Discussion on: Building our First Machine Learning Model (Pt. 4)

View post

Replies for: Right! My bad 😅 The order is actually correct. Doing get_dummies first before splitting the data might cause a data leakage. We want to make sure t...

I added the two lines above, but I still get the same error message. "ValueError: could not convert string to float: 'setosa'"

I think this may be because the train_X data still has the species in text format. How does the fit function know the relationship between the X species column and the Y setosa/versicolor/virginica columns? Do I need to do one-hot encoding on the X data?

Could you post a full, working Python script somewhere so I can see how this is supposed to work?

Michael Learns • Jan 17 '19 • Edited

Oh right! Take out the species in the features array. That should fix the "ValueError: could not convert string to float: 'setosa'"

Also, I've added the missing from sklearn.metrics import mean_absolute_error
for the mean_absolute_error function.

Here's a link to a working kaggle notebook: kaggle.com/interestedmike/iris-dat...