DEV Community

loading...
Cover image for Python Machine Learning fun

Python Machine Learning fun

petercour
・2 min read

Machine Learning is great. You can use data you already have and make all kinds of apps.

Why data? Machine Learning algorithms use data. Without data, no Machine Learning. So what can you do with data?

One example is making predictions. It's hard to make intelligent machines if the algorithm is programmed, because the number of possible situations is greater than the programmers time.

So you need to use data and train the algorithm.
The module sklearn is popular for making Machine Learning apps.

Oh blimey! Take notes

Machine Learning Algorithm

First load the data. Say you have the data in csv format.
pima indians diabetes datset

I took the indian diabetes set. But the principle works with any data set. Load it:

#!/usr/bin/python3
# load the CSV file as a numpy matrix                                                                                                                                               
dataset = np.loadtxt("./pima-indians-diabetes.csv", delimiter=",")

# separate the data from the target attributes                                                                                                                          
X = dataset[:,0:7]
y = dataset[:,8]

So you have X and y. You need to use an algorithm that uses this data and makes predictions

model = LogisticRegression()
model.fit(X, y)

What is fit? Fit is the algorithm learning from data. Ok.
Then you can make predictions:

expected = y
predicted = model.predict(X)

Because you know the expected and predicted, you can measure how well the algorithms predictions are.

Run the App

Because we have X and y, you can see how well the predictor is doing.

#!/usr/bin/python3
from sklearn import metrics
from sklearn.linear_model import LogisticRegression
import numpy as np                                               

# load the CSV file as a numpy matrix                                                                                                                                               
dataset = np.loadtxt("./pima-indians-diabetes.csv", delimiter=",")

# separate the data from the target attributes                                                                                                                          
X = dataset[:,0:7]
y = dataset[:,8]


# make predictions                                                                                                                                                         
expected = y
predicted = model.predict(X)

# summarize the fit of the model                                                                                                                                       
print('RESULT')
print(metrics.classification_report(expected, predicted))
print('CONFUSION MATRIX')

That was fun!

Related links:

Discussion (1)

Collapse
aoreclause profile image
Emmanuel Oreoluwa

I always though it was rocket science
I love this niche