DEV Community

loading...

First Machine Learning program

petercour
・2 min read

Think Machine Learning has to be difficult?

Think again. You can start with a simple program after installing the sklearn module.

#!/usr/bin/python3
import sklearn;
print("Scikit-Learn", sklearn.__version__)

So what does it do? You guessed it.

python3 first.py  
Scikit-Learn 0.21.2

So let's start cooking a Machine Learning app

Algorithms

Machine Learning has a focus on two things: algorithms and data. Sklearn contains many algorithms like so:

#!/usr/bin/python3
import sklearn;
from sklearn import tree
from sklearn.naive_bayes import GaussianNB
from sklearn import neighbors

# knn classifier                                                                                                                                                        
clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)

# decission tree algorithm                                                                                                                                              
clf = tree.DecisionTreeClassifier()   

# naive bayes                                                                                                                                                           
gnb = GaussianNB()

Nothing special here. Just shows that sklearn has many algorithms (machine learning algorithms) included

Let's cook!

Cooking time! Now if you haven't seen the breaking bad series, it may be a bit strange. So what will we make?

We'll make a predictor.

Based on input data, it will predict if it's a good product or bad product.

Example runs:

python3 skfirst.py  
Enter color: blue
How many percent pure: 99

Prediction : ['heisenberg']

-----------------------------
python3 skfirst.py
Enter color: blue
How many percent pure: 25

Prediction : ['garbage']
-----------------------------
python3 skfirst.py  
Enter color: white
How many percent pure: 99

Prediction : ['garbage']

Any Machine Learning algorithm uses (training) data. The data has two variables (color, pureness) and two possible outcomes (heisenberg, garbage)

# training data                                                                                                                                                         
X = [[0,90], [0,80], [1,20], [1,60], [0,70],[0,70],[1,99]] 
Y = ['heisenberg','heisenberg','garbage','garbage','garbage', 'garbage', 'garbage']

Then for any new input, it can predict if it's a good or bad product:

P = [[color,pure]]

# make prediction                                                                                                                                                       
c = tree.DecisionTreeClassifier()
c = c.fit(X,Y)

All code below:

#!/usr/bin/python3
import sklearn;
from sklearn import tree
from sklearn.naive_bayes import GaussianNB
from sklearn import neighbors

# training data                                                                                                                                                     
X = [[0,90], [0,80], [1,20], [1,60], [0,70],[0,70],[1,99]]
Y = ['heisenberg','heisenberg','garbage','garbage','garbage', 'garbage', 'garbage']

# new data for prediction                                                                                                                                           
color = input("Enter color: ")
if color == "blue":
    color = 0.0
else:
    color = 1.0

pure = input("How many percent pure: ")
pure = float(pure)

P = [[color,pure]]

# make prediction                                                                                                                                                   
c = tree.DecisionTreeClassifier()
c = c.fit(X,Y)
print("\nPrediction : " + str(c.predict(P)))

That was fun!

Related links:

Discussion (0)