Random forest analysis was performed to evaluate the importance of a series of explanatory variables in predicting a binary, categorical response variable. The following explanatory variables were included as possible contributors to a random forest evaluating Flower type (output) includes the petal length, petal width, sepal length and sepal width.
This is my copy of colab notebook so evryone can see the output along with the code
Python Code:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
iris = load_iris()
classes = iris['target_names']
classes
array(['setosa', 'versicolor', 'virginica'], dtype='<U10')
X=iris['data']
Y=iris['target']
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.3)
from sklearn.ensemble import RandomForestClassifier
▾ RandomForestClassifier
RandomForestClassifier(n_estimators=50)
model = RandomForestClassifier(n_estimators=50)
model.fit(X_train,Y_train)
model.score(X_test,Y_test)
0.9555555555555556
Output
print("actual result:",classes[Y_test[2]])
print("predicted result:",classes[model.predict([X_test[2]])[0]])
actual result: virginica
predicted result: virginica
predictions = model.predict(X_test)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_test,predictions)
import seaborn as sn
sn.heatmap(cm,annot=True)
Confusion Matrix Created based on predictions done on test dataset
X axis represents actual values & Y_axis represents predicted values
Output and accuracy
model.score(X_test,Y_test)
0.9555555555555556
Output
print("actual result:",classes[Y_test[2]])
print("predicted result:",classes[model.predict([X_test[2]])[0]])
actual result: virginica
predicted result: virginica
As mentioned in the code from the random tree classifier, We get an overall accuracy of about 96% (as shown in code)
Top comments (0)