Context
As python beginner, I try to learn from different sources, differents methods. I was reading a brazilian blog (estatsite.com.br) and I found an interesting topic about linear regression model, but it was in R.
So I took his example and applied to Python, but as soon I found problems using sklearn package, I found numpy method: numpy.polyfit()
. The official documentation is very clear and concise.
What we want?
We want verify if students get better in grades when they earn more money, using linear regression model.
Fast review in official documentation
According to it, numpy.polyfit()
fits polynomial least squares. Consider a polynomial p:
A p coefficient vectors which minimizes the squared errors of order
are encountered.
The general sintax is:
numpy.polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False)
.
Parameters as x, y, deg
must be completed.
The general equation of squared errors minimized is:
The curious reader must read the official documentation.
Code
Hiphotesis: earn money upgrade students performances.
# importing manipulation lib to visualize data
import pandas as pd
# importing numerycal processor lib
import numpy as np
# importing visualization lib
import matplotlib.pyplot as plt
# creating a dataframe
alunos = pd.DataFrame({
'André': {'Mesada': 36.02, 'Nota_Testao': 48},
'Joao': {'Mesada': 11.83, 'Nota_Testao': 25},
'Bia': {'Mesada': 22.0, 'Nota_Testao': 43},
'Ana': {'Mesada': 24.0, 'Nota_Testao': 39},
'José': {'Mesada': 100.0, 'Nota_Testao': 60},
'Vinicius': {'Mesada': 10.0, 'Nota_Testao': 40},
'Tulio': {'Mesada': 20.0, 'Nota_Testao': 48},
'Josué': {'Mesada': 25.0, 'Nota_Testao': 47},
'Antonella': {'Mesada': 22.0, 'Nota_Testao': 43},
})
# visualization of dataframe
alunos = alunos.T
alunos
# converting a series object into numpy array
x = np.array(alunos['Nota_Testao'])
y = np.array(alunos['Mesada']
The column Mesada
is our y, dependent variable. The column Nota_Testao
(x) is our predictor.
print('Independent : ', x)
print('Dependent variable: ', y)
# ploting
plt.plot(x, y, 'o')
m, b = np.polyfit(x, y, 1)
plt.xlabel("Mesada (R$)")
plt.ylabel("Nota")
plt.show();
Thank you for reading and I'm open mind to suggestions, sorry for eventual mistakes. My github link is: github.com/biangomes.
Top comments (0)