DEV Community

YASHWANTH CHIKKI HD
YASHWANTH CHIKKI HD

Posted on

pre_machine learning

ml libraries
-pandas
reading dataset

df=pd.read_csv("address")
df=pd.read_excel("address",sheet_name="sheet",usecols=['coloumn1'])
Enter fullscreen mode Exit fullscreen mode

viewing dataset

print(df.head)
print(df.tail)
print(df.sample(5))
Enter fullscreen mode Exit fullscreen mode

daatbase info

db.info() #guves full information
db.describe()  #gives satistics
db.shape 
Enter fullscreen mode Exit fullscreen mode

selecting coloumns

coloumn=df["coloumn_name"]
subset=df[["coloumn1","coloumn2"]]
Enter fullscreen mode Exit fullscreen mode

addtwo compatable coloumns

df["new"]=df["col1"]+df["col2"]
Enter fullscreen mode Exit fullscreen mode

filterrows

coloumn=df[df['coloumn']>10]
Enter fullscreen mode Exit fullscreen mode

mergetwo dataframe

newdf=pd.merge(df1,df2,on="commen_coloumn")

Enter fullscreen mode Exit fullscreen mode

missing values
-check


values=df.isnull().sum()
Enter fullscreen mode Exit fullscreen mode

-fill with values


df['col'].filna(0,implace=True)
Enter fullscreen mode Exit fullscreen mode

-drop


df=df.dropna()
Enter fullscreen mode Exit fullscreen mode

-duplicate


df.df.duplicate().sum()
Enter fullscreen mode Exit fullscreen mode

add new coloumn


y=np.array([1,2,3,4])
df['col_new']=y
Enter fullscreen mode Exit fullscreen mode

drop col


df.drop(coloummns=['col_name'],implace=True)
Enter fullscreen mode Exit fullscreen mode

ilock function


x=df.iloc[:,:-1]   #all but last coloumn
x=df.iloc[:.-1]  #only last coloumn
Enter fullscreen mode Exit fullscreen mode

numpuy


mean=np.mean(data)
st_dev=np.std(data)
median=np.median(data)
var=np.var(data)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)