## DEV Community is a community of 615,372 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

# Self Study: Data Science - Machine Learning journey : Day 2 (Statistics | R | Python | Anaconda | Jupyter)

Vignesh C Updated on ・2 min read

## Prerequisites:

Statistics is generally considered as one of the prerequisites to study machine learning. We need statistics to help transform observations into information and to answer questions about samples of observations.

### Statistics is needed in Machine Learning for..

Another prerequisite to data science - machine learning is a programming language - R or Python. R is used for statistical analysis to build models while Python is used beyond statistics with wide range of libraries and having better integration with other programming languages.

## Applied Statistics:

Two broad categories in the field of statistics:

1. Descriptive statistics
2. Inferential statistics

Descriptive statistics is the process of categorizing and describing the information.

Inferential statistics includes the process of analyzing a sample of data and using it to draw inferences about the population from which it was drawn.

We need to get familiarized with all these concepts to continue our machine learning journey effectively. Most of these concepts would have been covered as part of our graduate degree.

## Install R Studio

Install R and R Studio Desktop for your version of OS from here..

Sample R code to illustrate AUC and ROC from Day 1:

https://github.com/IamVigneshC/Machine-Learning-Data-Science/blob/master/R/ROC_AUC.R

## Install Python

You can install and use python through command line or through Anaconda which come along with a tutorial, reference for various libraries.

Once installed, you shall open JupyterLab or Jupyter notebook and work on Python.

Some of my samples to get started:

https://anaconda.org/iamvigneshc

https://github.com/IamVigneshC/Machine-Learning-Data-Science/tree/master/Python