DEV Community

Neha Gupta
Neha Gupta

Posted on

Beginner's Guide to Math's for Machine Learning

Hey everyone as I said in my last post that I'll share the mathematical part that I am currently studying for machine learning in upcoming post ,so here it is.
In this post I'll be sharing some basic math's that I have studied so far. So let's get started :)
Math's is one of the most important topic for machine learning. If you are someone who want to get good knowledge of this subject you have to be good with both mathematics and programming part. So let's start by understanding what statistics is.

Statistics and it's role in machine learning

Statistics is a branch of mathematics that deals with collecting, organizing, analyzing, interpreting, and presenting data.
It provides a robust set of tools for understanding patterns and trends, and making inferences and predictions based on data.

The role of statistics in machine learning ->

1.** Data Analysis and Preprocessing:** Statistics helps in understanding the underlying distribution of data, identifying outliers, handling missing values, and scaling features. Techniques like mean, median, mode, standard deviation, correlation, and percentiles are commonly used for data analysis and preprocessing.

  1. Constructing machine learning models: Statistics provides the methodologies and principles for creating models in machine learning.
  2. Interpreting results: Measures such as p-value, confidence intervals, R-squared, and others provide us with a statistical perspective on the machine learning model’s performance.

And there's more to it but going too deep in beginning phase is not good so I am keeping it simple here.

Now let's understand the some key concepts of statistics that we are concerned about in machine learning.

Key Concepts of Statistics

  1. Descriptive Statistics: Descriptive statistics involves summarizing and describing the main features of a dataset. This includes measures of central tendency (e.g., mean, median, mode) and measures of dispersion (e.g., variance, standard deviation, range).
  2. Inferential Statistics: Inferential statistics aims to make predictions or draw conclusions about a population based on a sample of data. It includes hypothesis testing, confidence intervals, and estimation of parameters.

Technical language is quite difficult to understand!
Let's understand these terms by using simple example.

For example, if you have a list of students' ages in a classroom, you might use descriptive statistics to find out things like:

  • The average (mean) age of the students.

  • The most common (mode) age among the students.

  • How spread out (range) the ages are.

Now if you wanted to know the average height of all people in your town but couldn't measure everyone, you might measure the heights of a smaller group of people (your sample) and use inferential statistics to estimate the average height of the entire town (the population).

So, in simple terms:

  • Descriptive statistics describe data you have.

  • Inferential statistics help you make educated guesses or inferences about a larger group based on a smaller sample.

So this is how statistics plays a very important role in machine learning. For this post I am keeping it simple and short let's explore more in my next post.
Stay connected to know more and don't forget to follow me :)

Top comments (1)

Collapse
 
nigel447 profile image
nigel447

Statistics is what we use when we cant quantify a parameter of a distribution so all we have left is the best guess we can come up with