Here I refer to ML as Machine Learning, not the programming language. According to Wikipedia, “Machine learning is the study of algorithms and mathematical models that computer systems use to progressively improve their performance on a specific task.”
So, why am I talking about ML and Data Science here? Because I’m interested in it. In college I haven’t had much time to research which specialization I’d like to follow. Between my CS and math majors, and my job, there is no much time left for me. However, being in my last year of college, I dedicated some time to find out which path I want to follow after I’m done. Just a side note. In most cases you don’t want to find a specialization to pursue too early. Use your time in college to learn the fundamental concepts: don’t run without knowing how to walk. Do your research close to graduation.
Anyways, I decided that I wanted to pursue something related to ML/Data Science. I love programming and solving problem in general. Systems would be my second option in order of preferences. But I decided to go with ML/Data Science for a few reasons. The first one is that I also enjoy math, and since it is my second major, it is a natural way to combine math with CS. The second reason is the huge amount of applications that ML and Data Science have. With a degree in ML/Data Science you can work for a wide range of companies and entities. The world is generating much more data than we are able to process.
Regarding the first of my reasons, math, some people might wonder how much math is really necessary in ML/Data Science. Without getting into much detail here, especially since I’m just starting to learn ML, here is what I have. I’m following this book: “Introduction to Statistical Learning”, 7th edition. It is intended to be mathematically non-heavy. However, this is what I have encountered, not only from this book but also from the famous video lectures of ML by Andrew Ng’s. The description of non-heavy math is a little misleading. It means in my understanding that it leaves out proofs and a lot of rigorous details. But it does not mean no math at all.
In order of importance in my opinion.
1- A course in Probabilities and Statistics. Why? In ML and Data Science, it’s all about stats and probabilities. You have to be able to know concepts like probability distributions: discrete and continuous. You have to know how to work with data and analyze it. Concepts like mean, standard deviation, confidence interval, regression analysis and more are the heart of Data Science and ML.
2- A course in Linear Algebra. Why? You have to work with matrices and vectors. It involves: matrix operations, vector operations, matrix transformations, linear programming, eigenvectors and eigenvalues and more.
3- Calculus at different levels. You definitely have to deal with differentiation and integration. Integrals are very important also in statistics (point 1) to deal with continuous distributions. Vector calculus is very important in order to understand concepts like contour maps, multivariate functions, partial derivatives and so on.
4- Optimization. This one does not have to be a separate course. It is normally introduced in calculus and linear algebra. Optimization refers to linear and non-linear programming. The linear programming is introduced in Linear Algebra and the non-linear programming in Calculus, especially Vector Calculus.
5- Numerical Algorithms. This one does not have to be a course by itself. It is normally introduced in the courses above. However, you should be able to have some knowledge about it. Grab a book in Numerical Analysis/Algorithms and learn a bit. The knowledge gained from requirements 1-4 should be enough to study it on your own. But it is very important.
Don’t let anyone fool you saying that math is of little importance. That is not true. But don’t let that discourage you. Most of these requirements are done in math minor, depending on your school. I rather tell you the truth from the beginning. Do you need to be super fluent in math? No, but at least be able to have an introduction to the topics I mentioned above. Most people stay away from math, but in order to pursue ML/Data Science you must embrace it at a computational level at least.
Most people want to make things perfect. Sometimes we evaluate the complexity of an upcoming goal or a problem. So, the fear to not complete it perfectly or "wrong" (Yeah, who are judges? 🤔) stops us even from trying.