Most of the people are under the misconception that data science is all about machine learning algorithms. That is not true. Data Science is a combination of mathematics, computer science and, machine learning.
Data Science is a study of data, where you maintain datsets and derive insights from the dataset. Data Science uses different parts mentioned in the pattern below to solve the problems.
Perception - try to identify patterns with the help of the data
Planning - involves two steps:
- Finding all possible solutions
- Finding the best possible solution among all solutions
What do you need to know to be a successful data scientist?
- Programming Knowledge
- Data modelling and evaluation
- Data Visualization and reporting
- Probability and Statistics
- Machine Learning techniques
- Relational Database knowledge
Let's get started with some basic terminology used in data science:
- Observations - data points in your dataset (rows)
- Features - variables in your dataset (columns)
- Target Variable - which you are trying to predict
- Train data - data from which your algorithm learns
- Test data - data to evaluate your model performance
- Model - set of patterns learned from the data
- Algorithm - specific machine learning process used to train your model