Get started with Machine Learning (Part-1)

#machinelearning #beginners #python

Welcome everyone!

Today, In the first session of our series symbolic Machine Learning, we will be talking about some basic ML stuffs.

So, What is Machine learning?

Machine Learning is an application of Artificial Intelligence (AI) that provides systems with the ability to automatically learn and improve from the learnings without being explicitly programmed.

According to guru ChatGPT,

Machine Learning is a technology that uses mathematical and statistical techniques to enable computer systems to learn from data and improve their performance on a particular task without being explicitly programmed. By analyzing and identifying patterns in data, machine learning algorithms can make predictions, identify anomalies, or classify new data points based on their similarity to previously seen data. The mathematical and statistical methods used in machine learning include linear algebra, calculus, probability theory, and optimization.

Data —> Learning Algorithm —> Understanding [Prediction/hypothesis]

ML works all the way from data to Understanding.

Now, someone might ask ...

If Machine Learning works to infer data to hypothesis/understanding, So what’s the difference between Machine Learning and Data mining?

To be clear, Data mining is the process of discovering patterns, correlations, and insights from large datasets using statistical and computational methods. It involves identifying hidden patterns and relationships in data that can be used to make informed decisions, improve business processes, or gain insights into human behavior. On top of that, data mining techniques include clustering, classification, association rule mining, and anomaly detection, among others. The ultimate goal of data mining is to extract valuable information from data and use it to improve business outcomes or gain new insights.

Confusing, huh?

Well, Although the two of the concepts looks exactly the same but There are differences!

Now look closely, Data mining is primarily concerned with discovering patterns, correlations, and insights from large datasets, whereas Machine Learning is focused on building predictive models and making decisions based on those models.

Secondly, Data mining can be applied to both structured and unstructured data, whereas Machine Learning generally requires structured data that has been preprocessed and labeled for training purposes

Lastly, about the applications, Data mining is commonly used in areas such as marketing, finance, and healthcare, where the goal is to gain insights and improve decision-making. Machine Learning is used in a wide range of applications, including image recognition, natural language processing, speech recognition, and recommendation systems.

Is it clear now?
No worries, If you still have any doubts, let me know!

Now, lets move on!

The most commonly recognized types of Machine Learning:

Supervised Learning: In this type of Machine Learning, the algorithm is trained on labeled data, where the correct outputs are already known, and the goal is to learn a mapping function that can predict the output for new input data.
Unsupervised Learning: In this type of Machine Learning, the algorithm is trained on unlabeled data, and the goal is to discover hidden patterns, structures, and relationships in the data without any predefined labels.
Semi-supervised Learning: In this type of Machine Learning, the algorithm is trained on a combination of labeled and unlabeled data, where the goal is to use the unlabeled data to improve the accuracy of the model on the labeled data.
Reinforcement Learning: In this type of Machine Learning, the algorithm learns by interacting with an environment and receiving feedback in the form of rewards or penalties. The goal is to learn a policy that maximizes the cumulative reward over time.
These types of Machine Learning are not mutually exclusive, and many real-world applications involve a combination of multiple techniques.

Different Machine Learning tasks

1. Regression:
Input- Known features and labeled data, output - a numeric value

2. Classification:
Input- Known features and labeled data, output- categories (Cat/dog, Spam/not spam)

3. Clustering:
Input- Known features but unlabeled data, Output- a number of groups based on similarity

4. Anomaly Detection:
Identifying instances that are significantly different from the majority of the data. (fraud detection, system health monitoring and so on).

That's pretty much about the fundamentals of Machine learning. Stay tuned for the next part!

Goodbye!