DEV Community

Cover image for Hands-on practice on machine learning

Posted on • Originally published at

Hands-on practice on machine learning

Practical machine learning.

Part 0.Prerequisite

Basic knowledge about this is helpful

  1. Python
  2. NumPy
  3. Pandas
  4. Matplotlib
  5. Scikit-learn

Part 1.Data Preprocessing

Data preprocessing is an important step in the data mining process.

  • Import the library.
  • Get the data.
  • Check for missing or null data.
  • Convert categorical data into numbers.
  • Split into train and test.
  • Feature scaling.

For data preprocessing use this Jupyter notebook

Part 2.Supervised Learning

Supervised learning is the learning of the model with an input variable and an output variable and algorithm map the input to the output.

Supervised learning classified into two categories of algorithms:

  • Classification: A classification problem is when the output variable is a category, such as "disease" or "No disease".
  • Regression: A regression problem is when the output variable is a real value, such as "Price".
  1. Classification
    There is a wide variety of classification applications from Healthcare to Marketing.
    Learn how to implement the following classification models:

  2. Regression
    The regression technique varies from Linear Regression to Random Forest.

Part 3.Unsupervised Learning

Unsupervised learning is where only the input data is present and no corresponding output variable is there.

Unsupervised learning has two categories of algorithms:

  • Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.
  • Association: An association rule learning problem is where you want to discover rules that describe a large portion of your data, such as people that buy X also tend to buy Y

3.1 Clustering
Clustering is similar to classification, but the basis is different. In clustering, you don't know what you are looking for, and you are trying to identify some segments or clusters in your data.

Learn how to implement the following Machine learning Clustering models:

The main problem is how to use the right estimator for our problems?
You can use the Scikit-learn map for your problem.

To make the world a better place, use data wisely.

Happy coding and have a great time learning how to make machines smarter.

Top comments (0)