DEV Community

Cover image for Machine Learning Basics: What is ML? Supervised vs Unsupervised, Features vs Labels
Charles
Charles

Posted on

Machine Learning Basics: What is ML? Supervised vs Unsupervised, Features vs Labels

ML

In this article, we will cover:

  • What ML really is
  • The difference between supervised and unsupervised learning
  • What features and labels are – and why they matter

What is machine learning?

Machine learning (ML) is the subset of artificial intelligence (AI) focused on algorithms that can learn the patterns of training data and, subsequently, make accurate inferences about new data. This pattern recognition ability enables machine learning models to make decisions or predictions without explicit, hard-coded instructions.

Examples of Machine Learning.

1. Personal assistants and voice assistants.
ML powers popular virtual assistants like Amazon Alexa and Apple Siri. It enables speech recognition, natural language processing (NLP), and text-to-speech conversion. When you ask a question, ML not only understands your intent but also searches for relevant answers or recalls similar past interactions for more personalized responses.

2. Email Filtering and Management.
ML algorithms in Gmail automatically categorize emails into Primary, Social, and Promotions tabs while detecting and moving spam to the spam folder. Beyond basic rules, ML tools classify incoming emails, route them to the right team members, extract attachments, and enable automated personalized replies.

3. Transportation and Navigation.

Machine Learning has transformed modern transportation in several ways:

  • Google Maps uses ML to analyze real-time traffic conditions, calculate the fastest routes, suggest nearby places to explore, and provide accurate arrival time predictions.

  • Ride-sharing apps like Uber and Bolt apply ML to match riders with drivers, dynamically set pricing (surge pricing), optimize routes based on live traffic, and predict accurate ETAs.

  • Self-driving cars (e.g., Tesla) rely heavily on computer vision and unsupervised ML algorithms. These systems process data from cameras and sensors in real-time to understand their surroundings and make instant driving decisions.

Types of machine learning

Machine Learning generally falls into two main learning paradigms: Supervised Learning and Unsupervised Learning. These differ based on the type of data they use and the objective they aim to achieve.

1. Supervised Learning
Supervised Learning

Supervised learning trains a model using labeled data — where every input example is paired with the correct output (label). The goal is to learn the mapping between inputs and outputs so the model can accurately predict outcomes on new, unseen data.

Common Tasks:

  • Classification — Predict discrete categories (e.g., spam/not spam, cat/dog, approve/reject loan)

  • Regression — Predict continuous values (e.g., house price, temperature, sales forecast)

How it works:
In supervised learning, the model learns from examples where the answers are already known. It is given inputs (features) together with the correct outputs (labels), and over time it identifies patterns in the data. As it trains, it continuously adjusts itself to reduce the difference between its predictions and the actual answers.

Real-world examples:

  • Spam detection,
  • Image classification,
  • Credit risk scoring.

Analogy:
Think of a student learning with a teacher. The teacher shows examples and clearly labels them — “this is a cat,” “this is a dog.” Over time, the student begins to recognize the differences and can correctly identify new animals on their own.

2. Unsupervised Learning

Unsupervised Learning

Unsupervised learning works with unlabeled data. The model must discover hidden patterns, structures, or groupings on its own — without any “correct answers” provided.

Common tasks:

  • Clustering — grouping similar data points together (e.g., customer segmentation)

  • Association — finding relationships in data (e.g., people who buy X also buy Y)

  • Dimensionality reduction — simplifying data while keeping the most important information

Real-world examples:

  • Customer segmentation in retail (grouping shoppers based on buying habits),

  • Fraud detection in mobile money or banking (flagging unusual transactions),

  • Product recommendations on e-commerce sites (suggesting items similar to what you’ve viewed),

  • Music or movie suggestions based on what you like (Spotify, Prime Video).

Supervised vs Unsupervised Learning

Aspect Supervised Learning Unsupervised Learning
Data used Labeled (features + answers) Unlabeled (just features, no answers)
Goal Predict an output / category Find hidden patterns or groupings
Task types Classification & regression Clustering, association, dimensionality reduction
How hard to evaluate Easy – you have ground truth to compare Trickier – no "right answer" to check against
Real‑world examples Spam detection, price prediction Customer segments, fraud detection
Complexity Generally simpler More complex (no teacher to guide)

Key Takeaway:

  • Use Supervised Learning when you have labeled historical data and want to make predictions.

  • Use Unsupervised Learning when you have lots of raw data and want to discover insights or patterns you didn’t already know.

Modern systems often combine both. For example, many Large Language Models (LLMs) use self‑supervised learning during pre‑training, followed by supervised fine‑tuning and RLHF (reinforcement learning from human feedback).

Features vs Labels

F vs L

If you're doing supervised learning, you'll run into two terms constantly: features and labels. Here's what they actually mean.

What is a Feature?
A feature is any piece of information you feed the model – a clue that helps it make a prediction. Features are also called independent variables, predictors, or attributes

Examples of Features:

  • In house price prediction: square footage, number of bedrooms
  • In spam detection: length of email, number of capital letters

Features can be numerical (age, price), categorical (gender, color), or text-based.

What is a Label?

A label is the answer the model tries to guess – the output or correct answer. Also called target or dependent variable

Examples of Labels:

  • House price prediction --> Actual sale price (Kshs)
  • Spam detection --> “Spam” or “Not Spam”

Labels are only available in supervised learning because they represent the ground truth.

Features vs Labels – Quick Comparison

Aspect Features (the inputs) Label (the answer)
What it is What the model uses to learn What the model tries to guess
Other names Independent variables, predictors Target variable, dependent variable
Do you always have it? Yes – in any dataset Only in supervised learning
House price example Size, bedrooms, location The price tag

Key Takeaway:

  • Features = clues. Label = the answer.
  • When preparing data for a supervised model, split it into X (features) and y (label).
  • Garbage in --> garbage out: bad features or wrong labels will ruin your model.

Conclusion

  • Machine Learning lets computers learn from data without hard‑coded rules.
  • Supervised learning uses labeled data to predict outcomes (spam detection, prices).
  • Unsupervised learning finds hidden patterns in unlabeled data (customer segments, fraud).
  • Features are the clues you feed the model. Labels are the answers you want to predict.

Top comments (0)