DEV Community

Pejman Rezaei
Pejman Rezaei

Posted on

Supervised vs. Unsupervised Learning

Machine Learning (ML) is a powerful tool that enables computers to learn from data and make predictions or decisions. But not all ML is the same—there are different types of learning, each suited for specific tasks. Two of the most common types are Supervised Learning and Unsupervised Learning. In this article, we’ll explore the differences between them, provide real-world examples, and walk through code snippets to help you understand how they work.


What is Supervised Learning?

Supervised Learning is a type of ML where the algorithm learns from labeled data. In other words, the data you provide to the model includes both input features and the correct output (labels). The goal is for the model to learn the relationship between the inputs and outputs so it can make accurate predictions on new, unseen data.

Real-World Examples of Supervised Learning

Email Spam Detection:

  • Input: The text of an email.

  • Output: A label indicating whether the email is "spam" or "not spam."

  • The model learns to classify emails based on labeled examples.

House Price Prediction:

  • Input: Features of a house (e.g., square footage, number of bedrooms, location).

  • Output: The price of the house.

  • The model learns to predict prices based on historical data.

Medical Diagnosis:

  • Input: Patient data (e.g., symptoms, test results).

  • Output: A diagnosis (e.g., "healthy" or "diabetic").

  • The model learns to diagnose conditions based on labeled medical records.


What is Unsupervised Learning?

Unsupervised Learning is a type of ML where the algorithm learns from unlabeled data. Unlike supervised learning, there are no correct outputs provided. Instead, the model tries to find patterns, structures, or relationships in the data on its own.

Real-World Examples of Unsupervised Learning

Customer Segmentation:

  • Input: Customer data (e.g., age, purchase history, location).

  • Output: Groups of similar customers (e.g., "frequent buyers," "budget shoppers").

  • The model identifies clusters of customers with similar behaviors.

Anomaly Detection:

  • Input: Network traffic data.

  • Output: Identification of unusual patterns that could indicate a cyberattack.

The model detects outliers or anomalies in the data.

Market Basket Analysis:

  • Input: Transaction data from a grocery store.

  • Output: Groups of products frequently bought together (e.g., "bread and butter").

  • The model identifies associations between products.


Key Differences Between Supervised and Unsupervised Learning

Aspect Supervised Learning Unsupervised Learning
Data Labeled (inputs and outputs provided) Unlabeled (only inputs provided)
Goal Predict outcomes or classify data Discover patterns or structures in data
Examples Classification, Regression Clustering, Dimensionality Reduction
Complexity Easier to evaluate (known outputs) Harder to evaluate (no ground truth)
Use Cases Spam detection, price prediction Customer segmentation, anomaly detection

Code Examples

Let’s dive into some code to see how supervised and unsupervised learning work in practice. We’ll use Python and the popular Scikit-learn library.

Supervised Learning Example: Predicting House Prices

We’ll use a simple linear regression model to predict house prices based on features like square footage.

# Import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Create a sample dataset
data = {
    'SquareFootage': [1400, 1600, 1700, 1875, 1100, 1550, 2350, 2450, 1425, 1700],
    'Price': [245000, 312000, 279000, 308000, 199000, 219000, 405000, 324000, 319000, 255000]
}
df = pd.DataFrame(data)

# Features (X) and labels (y)
X = df[['SquareFootage']]
y = df['Price']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")
Enter fullscreen mode Exit fullscreen mode

Unsupervised Learning Example: Customer Segmentation

We’ll use the K-Means clustering algorithm to group customers based on their age and spending habits.

# Import libraries
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# Create a sample dataset
data = {
    'Age': [25, 34, 22, 45, 32, 38, 41, 29, 35, 27],
    'SpendingScore': [30, 85, 20, 90, 50, 75, 80, 40, 60, 55]
}
df = pd.DataFrame(data)

# Features (X)
X = df[['Age', 'SpendingScore']]

# Train a K-Means clustering model
kmeans = KMeans(n_clusters=3, random_state=42)
df['Cluster'] = kmeans.fit_predict(X)

# Visualize the clusters
plt.scatter(df['Age'], df['SpendingScore'], c=df['Cluster'], cmap='viridis')
plt.xlabel('Age')
plt.ylabel('Spending Score')
plt.title('Customer Segmentation')
plt.show()
Enter fullscreen mode Exit fullscreen mode

When to Use Supervised vs. Unsupervised Learning

Use Supervised Learning when:

  • You have labeled data.

  • You want to predict outcomes or classify data.

  • Examples: Predicting sales, classifying images, detecting fraud.

Use Unsupervised Learning when:

  • You have unlabeled data.

  • You want to discover hidden patterns or structures.

  • Examples: Grouping customers, reducing data dimensions, finding anomalies.


Conclusion

Supervised and Unsupervised Learning are two fundamental approaches in Machine Learning, each with its own strengths and use cases. Supervised Learning is great for making predictions when you have labeled data, while Unsupervised Learning shines when you want to explore and uncover patterns in unlabeled data.

By understanding the differences and practicing with real-world examples (like the ones in this article), you’ll be well on your way to mastering these essential ML techniques. If you have any questions or want to share your own experiences, feel free to leave a comment below.

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay