DEV Community

Shubham Thakur
Shubham Thakur

Posted on

“What Chapter 1 of a Hundred pages of Machine Learning Book Taught Me"

Introduction

I recently finished Chapter 1 of a machine learning book. This chapter wasn’t about building models or writing code. Instead, it focused on something more important: building the right mental model of what machine learning actually is.

Here’s what I learned from just the first chapter.

  1. What Machine Learning Really Means

At its core, machine learning is a simple process:

a) Gather data

b) Build a statistical model from that data

c) Use the model to solve a real-world problem

The “learning” part means we don’t hard-code rules. We let the algorithm find patterns from examples.

  1. Types of Learning

Supervised learning: Data comes with labels (x, y).

Unsupervised learning: Data has no labels, only inputs x.

Semi-supervised learning: A mix of a few labeled examples and many unlabeled ones.

Reinforcement learning: An agent interacts with an environment and learns from rewards.

In practice, supervised learning is the most commonly used, so the chapter focuses mainly on that.

  1. Features and Labels

In supervised learning, each example is represented as a feature vector.
Each feature is just a number describing some aspect of the object (for example, an email, an image, or a person).

The label is what we want to predict: a class (like spam / not spam) or a number.

So the learning problem becomes:
Learn a function that maps feature vectors to labels and works well on new data.

  1. Decision Boundaries and SVM

To explain classification, the chapter introduces Support Vector Machines (SVM).

SVM views each example as a point in a high-dimensional space and tries to separate classes using a decision boundary, defined by:

wx − b = 0

Prediction is done using:

sign(wx − b)

In simple terms, the model checks which side of the boundary a point lies on.

  1. Why Margin Matters

SVM doesn’t just look for any separating boundary.
It looks for the one with the largest margin — the maximum distance from the closest points of both classes.

Why?
Because a larger margin usually means better generalization, i.e., better performance on unseen data.

This leads to an optimization problem:

Minimize ||w||
Subject to: yᵢ(wxᵢ − b) ≥ 1

Training, in this view, is simply solving this optimization problem.

  1. Why Models Work on New Data

The chapter also explains why machine learning models can work on new examples.
If training data and future data come from the same distribution, new examples are likely to appear near old ones. A well-chosen decision boundary will still separate them correctly most of the time.

More data usually means a better approximation of reality — and fewer surprises.

Top comments (0)