DEV Community

Nilavukkarasan R
Nilavukkarasan R

Posted on • Edited on

Perceptron: The Foundation of Modern AI

"We now have a new kind of programming paradigm. Instead of telling the computer what to do, we show it examples of what we want, and it figures out how to do it."

-- Michael Nielsen

Back to the Beginning

My first encounter with AI was in college. I memorised more than I understood. None of what I memorised appeared in the exam. So i wrote whatever I could in the exam, and I'm sure the professor didn't understand my answers either.

Fast forward Twenty years of building software systems. In all that time, I barely touched AI/ML. Sure, I designed applications that integrated with black box, AI/ML systems for OCR, but that was it.

Then ChatGPT happened.

Like many of you, I started experimenting. RAG chatbots, embedding models, agents, agentic patterns. I was building with these tools, but something bothered me. I didn't understand how any of it actually worked.

So I went back. Not to the latest paper, but to the very beginning. To the first artificial neuron.

Five Lines That Changed Programming

A perceptron takes inputs, multiplies each by a weight, adds them up, and makes a decision.

def perceptron(inputs, weights, bias):
    weighted_sum = sum(x * w for x, w in zip(inputs, weights))
    weighted_sum += bias
    return 1 if weighted_sum > 0 else 0
Enter fullscreen mode Exit fullscreen mode

That's it. Each input has a weight (how important is this input?). We sum them up, add a bias, and if the result is positive, output 1. Otherwise, output 0.

Now consider the AND logic gate:

Input: [0, 0] → Output: 0
Input: [0, 1] → Output: 0
Input: [1, 0] → Output: 0
Input: [1, 1] → Output: 1
Enter fullscreen mode Exit fullscreen mode

The traditional way? Write an if/else. The perceptron way? Show it examples and let it figure out the weights.

With learned weights [0.5, 0.5] and bias of −0.7, the perceptron solves this:

  • [0, 0]: 0×0.5 + 0×0.5 − 0.7 = −0.7 → Output: 0 ✓
  • [0, 1]: 0×0.5 + 1×0.5 − 0.7 = −0.2 → Output: 0 ✓
  • [1, 0]: 1×0.5 + 0×0.5 − 0.7 = −0.2 → Output: 0 ✓
  • [1, 1]: 1×0.5 + 1×0.5 − 0.7 = 0.3 → Output: 1 ✓

The if/else is hardcoded. The perceptron learned these numbers from examples.

How? It starts with random weights. It feeds in [1,1], gets the wrong answer, and nudges the weights a little in the direction that would have been correct. Feeds in [0,1], checks again, nudges again. After a few passes through all four examples, the weights settle at values that get everything right. That's the entire learning algorithm. Try, fail, adjust.

That shift, from writing rules to showing examples, is what Nielsen meant. And it's the same shift that powers every modern AI system.

It Draws a Line

A perceptron draws a line. The weights control the angle. The bias controls where it sits. Everything on one side is class 0, everything on the other is class 1. Training just means nudging the line until it separates the classes correctly.

For AND, the line puts [1,1] on one side and everything else on the other. Easy. For OR, it puts [0,0] alone on one side. Also easy.

For XOR (output 1 when inputs differ, 0 when they match), the class 1 points sit diagonally opposite each other. Try drawing one straight line that separates them. You can't. It's geometrically impossible.

That's the perceptron's entire story. If your problem lives on opposite sides of a line, it works beautifully. If not, no amount of training will help.

See It

Open the playground and train on AND. Watch the red dashed line settle into place, cleanly separating the orange dots from the blue ones. The error count drops to zero. Done.

Now switch to XOR. The line thrashes around, never settling. The error count never hits zero. The perceptron keeps trying, keeps adjusting, and keeps failing.

That contrast is the concept. Stare at it until it sticks.

AND vs XOR: the line settles for AND, fails for XOR

What's Next

Every single neuron in GPT-4, in every transformer you've ever used, works on these same principles. The perceptron isn't history. It's the foundation.

But there's one simple logic gate it cannot learn. No matter how you adjust the weights, that single straight line can never solve it.

The problem is called XOR. And solving it required an idea that changed everything.


References:
Nielsen, M. (2015). Neural Networks and Deep Learning. neuralnetworksanddeeplearning.com

Series: Learning AI from First Principles | Code: GitHub

Top comments (0)