DEV Community

Nilavukkarasan R
Nilavukkarasan R

Posted on • Edited on

Perceptron: The Foundation of Modern AI

"We now have a new kind of programming paradigm. Instead of telling the computer what to do, we show it examples of what we want, and it figures out how to do it."

-- Michael Nielsen

My Journey Back to the Beginning

My first encounter with Artificial Intelligence was during my college days. I had memorised more than I understood, but none of what I studied appeared in the exam, so I wrote whatever I could, and I’m quite certain the professor didn’t understand my answers either.

Fast forward 20 years of building software systems. In all that time, I barely touched AI/ML. Sure, I designed applications that integrated with black box, AI/ML systems for OCR, but that was it.

Then ChatGPT happened.

Like many of you, I started with the ChatGPT web interface, learning prompt engineering. Then I began experimenting—building RAG chatbots, exploring chunking strategies, testing different embedding models and retrieval techniques. I experimented with agents, explored MCPs and agentic patterns. I was learning these tools, building with them—but something bothered me.

I didn't understand how any of it actually worked.

So I decided to go back. Not to the latest paper or the newest framework, but to the very beginning. To the first artificial neuron.

Why This Matters

You might wonder why bother to learn about decades-old concept when we have ChatGPT, Claude and countless AI tools at our fingertips.

Here's why: Every single neuron in GPT-4, in every transformer, in every neural network you've ever used, works on the same basic principles as that first artificial neuron. The perceptron isn't history-It's the foundation.

Understanding it means understanding what's actually happening when you call an LLM API. It means knowing why things work, not just that they work.

If you've felt this same curiosity and want to truly understand the foundations beneath the tools we use every day, join me. Learning from first principles, one concept at a time.

From Biology to Silicon

In 1943, Warren McCulloch and Walter Pitts created the first mathematical model of a neuron. But it was Frank Rosenblatt in 1958 who built the perceptron, the first artificial neuron that could actually learn.

Rosenblatt's breakthrough came from mimicking nature. He studied how biological neurons work and translated that logic into mathematics. Here's how they compare:

Biological Neuron:

Dendrites  →  Cell Body  →  Threshold Check  →  Axon
(receive)     (process)     (fire if met)        (output)
Enter fullscreen mode Exit fullscreen mode

Artificial Neuron (Perceptron):

Inputs     →  Weighted Sum  →  Threshold Check  →  Output
x₁,x₂,...     Σ(xᵢ × wᵢ)       (≥ threshold?)       0 or 1
Enter fullscreen mode Exit fullscreen mode

The key insight: Learning happens by adjusting the weights.

How a Perceptron Works

Let's break it down to basics.

A perceptron takes inputs, multiplies each by a weight, adds them up, and makes a decision.

def perceptron_forward(inputs, weights, bias):
    # Multiply each input by its weight
    weighted_sum = sum(x * w for x, w in zip(inputs, weights))

    # Add bias (shifts the decision boundary)
    weighted_sum += bias

    # Activation: output 1 if positive, 0 otherwise
    return 1 if weighted_sum > 0 else 0
Enter fullscreen mode Exit fullscreen mode

That's it. That's the core of a perceptron.

What's happening:

  1. Each input has a weight (how important is this input?)
  2. We sum up: (input₁ × weight₁) + (input₂ × weight₂) + ... + bias
  3. If the sum is positive, output 1. Otherwise, output 0.

Example: AND gate

Let's say we want to implement the AND logic gate:

  • Input: [0, 0] → Output: 0
  • Input: [0, 1] → Output: 0
  • Input: [1, 0] → Output: 0
  • Input: [1, 1] → Output: 1

Traditional way (if/else):

def and_gate_traditional(input1, input2):
    if input1 == 1 and input2 == 1:
        return 1
    else:
        return 0
Enter fullscreen mode Exit fullscreen mode

Perceptron way (learned weights):

With the right weights ([0.5, 0.5] and bias -0.7), the perceptron can solve this:

  • [0, 0]: 0×0.5 + 0×0.5 - 0.7 = -0.7 → Output: 0 ✓
  • [0, 1]: 0×0.5 + 1×0.5 - 0.7 = -0.2 → Output: 0 ✓
  • [1, 0]: 1×0.5 + 0×0.5 - 0.7 = -0.2 → Output: 0 ✓
  • [1, 1]: 1×0.5 + 1×0.5 - 0.7 = 0.3 → Output: 1 ✓

The difference? The traditional way is hardcoded. The perceptron learns these weights from examples. That's the new programming paradigm Nielsen talked about.

What Clicked for Me

After implementing and testing the perceptron, here's what became clear:

Weights are just numbers. There's no magic. A weight of 0.5 means "this input matters half as much as an input with weight 1.0."

The bias shifts the boundary. Without bias, the decision boundary always goes through the origin. Bias lets it move anywhere.

Learning is adjustment. When the perceptron makes a mistake, we adjust the weights. That's learning.

It's a linear classifier. The perceptron draws a straight line (or hyperplane) to separate classes. This is both its power and its limitation.

Explore the Code

I've implemented a complete perceptron from scratch with visualizations:

Here is the sample visualization screenshot from the playground

GitHub Repository: perceptrons-to-transformers

What you'll find:

  • 01-perceptron/perceptron.py - Full implementation with learning algorithm
  • 01-perceptron/perceptron_playground.py - Streamlit app to play with it

What's Next

The perceptron can learn AND, OR, and NAND gates perfectly. But it has a fundamental limitation.

No matter how you adjust the weights, there's one simple logic gate it cannot learn. This limitation exposed a critical weakness in single-layer networks.

In the next post, we'll explore this limitation and see why it led to the invention of multilayer networks.

Spoiler: The problem is called XOR, and solving it ultimately enabled path to modern deep learning.


References

  1. Nielsen, M. (2015). Neural Networks and Deep Learning. Determination Press. Available at: http://neuralnetworksanddeeplearning.com/

Tags: #MachineLearning #AI #DeepLearning #Perceptron #NeuralNetworks

Series: From Perceptron to Transformers

Code: GitHub Repository

Top comments (0)