Shrijith Venkatramana

Posted on Feb 13, 2025 • Edited on Mar 7

Understanding Backpropagation from Scratch with micrograd - Derivatives

#machinelearning #python #programming #ai

Hello, I'm Shrijith. I'm building git-lrc, an AI code reviewer that runs on every commit. It is free, unlimited, and source-available on Github. Star Us to help devs discover the project. Do give it a try and share your feedback for improving the product.

Neural networks might seem complex, but at their core, they rely on a simple yet powerful concept: derivatives. Andrej Karpathy’s micrograd proves this beautifully—it's just two Python files with less than 150 lines of code, yet it captures the fundamental ideas behind neural networks.

This blog breaks down micrograd step by step, starting with the very foundation: what derivatives really mean and how we compute them. You’ll learn:

How backpropagation works by understanding derivatives in the simplest way
The difference between symbolic and computational differentiation
How small input changes affect output (positive, negative, and zero slopes)
Why neural networks don’t need explicit derivative formulas

With visual explanations, simple code snippets, and practical insights, by the end of this post, you’ll have a solid grasp of how gradients drive learning in neural networks—without drowning in unnecessary complexity. Let’s dive in.

Karpathy's `micrograd` is just 2 files of Python (< 150 LOC)

micrograd consists of just two small files:

engine.py: Less than 100 lines of code, defines the Value class, the code that powers the neural network
nn.py: Defines Neuron, Layer and MLP (Multi-Layer Perceptron). In total, around 60 lines of code.

Fundamentally - the core ideas behind neural networks can be captured in just under 150 lines of simple Python code. The rest of the code complexity in other libraries is about efficiency.

Groundwork for understanding the definition of derivatives

The first goal is to understand the concept of derivatives with some examples. So we do the following to prepare some groundwork:

Define a function f - takes in scalar input, gives scalar output
Generate a range of values for x and y (input/output)
Plot the values

Two Ways of Calculating Derivative

The task is to find the derivative of the function at particular points, such as where x=3 and so on.

In school, we are usually taught the symbolic method.

Say for the expression 3*x**2 - 4*x + 5, we can find the derivative expression to be 6*x - 4.

But since we're dealing with neural networks - the expression we are dealing with could be huge, and nobody writes those expressions down.

So instead of taking a symbolic approach - we take a computational approach.

However, it is useful to understand what derivatives mean at a conceptual level first - before we move onto the computations.

The Meaning of a Differentiable Function

The key formula is the following:

In the above image - we see that h is a very small value, and it keeps getting smaller, vanishing towards a 0.

The question is - what is the trend of a function's output, when there's a small bump/increase in its input.

At a higher level, we are asking at point x, if we increase it by a tiny amount h to get x+h, does the output increase or decrease? And the change in output - what is the magnitude of it?

The resultant value of the formula is a slope. And if a bump in input leads to a positive slope, it means the value of output increases.

If the input is increased, and we get a negative slope it means, the value of output decreases.

Also at a specific point of 2/3 in the above diagram you can also see that a slight increase in input will still keep the same output - that is we have a zero slope

Numerical Exploration

The above intuition can be validated with some numerical exploration with a valid x value and a tiny h value.

Positive Slope Example

Negative Slope Example

Zero Slope Example

Reference

The spelled-out intro to neural networks and backpropagation: building micrograd)

*AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.*

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

HexmosTech / git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

git-lrc

Free, Unlimited AI Code Reviews That Run on Commit

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs -- without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

See It In Action

See git-lrc catch serious security issues such as leaked credentials, expensive cloud operations, and sensitive material in log statements

git-lrc-intro-60s.mp4

Why

🤖 AI agents silently break things. Code removed. Logic changed. Edge cases gone. You won't notice until production.
🔍 Catch it before it ships. AI-powered inline comments show you exactly what changed and what looks wrong.
🔁 Build a habit, ship better code. Regular review → fewer bugs → more robust code → better results in your team.
🔗 Why git? Git is universal. Every editor, every IDE, every AI…

View on GitHub

DEV Community

Understanding Backpropagation from Scratch with micrograd - Derivatives

Karpathy's `micrograd` is just 2 files of Python (< 150 LOC)

Groundwork for understanding the definition of derivatives

Two Ways of Calculating Derivative

The Meaning of a Differentiable Function