DEV Community

Alper GÖÇEN
Alper GÖÇEN

Posted on

Free from-scratch deep learning notes: tensors, attention, and a tiny GPT

I'm an AI PhD student, and I have started writing a free public notebook on how AI models work under the hood:

https://insideaimodels.com/

The goal is to make the mechanics easier to reason about, without hiding everything behind library calls. I am writing the notes I wish I had when I was moving from "I can run the code" to "I understand what the model is doing."

What is inside so far

  • Building GPT from scratch in PyTorch: tokenizer, embeddings, masked self-attention, multi-head attention, residual blocks, training loop, and generation.
  • Attention explained from scratch: query, key, value vectors, softmax, context vectors, and why the mechanism matters.
  • Tensors for deep learning: shapes, dimensions, and why tensor thinking is the language of neural networks.
  • Gradient descent intuition: learning rate, derivatives, backpropagation, and the optimization loop.
  • Identity-aware negative sampling: a short note from my deepfake-detection research direction.

A few direct links:

There is no paywall, signup, or course funnel. I am sharing it publicly because writing helps me learn, and because practical ML resources are better when they stay open.

If there is a part of modern AI models that usually gets hand-waved in tutorials, I would love to hear what I should cover next.

Top comments (0)