A Minimal ~9M Parameter Transformer LLM Trained from Scratch

Mudasir Habib — Thu, 23 Apr 2026 14:30:13 +0000

SelfLM — Building a Tiny LLM from Scratch (End-to-End)

LLMs are complex, but not magical — once you break them into components, everything becomes understandable.

It started with a simple question:
“How do models like GPT actually work?”

So I decided to build a smaller version myself — step by step — from dataset generation to tokenization, training, and deployment. Everything is fully open-source.

What This Project Covers

Instead of treating models as black boxes, this project focuses on the entire pipeline:

Synthetic dataset generation (~60K samples)
Tokenization & preprocessing
Transformer architecture (from scratch)
Training pipeline
Inference & deployment

Highlights

Trained in ~5 minutes (Colab T4 GPU)
Fully custom LLM (~9M parameters)
Hugging Face model + dataset + live Space
Serverless deployment using ONNX on Vercel (free tier)
Lightweight, browser-friendly inference

Live Demo

https://selflm.vercel.app/docs

Hugging Face Space

https://huggingface.co/spaces/Mudasir-Habib/selflm-demo

Colab Notebook

https://colab.research.google.com/drive/1EyR5mFuHupJWdnJWazvdjU1Bre2rF2RD?usp=sharing

GitHub Repository

https://github.com/Mudasirhabib123/selflm

Customization Feature

One of the most interesting parts:

You can customize the model with your own data by simply:

Editing the first cell in the Colab notebook
OR modifying src/dataset/data.py

Add your own context, retrain, and instantly get a personalized LLM.

Goal

This project is built for:

Learning how LLMs actually work
Experimentation with small-scale models
Understanding the full pipeline end-to-end

Open Source

Fully open-source and designed to make LLMs accessible, transparent, and understandable.

If you find it useful, consider giving it a star on GitHub.

DEV Community: Mudasir Habib