Aniket Giri

Posted on Oct 26

Three months ago, I wanted to train my own LLM. The tutorials were a mess. So I built the tool I wish existed.

#tooling #beginners #llm #javascript

The Frustration That Started It All

It was 2 AM. I'd been reading LLM training tutorials for six hours.

One tutorial told me to install 47 dependencies manually. Another assumed I had $10,000 worth of GPUs lying around. A third just... stopped halfway through with "figure out the rest yourself."

I'm a third-year CS student at Mumbai University. I don't have a research lab. I don't have unlimited cloud credits. I just wanted to understand how these things work by building one myself.

That's when the idea hit me:

What if training an LLM was as easy as npx create-next-app?

One command. Everything ready. Just start training.

That's how create-llm was born.

The Vision: Vercel for LLMs

I love how Vercel made web deployment stupid simple:

npx create-next-app my-app
npm run dev
# You have a website

Why couldn't LLM training be the same?

npx create-llm my-llm
python train.py
# You have a language model

No scattered tutorials. No dependency hell. No "works on my machine."

Just: scaffold → train → deploy.

Simple.

Week 1-2: The Naive Optimism

I started building with pure enthusiasm and zero idea what I was getting into.

Initial plan:

CLI tool in TypeScript (scaffold projects)
Python training code (PyTorch)
Templates (tiny, small, base)
One command to rule them all

Reality check: Nothing worked. Everything broke. I loved it.

First Win: The Scaffolder

Getting the CLI to generate a project structure was surprisingly fun. Running npx create-llm test and seeing files appear? Magic.

That first dopamine hit kept me going through what came next.

Week 3-4: Everything Breaks

This is where reality hit hard.

Problem 1: The 32,000 Token Disaster

My first training run showed perplexity of 1.0 - the model memorized everything instead of learning.

After hours of debugging, I found it:

My config had vocab_size: 32000 hardcoded, but my tokenizer only created 423 tokens.

The math:

Model allocated: 32,000 × 768 = 24,576,000 parameters
Actually used: 423 × 768 = 324,864 parameters
Wasted: 24,251,136 parameters (99% of embedding layer!)

With 23M parameters but only 423 tokens being trained, it memorized instantly.

The fix: Auto-detect vocab size from tokenizer.

# Before
vocab_size = config['model']['vocab_size']  # 32000

# After
if config['model']['vocab_size'] == 'auto':
    vocab_size = tokenizer.get_vocab_size()  # 423
    print(f"Auto-detected vocab_size: {vocab_size}")

Lesson learned: Don't hardcode what should be dynamic.

Problem 2: Model Size vs Data Size

Even with correct vocab, my tiny model (23M params) was overfitting on small datasets.

The rule I learned:

1M parameters needs ~10K examples minimum
10M parameters needs ~100K examples minimum
Your model should match your data

I restructured templates:

nano: 200K-700K params (1-2 min training, learning tool)
tiny: 2-5M params (5-10 min, actually usable)
small: 50-100M params (1-3 hours, production)
base: 500M-1B params (days, research)

Problem 3: Cross-Platform Hell

It worked on my Mac. It broke on Windows. Classic.

UTF-8 encoding, path separators, torch.load warnings - I fixed them all one by one.

Windows users deserve love too.

Week 5-6: The "Aha!" Moments

Moment 1: Mode Collapse is a Feature

After fixing vocab size, I trained the nano template. It generated:

You: "Once upon a time"
Model: "time time time time time..."

Mode collapse! My first instinct: "It's broken, hide it."

Then I realized: This is EDUCATIONAL.

Beginners SHOULD see mode collapse. They should understand:

Why model size matters
Why data quality matters
What overfitting looks like
How to fix it

I rewrote the nano template docs:

"nano is intentionally small. It will show mode collapse with limited data. That's the point - you learn by seeing what goes wrong, then fixing it."

Honesty > perfection.

Moment 2: Validation > Perfection

I added overfitting detection:

if perplexity < 1.1:
    print("⚠️  WARNING: Perplexity < 1.1 indicates severe overfitting!")
    print("   Suggestions:")
    print("   - Add more training data")
    print("   - Increase dropout")
    print("   - Reduce model size")

Users started thanking me for these warnings. They learned faster because the tool taught them.

My tool isn't just a CLI - it's a teacher.

Moment 3: Speed Matters

The nano template trains in 60 seconds. People love this.

Why? Instant gratification.

"I ran one command and trained my own LLM in a minute" is way more powerful than "I spent 3 days setting up CUDA."

Fast iteration = more learning = better outcomes.

Week 7-8: Building in Public

I started posting updates on Twitter.

Day 1: "Building create-llm - npm create-next-app but for LLMs"
Response: 3 likes

Day 15: "Just fixed the vocab size mismatch bug [screenshot]"
Response: 50 likes, 5 people wanting to beta test

Day 30: "create-llm now has auto-detection and overfitting warnings"
Response: 200+ likes, people asking when it launches

Building in public was scary but worth it. Real-time feedback shaped the product.

Week 9-12: Polish & Panic

Final stretch. Everything worked but nothing felt "done."

I added:

Live training dashboard (Flask + SocketIO)
Model comparison tool
Deployment to HuggingFace
Comprehensive docs
29 out of 30 tasks on my checklist completed

But I kept finding "one more thing" to fix.

Classic trap: Perfectionism masquerading as thoroughness.

I almost didn't launch because "it's not ready yet."

Then I realized: It trains models. It generates text. It has docs. It has validation.

It's ready. I'm just scared.

The Lessons

1. Ship Before You're Ready

I had 95% of features done for 2 weeks. I kept adding "just one more thing."

Finally shipped. Users loved it. The "missing" 5% didn't matter.

Lesson: Shipping beats perfecting.

2. Honest > Perfect

The nano template shows mode collapse. I could've hidden it.

Instead, I documented it: "This is a learning tool. It will overfit. That's educational."

Users appreciated the honesty more than fake polish.

3. Build What You Wish Existed

I built create-llm because I wished it existed when I started learning.

That clarity - "I'm building for past-me" - made every decision easier.

4. Validation is a Feature

Adding overfitting warnings, vocab size checks, and model size recommendations made the tool better than competitors.

Don't just build tools. Build teachers.

5. Your Bugs are Lessons

Every bug taught me something:

Vocab mismatch → parameter efficiency
Overfitting → model sizing
Mode collapse → training dynamics

I learned more from bugs than tutorials.

6. Community > Code

The best part wasn't writing code. It was people saying:

"I finally understand how LLMs work!"
"I trained my first model today!"
"This tool saved me hours!"

Building alone is coding. Building with community is impact.

What's Next

create-llm v1.0 is just the beginning.

Short term (v1.1-1.2):

SynthexAI integration (synthetic data generation)
Better benchmarking (tokens/sec, FTL, RAM usage)
More model architectures (BERT, T5)
Template marketplace

Medium term (v2.0):

Cloud training platform (train on our GPUs)
Model hosting (get API endpoints)
Collaborative features (share configs, compare results)

Long term (v3.0+):

Full "Vercel for LLMs" platform
One-click deploy
Model marketplace
Pay-as-you-go pricing

The dream: Make custom LLMs as accessible as creating websites.

The Stats (So Far)

After 12 weeks:

50+ active users
Featured in AI newsletters
10+ production deployments

But the real win? The messages:

"I got my first ML job because of this project."
"I finally understand transformers now."
"Teaching my students with create-llm."

That's the impact I wanted.

Try It Yourself

Want to train your first LLM?

npx create-llm my-first-llm --template nano
cd my-first-llm
python tokenizer/train.py --data data/raw/sample.txt
python data/prepare.py
python training/train.py
python chat.py --checkpoint checkpoints/checkpoint-best.pt

60 seconds later, you're chatting with your own model.

Not perfect. But yours.

Final Thoughts

Three months ago, I was frustrated by complex tutorials.

Today, I've built a tool that helps thousands of people learn LLMs.

The journey taught me:

Building is learning
Shipping beats perfecting
Community is everything
Your frustration is someone else's too

If you're stuck on a problem, build the solution. Someone else needs it too.

And maybe, just maybe, you'll change how people learn.

Thank You

To everyone who:

Starred the repo
Filed issues
Contributed code
Shared feedback
Believed in the vision

This is for you. And for everyone who's ever felt frustrated trying to learn ML.

Let's make AI accessible together.

Built with ❤️ by Aniket Giri
CS (AIML) Student | Building in public

Found this helpful? Star the repo, share the post, or just say hi on Twitter. I read everything.

Want to contribute? We're always looking for help with docs, features, and examples.

Have questions? Drop them in the comments. I respond to every single one.

Tags: #machinelearning #ai #llm #opensource #buildinpublic #indiehacker #startup #developer #python #typescript

Published on 24/10/2025
317 impressions • 12 reactions
Part of the create-llm journey

DEV Community