Abubakersiddique771

Posted on Jun 1

I Built an AI That Learns Only From GitHub Bugs 🐛🤖

#github #ai #chatgpt #tutorial

"Smart people learn from their mistakes. Geniuses learn from GitHub issues."

Imagine an AI that doesn’t just write perfect code — but actually studies thousands of real-world bugs, failures, and pull request debates to understand how things go wrong.

That’s what I tried to build.

This is the story of how I scraped, trained, and deployed a local LLM that doesn’t just generate code — it warns me about bugs real devs have already made. And it’s powered entirely by GitHub’s open issue tracker.

🔥 The Problem: Tutorials Don’t Teach You What Breaks

As developers, we spend hours on tutorials that show us:

“How to set up a REST API”
“How to train a basic LLM”
“How to deploy a project with Docker”

But these are happy path instructions. They show you the "golden path."

Meanwhile, on GitHub:

Issues filled with edge cases
Pull request discussions full of trade-offs
"Why was this line changed?" mysteries

That’s the real learning — the dark matter of dev education.

So I asked myself:

Can I make an AI that doesn’t learn from code examples, but from code mistakes?

🧠 The Idea: Train an LLM on GitHub Issues & Fixes

Instead of feeding an AI perfect Stack Overflow answers, I did this:

Crawled open-source repos with high issue + PR activity
Extracted:

Title + body of bug reports
Linked PRs that fixed them
Reviewer comments
1. Chunked each (issue → fix) pair into embeddings
2. Indexed them in a vector DB
3. Created a CLI and VS Code extension where I could ask:

“Has anyone fixed a bug like this before?”

And shockingly... it worked.

🛠️ My Stack: Building the "Bug Sage"

Component	Tool Used
Issue Scraping	GitHub API + GraphQL
Embedding	`text-embedding-ada-002` via OpenAI OR `Instructor-XL` locally
Vector DB	ChromaDB
Retrieval	LangChain
UI	CLI + VS Code Sidebar
Local LLM	`Phi-3` or `Mistral` via Ollama

🐛 How It Works (Real Example)

Say I’m debugging a KeyError in my FastAPI app when deploying to AWS Lambda.

Instead of googling aimlessly or hitting Stack Overflow, I type:

bugsage "KeyError during AWS Lambda cold start in FastAPI app"

And it retrieves this issue from another repo:

📝 #328 - FastAPI app fails on cold start due to environment variables missing
🔧 Fixed by moving .env loading inside the handler in PR #329

Suddenly, I’m not just getting a fix.
I’m getting context, explanation, and real-world patterns.

🤖 The Coolest Part? The AI Learns With Every Crawl

Every weekend, a GitHub Action re-scrapes:

New issues from starred repos
PRs with fixes keywords
Tags like bug, regression, performance

It self-indexes all this into ChromaDB. Over time, the "Bug Sage" gets smarter — like a developer mentor who reads every project on GitHub for you.

Check this out, if u have some moment of time: (while reading it)

🧠 The Educational Value: Learning From Pain Points

This isn’t just a productivity tool.

It’s a learning engine.

You start to see:

How real teams debug
What kinds of mistakes repeat
Why certain decisions are controversial

You start coding defensively — with foresight.

It’s like pair programming with 1,000 senior devs whispering, “Hey… that didn’t work for us either.”

🤯 Unexpected Use Cases

✅ Code review help: It suggests real PR debates for similar changes
📉 Prevent regressions: Matches code diffs to past rollback issues
🎓 Learning prompts: “Give me 3 bugs people faced with WebSockets + Django Channels”
🕵️‍♂️ Open-source archaeology: “What were the most common bugs in X repo over 2 years?”

😂 Dev Humor (You Know It’s Coming)

🧟 “I don’t make the same bug twice… I make it 10 times, slightly differently”
🧙‍♂️ “Bug Sage, what’s the ancient wisdom on async DB calls?”
👶 Me: "Why is my code crashing?" GitHub AI: "It has happened before… and it will happen again."

📚 How You Can Build Your Own “Bug Sage”

Want to try this at home?

Step 1: Crawl GitHub issues and PRs

from github import Github
g = Github("your_token")
repo = g.get_repo("tiangolo/fastapi")
issues = repo.get_issues(state="closed", labels=["bug"])

Step 2: Pair issues to PRs

Look for text like "Fixes #123" in PR bodies.

Step 3: Embed text

Use:

from langchain.embeddings import OpenAIEmbeddings

Or local models like Instructor-XL via HuggingFace.

Step 4: Store + Query with Chroma

from langchain.vectorstores import Chroma

Step 5: Build CLI or integrate into VS Code!

🌍 Final Thought: Make GitHub Your Mistake Mentor

We often treat GitHub as a place to show perfect work.

But it’s really a museum of broken code — and if you mine it well, you’ll learn ten times faster than any tutorial.

Don’t just write code.
Study how it breaks — and let AI help you never repeat it.

💬 Tired of Building for Likes Instead of Income?

I was too. So I started creating simple digital tools and kits that actually make money — without needing a big audience, fancy code, or endless hustle.

🔓 Premium Bundles for Devs. Who Want to Break Free

These are shortcuts to doing your own thing and making it pay:

🌍 I built a simple website for a local biz and got $500+ — No design skills. Just solved a real problem.
🚀 Launched a SaaS in 7 days — no code, no audience — It’s messy but it works.
🔌 Used public APIs to build tiny tools people paid $997 for — Took what was already out there and made it useful.
📦 $300 in 3 days from a simple resource vault — Just organized links + tools. That’s it.
📈 Ranked a local site without writing a single blog post — SEO doesn’t have to be hard if you do it differently.

🔧 Quick Kits (Take 1 Product That Actually Works for You)

These are personal wins turned into plug-and-play kits — short instruction guides:

⚡ $1K in a week using APIs I didn’t even build — Copy-paste logic, add polish, publish.
🔥 My $0 dev setup now earns $97+ daily — Took years to build. Now it runs quietly in the background.
💼 This SaaS starter kit sells itself for $499 — Turns out, people love skipping setup pain.
📚 I turned academic papers into real products — It’s all just buried gold if you know where to look.
💡 My dev portfolio became a $297 product — I just told my story and sold the assets I made along the way.

👉 Browse all tools and micro-business kits here
👉 Browse all blueprints here

DEV Community