Ten days ago, I had zero public GitHub repos. Today I have 51. Here is what I learned about what actually gets discovered on GitHub — and what dies in obscurity.
The Experiment
I wanted to answer one question: Can you build a meaningful GitHub presence from scratch in under 2 weeks?
Not with a viral project. Not with an existing audience. Starting from zero, with nothing but code and READMEs.
The Results (Raw Numbers)
| Metric | Day 1 | Day 10 |
|---|---|---|
| Repos | 0 | 51 |
| Stars | 0 | 10 |
| Forks | 0 | 2 |
| Clones | 0 | 1,800+ |
| Views | 0 | 300+ |
10 stars in 10 days with zero promotion outside Dev.to. Not viral, but real organic discovery.
What Got Traffic (And What Didn't)
Winner #1: Curated Lists
My awesome-web-scraping-2026 repo got 142 views and 9 stars — the most of any repo.
Why? People search GitHub for curated tool lists. "Awesome" repos are a known pattern that developers trust.
Winner #2: Data Collections
ai-market-research-reports got 1,668 clones and 478 unique visitors — orders of magnitude more than any code repo.
People clone data repos. They rarely clone code tutorials. This was my biggest insight.
Loser: Tutorial Repos Without Clear Value Prop
Generic tutorial repos with names like python-data-pipelines got near-zero traffic. The name doesn't tell you WHY you should click.
The 5 Rules I Learned
Rule 1: README is Your Landing Page
A repo with great code and a bad README is invisible. A repo with mediocre code and a great README gets stars.
Every README must have:
- One-line value prop (what problem does this solve?)
- Real numbers ("reduced costs by 97%", not "improves performance")
- A story (not docs — a human story about a real problem)
- Quick start (copy-paste and it works in 30 seconds)
Rule 2: Topics Are Your SEO
GitHub topics are like meta keywords for search. Every repo should have 5-8 relevant topics. I added topics to all 51 repos and saw a noticeable traffic bump within days.
Rule 3: Cross-Link Everything
Every README links to 3-5 related repos. When someone finds one repo, they discover your whole portfolio. It is like internal linking for SEO, but on GitHub.
Rule 4: Data Beats Code
People will clone a dataset 10x more than a code tutorial. If you have domain knowledge, package it as structured data — JSON, CSV, markdown tables. It gets discovered faster than any tutorial.
Rule 5: One Story Per Repo
The repos that got stars all had a compelling story:
- "I lost $2,000 of scraped data because my pipeline had no error handling"
- "My VPS cost $50/month but only used 2.5% of capacity"
No one stars a README that starts with "This is a collection of..."
What I Would Do Differently
- Start with 5 repos, not 51. Quality over quantity. My best 5 repos drove 95% of all traffic.
- Write the story first, code second. The narrative in the README matters more than the implementation.
- Focus on searchable niches. "web scraping tools 2026" gets searched. "my python scripts" does not.
Top Repos by Category
Data & Research:
- ai-market-research-reports — 500+ AI market reports (1,668 clones)
- hn-tech-trends-dataset — Hacker News trend data
Web Scraping:
- awesome-web-scraping-2026 — 500+ tools curated (9 stars)
- automated-testing-scrapers — Test framework for scrapers
AI/ML:
- ml-fine-tuning-free — Fine-tune models on free GPUs
- llm-prompt-engineering — 15 production prompt patterns
Have you tried building a GitHub portfolio from scratch? What worked for you? Drop a comment — I am genuinely curious what others have experienced.
I help teams set up data collection infrastructure. If you need web scraping at scale, reach out.
Top comments (0)