The first version used a cron job. Every six hours, it would hit the GitHub API, pull repo data, and write it to a JSON file that got committed back to the repo. It worked for about a week.
Then I hit GitHub's rate limit. Then the cron silently failed for three days and nobody noticed. Then I realized the JSON file had grown to 4MB because I was storing every commit message for every repo.
I was building getfolio.dev, a service that generates developer portfolios synced live with GitHub. The whole point was that your portfolio would never go stale. Which meant the data layer couldn't be fragile. It had to actually work at scale, for hundreds of users, without me babysitting a cron tab.
So I rewrote it. And then I rewrote it again.
Version 1: Static JSON + GitHub Actions
The appeal was simplicity. GitHub Actions triggers on a schedule, fetches API data, commits a JSON file, Vercel rebuilds. Zero infrastructure.
Problems showed up fast. Rate limiting was the obvious one. But the deeper issue was latency. A user signs up, connects their GitHub, and then waits up to six hours to see their repos populate. That's not a product. That's a broken promise.
Version 2: Serverless Functions + Direct API Calls
Next attempt: call GitHub's API on every page load through a Next.js API route. Fresh data every time.
This solved the staleness problem and created a new one. GitHub's REST API isn't slow exactly, but when you're fetching repos, languages, contribution data, and pinned items, you're looking at 4-5 sequential requests. Page loads went from 200ms to 2+ seconds. Some users had 80+ repos. That made it worse.
I tried parallelizing the requests with Promise.all. Got it down to about 900ms on a good day. Still too slow for a portfolio that someone might bounce from in under three seconds.
Version 3: Firebase + Smart Caching (What Actually Shipped)
The solution that stuck uses Firebase as a caching and sync layer between GitHub and the portfolio.
When a user connects their account, we do a full initial fetch and store the processed data in Firestore. After that, we use a combination of webhooks (when available) and intelligent polling to keep things current. The polling interval adapts. If you pushed code today, we check more frequently. If your last commit was three weeks ago, we back off.
Page loads hit Firestore, not GitHub. Response times dropped back to ~150ms.
The schema looks roughly like this:
users/
{uid}/
github_profile: { ... }
repos/
{repo_id}: {
name, description, stars, forks,
languages: { TypeScript: 14500, CSS: 3200 },
last_pushed: timestamp,
pinned: boolean
}
sync_meta: {
last_full_sync: timestamp,
next_scheduled: timestamp,
poll_interval_minutes: 60
}
The adaptive polling was the part that took the longest to get right. Too aggressive and you burn through API quota across all users. Too passive and someone pushes a cool project and their portfolio doesn't reflect it for a day.
Current logic: base interval of 360 minutes. If last_pushed is within 24 hours, drop to 60 minutes. Within the last hour, drop to 15. It's simple and it works.
The Rest of the Stack
For anyone curious about the full picture:
- Next.js 14 App Router for the portfolio rendering and the dashboard. Server components handle the data-heavy portfolio pages. Client components for the drag-drop editor and theme preview.
- Tailwind + Framer Motion for styling and animations across the five themes (DarkPro, Terminal, Minimal, Glass, Editorial). Each theme is a separate component tree, not a CSS swap.
- Stripe for billing. Pro plan unlocks custom domains and analytics.
- Vercel for hosting. Edge functions for the custom domain resolution.
What I'd Do Differently
I'd skip version 1 and 2 entirely and start with the caching layer. The instinct to keep things simple by avoiding a database was wrong. The database is what made it simple. Without it, I was fighting the GitHub API's constraints on every request.
If you're building anything that depends on a third-party API as a primary data source, put a cache in front of it from day one. Not as an optimization. As architecture.
Getfolio launches soon. The early version is live at getfolio.dev if you want to see how your GitHub data looks across different themes.
What's a technical decision in your projects that you got wrong twice before finding the right approach? Curious what other people's version 1 → version 3 journeys looked like 🛠️
Originally published on getfolio.dev.

Top comments (0)