This is a submission for the GitHub Finish-Up-A-Thon Challenge
What I Built
SpendWise AI is a free tool that audits your AI tool spending (Cursor, Copilot, Claude, ChatGPT, Gemini, Windsurf) against verified vendor pricing and tells you exactly where you're overspending and what to do about it.
I originally built this as a week-long assignment for a startup. The problem it solves is simple: founders and engineering managers pay for multiple AI tools but have no idea if they're getting ripped off. SpendWise gives them that answer in under a minute, no signup needed.
The interesting part is that the core audit engine has zero AI in it. It runs 6 hardcoded rules against verified pricing data, so every recommendation is reproducible and verifiable. AI (Groq's Llama 3) only kicks in to write a friendly summary paragraph on top of the structured results. I made this choice because financial recommendations need to be deterministic. Same input, same output, every time.
The stack is Next.js 16, TypeScript, Tailwind + shadcn/ui, Supabase for the database, Groq for AI summaries, Resend for emails, and Vitest for testing. Deployed on Vercel.
Live app: spendwise-ai-test.vercel.app
Source code: github.com/Karam-999/SpendWise-AI
Demo
The original audit tool:
The comeback (re-audit on pricing change):
You can try the Round 1 version live at spendwise-ai-test.vercel.app. Pick a tool like Cursor on Teams plan at $40/mo, run the audit, and see the full savings breakdown. The Round 2 features (pricing change detection, re-audit diff view) are on a separate branch and not merged to main yet, but the demo video above walks through the complete flow.
The Comeback Story
Where it was:
The original version was basically a calculator. You fill in your AI tools, it shows you where you can save money, and that's it. If Cursor changed its pricing the next week, your audit was already stale and you'd never know about it.
It worked fine as a one-time thing. It had the form, the audit engine, AI summaries, email lead capture, shareable URLs with OG tags, localStorage persistence, spam protection, 9 tests, and CI. But audits had no memory. No pricing snapshot, no change detection, no way to bring users back.
What I added:
I turned it into a system that actually stays useful after the first visit.
The big changes:
Every audit now stores the exact pricing data that was used at the time as a snapshot in the database. This way I can always compare what pricing looked like "then" vs "now."
Built a
/api/detect-changesendpoint that accepts pricing overrides, compares them against every stored audit, re-runs the ones that are affected, and sends a consolidated email per user telling them what changed.Added a re-audit diff view at
/audit/[id]/reauditthat shows old vs new recommendations side by side, with the savings delta, changed/added/removed recommendations highlighted, and score comparison.Had to refactor the audit engine to support injectable pricing. The original had pricing hardcoded as a private constant, which was the right call for Round 1 but made Round 2 harder. Ended up exporting the pricing, creating a typed version of the engine that accepts custom pricing, and updating 15 internal references. All 9 original tests still passed after the refactor.
Wrote 6 new tests for the pricing diff logic. Total is now 15 tests, build and type checking both pass clean.
Things I intentionally skipped: HTML email templates (shipped plain text to get the full flow working instead of polishing one piece), admin dashboard (the data is queryable via SQL already), and scheduled cron (manual endpoint works, Vercel Cron needs Pro tier).
My Experience with GitHub Copilot
I used Copilot mostly for the boring stuff that would have eaten up time otherwise.
Documentation was the biggest win. I had a bunch of markdown files to write (architecture docs, PR descriptions, devlogs, reflections) and Copilot helped me get first drafts out fast. I'd review and rewrite them, but not having to start from a blank page saved a lot of time.
Bug fixing was another area where it helped. When Supabase inserts were silently failing (returning null for both data and error because of RLS), Copilot helped me quickly write the diagnostic logging to trace what was happening. It also caught Next.js lint issues like using <a> tags instead of <Link> and setState patterns that were wrong.
TypeScript type checking during the Round 2 refactor was where Copilot really earned its keep. I was changing 15 references from a hardcoded PRICING. to an injected pricing. parameter and creating new types for the pricing data structure. Copilot's inline suggestions made that mechanical refactoring way faster.
Test scaffolding for the 6 new pricing diff tests. Copilot generated the boilerplate (imports, describe/it blocks) and I filled in the actual test logic.
What I didn't use it for: the actual audit rules (I verified all pricing manually against vendor pages), architectural decisions, and the business/strategy docs. Those need real thinking, not autocomplete.
Top comments (0)