I built a free AI book-recommendation app — and the hardest part was staying under a 1,000/day API limit

#webdev #react #showdev #ai

I kept finishing a book I loved and then having no idea what to read next. Goodreads-style recommendations always felt generic, so I built My Next Book: you add a few books you've loved, and it returns five tailored picks, each with a one-line reason for the match plus cover, genre and a real description.

Live (no login, free): https://mynextbook.io

The stack

React + Vite, deployed on Vercel
Recommendations from Claude (Haiku) via a small serverless proxy, so the API key never touches the browser
Book metadata (covers, genre, page count, publisher blurb) from the Google Books API — real data, not model guesses

The actual hard part: the Google Books free tier

Google Books gives you ~1,000 calls/day on the free tier. With search-as-you-type plus enriching every recommendation with a cover and metadata, I blew through that almost instantly in testing.

What fixed it was layering caches:

Build-time bundled cache. A curated list of ~440 popular titles, with cover URLs fetched once at build time by a script and baked straight into the bundle. These show up instantly on load with zero API calls.
Threshold-based search. When a query has enough strong matches in that bundled cache, the app skips the Google API entirely and serves from the cache. The live API is only hit for searches the cache can't answer.
localStorage caching. Both search results and recommendation enrichment (covers/genre/description) are cached client-side with a TTL, so repeat lookups of the same book are free.

Net effect: most common searches now resolve with zero network calls, and the daily quota goes a very long way.

What I'd love feedback on

Recommendation quality (does it actually "get" your taste?)
Anything in the UX that feels off

Top comments (2)

Mateo Ruiz • Jun 8

This is a great example of where the hardest engineering problem isn't the AI at all it's the surrounding system design.

A lot of AI app showcases focus on prompts and models, but things like API quotas, caching strategies, latency, and cost control are usually what determine whether a project stays usable once real users show up.

I especially like the layered caching approach here. The bundled cache + threshold-based lookup + localStorage combination is a practical way to stretch a free-tier limit without degrading the user experience.

We've seen the same pattern in AI product development at IT Path Solutions: model quality gets attention, but thoughtful engineering around rate limits, data retrieval, caching, and infrastructure is often what makes an AI-powered product sustainable at scale.

Curious did you notice any measurable difference in recommendation quality when users provided 2–3 books versus larger reading histories?

Andreas • Jun 8

Thanks for reading!

Good question — anecdotally, 2–3 books gives recs that are "close but generic," while 5+ starts surfacing patterns the user themselves might not have articulated (e.g. preference for unreliable narrators, or a specific era). The reasoning text also gets meatier.

I haven't measured it rigorously though. If anyone has tried the app and noticed a quality shift based on input size, I'd love to hear about it — that's actually more valuable feedback than my own intuition at this point.