I built a free tool so LLMs stop wasting tokens when they pull web pages

juneyoung — Thu, 25 Jun 2026 02:59:09 +0000

When I let an LLM go search the web for me, it grabs the whole page raw and
burns a bunch of tokens on stuff it doesn't actually need. I kept wishing it
could just pull the part that matters. So I built a small tool for that.

Lean Reader takes a URL and gives you back just the body text, cleaned up,
plus the numbers on how many tokens it saved. It tells you which model and
tokenizer it's counting against, so you can check it yourself.

For example, one React docs page went from 119,126 tokens down to 4,942.
About 96% off, roughly $0.29 saved on gpt-4o pricing.

How it works is pretty boring. It grabs the page, throws out the nav, cookie
banners, scripts and that kind of junk, then runs two extractors and keeps
whichever one holds onto more of the actual body.

It's not perfect. It fetches pages statically, so JS-heavy sites and GitHub
repos come out thin. For those, something that actually renders the page like
Jina or Firecrawl will do better. And the tech honestly isn't fancy. It
fetches, cleans, counts tokens. That's it.

It's free and open source (MIT). You can run it with npx lean-reader, and it
works as an MCP server too, so an agent can call it mid-search.

Link: lean-reader-web.vercel.app

If you try it, I'd really like to know where it breaks for you.

DEV Community: juneyoung

I built a free tool so LLMs stop wasting tokens when they pull web pages