DEV Community

Cover image for I built a CLI tool that converts messy webpages into clean markdown for AI tools
Anal Shaju
Anal Shaju

Posted on

I built a CLI tool that converts messy webpages into clean markdown for AI tools

The Problem

Every time I paste a webpage into Claude or ChatGPT, I waste my tokens on navigation bars, footers, ads, and junk.

What I Built

grabctx - a CLI tool that strips all the unwanted things and gives you only the main content as clean markdown.

How It Works

npm install -g grabctx
grabctx https://any-article.com --copy
Enter fullscreen mode Exit fullscreen mode

It fetches the page, extracts the main content using Mozilla's
Readability algorithm (same one Firefox uses for Reader View),
converts it to markdown, and shows you the token savings.

Results

Tested on a Wikipedia page:

  • Before: 91,461 tokens
  • After: 19,757 tokens
  • Saved: 78%

Tech Stack

  • TypeScript
  • Node.js
  • @mozilla/readability
  • linkedom
  • turndown
  • commander
  • gpt-tokenizer

Links

Would love feedback from the community!

Top comments (0)