DEV Community

Cover image for I built an API that converts any webpage to clean Markdown in under 1 second
filtede98
filtede98

Posted on

I built an API that converts any webpage to clean Markdown in under 1 second

## The Problem

I was building a RAG pipeline and needed a way to feed web content into my LLM. The options were:

  • Copy-paste the text manually — doesn't scale
  • Use Beautiful Soup — returns raw text, loses all structure
  • Use a headless browser — slow, expensive, complex to maintain

None of them gave me what I actually needed: structured Markdown that preserves headings, tables, code blocks, and
links.

## So I Built WTM API

One POST request. Any URL. Clean Markdown back in under 1 second.


bash
  curl -X POST https://wtmapi.com/api/v1/convert \                                                                      
    -H "x-api-key: YOUR_KEY" \                
    -H "Content-Type: application/json" \                                                                               
    -d '{"url": "https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map"}'

  What you get back:                      

  # Array.prototype.map()                                                                                               

  The **map()** method of Array instances creates a new array                                                           
  populated with the results of calling a provided function                                                             
  on every element in the calling array.

  ## Syntax                                                       

  map(callbackFn)                                                 
  map(callbackFn, thisArg)

  ## Examples                             

  const numbers = [1, 4, 9];                                                                                            
  const roots = numbers.map((num) => Math.sqrt(num));
  // roots is now [1, 2, 3]                                                                                             

  Headings, code blocks with syntax hints, bold text, links — all preserved. Not just raw text.

  How It Works

  No headless browser. No Puppeteer. No Playwright.               

  The engine runs server-side with Cheerio (lightweight HTML parser) and does:

  1. Minimal cleanup — removes only <nav>, <script>, <style>, cookie banners
  2. Recursive conversion — walks the DOM tree and converts each element to its Markdown equivalent                     
  3. URL resolution — relative links become absolute URLs         
  4. Table conversion — HTML tables become proper Markdown tables                                                       

  The result is a faithful conversion of the page content, not a lossy text extraction.                                 

  What I Used to Build It                                                                                               

  The entire stack runs on free tiers:                                                                                  

  ┌────────────┬────────────────────────────────────────────┐     
  │  Service   │                  Purpose                   │                                                           
  ├────────────┼────────────────────────────────────────────┤     
  │ Next.js 16 │ Framework (App Router)                     │                                                           
  ├────────────┼────────────────────────────────────────────┤     
  │ Supabase   │ Auth + PostgreSQL + Row Level Security     │
  ├────────────┼────────────────────────────────────────────┤
  │ Stripe     │ Subscription billing (Free/Pro/Enterprise) │
  ├────────────┼────────────────────────────────────────────┤                                                           
  │ Vercel     │ Hosting and deployment                     │                                                           
  ├────────────┼────────────────────────────────────────────┤                                                           
  │ Cheerio    │ HTML parsing engine                        │                                                           
  └────────────┴────────────────────────────────────────────┘     

  Total monthly cost: $0.                                         

  Pricing

  - Free: 50 calls/month (no credit card)     
  - Pro: $9/month — 10,000 calls                                                                                        
  - Enterprise: $49/month — 100,000 calls                         

  Try It Now

  There's a live demo on the site — 3 free conversions, no signup required. Paste any URL and see the output instantly.

  https://wtmapi.com                                              

  I'd love to hear what you think. What URLs would you test it on? What features would you want next?
Enter fullscreen mode Exit fullscreen mode

Top comments (1)

Collapse
 
filtede98 profile image
filtede98