Digiwares

Posted on Jan 23

I Built a Free Article-to-Audio Converter!

#podcast #webdev #nextjs #openai

I Built a Free Article-to-Audio Converter in a Weekend

My "read later" list was out of control. Hundreds of articles I'd never get to. So I built Sornic — paste any article URL, get audio in seconds.

The Problem

I wanted to catch up on articles while commuting, cooking, or working out. But:

Pocket/Instapaper just pile up unread
Most TTS apps sound robotic
Browser extensions are clunky

I wanted something dead simple: URL in, audio out.

The Stack

Next.js 14 (App Router)
OpenAI TTS API (natural voices)
Vercel (hosting + serverless functions)
Upstash Redis (rate limiting)
Tailwind CSS (styling)

How It Works

1. Article Extraction

When you paste a URL, the server fetches the page and extracts the article content using Mozilla's Readability library (same one Firefox uses for Reader View).

const dom = new JSDOM(html, { url });
const reader = new Readability(dom.window.document);
const article = reader.parse();

For JS-heavy sites that don't work with simple fetch, I fall back to Firecrawl.

2. Content Cleanup

Raw extracted text often includes navigation, ads, "Subscribe now!" prompts. I use Claude Haiku to clean it up:

const response = await anthropic.messages.create({
  model: 'claude-haiku-4-20250514',
  messages: [{
    role: 'user',
    content: `Clean this article for text-to-speech.
              Remove nav, ads, CTAs. Keep only the article body.
              ${rawText}`
  }]
});

3. Text-to-Speech

OpenAI's TTS API has a 4096 character limit, so I chunk long articles:

function splitTextIntoChunks(text: string, maxLength: number): string[] {
  // Break at sentence boundaries when possible
  const sentenceMatch = remaining.slice(0, maxLength).match(/.*[.!?]\s/s);
  // ...
}

Then generate audio for each chunk and concatenate:

for (const chunk of chunks) {
  const mp3Response = await openai.audio.speech.create({
    model: 'tts-1',
    voice: 'nova',
    input: chunk,
    speed: 1.0
  });
  audioBuffers.push(await mp3Response.arrayBuffer());
}

4. Rate Limiting

Free tier = 5 articles/day per IP. Using Upstash Redis:

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.fixedWindow(5, '24h')
});

Challenges

1. Vercel Timeouts

Default is 10 seconds. Long articles can take 30-60 seconds to process. Fixed with:

// vercel.json
{
  "functions": {
    "app/api/**/*.ts": { "maxDuration": 60 }
  }
}

2. ESM/CommonJS Conflicts

jsdom v27 broke on Vercel due to ESM issues. Downgraded to v24:

npm install jsdom@24.1.3

3. Sites Blocking Scraping

Some sites block server-side requests. Firecrawl handles these as a fallback — it uses headless browsers and handles anti-bot measures.

Cost Breakdown

Per article (~2000 words):

OpenAI TTS: ~$0.03
Claude Haiku cleanup: ~$0.001
Vercel/Upstash: Free tier

At 5 free articles/user/day, costs stay manageable with the rate limit.

What's Next

Download as MP3
Browser extension
Playlist/queue feature
Premium tier with more articles

Try It

sornic.com — no signup required.

Drop a URL, pick a voice, hit play. Would love feedback on what features would make it more useful.

Tags: webdev, nextjs, openai, javascript

Top comments (3)

Marat Sabitov • Jan 23

Wow, it's sounds great! 🚀
It would be awesome if you had a Telegram bot that would send audio in response to a URL. For me, using bots is often more convenient than visiting websites

Comment deleted

Digiwares • Jan 24 • Edited

Just built it! 🚀

@SornicBot on Telegram - send any URL, get audio back.

Thanks for the suggestion. Let me know how it works for you.