How I built a local-first content processor with Python + Ollama

#python #ollama #productivity #showdev

I have 847 saved videos, articles, and podcasts I will never watch or read.

For two years I told myself I'd get to them. I didn't. The pile just grew.

The problem isn't discipline. It's that "save for later" is a one-way door — things go in but never come out as anything useful.

So I built something to fix that. It's called DRIP.

What it does

DRIP reads your saved content — YouTube videos, articles, podcasts — and converts each one into a structured Markdown note that drops directly into your Obsidian vault (or any folder you point it at).

A saved recipe becomes a note with an ingredient table and steps. A podcast becomes guest, topics, and key takeaways. A long article becomes a summary with the core arguments pulled out.

Everything runs locally on your machine. Nothing is uploaded anywhere.

The stack

Python for orchestration and file handling
Ollama for local LLM inference (llama3, but model-agnostic)
yt-dlp for pulling transcripts from YouTube
Readability for extracting clean article text
Whisper (optional) for podcasts without transcripts

The core loop is simple: fetch content → extract clean text → send to Ollama with a structured prompt → write Markdown to the output directory.

Why local-first

Two reasons.

Privacy. My saved content includes private reading habits, half-formed research, and things I save "just in case." I didn't want to pipe all of that through a third-party API.

Cost. Running a cloud LLM on 847 items would get expensive fast. Running Ollama locally is free after setup.

The tradeoff is speed — local inference is slower than API calls. But this is a background process; I kick it off and come back later.

The hard parts

Transcript quality varies a lot. YouTube auto-captions are often garbled, especially for technical talks. I added a cleaning pass before the LLM step to strip filler words and fix common OCR-style errors.

Getting reliable structured output from Ollama. I needed clean Markdown with consistent heading levels — not JSON, not prose. This took more prompt iteration than expected. The fix was being extremely explicit in the system prompt with a concrete example of the exact output format.

Concurrency limits. Even locally, hammering Ollama with many concurrent requests degrades output quality. I settled on a small queue with a configurable concurrency limit (default: 3).

Current state

It's working and I use it daily. I've processed about 600 items so far — the notes are genuinely useful and I've actually started surfacing things I saved years ago.

I packaged it as a one-time purchase tool at thebvl.com — $39, runs on your machine, no subscription, you own it.

Happy to answer questions about the implementation, particularly the Ollama prompt structure or the transcript pipeline. Both took longer to get right than I expected.

DEV Community