DEV Community

Hopkins Jesse
Hopkins Jesse

Posted on

Building My Personal AI Content Pipeline: Architecture, Tools, and Numbers

I used to spend three hours every Sunday drafting technical posts. That stopped eight months ago. Now I publish four posts a week with roughly twenty minutes of active editing. The trick was not asking an AI to write for me. The trick was building a system that handles repetitive formatting and structural assembly while keeping me in control. I call it my content pipeline. It processes about 42 raw notes per week, costs me $11.63 monthly to run, and publishes directly to my static site generator. Here is exactly how I built it, what it runs on, and the numbers that actually matter.

The pipeline splits into four stages. First comes capture. I dump ideas, code snippets, and rough paragraphs into a local Obsidian vault. Second is enrichment. A Python script pulls those files, runs basic formatting checks, and pushes clean markdown to a Redis queue. Third is generation. Workers pick up items from the queue, call an LLM with structured prompts, and return draft variations. Fourth is review and publish. I read the drafts in a custom web interface, make edits, and trigger a GitHub Actions workflow that builds and deploys the site. Nothing runs in the cloud except the LLM calls and the deployment runner. Everything else sits on a single Raspberry Pi 4.

I chose Obsidian for capture because the folder structure maps directly to my content categories. Each note gets a YAML frontmatter block with status, tags, and target word count. Redis acts as the message broker. I picked it because it handles about 600 jobs a week without dropping connections. The worker service runs on FastAPI with Celery. I could have used plain cron jobs, but the retry logic in Celery saves me when API rate limits hit. For generation, I route prompts to Claude 3.5 Sonnet. I tested GPT 4, Llama 3, and Mistral across 200 test prompts. Sonnet won on technical accuracy, especially when I asked it to preserve my existing code samples. I cap each prompt at 8,000 tokens. That keeps costs predictable. The review interface is a small Next.js app that fetches drafts from PostgreSQL and exposes a simple approve or edit button.

The most critical piece lives in the prompt builder. I do not send raw notes to the model. I sanitize, chunk, and attach constraints. Here is the exact function that prepares the payload.

def prepare_generation_payload(note: Note) -> dict:
    constraints = [
        "Keep the tone conversational but precise.",
        "Do not invent APIs or libraries.",
        "Preserve all code blocks exactly as provided.",
        "Target 1200 to 1500 words.",
        "End with a single practical takeaway."
    ]
    payload = {
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 2048,
        "temperature": 0.3,
        "system": f"You are a senior developer editing a draft. Follow these rules: {', '.join(constraints)}",
        "messages": [
            {"role": "user", "content": f"Transform this raw note into a complete article:\n\n{note.clean_markdown}"}
        ]
    }
    return payload
Enter fullscreen mode Exit fullscreen mode

I keep the temperature at 0.3. Higher values introduce fluff that I always end up deleting. The strict system message stops the model from adding generic introductions. I run this function inside a Celery task that retries three times with exponential backoff. If the third attempt fails, it logs to a CSV and notifies me via Discord webhook. I get about a two percent failure rate on API calls, which usually traces back to transient network hiccups.

The pipeline handles 420 items per month. I spend roughly 80 minutes a week on active writing. That includes reviewing AI drafts, fixing broken links, and adding personal context. My hosting costs sit at $11.63. The Pi uses three watts. Redis runs on the same box. The only external charge is the LLM API, which averages $9.40 per month because I limit context windows and cache frequent queries. Draft quality measures at 91 percent on the first pass. I define that as needing only minor grammar fixes and no factual corrections. The remaining nine percent get rewritten manually or sent back with adjusted prompts. Publishing takes 42 seconds from clicking approve to seeing the live URL.

I learned three things while tuning this setup. First, structured constraints beat creative freedom. When I told the model to write however it wanted, I got three paragraphs of filler before the actual technical content appeared. Now I force it to mirror my existing outline. Second, local processing cuts latency by 70 percent. I used to upload files to S3, trigger a Lambda, and wait for a callback. Moving to Redis and Celery on the Pi removed that network hop entirely. Third, version control matters more than I expected. Every draft gets committed to a private Git repository with a timestamp tag. I can diff any article against its original note and see exactly what changed. That saved me twice when I accidentally published outdated security warnings.

The pipeline will not replace your voice. It handles repetitive assembly so you can focus on the parts that actually require human judgment. I still write the initial observations. I still verify every command and configuration. The AI just turns scattered notes into readable drafts. If you want to replicate this, start small. Route ten notes through a basic script. Measure the time spent editing. Adjust the prompt constraints. Add a queue when you hit rate limits. Ship when the numbers make sense. I am currently adding an image generation step that pulls code screenshots and converts them to optimized WebP files. That should cut another twelve minutes from my weekly routine. The system grows with the work. It stays practical because every component has a clear purpose and a measurable output.


💡 Further Reading: Pi Stack

Top comments (0)