Building an AI persona that doesn't lie: the parts nobody bothers to build

#ai #opensource #python #showdev

Most "I built an AI bot that posts to social media" projects are about 80 lines of code: hit a news API, drop the result into a prompt, call an LLM, post whatever comes back. Mine started there too. The trouble is that the naive version does exactly what you'd expect. It's fluent, it's confident, and it gets technical details wrong constantly.

AlexaPavlova posts dry takes on tech news and open source as a senior Berlin developer, on Mastodon and Bluesky. The whole thing runs on hardware in my house: local LLM for the text, local image model for the pictures, nothing going out to a cloud API. The code is now up under MIT.

The part I actually care about isn't that it posts. It's how much has to happen between "here is a news story" and "here is something worth publishing." That gap is where the months went.

What it actually publishes: the persona's take plus a portrait from the image model.

The pipeline is the product

A single post goes through all of this before I ever lay eyes on it:

sources → curate → enrich → generate → sanitize → validate → regen? → approval → publish
  HN        pick     fetch +   local      ~10 deter-  LLM grounding  one      Telegram   Mastodon
  Lobsters  slots    summarize  LLM        ministic    check, gated   retry    photo +    + Bluesky,
  blogs              (factual              cleanup     by a regex     with     inline     each with
                      only)                passes      pre-screen     failed   keyboard   its own
                                                                      claims              creds
                                                                      injected

The approval step is just my phone. Each post arrives as a Telegram card with the image and a version tailored to each platform, and nothing publishes until I approve it.

And that's only the generation path. Replies are a separate long-running process. It polls published posts at random intervals between 30 minutes and 2 hours, pulls the full reply tree rather than just the top-level replies, drafts an answer in voice, and sends it through the same approval step on Telegram. I kept its callback handling apart from the post-approval listener on purpose, because two pollers hitting the same Telegram update stream end up stepping on each other.

None of this is glamorous, and all of it earned its place the hard way. Here are the layers that matter.

Layer 1: Enrichment, with a hard "empty beats wrong" rule

Posts start from real stories. Each one gets fetched, the main text pulled out with lxml, then summarized by a separate LLM call. The summarizer prompt is deliberately boring: a neutral one-to-three-sentence factual extract, no opinion, no critique. The first version let the summary carry opinion, and the persona just parroted the source's framing back out as if it were its own. Splitting "what happened" from "what I think about it" is most of what keeps the posts from reading like recycled headlines.

The decision I'd point to is what happens on failure. Paywall, JS-only page with nothing to extract, timeout, summarizer error: the slot gets dropped, not faked. I'd rather publish three real things than four where one is invented. Summaries are cached for 24 hours so reruns and regenerations don't refetch, and hosts that never extract cleanly are skipped before they waste a slot.

Reading the article at all was a later change, and it mattered more than any prompt I wrote. The first version worked off the headline and whatever snippet the source handed back, and it showed. The takes were either generic or quietly invented, because the model was reacting to a title instead of to anything real. Fetching the page, pulling out the body, and summarizing it is what made the opinions land. It also made the system pickier, and that shows up in the numbers. Between enrichment dropping anything it can't read cleanly and the validator dropping anything it can't ground, a batch that starts as eight candidates now tends to publish two or three. It used to be seven or eight. That fall isn't the system breaking. It's the quality bar doing its job, and I'd rather it stay that way.

That rule, drop it rather than invent it, is the spine of the whole system. Everything after it exists to enforce the same thing harder.

Layer 2: Deterministic sanitization before any LLM judgment

Before the expensive validation, the raw output runs through about ten deterministic cleanup passes. Plain functions, no model, no cost. They strip the things the LLM gets wrong in predictable ways: banned hashtags, leaked internal slot IDs, the byline it likes to sign at the end, orphaned sentence openers, the same opening tic over and over, asterisk self-censorship, emoji wrapped in XML-looking tags. Each persona carries its own hashtag allowlist and banned-phrase list in its identity file.

The principle is just: don't put a model on what a regex can do. A surprising amount of "bad LLM output" is mechanical, and mechanical problems have mechanical solutions that run faster and stay consistent in a way an LLM cleanup pass never does.

There's a deeper reason this matters, and it took me a while to see it. My first instinct whenever a post came out wrong was to add another rule to the prompt. But every rule competes for the model's attention, and past a point the model just satisfies a few constraints and lets the rest go, usually starting with the overall coherence. On small quantized models that ceiling comes fast. Every behavior I could push out of the prompt and into deterministic code was attention I handed back to the writing itself. The cleanup layer isn't only about speed. It's about keeping the prompt short enough that the model can still hold a voice.

Layer 3: A regex pre-screen in front of the LLM validator

Then the grounding check, which is an LLM pass asking whether the claims in a post are actually supported by the source material. It works, but it costs a round trip, and the round trip is slow and sometimes wrong in its own right.

What made it cheap was noticing that the most common hallucinations have a shape. Invented file paths, file:line references, version strings like v2.3.1, error codes, CVE numbers. All of those match a regex. So a pre-screen runs first, and if it finds a core/handler.py:88 token that points at nothing real, the post is flagged before any LLM call happens. The cheap deterministic check catches the common case instantly, and the model only looks at what survives it.

A failed post gets one regeneration attempt, with the specific claims that failed fed back into the prompt so the retry knows what to avoid instead of rolling the dice again.

Layer 4: The generator and the validator have to want the same thing

This is the one that cost me the most time. Early versions of the persona prompt asked for the exact thing the validator then threw out. The prompt said to "reference specific technical details," the model read that as "invent a plausible file path," the grounding check caught the invented path, and the post got dropped. I was burning compute to generate posts designed to fail my own quality gate, then burning more compute to detect that they'd failed.

The fix had to touch both ends. The prompts now say outright not to invent file paths, line numbers, version numbers, function names, or API endpoints, and the validator enforces that same line. If your generate-then-validate loop rejects a lot, the problem usually isn't the model. It's that the prompt is asking for something the validator is built to punish.

Memory had a version of the same bug. I used to feed whole previous post bodies into the context for continuity, and the model pattern-matched them. If last week's post had a file:line reference in it, this week's would invent a fresh one, because a few examples had taught it that "posts like this contain specifics." Switching to titles only, framed as "topics already covered, don't repeat them," killed that whole category. Examples in context are instructions whether you mean them that way or not.

What this adds up to

The rest is plumbing. Each persona is isolated in its own directory with its own identity file, prompts, and SQLite databases, so adding one is a directory copy. A scheduler that stays in UTC end to end. A dispatcher loop. Two separate Telegram approval surfaces. Publishing to both Mastodon and Bluesky with per-account credentials, each platform getting its own version of the text generated to fit its length limit. Around 312 tests, all mocked, holding it together.

The thing I keep coming back to is that getting an LLM to produce text is the easy 20%. Getting it to reliably not make things up is the other 80%, and that 80% is curation, the discipline of dropping a story instead of faking it, deterministic cleanup, layered validation, and a human with a veto. Almost none of the hard part is the model. Almost all of it is the scaffolding around the model.

How it actually runs

Two machines. A Raspberry Pi 4 does all the orchestration and never turns off: it holds the SQLite databases, runs the three persistent processes as systemd units so they come back after a reboot, and fires the daily batch from cron. A laptop with a GPU does the heavy lifting, and only when there's actually something to generate. The two talk over the LAN.

Four processes do the work:

main_batch.py — the daily run, kicked off by cron. Pulls stories, builds up to eight candidate posts and their images, and sends the ones that survive enrichment and validation to Telegram for approval, then exits. One-shot, not a daemon.
main_telegram_listener.py — a persistent loop handling the APPROVE / REGEN / CANCEL buttons on each Telegram card.
main_dispatcher.py — a persistent loop that publishes approved posts to Mastodon and Bluesky at their scheduled time.
main_reply_listener.py — a persistent loop that polls published posts for replies, drafts an answer, and routes it back through Telegram for approval.

The GPU side is just two local HTTP services the Pi calls into: LMStudio for the text model and SwarmUI for the images.

What keeps the standing cost low is the split, not the hardware. The always-on half is a Pi 4 pulling a few watts, on 24/7. The inference half is not modest: it's an MSI Vector laptop with an RTX 5080 (16 GB), a real GPU machine. The point is that it doesn't have to run around the clock. A 14B model quantized to GGUF fits comfortably in 16 GB, and for one persona posting a handful of times a day the GPU only does real work during the daily batch and the occasional reply draft, idle the rest of the time. The thing that's always on costs almost nothing; the thing that costs something is almost never on.

The Pi 4 that never sleeps: SQLite, the listeners, the daily cron.

The MSI that does the inference — only awake when there's something to generate.

The code

MIT, runs end to end: https://github.com/msalsas/amanuensis. The text runs on a local GGUF-quantized model served through LMStudio. I went through a few over the project: Hermes 3 (Llama 3.1 8B) to start, Mistral Nemo Instruct in the middle, and Hermes 4 14B by the end. The code doesn't care which one is loaded, it just talks to the LMStudio endpoint. One thing did shape the choice: the model has to be uncensored enough to actually voice the persona's sarcastic criticism. The more heavily aligned models kept refusing or sanding the takes down into corporate mush, which kills the character outright. Images are Juggernaut XL with a LoRA I trained myself on a fully synthetic dataset, so the face stays consistent across posts and isn't based on any real person. Telegram handles approvals, Mastodon and Bluesky handle publishing. Nothing calls out to a cloud model.

One war story, because it makes the model-agnostic thing concrete. The MSI isn't a dev machine, it's a dedicated inference server, but it hosts whatever model I need at the time: Qwen when I'm using it for coding from my workstation, Hermes for Alexa. The batch uses whatever's loaded, and one day Qwen was. It generated the posts in its thinking mode, the reasoning tokens bled into a response format the pipeline wasn't expecting, and the batch fell over. Naming the model in config doesn't fully fix this, because the model still has to actually be loaded into LMStudio. I wrote a small tool to load and swap models programmatically (llm-control) but never wired it into the pipeline, so loading the right model before a run stayed a manual step I had to remember.

It's an experiment rather than a maintained product, so issues and PRs may sit. But it works, it's tested, and the anti-fabrication parts carry over to any LLM pipeline where a wrong answer is worse than no answer.