MORINAGA

Posted on Jun 28

How I built a pre-post QC gate that blocks Bluesky automation from self-revealing

#ai #webdev #showdev #indiehackers

Three weeks into running a Bluesky queue from an automated content pipeline, I saw a post go out that referenced "the content pipeline" directly. Not egregiously — it was a passing phrase — but it was the kind of thing that reads differently on a social timeline than it does in a dev.to article. On dev.to, being honest about automation is a feature. On Bluesky, unprompted mentions of your own automation mechanism register as a red flag to human readers who are already primed to distrust content farms.

The JSONL-based queue I described earlier was working fine mechanically — entries generate, sit in the queue, and flush one at a time via a cron job. But there was no filter between the generation step and the post step. Whatever the prompt produced went into the queue, and whatever was at the front of the queue got posted. The Bluesky AT Protocol post API has no server-side content filter beyond spam detection, so the responsibility sits entirely with the client.

I spent a Saturday building bluesky-qc.mjs. It's a gate script that now runs as the first step in the posting workflow. Here's how it works.

The architecture: a gate between queue and post

Before bluesky-qc.mjs, the cron job was:

bluesky-post-queue.mjs → Bluesky API

After:

bluesky-qc.mjs → (PASS) bluesky-post-queue.mjs → Bluesky API

Both scripts read from the same content/bluesky-queue.jsonl file. The QC script walks entries in order, applies four gates to each, and either clears the first clean entry for the post script or moves failing entries to a rejection log. The post script then finds the first unposted entry in the queue — which, if QC just ran, should be a clean one.

In GitHub Actions this runs as pnpm bluesky:qc-then-post, a single composite command. If QC rejects everything in the queue and there's nothing clean to pass through, the workflow exits 0 without posting. Skipping a day is fine. Posting something that reads as automation-reveal isn't.

This connects to the same philosophy behind the Cloudflare Pages race fix I wrote about earlier — building explicit serialization into a pipeline that otherwise looks like it can run all steps in parallel.

Gate 1: vocabulary rejection

G1 is a compiled case-insensitive regex against a list of phrases that signal automated origin:

const REVEAL =
  /programmatic|content[\s-]pipeline|AI-curated|AI-generated|AI authorship|
   my sites|three (independent )?sites|pages generated|page count|records_in|
   llm_calls|build_duration|directory site|scaled content|batch test|
   120 pages|\bcron\b|autonomous agent|unsupervised PR|curat|generation|
   generate|disclos|thin pages|thin-content|\bprose\b|HCU|index vs|
   get dropped|pruning|bounce rate|citation rate|refresh signals|
   content experiments/ix

This is aggressive by design. "curat" catches "curated" and "curation" — both common in automated content contexts. "generate" catches anything mentioning generation. Posts occasionally trip G1 on phrases I'd consider acceptable, and I intentionally leave the regex tight. If a post needs to reference generation or curation to make its point, Bluesky probably isn't the right channel for that thought; dev.to is.

The pattern grew over time. I started with about 15 terms and added ~15 more after specific entries cleared an early version and still felt off when I reviewed the rejection log. The final regex is the output of four weeks of manually auditing what the gate should have caught.

One thing I added late: \bcron\b. Even the word "cron" — in a sentence like "I run this as a cron job" — is a signaling phrase for automation in a social context where the default assumption is that people are posting manually.

Gate 2: freshness, two parts

Staleness shows up in two distinct forms.

Stale phrasing: the entry uses time-relative language that was accurate when generated but isn't accurate when it finally posts.

const STALE =
  /\btoday\b|this week|yesterday|this morning|just\s+(announced|released|landed|launched|dropped)/i;

An entry might say "just dropped" about something that was new when the prompt ran but is three days old by post time. G2a catches this before it reaches the timeline.

Stale timestamp: the entry was created more than TTL_DAYS = 14 days ago. The queue sits ahead of a given entry if the entry was generated a while back and newer entries jumped ahead of it. Both created_at and generated_at field names are checked because the generation scripts in this repo aren't consistent:

const ts = entry.created_at || entry.generated_at || entry.createdAt;
if (ts) {
  const ageDays = (Date.now() - Date.parse(ts)) / 86400000;
  if (ageDays > TTL_DAYS) {
    reasons.push(
      `G2: ${Math.floor(ageDays)} days old (TTL ${TTL_DAYS})`
    );
  }
}

The 14-day TTL is based on my queue depth and posting rate. At one post per day, anything more than 14 entries deep in the queue was probably generated in a context that no longer applies — the tool being referenced may have had an update, the framing may feel dated.

Both sub-checks write separate rejection reasons so the log makes clear which flavor of staleness triggered.

Gate 3: engagement prediction (warn-only in v1)

Gate 3 uses data/bluesky-engagement-profile.json, generated weekly by bluesky-engagement-stats.mjs. That script pulls my last 300 posts from the Bluesky API, calculates a score per post as likes + 2×reposts + replies (reposts weighted higher as a stronger signal), and builds a breakdown by hashtag.

At post time, G3 looks at which hashtags appear in the pending entry and computes a predicted score relative to the baseline:

if (profile?.by_hashtag) {
  const tags = [...entry.text.matchAll(/#[A-Za-z][A-Za-z0-9_]+/g)]
    .map((m) => m[0].toLowerCase());
  const found = tags
    .map((t) => ({ t, m: profile.by_hashtag[t]?.median }))
    .filter((x) => x.m !== undefined);
  const baseline = profile.overall?.median ?? 0;
  if (found.length) {
    const predicted = Math.max(...found.map((x) => x.m));
    g3 = { predicted, baseline, tags: found.map((x) => x.t) };
  }
}

G3 is warn-only in v1. It logs the predicted vs. baseline score but doesn't reject. The reason: I don't have enough post history yet for the signal to be reliable. The shared Claude Haiku client I use for content generation has been running for about two months, and post volume is roughly one per day — that's around 60 data points. Median engagement is still low enough that G3's predictions are noisy.

When I flip G3 to hard-fail (the code has a comment marking where the threshold check goes), I expect it to catch G1/G2 survivors that technically pass the text checks but target hashtags that don't perform for this account. A comment in the code marks this:

// v1: warn-only. データ蓄積後に hard-fail へ昇格予定

The engagement stats script also breaks down by time window (30d, 60d, 90d), so the profile should become more useful as data accumulates over the next few months.

Gate 4: reserved for Codex

The fourth gate is unimplemented. The design intent is a --codex flag that pulls Codex for a quality pass on anything that cleared G1-G3 but still needs a final review. In the current setup, Codex runs at generation time through the article pipeline (the three-layer Codex protocol handles article quality) — but for Bluesky posts, the generation step doesn't include a Codex pass. G4 would close that gap.

I'm deferring G4 until G3 is stable, because adding latency and API cost to a cron that already runs three times a day doesn't make sense while the data foundations aren't solid yet.

What happens to rejected entries

Every failing entry gets appended to data/bluesky-qc-rejected.jsonl:

appendFileSync(
  REJECTED,
  JSON.stringify({
    ...entry,
    qc_reasons: reasons,
    qc_at: new Date().toISOString(),
  }) + "\n"
);

The rejected log serves two purposes: nothing is lost (entries can be salvaged by editing and re-adding to the queue), and it's the primary feedback mechanism for tuning generation prompts upstream. If G1 keeps hitting "content pipeline" in posts that were supposed to be tool roundups, the prompt that generates those posts is leaking pipeline jargon. That's a prompt fix, not a gate fix.

I review the rejected log about once a week, the same session where I run bluesky-engagement-stats.mjs to refresh the G3 profile. Together this takes about 20 minutes.

Differences from the post-deploy check pattern

The single CI pipeline this sits in already runs post-deploy checks after every Cloudflare Pages build. Those checks are about production correctness — did the right pages render, do the JSON-LD blocks validate, is the sitemap current. The Bluesky QC gate is about tone and context — does the text feel authentic for a social timeline.

The design principle is the same: gate as early as possible, make failures informative, never silently swallow errors. But the failure modes are different. A broken JSON-LD block is objectively wrong. A post that mentions "content pipeline" isn't wrong — it's contextually wrong for the audience.

What I'd do differently

Atomic state machine. The two-script design has a coordination gap: QC clears an entry, post script runs, if the network fails mid-post the entry stays at the front of the queue and QC re-evaluates it next run. This is harmless but wasteful. A single script that locks the entry state before attempting the post — and marks it as qc_passed in the JSONL — would eliminate the re-evaluation churn.

Classifier instead of regex. The REVEAL regex is 30+ terms and growing. The right long-term design is a small logistic regression or a few-shot Claude classifier trained on the rejected log once there are 100+ examples. I'll have enough data in about six months at current rejection rates. Until then, a regex is zero latency, zero cost, and deterministic — properties I'd be trading away for accuracy I don't yet need.

G3 as a hard gate sooner. The warn-only period for G3 feels too conservative in retrospect. I could have set a very permissive threshold (say, predicted score < 0.5 × baseline median) and still caught the clearly low-signal cases without needing extensive data. I'll revisit once I hit 90 days of post history.

FAQ

Why not filter at generation time rather than at post time?

I do — generation prompts include explicit instructions to avoid revealing automation. But prompts are probabilistic. The gate is deterministic. Both layers running in sequence is cheaper than relying on either alone.

What if the whole queue fails QC?

The script exits 0 with a log message. Nothing posts. The queue isn't cleared, so the next run tries again. If the queue stays empty for several consecutive days, the queue refill cron (bluesky-refill-queue.mjs) adds fresh entries.

How do you decide what goes into the REVEAL list?

Anything I'd feel awkward saying to a human who asked "did you write this?" — tool names for automated workflows, quantitative signals from CI pipelines, phrases that have no natural place in a first-person social post.

Does the gate ever false-positive on legitimate content?

Yes. "generate" catches legitimate uses of the word. Posts about generators, code generation tools, or electrical generation would trip G1. I accept that loss. Posts about those topics are edge cases for this particular Bluesky presence, and if they do come up I can edit the queue entry to rephrase before re-adding.

Related:

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.

DEV Community