DEV Community

Cover image for How I schedule three daily Bluesky posts from a JSONL queue without an external service
MORINAGA
MORINAGA

Posted on

How I schedule three daily Bluesky posts from a JSONL queue without an external service

The Bluesky image upload race I fixed a few weeks ago was the last painful incident in an otherwise simple posting pipeline. Here's how the queue system works — the design is different from every social-scheduling SaaS I looked at, and that difference matters on GitHub Actions.

The queue: a flat JSONL file

The entire post schedule lives in content/bluesky-queue.jsonl. Each line is a self-contained JSON object:

{"text": "New article: What I learned about JSON-LD audits in CI. #webdev #tutorial https://aiappdex.com/articles/jsonld-audit-post-deploy-ci"}
{"text": "TIL: Turso vs Cloudflare D1 for Astro monorepos — the practical difference. #opensource #astro https://ossfind.com/articles/turso-libsql"}
{"posted_at": "2026-05-20T09:02:15Z", "post_uri": "at://did:plc:abc123/app.bsky.feed.post/3xyz", "text": "..."}
Enter fullscreen mode Exit fullscreen mode

Unposted entries have only text. After a post succeeds, the script rewrites that line in-place with posted_at and post_uri added. The queue drains from top to bottom; the script picks the first line without a posted_at field and exits after posting one entry.

This format is a deliberate trade-off. It's not a real database. You can't query it, you can't easily filter by tag, and editing it by hand means being careful about JSON syntax on every line. What it gives you: a single file that's diff-friendly in git history, trivially readable, and appended to by any CI job that generates content — the article-publish workflow appends a Bluesky promotion line to the queue after each successful publish.

The post script: richtext facets for hashtags and URLs

Bluesky's API expects richtext facets — byte-range annotations that tell the client which parts of the text are links or hashtags. These aren't inferred; you have to compute them and include them in the post record. The post script builds them from the text string using regex:

function buildFacets(text) {
  const facets = [];
  const enc = new TextEncoder();

  for (const m of text.matchAll(/(?:^|[\s,.;:!?])(#[a-zA-Z][a-zA-Z0-9_]*)/g)) {
    const tagWithHash = m[1];
    const offset = (m.index ?? 0) + m[0].length - tagWithHash.length;
    const byteStart = enc.encode(text.slice(0, offset)).length;
    const byteEnd = byteStart + enc.encode(tagWithHash).length;
    facets.push({
      index: { byteStart, byteEnd },
      features: [{ $type: "app.bsky.richtext.facet#tag", tag: tagWithHash.slice(1) }],
    });
  }

  for (const m of text.matchAll(/https?:\/\/[^\s)]+/g)) {
    const byteStart = enc.encode(text.slice(0, m.index ?? 0)).length;
    const byteEnd = byteStart + enc.encode(m[0]).length;
    facets.push({
      index: { byteStart, byteEnd },
      features: [{ $type: "app.bsky.richtext.facet#link", uri: m[0] }],
    });
  }

  return facets;
}
Enter fullscreen mode Exit fullscreen mode

The byte offset calculation is the non-obvious part. Bluesky byte ranges are UTF-8 byte positions, not JavaScript character positions. A string with emoji before a hashtag would have different byte and character offsets. Using TextEncoder to measure text.slice(0, offset) gives the correct UTF-8 byte position regardless of what precedes the match.

Off-minute cron scheduling

The workflow fires three times daily:

schedule:
  - cron: "37 23 * * *"   # 08:37 UTC → ~09:00 JST
  - cron: "37 7 * * *"    # 16:37 UTC → ~17:00 JST
  - cron: "37 13 * * *"   # 22:37 UTC → ~23:00 JST
Enter fullscreen mode Exit fullscreen mode

The :37 offset is intentional. GitHub Actions schedules at top-of-hour slots — 0 * * * *, 0 0 * * * — are heavily contended globally. I measured 3–4 hour actual delays on a 0 0 * * * slot before moving to :37. The off-minute timing doesn't eliminate delay but reduces it significantly; real-world delivery now lands within 15–20 minutes of the intended JST time.

Inside the job, there's a random additional delay before posting:

- name: Random start delay (0-5 min) to avoid bot-pattern timing
  run: |
    DELAY=$(( RANDOM % 300 ))
    echo "Sleeping ${DELAY}s before posting"
    sleep $DELAY
Enter fullscreen mode Exit fullscreen mode

This spreads the actual post time across a 5-minute window. Bluesky's feed algorithms tend to de-emphasize accounts that post at machine-exact times; the random delay makes the pattern look more organic. I don't have data to prove it works, but the cost is zero.

Self-trigger prevention

After posting, the script rewrites content/bluesky-queue.jsonl and commits the change back to the repo. Without a guard, that commit would trigger the workflow again immediately — draining the queue faster than intended.

The guard is a commit message convention:

- name: Commit queue update
  run: |
    git add content/bluesky-queue.jsonl
    git commit -m "chore(bluesky): mark queued post as posted [skip bluesky-queue]"
    git push
Enter fullscreen mode Exit fullscreen mode
jobs:
  post:
    if: "!contains(github.event.head_commit.message, '[skip bluesky-queue]')"
Enter fullscreen mode Exit fullscreen mode

The workflow skips if the triggering commit contains [skip bluesky-queue]. It's the same skip-token pattern I use across every self-committing workflow in the shared CI pipeline — article ETL, OG image regeneration, the sitemap rebuild — each with a distinct [skip <name>] token so workflows don't accidentally skip each other.

What I'd do differently

The JSONL format breaks down if you want to schedule posts for a specific future date rather than "next in queue." A scheduled_after ISO timestamp field would fix this without changing the format much; the picker logic shifts from "first line without posted_at" to "first line where scheduled_after <= now and posted_at is absent."

The other gap: no retry backoff on failed posts. If the Bluesky API returns an error, the script exits and the next scheduled run retries the same entry — correct behavior, but without backoff a transient 500 hits the same entry three times across the day's posting slots. So far that hasn't caused duplicate posts, but it's a latent issue worth fixing before the account grows.

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.

Top comments (0)