5 things I noticed this week: CI cost, Bluesky QC, and CC0 licensing

#showdev #webdev #githubactions #indiehackers

Five things I noticed or shipped this week while running the AI directory sites and a YouTube automation pipeline. No big announcements — mostly friction I hit and adjustments I made.

1. GitHub Actions was eating my free quota silently

I had a Bluesky posting cron running three times a day. Each run triggered a four-platform matrix build (three sites × multiple jobs). When I looked at my Actions minutes consumed this week, the math was embarrassing: 3 cron runs × 5-6 minutes each × 7 days = roughly 120 minutes/week just for posting a tweet-sized status update.

Two fixes. First, I changed the Bluesky cron from 0 */8 * * * to a single daily trigger — the queue buffers three posts regardless of how many times the cron fires, and posts don't go out faster than the queue feeds them. Second, I added a path filter so content-only commits (articles, copy edits) skip the four-way matrix build entirely. A new article doesn't need a full CI rebuild of all three Astro sites.

Actions quota is not infinite. Even on a paid plan, burning minutes on no-ops is a bad habit to get into before the repo scales.

2. Bluesky posts need a quality gate before they leave the queue

I added a QC gate to the Bluesky post pipeline this week — a step that reads each queued post, checks it against a short ruleset (no broken links, no expired announcements, no posts that reveal the automation stack in a tone that sounds like spam), and drops anything that fails before the cron fires.

The immediate trigger: I audited the outbox and found 17 posts that read like a bot talking to itself. Phrases like "🔁 queued" and "auto-generated" in a context where I had not disclosed that. Not illegal, but not the tone I want on a personal account.

The gate runs as a step before the actual bluesky post command. If it rejects a post, the item stays in the queue with a flagged status so I can review it manually. Net result: fewer posts per day (down from three to one or two), but ones I would not be embarrassed to have written manually.

3. Stopping model routing made the pipeline simpler

I wrote a YT script this week about removing model routing — the pattern where you send different content types to different AI models based on some classifier. I had been routing "short factual" queries to a faster/cheaper model and "synthesis" queries to a more capable one.

What I found after removing it: latency stayed basically the same, cost went up about 8%, and the code got significantly simpler. The routing classifier itself had edge cases. When the classifier misfired on a synthesis query and sent it to the cheaper model, the output was noticeably worse. The 8% cost increase to send everything to the capable model is cheaper than debugging routing bugs.

This is not a universal takeaway — at scale, routing probably pays off. At indie scale with a handful of daily API calls, the complexity cost is real and the savings are marginal.

4. Openverse CC0 filtering is not default — you have to opt in

I added image slides to the YouTube slide renderer this week using Openverse. The API returns results across multiple Creative Commons license types by default. For a monetized YouTube channel, using CC-BY images without visible on-screen attribution is a real licensing problem.

The filter I needed is license=cc0,pdm — not the default. Without it, you get CC-BY, CC-BY-SA, CC-BY-NC results mixed in with no indication they require credit. The API returns a license field per result, but if you're batch-processing slides and forget to filter upstream, you will miss one eventually.

A second issue: Openverse sometimes returns results pointing to images that have since been removed from the source host. The API returns 200 with metadata, but the actual image URL 404s. I added a requests.head() check before the slide renderer tries to download anything, and skip results that return non-200.

5. Self-hosted observability tools have a comfort vs. capability gap

I did a comparison of Netdata, SigNoz, and OpenObserve this week for the purpose of monitoring the three sites. All three install in under 10 minutes. The divergence shows up in what you're comfortable touching at 2am when something breaks.

Netdata is the most comfortable out of the box — it auto-discovers processes and starts charting immediately. SigNoz requires you to send OpenTelemetry traces explicitly, which means instrumenting your code first. OpenObserve is log-focused and works well if you're piping structured JSON logs, but its dashboard interface has a steeper learning curve than the other two.

For my current situation (Vercel + Cloudflare Pages, no VPS to instrument), all three are somewhat over-engineered. I ended up with a single Datadog free-tier integration for error alerting and leaving the self-hosted tools as a future option if the infrastructure changes.

Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.