I've been building in private for too long.
Two months of solo development, zero public commits, and a growing suspicion that "I'll share it when it's ready" is just fear wearing a productivity costume. So this is me breaking that habit. Starting now, I'm building PostAll — a Content Automation Tool — in public on Dev.to.
Every week I'll write one of these. Actual technical decisions, actual dead ends, actual code. Not a highlight reel.
Here's where we are after week one.
What PostAll Actually Is (The Technical Version)
PostAll is a SaaS platform that takes your content brief — a topic, a brand voice, a target audience, an output format — and produces structured, publish-ready content at scale.
"Publish-ready" is doing a lot of work in that sentence, so let me be specific: the goal isn't to dump raw GPT output into a text box and call it done. The goal is CMS-ready output: correct heading hierarchy, metadata populated, internal link placeholders flagged, tone-checked against your brand guide, and formatted for your specific platform.
That's the product. The gap between "AI writes text" and "AI writes text your CMS can ingest directly" is where PostAll lives — and it's a bigger engineering gap than most people expect.
The Stack I Landed On (And the One I Almost Chose Instead)
I went back and forth on this for longer than I want to admit.
What I'm running:
- Next.js 14 (App Router) — Frontend and API routes. I wanted one codebase to start. Will extract the API into a separate service if/when it becomes a bottleneck.
- PostgreSQL + Prisma — Relational data for users, workspaces, content jobs, and output history. I considered MongoDB for the flexibility on content schemas, but the relational constraints turned out to be a feature, not a bug — more on that below.
- BullMQ + Redis — Job queue for async content generation. This was the most deliberate decision of the week.
- OpenAI API (gpt-4o) — Primary generation model for now.
- Vercel — Hosting. Yes, I know. I'll move to a VPS when the pricing math stops working in Vercel's favor.
What I almost chose: Supabase for the whole backend. Realtime subscriptions for job status updates looked great in the docs. I ended up keeping Supabase for auth but pulling back on letting it be my entire backend, because I didn't want to couple my job queue architecture to their platform. Decoupling that decision early felt right.
The First Real Problem: You Can't Fake Content Job State
Here's the thing nobody talks about in content generation tutorials: a content job isn't a request. It's a state machine.
When a user submits a brief, that job moves through at least six states before it's done:
QUEUED → PROCESSING → GENERATING → VALIDATING → FORMATTING → COMPLETE
And it can fail at any of those stages, for different reasons, requiring different recovery strategies. An GENERATING failure is different from a FORMATTING failure — the first might mean retry with a different prompt, the second might mean the output was generated fine but the post-processing broke.
My first implementation had exactly two states: pending and done. Here's what that schema looked like:
// Week 1, Day 1 — too naive
model ContentJob {
id String @id @default(cuid())
status String // "pending" | "done" | "failed"
output String?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
This worked until I tried to handle a partial failure — generation succeeded, formatting broke. I had no way to retry just the formatting step. The whole job was marked failed and the generated content was gone.
By day 3, the schema looked like this:
// Week 1, Day 3 — actually models what's happening
model ContentJob {
id String @id @default(cuid())
status ContentStatus // enum in Prisma schema
stage ContentStage // which stage it's in
generatedOutput String? // raw LLM output, preserved separately
formattedOutput String? // post-processed output
failureReason String?
retryCount Int @default(0)
lastAttemptAt DateTime?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
enum ContentStatus {
QUEUED
PROCESSING
COMPLETE
FAILED
RETRYING
}
enum ContentStage {
GENERATION
VALIDATION
FORMATTING
}
Separating status from stage was the key insight. Status is "is this job healthy?" Stage is "where in the pipeline is it?" Now I can fail a job at the FORMATTING stage and retry only that stage without losing the generated content.
The lesson here applies beyond content generation: if your workflow has more than two meaningful states, model them explicitly from day one. Retrofitting a state machine onto a status: string column is painful.
The Queue: Why BullMQ and Not Just setTimeout
The honest answer is I started with setTimeout.
For local development with one user (me), it was fine. A content job comes in, I kick off an async function, it runs, it finishes. Simple.
Then I thought: what happens if the server restarts mid-generation? The job is gone. The user has no idea what happened. I have no visibility into what was running.
BullMQ with Redis gives me:
- Persistence — Jobs survive server restarts
- Visibility — I can see queue depth, processing time, and failure rates
- Retry logic — Exponential backoff built in
- Concurrency control — I can process X jobs at a time and not hammer the OpenAI API
Here's the actual worker setup:
// src/lib/queue/contentWorker.ts
import { Worker, Job } from 'bullmq';
import { generateContent } from '../generation/generateContent';
import { formatOutput } from '../formatting/formatOutput';
import { updateJobStatus } from '../db/contentJobs';
import { redisConnection } from '../redis';
const CONCURRENCY = 3; // OpenAI gpt-4o rate limit buffer on my current tier
export const contentWorker = new Worker(
'content-generation',
async (job: Job) => {
const { jobId, brief } = job.data;
try {
// Stage 1: Generation
await updateJobStatus(jobId, 'PROCESSING', 'GENERATION');
const rawOutput = await generateContent(brief);
await saveGeneratedOutput(jobId, rawOutput);
// Stage 2: Formatting
await updateJobStatus(jobId, 'PROCESSING', 'FORMATTING');
const formatted = await formatOutput(rawOutput, brief.outputFormat);
// Complete
await updateJobStatus(jobId, 'COMPLETE', 'FORMATTING', { formattedOutput: formatted });
} catch (error) {
// Preserve which stage we were in when it failed
await updateJobStatus(jobId, 'FAILED', job.data.currentStage, {
failureReason: error instanceof Error ? error.message : 'Unknown error',
});
throw error; // Re-throw so BullMQ handles retry logic
}
},
{
connection: redisConnection,
concurrency: CONCURRENCY,
limiter: {
max: 3, // Max 3 jobs processed
duration: 1000, // per second — OpenAI rate limit guard
},
}
);
The concurrency: 3 and limiter config isn't arbitrary — it's specifically calibrated to my current OpenAI API tier. I'll bump this as the tier increases.
What I Underestimated: Prompt Variability
I thought the hard part of a content generation tool would be the infrastructure. I was wrong.
The hard part is prompt consistency.
When you run the same brief through the same prompt 10 times, you don't get the same output 10 times. You get 10 variations with different structure, different tone emphasis, different heading choices. For a single piece of content, that's fine — variety is a feature. For a business that needs 200 product descriptions that sound like they came from the same brand, it's a serious problem.
I've spent the back half of this week on what I'm calling "structured generation" — instead of prompting for the full article in one shot, I'm breaking generation into structured sub-calls:
- Generate an outline (JSON output, validated against a schema)
- Generate each section from the outline independently
- Assemble and run a final consistency pass
This is slower and costs more per job. It's also dramatically more consistent. I'll have benchmarks next week.
The Number That Surprised Me
I ran 47 test jobs through the queue this week. Average generation time: 11.3 seconds per job for a ~800-word article with gpt-4o.
That sounds fast until you think about it at scale: 500 articles would take about 94 minutes at that rate with my current concurrency. That's fine for batch jobs, not fine if a user expects to see 500 articles in their dashboard within 20 minutes of submitting.
Next week's problem: figuring out where the 11 seconds actually go (API latency vs. my code), and whether structured generation can run section calls in parallel.
What's Coming Next Week
- Structured generation architecture (the sub-call approach) with actual benchmark numbers
- The formatting layer: how I'm turning raw LLM output into CMS-ready HTML/Markdown
- First beta user onboarding — someone from my network agreed to try it this week
The Open Question I Actually Don't Know the Answer To
I'm debating whether job status updates should push to the frontend via Server-Sent Events or if polling every 3 seconds is fine for the beta.
SSE is more elegant. Polling is simpler to debug. At 3-second intervals, polling is probably imperceptible to users. But it feels wrong.
What would you choose for a content generation job that takes 10-15 seconds? Drop it in the comments — I'll use whatever the consensus is and report back next week.
PostAll is currently in public beta. If you're a developer building content workflows and want early access, reply here or DM me.
Next devlog drops soon.
Top comments (0)