DEV Community

maymay5692
maymay5692

Posted on

I Had AI Write an Article. Then My AI Quality Gate Rejected It 5 Times.

I use AI to write technical articles. I'm not going to pretend otherwise — it's 2026, most of us do to some degree. The issue isn't using AI. The issue is that AI-generated text has a smell, and readers can tell.

So I built a quality gate. Three AI personalities that independently review every article before it goes live. Two out of three have to approve, or the article goes back for revisions.

My first article went through five rounds of rejection before it finally passed. Here's every reason it got bounced.

The System: Three Judges, One Article

The quality gate is called MAGI — named after the supercomputer system in Neon Genesis Evangelion. Three distinct personas, each with a different lens:

MELCHIOR (The Scientist) — Only cares about data and accuracy. "Is the code correct? Do the numbers check out? I don't care if it's boring, I care if it's wrong."

BALTHAZAR (The Mother) — Focused on safety and risk. "Will this get us in trouble? Will readers trust us less after reading this? What's the worst case?"

CASPER (The Woman) — Gut instinct and brutal honesty. "This is boring." "Nobody talks like this." "I wouldn't finish reading this." She's the one you're afraid of.

Each one reads the article independently. No peeking at the others' opinions. Then they vote: approve, reject, or conditional. Two-thirds majority to pass.

(Spoiler: CASPER is the hardest to please. Her sensitivity to AI-generated text is borderline paranoid. She's also almost always right.)

Rejection #1: "Stop Using Part Headings"

My first draft was neatly organized into Part 1, Part 2, Part 3, Part 4.

CASPER rejected it immediately.

"Is this a textbook? Humans don't write blog posts in numbered parts. This screams 'I told an AI to structure this for me.'"

MELCHIOR was fine with it — the technical content was accurate. BALTHAZAR flagged it as a risk: "If readers sense it's AI-generated, they'll bounce."

Vote: 1 approve, 1 conditional, 1 reject. Back to revisions.

I flattened the headings. No more Part 1/2/3/4. Just descriptive section titles like a normal blog post.

Rejection #2: "Don't Open with a Proverb"

Fixed the structure. But I started the revised version with one of those aphoristic sentences AI loves:

"A trader with discipline will outperform a genius without it."

CASPER caught it instantly.

"Name one personal blog post that opens with a fortune cookie quote. I'll wait."

She also flagged something I hadn't noticed: "AI loves putting things in groups of three. If you have a bullet list with exactly three items that are all roughly the same length — that's a tell."

I went back and looked. She was right. Every list was three items. Every item was one line. Perfectly symmetrical. Perfectly artificial.

Fix: killed the proverb, made my lists uneven — some with two items, some with five, some with a long and a short one mixed together.

Rejection #3: "Your Paragraphs Are Too Uniform"

Third submission. CASPER rejected again.

"Every single paragraph is 2-4 sentences. Same length. Same rhythm. Humans are messier than this."

MELCHIOR approved — accuracy was fine. BALTHAZAR leaned conditional. So technically it was 2-1 in favor. But CASPER's point was too good to ignore. She wasn't wrong. Real writing has one-sentence paragraphs. It has paragraphs that ramble on for six sentences because the writer got excited about something. It has rhythm changes.

AI doesn't do that. AI writes in perfectly measured blocks.

Fix: I deliberately mixed in single-sentence paragraphs. Added a digression that went off-topic for a few lines. Let some sections be short and some long.

Like this paragraph. Just one line.

Rejection #4: "Where's the Failure?"

Now CASPER was mostly satisfied. But BALTHAZAR spoke up.

"This reads like a success story. Everything works, everything is great. Where's the part where something went wrong? Nobody trusts a story with no failures."

The article was about building a crypto trading bot. So I added the part where every single trade closed in the red. $33 invested, 100% loss rate. The bot was working correctly — it was executing the strategy as designed. The strategy just happened to be wrong for that market phase.

CASPER had one more note: "The sentence endings are all identical. Every sentence ends with a period and a declarative statement. Mix it up."

For the Japanese version, this meant adding casual sentence endings — the equivalent of "you know?" and "honestly" and trailing thoughts. For English, it meant varying between short punchy statements and longer explanatory ones. Throwing in a parenthetical now and then. (Like this.)

Rejection #5: "It Still Reads Like a Translation"

This one was specific to the Japanese article, but the principle applies universally.

BALTHAZAR flagged phrases that were structurally correct but unnatural. Things like "it is possible to perform" instead of just "you can do." Formal constructions that no one uses in blog posts.

Then MELCHIOR — the data-focused one who usually doesn't care about style — backed it up: "I compared the sentence-ending distribution against the top 50 posts on this platform. The statistical profile doesn't match. The article reads like it was written in English and translated."

That was the kill shot. It wasn't just a gut feeling anymore — there was data.

The final fix: I fed the AI several popular posts by human authors on the target platform, had it generate a style analysis report (word frequency, sentence length distribution, casualness score), and then rewrote the article to match that profile.

That's when all three approved.

Before and After

Here's what the same paragraph looked like at each stage:

Draft 1 (rejected):

Automated trading systems have the ability to execute transactions at predetermined intervals. This capability enables consistent performance regardless of market conditions. Furthermore, the elimination of emotional bias represents a significant advantage over manual trading approaches.

Draft 5 (approved):

The bot checks every hour, decides whether to buy or sell, and goes back to sleep. It doesn't panic at 3 AM. It doesn't hold too long because "maybe it'll go higher." That's the whole point.

Same information. Completely different feel.

The changes that got it through:

  • Part 1/2/3/4 → flat headings
  • proverb opening → personal anecdote
  • uniform paragraph length → deliberately uneven
  • pure success story → failures included
  • "Furthermore" and "Additionally" → gone
  • neat three-item lists → messy, real

Using AI to Catch AI (The Irony)

Yes, it's ironic. AI writes the article, AI rejects it, AI suggests the fix, AI rewrites it. The snake eating its tail.

But here's why it actually works: AI knows its own patterns better than anyone. It knows it defaults to three-item lists. It knows it loves "Furthermore." It knows it produces uniform paragraph lengths. So it's uniquely qualified to spot those patterns in text.

The three-persona approach covers blind spots. MELCHIOR alone would approve anything that's technically correct. CASPER alone would reject everything for being "not fun enough." BALTHAZAR alone would be too cautious. Together they approximate something close to a real editorial review.

The hardest part of making AI write like a human? It's not the vocabulary or the grammar. It's the imperfections. Humans are inconsistent, messy, occasionally brilliant and occasionally lazy in the same paragraph.

Teaching AI to be imperfect on purpose — that's the actual challenge.

This post describes a real workflow I use. The MAGI system runs on Claude and tmux. The personality descriptions are built into the system prompts. The rejection examples are from actual review cycles, though simplified for readability.

Top comments (0)