Daniel Nwaneri

Posted on Feb 24

Why I Built My Own Humanizer (And Why You Should Too)

#ai #writing #tooling #webdev

There's a tool called humanizer — a Claude Code skill built by blader, inspired by Wikipedia's guide to detecting AI writing. It's good. 6,600 stars, hundreds of forks, an active community adding patterns and language support. If you want to strip AI tells from any text, it does that well.

I used it. Then I built something different.

The problem isn't that humanizer is wrong. It's that it's solving a slightly different problem than the one I actually have.

Humanizer checks your writing against a generic human baseline. It knows what AI writing looks like and flags the patterns — significance inflation, copula avoidance, the rule of three, em dash overuse. Twenty-four patterns derived from Wikipedia's AI cleanup guide. Run your draft through it, find the tells, rewrite.

That works if your goal is writing that doesn't look AI-generated.

My goal is writing that sounds like me.

Those are related but not the same thing. I can write a draft that passes every humanizer check and still sounds nothing like my published work. No AI tells, no voice. Sterile, voiceless prose is as detectable as slop — it just gets detected by different readers.

The thing I needed wasn't a list of patterns to avoid. It was a calibration against my own writing at its best.

So I built voice-humanizer. Same foundation as blader's tool — same 24 patterns, now 27 with three new ones from a community PR. But with one addition that changes what it does: a CORPUS.md file containing your own published writing, from which the skill extracts your voice fingerprint before it checks anything else.

Voice check first. AI pattern check second.

The fingerprint tracks what you reach for and — just as important — what you don't. Rhythm, specificity, the patterns absent from your corpus that signal drift.

Here's what that looks like in practice. I ran voice-humanizer on a draft of this post before publishing. It caught this:

Before (draft):

The fingerprint tracks rhythm patterns, paragraph opening style, specificity signals, what you reach for when you need a concrete detail, and — just as important — what you don't do.

Flag:

Voice drift — list of five items where your corpus shows you compress to two. Em dash doing emotional emphasis work your corpus handles structurally.

After:

The fingerprint tracks what you reach for and — just as important — what you don't. Rhythm, specificity, the patterns absent from your corpus that signal drift.

No AI pattern was triggered. A generic humanizer would have passed this. Voice-humanizer caught it because the corpus knew this author compresses lists. That's the difference.

When it flags something now, it doesn't just say "this pattern looks like AI." It says "this reads as Claude because it uses three parallel items where your corpus shows you compress to two. Here's what you'd likely do instead."

That's a different kind of feedback.

The corpus approach also solves a problem humanizer can't: false positives.

My writing uses em dashes. Not excessively, but deliberately — once per piece, structurally. A generic humanizer would flag that. Voice-humanizer won't, because it appears in the corpus. It's my pattern, not AI bleeding through.

Same for any other stylistic choice that looks like an AI tell in isolation but is actually part of your voice. The corpus is the ground truth.

You can use voice-humanizer with your own writing. The repo is public: github.com/dannwaneri/voice-humanizer

CORPUS.md is gitignored — your writing stays private. CORPUS.example.md shows you what to put there. Five questions in SETUP.md help you extract your own voice fingerprint before you start.

It won't work without a corpus. That's intentional. A humanizer calibrated to nobody's voice in particular isn't calibrated to yours.

Credit to blader for the foundation — the pattern list and skill format this is built on. Voice-humanizer solves a narrower problem for a specific kind of writer: someone who's been writing long enough to know what their best work sounds like — and doesn't want AI assistance to flatten it.

Top comments (16)

MaxxMini • Feb 25

The "not sounding like AI" vs "sounding like me" distinction is the real insight here. I've been writing on Dev.to for a few months now — around 50 articles — and noticed this empirically before understanding why.

My articles that performed best (reactions, comments, genuine engagement) were all ones where I wrote from a specific lived experience. A finance app I built with no backend, a 23-tool collection, a post about the $0 revenue phase. The ones that flopped? Every single listicle and tutorial I published following "proven" content structures.

Same AI tells removed in both cases. But the high-performers had voice — specificity that a generic humanizer wouldn't flag because it's not about patterns to avoid. It's about patterns you uniquely reach for.

Your CORPUS.md approach formalizes something I was doing by intuition: rereading my best pieces before writing new ones, trying to reverse-engineer what made them mine. Having a tool that extracts that fingerprint is a different order of feedback.

Two questions:

Corpus decay — does your voice fingerprint become stale? If you're writing 3x more with AI assistance now than before, the ratio of "pure you" samples in the corpus shifts. At what point does your corpus reflect your AI-influenced voice rather than your pre-AI baseline?
Minimum viable corpus — is there a threshold where the fingerprint becomes reliable? Five pieces? Twenty? I'm curious if you hit diminishing returns or if more samples keep refining the signal.

Daniel Nwaneri • Feb 25

"Patterns you uniquely reach for" is exactly the frame and the fact that you arrived at it empirically before having a name for it is the best evidence the tool is solving a real problem.

On corpus decay. I don't have a clean answer yet. My working assumption is that you weight recent pieces more heavily and treat the corpus as a living document rather than a fixed baseline. Revisit every few months, remove pieces that no longer represent how you write. But the ratio shift you're describing is real.
if AI assistance is already shaping your output, your corpus starts capturing AI-influenced voice as "authentic." the honest answer is I don't know where that threshold is yet.
on minimum viable corpus. five pieces feels like the floor for rhythm detection, twenty for reliable specificity fingerprinting. But this is intuition not data. what I'd actually say is use your best pieces not your most recent ones. quality over quantity.

The pieces where you felt most like yourself not the ones that performed best by metrics. Those two sets aren't always the same.

Both of these are open questions worth tracking. if you build a corpus and notice the fingerprint degrading or improving with size, I'd genuinely want to know what you find.

Ali-Funk • Feb 24

I saved it to read it again a second time. Before I build it I will test out the standard market tools first. Cool project. @dannwaneri

Daniel Nwaneri • Feb 24

That's exactly the right order. Test the standard tools first, know what they do and don't catch, then decide if calibrating to your own voice is worth the extra setup.

Curious what you find when you run your writing through the generic humanizer. what it flags might tell you something about your patterns.

Ali-Funk • Feb 24

Your ideas are genuinely refreshing and I need to try this out first thing tomorrow

Ingo Steinke, web developer • Feb 25

Question remains: what is "my tone" and how does training on past material not hold back developing a better writing style? I always felt like my sentences are too long and text too hard to understand, still some of my posts became quite popular. I'd choose on popular + recent + revised posts with a high readability score and retrain the system with newer material again in a few months from now. And I'd have to split technical DEV posts and cultural blogging for a more general audience.

Daniel Nwaneri • Feb 25

The em dash problem is real. On a Mac it's Option+Shift+Hyphen, on Windows it's Alt+0151. worth adding to muscle memory if you write a lot.
The corpus staleness question is the sharpest thing in this thread. you're right that training on old material risks optimizing for who you were. Weight recent pieces more heavily, revisit the corpus every few months, and treat it as a living document not a fixed baseline.

The split corpus idea. technical posts separate from cultural writing — is something I hadn't considered and probably should implement. different registers, different fingerprints. worth building into the setup instructions.

Ingo Steinke, web developer • Feb 25

Linux doesn't seem to have a similar preconfigured way to type that character and I haven't even been missing it at all. I suppose that you also use special typographic quotation marks instead of the ASCII 34 " replacement?

Ingo Steinke, web developer • Feb 25

My writing uses em dashes.

I always wondered how people do this. There is no such symbol in the standard German keyboard layout, so for me it's less likely than using an emoji in my text.

Matthew Hou • Feb 26

The corpus-first approach is the right call. I've been working on something adjacent — maintaining voice consistency across different content types (articles, comments, product copy) — and the lesson is the same: generic detection catches AI tells but misses voice drift. The false positive problem you describe is real. I use em dashes deliberately too, and every generic checker flags them. Having a ground truth corpus that says 'this is actually how this person writes' changes the signal entirely. One thing I'd add: the corpus should probably evolve. Your writing voice shifts over months. A static CORPUS.md calibrated to writing from a year ago might start penalizing your current voice. Have you thought about a rolling window approach?

Daniel Nwaneri • Feb 26

The rolling window problem is real and I don't have a clean solution yet. My working approach is manual.Revisit the corpus every few months, weight recent pieces more heavily, remove anything that no longer represents how I write.

The harder version of your question. if your voice is shifting blc of AI assistance, the corpus starts capturing AI-influenced voice as authentic. At what point does the ground truth stop being ground truth?
maintaining voice consistency across content types is the adjacent problem I haven't solved. articles and comments already feel like different registers.Product copy is a third one entirely. Separate corpus files per content type is the obvious answer but adds friction most people won't accept.

Dejan • Mar 2

Congratulations on your success. However, human evaluation of text or speech has common flaws. Some search systems evaluate sentences of less than 10 characters as over 80% AI-generated. This is a problem elsewhere, not a methodological one. Speaking of the corpus you mentioned, it knows how to recognize sentences. Whether it's finding a word, learning the probability of the next word, and then constructing the entire sentence, each model is different. And there are many common methods. You probably have many identification tools that can evaluate my commit as AI-generated. However, it's important to remember that AI must recognize that humans are prone to and prone to making mistakes. This doesn't mean reinforcement learning or strong AI. It needs to be able to analyze the sentiment of the entire sentence. I think that's the answer. In any case, congratulations on your success, and I sincerely hope you continue to post articles like this.

klement Gunndu • Mar 2

The corpus decay problem is real — we ran into this building content pipelines where the "voice fingerprint" drifted after just a few months of AI-assisted editing. Curious if you've considered versioning the CORPUS.md to track that drift over time?

Daniel Nwaneri • Mar 2

Versioning CORPUS.md is the right instinct. Treating voice fingerprint as a living document rather than a fixed calibration. Haven't implemented it yet but the approach would be semantic versioning tied to publishing milestones: v1.0 captures pre-AI-assisted writing, each major drift gets a new version, rollback available if the current voice diverges too far from the baseline.
The harder question you're pointing at: at what point does the drifted version become the authentic voice rather than a corrupted one? if the writing genuinely improved through AI collaboration, penalizing that drift is the wrong call. The version history is also the record of how the voice evolved which is different from how it degraded.