DEV Community: Raunak Kathuria

How to make AI sound more like you, not more like AI

Raunak Kathuria — Sun, 26 Apr 2026 15:45:25 +0000

In the first post, I made a simple point: The AI model is a commodity. Taste isn’t.

https://dev.to/raunakkathuria/the-ai-model-is-a-commodity-taste-isnt-5177

A few people then asked the obvious next question: “If taste is the real differentiator, how do you actually build it into an AI workflow?”.

This post is my answer to that. Not in theory. In practice.

This is the exact process I used:

the prompt I used to build my TASTE.md
the questions that mattered most
the audit loop I now use to catch AI-sounding writing
and why this worked better than just saying “write in my voice”

I started doing this because I kept running into the same problem. I could get AI to produce clean writing. Sometimes even impressive writing. But it still did not quite feel like me. That was the frustrating part.

It was often clear, structured, and polished. But something in it felt slightly off. Too smooth. Too balanced. A bit too eager to sound like “good writing.”

For a while, I thought this was mainly a prompting problem from my side.

Maybe I needed better instructions:

“Write like me.”
“Match my tone.”
“Be more natural.”

That helped a little. But not enough. It still felt like I was describing the surface of my writing, not the thing underneath it.

That was the shift for me.

AI does not need more adjectives about your style. It needs a better understanding of your taste.

The goal was not to sound like me. It was to think closer to me.

At first, I assumed I mainly needed AI to copy my tone, sentence style, or structure. But that is only 10% of it.

What actually matters more is:

what I notice
what I simplify
what I reject
what I find engaging
what I never want to sound like

In other words, taste. That is what makes something feel like you even before it becomes stylish.

It is also why two pieces can say similar things, but only one feels like it came from you.

The first prompt I used

I started broad on purpose.

I wanted the model to look at my public writing, my older blog posts, and the way I think in conversations before trying to define my voice.

Here is the prompt I used:

I want you to go through my Substack and LinkedIn
- https://www.linkedin.com/in/raunakkathuria/
- https://raunakkathuria.substack.com/
- https://raunakkathuria.github.io/blog/ (though a little old now)

And also analyze all the chat history with me to develop a taste that can define my taste for AI to write and behave like me.

You need to create a taste of me based on the above information, ask me questions that will be needed to create this and ask till you are not satisfied that we have 95% of information to write the taste for defining me, my writing style etc that I will use to give it to my agent AI so that it can sound and write like me

Also read https://alisabelmas.substack.com/p/the-age-of-good-taste that explains how the taste has evolved over the centuries

Two things about this were useful.

First, it forced the model to start from evidence instead of guessing.

Second, it made the process iterative. Not “generate a style guide in one shot,” but “keep asking until the signal is strong enough.”

That part matters more than it sounds.Most people stop too early.

They show the model two or three pieces of writing, ask for a style guide, and then wonder why the result sounds generic.

A generic input usually produces a generic version of you.

The next step was contrast, not description

Once the model had enough material, the most useful questions were not “what is my tone?”

They were questions like:

what writing do I admire?
what writing do I dislike, even if it is popular?
what should AI never do when writing like me?
how opinionated should it be?
what should readers feel?
what kind of openings feel natural to me?
what kind of endings feel earned?

That is where the real shape started to emerge.

I shared references I genuinely like:

how Karpathy presents information
how Arthur Hayes can be engaging
how Anthropic Engineering explains technical ideas in a way that is easy to digest

And I was also explicit about the emotional shape I wanted:

calm clarity first
playful insight second
not loud
not boastful
not spammy

That narrowed the space quickly.

Because good taste is not just preference. It is also rejection.

Sometimes the clearest signal is not what you love. It is what you instantly do not want to sound like.

The most useful realization was what I did not want

This turned out to be as important as what I liked.

I did not want the writing to sound:

like consultant-speak
like LinkedIn self-congratulation
like a Twitter-thread guru
like dramatic storytelling trying too hard to land
like polished AI neatness

That gave the model sharper guardrails.

A lot of weak AI writing does not fail because it is wrong. It fails because it feels slightly manufactured.

Too polished
Too balanced
Too eager to sound like “good writing”

That was the part I kept reacting to.

Not that the draft was poor. Just that it felt constructed.

At some point, the process gave me a simple filter that ended up being useful everywhere after that:

not writing that grabs attention but writing that earns attention

That is still one of the best tests I have found.

What my TASTE.md ended up

The final TASTE.md was not really a list of tone adjectives.

It was closer to a decision system.

It encoded things like:

make complex ideas easy to understand
use simple language with technical precision when needed
prefer logic first, then story, then examples
be engaging, but never performative
use subtle humor, not forced wit
allow paradox when it reveals something true
stay low-ego
avoid sounding like a personal brand machine

That was the real difference.

I was no longer asking AI to imitate my writing.

I was giving it a better model of my judgment.

And that is a much stronger foundation.

The second part of the system: audit, then humanize

Once I had the base taste defined, a different problem showed up.

Even when the idea was right, some drafts still sounded a bit too smooth. Again, not bad. Just slightly synthetic.

So I started using a separate audit prompt after drafting. That separation helped a lot.

Writing and critiquing are different jobs. A model that can produce a smooth paragraph is not always good at noticing where that same paragraph sounds artificial.

You have to ask for that explicitly.

Here is the audit prompt I use.

Universal AI-audit-check prompt

Audit this draft for AI-sounding writing.

Check for:
- predictable phrasing
- generic abstractions
- over-balanced sentence rhythm
- repeated syntactic patterns
- too many rhetorical devices
- over-explaining
- vague claims without concrete grounding
- polished-but-forgettable wording
- anything that sounds constructed instead of true

Return:
1. AI-sound risk score (0–100)
2. Top 5 flagged lines
3. Why each line feels artificial
4. A tighter alternative for each
5. Final verdict: pass / revise / rewrite

Important:
- preserve my meaning
- preserve my calm, grounded voice
- do not make it more dramatic
- do not add fake personality
- do not add emojis, hype, or cleverness for its own sake

The key thing here is that I am not asking it, “make this better.”

That is too vague.

I am asking it to tell me where the writing sounds artificial, and why.

That usually leads to much better edits.

Universal humanizer prompt

Then, only if the audit flags real issues, I use this:

Humanize this draft without changing the idea.

Goals:
- make it sound more natural, specific, and human
- reduce visible AI smoothness
- keep the writing calm, clear, grounded, and low-ego
- preserve my original judgment and intent

Rules:
- edit only where needed
- prefer concrete words over abstract ones
- remove over-signposting
- vary sentence rhythm naturally
- keep one sharp line if it feels earned
- do not force anecdotes
- do not add fake struggle or fake vulnerability
- do not make it sound like a LinkedIn guru
- do not over-polish

Return:
1. revised version
2. 3 biggest changes made
3. why those changes improved the voice

This matters too. A lot of “humanizing” prompts make writing worse because they add fake personality.

That is not what I want. I do not want the writing to sound louder, more dramatic, or more quirky than it needs to be.

I just want it to stop sounding like polished autocomplete.

The actual loop I use

In practice, the workflow is simple.

1. Build the base taste

Use examples, dislikes, audience, tone, openings, endings, and constraints to create a real TASTE.md.

2. Draft with that taste

Have the model write with the shared taste as the base.

3. Audit the draft

Use a separate prompt to identify where it sounds AI-generated.

4. Humanize only if needed

Do not “improve” everything. Fix only what is flattening the voice.

5. Keep refining the taste file

Every time something feels off, the fix is often not in the draft. It is in the taste definition.

That last point is probably the most important.

A weak prompt can produce a weak output once. A weak taste file produces weak outputs repeatedly.

Before and after

Before this process, I was asking AI to imitate my writing. After this process, I was giving AI a better model of my judgment.

That changed the quality of the output more than any clever prompt did.

If you want to build your own

Start with three things.

1. Give it your real data

Not just one post. Give it enough signal:

blogs
notes
old writing
chats
public posts

2. Give it contrast

Tell it what you like and what you never want to sound like. This is where a lot of the real taste signal comes from.

3. Separate writing from auditing

Use one pass to generate.
Use another pass to critique.
Use a third pass, only when needed, to humanize.

Do not merge all three into one vague instruction and expect the result to be sharp.

Final thought

In the first post, I argued that taste will matter more as models become abundant.

This is what that looks like in practice.

The easiest mistake with AI writing is to focus on output too early.

You keep editing sentences when the real issue is that the model does not yet understand your judgment.

That is why I think TASTE.md matters.

It is not just a style file. It is a way of making your preferences reusable.

And once that exists, the writing gets better because the decisions underneath it get better.

AI gets cheaper every year. Your judgment does not.

That is why the real asset is not the model. It is the taste you train it to follow.

The AI model is a commodity. Taste isn't.

Raunak Kathuria — Mon, 06 Apr 2026 13:43:10 +0000

When everyone has access to the same AI, your edge is the judgment behind the output.

The first comment on my Reddit post wasn't a question.

I don't think I've ever seen so many stupid ideas and em-dashes in such a short amount of text.

No counter-argument. No nuance. Just a precise observation about punctuation I couldn't shake, because they were right. Not about em-dashes, but about the generic text that was presented.

The post had been written with AI assistance. I'd used em-dashes in almost every sentence, in a rhythm that wasn't mine. The Reddit commenter knew something was off. That's the thing about taste — you recognise the absence of it before you can name it.

Everyone has the same AI

Here's the uncomfortable truth: the model is no longer the differentiator.

Give ten capable people a reasonable prompt and you'll get ten tidy versions of the same answer. Same sentence balance. Same transitions. Same polished neutrality.

The AI optimises for readability, and readable turns out to mean forgettable.

What nobody's discussing: AI was trained to be useful to everyone. That means it has no taste. It has capability without judgment, fluency without instinct. It can write anything, which means it sounds like nothing in particular.

Taste is judgment you can teach

Everyone who's spent years thinking about something develops a taste for it. Not just preferences: a finely tuned sense of what belongs and what doesn't. What to say. What to leave out. What sounds true and what sounds performed.

That taste isn't in the model. It's in you.

Taste is the personalisation you bring to AI — the part that can't be trained on someone else's data.

A doctor has it in diagnosis, the pattern recognition from thousands of cases that flags something wrong before the tests confirm it. A great editor has it for sentences, whether a line is doing the work it thinks it's doing.

AI can generate the words. It can't generate the judgment behind them.

The question is whether you can give it yours.

What I built to do exactly that

After the Reddit comment, I didn't just edit the post. I tried to fix the root cause.

The problem wasn't the output. It was that I'd given the AI no real constraint to work with. Write like me is not a constraint. It's a wish. The AI interpreted it as: write clearly, write professionally, write in a way that nobody will object to. And nobody would object to it, because it had no point of view.

So I spent a few sessions trying to articulate what my voice actually was. Not in general terms, specific ones.

I asked the agent to read through my Substack posts, my LinkedIn writing, our conversation history. Identify the patterns. Push back on what it got wrong. Refine until the description felt accurate enough to be useful.

The output was a file I called TASTE.md.

It's not a style guide. Style guides tell you how to format things. This was closer to a brief for what kind of mind should be speaking: what it cares about, what it would never say, the things that would sound wrong even if they were technically correct.

The difference in the writing was immediate. Not perfect — it took several passes and a proper audit loop to get the AI detection score from 78% down to 11%. The first draft with TASTE.md loaded still scored 62%. Three rounds to get it to 11%. But the direction was right from the start.

Why this matters beyond writing

Writing is just the most visible place where taste shows up.

In code review, the taste is your sense of what good architecture feels like in your specific system, not in the benchmark, not in general, in yours. In product decisions, it's the feel for which trade-off your team can absorb right now and which one will cost you a quarter.

These aren't prompting techniques. They're the accumulated judgment you've built over years, finally expressible in a form something else can use.

The teams that get the most out of AI aren't the ones with the best prompts. They're the ones who've spent time writing it down, something like TASTE.md, but for their domain, specifically enough that they can give it to something that has no beliefs of its own.

The inversion worth sitting with

AI is trained to be a generalist. That framing is holding people back.

The same model with your specific taste, the things that would sound wrong to you even if they're technically correct, produces something different from the same model without it. Each piece builds on the last. The output starts to sound like it came from someone.

And the part I didn't expect: building TASTE.md didn't just make the AI more useful. It made me clearer about what I actually believe. You can't articulate your taste to something else without first understanding it yourself.

The Reddit comment that stung on a Sunday evening was useful. Not because it was about the em-dashes, they were right about the generic text that was presented.

Everyone has access to the same AI. What differentiates you is what you bring to it.

How clearly have you articulated your taste?

This is part one of a two-part series. Part two covers the exact process: how to build your own TASTE.md, the audit loop I use, and what the before/after looks like.

Originally published at raunakkathuria.substack.com

The Agentic Stack

Raunak Kathuria — Mon, 30 Mar 2026 05:16:30 +0000

We’ve spent years building software the same way. A service receives a request, calls another service, writes to a database. Something reads it back. You scale the slow parts, monitor the breaking ones.

That model still works. It’s reliable, well-understood, and well-tooled. But something’s shifted — and if you haven’t felt it yet in your architecture decisions, you will soon.

AI has introduced a new execution layer. Not infrastructure. Not middleware. Something that does work — the kind that used to require a human to sit down, think, and grind through.

The question nobody’s asking right now

Most leaders ask: “Should we use AI?” That’s the wrong level. Everyone’s already using it — in their IDE, their code review tool, their incident runbook.

Ask instead: What’s the architecture?

Because when AI stops being a feature and starts being an execution layer — something that does work rather than assists with work — your system’s structure changes. How you define capability changes. How intent enters the system changes. That’s what the agentic stack is about, and it’s worth understanding before your team figures it out without you.

Three layers. You already know all of them

Claw is the new unit of architecture. A bounded execution unit with its own role, context, tools, and constraints. Not a microservice — it doesn’t expose an API endpoint. It checks Slack, reads files, calls APIs, drafts responses, takes action. It does things.

Skill is the instruction layer. Reusable logic that tells a claw what to do, how to do it, what rules it’s operating under, and what good output looks like. If a claw is the executor, the skill is the judgment baked in.

Prompt is how intent enters the system. Not a rigid API contract — a natural language instruction that activates claws and routes work through skills.

Three layers. Each one maps to something familiar.

Claw is the architecture. Skill is the language. Prompt is the protocol. That’s the stack your team is building on — whether you’ve named it yet or not.

What this replaces and what it doesn't

Here’s where most explanations go sideways: they frame this as replacement. It isn’t, and treating it that way will cause you to either over-invest or dismiss it entirely.

Services still define how systems run. Databases still store state. APIs still integrate third-party tools. None of that changes.

What changes is the coordination layer on top, the work that used to live in Notion docs, Jira comments, and Slack threads.

The glue work. The “someone needs to look at this and decide work”. Before the agentic stack, that required a human. After, a claw handles it.

Meet Priya

She’s a Senior Engineering Manager at a scaling platform team. Her week, pre-agentic stack: 15 PRs moving through review, senior engineers burning two to three hours a day reading diffs, junior engineers waiting three to five days for feedback that came back cryptic half the time anyway.

The process ran like this:

PR opened → Priya manually assigns reviewer based on who knows the area
Reviewer reads 300–500 lines of code
Leaves comments → author reads them, guesses at intent → revises
Reviewer re-reads, re-approves or re-requests changes
Repeat until it's good enough

That's six to eight human touchpoints per PR. Multiply by 15 PRs a week.

They tried building a claw to do first-pass reviews. The skill encoded the team’s standards — naming conventions, test coverage thresholds, security anti-patterns, documentation expectations. First two weeks were rough. It flagged too aggressively, left feedback in a tone that annoyed the engineers, and missed context on a few architectural decisions it didn’t have visibility into.

They fixed the skill. Rewrote the tone guidelines, added the architecture docs to its context, tuned the flagging thresholds. By week four, it was doing what a good junior reviewer would do — and Priya’s seniors were only seeing the PRs that genuinely needed a human call.

That’s the actual story. It wasn’t clean. It worked anyway.

What it looks like in practice

Here's a simplified skill definition for a code review claw:

name: code-review
role: Review pull requests against team engineering standards. Flag violations. Approve clean PRs. Escalate anything touching auth or payments.
context:
  - repo: standards/engineering-guidelines.md
  - file: .github/CODEOWNERS
tools:
  - read_pr_diff
  - post_review_comment
  - request_human_review
  - approve_pr
constraints:
  - Never approve PRs touching /src/auth without human sign-off
  - Always flag missing tests as blocking
  - Post feedback in plain English, not just line references
output:
  - structured review with verdict: approve | changes_requested | escalate
The skill is the logic. The claw is the executor. The PR opening is the prompt.

No new service to build. No new API to maintain. You're composing behaviour, not writing code.

What actually changes for you

Your senior engineers stop being the default reviewers. Their judgment gets encoded into skills — and applied at scale, consistently, without their calendar getting eaten. They shift from doing reviews to defining what a good review looks like. That’s a leverage upgrade.

Code standards become executable, not aspirational. There’s a meaningful difference between a policy nobody reads and a skill a claw runs on every PR. One is stated. One is applied. You’ve probably written a lot of policies that belong in the second category.

Your junior engineers get faster feedback loops. They learn faster because the signal isn’t filtered through whoever had time to review that week.

And you get visibility into work that was previously invisible. Every claw action is logged. Every decision is traceable. Which is a kind of accountability your process almost certainly doesn’t have right now.

How to start

Pick one workflow that is a bottleneck. Not a hypothetical one, a specific named workflow that your team complains about in retros.

Write down, in plain language, what a good human does in that workflow. Step by step. Including the judgment calls, the rules they apply, the things they'd escalate. That's your first skill. The trigger is your first prompt. The claw executes it.

Run it alongside your human process for two weeks. Don’t replace anything yet — just compare the outputs. Trust the parts that are right. Fix the parts that aren’t.

That’s it. Everything else is refinement.

The most interesting thing about this stack isn’t the technology. It’s that the judgment your best people carry around in their heads — the stuff that makes them irreplaceable — can now be encoded and applied everywhere, all the time, without them being in the room.
That’s what’s actually happening. The naming is new. The leverage isn’t.

What workflow in your team still requires a human because nobody’s written down what good looks like? Genuinely curious — drop it in the comments.