Luca Bartoccini for Superdots

Posted on May 17 • Originally published at superdots.sh

How to Use AI for Performance Reviews (Done Right)

#performancereviews #forhr #feedback #management

Most managers who try AI for performance reviews use it for the one part it handles worst: writing the review from scratch.

The sequence goes like this. Review season arrives. You open ChatGPT, type "write a performance review for [name]," paste in a few notes, and hit go. What comes back is technically a performance review — right words, right structure. And it sounds like every other AI-generated review anyone has ever read, which is to say it sounds like nothing. The employee can tell. You can tell. It goes back in the draft pile.

The fix isn't a better prompt. It's inserting AI at a different point in the process entirely.

What most managers get wrong

HR managers and team leads who've shifted to AI-assisted reviews describe the same failure mode, almost word for word: they asked AI to do the hard part — write specific, meaningful feedback — without giving it what it actually needs: specific, meaningful observations. AI is a transformer, not an oracle. It can reshape input, restructure it, and make it clearer. It cannot invent the substance.

The managers who get real value use AI as a thinking tool, not a writing tool. They use it:

To organize raw observations before writing anything
To surface gaps and recency bias in their own notes
To audit a draft for vague language before sending
To turn bullet points into polished prose — after the thinking is done

The ordering matters. Writing comes last, not first.

What AI can (and can't) help with

Where AI adds genuine value:

Structuring messy observations. You have 12 months of 1:1 notes, project comments, and email threads. AI can organize those into themes — strengths, development areas, behavioral patterns — so you're not staring at a wall of text when you sit down to write.
Flagging recency bias. Recency bias is the tendency to weight recent events more heavily than earlier ones in the same period — a well-documented pattern in performance evaluation research. AI can scan your notes and flag when your examples cluster in Q3-Q4 with nothing from the first half of the year.
Catching vague language. "Strong communicator." "Team player." "Shows initiative." These phrases say nothing. AI can identify them in your draft and prompt you to replace each one with a specific example from your notes.
First draft from your material. Once you have structured, specific bullet points, AI is excellent at turning them into readable prose. This genuinely saves time — but only if the input is good.

Where AI doesn't help:

AI can't know what actually happened. You do.
AI can't judge whether a behavior was a one-time event or a pattern. You can.
AI can't replace the delivery conversation, which matters more than the document.

For teams that want to ground feedback in year-round data, AI people analytics software covers the tools that make observation more systematic from the start.

The three prompts every manager needs

Here's the pre-review framework that HR teams who've made this shift use consistently. Run these three prompts in sequence before writing a single word of the actual review.

Prompt 1: The brain dump organizer

Paste your raw notes — 1:1 meeting notes, project feedback, peer comments, goal progress — and use:

"Here are my raw notes about [employee name]'s performance this year. Organize them into 3-4 themes: what they did well, where they struggled, patterns in their working style, and any areas where I seem to have little evidence. Don't write the review — just organize the material and flag anything that looks underrepresented or missing."

The output isn't a draft. It's a structured view of your material so you can see where you have real evidence and where you're working on impression.

Prompt 2: The recency bias check

Run this on your organized notes before writing anything:

"Review these observations. Are there time periods — specific quarters or project phases — that appear underrepresented? Are there themes where all my examples come from the last 2-3 months? Flag any temporal gaps in my evidence and suggest what I might be forgetting."

Managers who work with this prompt consistently find that Q1 and Q2 observations disappear from final reviews — not because nothing happened, but because recent memory crowds out earlier events. AI makes the gap visible before it becomes a fairness problem.

Prompt 3: The specificity audit

Run this on your draft before finalizing:

"Review this performance review draft. For each sentence that could apply to any employee — generic praise, vague criticism, phrases like 'team player' or 'areas for growth' — flag it and ask me: what specific behavior or example does this come from? I want to replace every generic sentence with one that only this person would recognize."

This single prompt meaningfully improves review quality. Most first drafts have 4-6 sentences that should fail this test. The goal isn't to add more words — it's to replace empty ones.

How to audit your own feedback for consistency

Beyond the three prompts, AI is useful for something most managers skip: checking whether your language is consistent across the team.

If you manage 6-8 people, copy the first paragraph from each completed review into a single document and run:

"Here are opening paragraphs from [number] performance reviews I've written. Are there phrases or sentences that appear in multiple reviews? Flag any language that suggests I'm using templated descriptions rather than observations specific to each person."

This matters for two reasons. Your team members compare notes — template language gets noticed. And inconsistent specificity across reviews can create problems if the documents are later examined for fairness or bias patterns.

AI tools for change management include several platforms that build this kind of cross-review consistency analysis directly into their feedback modules, which is useful if you're rolling this out at scale.

The 30-minute pre-review workflow

Once you're using AI as a thinking tool, the process becomes predictable. Here's the sequence:

Minutes 0–5: Gather your material. Pull together your 1:1 notes, project artifacts, peer feedback, and the employee's goals from the start of the year. If your notes are thin, use AI to prompt your memory (Prompt 1 above works here too — paste the goals and ask what questions would help you recall the year).

Minutes 5–15: Run the brain dump organizer. Paste your notes into Prompt 1. Review the themes the AI returns. Add anything it missed or misclassified. You should now have 3-4 organized themes with the evidence clearly attributed.

Minutes 15–20: Add specifics to each theme. For each theme, write 2-3 bullet points with project names, observed behaviors, and outcomes where you have them. No prose yet. This is the step AI can't do for you — and it's the most important one.

Minutes 20–30: Draft from your bullets. Use:

"Here are bullet points organized into themes for [employee name]'s performance review: [paste themes and bullets]. Write a first draft in a professional, direct tone. Each paragraph should include at least one specific example from the bullets. Avoid generic phrases like 'team player,' 'strong communicator,' or 'areas for growth.'"

Plan 20-30 minutes after this to edit and polish. You'll be revising something specific and accurate, not starting from scratch. The step-by-step guide to AI performance review drafting covers the mechanics of the drafting phase in more detail if you want the full process.

What goes wrong

Over-relying on AI for the substance. If your notes are thin, no prompt will fix that. AI organizes and articulates. It doesn't observe.

Skipping the specificity audit. Prompt 3 is the one most managers skip because the draft looks polished. Polished generic output is worse than rough specific output — it obscures the problem until the employee reads it.

Not reading the draft aloud. AI prose reads fine on screen and hollow in a 1:1 conversation. If you're discussing the review with your employee, read it aloud before the meeting. You'll immediately hear what sounds like you and what sounds like a template.

Expecting efficiency from the wrong step. AI saves time on drafting, not on observation. If you consistently reach review season with nothing to work from, the fix is a year-round note-taking habit — not a better prompt. Even 5 minutes after each 1:1 to log one specific thing changes what you have to work with in December.

Try this today

Take a performance review you wrote in the last cycle — the most recent one you can find. Paste it into ChatGPT or Claude and run:

"Here is a performance review I wrote. For each sentence that could apply to any employee — vague praise, non-specific criticism, or generic phrases — flag it. For each flagged sentence, ask me: what specific behavior or event is this based on?"

Count how many sentences get flagged. For most managers doing this for the first time, the number is 4-7 in a standard review. That's your baseline.

The next review you write, run the same prompt on your draft before it's final. The gap between first pass and final version is the work AI is actually doing for you — not writing the review, but making you a more specific, better-prepared reviewer.

Originally published on Superdots.