resumeadapter

Posted on May 21

Why We Don't Use a Single LLM Prompt to Rewrite Resumes (and What We Built Instead)

#ai #architecture #career #llm

The hype version of an AI resume tool
If you've used any "AI resume optimizer" in the last 18 months, it probably works like this:

You upload a resume (PDF or DOCX).

You paste a job description.

The app stuffs both into one massive prompt that says something like "Rewrite this resume to match this job. Add ATS keywords. Sound professional. Don't make stuff up."

The LLM returns a wall of text.

The app renders it back as a "new resume."

This is a ChatGPT wrapper with a file uploader. And on the surface, it works. The output reads fluently. Recruiter-y verbs everywhere. Bullets feel quantified.

But run the same input twice and you get two different resumes. Run it across 100 users and you'll see invented job titles, dropped certifications, fabricated metrics, and sometimes a "Master's degree from Stanford" that never existed.

That's not a tooling problem. It's an architecture problem.

A resume is not a document. It is structured data.
This is the principle we ended up building ResumeAdapter around.

Treat a resume as text and you're at the mercy of whatever the LLM happened to roll on temperature that day. Treat it as a typed object, with parsed fields, validated schemas, and deterministic comparison logic, and most of the failure modes above disappear.

The reframe sounds obvious in retrospect. But almost every tool in the category still operates on text blobs because building the structured layer is harder, slower, and less Twitter-friendly than shipping a prompt.

The five components
Here's the pipeline we landed on. Each piece does one thing, has its own schema, and can be tested in isolation.

CRDM: Canonical Resume Data Model The single structured representation of a user's resume. Every parsed CV gets turned into a CRDM with typed fields:

type CRDM = { contact: ContactInfo; experience: ExperienceEntry[]; skills: Skill[]; education: EducationEntry[]; achievements: Achievement[]; summaries: Summary[]; metadata: ResumeMetadata; recencyMap: RecencyMap; };

No missing fields. No ambiguity. Validated by a Zod schema before it enters any downstream step. If the parser can't extract a field with high enough confidence, that's encoded in the metadata, not silently filled with a hallucination.

CJDM: Canonical Job Description Model Same idea, applied to the job description. Responsibilities, required skills, preferred skills, keywords, seniority signals, domain requirements, all extracted into a typed structure.

The point: once both sides are CRDM and CJDM, comparing them is a data problem, not a prompting problem.

GAE: Gap Analysis Engine This is where most tools quietly cheat. They prompt the LLM with "What are the gaps between this resume and this job?" and trust the answer.

We made GAE a pure function:

CRDM + CJDM -> { keywordGaps: string[]; skillGaps: SkillGap[]; experienceRelevanceDelta: number; missingAccomplishments: string[]; recencyMismatches: RecencyMismatch[]; roleSpecificGaps: RoleGap[]; }

No LLM. Deterministic. Testable. Reproducible. If you run it twice on the same input, you get the same output. If a recruiter or a user asks "why did you flag this gap?", we can show them the exact comparison that produced it.

This single decision (gap analysis as a pure function instead of a prompt) eliminates a huge class of trust problems.

RPG: Rewrite Plan Generator Given a CRDM and a GAE output, RPG produces a blueprint:

Which specific bullets to rewrite

Which skills to surface

Where to add quantifications

Which generic verbs to replace

Which ATS constraints to enforce

Critically, this is the plan, not the rewrite. It's a structured set of instructions that downstream steps will execute. You can render this plan to the user and they can approve, reject, or edit individual items before any text generation happens.

MRC: Modular Rewrite Chain This is where LLMs finally show up.

Instead of one prompt that rewrites the whole resume, MRC executes a chain of small, scoped rewrites:

Rewrite summary (1 prompt, 1 schema)
Rewrite bullet #1 (1 prompt, 1 schema)
Add quantification to bullet #3 (1 prompt, 1 schema)
Replace "managed" with stronger verb (1 prompt, 1 schema)
Enforce ATS-safe formatting (1 prompt, 1 schema)

Each step:

Has its own minimal prompt

Returns JSON validated by Zod

Is independently testable

Can be retried on validation failure without re-running the whole chain

Is explainable (we know exactly which step produced which change)

If step 7 hallucinates a job title, step 7 fails validation and reruns. Steps 1 through 6 are untouched. Compare this to a single-prompt rewrite where one bad token at position 4,000 silently corrupts the whole output.

Why this matters in practice
Three things change once you build it this way:

Reproducibility. Same resume + same job in two runs gives nearly identical output. The only stochastic piece is the rewrite text inside each MRC step, and even that is bounded by schema constraints.

Explainability. Every change we make to a bullet maps back to a specific gap from GAE, a specific instruction from RPG, and a specific MRC step. When a user asks "why did you change this?", we can answer with structure, not vibes.

Safe failure. When something goes wrong in a giant prompt, the whole resume is suspect. When something goes wrong in an MRC step, we know exactly which bullet failed and can rerun just that piece.

The Zod-everywhere rule
The non-negotiable rule across the entire pipeline: every LLM response gets validated by Zod before it enters any downstream step.

This is the part most teams skip because it feels like overkill. It isn't. LLMs return malformed JSON, miss fields, hallucinate enum values, and occasionally output prose where a string array was expected. Without schema validation at every boundary, those errors propagate silently into the user's resume.

With Zod validation, a malformed response is just a retry. The pipeline keeps its guarantees.

const RewrittenBulletSchema = z.object({ text: z.string().min(20).max(300), keywordsCovered: z.array(z.string()), quantificationAdded: z.boolean(), reasoning: z.string(), });

const result = await llm.rewrite(bullet, plan);
const validated = RewrittenBulletSchema.parse(result);

If parse throws, we retry the MRC step with a tightened prompt. We do not let unvalidated LLM output into the next stage. Ever.

What this is not
It's worth being clear about what we are not claiming.

We are not claiming LLMs are bad at rewriting resume bullets. They're great at it, when you scope the task narrowly.

We are not claiming determinism solves every problem. Job description language is messy, resumes are unstructured at the source, and parsing PDFs is still a hard problem (especially scanned ones).

We are not claiming our architecture is novel computer science. Schema validation, separation of concerns, and pure functions are decades-old ideas. The novel part is applying them ruthlessly inside an AI product category where most teams are shipping prompt chains and calling it a day.

What we learned shipping this
A few things that surprised me as we built it out:

The parser is the hardest part. Everything downstream depends on CRDM quality. Bad parse means bad gap analysis means bad rewrite plan. We ended up building confidence scoring with eight positive signals and four penalty signals, plus automatic routing of scanned PDFs to OCR.
Users do not want a black box. Once we started exposing the plan (RPG) to users before executing the rewrite (MRC), conversion went up. People trust changes they can preview and approve.
Schema migrations are painful but worth it. Adding a new field to CRDM means touching parser, GAE, RPG, and MRC. The discipline forces you to think before adding scope creep.
Pure functions are gold for debugging. When a user reports a bad gap analysis, we can replay the exact CRDM and CJDM and reproduce the result locally. No "well, the LLM was acting weird that day."
Atomic rewrites compose better than monolithic ones. Once you have an MRC, adding a new transformation (e.g., "convert to Canadian English") is one new step, not a rewrite of the whole prompt.

Closing
I'm not saying you need this much architecture for every AI product. If you're building a one-off internal tool, ship the prompt and move on.

But if you're building a product that millions of job seekers will trust with their career data, prompts are not enough. You need types. You need schemas. You need pure functions. You need atomic LLM calls validated at every boundary.

We are not prompting. We are engineering.

If you want to see the engine in action, you can try the resume analyzer or the cover letter generator. Both run on the same pipeline described above.

Happy to answer questions in the comments. Especially curious if anyone else is going the structured-data route for an AI product, or if you've found a way to get reliable output from monolithic prompts that I'm missing.

DEV Community

Why We Don't Use a Single LLM Prompt to Rewrite Resumes (and What We Built Instead)

Top comments (0)