DEV Community

Cover image for From PDF to ATS-Optimised Resume in Three Steps — How I Built It with Next.js and OpenAI
Azeez Roheem
Azeez Roheem

Posted on

From PDF to ATS-Optimised Resume in Three Steps — How I Built It with Next.js and OpenAI

Sometimes I would like to know the missing skills and what I already
have on my resume. I wanted an app to fix the gaps automatically —
so I could get a better resume and apply faster.

As job applicants, we need to blend speed with ATS-optimised resumes.
Checking a resume against a job description manually is time-consuming
and requires knowledge most candidates don't have. Most people have
never heard of ATS keywords — let alone know how to use them.

So I built a pipeline to do it automatically. Here's how it works.

Step 1 — PDF Upload and Extraction

The pipeline starts with a PDF upload. PDF is the standard format
for resumes, but extracting clean text from one is harder than it
sounds. Column layouts, custom fonts, and formatting cause the raw
text to come out in the wrong order — a candidate's name can end up
in the wrong section entirely.

extractAndStructure() solves this in two steps. First it cleans the
raw text — removing blank lines, trimming whitespace, and splitting
it into named sections. Then it sends those sections to OpenAI, which
reorganises them into a structured JSON object with fields for name,
experience, skills, education, and contact details.

Structured JSON is returned instead of raw text because every step
after this needs to read specific fields. The analyse route needs the
skills array. The rewrite route needs the experience highlights. Raw
text would require re-parsing at every step — structuring once here
makes the rest of the pipeline simple.

Step 2 — Match Analysis and Keyword Extraction

When the user submits a job description, two functions run at the
same time — extractKeywords() and analyseMatch(). They run in
parallel because neither needs the other's result. Both only need
the resume and job description, so there is no reason to wait.
Running them simultaneously cuts the response time roughly in half.

extractKeywords() pulls required skills, tools and technologies,
and ATS keywords directly from the job description. analyseMatch()
scores the resume against it — returning a match score out of 100,
matched skills, and missing skills.

The match score tells the user how well their resume fits the role
before any rewriting begins. It is most useful for resumes scoring
between 40 and 70 — where real gaps exist but the candidate is not
a wrong fit. The user sees the score and extracted keywords before
continuing, so they can decide whether to proceed with the rewrite
or try a different role entirely.

Step 3 — Bullet Rewriting

When the user hits Rewrite & Continue, the pipeline combines
missingSkills and atsKeywords into one target list. missingSkills
comes from the match analysis — skills the resume lacks. atsKeywords
comes from the job description itself — what the ATS scanner is
looking for. Combining both gives the rewriter the most complete
picture of what the role needs.

For each bullet, three functions run in sequence. scoreBullet()
rates the bullet on three criteria — action verb, skill or tool,
and outcome. Bullets that score poorly are flagged for rewriting.
Bullets that are already strong are left unchanged. rewriteBullet()
then rewrites only the flagged bullets, incorporating the target
keywords naturally. validateRewrite() runs immediately after each
rewrite — checking that the keywords fit naturally and the
truthfulness risk is low. If the rewrite fails validation, the
original bullet is kept.

The user sees a before and after comparison for every bullet that
changed, a summary of how many were improved, and a button to
download the tailored resume.

What I'd Do Differently

The pipeline works best for candidates with match scores between
40 and 70. In this range, real gaps exist but the candidate is not
a wrong fit for the role. Above 80, the resume is already strong —
rewriting bullets won't move the needle. Below 40, the role itself
is likely the wrong target.

The biggest limitation right now is PDF format. Most people write
their resume in Word or Google Docs and export to PDF — but if the
export is done incorrectly, the PDF becomes a scanned image with no
text layer. The pipeline fails at the first step and the user has
no way to continue without fixing their file. Word documents and
Google Docs exports need to be supported directly in a future version.

The final piece is PDF generation. Right now the pipeline analyses
and rewrites but the output lives on a web page. Week 4 turns the
rewritten JSON into a formatted, downloadable PDF — a complete,
ATS-optimised resume the candidate can send directly to recruiters.
That is what transforms this from a pipeline into a product.

This is Week 3 of my AI/ML learning curriculum. Week 2 covered
the Node.js pipeline that powers this app. The full code is on
GitHub: github.com/Azeez1314

Top comments (0)