Devanshu Biswas

Posted on Jun 7

I Built the Resume-vs-JD Scorer Every ATS Uses — In 30 Lines of JavaScript

#javascript #nlp #beginners #hackathon

🌐 Live demo: https://dev48v.infy.uk/solve/day1-resume-jd-match.html

Day 1 of SolveFromZero — pick a real hackathon problem, ship the working solution. Today's brief is a classic from Unstop: "Build a tool that scores a resume against a job description and surfaces the missing keywords."

Every Applicant Tracking System (Workday, Greenhouse, Lever, all of them) does some version of this before a human ever sees your resume. Most candidates have no idea their CV is competing in a 30-second word-overlap contest.

Let's build the contest judge.

The whole algorithm in 4 steps

tokenize(resume)   →   list of meaningful words
tokenize(JD)       →   list of meaningful words
coverage =  |JD-tokens found in resume|  /  |JD-tokens|
jaccard  =  |intersection|  /  |union|
score    =  round( 0.7 × coverage  +  0.3 × jaccard )  × 100

That's it. The whole thing fits in 30 lines of JavaScript. No model, no API key, no backend — runs in your browser tab.

Step 1 — Tokenize

Split on anything that's not a letter (or + or # for C++ / C#), lowercase everything, drop short words and stopwords.

const STOP = new Set(
  "a an the of in on for to and or with by from as is are was were be been being have has had do does did will would can could may might must shall should i you he she it we they me him her us them my your his their our this that these those at if not but so".split(" ")
);

function tokens(text) {
  return text.toLowerCase()
    .split(/[^a-z+#]+/)
    .filter(w => w.length > 2 && !STOP.has(w));
}

Stopwords matter. Without them, every resume scores high against every JD because both contain "the" and "a" hundreds of times. The filter is what makes the rest of the math meaningful.

Step 2 — Coverage (the important one)

const setR = new Set(tokens(resume));
const setJ = new Set(tokens(jd));

const covered = [...setJ].filter(k => setR.has(k));
const coverage = covered.length / setJ.size;

Coverage answers the question recruiters actually care about: "What fraction of what we're asking for is in this resume?" A 70 % coverage = 28 of 40 JD keywords found in the candidate's text.

Step 3 — Jaccard similarity

function jaccard(a, b) {
  const A = new Set(a), B = new Set(b);
  const inter = [...A].filter(x => B.has(x)).length;
  const union = new Set([...A, ...B]).size;
  return union ? inter / union : 0;
}

Jaccard adds nuance: if your resume has 100 unique tokens and the JD has 40, even with 100 % coverage your Jaccard is only 0.4 (the resume is "wider" than the JD). This stops a wall-of-buzzwords resume from gaming the score.

Step 4 — Blend

const score = Math.round((0.7 * coverage + 0.3 * jaccard) * 100);

70 % coverage, 30 % Jaccard. Coverage matters more because that's what the recruiter is really screening for. Tune the weights for your domain — academic CVs vs. product CVs might want different blends.

Step 5 — The actionable output: missing keywords

const missing = [...setJ].filter(k => !setR.has(k));

The score alone is useless. The list of missing words is the takeaway. Add those (where genuinely true to your experience) and your next application clears the ATS filter.

When you'd upgrade to embeddings

Word-set matching misses synonyms: "JavaScript" / "JS" / "ECMAScript" all count as different. For higher recall, swap Jaccard for cosine similarity on sentence embeddings:

import { pipeline } from "@xenova/transformers";
const embed = await pipeline("feature-extraction", "Xenova/all-MiniLM-L6-v2");

const [eR, eJ] = await Promise.all([embed(resume), embed(jd)]);
const cosine = dot(eR.data, eJ.data) / (norm(eR.data) * norm(eJ.data));

Same shape, fancier math, much better synonym recall. Costs you a 50 MB model download (cached after first load). Use this when your word-set version is missing valid matches because of vocabulary differences.

Try it now

👉 https://dev48v.infy.uk/solve/day1-resume-jd-match.html

Three tabs on the page:

LOOK — paste a resume + JD, score it
UNDERSTAND — 9 click-through steps with the WHY for each line of math
BUILD — copy the 30 lines, paste into an HTML file, double-click to run

Sample resume + sample JD are loadable with one click if you don't want to dig up your own.

Why this is Day 1 of SolveFromZero

This is the launch of my new series: 50 real hackathon problem statements, 50 working solutions. Sources: Smart India Hackathon, Devpost, Unstop, Devfolio.

Every day I pick a real brief that someone actually posted and ship a solution you can fork. The goal isn't to win the hackathon retroactively — it's to see the floor of what a working solution looks like, so you don't waste your first 3 hours staring at a blank repo.

Tomorrow: pothole detection from dashcam video — Smart India Hackathon classic.

🌐 All problems: https://dev48v.infy.uk/solvefromzero.php
🌐 The series umbrella: https://dev48v.infy.uk/zero-to-hero.php

Top comments (1)

Jon Randy 🎖️ • Jun 8 • Edited

In no way a critcism of any of your code, or your post.... but this is one of the dumbest things you could do to assess someone's CV. If any recruiter dismisses candidates based on this, they're a total clown.