HackerRank's Open-Source ATS: My Resume Scored 90, 74, Then 88
Meta Description: HackerRank open sourced its ATS and I tested it on my own resume — scoring 90/100, then 74, then 88. Here's what the inconsistency reveals about AI resume scoring.
TL;DR: HackerRank open-sourced its Applicant Tracking System (ATS), and the internet immediately started stress-testing it. When I ran my own resume through it multiple times, I got scores of 90, 74, and 88 — on the same document, with no changes. That's not a bug report; it's a feature of how LLM-based resume scoring actually works. This article breaks down what happened, what it means for job seekers, and how to use this tool (and its limitations) to your actual advantage.
Key Takeaways
- HackerRank's open-sourced ATS uses LLM-based scoring, which is non-deterministic by design — expect score variance of 10–20 points between runs
- A single score means almost nothing; patterns across multiple runs are what matter
- The tool is genuinely useful for identifying keyword gaps and structural weaknesses in your resume
- Traditional ATS systems (Greenhouse, Lever, Workday) use different logic — don't assume this tool mirrors every employer's system
- Your resume optimization strategy should focus on clarity and relevance, not gaming a single score
- The open-source nature of this tool is its biggest strength — you can read exactly how it evaluates your resume
What Actually Happened: HackerRank Open Sources Its ATS
In mid-2026, HackerRank made a move that got the developer and job-seeker communities buzzing: they open-sourced the core of their Applicant Tracking System on GitHub. The pitch was compelling — finally, candidates could see how their resumes were being evaluated, not just whether they made the cut.
The community response was predictable and entirely human. Everyone immediately uploaded their own resume to see the score.
And that's where things got interesting — or, depending on your perspective, deeply frustrating.
My resume scored 90 out of 100 on the first run. I felt great for approximately four minutes. Then I ran it again. 74. Then again. 88. Same PDF. Same job description. No edits. Three completely different scores within a 15-minute window.
I wasn't alone. Reddit threads, LinkedIn posts, and developer forums filled up with people comparing their wildly inconsistent results. The discourse split pretty cleanly: some people called it broken, others called it a revelation. Both groups were partially right.
[INTERNAL_LINK: how ATS systems work in 2026]
Why the Score Keeps Changing: The LLM Problem Nobody Warned You About
Here's the technical reality that most coverage of this story glossed over: HackerRank's ATS uses a Large Language Model at its core, not a deterministic keyword-matching algorithm.
Traditional ATS tools — the kind that have been rejecting your resume for the past decade — work more like spreadsheets. They scan for specific keywords, count them, check formatting rules, and spit out a score. Run it twice, get the same number. Boring, but consistent.
LLM-based systems are fundamentally different. They're probabilistic. Every time the model generates a response, it samples from a probability distribution of possible outputs. The temperature setting (a parameter that controls how "creative" or "random" the output is) determines how much variance you see. A temperature of 0 gives you perfectly consistent outputs. Anything above that introduces variability.
Most production LLM applications use temperatures between 0.3 and 0.8 for a reason — pure determinism makes AI outputs feel robotic and repetitive. But it also means your resume "score" is genuinely not a fixed property of your document.
What This Means Practically
| Run | Score | What Changed |
|---|---|---|
| 1st | 90/100 | Nothing — baseline |
| 2nd | 74/100 | Nothing — same file |
| 3rd | 88/100 | Nothing — same file |
| Average | ~84/100 | This is closer to your "real" score |
The actionable takeaway: Run your resume through the tool at least 5 times and average the scores. That average is meaningfully more reliable than any single data point.
What HackerRank's Open-Source ATS Actually Evaluates
To understand the scores, I went to the source — the actual repository. Here's what the evaluation framework appears to prioritize (based on the publicly available code and prompt engineering):
1. Keyword and Skills Alignment
The system compares your resume's stated skills against the job description you provide. This is where most of the score weight lives. If the job description asks for "distributed systems experience" and your resume says "worked on microservices at scale," the LLM might connect those dots — or it might not, depending on the run.
2. Experience Relevance Scoring
It doesn't just check if you have experience; it tries to assess whether your specific experience is relevant to the role. A 10-year career in backend engineering might score differently against a "Senior Backend Engineer" role versus a "Full-Stack Product Engineer" role, even if your resume is identical.
3. Formatting and Readability
The tool penalizes resumes that are hard to parse — dense walls of text, unusual formatting, or non-standard section headers. This is one area where the scoring tends to be more consistent across runs.
4. Quantified Achievements
Like most modern resume advice, the system rewards bullet points that include measurable outcomes ("reduced API latency by 40%") over vague descriptions ("improved system performance").
5. Education and Credential Matching
For roles with explicit educational requirements, the system checks alignment. This is weighted lower than experience for most technical roles.
[INTERNAL_LINK: resume writing tips for software engineers]
How to Actually Use This Tool (Without Losing Your Mind)
Despite the inconsistency drama, HackerRank's open-source ATS is genuinely useful if you approach it correctly. Here's a practical workflow:
Step 1: Run It Multiple Times First
Don't act on a single score. Run your resume against the same job description five times and note the range. A resume that scores 85-92 consistently is in good shape. One that swings between 60 and 88 has real structural issues the model is uncertain about.
Step 2: Use the Feedback, Not the Number
The score is a headline. The feedback is the article. Most runs will generate qualitative comments about what's missing or weak. Look for patterns in that feedback across multiple runs — if three out of five runs mention "lacks specific cloud infrastructure experience," that's signal worth acting on.
Step 3: Compare Against Multiple Job Descriptions
Run your resume against three to five different job descriptions for roles you're targeting. This reveals whether your resume is genuinely weak or just mismatched to a specific role's language.
Step 4: Treat It as One Signal Among Many
HackerRank's ATS is one tool. It doesn't represent how Greenhouse, Lever, Workday, or iCIMS will evaluate you. Use it for directional guidance, not as the final word.
How This Compares to Other Resume Scoring Tools
Since we're being honest about what this tool can and can't do, let's put it in context.
| Tool | Scoring Method | Consistency | Best For |
|---|---|---|---|
| HackerRank Open-Source ATS | LLM-based | Low-Medium | Holistic relevance assessment |
| Jobscan | Keyword matching | High | ATS keyword optimization |
| Resume Worded | ML + rules-based | Medium-High | Comprehensive resume feedback |
| Teal HQ | Keyword + AI hybrid | Medium | Job tracking + resume tailoring |
| Manual recruiter review | Human judgment | Variable | Final hiring decisions |
Honest assessments:
Jobscan is the most reliable for traditional ATS keyword optimization. It's not glamorous, but if you're applying to companies using Workday or Greenhouse, it's more directly applicable than HackerRank's tool. The free tier is limited but useful for a quick check.
Resume Worded gives more detailed feedback on resume quality beyond just keywords — it'll tell you if your bullet points are weak, not just whether they contain the right terms. Worth using alongside the HackerRank tool.
Teal HQ is my pick for job seekers who want an all-in-one workflow. The resume scoring is decent, but the real value is in tracking applications and tailoring your resume to specific roles at scale.
[INTERNAL_LINK: best resume optimization tools compared]
The Bigger Picture: What Open-Sourcing an ATS Actually Means
Let's zoom out for a second, because the score drama is actually the less interesting part of this story.
The fact that HackerRank open-sourced this tool is genuinely significant. For years, ATS systems have been black boxes. Candidates knew their resumes were being filtered by software but had no visibility into how. Open-sourcing the evaluation logic is a meaningful step toward transparency in hiring.
But it also reveals something uncomfortable: even the companies building these tools aren't entirely sure how they work. An LLM-based evaluation system that produces scores of 90, 74, and 88 on the same input isn't a precisely engineered measurement instrument. It's a probabilistic approximation of what a human recruiter might think.
That's not necessarily bad — human recruiters are also inconsistent. Studies have shown that the same resume can get dramatically different evaluations from different recruiters, or even from the same recruiter on different days. In that sense, the LLM's variance is honest. It's just unexpected when you're looking at a number that implies precision.
What This Means for Job Seekers in 2026
The hiring landscape has shifted significantly. More companies are using AI-assisted screening, but the technology is still maturing. The practical implications:
- Keyword optimization still matters for traditional ATS systems, but it's becoming less sufficient on its own
- Narrative coherence — how well your career story hangs together — is increasingly evaluated by LLM-based tools
- Tailoring your resume to each job description is more important than ever, because AI systems are better at detecting generic applications
- Human review still happens for most roles above a certain level — your resume needs to work for both machines and people
Practical Resume Improvements Based on What This Tool Reveals
Whether your score was 90, 74, or somewhere in between, here are the improvements that consistently move the needle across multiple runs:
Quick Wins (Do These Today)
- Add a skills section with explicit technology/tool names that match your target job descriptions
- Convert vague bullet points to achievement-oriented statements with numbers ("managed team" → "managed 6-person team that shipped 3 product features in Q1")
- Make sure your job titles clearly communicate seniority and function
- Remove anything more than 10–12 years old unless it's directly relevant
Medium-Effort Improvements
- Write a tailored professional summary for each job category you're targeting (not each individual application — that's unsustainable)
- Audit your resume for jargon vs. clarity — internal company terminology that made sense at your last job may confuse both AI and human reviewers
- Ensure your section headers are standard ("Work Experience," "Education," "Skills") rather than creative alternatives that parsing systems may not recognize
Structural Changes Worth Considering
- If you're a career changer, consider a hybrid resume format that leads with a skills/competencies section before chronological experience
- For senior roles, add a career highlights section at the top that surfaces your three to five most impressive achievements immediately
Frequently Asked Questions
Q: Is HackerRank's open-source ATS the same system companies actually use to screen candidates?
Not exactly. HackerRank has released a version of their ATS tooling, but individual companies configure and customize ATS systems to their own requirements. The open-source version gives you insight into one approach to AI-assisted resume screening, but it doesn't perfectly replicate what any specific employer's system will do with your resume. Use it as directional guidance, not a definitive predictor.
Q: Why does my score change every time I run it?
This is a feature of LLM-based systems, not a bug. Large Language Models are probabilistic — they sample from probability distributions when generating text, which means the same input can produce different outputs. The variance you're seeing (often 10–20 points) reflects genuine uncertainty in the model's assessment. Running the tool multiple times and averaging the scores gives you a more reliable signal than any single result.
Q: Should I optimize my resume specifically for this tool?
Only partially. Optimizing for the patterns this tool consistently flags — better keyword alignment, quantified achievements, clear formatting — will generally make your resume stronger across all evaluation systems. But don't chase a specific score or optimize for quirks of this particular tool at the expense of resume clarity and authenticity.
Q: How does this compare to what companies using Greenhouse or Workday actually see?
Traditional enterprise ATS platforms like Greenhouse, Workday, and Lever still rely heavily on keyword matching and structured data parsing rather than LLM-based evaluation. For applications going into those systems, tools like Jobscan that specifically model keyword matching may be more directly predictive. HackerRank's tool is more useful for roles at companies using AI-forward screening processes.
Q: What's the most reliable way to know if my resume is actually good?
Honest answer: get it reviewed by a human recruiter or hiring manager in your target field. AI tools — including this one — are useful for identifying obvious gaps and optimizing for automated screening, but human feedback remains the gold standard. Many career coaches offer resume reviews, and communities like Blind, relevant subreddits, and professional associations often have peer review opportunities.
The Bottom Line
HackerRank's open-source ATS is a flawed, fascinating, and genuinely useful tool — as long as you understand what it is and isn't. The score that swings between 74 and 90 isn't telling you your resume is bad; it's telling you that resume evaluation, even by sophisticated AI systems, contains more uncertainty than the clean number suggests.
Use the tool for what it's good at: identifying patterns in how your resume aligns with job descriptions, catching structural weaknesses, and understanding the logic behind AI-assisted screening. Ignore the score as a single source of truth.
The real value of HackerRank open-sourcing this system isn't the tool itself — it's the transparency. For the first time, candidates can look inside the black box. What we found is that the box is more uncertain than we thought. That's not a reason to despair. It's a reason to stop obsessing over a number and start focusing on building a resume that clearly communicates your value to both machines and the humans who ultimately make hiring decisions.
Ready to put this into practice? Start by running your resume through the HackerRank open-source ATS five times and averaging your scores. Then cross-reference with Resume Worded for qualitative feedback. If you're actively job hunting, Teal HQ can help you track applications and tailor your resume systematically across multiple roles.
[INTERNAL_LINK: complete job search toolkit for 2026]
Have you tested your resume with HackerRank's open-source ATS? Drop your experience in the comments — especially if your scores were as all over the place as mine.
Top comments (0)