Forem

Esther Studer
Esther Studer

Posted on

I Built an AI Job Application Screener in 47 Lines of Python — Here's What I Learned

I Built an AI Job Application Screener in 47 Lines of Python — Here's What I Learned

Every time I open my inbox after posting a job listing, I die a little inside.

Not because of the volume (though 300+ applications for a mid-level dev role is absurd). But because 80% of them could be filtered in 10 seconds by a half-awake intern.

So I built a small Python script to do it for me. Here's what happened.


The Problem

Most job applications are noise. Keywords stuffed into a PDF, a cover letter ChatGPT wrote (badly), and zero signal about whether this person can actually do the job.

Recruiters spend an average of 7.4 seconds on a resume before deciding yes or no. That's not because they're lazy — it's because they literally can't read 300 documents carefully.

I wanted to build something that:

  1. Parsed resume PDFs
  2. Scored them against job requirements
  3. Flagged obvious red flags (gaps, keyword stuffing, mismatched roles)
  4. Ranked candidates in < 5 seconds per application

The Stack

  • pdfplumber — PDF text extraction
  • openai — GPT-4o-mini for scoring (cheap and fast)
  • pydantic — structured output validation
  • rich — because terminal output should be beautiful
pip install pdfplumber openai pydantic rich
Enter fullscreen mode Exit fullscreen mode

The Code (47 Lines)

import pdfplumber
from openai import OpenAI
from pydantic import BaseModel
from rich.console import Console
from rich.table import Table
import json, sys

client = OpenAI()  # uses OPENAI_API_KEY env var
console = Console()

JOB_REQUIREMENTS = """
Senior Python Developer — 5+ years experience, FastAPI or Django,
PostgreSQL, Docker, cloud (AWS/GCP), team lead experience preferred.
"""

class CandidateScore(BaseModel):
    name: str
    score: int  # 0-100
    strengths: list[str]
    red_flags: list[str]
    recommendation: str  # "advance" | "maybe" | "pass"

def extract_text(pdf_path: str) -> str:
    with pdfplumber.open(pdf_path) as pdf:
        return " ".join(page.extract_text() or "" for page in pdf.pages)

def score_candidate(resume_text: str) -> CandidateScore:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": f"You are a senior tech recruiter. Score this resume against: {JOB_REQUIREMENTS}"},
            {"role": "user", "content": resume_text[:3000]}
        ],
        response_format={"type": "json_object"}
    )
    data = json.loads(response.choices[0].message.content)
    return CandidateScore(**data)

def main():
    pdfs = sys.argv[1:]
    results = []

    for pdf in pdfs:
        text = extract_text(pdf)
        score = score_candidate(text)
        results.append((pdf, score))

    results.sort(key=lambda x: x[1].score, reverse=True)

    table = Table(title="Candidate Rankings")
    table.add_column("File"), table.add_column("Score"), table.add_column("Rec"), table.add_column("Red Flags")

    for path, s in results:
        table.add_row(path, str(s.score), s.recommendation, ", ".join(s.red_flags[:2]))

    console.print(table)

main()
Enter fullscreen mode Exit fullscreen mode
python screener.py applications/*.pdf
Enter fullscreen mode Exit fullscreen mode

What the Output Looks Like

Candidate Rankings
File              | Score | Rec      | Red Flags
anna_m.pdf        |  87   | advance  | No Docker mentioned
dev_john.pdf      |  72   | maybe    | 2yr gap 2021-2023
chatgpt_guy.pdf   |  31   | pass     | Keyword stuffing, vague
Enter fullscreen mode Exit fullscreen mode

Processed 47 resumes in 23 seconds. Cost: $0.11 in OpenAI credits.


What Surprised Me

1. GPT-4o-mini is shockingly good at this

I expected generic scoring. Instead, it caught things like: "mentions 'machine learning' 11 times but all in the context of using no-code tools" — that's a red flag a human recruiter might miss under time pressure.

2. The real value is the red flags, not the score

A score of 72 vs 68 is noise. But "employment gap not explained" or "role titles don't match claimed responsibilities" are actual signals.

3. It's wrong ~15% of the time

PDF parsing breaks on two-column layouts. Some resumes use images instead of text. Always have a human do a sanity check pass on the top 20%.


The Bigger Picture: AI Won't Replace Recruiters

But it will make the bad ones obsolete and the good ones 10x faster.

The same logic applies to candidates. If you're applying for jobs in 2025 without understanding how these systems work, you're invisible. Not because you're underqualified — because your resume wasn't written for the algorithm that reads it first.

I've seen smart engineers get filtered out in 0.3 seconds because they wrote "worked with databases" instead of "PostgreSQL, Redis, query optimization".

If you're navigating a career switch and want an AI that actually understands the job market, jobwechsel-ki.ch is worth a look — it's built specifically for that context.


What's Next

I'm adding:

  • Email auto-responders for rejected candidates (with actual useful feedback)
  • GitHub profile analysis as a secondary signal
  • A simple web UI so non-technical hiring managers can use it

Want the full repo? Drop a comment and I'll put it on GitHub.


What's your take — are AI screeners fair to candidates? Or are we just automating bias faster? Let's discuss below. 👇

Top comments (0)