Applicant Tracking Systems are black boxes. They score and rank resumes before any human reads them. The rules are not published. Every ATS behaves differently. The feedback loop is nonexistent.
SIRA is a pipeline of 9 specialized AI agents analyzing, scoring, and optimizing a resume in about 60 seconds. Here is the technical breakdown.
The Problem: ATS Is a Black Box
ATS platforms like Workday, Greenhouse, and Lever use keyword matching, section detection, and scoring algorithms to rank applicants. Most job seekers do not realize their resumes get filtered before a human reads them.
As developers, we do better.
Architecture: Why 9 Agents
A monolithic prompt (dump the resume and job description into GPT and ask for improvements) produces generic output. Monolithic prompts hallucinate metrics, miss structural issues, and fail to reason about multiple optimization dimensions simultaneously.
SIRA uses a multi-agent pipeline where each agent has a single, well-defined responsibility. Think of it like microservices for reasoning.
Agent 1: Document Parser
Extracts raw text and structural metadata from PDF/DOCX uploads. Handles multi-column layouts, tables, headers, and edge cases like image-based PDFs (via OCR fallback). Outputs a normalized JSON representation.
Agent 2: Section Classifier
Identifies and labels resume sections: Contact Info, Summary, Experience, Education, Skills, Projects, Certifications. Uses a fine-tuned classifier trained on resume section samples.
Agent 3: ATS Compatibility Analyzer
The core engine. It simulates how major ATS platforms parse the resume:
- Keyword extraction from the job description (TF-IDF + semantic embeddings)
- Keyword matching against resume content (exact match, synonym match, semantic similarity)
- Format compatibility scoring
- Section completeness check
Outputs a compatibility score from 0-100 with a detailed breakdown.
Agent 4: Keyword Gap Analyzer
Compares the job description's required/preferred qualifications against resume content. Produces a gap report:
{
"missing_hard_skills": ["Kubernetes", "Terraform"],
"missing_soft_skills": ["cross-functional collaboration"],
"weak_matches": ["Python (mentioned but not emphasized)"],
"strong_matches": ["React", "TypeScript", "AWS"]
}
Agent 5: Impact Quantifier
Scans bullet points and flags vague language. "Improved system performance" becomes "Improved system response time by 40% by implementing Redis caching, reducing average API latency from 850ms to 510ms."
This agent uses few-shot prompting with industry-specific examples. It will not fabricate numbers. It suggests where metrics should go and provides templates.
Agent 6: Language and Tone Optimizer
Ensures consistent tense, active voice, professional tone, and eliminates filler words. Handles localization for different markets.
Agent 7: Structure Advisor
Recommends section reordering based on the target role. A senior engineer should lead with Experience. A fresh graduate should lead with Education or Projects. Also flags formatting issues breaking ATS parsing.
Agent 8: Competitive Benchmarker
Compares the resume against aggregated patterns from successful resumes in the same role/industry.
Agent 9: Final Synthesizer
Takes all outputs from agents 3-8, resolves conflicts, and produces an optimized resume (markdown + formatted PDF), a detailed score report, and a prioritized action list.
The Orchestration Layer
The agents run as a directed acyclic graph (DAG):
Parser -> Classifier -> [ATS Analyzer, Keyword Gap, Impact Quantifier] -> [Language Optimizer, Structure Advisor] -> Benchmarker -> Synthesizer
Agents 3, 4, and 5 run in parallel. Agents 6 and 7 also parallelize. Total processing time: ~60 seconds instead of 3-4 minutes for sequential execution.
Each agent communicates via structured JSON schemas. Validation runs at every handoff point. If an agent produces malformed output, it gets one retry with the error message injected, then fails gracefully with partial results.
NLP Under the Hood
Semantic keyword matching: Exact keyword matching misses too much. "Machine Learning" and "ML" are the same thing. SIRA uses sentence-transformers (all-MiniLM-L6-v2) to compute embeddings and cosine similarity, with a threshold tuned per-category.
Job description parsing: JDs are messy. Required versus preferred qualifications are not always separated. A custom NER model extracts structured requirements from free-text postings.
Resume scoring calibration: The 0-100 ATS score is calibrated against real-world data from beta users, used to weight scoring components.
Key Learnings
Multi-agent beats monolithic prompts for complex analytical tasks. Each agent is independently testable and tunable.
ATS systems are surprisingly simple. Most choke on two-column layouts, tables, and creative formatting. A clean single-column resume outperforms a gorgeous designed one.
The keyword game is real, but nuance matters. Keyword stuffing gets caught. Contextual usage in achievement-oriented bullet points scores well.
Feedback loops are everything. The biggest quality improvement came from collecting user feedback on whether optimized resumes led to more callbacks.
Try It
SIRA is live and free to use:
- Web app: https://sira.now. Upload your resume, paste a job description, get your optimization report
- Telegram bot: https://t.me/sira_cv_bot. Send your resume directly in chat for quick analysis
The job market is tough in ways hurting qualified people. If you are a developer who has been ghosted after dozens of applications, the problem might not be your skills. It might be how your resume talks to machines.
Upload your resume to SIRA today and see where the pipeline identifies gaps.
Top comments (0)