Dhyan Raj

Posted on Dec 25

How Our AI Hiring Platform Gets Smarter Without Code Changes

#ai #mcp #microservices #architecture

Introduction

Last month, our AI hiring platform could only process PDF resumes.

This week, it handles PDFs, Word documents, and scanned images with OCR.

We didn't change a single line of code in our resume processing agent. We just deployed new extractors to the cluster — and the LLM-powered agent started using them automatically.

This is what building on MCP Mesh feels like.

The Platform

We built a multi-agent hiring platform on Kubernetes:

┌─────────────────────────────────────────────────────────────────────────┐
│                         MCP Mesh Registry                               │
│                  (Discovery + Topology + Health)                        │
└─────────────────────────────────────────────────────────────────────────┘
       ▲              ▲                ▲              ▲              ▲
       │              │                │              │              │
  ┌────┴────┐   ┌─────┴─────┐   ┌──────┴──────┐  ┌────┴────┐   ┌─────┴─────┐
  │  Resume │   │    Job    │   │  Interview  │  │ Scoring │   │    LLM    │
  │  Agent  │   │  Matcher  │   │    Agent    │  │  Agent  │   │ Providers │
  │ (LLM)   │   │           │   │   (LLM)     │  │  (LLM)  │   │           │
  └────┬────┘   └───────────┘   └─────────────┘  └─────────┘   └───────────┘
       │
       │ dynamically discovers extractors
       ▼
  ┌───────────────────────────────────────────────────────────────────┐
  │                   Extractor Tools                                 │
  │  ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐            │
  │  │   PDF   │   │   DOC   │   │  Image  │   │ Future  │            │
  │  │Extractor│   │Extractor│   │  (OCR)  │   │   ...   │            │
  │  └─────────┘   └─────────┘   └─────────┘   └─────────┘            │
  │       tags: [extractor, pdf]  [extractor, doc]  [extractor, ocr]  │
  └───────────────────────────────────────────────────────────────────┘

The key insight: agents powered by @mesh.llm don't have hardcoded dependencies. They discover tools at runtime — and intelligently choose which ones to use.

The Resume Agent

Here's the core of our resume processing:

@mesh.llm(
    provider={"capability": "llm", "tags": ["+claude"]},
    filter=[{"tags": ["extractor"]}],  # Discover all extractors
    system_prompt="""You process uploaded resumes.

    Available tools let you extract text from different file formats.
    Choose the appropriate extractor based on the file type.
    Then analyze the extracted text and return structured candidate data.""",
    max_iterations=3,
)
@mesh.tool(
    capability="process_resume",
    tags=["resume", "ai"],
)
async def process_resume(
    file_path: str,
    file_type: str,
    llm: MeshLlmAgent = None
) -> CandidateProfile:
    return await llm(f"Process this {file_type} resume: {file_path}")

The magic is in filter=[{"tags": ["extractor"]}]. The LLM sees every tool tagged with "extractor" — and decides which one to call based on the file type.

Day 1: PDF Only

When we launched, we had one extractor:

# pdf_extractor.py
@mesh.tool(
    capability="extract_pdf",
    tags=["extractor", "pdf"],
    description="Extract text content from PDF files"
)
async def extract_pdf(file_path: str) -> ExtractedText:
    # PDF extraction logic
    return ExtractedText(content=text, metadata={...})

The Resume Agent's LLM sees: Available tools: [extract_pdf]

User uploads resume.pdf → LLM reasons: "This is a PDF, I'll use extract_pdf" → Extracts text → Returns structured profile.

Day 30: Adding Word Support

Product wants Word document support. We write a new extractor:

# doc_extractor.py
@mesh.tool(
    capability="extract_doc",
    tags=["extractor", "doc", "docx"],
    description="Extract text content from Word documents (.doc, .docx)"
)
async def extract_doc(file_path: str) -> ExtractedText:
    # Word extraction logic
    return ExtractedText(content=text, metadata={...})

Deploy to Kubernetes:

helm install doc-extractor oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-agent \
  --version 0.7.12 \
  -n mcp-mesh \
  -f doc-extractor/helm-values.yaml

Within 10 seconds, the Resume Agent's LLM sees: Available tools: [extract_pdf, extract_doc]

User uploads resume.docx → LLM reasons: "This is a Word document, I'll use extract_doc" → Works.

No code change to the Resume Agent. No restart. No config update.

Day 60: Image OCR for Scanned Resumes

HR reports that some candidates upload scanned PDFs or photos of their resumes. We add OCR:

# image_extractor.py
@mesh.tool(
    capability="extract_image_ocr",
    tags=["extractor", "image", "ocr", "scan"],
    description="Extract text from images or scanned documents using OCR"
)
async def extract_image_ocr(file_path: str) -> ExtractedText:
    # OCR logic (Tesseract, Cloud Vision, etc.)
    return ExtractedText(content=text, confidence=0.92, metadata={...})

helm install image-extractor oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-agent \
  --version 0.7.12 \
  -n mcp-mesh \
  -f image-extractor/helm-values.yaml

The Resume Agent's LLM now sees: Available tools: [extract_pdf, extract_doc, extract_image_ocr]

User uploads resume_scan.jpg → LLM reasons: "This is an image, I'll use extract_image_ocr" → Works.

But here's where it gets interesting. User uploads a PDF that's actually a scanned image (no selectable text). The LLM:

Tries extract_pdf → Gets empty/garbled text
Reasons: "The PDF extraction returned garbage. This might be a scanned document."
Calls extract_image_ocr on the same file → Gets clean text
Returns structured profile

The agent got smarter. It learned a new recovery strategy without anyone writing that logic.

We didn't tell the Resume Agent about OCR. We didn't update its prompts. We just deployed an extractor with good tags and a clear description — and the LLM figured out when to use it.

Why This Works

Traditional microservices require explicit wiring:

# Traditional approach - hardcoded routing
def process_resume(file_path: str, file_type: str):
    if file_type == "pdf":
        text = call_pdf_service(file_path)
    elif file_type in ["doc", "docx"]:
        text = call_doc_service(file_path)
    elif file_type in ["jpg", "png"]:
        text = call_ocr_service(file_path)
    else:
        raise UnsupportedFormatError(file_type)
    # ...

Every new format requires code changes, redeployment, and testing.

MCP Mesh with @mesh.llm inverts this:

Tools self-describe — Each extractor has tags and descriptions
LLM discovers tools — filter=[{"tags": ["extractor"]}] broadcasts intent
LLM reasons about tools — Chooses based on context, not hardcoded rules
Mesh handles routing — Tool calls go to the right agent automatically

The Resume Agent's code stays frozen. The platform's capabilities expand with each helm install.

The Enterprise Reality

This isn't a toy demo. It's running in production:

Aspect	Traditional	MCP Mesh
Adding new file format	Code change + deploy + test	helm install
Config files for routing	Per-service	0
Recovery logic for edge cases	Manual if/else	LLM figures it out
Time to add capability	Hours/days	Minutes

Infrastructure

# One-time cluster setup
helm install mcp-core oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-core \
  --version 0.7.12 \
  -n mcp-mesh --create-namespace

# Deploys: Registry + PostgreSQL + Redis + Tempo + Grafana

Same agent code runs locally (meshctl start), in Docker Compose, and in Kubernetes. Only the infrastructure changes.

What the LLM Sees

When the Resume Agent's LLM runs, it receives a tool list like:

{
  "tools": [
    {
      "name": "extract_pdf",
      "description": "Extract text content from PDF files",
      "tags": ["extractor", "pdf"],
      "input_schema": {"file_path": "string"}
    },
    {
      "name": "extract_doc",
      "description": "Extract text content from Word documents (.doc, .docx)",
      "tags": ["extractor", "doc", "docx"],
      "input_schema": {"file_path": "string"}
    },
    {
      "name": "extract_image_ocr",
      "description": "Extract text from images or scanned documents using OCR",
      "tags": ["extractor", "image", "ocr", "scan"],
      "input_schema": {"file_path": "string"}
    }
  ]
}

The LLM reads descriptions, understands capabilities, and makes intelligent choices. Add a new tool? It appears in this list within seconds.

LLM Failover (Bonus)

We run two LLM providers:

helm install claude-provider oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-agent \
  -f claude-provider/helm-values.yaml -n mcp-mesh

helm install openai-provider oci://ghcr.io/dhyansraj/mcp-mesh/mcp-mesh-agent \
  -f openai-provider/helm-values.yaml -n mcp-mesh

The Resume Agent's provider={"capability": "llm", "tags": ["+claude"]} prefers Claude.

When Claude's API goes down:

Mesh detects unhealthy (missed heartbeats)
Topology updates
Next request routes to OpenAI
When Claude recovers, traffic returns

Zero failover code. It's how the mesh works.

What's Next

This article showed what we built — an AI platform that genuinely gets smarter as you add capabilities.

The next article explains why we chose MCP over REST as the foundation — and why that choice matters more than you might think.

👉 [Next: MCP vs REST — Why MCP is the better microservice protocol]

DEV Community