Stop Hardcoding Your Agent Workflows (or Don't): A Dev's Guide to Supervisor Delegation
If you're building anything with LLM agents right now, you've probably hit this fork in the road: do you hardcode which agent handles what, or do you let a "supervisor" agent decide at runtime?
It's tempting to reach for the clever solution—dynamic delegation feels proper, like good OOP. But after shipping a few of these systems, I've learned the hard way that the right answer is annoyingly context-dependent.
Let me walk you through the trade-offs, so you can make the call before you burn through your token budget.
The Two Approaches (in 60 Seconds)
Hardcoded routing is exactly what it sounds like:
def route_task(task):
if "refund" in task.lower():
return refund_agent.handle(task)
elif "track order" in task.lower():
return tracking_agent.handle(task)
else:
return general_support_agent.handle(task)
Simple. Deterministic. Zero tokens spent on routing logic.
Supervisor delegation puts an LLM in charge of routing:
def route_task(task):
supervisor_prompt = f"""
Given this task: {task}
Available agents:
- refund_agent: handles refund requests
- tracking_agent: handles order tracking
- support_agent: general queries
Which agent should handle this? Return JSON.
"""
decision = llm.call(supervisor_prompt)
return get_agent(decision['agent']).handle(task)
Flexible. Handles edge cases you didn't anticipate. Also: costs tokens on every single request.
When the Supervisor Actually Earns Its Keep
Dynamic delegation makes sense when your input space is genuinely unpredictable. I'm talking:
- Research workflows where you don't know upfront whether a query needs web search, database lookup, or code execution
- Multi-domain customer support where a single ticket might touch billing, technical support, and account management
- Data pipelines where the shape of incoming data determines which transformation agents fire
The key signal: you can't write the if/else tree because you genuinely don't know the patterns yet.
If you're in this camp, supervisor and sub-agents can save you from an unmaintainable mess of routing logic.
When You Should Just Write the Damn If Statement
Here's the uncomfortable truth: most enterprise use cases are way more constrained than we pretend.
Customer service triage? You've got maybe 6–10 intent categories. Document classification? Probably fewer than 20 types. Internal tooling automation? You know exactly what your users are going to ask for because you control the interface.
If you can enumerate the cases in a planning doc, you can hardcode the routing.
Hardcoded routing gives you:
- Predictable costs: no surprise token spikes
- Faster execution: skip the LLM call entirely
- Easier debugging: stack traces beat "the supervisor made a weird choice"
- Simpler observability: you know exactly which code path ran
Start here. Add the supervisor later if you actually need it.
The Hidden Costs Nobody Talks About
Even if dynamic delegation works, it compounds costs in ways that'll bite you:
- Supervisor reasoning: tokens on every request
- Sub-agent context: each agent needs enough context to work, often duplicating the supervisor's input
- Inter-agent communication: if agents collaborate, they're burning tokens talking to each other
- Retry logic: when delegation fails, you pay again
I've seen teams blow their monthly OpenAI budget in a week because their supervisor was re-analysing the same 500-word input on every routing decision.
Observability Is Your Real Problem
Debugging "why did the supervisor send this to the wrong agent?" is miserable. You're spelunking through LLM logs, trying to reconstruct reasoning that wasn't deterministic to begin with.
Hardcoded routing? grep works. Supervisor delegation? You need structured logging, trace IDs, and probably a vector database just to understand what happened.
If you're not already doing observability well, adding a supervisor will hurt.
My Default Recommendation
Start with hardcoded routing. Write the simplest if/elif/else chain that handles your known cases. Add a catch-all that logs unknown inputs.
Then instrument heavily:
if match := extract_intent(task):
logger.info(f"Routing {task_id} to {match.agent}",
intent=match.intent, confidence=match.score)
return route_to(match.agent, task)
else:
logger.warning(f"No route for {task_id}", task=task)
return fallback_agent.handle(task)
Once you've got a month of prod data showing you can't write better rules, then evaluate a supervisor.
And if you're working with a team that specialises in AI automation and software development, they'll probably tell you the same thing: solve the problem you have, not the one that sounds clever.
TL;DR
- Supervisor delegation ≠ always better
- Hardcode first if your domain is constrained
- Dynamic routing pays for itself when inputs are genuinely unpredictable
- Observability and cost control are harder than you think
- Start simple, instrument everything, upgrade if data proves you need it
Your future self (and your token budget) will thank you.
Top comments (0)