In early 2024, I had a problem. My phone repair shop was processing thousands of customer inquiries a month across WhatsApp, phone calls, and walk-ins. My team was drowning in repetitive questions: "Is my phone ready?", "Do you have the screen for iPhone 14?", "Can I book for tomorrow at 5pm?"
Twelve months later, an AI agent named Jacobo was handling ~90% of those interactions autonomously. Customers got instant answers. My team focused on actual repairs. And when I sold the business in early 2025, the agent was a key part of what made it sellable.
Here's how I built it.
The Problem: Three Channels, One Bottleneck
Santifer iRepair had been running for 16 years when I started this project. We'd already automated the back-office with Airtable — 12 connected databases handling repairs, inventory, invoicing, the works. But customer communication was still manual.
The pain points:
- WhatsApp: Customers expected instant replies. We couldn't deliver.
- Phone calls: Staff interrupted mid-repair to answer "what's my repair status?"
- Booking: Back-and-forth messages to find a slot that worked.
I needed something that could talk to customers across channels, understand what they wanted, and actually do things — not just generate text.
Architecture: A Router With Specialized Sub-Agents
The breakthrough came when I stopped thinking "chatbot" and started thinking "agent orchestration."
┌─────────────────────┐
│ INCOMING REQUEST │
│ (Voice/WhatsApp) │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ MAIN ROUTER │
│ (Intent Classifier)│
└──────────┬──────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
┌───────▼───────┐ ┌─────────▼────────┐ ┌────────▼────────┐
│ APPOINTMENTS │ │ DISCOUNTS │ │ ORDERS │
│ Sub-Agent │ │ Sub-Agent │ │ Sub-Agent │
└───────┬───────┘ └─────────┬────────┘ └────────┬────────┘
│ │ │
└──────────────────────┼──────────────────────┘
│
┌──────────▼──────────┐
│ HITL HANDOFF │
│ (When confidence │
│ is low or │
│ escalation needed) │
└─────────────────────┘
Main Router: Every incoming message hits the router first. It classifies intent and delegates to the right sub-agent via tool calling. No giant monolithic prompt trying to do everything.
Sub-Agents: Each one is laser-focused on a single domain:
- Appointments: Queries available slots from Airtable, handles booking logic, sends confirmation via WhatsApp
- Discounts: Pulls customer history, calculates applicable promos, explains the discount
- Orders: Validates stock against inventory DB, creates the order, sends ETA notification
HITL Handoff: When confidence drops below threshold or the customer explicitly asks for a human, Jacobo escalates — but passes the full conversation context so nobody starts from zero.
The Stack
| Component | Tool | Why |
|---|---|---|
| LLM | Claude API | Best balance of reasoning + tool use at the time |
| Orchestration | n8n | Visual workflows, easy to debug, self-hosted |
| WATI | Clean WhatsApp Business API wrapper | |
| Voice | ElevenLabs | Natural-sounding Spanish TTS |
| Phone | Aircall | Cloud PBX with good API |
| Backend/DB | Airtable | Already our source of truth for everything |
The key insight: Airtable wasn't just storage — it was the agent's brain. Every sub-agent queried Airtable directly. Customer history, inventory levels, appointment slots — all live data, no sync issues.
Key Technical Decisions (And Why)
1. Tool Calling Over Prompt Stuffing
Early versions tried to cram everything into the system prompt. "Here's how to check inventory, here's how to book appointments, here's our discount rules..."
It was brittle. The model would hallucinate discounts or book non-existent slots.
Tool calling changed everything. Each sub-agent has explicit tools:
check_available_slots(date, service_type) → returns actual slots
create_booking(customer_id, slot_id) → books or fails with reason
calculate_discount(customer_id, service) → returns applicable promo
The model reasons about what to do. The tools handle how. Clean separation.
2. Sub-Agent Specialization Over One Big Agent
A single agent handling appointments, discounts, orders, and general FAQs? That's a recipe for confusion.
Each sub-agent has:
- Its own system prompt (focused, ~200 tokens)
- Its own tool set (only what it needs)
- Its own failure modes (easier to debug)
The router is dumb on purpose. It just classifies and delegates. Complexity lives at the edges.
3. Graceful HITL, Not Graceful Degradation
Some AI systems try to "degrade gracefully" — giving worse answers when uncertain. I took a different approach: escalate early, escalate with context.
When Jacobo wasn't confident:
- Customer got a message: "Let me connect you with the team"
- Staff got a Slack notification with full conversation history
- Average human response time: under 2 minutes
The 10% that needed humans got better service than before, because staff had full context.
Lessons Learned
Start with the most repetitive task. Appointment booking was 40% of all inquiries. Automating that alone bought us massive breathing room.
Your database is your agent's memory. Don't build a separate "AI database." Query what you already have. Airtable's API was fast enough for real-time lookups.
Tool calling > RAG for transactional tasks. RAG is great for knowledge retrieval. But when you need to do things — book, order, check status — tool calling is the architecture.
Measure deflection rate, not just accuracy. "Did the agent answer correctly?" matters less than "Did the customer get what they needed without human help?" We tracked both.
The Outcome
After 12 months in production:
- ~90% of customer interactions handled without human intervention
- Staff spent 70% more time on actual repairs
- Customer satisfaction stayed flat (no degradation — that was the goal)
- The system became a selling point when I exited the business
What I'd Do Differently
Voice was harder than expected. ElevenLabs sounds great, but latency in the voice → transcription → LLM → TTS loop was noticeable. I'd explore tighter integrations if rebuilding today.
More observability earlier. I added proper logging and trace monitoring late in the project. Should've been day one.
Simpler discount logic. The discount sub-agent had too many edge cases baked into the prompt. Should've moved more logic into deterministic code and kept the LLM for natural language understanding only.
Building Jacobo taught me that AI agents aren't magic — they're systems engineering with an LLM in the middle. The LLM handles the messy human language part. Everything else is APIs, databases, and good old-fashioned software architecture.
The 90% automation wasn't because the AI was brilliant. It was because we picked the right problems, built the right tools, and knew when to hand off to humans.
I'm currently open to AI Product Manager and Forward Deployed Engineer roles. Check my portfolio at santifer.io.
Top comments (0)