What I learned building an AI receptionist that actually books the job (not just answers the phone)

#ai #startup #smallbusiness #buildinpublic

We built Robin, an AI receptionist for local service businesses (contractors, salons, dental practices, auto shops), and the biggest lesson from shipping it wasn't about the LLM. It was about the last 10 feet: what happens the moment after the AI understands what the caller wants.

The easy 80%

Getting an LLM to sound natural on a phone call or SMS thread is genuinely solved now. Voice models are good, latency is low enough, and prompt engineering for "sound like a helpful front desk person, not a robot" is a well-trodden path. That part took us maybe 20% of total build time.

The hard 20% that actually matters

The part that took real engineering:

Real calendar awareness. Not "the AI thinks it booked something." Robin checks actual availability, actual business hours (including the owner's real schedule, holidays, and buffer time), and writes to a real calendar. If it can't verify a slot is open, it doesn't pretend to book it.
Confirmation loops that don't feel like a form. A human customer texting back and forth expects a conversation, not a wizard. But you still need structured data (name, service, time, phone number) out the other end. We landed on parsing structured intent out of natural language turns rather than forcing rigid steps.
Graceful failure. When Robin genuinely doesn't know something (custom pricing, a weird edge case, an angry customer), it needs to say "let me have the owner call you back" instead of confidently making something up. Confident wrong answers are worse than an admission of not knowing, especially when a $97/month product is standing in for a $3,000/month human employee.
Owning the business's voice. Every business sounds a little different. A dental office and an HVAC contractor should not have Robin talking the same way. This ended up being less about model choice and more about a structured "business profile" layer that gets injected into every conversation.

Why this matters beyond our use case

If you're building any AI agent meant to replace or augment a human role (not just answer questions), the lesson generalizes: the model is not your product. The scaffolding around the model — state management, verification against ground truth, and honest failure modes — is where almost all of the real engineering effort goes, and it's the part most demos skip.

We put a live, no-signup-required demo up if anyone wants to poke at how it handles a real scheduling conversation: clawvr.com/robin

Happy to go deeper into any of the above in the comments if useful.