DEV Community

Cover image for The Voice AI Prompt That Was Too Human (And Why Robot Instructions Worked Better)
FARHAN HABIB FARAZ
FARHAN HABIB FARAZ

Posted on

The Voice AI Prompt That Was Too Human (And Why Robot Instructions Worked Better)

I spent 3 weeks writing the perfect conversational prompt. 18,000 words. Every scenario covered. Every response scripted like a movie dialogue. The system rejected it. Not because it was wrong. Because it was too much like how humans write, not how AI processes.

I’d written it like a play script: USER says: “Assalamualaikum” BOT responds: “Walaikum Assalam, how can I help you today?” USER says: “I want to book an appointment” BOT responds: “Sure! What date works for you?” Eighteen thousand words of this. Every greeting. Every question. Every possible response path. I was so proud. This covered everything. The voice AI system loaded it…and crashed. Every single time.

Turns out, the quotation marks were killing it. Every time I wrote: USER says: “Hello” the server tried to parse those quotation marks as string delimiters. It thought I was ending commands. It got confused about what was instruction versus what was dialogue. But that wasn’t the only problem. The real issue was that I’d written instructions for a human actor, not an AI system. I was telling a story about how conversations should go. The AI needed rules about how to behave.

I rewrote everything. Threw out the dialogue format. Used directives instead. When caller says: Assalamualaikum. Action: Greet in Bangla. Response: ওয়ালাইকুম আসসালাম. When caller mentions: appointment. Action: Initiate booking flow. Next step: Ask for preferred date. No quotation marks. No dialogue. Just trigger, action, response. The difference was philosophical. Old way: “Here’s how a conversation looks.” New way: “Here’s what to do when X happens.” AI doesn’t need to see conversations. It needs to know: if this, then that.

The original prompt was 18,000 words. The new prompt was 3,000 words. I cut 83 percent of the content. And somehow, it worked better. I stopped giving examples and started giving patterns. Old approach: show 20 examples of greetings. New approach: one rule that covers all greetings. Detect greeting keywords. Respond with a Bangla greeting. Use natural variation. One rule replaced twenty examples. I did this everywhere. Consolidated. Merged. Turned specific cases into general patterns. The AI didn’t need to see every scenario played out. It needed to understand the underlying logic.

When you write USER: “I need help” BOT: “Of course! What can I help you with?” the AI has to parse the dialogue structure, extract patterns, infer timing, and recreate conversational flow. That’s a lot of processing. When you write Trigger: Help request detected. Response: Acknowledge and ask for specifics. The AI just does it. No interpretation needed.

Instead of writing 15 different ways to say “thank you,” I created a variation library. Acknowledgment responses: ধন্যবাদ, কৃতজ্ঞতা, অনেক ভালো, চমৎকার. Understanding responses: বুঝতে পারছি, জানি, বুঝলাম, ঠিক আছে. Confirmation responses: হ্যাঁ, অবশ্যই, ঠিক আছে, নিশ্চিত. One section. Hundreds of dialogue examples gone. The AI now knows when to acknowledge, it should pick from this list.

Old prompt style was descriptive: “The bot should respond politely when the customer asks for help.” New prompt style was imperative: “Action: Respond politely. Response: exact text.” AI systems don’t need “should.” They need commands. Should requires interpretation. Action executes immediately.

The voice AI that kept crashing started working immediately. Processing time per call dropped from 6 seconds to under 2. The AI wasn’t reading a novel anymore. It was executing a decision tree. Old format: read 18,000 words, find an example, extract a pattern, generate a response. New format: keyword detected, rule triggered, response executed. Speed improved dramatically. Consistency improved even more. Dialogue examples caused blended responses. Directive rules removed ambiguity.

I tested both versions with the same 50 sample calls. The old version was slower, inconsistent, and crashed three times. The new version was fast, consistent, and never crashed. It sounded slightly more systematic, but customers didn’t care. They cared about fast, correct answers.

AI doesn’t think like humans write. Humans write narratively. AI processes structurally. Writing for AI isn’t creative writing. It’s technical specification. Examples are expensive. Every example adds tokens, processing time, and ambiguity. One good rule beats ten examples. Quotation marks are dangerous in systems that parse text programmatically. Use triggers instead. Compression forces clarity. Cutting 15,000 words showed me how much I was over-explaining.

If I had to compress this into a formula: avoid quotation marks, use trigger-action rules, replace examples with patterns, use commands instead of descriptions, prefer structure over narrative, and use keyword mapping instead of dialogue simulation.

If you’re building chatbots, voice AI, IVR systems, or customer service bots, stop writing conversation scripts. Start writing behavior rules. Your prompt isn’t a screenplay. It’s an instruction manual. AI doesn’t need to see conversations. It needs to know what to do when something happens.

I didn’t make the prompt better by adding more examples. I made it better by removing them. Before, the AI had to read and interpret. After, it could execute directly. That’s the difference between a prompt that struggles and a prompt that flies.

Are your prompts written for humans to read or for AI to execute? How much of your prompt is examples versus rules? Have you ever had a prompt work better after cutting it down?

Written by Faraz Farhan, Senior Prompt Engineer and Team Lead at PowerInAI. Building AI prompts that AI can actually process.
Tags: promptengineering, voiceai, optimization, systemdesign, conversationalai, efficiency

Top comments (0)