Leonidas Williamson

Posted on Mar 23

I Replaced 3 Hours of Prompt Engineering with a 50ms Free API Call

#ai #llm #promptengineering #mcp

The secret? Stop treating prompts like strings. Start treating them like compiled code.

Last month I was building a customer support agent for a client. Standard stuff: Claude-powered, MCP-connected, deployed on their Slack workspace.

The agent took me 2 hours to build.

The prompt took me 3 days.

Every time I thought I had it right, the agent would:

Lose its persona after 4-5 messages
Forget constraints I explicitly defined
Respond generically when it should have escalated
Behave differently on Slack vs. their web widget

Sound familiar?

I was about to start my fourth rewrite when I had the idea for ASK LEONIDAS.

Sixty seconds later, I had a production-ready prompt that solved every single one of those problems.

The Real Problem: Prompts Aren't Code (But They Should Be)

Here's what nobody tells you about prompt engineering:

You're not writing instructions. You're writing a specification.

And specifications need structure. They need modularity. They need to handle edge cases.

Most developers write prompts like this:

You are a helpful customer support agent for Acme Corp. 
Be friendly and professional. Help users with their questions.
Escalate complex issues to human support.

That's not a spec. That's a wish list.

What happens when:

The user asks something outside your product domain?
The conversation goes past 10 messages?
The agent needs to switch from support to sales?
The same prompt runs on WhatsApp vs. Discord?

Your "prompt" has no answer. So the model guesses. And guesses produce inconsistent behavior.

The LEONIDAS Framework: 8 Pillars, 0 Guesswork

The LEONIDAS framework treats prompts like what they actually are: behavioral specifications.

Every prompt gets 8 distinct components:

Pillar	What It Defines
L – Persona	Identity, expertise, voice, decision-making style
E – Objective	The singular mission every message should advance
O – Tone & Format	Channel-specific formatting rules
N – Constraints	Hard limits (what the agent must never do)
I – Business Logic	Decision trees, qualification criteria, escalation rules
D – Structure	Conversation flow patterns
A – Human Behavior	Psychology and relationship dynamics
S – Multipurpose	Platform adaptations (Slack ≠ WhatsApp ≠ Web)

This isn't a prompting technique. It's prompt architecture.

The Part That Made Me Actually Use It: MCP Integration

Here's where it gets interesting for developers.

ASK LEONIDAS exposes its prompt engine as an MCP server. That means you can call it directly from:

Claude Desktop (native tool)
Cursor / Windsurf (IDE integration)
Any MCP-compatible agent (via HTTP)

No LLM in the loop. Pure logic. Sub-50ms response times.

Quick Setup (60 seconds)

Get your free API key at askleonidas.com/dev
Add to your Claude Desktop config:

{
  "mcpServers": {
    "ask-leonidas": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://www.askleonidas.com/api/mcp/stream",
        "--header",
        "Authorization: Bearer YOUR_API_KEY"
      ]
    }
  }
}

Restart Claude Desktop
Use it:

Use the generate_prompt tool to create a LEONIDAS prompt 
for a B2B sales agent that qualifies leads via WhatsApp 
and books demo calls.

That's it. You now have structured prompt generation inside your IDE.

Available Tools

Tool	Purpose
`generate_prompt`	Run the LEONIDAS engine → structured prompt in <50ms
`list_templates`	Browse the full public template library
`get_template`	Fetch a specific template by ID
`get_template_by_slug`	Fetch by URL slug

Free tier: 500 calls/day. No credit card.

Before & After: What This Actually Looks Like

Let me show you a real comparison.

❌ Before: What I Was Writing

You are a helpful sales assistant. Qualify leads by asking 
about their budget, timeline, and decision-making authority. 
If they're qualified, offer to book a demo call. Be professional 
but friendly.

Result: Agent asks the same questions in the same order every time. Sounds robotic. Goes off-script when users ask unexpected questions. Forgets persona after extended conversations.

✅ After: LEONIDAS Output

[L – PERSONA]
You are Marcus, a senior B2B sales consultant with 12 years 
of experience closing enterprise deals. You've personally 
qualified 50,000+ leads and have a 28% SQL-to-opportunity 
conversion rate. You specialize in consultative qualification —
never pushy, always value-first.

[E – OBJECTIVE]
Your sole mission is to qualify inbound leads within 5 exchanges 
and either (a) route qualified leads to a demo call, or 
(b) add unqualified leads to a nurture sequence with appropriate 
tags. Every message must move toward one of these two outcomes.

[O – TONE & FORMAT]
Conversational and curious — never robotic. WhatsApp messages 
must be under 3 sentences. Use line breaks, not paragraphs. 
Ask one question at a time. Never use bullet points.

[N – CONSTRAINTS]
Do not ask about budget directly in the first 2 messages. 
Do not send more than 2 consecutive messages without a response. 
If a lead explicitly says "not interested," acknowledge and exit 
gracefully — do not push.

[I – BUSINESS LOGIC]
Qualification criteria: budget >$10K/month, decision-maker or 
direct influencer, timeline <90 days, active pain point. 
If 3 of 4 criteria are met, route to demo booking. 
If fewer than 3, add to nurture with tag: "warm-not-ready".

[D – STRUCTURE]
Conversation flow: Warm greeting → Context discovery (1 question) → 
Pain validation → Urgency/timeline probe → Budget signal → 
Authority check → Routing decision.

[A – HUMAN BEHAVIOR]
People resist being qualified. Frame every question as helping 
them — "To make sure I connect you with the right resource..." 
Mirror their communication style. Fast responders can handle 
2 questions per message; slow responders need 1.

[S – MULTIPURPOSE]
Optimized for WhatsApp. For Telegram, increase message length 
by 20%. For web chat, enable markdown. For voice, add filler 
phrases for natural pacing.

Result: Agent maintains persona across 20+ message conversations. Adapts to user communication style. Knows exactly when to escalate. Behaves consistently across channels.

Why This Matters for the Agentic Future

We're building toward an agentic economy where AI agents don't just answer questions — they act. They book appointments. They qualify leads. They manage tickets. They coordinate with other agents.

This requires a new trust infrastructure. Your agents need:

Clear identity — verifiable persona and capabilities
Defined behavior — constraints that actually constrain
Auditability — decisions that can be traced back to rules
Cross-platform consistency — same agent, same behavior, any channel

The LEONIDAS framework isn't just about better prompts. It's about agent specifications that can be versioned, tested, and trusted.

If you're building production agents, you should be following the emerging work on agent trust and verification. The standards are evolving fast, and prompt architecture is a foundational piece.

My New Workflow

Here's how I build agents now:

Define the problem — What should this agent actually do?
Generate the prompt — MCP call to LEONIDAS, 50ms
Review & tune — Adjust constraints and business logic
Deploy — Same prompt works across all channels
Iterate — Use the framework structure to make targeted changes

Time from "idea" to "deployed agent": under an hour.

Prompt rewrites due to persona drift: zero.

TL;DR

Prompts aren't strings — they're behavioral specifications
The LEONIDAS framework provides 8-pillar structure that eliminates guesswork
It's available as an MCP server: 50ms responses, 500 free calls/day
Works with Claude Desktop, Cursor, and any MCP-compatible agent
This is infrastructure for the agentic economy, not just a prompting trick

Try It

Web wizard: askleonidas.com/wizard

Developer API: askleonidas.com/dev

Example prompts: askleonidas.com/examples

Free. No signup for the wizard. 500 API calls/day on the free tier.

If you're still writing prompts by hand in 2026, you're compiling assembly while the rest of us use a compiler.

What prompt frameworks are you using? I'm especially curious about how folks are handling multi-agent coordination — drop your approach in the comments.

Tags: #ai #llm #mcp #claude #promptengineering #agents #devtools

DEV Community