In Part 1, I talked about why building conversational apps is still messy.
Now let’s get into the part that actually matters:
How do these systems really work under the hood?
If a user sends:
“I want 2 fried rice”
How does that message turn into a real backend action?
Not just a response — but something like:
- creating an order
- updating a database
- returning a real result
To understand that, we need to look at the core building blocks behind modern conversational systems.
The Core Idea
At a high level, the architecture looks like this:
User (Telegram / Slack / WhatsApp)
↓
Agent
↓
Tool
↓
Backend API
↓
Database
Each piece has a very specific role.
1. Agent — The Brain
An agent is the AI layer that understands user messages and decides what to do.
It:
- reads the user’s message
- interprets intent
- decides whether to respond directly or call a tool
You can think of it as the decision-maker.
But here’s the important part:
The agent should not contain your business logic.
It shouldn’t calculate totals, write database queries, or enforce rules.
That belongs in your backend.
2. Tool — The Bridge to Your Backend
A tool is just a function your agent can call.
For example:
get_menucreate_ordertrack_order
Instead of the agent guessing or hallucinating results, it calls a tool like:
create_order({
items: [...]
})
That tool then runs real backend logic.
This is the most important concept:
Tools are the only safe way for agents to interact with real systems.
They:
- validate input
- execute real logic
- return structured results
No guessing. No hallucination.
3. Backend — The Source of Truth
Your backend still does everything it normally does:
- database queries
- validation
- business rules
- calculations
Nothing changes here.
The agent doesn’t replace your backend.
It just sits on top of it.
4. Channels — Where Users Come From
Users don’t talk to your backend directly.
They come from platforms like:
- Telegram
- Slack
- Discord
- SMS
Each of these platforms has:
- different APIs
- different message formats
- different authentication
This is where things usually get messy.
5. The Missing Piece: Routing
So how does everything connect?
That’s where the webhook layer comes in.
User → Channel → Webhook → Agent → Tool → Backend
When a user sends a message:
- The channel sends it to your webhook
- The agent processes it
- The agent decides to call a tool
- The tool runs backend logic
- A response is sent back to the user
Where Konsier Fits In
Instead of building all of this yourself, Konsier provides a structured way to handle it.
It gives you:
- a way to define agents
- a way to define tools
- a unified webhook layer
- built-in support for multiple channels
So instead of wiring everything manually, you focus on:
defining what your system can do
and Konsier handles the rest.
Why This Model Works
This approach solves a major problem:
It separates AI reasoning from business logic.
- The agent decides what to do
- The backend decides how it’s done
That separation makes the system:
- more reliable
- easier to debug
- easier to scale
A Simple Mental Model
If you remember nothing else, remember this:
- Agent = decision maker
- Tool = controlled execution
- Backend = source of truth
- Channel = entry point
- Webhook = connection layer
What’s Next
Now that the architecture is clear, the next question is:
How do you actually plug this into an existing backend?
In Part 3, I’ll walk through how I integrated this model into an existing Express + PostgreSQL API using Konsier.
We’ll go from theory to actual code:
- setting up the SDK
- defining tools
- mounting the webhook
- syncing configuration
If you're building conversational systems, understanding this pattern will save you a lot of time.
In the next post, we’ll make it real.
Top comments (0)