Naomi Kynes

Posted on Mar 7 • Edited on Mar 14

The missing layer between your AI agents and you

#ai #agents #opensource #selfhosted

The three patterns developers settle for

Pattern 1: Terminal babysitting

python my_agent.py >> agent.log 2>&1 &
tail -f agent.log

You run the agent. You watch the logs. When it crashes, you restart it.

This works fine for one-shot batch jobs: scraping, processing, tasks where the agent runs to completion and you check the output file. It falls apart the moment your agent needs a human decision mid-flight, or when you're running 5 agents concurrently and need to understand the global state.

Pattern 2: Polling (REST API)

# Check status
curl http://localhost:8000/status
# {"status": "running", "task": "scraping page 42/100", "errors": 0}

# Ask it something
curl -X POST http://localhost:8000/query \
  -d '{"question": "what have you found so far?"}'

Better. At least there's a programmatic dialogue. But the power dynamic is one-way: the agent can only respond to requests. If your agent hits a rate limit at 2 AM or finds an unexpected anomaly, it can't tell you. You'd have to write a separate script just to check on it.

This is fine for synchronous tools, but interesting agents are long-running and need the ability to reach out.

Pattern 3: Bidirectional messaging

The agent pushes messages over WebSockets or SSE when it has state changes or needs help. You reply asynchronously. The channel tracks the history, so you don't lose context if the container restarts.

# Agent sends you a message when it hits an edge case
import requests

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# The agent encountered a decision point and escalates
requests.post(f"{API}/channels/{CHANNEL_ID}/messages",
    headers=headers,
    json={"content": "Found 3 candidates matching the criteria. "
                     "Should I contact all three, or just the top one? @human"})

You get a push notification. You reply. The agent resumes the workflow.

This is how humans work together. It's odd that we don't build agent orchestration this way by default.

The architectural requirements for agent comms

To build this kind of bidirectional pub/sub layer for your agents, you run into a few non-trivial infrastructure requirements:

If an agent's only memory is its token window, conversation history evaporates when the process restarts. You need a fast, persistent datastore (Postgres/Redis) tracking channel history.

Agents use API keys. Humans use session cookies or OAuth. Most web frameworks handle one well and bolt the other on. A proper agent communication layer treats both as first-class citizens in the same chat rooms.

Polling a REST endpoint every 5 seconds is brittle and resource-heavy. WebSockets or Server-Sent Events (SSE) make the interaction feel like an actual terminal session.

If setting up the comms layer is harder than writing the actual agent, devs won't use it. They'll just write print() statements instead.

A minimal working example

Here's the full loop — agent bootstraps its own workspace, gets credentials, and starts messaging:

# Step 1: Bootstrap — creates workspace, returns API key + channel ID + human invite link
curl -X POST http://localhost:8080/api/v1/bootstrap \
  -H "Content-Type: application/json" \
  -d '{
    "owner_email": "admin@local",
    "owner_password": "changeme",
    "agent_name":    "research-bot",
    "agent_description": "Handles research tasks"
  }'

Response:

{
  "api_key":    "au_abc123...",
  "channel_id": "ch_xyz789...",
  "invite_url": "http://localhost:3001/invite/TOKEN"
}

You open the invite URL in a browser. Now you and the agent are in the same channel.

# Step 2: Agent sends messages
import requests, json, websocket

API      = "http://localhost:8080/api/v1"
KEY      = "au_abc123..."
CHANNEL  = "ch_xyz789..."
headers  = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}

# REST: fire and move on
requests.post(f"{API}/channels/{CHANNEL}/messages",
    headers=headers,
    json={"content": "Started research run. Will update with findings."})

# WebSocket: for real-time back-and-forth
ws = websocket.create_connection(f"ws://localhost:8080/ws?token={KEY}")

# Listen for human replies
while True:
    msg = json.loads(ws.recv())
    if msg["type"] == "new_message":
        sender  = msg["message"]["author_name"]
        content = msg["message"]["content"]
        if sender != "research-bot":           # message is from a human
            handle_human_reply(content)

No SDK. No framework lock-in. If your code can make HTTP calls or open a WebSocket, this works — Python, Node, Go, bash, whatever.

Multi-agent patterns

Once you have a messaging layer, multi-agent coordination follows naturally.

One orchestrator creates tasks, routes to specialists, aggregates results. Each agent gets its own API key. The orchestrator reads from all channels and handles escalation.

Agents post to a shared channel. Others read and react. Loose coupling — no direct orchestration, no central dispatcher that becomes a bottleneck.

# Agent 1 posts result to shared channel
ws.send(json.dumps({
    "type":       "send_message",
    "channel_id": SHARED_CHANNEL,
    "content":    "Research complete: found 3 key findings. See attached."
}))

# Agent 2 is subscribed to the same channel
msg = json.loads(ws.recv())
if msg["type"] == "new_message":
    if "Research complete" in msg["message"]["content"]:
        trigger_analysis_pipeline(msg["message"]["content"])

The mental model is just pub/sub. The difference is that humans can participate in the same channel — you can watch the agents coordinate, ask a question, redirect them. The collaboration is visible.

The mental shift

Stop treating your agents as opaque back-end jobs.

That shift completely changes the orchestration layer you need. You don't need a heavy telemetry dashboard. You need a fast, transparent messaging bus that lets you see exactly what the agent is doing and intervene when it drifts.

I kept hitting this friction point, so I wrote Agent United. It's an open-source, self-hosted chat platform that handles all the web socket plumbing, auth separation, and persistent state so you can just focus on your agent logic. It spins up in one docker-compose up command.

If you're building systems that need human oversight, you can grab the code on GitHub or see the API patterns at docs.agentunited.ai/docs/agent-guide.

But ultimately, the implementation doesn't matter. The pattern does. Build systems that talk back.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.