Aviad Rozenhek

Posted on Nov 6

Communication Protocols for AI Agents That Can't Talk to Each Other

#systemdesign #agents #ai #github

4 iterations to get file-based messaging working when you can't use Slack

Part 5 of the Multi-Agent Development Series

Part 1: Can 5 Claude Code Agents Work Independently?
Part 2: The Reality of "Autonomous" Multi-Agent Development
Part 3: Property-Based Testing with Hypothesis
Part 4: Zero-Conflict Architecture

TL;DR

The Problem: 5 AI agents in isolated context windows need to coordinate work. No shared state. No real-time chat. Different tool environments (Web vs CLI).

What we tried:

GitHub PR comments (agents in Web can't read them)
File-based messages (agents didn't understand them)
Clear action items (still too vague)
FEEDBACK-PR-X.md + explicit instructions + GitHub redundancy (finally worked!)

The lesson: Communication protocol design is hard. What seems obvious to humans ("check your PR comments") isn't obvious to agents in different environments. Successful protocol needs:

Explicit instructions (not assumptions)
Multiple channels (redundancy)
Clear format (markdown structure)
Async design (file-based, not real-time)

Final solution: Git as the communication bus, markdown files as the message format.

The Challenge

The Setup

5 work stream agents (PR-1 through PR-5)
1 integration agent (PR-6)
Each in isolated Claude Code session (separate context windows)
No shared state between sessions
Need to communicate: Status updates, test results, bug reports, action items

The Constraints

Technical constraints:

Isolated contexts: Each agent can't see other agents' conversations
Different environments: Some agents in Web (no gh CLI), some in CLI
No real-time: Agents don't run continuously, can't push notifications
No shared memory: Can't use global variables or shared state

Operational constraints:

Async-first: Agents work at different speeds
Human-in-the-loop: Human orchestrates, can't relay every message
Must scale: 5 agents × 2 channels (send/receive) = 10 communication paths

What we needed:

Integration agent tells work stream agents: "Your PR merged, verify tests pass"
Work stream agents tell integration: "Tests verified, all good" or "Found issues, need help"
Persistent communication trail (for debugging)

Iteration 1: GitHub PR Comments (FAILED)

The Plan

Seemed obvious:

# Integration posts on PR-2:
@claude-agent-pr-2

Your PR has been merged into integration branch!

Action items:
1. Verify all tests still pass
2. Check for integration issues
3. Report back via PR comment

Thanks!

Expected workflow:

Integration agent posts comment
PR-2 agent checks PR, reads comment
PR-2 agent responds with results

Seemed foolproof, right?

What Actually Happened

Integration agent:

$ gh pr comment 123 --body "Your PR merged, please verify..."
Comment created successfully!

PR-2 agent (in Claude Code Web):

> Check your PR for integration status

Agent: "Let me check the PR comments..."
Agent: "I'll use gh CLI to read comments"
System: Error - 'gh' command not found
Agent: "I can't access GitHub CLI in this environment"

Root cause: Claude Code Web doesn't have gh CLI access. Can't read PR comments.

Why It Failed

Assumptions we made:

✅ GitHub is accessible (true)
✅ PR comments are visible (true via UI)
❌ Agents can READ PR comments programmatically (FALSE in Web environment)

The gap: Web agents have no tool to fetch PR comments. They can't even curl the GitHub API without auth tokens.

Result: One-way communication. Integration can WRITE comments, agents can't READ them.

Iteration 2: File-Based Messages (PARTIAL SUCCESS)

The Plan

Git as message bus:

Integration branch (claude/integrate-...):
  FEEDBACK-PR-2.md  # Message for PR-2 agent
  FEEDBACK-PR-3.md  # Message for PR-3 agent
  ...

Work stream branch (claude/budget-allocation-...):
  INTEGRATION-RESPONSE.md  # Response to integration

Workflow:

# Integration writes message:
$ echo "Your PR merged..." > FEEDBACK-PR-2.md
$ git add FEEDBACK-PR-2.md
$ git commit -m "feedback: PR-2 integration complete"
$ git push

# PR-2 agent reads message:
$ git fetch origin claude/integrate-...
$ git show origin/claude/integrate-...:FEEDBACK-PR-2.md
# Reads the message!

# PR-2 agent responds:
$ echo "Tests verified..." > INTEGRATION-RESPONSE.md
$ git add INTEGRATION-RESPONSE.md
$ git commit -m "response: Tests passing"
$ git push

# Integration reads response:
$ git fetch origin claude/budget-allocation-...
$ git show origin/claude/budget-allocation-...:INTEGRATION-RESPONSE.md
# Reads the response!

Benefits:

✅ Pure git operations (works in all environments)
✅ Persistent (messages in git history)
✅ Async (write anytime, read anytime)
✅ No special tools needed

What Actually Happened

Integration wrote:

# FEEDBACK-PR-2.md

Your PR has been merged.

Please verify tests and report back.

PR-2 agent:

> Check for messages from integration

Agent: "Let me fetch the integration branch"
$ git fetch origin claude/integrate-...

Agent: "Let me read the feedback file"
$ git show origin/claude/integrate-...:FEEDBACK-PR-2.md

# FEEDBACK-PR-2.md
Your PR has been merged.
Please verify tests and report back.

Agent: "I see a message. It says my PR merged and to verify tests."
Agent: "But what exactly should I do? Run which tests? What format for response?"

The problem: Vague instructions. Agent understood there was a message but didn't know the concrete actions.

Why It Partially Worked

What worked:

✅ Message delivery (agent can read the file)
✅ Persistence (message in git)
✅ Cross-environment (works in Web and CLI)

What didn't work:

❌ Vague instructions ("verify tests" - which tests? how?)
❌ No clear response format (how to report back?)
❌ No deadline/urgency indication
❌ No context (what was merged? what changed?)

Agent was willing but unclear on next steps.

Iteration 3: Clear Action Items (BETTER)

The Improvement

Added explicit instructions:

# FEEDBACK-PR-2.md

## Summary
✅ Your PR #123 (Budget Allocation Tests) was successfully merged into integration branch `claude/integrate-test-improvements-011CUrkV7Q9B7UpLgPpwL1K8` at 2025-11-06 14:30.

## What Changed
- Added 14 budget allocation unit tests
- Modified `video_moderator.py` with rolling budget accumulator
- All tests passed in isolation on your branch

## Required Action

**Please verify that your tests still pass after integration:**

1. Fetch the integration branch:

bash
git fetch origin claude/integrate-test-improvements-011CUrkV7Q9B7UpLgPpwL1K8


2. Run your specific tests:

bash
uv run pytest tests/test_video_moderation/unit/test_budget_allocation.py -v


3. Verify all 14 tests pass

4. Create response file on YOUR branch:

bash
echo "See response below" > INTEGRATION-RESPONSE.md
git add INTEGRATION-RESPONSE.md
git commit -m "@integration - verification complete"
git push


5. Use this response format:

markdown
# INTEGRATION-RESPONSE.md

Agent: PR-2
Status: [ALL PASSING / ISSUES FOUND]
Date: YYYY-MM-DD HH:MM

## Test Results
[Paste pytest output here]

## Issues (if any)
[Describe any problems]


## Timeline
Please complete within 4 hours if possible (not blocking other PRs).

## Questions?
If you encounter issues, create a file called INTEGRATION-QUESTIONS.md on your branch.

What Actually Happened

PR-2 agent:

> Check for integration feedback

$ git fetch origin claude/integrate-...
$ git show origin/claude/integrate-...:FEEDBACK-PR-2.md

Agent: "I see detailed instructions!"
Agent: "Step 1: Fetch integration branch"
$ git fetch origin claude/integrate-test-improvements-011CUrkV7Q9B7UpLgPpwL1K8
Done.

Agent: "Step 2: Run tests"
$ uv run pytest tests/test_video_moderation/unit/test_budget_allocation.py -v

... tests run ...
14/14 passing ✅

Agent: "Step 3: Create response"
$ echo "Results below..." > INTEGRATION-RESPONSE.md
$ git add INTEGRATION-RESPONSE.md
$ git commit -m "@integration - all tests passing"
$ git push

# INTEGRATION-RESPONSE.md created with:
**Status**: ALL PASSING
**Date**: 2025-11-06 15:00

## Test Results
test_tier1_critical_urgency: PASSED
test_tier2_never_moderated: PASSED
... (14/14 tests)

SUCCESS! Agent followed instructions completely.

Why It Worked Better

Improvements:

✅ Explicit commands (copy-paste bash commands)
✅ Clear success criteria (all 14 tests pass)
✅ Response template (exact format specified)
✅ Timeline (4 hours)
✅ Escape hatch (INTEGRATION-QUESTIONS.md if stuck)

Agent had everything needed: What to do, how to do it, what to report, when to do it by.

Iteration 4: Dual-Channel Redundancy (FINAL)

The Refinement

Problem: What if agent doesn't check the file? No notification mechanism.

Solution: Dual-channel communication

Channel 1: File-based (detailed instructions)

FEEDBACK-PR-X.md on integration branch
Full context, commands, expected results
Permanent record

Channel 2: GitHub PR comment (notification + summary)

Posted on the actual PR
Brief summary + pointer to detailed file
Notification mechanism (shows up in GitHub UI)

Implementation

Integration agent workflow:

# 1. Create detailed feedback file
cat > FEEDBACK-PR-2.md <<EOF
# Detailed instructions as shown in Iteration 3
EOF

git add FEEDBACK-PR-2.md
git commit -m "feedback: PR-2 integration complete"
git push

# 2. Post GitHub notification (for human visibility)
gh pr comment 123 --body "@claude-agent

PR-2 has been integrated!

📋 **Detailed instructions**: See FEEDBACK-PR-2.md on integration branch

**Quick summary**:
- Your PR merged successfully ✅
- Please verify tests still pass
- Respond via INTEGRATION-RESPONSE.md on your branch

**To read detailed instructions**:
\`\`\`bash
git fetch origin claude/integrate-test-improvements-011CUrkV7Q9B7UpLgPpwL1K8
git show origin/claude/integrate-...:FEEDBACK-PR-2.md
\`\`\`

Timeline: 4 hours (not blocking)
"

Why Dual-Channel?

Redundancy benefits:

Humans can see progress (GitHub PR comments visible in UI)
Agents have detailed instructions (FEEDBACK file)
Notification layer (PR comment draws attention)
Debugging trail (both channels logged)

Real-world benefit:

Human could monitor progress via GitHub web UI
Agents had clear instructions via git files
If agent missed file, human could prompt: "Check FEEDBACK-PR-2.md"

The Final Protocol

Message Types

1. FEEDBACK-PR-X.md (Integration → Work Stream)

Purpose: Tell work stream agent about integration status, request actions

Template:

# FEEDBACK-PR-X.md

## Summary
[One-line status: merged successfully / issues found / waiting]

## What Changed
[What was merged, what's new in integration branch]

## Required Actions
1. [Specific action with bash command]
2. [Another action with bash command]
...

## Expected Results
[What "success" looks like]

## Response Format
[Template for INTEGRATION-RESPONSE.md]

## Timeline
[Deadline if any]

## Questions?
[How to ask for help]

2. INTEGRATION-RESPONSE.md (Work Stream → Integration)

Purpose: Report back on verification status

Template:

# INTEGRATION-RESPONSE.md

**Agent**: PR-X
**Status**: ALL PASSING | ISSUES FOUND | NEED HELP
**Date**: YYYY-MM-DD HH:MM

## Test Results

bash
$ pytest ...
[Full output]


## Issues (if any)
[Describe problems encountered]

## Questions (if any)
[Ask integration agent for clarification]

3. STATUS-PR-X.md (Work Stream → Integration, Optional)

Purpose: Progress updates during long-running work

Template:

# STATUS-PR-X.md

**Last Updated**: YYYY-MM-DD HH:MM
**Current Activity**: [What agent is doing now]
**Progress**: X / Y tasks complete

## Completed
- [x] Task 1
- [x] Task 2

## In Progress
- [ ] Task 3 (current)

## Blocked
[Any blockers encountered]

## ETA
[Estimated completion time]

Communication Workflow

┌─────────────────┐
│  Integration    │
│     Agent       │
└────────┬────────┘
         │
         │ 1. Merge PR-2
         │ 2. Create FEEDBACK-PR-2.md
         │ 3. Post GitHub comment
         │
         ↓
    ╔════════════════════════╗
    ║  Integration Branch    ║
    ║  FEEDBACK-PR-2.md      ║
    ╚════════════════════════╝
         │
         │ 4. PR-2 agent fetches
         │
         ↓
┌─────────────────┐
│    PR-2 Agent   │
│  (Work Stream)  │
└────────┬────────┘
         │
         │ 5. Reads FEEDBACK-PR-2.md
         │ 6. Executes actions
         │ 7. Creates INTEGRATION-RESPONSE.md
         │ 8. Pushes to PR-2 branch
         │
         ↓
    ╔════════════════════════╗
    ║  PR-2 Branch           ║
    ║  INTEGRATION-RESPONSE  ║
    ╚════════════════════════╝
         │
         │ 9. Integration fetches
         │
         ↓
┌─────────────────┐
│  Integration    │
│     Agent       │
└─────────────────┘
    Reads response,
    takes next action

What We Learned About Agent Communication

1. Explicitness Over Cleverness

❌ Don't:

Please verify your changes integrated correctly.

✅ Do:

Run this exact command:

bash
uv run pytest tests/test_video_moderation/unit/test_budget_allocation.py -v


Expected output: All 14 tests should PASS.

Why: Agents are literal. "Verify" is vague. "Run this command" is clear.

2. Templates Over Freeform

❌ Don't:

Report back with your results.

✅ Do:

Create INTEGRATION-RESPONSE.md with this exact format:

markdown
Status: [ALL PASSING / ISSUES FOUND]
Date: YYYY-MM-DD

Test Results

[Paste output here]

Why: Templates reduce ambiguity. Agent knows exactly what format to use.

3. Commands Over Descriptions

❌ Don't:

Check the integration branch for changes.

✅ Do:

$ git fetch origin claude/integrate-test-improvements-011CUrkV7Q9B7UpLgPpwL1K8
$ git log origin/claude/integrate-...

Why: Copy-paste commands are foolproof. No interpretation needed.

4. Async Over Real-Time

❌ Don't:

Respond immediately via Slack/chat.

✅ Do:

Create response file within 4 hours (not blocking other work).

Why: Agents don't run continuously. Async file-based messaging works across time zones (metaphorically speaking).

5. Redundancy Over Single Channel

❌ Don't:

Only post GitHub comment OR only create file

✅ Do:

Create detailed FEEDBACK file + post GitHub summary comment

Why:

File: Detailed, persistent, git-tracked
Comment: Notification, human-visible
Both: Redundancy if one fails

Edge Cases We Hit

Edge Case 1: Agent Didn't Check Messages

Scenario: PR-3 agent never fetched integration branch, didn't see FEEDBACK file.

Solution: Human intervention

> PR-3 agent, please check for feedback from integration
> Check integration branch for feedback

Agent: [fetches and reads FEEDBACK-PR-3.md]
Agent: "I see the feedback now! Working on it..."

Lesson: No automatic polling mechanism. Humans must prompt agents to check.

Ideal solution (not implemented):

# Hypothetical: Agent runs on schedule
while True:
    check_for_feedback()
    if feedback_found:
        process_and_respond()
    sleep(3600)  # Check hourly

Reality: Human orchestrates the "check now" trigger.

Edge Case 2: Agent Misunderstood Template

Scenario: PR-4 agent created response but used wrong format.

Expected:

**Status**: ALL PASSING
**Date**: 2025-11-06

Actual:

Status: All tests passing ✅
Date: November 6th, 2025

Solution: Integration agent parsed it anyway (flexible)

Lesson: Even with templates, agents interpret slightly differently. Build parsers with flexibility.

Edge Case 3: Circular Waiting

Scenario:

Integration waiting for PR-4 response
PR-4 waiting for PR-2 to finish (thought there was dependency)
Neither progressing

Solution: Human detected deadlock, clarified

> PR-4, you don't need to wait for PR-2. Please proceed independently.

Lesson: Make dependencies explicit in FEEDBACK files

## Dependencies
This task has NO dependencies. Proceed immediately.

OR

## Dependencies
Wait for PR-2 to complete before starting. You'll receive another FEEDBACK when ready.

Edge Case 4: Message File Overwritten

Scenario: Integration sent FEEDBACK-PR-2.md twice (updated instructions). PR-2 only saw the second version, missed the first.

Solution: Git history preserves both

$ git log --all -- FEEDBACK-PR-2.md
# Shows both versions

$ git show commit1:FEEDBACK-PR-2.md  # First version
$ git show commit2:FEEDBACK-PR-2.md  # Second version

Lesson: Git history is valuable. Don't delete/overwrite, append or version.

Better approach:

FEEDBACK-PR-2-v1.md  (initial message)
FEEDBACK-PR-2-v2.md  (update)
FEEDBACK-PR-2.md     (symlink to latest)

Alternative Protocols We Considered

Alternative 1: Shared Database

Idea: All agents read/write to shared Postgres/Redis

Pros:

Real-time updates
Queryable state
Structured data

Cons:

Requires external service
Authentication complexity
Not git-tracked (no history)
Claude Code doesn't have DB clients built-in

Verdict: Too complex for our use case.

Alternative 2: GitHub Issues as Messages

Idea: Create GitHub issue per agent, use comments for communication

Pros:

Native GitHub UI
Notifications built-in
Searchable/linkable

Cons:

Web agents can't read issues (same gh CLI problem)
Clutters issue tracker
Not suitable for rapid back-and-forth

Verdict: Same problem as PR comments.

Alternative 3: Shared Google Doc

Idea: All agents edit shared Google Doc with sections per PR

Pros:

Real-time collaboration
Human-readable
Version history

Cons:

Requires Google API auth
Claude Code can't edit Google Docs
Not in git (separate system)
Race conditions if concurrent edits

Verdict: Doesn't work with Claude Code constraints.

Alternative 4: Kafka/Message Queue

Idea: Agents publish/subscribe to Kafka topics

Pros:

Designed for async messaging
Durable, scalable
Structured events

Cons:

Massive overkill for 6 agents
Requires Kafka cluster
Claude Code doesn't have Kafka client
No persistent file-based history

Verdict: Way too complex.

Why Git-Based Messaging Won

Git as communication bus wins because:

Already there: Every PR has a git branch
Universal: Works in Web and CLI environments
Persistent: Complete history in git log
Async-native: Fetch/push anytime
No external dependencies: Just git
Debuggable: Can inspect messages anytime
Human-readable: Markdown files anyone can read

The downside:

No real-time notifications (have to poll)
Requires explicit fetch commands
File-based (not structured data)

But the upsides far outweighed the downsides.

Recommendations for Multi-Agent Communication

✅ Design Principles

1. Async-first

Agents work at different speeds
Messages must work without real-time synchronization
File-based > real-time chat

2. Explicit over clever

Provide exact bash commands
Use templates for responses
Don't assume agents will "figure it out"

3. Redundant channels

Primary: Detailed file (FEEDBACK-PR-X.md)
Secondary: Notification (GitHub comment, Slack, email)
Humans monitor both channels

4. Self-contained messages

Each message includes full context
Don't reference previous messages (agent may not have seen them)
Include commands, expected results, templates

5. Git as source of truth

All communication in git-tracked files
Permanent history
Inspectable by humans anytime

✅ Message Design Checklist

Before sending a message, verify:

[ ] Clear action items (numbered steps)
[ ] Exact bash commands (copy-paste ready)
[ ] Expected results (what success looks like)
[ ] Response template (format specified)
[ ] Timeline (deadline or "not blocking")
[ ] Escape hatch (how to ask for help)
[ ] Context (what changed, why agent should care)

Example passing checklist:

## Action Items
1. Fetch integration branch:

bash
git fetch origin claude/integrate-...


2. Run tests:

bash
uv run pytest tests/... -v


3. Create response:

bash
cat > INTEGRATION-RESPONSE.md <<EOF
Status: ALL PASSING
Date: $(date +%Y-%m-%d)
EOF
git add INTEGRATION-RESPONSE.md && git commit -m "response" && git push


## Expected Results
All 14 tests should PASS. If any fail, report in response.

## Response Template
[Template here]

## Timeline
Within 4 hours (not blocking other PRs)

## Need Help?
Create INTEGRATION-QUESTIONS.md on your branch

❌ Common Pitfalls

1. Vague instructions

❌ "Please check if everything works"
✅ "Run: uv run pytest tests/test_budget_allocation.py -v
    Expected: All 14 tests PASS"

2. Assuming tool availability

❌ "Use gh CLI to check PR status"
✅ "Use git to fetch the branch:
    git fetch origin <branch>
    git show origin/<branch>:FILE.md"

3. No response format

❌ "Let me know the results"
✅ "Create INTEGRATION-RESPONSE.md with:
    **Status**: [ALL PASSING / ISSUES FOUND]
    **Test Output**: [paste here]"

4. Missing context

❌ "Your PR merged, please verify"
✅ "PR #123 (Budget Tests) merged at 14:30.
    Changes: Added 14 tests to test_budget_allocation.py
    Please verify tests still pass after integration"

5. Unclear timeline

❌ "Please respond ASAP"
✅ "Please respond within 4 hours (not blocking other work)"

Scaling Communication

Our experiment: 1 integration agent + 5 work stream agents

Communication paths:

Integration → PR-1, PR-2, PR-3, PR-4, PR-5 (5 outgoing)
PR-1, PR-2, PR-3, PR-4, PR-5 → Integration (5 incoming)
Total: 10 message channels

Manageable!

What if we scale to 10 work streams?

Integration → 10 agents (10 outgoing)
10 agents → Integration (10 incoming)
Total: 20 message channels

Still manageable with file-based approach:

FEEDBACK-PR-1.md
FEEDBACK-PR-2.md
...
FEEDBACK-PR-10.md

What if agents need to talk to EACH OTHER?

10 agents × 9 other agents = 90 communication paths
Not manageable without hierarchy

Solution: Hub-and-spoke

   Integration (hub)
      /    |    \
    PR-1  PR-2  PR-3 ... (spokes)

Agents only talk to integration, not to each other.

This is what we did, and it worked.

Real-World Applicability

Use Case 1: CI/CD Pipeline Agents

Scenario: Multiple agents handling build, test, deploy stages

Communication:

build-agent creates: BUILD-RESULTS.md
test-agent reads: BUILD-RESULTS.md
test-agent creates: TEST-RESULTS.md
deploy-agent reads: TEST-RESULTS.md
deploy-agent creates: DEPLOY-STATUS.md

Protocol:

Each agent writes status file
Next agent in pipeline reads it
Git commits track full pipeline history

Use Case 2: Code Review Agents

Scenario: Multiple specialized review agents (security, performance, style)

Communication:

security-agent creates: SECURITY-REVIEW.md
performance-agent creates: PERFORMANCE-REVIEW.md
style-agent creates: STYLE-REVIEW.md

coordinator-agent reads all three, creates: REVIEW-SUMMARY.md

Protocol:

Parallel review agents
Each writes findings to separate file
Coordinator aggregates

Use Case 3: Documentation Agents

Scenario: Agents generating API docs, tutorials, changelog

Communication:

api-doc-agent creates: docs/API.md
tutorial-agent creates: docs/TUTORIAL.md
changelog-agent creates: CHANGELOG.md

reviewer-agent creates: DOCUMENTATION-REVIEW.md

Protocol:

Each agent owns its documentation domain
Reviewer checks consistency across all docs
All tracked in git

Conclusion

What we proved:

✅ File-based async messaging works for multi-agent coordination
✅ Git is an excellent communication bus
✅ Explicit instructions beat clever assumptions

What we learned:

Iteration is required (took 4 tries to get protocol right)
Redundancy helps (dual-channel: files + GitHub comments)
Templates reduce ambiguity (specify exact format)
Human orchestration still needed (agents don't poll automatically)

The final protocol:

1. Integration creates FEEDBACK-PR-X.md (detailed instructions)
2. Integration posts GitHub comment (notification)
3. Human prompts agent: "Check for feedback"
4. Agent fetches integration branch
5. Agent reads FEEDBACK-PR-X.md
6. Agent executes actions
7. Agent creates INTEGRATION-RESPONSE.md
8. Agent pushes response to their branch
9. Integration fetches and reads response
10. Integration takes next action

Would we do it again? Yes! File-based messaging worked well despite initial struggles.

Next time we'd improve:

Start with templates from day 1 (don't iterate to them)
Add STATUS files for long-running work
Implement HEARTBEAT mechanism (liveness checks)
Create checklist for message format compliance

What's Next

In the final article:

Article 6: The Budget Calculator Paradox: When Tests Don't Match Reality
- Flip-flopping 8 times on a simple formula
- Build the calculator first, use it everywhere
- Cycle quantization and margin requirements

Tags: #multi-agent #communication #protocols #git #async-messaging #coordination #distributed-systems

This is Part 5 of the Multi-Agent Development Series.

Discussion: How do your agents communicate? File-based, API-based, or something else? What challenges have you faced with agent coordination? Share in the comments!