BarakBot

Tomer Barak — Tue, 01 Apr 2025 04:15:57 +0000

BarakBot (ברק בוט) is an advanced multi-agent system where users interact with specialized Language Learning Models (LLMs) via a Telegram bot interface. Originally designed for family communication in both Hebrew and English, the project has evolved into a comprehensive personal assistant capable of handling various tasks through different specialized agents, each with specific roles and capabilities.

Features

Multi-Agent Architecture

BarakBot utilizes a self-referential agent architecture where specialized LLM agents collaborate to handle different tasks. Rather than using a complex "agent hub," each agent can request assistance from other agents by communicating with the bot itself—the same way a human user would. This creates a more natural interaction flow and simplifies system architecture.

Tasker Agent with Obsidian Integration

The newest addition to BarakBot is a dedicated Tasker Agent that helps users manage their tasks, projects, and information using an integrated Obsidian vault:

Creates, reads, updates, and deletes notes in an organized structure
Maintains a working memory to track user preferences and ongoing tasks
Implements a comprehensive priority system for tasks:
- 🔺 Highest priority (critical and time-sensitive)
- ⏫ High priority (important but not critical)
- 🔼 Medium priority (should be done soon)
- 🔽 Low priority (can wait if needed)
Adds due dates to time-sensitive tasks with 📅 format
Tracks task completion status ([ ] vs [x])
Organizes information in a logical folder structure
Presents information naturally without exposing the underlying note-taking system to users

General Agent & Session Management

A general agent serves as the primary interface, managing conversation flows and deciding when to invoke specialized agents. It maintains context-aware conversations in sessions, with prompts and responses handled appropriately based on whether the interaction is in a private chat or group setting.

Scheduler Integration

The scheduler agent manages notifications and reminders based on triggers defined in a JSON file. It can handle both recurring schedules (using cron syntax) and one-time events:

Users can set up reminders for specific times or days
The scheduler can trigger other agents automatically
Cross-chat communication is supported, allowing messages to be sent between different chats
Reminders can be modified or rescheduled as needed

Photo Handling

The photo agent manages image sharing and captioning. Users can request photos from specific time periods or with particular family members, and the agent will retrieve and caption appropriate images. This creates an interactive photo-sharing experience that feels natural and contextual.

Web UI

A web interface provides tools for prompting and testing the LLMs, making it easier to develop and refine prompts without going through the Telegram interface.

System Architecture

Self-Referential Design

The self-referential agent architecture allows BarakBot to function as a cohesive system while maintaining specialized capabilities. When an agent needs functionality outside its domain:

It identifies the need for another agent's capabilities
Rather than using a dedicated communication channel, it simply sends a message to the bot itself
The bot routes the message to the appropriate specialized agent
The response is returned through the same channel

This approach simplifies agent coordination without requiring additional infrastructure, creating a more natural and coherent experience for both users and the system itself.

Action-Based Communication

All agents use a standardized action-based communication format:

ACTIONS:
ACTION1_parameters_END_ACTION1
ACTION2_parameters_END_ACTION2

This creates a consistent interface between components and allows for:

Internal thinking processes (THINK action) that aren't shown to users
Natural responses in the appropriate language (RESPOND action)
Agent-to-agent communication (CHAT_AGENT action)
Specialized actions for each agent type (e.g., GET_PHOTO, VAR_CIBUS_RUN)

Technologies Used

Python for backend logic and agent coordination
Telegram Bot API for messaging and user interactions
OpenAI's GPT models for language understanding and natural responses
Obsidian for structured information and task management
Flask for the web testing interface
JSON for configuration and scheduling data
Web search capabilities for real-time information access

For more information, see my website

Self-refernce in AI agents

Tomer Barak — Tue, 01 Apr 2025 04:13:10 +0000

In developing BarakBot, I encountered a fundamental challenge in multi-agent coordination. How should distinct AI agents communicate effectively while keeping the architecture simple and scalable? The solution turned out to be surprisingly intuitive: let them talk to themselves.

The Multi-Agent Challenge

BarakBot consists of multiple specialized LLM agents operating within a Telegram bot interface:

A general agent managing conversations and delegating tasks
A photo agent generating captions for images
A scheduler agent handling reminders and notifications

Each agent performs well in isolation, but they often need to collaborate. For example, the photo agent might need the scheduler agent to remind a user about an analyzed image. The naive approach—hardcoding direct interactions—quickly becomes unmanageable.

The Conventional Approach: Centralized Agent Hub

A typical solution is to implement an "agent hub," a structured interface where agents formally request services from each other. While common in multi-agent systems, this method introduces unnecessary complexity:

Requires additional infrastructure for agent communication
Adds debugging challenges as the network of interactions expands
Creates separate protocols for human-agent and agent-agent interactions

This approach felt artificial—unlike how humans coordinate thoughts and actions internally.

The Self-Referential Solution

The breakthrough came when I reframed the problem from an agent's perspective. If an agent needed another agent's help, what would be the most natural way to request it? The answer: do exactly what a human user does—send a message to the bot.

This is related to an interesting observation: while the agents were switching behind the scenes, users didn't notice the transitions. From the human perspective, the bot remained a singular, coherent entity, even as it dynamically changed between specialized agents. This revealed a key insight: humans naturally maintain a unified mental model of the bot, regardless of its internal complexity.

If the system was coherent enough for users to treat it as one entity, why not let the agents do the same? Instead of addressing specific subcomponents, each agent could simply refer to a single, overarching "assistant"—which, in reality, is just the bot itself. When an agent requires a capability it lacks, it simply asks the bot, which routes the request appropriately—just as it does for human users.

Why This Works

Simplicity: No additional communication protocols or infrastructure.
Consistency: The same mechanism handles human and agent interactions.
Scalability: New agents integrate seamlessly without modifying existing connections.

Conclusion

This approach simplifies multi-agent coordination without requiring additional infrastructure, allowing agents to leverage existing communication channels. By treating the bot as a unified entity, both users and agents interact with it in a way that feels natural and coherent.

More broadly, this highlights an interesting parallel between AI coordination and human cognition. Just as people maintain a stable sense of identity despite shifting internal processes, AI agents can function cohesively within a larger system while operating independently behind the scenes.

There is still much to explore, especially in understanding the implications of self-referential communication in AI systems. As BarakBot evolves, these insights will help refine its design and expand its capabilities. For more details, visit my website.

DEV Community: Tomer Barak