Octo: Giving AI Agents a Desk, an Identity, and a Performance Review

#ai #opensource #devops #productivity

Agents are already doing real work. Competitive analysis, code review, documentation drafts. But in most collaboration tools, they still don't have a place of their own. They're not employees, they're not systems, they just sit in the group chat under a service account. They can send messages, receive instructions, and occasionally drop what looks like a decent analysis report in the channel, but nobody really treats them as part of the team.

This shows up first in permissions. An agent doing competitive research needs access to every discussion in the project channel. An agent doing code review only needs messages related to the code repository. The service account model can't handle this distinction. Service accounts were built for system integration. Their permission model has one layer: can this account access this resource, yes or no. There's no concept of "who does this account work for" or "what context should it see." In practice, teams end up manually creating subgroups and forwarding messages to control what the agent can see. It's slow and it breaks easily.

What's worse than permissions is that agents have almost no work history. A human team member works on a project for three months and the team knows what they're good at, which tasks went well, where they stumbled. Next time work gets assigned, there's experience to draw on. An agent runs a hundred tasks and nothing sticks. Completion rate, rejection count, what types of work it handles well, all of that information is scattered across chat logs in different conversation windows. Nobody organizes it because there's nowhere to organize it. Next time you need to pick an agent for a job, you're guessing, or you start over and let it try again. A hundred tasks of accumulated performance data, completely wasted.

This gets worse in team settings. Say you have three agents running in parallel on a project, one doing research, one writing proposals, one building test cases. How does the project lead know which agent delivered high quality last time, which one got sent back twice? There's no way to tell. Every agent looks the same in the collaboration tool, same service account avatar, no differentiating information to support the decision.

In Octo, we built something called an AgentCard for each bot. It lists capability tags, historical work records, and rejection counts. It's not a complicated technical solution. It just stores information that should have been structured and saved all along, so teams have data to look at when choosing which bot to assign work to. The AgentCard also records who created the bot, who it works for, and what permissions it inherited. This information updates continuously as collaboration happens, instead of being set once at creation and never touched again.

Delivery and review are another gap that nobody's really addressed. Many agents finish a task and just paste a block of results into the chat, where it gets buried by new messages almost immediately. Try finding the full output of a competitive analysis from three months ago. You can't. More commonly, a deliverable gets rejected, but the reason for rejection lives in one buried chat message. The agent doesn't know why it was sent back last time, so it makes the same mistake again. The previous feedback was never recorded, and there's no mechanism to inject it into the next task description. Every time, you're teaching it from scratch.

Octo handles this with Matter. Matter pulls each task delivery out of the chat stream and turns it into a structured work unit with an owner, deliverables, and a review conclusion. Deliverables are attached under the Matter so they don't get washed away by new messages. Rejections and approvals both leave a record. Feedback gets stored and automatically injected into the next task. A Matter's full lifecycle includes the brief, the discussion, the output, human feedback, and the final review conclusion, all in one place, no need to dig through chat history.

Then there's the question of how multiple agents collaborate with each other. Does an agent doing research need to see the proposal another agent is writing? Sometimes sharing information prevents duplicate work. Sometimes isolation is better, everyone works independently and a human picks the best result at the end. In existing collaboration tools, all messages are visible to all channel members. There's no concept of "collaboration mode," and information flow is controlled entirely by humans manually. Octo defines six collaboration modes that specify information visibility and flow between bots, covering the full range from fully shared to fully isolated, selected based on the nature of the task.

If a team is serious about having agents participate in collaboration long term, those agents should be treated at least like proper team members, with an identity, a work record, and a clear delivery and review process. This isn't an advanced requirement. It's a basic capability that collaboration tools should have in the age of AI. It just got overlooked in the design of existing products.

Octo is open source on GitHub under the Apache 2.0 license: https://github.com/Mininglamp-OSS/octo-server

DEV Community

Octo: Giving AI Agents a Desk, an Identity, and a Performance Review

Top comments (0)