Agent traces are not enough. Agent runs need operating records.

#agents #ai #devops #mcp

Most production-agent discussions eventually land on observability.

That is good. Traces matter.

But I think traces are only one slice of what teams actually need once agents start touching tools, files, tickets, browsers, MCP servers, credentials, or customer-facing systems.

A trace answers: what happened inside this run?

An operating record answers a wider set of questions:

Which agent was installed and running?
Which model/provider/config was active?
Which MCP servers and tools were visible to the agent?
Which permissions were granted for this run?
Which actions required approval?
What did the agent actually change?
What failed, retried, or timed out?
Can another person replay the decision later?
Can I stop, recover, or uninstall the system cleanly?

That second set of questions is the part I keep seeing teams rebuild ad hoc.

The shift from prompt debugging to run operations

When an agent is just a demo, the prompt feels like the center of the system.

When an agent is running every day, the center shifts to operations:

setup state
tool exposure
run boundaries
approval policy
event history
rollback path
cost and latency drift
evidence for what happened

This is especially true with MCP. A manifest can tell you which tools exist. It does not, by itself, tell you which tools were exposed to a specific agent run, which arguments were passed, what side effects happened, and why a guard allowed or blocked the action.

What I want from agent infrastructure

For local and self-hosted agents, I want a boring control surface:

Install the agent.
Configure the provider and runtime.
See which tools and permissions exist.
Start and stop runs.
Inspect the job state.
Require approvals for risky actions.
Keep receipts for what happened.
Uninstall cleanly.

That sounds less exciting than a new agent demo, but it is the layer that makes repeated use feel sane.

Where Armorer fits

This is what we are building Armorer around: a local control plane for AI agents.

Armorer is not meant to be another agent framework. The goal is to sit around agents and make the operational state visible: installed agents, setup, running jobs, local configuration, approvals, audit trails, and recovery.

Repo: https://github.com/ArmorerLabs/Armorer

Where Armorer Guard fits

Armorer Guard is the companion piece: a local Rust guard layer for agent inputs and tool-call risk.

The key idea is that a guard decision should not just be a yes/no result or a block count. It should leave a small record that someone can inspect later: what was evaluated, what policy applied, why the decision happened, and what the runtime did with it.

Repo: https://github.com/ArmorerLabs/Armorer-Guard

The question I am trying to answer

What is the minimum useful operating record for an AI-agent run?

My current answer is:

run identity
agent/runtime version
effective tool/capability set
inputs and relevant context
policy/guard decisions
approvals
side effects
recovery/stop state
evidence links

Curious how other people are modeling this. If you are running agents in production or locally with MCP-heavy workflows, what fields do you wish every run left behind?