`AI agents are no longer just generating text. They're sending emails, pushing code, updating CRM records, and modifying databases.
And when they go wrong, they really go wrong.
I've seen this pattern repeatedly: an agent works perfectly in testing, gets deployed, and then sends 200 emails to the wrong list. Or deletes the wrong GitHub issues. Or overwrites 3 months of CRM data.
The model didn't fail. The prompt was fine. There was just no safety net.
The problem isn't the agent. It's the execution layer.
Most teams handle this with logging. They add Langfuse or Helicone, watch the traces, and hope they catch mistakes before they happen.
But logging tells you what went wrong after it happened. What you actually need is the ability to undo it.
What reversible execution looks like
The core idea is simple: before any action executes, you log it. After it executes, you store enough information to reverse it. If something goes wrong, you unwind in LIFO order.
For every connector, you need a compensation handler, a function that defines what "undo" means for that specific action:
`typescript
// Email sent: can't unsend, but can send correction
compensate: async (action) => {
await sendCorrection(action.payload.to, action.payload.subject)
await flagThread(action.payload.threadId)
}
// CRM record updated: revert to snapshot
compensate: async (action) => {
await crm.records.update(action.payload.recordId, action.snapshot.before)
}
// GitHub issue created: close it
compensate: async (action) => {
await octokit.issues.update({
issue_number: action.result.number,
state: 'closed'
})
}
`
Compensation isn't symmetric. "Undo send email" is not the same as "delete sent email." The action already had consequences. So each handler has to be action-aware, not generic.
Approval gates for high-risk actions
Not every action needs rollback. Some need prevention.
The pattern that works: define a risk threshold per action type. Actions above the threshold pause and wait for human approval before executing.
typescript
const session = await agentrein.newSession({
agentId: 'email-agent',
intent: 'Send follow-up emails to leads from last week',
approvalRules: [
{ action: 'gmail.send', requireApproval: true },
{ action: 'gmail.draft', requireApproval: false }
]
})
The audit trail problem
Even with rollback and approval gates, you need to know why the agent took each action, not just what it did.
Most logging tools capture the API call. What you actually need is the intent at the time of execution: what was the agent trying to accomplish, and did this action match that goal?
What we built
We built AgentRein to solve exactly this, a drop-in SDK that wraps your existing tools and adds rollback, approval gates, and audit logs.
`typescript
import { AgentRein } from 'agentrein'
import { Octokit } from '@octokit/rest'
const agentrein = new AgentRein({ apiKey: 'YOUR_API_KEY' })
const octokit = new Octokit({ auth: 'GITHUB_TOKEN' })
const session = await agentrein.newSession({
agentId: 'onboarding-agent',
intent: 'Create GitHub issue and notify Slack'
})
const agentOctokit = agentrein.wrap(octokit, session, { connector: 'github' })
const issue = await agentOctokit.issues.create({
owner: 'my-org',
repo: 'my-repo',
title: 'Onboarding task'
})
await agentrein.completeSession(session)
`
Pre-built compensation handlers for GitHub, Stripe, Slack, Gmail, Notion, HubSpot. Free tier available.
If you're running agents in production that touch real systems, I'd love your feedback: agentrein.com`
Top comments (0)