When deploying AI agents in enterprise environments, three requirements typically surface: every action must be traceable, multi-step operations must be atomic, and context must persist across sessions. AgentHelm v0.3.0 addresses all three.
This walkthrough demonstrates building a contract processing system with two specialized agents and full observability.
The Enterprise Requirements
Before diving into code, let's establish what enterprise deployments demand:
- Traceability: Every tool call must be logged with inputs, outputs, timing, and cost
- Atomicity: If step 3 of 5 fails, steps 1-2 must be rolled back
- Memory Persistence: Agent context survives restarts and can be audited
- Cost Visibility: Know exactly what each operation costs before the invoice arrives
Architecture Overview
┌────────────────────────────────────────────────────────────┐
│ PlannerAgent │
│ (Generates execution blueprint) │
└────────────────────────────────────────────────────────────┘
│
▼
┌───────────────┴───────────────┐
│ Orchestrator │
│ (Saga pattern execution) │
└───────────────┬───────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ AnalysisAgent │ │ DocumentAgent │
│ (Extract & analyze) │ │ (Generate & save) │
└─────────────────────┘ └─────────────────────┘
│ │
└───────────────┬───────────────┘
▼
┌───────────────────────────────┐
│ MemoryHub │
│ (SQLite short-term + Qdrant) │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ ExecutionTracer │
│ (SQLite + OpenTelemetry) │
└───────────────────────────────┘
│
▼
┌───────────────────────────────┐
│ Jaeger │
│ (Trace visualization) │
└───────────────────────────────┘
Setup
pip install agenthelm
import dspy
from agenthelm import (
ToolAgent, PlannerAgent, Orchestrator, AgentRegistry,
tool, MemoryHub, ExecutionTracer
)
from agenthelm.core.storage import SqliteStorage
from agenthelm.tracing import init_tracing
lm = dspy.LM("mistral/mistral-large-latest")
Configure Observability First
In enterprise deployments, observability is not an afterthought.
# Initialize OpenTelemetry with Jaeger
init_tracing(
service_name="contract-processor",
otlp_endpoint="http://jaeger:4317",
enabled=True
)
# Create execution tracer with persistent storage
tracer = ExecutionTracer(
storage=SqliteStorage("/var/log/agenthelm/traces.db"),
session_id="contract-batch-2025-12-26"
)
Every tool execution is now:
- Logged to SQLite with full inputs/outputs
- Exported to Jaeger for distributed tracing
- Tagged with session ID for batch correlation
Configure Memory Persistence
# Production memory configuration
memory = MemoryHub(
data_dir="/var/data/agenthelm", # Local persistence
# Or for network mode:
# redis_url="redis://cache.internal:6379",
# qdrant_url="http://qdrant.internal:6333"
)
MemoryHub provides:
- Short-term memory: Session state with TTL (SQLite locally, Redis for distributed)
- Semantic memory: Vector search for context retrieval (Qdrant with FastEmbed)
Define Tools with Compensation
Atomicity requires compensating actions for rollback.
@tool()
def extract_contract_data(document_path: str) -> dict:
"""Extract structured data from a contract document."""
# Simulated extraction
return {
"parties": ["Acme Corp", "Widget Inc"],
"value": 150000,
"terms": "12 months",
"effective_date": "2025-01-01"
}
@tool()
def validate_compliance(contract_data: dict) -> dict:
"""Validate contract against compliance rules."""
issues = []
if contract_data.get("value", 0) > 100000:
issues.append("Requires senior approval for contracts > $100k")
return {"valid": len(issues) == 0, "issues": issues}
@tool(compensating_tool="delete_record")
def create_record(contract_data: dict, record_type: str) -> str:
"""Create a record in the system."""
record_id = f"REC-{hash(str(contract_data)) % 10000:04d}"
# In production: database insert
return record_id
@tool()
def delete_record(record_id: str) -> str:
"""Delete a record (compensation action)."""
# In production: database delete
return f"Deleted {record_id}"
@tool(compensating_tool="archive_document")
def generate_summary(contract_data: dict, output_path: str) -> str:
"""Generate and save a contract summary document."""
content = f"Contract Summary: {contract_data}"
with open(output_path, "w") as f:
f.write(content)
return output_path
@tool()
def archive_document(output_path: str) -> str:
"""Archive a document (compensation action)."""
import os
if os.path.exists(output_path):
os.rename(output_path, f"{output_path}.archived")
return f"Archived {output_path}"
Note the compensating_tool parameter. When the Orchestrator detects a failure, it automatically calls these in reverse order.
Create Specialized Agents
# Agent 1: Analysis specialist
analysis_agent = ToolAgent(
name="analyst",
lm=lm,
tools=[extract_contract_data, validate_compliance],
tracer=tracer,
memory=memory,
role="You are a contract analysis specialist. Extract data accurately."
)
# Agent 2: Document specialist
document_agent = ToolAgent(
name="documenter",
lm=lm,
tools=[create_record, generate_summary],
tracer=tracer,
memory=memory,
role="You are a document management specialist. Create records precisely."
)
# Register agents
registry = AgentRegistry()
registry.register(analysis_agent)
registry.register(document_agent)
Both agents share:
- The same tracer (unified trace storage)
- The same memory hub (shared context)
Generate the Execution Plan
# Planner agent has visibility into all tools
planner = PlannerAgent(
name="planner",
lm=lm,
tools=[
extract_contract_data, validate_compliance,
create_record, generate_summary
]
)
plan = planner.plan(
"Process contract.pdf: extract data, validate compliance, "
"create a system record, and generate a summary document"
)
print(plan.to_yaml())
Generated plan:
goal: Process contract and create records
reasoning: |
Sequential process: extract first, then validate, then create
record, finally generate summary. Each step depends on previous.
steps:
- id: extract
agent: analyst
tool: extract_contract_data
description: Extract structured data from contract
args:
document_path: "contract.pdf"
- id: validate
agent: analyst
tool: validate_compliance
description: Check against compliance rules
depends_on: [extract]
args:
contract_data: "${extract.result}"
- id: record
agent: documenter
tool: create_record
description: Create system record
depends_on: [validate]
args:
contract_data: "${extract.result}"
record_type: "contract"
- id: summary
agent: documenter
tool: generate_summary
description: Generate summary document
depends_on: [record]
args:
contract_data: "${extract.result}"
output_path: "/output/contract_summary.md"
Review and Edit the Plan
Before execution, the plan can be reviewed and modified.
# Save plan for review
plan_path = "/reviews/contract_plan.yaml"
with open(plan_path, "w") as f:
f.write(plan.to_yaml())
# Manual review happens here...
# Reviewer can edit the YAML, add steps, modify args
# Load reviewed plan
from agenthelm import Plan
reviewed_plan = Plan.from_yaml(open(plan_path).read())
# Approve for execution
reviewed_plan.approved = True
In production, this review step integrates with your approval workflow; Slack notifications, PR-based reviews, or manual sign-off.
Execute with Saga Rollback
orchestrator = Orchestrator(
registry=registry,
enable_rollback=True # Saga pattern enabled
)
result = await orchestrator.execute(reviewed_plan)
If generate_summary fails after create_record succeeds:
-
generate_summarymarked as FAILED - Orchestrator triggers rollback
-
delete_recordcalled automatically (compensating action forcreate_record) - System returns to consistent state
Inspect Traces
After execution, full traceability:
# Programmatic access
for event in result.events:
print(f"{event.tool_name}: {event.execution_time:.3f}s, ${event.estimated_cost_usd:.4f}")
# Summary
print(f"Total cost: ${result.total_cost_usd:.4f}")
print(f"Total tokens: {result.token_usage.total_tokens}")
Via CLI:
# List recent traces
agenthelm traces list -s /var/log/agenthelm/traces.db
# Filter by tool
agenthelm traces filter --tool create_record --status success
# Export for audit
agenthelm traces export -o audit_report.json -f json
agenthelm traces export -o audit_report.csv -f csv
Memory for Context Continuity
Store and retrieve context across sessions:
from agenthelm.memory import MemoryContext
async with MemoryContext(memory, session_id="contract-batch-2025-12-26") as ctx:
# Store processing context
await ctx.set("last_processed_contract", "contract.pdf")
await ctx.set("batch_status", {"processed": 1, "failed": 0})
# Store semantic memory for future retrieval
await ctx.store_memory(
"Contract with Acme Corp processed successfully. Value: $150k, 12 months.",
metadata={"contract_id": "contract.pdf", "status": "complete"}
)
# Later, in another session
async with MemoryContext(memory, session_id="new-session") as ctx:
# Recall relevant past contracts
results = await ctx.recall("Acme Corp contracts", top_k=5)
for r in results:
print(f"Score: {r.score:.2f} - {r.text}")
Jaeger Integration
With OpenTelemetry configured, view traces in Jaeger:
# Start Jaeger
docker run -d -p 16686:16686 -p 4317:4317 jaegertracing/all-in-one
# Run your agent workflow
python process_contracts.py
# Open Jaeger UI
open http://localhost:16686
Each agent execution appears as a span with:
- Tool name and arguments
- Execution duration
- Success/failure status
- Cost attribution
Key Enterprise Benefits
| Requirement | AgentHelm Feature |
|---|---|
| Audit trail | ExecutionTracer + SQLite storage |
| Distributed tracing | OpenTelemetry + Jaeger |
| Atomic operations | Saga pattern with compensating tools |
| Session persistence | MemoryHub with Redis/SQLite |
| Context search | Semantic memory with Qdrant |
| Cost control | Built-in pricing for 20+ LLM providers |
| Human review | Plan YAML export/import workflow |
Conclusion
Enterprise AI agent deployments require more than just "an agent that works." They require:
- Traceability for compliance and debugging
- Atomicity for data consistency
- Memory persistence for context continuity
AgentHelm v0.3.0 provides these as first-class features, not afterthoughts.
Documentation: hadywalied.github.io/agenthelm
GitHub: github.com/hadywalied/agenthelm
Top comments (0)