Wanda

Posted on Mar 24 • Originally published at apidog.com

How to Stop Babysitting AI Agents ?

TL;DR

You can stop babysitting AI agents by implementing three key systems: guardrails (hard constraints to prevent major failures), observability (detailed logs and metrics for visibility), and checkpoints (automatic pauses for human verification). With these in place, agents can run autonomously for hours, not just minutes. Tools like Apidog let you define strict API contracts, so your API layer acts as a safety net your agents can’t bypass.

Try Apidog today

Introduction

Last week I saw a developer spend 4 hours supervising an AI agent that was supposed to save time. Every few minutes, he’d step in, fix mistakes, and restart the process. At the end, he did more manual work than if he’d coded from scratch.

This is the babysitting problem—the main reason AI agents often fail to deliver value. The models are capable, but without the right setup, teams get stuck in constant supervision mode.

The root issue: most AI agent workflows treat LLMs like junior devs who need hand-holding. But LLMs are more like fast, unpredictable interns—they’ll confidently make mistakes unless you set clear boundaries.

💡 Tip: If you’re building APIs or working with AI agents that call APIs, Apidog helps you enforce those boundaries. By defining exact request/response schemas, you build contracts agents can’t accidentally break—a map instead of letting them wander.

Define API contracts your AI agents can follow

By the end of this guide, you’ll have:

A mental model for agent autonomy
Concrete patterns for guardrails, observability, and checkpoints
Ready-to-use code examples
A checklist to assess if an agent is ready for unsupervised execution

Why Agents Need Constant Supervision

AI agents fail in predictable ways. Knowing these failure modes helps you prevent them.

Failure mode 1: Scope creep

Request: “Add authentication to the API endpoint.”

Agent adds authentication, then rate limiting, then refactors the database, then deletes what it thinks are unused files. Why? No one told it when to stop. LLMs lack an innate sense of “done.”

Failure mode 2: Wrong abstractions

Ask an agent to “improve error handling” and it might add try-catch blocks everywhere—technically correct, practically useless. The agent follows literal instructions, missing the real intent.

Failure mode 3: Cascading failures

A small mistake in step 1 propagates through subsequent decisions. What starts as a typo becomes a broken API, failed tests, and hours of debugging. The agent doesn’t notice the failure at each step.

Failure mode 4: Resource exhaustion

Without constraints, agents can loop forever—retrying APIs, spawning sub-agents, or generating endless code until you hit a quota or billing limit.

The Autonomy Framework: Guardrails, Observability, Checkpoints

Solve these problems with three layers—a pyramid:

Guardrails (bottom, prevention)
Observability (middle, detection)
Checkpoints (top, human-in-the-loop recovery)

Layer 1: Guardrails (prevention)

Guardrails are hard rules your agent cannot break, enforced by code.

Hard constraints via code:

# Don't just tell the agent what not to do. Enforce it.

import os
from pathlib import Path

ALLOWED_DIRECTORIES = {"src", "tests", "docs"}

def validate_file_path(path: str) -> bool:
    """Agent cannot write outside allowed directories."""
    abs_path = Path(path).resolve()
    return any(
        str(abs_path).startswith(str(Path(d).resolve()))
        for d in ALLOWED_DIRECTORIES
    )

def agent_write_file(path: str, content: str):
    if not validate_file_path(path):
        raise ValueError(f"Cannot write to {path}: outside allowed directories")
    with open(path, 'w') as f:
        f.write(content)

API schema constraints:

When agents call APIs, use schemas to reject malformed requests. Apidog enforces these contracts.

// apidog-schema.ts
export const CreateUserSchema = {
  type: 'object',
  required: ['email', 'name'],
  properties: {
    email: { type: 'string', format: 'email' },
    name: { type: 'string', minLength: 1, maxLength: 100 },
    role: { type: 'string', enum: ['user', 'admin', 'guest'] }
  },
  additionalProperties: false
}

// Validate before calling API
function validateRequest(schema: object, data: unknown): void {
  const valid = ajv.validate(schema, data)
  if (!valid) {
    throw new Error(`Invalid request: ${JSON.stringify(ajv.errors)}`)
  }
}

Budget constraints:

import time
from dataclasses import dataclass

@dataclass
class AgentBudget:
    max_steps: int = 50
    max_tokens: int = 100000
    max_time_seconds: int = 600
    max_api_calls: int = 100

class BudgetEnforcer:
    def __init__(self, budget: AgentBudget):
        self.budget = budget
        self.start_time = time.time()
        self.steps = 0
        self.tokens_used = 0
        self.api_calls = 0

    def check(self) -> bool:
        elapsed = time.time() - self.start_time

        if self.steps >= self.budget.max_steps:
            raise RuntimeError(f"Step limit reached: {self.steps}")
        if self.tokens_used >= self.budget.max_tokens:
            raise RuntimeError(f"Token limit reached: {self.tokens_used}")
        if elapsed >= self.budget.max_time_seconds:
            raise RuntimeError(f"Time limit reached: {elapsed:.0f}s")
        if self.api_calls >= self.budget.max_api_calls:
            raise RuntimeError(f"API call limit reached: {self.api_calls}")

        return True

    def record_step(self, tokens: int, api_calls: int = 0):
        self.steps += 1
        self.tokens_used += tokens
        self.api_calls += api_calls
        self.check()

Layer 2: Observability (detection)

With long-running agents, you need visibility into what they’re doing.

Structured logging:

import json
from datetime import datetime
from typing import Any

class AgentLogger:
    def __init__(self, log_file: str = "agent_trace.jsonl"):
        self.log_file = log_file
        self.entries = []

    def log(self, event: str, data: dict[str, Any] | None = None):
        entry = {
            "timestamp": datetime.utcnow().isoformat(),
            "event": event,
            "data": data or {}
        }
        self.entries.append(entry)
        with open(self.log_file, 'a') as f:
            f.write(json.dumps(entry) + '\n')

    def log_decision(self, decision: str, reasoning: str, confidence: float):
        self.log("decision", {
            "decision": decision,
            "reasoning": reasoning,
            "confidence": confidence
        })

    def log_action(self, action: str, params: dict, result: str):
        self.log("action", {
            "action": action,
            "params": params,
            "result": result[:200]
        })

    def log_error(self, error: str, context: dict):
        self.log("error", {
            "error": error,
            "context": context
        })

# Usage in agent
logger = AgentLogger()
logger.log_decision(
    decision="Add rate limiting to API",
    reasoning="Current endpoint has no protection against abuse",
    confidence=0.85
)
logger.log_action(
    action="write_file",
    params={"path": "src/middleware/rate-limit.ts"},
    result="Successfully wrote 45 lines"
)

Metrics dashboard:

from collections import Counter
from dataclasses import dataclass, field

@dataclass
class AgentMetrics:
    actions_taken: Counter = field(default_factory=Counter)
    files_modified: list[str] = field(default_factory=list)
    api_calls: dict[str, int] = field(default_factory=dict)
    errors: list[str] = field(default_factory=list)
    decisions_by_confidence: dict[str, int] = field(default_factory=lambda: {
        "high (>0.9)": 0,
        "medium (0.7-0.9)": 0,
        "low (<0.7)": 0
    })

    def record_action(self, action: str):
        self.actions_taken[action] += 1

    def record_file_modification(self, path: str):
        if path not in self.files_modified:
            self.files_modified.append(path)

    def record_api_call(self, endpoint: str):
        self.api_calls[endpoint] = self.api_calls.get(endpoint, 0) + 1

    def record_error(self, error: str):
        self.errors.append(error)

    def record_decision(self, confidence: float):
        if confidence > 0.9:
            self.decisions_by_confidence["high (>0.9)"] += 1
        elif confidence >= 0.7:
            self.decisions_by_confidence["medium (0.7-0.9)"] += 1
        else:
            self.decisions_by_confidence["low (<0.7)"] += 1

    def summary(self) -> str:
        return f"""
Agent Metrics Summary
=====================
Actions: {dict(self.actions_taken)}
Files modified: {len(self.files_modified)}
API calls: {self.api_calls}
Errors: {len(self.errors)}
Decisions by confidence: {self.decisions_by_confidence}
"""

Layer 3: Checkpoints (recovery)

Checkpoints are automatic pauses for human verification—catch issues early.

Automatic checkpoints:

from enum import Enum
from typing import Callable
from dataclasses import dataclass

class CheckpointTrigger(Enum):
    BEFORE_FILE_WRITE = "before_file_write"
    BEFORE_API_CALL = "before_api_call"
    BEFORE_GIT_COMMIT = "before_git_commit"
    BEFORE_DELETE = "before_delete"
    AFTER_N_STEPS = "after_n_steps"

@dataclass
class Checkpoint:
    trigger: CheckpointTrigger
    description: str
    data: dict
    requires_approval: bool = True

class CheckpointManager:
    def __init__(self, auto_approve: set[CheckpointTrigger] | None = None):
        self.auto_approve = auto_approve or set()
        self.pending: list[Checkpoint] = []

    def create_checkpoint(
        self, 
        trigger: CheckpointTrigger, 
        description: str, 
        data: dict
    ) -> bool:
        if trigger in self.auto_approve:
            return True

        checkpoint = Checkpoint(
            trigger=trigger,
            description=description,
            data=data
        )
        self.pending.append(checkpoint)
        return False

    def approve(self, checkpoint_id: int) -> None:
        if 0 <= checkpoint_id < len(self.pending):
            self.pending.pop(checkpoint_id)

    def reject(self, checkpoint_id: int) -> None:
        raise RuntimeError(f"Checkpoint rejected: {self.pending[checkpoint_id]}")

# Usage
checkpoints = CheckpointManager(
    auto_approve={CheckpointTrigger.BEFORE_FILE_WRITE}
)

if not checkpoints.create_checkpoint(
    trigger=CheckpointTrigger.BEFORE_DELETE,
    description="About to delete src/legacy/ directory",
    data={"path": "src/legacy/", "files": ["old_handler.ts", "deprecated.ts"]}
):
    # Wait for human approval
    agent.pause("Waiting for approval to delete files")

Building Autonomous Agents with Apidog

When AI agents interact with APIs, malformed requests are a major risk. Apidog lets you define strict API schemas and generate validated clients for your agents.

Setting up API contracts:

Import or define your OpenAPI spec in Apidog
Generate client code with built-in validation
Provide the validated client to your agent (not raw HTTP)

// Don't let agent call APIs directly
const response = await fetch('/api/users', {
  method: 'POST',
  body: JSON.stringify(data)  // No validation
})

// Use validated client
import { UsersApi } from './generated/apidog-client'

const usersApi = new UsersApi()
// Agent can only send valid requests - schema enforced
const response = await usersApi.createUser({
  email: 'user@example.com',
  name: 'Test User',
  role: 'user'  // Must match enum
})

Now your API layer acts as a guardrail—the agent cannot send invalid data.

Generate validated API clients for your AI agents

Proven Patterns and Common Mistakes

Pattern 1: The Approval Sandwich

For risky operations, require approval both before and after the action.

def risky_operation(agent, operation):
    # Pre-approval
    if not agent.checkpoint(f"About to: {operation.description}"):
        return "Cancelled by user"

    # Execute
    result = operation.execute()

    # Post-approval
    if not agent.checkpoint(f"Verify result of: {operation.description}"):
        operation.rollback()
        return "Rolled back by user"

    return result

Pattern 2: Confidence Thresholds

Don’t let agents act on low-confidence decisions.

MIN_CONFIDENCE = 0.75

def agent_decide(options: list[dict]) -> dict:
    best = max(options, key=lambda x: x.get('confidence', 0))

    if best['confidence'] < MIN_CONFIDENCE:
        # Escalate to human
        return {
            'action': 'escalate',
            'reason': f"Best option has confidence {best['confidence']:.2f} < {MIN_CONFIDENCE}",
            'options': options
        }

    return best

Pattern 3: Idempotent Operations

Make agent actions repeatable and safe.

import hashlib

def idempotent_write(path: str, content: str) -> bool:
    """Only write if content changed."""
    content_hash = hashlib.sha256(content.encode()).hexdigest()
    existing_hash = None
    if os.path.exists(path):
        with open(path, 'r') as f:
            existing_hash = hashlib.sha256(f.read().encode()).hexdigest()

    if content_hash == existing_hash:
        logger.log_action("write_file", {"path": path}, "Skipped - no changes")
        return False

    with open(path, 'w') as f:
        f.write(content)
    logger.log_action("write_file", {"path": path}, f"Wrote {len(content)} bytes")
    return True

Common Mistakes to Avoid

Trusting prompts as constraints: “Don’t delete files” in a prompt isn’t a real constraint. File permissions are.
No rollback plan: Always use git or backups so you can undo mistakes.
Ignoring confidence scores: Most LLMs provide or can be prompted for confidence. Low confidence? Pause and escalate.
Over-monitoring: If you’re watching every step, it’s not automation—just manual work with extra steps.
Under-specifying success: The agent needs a clear completion signal. “Fix the bug” is vague. “Fix the bug and all tests pass” is actionable.

Alternatives and Comparisons

Approach	Autonomy	Risk	Best for
Manual coding	None	Low	Complex, critical work
Pair programming with AI	Low	Low	Learning, exploration
Supervised agents	Medium	Medium	Routine tasks
Autonomous agents with guardrails	High	Controlled	Bulk operations, migrations
Fully autonomous agents	Very high	High	Trusted, well-tested workflows

Most teams should aim for “autonomous with guardrails”—it delivers 80% of the time savings with just 10% of the risk.

Real-World Use Cases

Codebase migration:

A team migrated 200 API endpoints from REST to GraphQL using an autonomous agent. Guardrails blocked schema changes; checkpoints required approvals before deleting old endpoints. Migration finished in 3 days (not 3 weeks) with zero production incidents.

Documentation generation:

An agent auto-generates API docs from code. Guardrails restrict it to specific directories. Checkpoints pause before publishing, so the team reviews docs weekly instead of writing manually.

Test coverage:

An agent analyzes code and writes missing tests. Budget constraints prevent runaway generation. Confidence thresholds flag uncertain tests. Coverage improved from 60% to 85% in a month.

Wrapping Up

Key takeaways:

AI agents fail in predictable ways: scope creep, wrong abstractions, cascading failures, resource exhaustion
Three layers solve most issues: guardrails (prevention), observability (detection), checkpoints (recovery)
Guardrails should be enforced in code—not prompts
Observability = structured logs and metrics, not manual supervision
Checkpoints let humans verify decisions at key moments
API schemas from Apidog make your API layer a guardrail

Your next steps:

Identify your most repetitive AI-assisted task
Define guardrails: what must the agent never do?
Add structured logging to monitor behavior
Create checkpoints for high-risk operations
Let it run for 30 minutes and review the logs

The goal isn’t to eliminate humans, but to have them in the right part of the loop: making high-level decisions, not fixing low-level errors.

Build API guardrails for your AI agents—free

FAQ

What’s the difference between an AI agent and an AI assistant?

An assistant waits for your next instruction and responds. An agent takes a goal and autonomously plans and executes steps. Assistants need you at every step; agents run until a checkpoint or completion.

How do I know if my agent is ready to run autonomously?

Run it in supervised mode for 10 sessions. Track interventions. If you intervene less than twice per session, and only for minor clarifications, it’s ready. Frequent or major interventions? Add more guardrails.

What’s the biggest risk with autonomous agents?

Cascading failures the agent doesn’t detect. Small mistakes can snowball. Checkpoints break the chain, forcing verification.

Can I use these patterns with any LLM?

Yes. Guardrails, observability, and checkpoints are model-agnostic—use with Claude, GPT-4, Gemini, etc.

How much does observability slow down the agent?

Negligible—logging is fast. The main slowdown is from checkpoints waiting for human input. Use checkpoints only at high-risk moments for maximum autonomy.

What if the agent makes a decision I disagree with?

Checkpoints enable you to reject those decisions. The agent can roll back or try another approach. Also, update your instructions to reflect preferences.

Should I start with supervised or autonomous agents?

Start supervised. Add checkpoints to every significant action until trust builds. Gradually reduce checkpoints on low-risk operations.

How does Apidog specifically help with AI agents?

Apidog generates validated API clients from your schemas. Agents using these clients can’t send malformed requests—a whole class of errors is prevented before reaching your backend.