MCP Agent Mode: How Bifrost Auto-Executes Tool Calls

Bifrost Agent Mode removes manual tool-execution round trips by running approved MCP tool calls autonomously, with configurable per-tool auto-approval.

Every MCP gateway has to make a choice about tool calls: return them to the application for explicit handling, or execute them automatically and loop until the task is done. The first approach gives applications full control. The second is what Bifrost, the open-source AI and MCP gateway built in Go by Maxim AI, calls Agent Mode. When Agent Mode is on, Bifrost auto-executes approved tools, feeds results back to the model, and repeats until the agent finishes or a depth ceiling is hit. This post covers the execution model, configuration fields, and security patterns for running Agent Mode safely in production.

Defining Agent Mode in an MCP Gateway

Agent Mode is a gateway-level configuration that replaces the explicit, per-call execution model with an autonomous loop. Rather than returning each LLM tool call to the application for handling, the Bifrost AI gateway executes the approved tools itself, collects the results, appends them to the conversation, and sends the updated context back to the model for the next iteration.

The default behavior is worth understanding first. Without Agent Mode, every tool call from the model surfaces as a pending approval: the application calls the MCP gateway's tool execution endpoint, confirms the call, and submits the result. This stateless, explicit pattern is a clean audit trail, but it requires the application to orchestrate each step of a multi-tool workflow. Agent Mode moves that orchestration responsibility into Bifrost, shrinking the application surface for agent-heavy tasks.

The Case for Autonomous Tool Execution in Agent Systems

Consider a task that reads a directory, opens several files, queries a database, and then produces a summary. Without autonomous execution, the application must drive each of those tool calls independently: submit the LLM request, receive tool calls, execute them one by one, resubmit with results, and repeat. For a single-tool task that is manageable. For a twenty-step agent workflow, it introduces meaningful latency and code complexity.

Running the loop inside the MCP gateway eliminates those round trips. The model's tool calls are executed in place, and the conversation history is updated before the next LLM call. Anthropic, which published the Model Context Protocol specification in November 2024, has noted that developers are now building agents with access to hundreds or thousands of tools across many MCP servers. At that scale, per-call orchestration in application code is a bottleneck. Consistent tool access filtering applied at the gateway level is what keeps that autonomy bounded.

How the Agent Mode Loop Runs in Bifrost

With Agent Mode active, Bifrost follows a fixed execution sequence on every request:

The model returns one or more tool calls in its response.
Bifrost executes the tools on the auto-execute list immediately.
Completed results are added to the conversation and the next LLM call is made.
This cycle continues until the model issues no further tool calls, or max_agent_depth is reached.
Tools not on the auto-execute list are returned to the application as pending approvals.

Four behaviors shape how that loop runs in practice:

Max depth. max_agent_depth caps the number of iterations. Its default is 10, and it can be set anywhere from 1 to 50. Each iteration is one LLM call that produces tool calls. Once the cap is hit, the current response is returned as-is, including any pending tool calls that have not yet run.
Parallel execution. When a single model response contains more than one auto-executable tool, Bifrost runs all of them in parallel rather than sequentially, then collects results before starting the next iteration.
Per-tool timeout. Individual tool calls are bounded by tool_execution_timeout, which defaults to 30 seconds. A tool that exceeds that limit returns an error result; the loop continues with the error included in the conversation.
Non-streaming only. Because the loop requires a complete response before deciding whether to continue, Agent Mode is incompatible with streaming endpoints. Use the non-streaming chat and responses endpoints when Agent Mode is enabled.

When a response mixes auto-executable and non-auto-executable tools, Bifrost runs the auto-executable ones first, places a JSON summary of their results in the content field, leaves the pending tools in tool_calls, and sets finish_reason to stop. The application then reviews the pending approvals, executes or rejects them, and continues the conversation with those results.

Configuring Auto-Execution: The Two-List Model

Two fields on each MCP client config control which tools participate in Agent Mode:

Field	Purpose	Semantics
`tools_to_execute`	Whitelist of tools the model can call	`["*"]` = all, `[]` = none, `["a","b"]` = specific tools
`tools_to_auto_execute`	Subset that runs without explicit approval	Same semantics; must be a subset of `tools_to_execute`

The execute list is authoritative. If a tool appears in tools_to_auto_execute but not in tools_to_execute, it is ignored. No tools are auto-executed by default, so Agent Mode is fully opt-in at the tool level.

A common starting point is to whitelist all tools but restrict auto-execution to read operations:

{
  "name": "filesystem",
  "connection_type": "stdio",
  "stdio_config": {
    "command": "npx",
    "args": ["-y", "@anthropic/mcp-filesystem"]
  },
  "tools_to_execute": ["*"],
  "tools_to_auto_execute": ["read_file", "list_directory"]
}

These settings are configurable per client through the Bifrost web UI (a per-tool "Automatically execute tool" toggle), through the gateway API, or in config.json. Because configuration is per client, different MCP servers can carry different auto-execution policies in the same deployment. For team or user-level scoping, MCP tool filtering per virtual key layers on top of the client config to restrict which tools are reachable per consumer.

Which Tools Belong on the Auto-Execute List

The auto-execute list functions as a security boundary. Any tool placed on it will run without human review during the agent loop, so the decision of what to include has direct security implications.

Appropriate candidates for auto-execution:

Read-only file operations: read_file, list_directory
Search and retrieval: search, fetch_url
Any non-destructive query that has no side effects beyond returning data

Operations that should require explicit approval:

File writes: write_file, create_file
Deletions: delete_file, delete_record
Command execution: run_command, execute_script
Operations with external effects: sending email, initiating purchases, triggering deployments

Keeping write and delete operations off the auto-execute list means Bifrost returns them to the application as pending approvals rather than running them in the loop. For enterprise deployments in regulated environments, combining this pattern with immutable audit logs produces a tamper-evident record of every tool suggestion, approval decision, and execution result, which is relevant evidence for SOC 2, GDPR, and HIPAA audits.

Governance and Observability Inside the Agent Loop

Autonomous execution does not bypass Bifrost's governance layer. Every tool execution in the agent loop runs through the same controls applied to any other request.

Virtual keys scope access, budgets, and rate limits per consumer. Tool availability can be filtered per key, so what is auto-executable for one team or user can be restricted for another.
Tool filtering stacks client-level, request-level, and per-key filters before the loop runs. Only tools that pass all three layers are eligible for auto-execution.
OpenTelemetry tracing and native Prometheus metrics capture each iteration of the agent loop, making every intermediate tool call traceable rather than opaque inside a black box.

This governance stack is what makes Agent Mode practical for coding agents routed through the gateway. Teams can point Claude Code and other CLI agents at Bifrost, auto-execute read-only tools for speed, and hold mutating operations for explicit review, with a single consistent audit trail across the entire governance model. The same deployment can also expose its aggregated tools outward through a single Bifrost MCP server endpoint, handling discovery, governance, and execution in one place.

Agent Mode: Common Questions

Does Bifrost run tool calls automatically without any configuration?

No. By default, Bifrost returns all tool calls to the application and executes nothing on its own. Tools run autonomously only when they are explicitly listed in tools_to_auto_execute.

Is Agent Mode compatible with streaming?

No. The autonomous loop requires complete responses to determine whether another iteration is needed. Agent Mode only works with the non-streaming chat and responses endpoints.

What controls how many iterations the loop runs?

max_agent_depth sets the ceiling. The default is 10 iterations; the configurable range is 1 to 50. Reaching the limit returns the current response, which may still contain unresolved tool calls.

How are tools that require approval handled during the loop?

Bifrost completes all auto-executable tools first. The response is then returned with a content summary of what ran and the pending approvals in tool_calls with finish_reason set to stop. The application processes the pending approvals and continues the conversation.

Deploy Agent Mode with Bifrost

Autonomous MCP tool execution through Agent Mode lets AI agents handle multi-step workflows without per-call orchestration code, while configurable auto-approval lists, depth limits, and per-tool timeouts keep execution bounded and auditable. Paired with Bifrost's governance layer, auto-execution can run at production scale without sacrificing visibility or control. Bifrost is benchmarked at 11 microseconds of overhead at 5,000 RPS, making it a solid foundation for enterprises running mission-critical autonomous agent workloads.

Book a demo with the Bifrost team to explore how Agent Mode fits your deployment, or get started directly with the open-source project on GitHub and the Bifrost documentation.