We Mapped 500 AI Agent Infrastructure Projects

#ai #mcp #agents #opensource

The early agent conversation focused on prompts, tools, and demos. That was useful, but production systems need more than an agent loop wrapped around a few APIs.

Once agents touch real files, browsers, APIs, credentials, workflows, and customer data, the infrastructure layer becomes the product boundary. Teams need to know where code runs, what tools are available, what state is persisted, how failures are observed, and how risky actions are contained.

The 10-layer agent infrastructure stack

We expanded Awesome Agent Runtime into a curated map of 500 projects across 10 infrastructure categories:

Agent runtime
Execution sandbox
Browser automation
Tool protocol
App integrations
Memory/context
Safety/evals
Model gateway
Observability
Deployment/compute

Why this map exists

This is not a generic AI tools directory.

The goal is to track infrastructure that helps builders run agents in real products:

control state and workflows
run code safely
automate browsers
connect tools and apps
store memory and context
evaluate behavior
route models
observe failures
deploy and scale workloads

What we learned from the first 500 projects

Several patterns stood out while curating the first 500 projects.

Agent frameworks are maturing quickly, but runtime safety is still uneven. Builders are increasingly asking not only "can the agent call a tool?" but "where does that tool run, what can it access, and how do we recover when it behaves badly?"

Execution sandboxes and browser automation are becoming first-class agent primitives. If an agent can write code, open pages, call CLIs, or operate SaaS workflows, isolation and repeatability matter as much as model quality.

MCP and tool protocols are giving the ecosystem shared language. The protocol layer is becoming the place where agents, tools, permissions, and app integrations start to meet.

Observability is moving beyond prompt logs. Production teams need traces, evals, cost visibility, tool-call history, runtime events, and failure analysis.

Deployment is also splitting. Model inference is only one part of the stack; sandbox execution, tool infrastructure, workers, browsers, and integration runtimes all need their own operating model.