DEV Community

Cover image for Agentic Platform Engineering: How to Build an Agent Infrastructure That Scales From Your Laptop to the Enterprise
Saul Fernandez
Saul Fernandez

Posted on

Agentic Platform Engineering: How to Build an Agent Infrastructure That Scales From Your Laptop to the Enterprise

Starting local is not thinking small. It's thinking strategically.


Stripe recently published a series on how they build "Minions" — fully autonomous coding agents that take a task and complete it end-to-end without human intervention. One agent reads the codebase, another writes the implementation, another runs the tests, another reviews the result. All in parallel. All coordinated. All producing production-ready code at scale.

Reading that, most engineers react one of two ways. Either "that's years away for us" — or they start thinking about what foundations would need to be in place to even begin moving in that direction.

This article is about the second reaction.

What Stripe describes is not a product you buy. It's a capability you build, incrementally, on top of solid infrastructure. And that infrastructure — the way agent configuration is stored, versioned, distributed, and composed — is what separates teams that can scale AI seriously from teams that are still copy-pasting prompts into chat windows.

I call this discipline Agentic Platform Engineering. And its first principle is simple: treat agent intelligence as infrastructure, not as improvisation.


The Problem at Any Scale

Whether you're a solo engineer with 10 repositories or a platform team with 200, you hit the same structural problem when you try to work seriously with AI agents.

At the individual level, it looks like this: you spend time configuring your agent well — crafting instructions, building reusable procedures, defining what it should and shouldn't do depending on where you're working. Then you switch machines, reinstall a tool, or try a different agent. Your configuration is gone. You start over.

At the team level, it's worse: every engineer has their own private, undocumented, non-transferable agent setup. There's no shared knowledge about how agents should behave in your codebase, no consistency in what they can and cannot do, no way to onboard a new member into an agentic workflow. The AI capability of your team cannot scale because it lives in individual heads and local files.

And at the enterprise level — the level where Stripe operates — you cannot even begin to think about autonomous agents running pipelines if you haven't solved the fundamental question: where does agent configuration live, who owns it, and how does it reach every context where it's needed?

Most engineers interact with AI agents in one of two ways:

  1. Ad-hoc: No configuration, just prompting. Works for one-off tasks, but the agent has no memory of your stack, your conventions, or your constraints.

  2. Single-file config: One big AGENTS.md or CLAUDE.md at the root of a repo. Better, but it doesn't scale — the same instructions get injected everywhere, regardless of whether they're relevant, and they live in one repo while you work across twenty.

Neither approach is infrastructure. Neither can scale. Neither gives you what you'd need to move toward autonomous multi-agent systems.

The question is: what would infrastructure actually look like?


The Architecture: Three Repos, One System

The solution I landed on separates concerns into three distinct repositories, each with a single, well-defined responsibility. This is not a personal productivity hack. It's a deliberate architectural pattern that mirrors how platform engineering works for any shared infrastructure — you version it, you document what exists, and you decouple the interface from the implementation.

agent-library/    ← The brain (tool-agnostic intelligence)
agent-setup/      ← The bridge (tool-specific deployment)
resource-catalog/ ← The map (inventory of everything)
Enter fullscreen mode Exit fullscreen mode

Let me explain each one.


1. agent-library — The Brain

This is the single source of truth for everything the agent knows and how it behaves. It contains no tool-specific configuration. If tomorrow I switch from one AI coding tool to another, this repo stays untouched.

The structure:

agent-library/
├── library.yaml          ← Central manifest
├── SKILLS-INDEX.md       ← Human-readable index of all skills
├── layers/               ← Context-specific instructions
│   ├── global.md         ← Identity, principles, environment map
│   ├── repos.md          ← Shared git conventions
│   ├── work/             ← Work domain
│   │   ├── domain.md     ← Conservative rules, safety constraints
│   │   ├── terraform.md  ← Terraform-specific workflow
│   │   ├── gitops.md     ← GitOps/FluxCD rules
│   │   └── code.md       ← Code conventions
│   └── personal/         ← Personal domain
│       ├── domain.md     ← Experimental mode, fast iteration
│       ├── fintech-app.md
│       └── infra-gcp.md
├── skills/               ← Reusable procedures
│   ├── global/           ← Available everywhere
│   └── work/             ← Domain-specific
├── rules/                ← Always-on constraints
└── prompts/              ← Reusable prompt templates
Enter fullscreen mode Exit fullscreen mode

The key concept is layers. Each layer is a Markdown file that becomes an AGENTS.md (or equivalent) for a specific directory. They are designed to be cumulative — the agent loads them from parent to child, each adding context on top of the previous one.

When the agent is working in ~/repos/work/terraform/, it loads:

~/.agent/AGENTS.md              → global.md        (who I am, core principles)
~/repos/AGENTS.md               → repos.md         (git conventions)
~/repos/work/AGENTS.md          → work/domain.md   (conservative, safety-first)
~/repos/work/terraform/AGENTS.md → work/terraform.md (terraform workflow)
Enter fullscreen mode Exit fullscreen mode

Each layer is laser-focused. global.md doesn't know about Terraform. work/terraform.md doesn't know about React. The agent assembles its context from the bottom up, with exactly the information it needs for where it currently is.

The library.yaml manifest is the glue. It declares every layer, skill, rule, and prompt — what it is, where it lives in the repo, and where it should be deployed on the filesystem:

layers:
  - name: work-terraform
    description: "Terraform-specific rules for work domain"
    source: layers/work/terraform.md
    target: ~/repos/work/terraform/AGENTS.md
    scope: "~/repos/work/terraform/*"

skills:
  - name: terraform-plan
    description: "Terraform plan/apply workflow"
    source: skills/work/terraform-plan.md
    scope: "~/repos/work/terraform/*"
Enter fullscreen mode Exit fullscreen mode

2. agent-setup — The Bridge

This repo is the adapter between the tool-agnostic brain and the specific AI agent tool I use today. If I switch tools next year, I replace only this repo. The brain stays the same.

Its core is a single setup.sh script that reads library.yaml and deploys everything via symlinks:

# Creates: ~/repos/work/terraform/AGENTS.md → agent-library/layers/work/terraform.md
# Creates: ~/.agent/skills/terraform-plan → agent-library/skills/work/terraform-plan.md
# ... and so on for every layer, skill, rule, and prompt
Enter fullscreen mode Exit fullscreen mode

Why symlinks instead of copies?

Because when I edit a layer in agent-library, the change is immediately live everywhere. No re-deployment needed. setup.sh only needs to run again when I add a new file (a new symlink to create). For edits to existing files, the symlink already points to the right place.

The repo also contains tool-specific settings, keybindings, and extensions — things that only make sense for a specific tool.


3. resource-catalog — The Map

This is the index of everything that exists in my engineering ecosystem. It follows the Backstage catalog format — the same standard used in enterprise engineering platforms.

# components/agent-library.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: agent-library
  description: "Tool-agnostic agent configuration library"
  annotations:
    github.com/project-slug: your-username/agent-library
spec:
  type: ai-agent-config
  lifecycle: production
  owner: your-name
  system: personal-ai-agent-platform
Enter fullscreen mode Exit fullscreen mode

Every repository I own — infrastructure, applications, documentation, and yes, the agent-library itself — is registered here with its type, owner, system, and source location.

The catalog is not where agent logic lives. It's a map, not an engine. The distinction matters: the catalog tells you that agent-library exists and what it is. The agent-library itself contains what the agent knows. Mixing these two concerns would be like embedding source code inside your package.json.


Skills: The Reusable Procedure Library

Beyond layers (which define how the agent behaves), the library contains skills — reusable, step-by-step procedures for common tasks.

A skill looks like this:

# Skill: Terraform Plan

Use this skill when working with Terraform in the work domain.

## Steps
1. Check context — confirm directory and workspace
2. terraform fmt -recursive
3. terraform validate
4. terraform plan -out=tfplan
5. Review plan — summarize what will change
6. Highlight risks — flag any destroys or critical changes
7. Wait for confirmation — never apply without explicit approval
...

## Red Flags (Stop and Ask)
- Any resource marked for destruction
- Changes to IAM policies
- Changes to production databases
Enter fullscreen mode Exit fullscreen mode

Skills are invoked explicitly: /skill:terraform-plan. They're never loaded automatically — that's intentional. The agent doesn't pre-load every procedure it might need. It loads the skill when the task calls for it.


The Token Efficiency Design

This is where the architecture earns its keep.

A naive approach would be: put everything in one big AGENTS.md. All the rules, all the skills, all the context. The agent always knows everything.

The problem: that file becomes enormous. Every single message you send to the agent carries the full weight of that context as tokens. You're paying for Terraform rules when you're writing a Python script. You're loading GitOps procedures when you're working on documentation.

The architecture solves this at three levels:

Level 1: Layers are scoped by directory. The Terraform layer only activates when you're in ~/repos/work/terraform/. Not in your React app. Not in your docs.

Level 2: Each layer only declares what's relevant at its level. global.md lists 6 universal skills (debug, code-review, refactor, test, documentation, git-workflow). It does not list terraform-plan or catalog-management — those are irrelevant in most contexts. work/terraform.md lists terraform-plan and nothing else, because that's the only skill you need there.

Level 3: Meta-skills are scoped to their home. create-skill (the skill that creates new skills) is only available inside agent-library/. catalog-management is only available inside resource-catalog/. Why would the agent know how to modify the agent library while it's working on your fintech app?

The result: when the agent is in work/terraform/, its active context is exactly:

  • 6 global skills
  • 1 domain skill (infrastructure-review)
  • 1 directory skill (terraform-plan)

That's it. No noise.


Disaster Recovery in Under 5 Minutes

The entire system is built for one guarantee: if everything breaks, you can rebuild from scratch in under 5 minutes.

# Step 1: Clone the three repos
mkdir -p ~/repos && cd ~/repos
git clone git@github.com:your-username/agent-library.git
git clone git@github.com:your-username/agent-setup.git
git clone git@github.com:your-username/resource-catalog.git

# Step 2: Deploy
cd agent-setup && bash setup.sh

# Step 3: Verify
cd ~/repos/work/terraform && your-agent "What context am I in?"
# → Agent responds with terraform-specific context ✓
Enter fullscreen mode Exit fullscreen mode

Done. The agent has its full identity, all its domain knowledge, all its skills, and all the right rules for every directory it works in.

This is only possible because:

  1. Everything is in git. No local-only configuration that can be lost.
  2. The brain is separate from the tool. Reinstalling the tool doesn't lose the intelligence.
  3. The manifest declares everything. library.yaml is the complete description of the system — setup.sh just executes it.

The Mental Model: A Package Manager for Agent Intelligence

Think of it like a software package manager, but for how agents think and behave.

library.yaml is your package.json — it declares everything that should exist and what it does. setup.sh is your npm install — it takes the manifest and wires everything up on any machine. The layers are your source modules — composable, scoped, loaded on demand. The skills are your function library — procedures you invoke when you need them, not before.

The difference from a traditional package manager: the "packages" here are not code. They're instructions for how to reason in a given context.

This matters beyond the individual level. If you're a platform team and you want every engineer to work with agents consistently — the same safety rules around production, the same conventions for code review, the same escalation procedures — you publish to agent-library. Engineers run setup.sh. Done. The intelligence is distributed, versioned, and auditable. Just like any other shared infrastructure.

This is the foundation Stripe's approach requires. Before you can run autonomous agents in parallel on real codebases, you need to have solved: where do agents get their instructions? Who updates them? How do changes propagate? How do you ensure an agent working on your payment service doesn't behave like an agent working on an internal tool?

The architecture described in this article is an answer to those questions — starting from a single developer setup, but designed to scale.


What This Looks Like Day-to-Day

When I open a terminal in ~/repos/work/terraform/:
The agent already knows it's in conservative mode, that any terraform apply needs a reviewed plan and explicit confirmation, that pre-commit hooks must run before any commit, and exactly which skill to use for the full workflow.

When I open a terminal in ~/repos/personal/fintech-app/:
The agent knows it can move fast, that this is a Python financial analysis platform, that API keys live in environment variables and never in code, and that tests run before committing.

When I want to add a new skill:
I run /skill:create-skill. The agent walks me through creating the file, registering it in library.yaml with the right scope, updating SKILLS-INDEX.md, and committing. The skill is live the moment it's committed — no redeployment needed (symlinks).

When a colleague joins my team or I get a new machine:
Three git clones and one bash script. Same agent, same behavior, same context everywhere.


The Design Decisions That Matter

Why not one big repo? Separation of concerns. The brain shouldn't depend on the tool. The catalog shouldn't contain executable logic. Mix them and you create coupling that makes the whole system fragile.

Why Backstage format for the catalog? It's an industry standard built exactly for this — describing what exists, who owns it, and how it relates to other things. It's human-readable, tool-agnostic, and designed to scale.

Why symlinks instead of copies? Real-time updates without redeployment. Edit terraform.md in the library, it's immediately live in ~/repos/work/terraform/. No sync step, no drift between source and deployed config.

Why scope skills to directories instead of loading all of them? Token efficiency and cognitive clarity. An agent with 30 loaded skills is an agent that has to decide which one applies. An agent with 2 loaded skills knows exactly what to use.


What I Haven't Built Yet

This is an honest article, so here's what's still on the roadmap — and what brings the architecture closer to the Stripe model:

  • Extensions (next): Custom tools for things like catalog lookup, repo navigation, and library sync directly from the terminal
  • MCP integration: Model Context Protocol servers for deeper, structured tool integrations — giving agents access to live data sources, not just static instructions
  • Multi-agent orchestration: The ability to spawn specialized agents in parallel for complex tasks — one reads the codebase, another implements, another validates. This is the direction Stripe's Minions move in, and this architecture is specifically designed so the layer system can feed each specialized agent exactly the context it needs, nothing more
  • Centralized distribution: Moving from local symlinks to a pull-based model where any machine or CI environment can fetch the latest agent configuration from agent-library automatically

Each of these is a step up the autonomy ladder. But none of them are possible without the foundation: versioned, scoped, composable agent configuration that you control.


Conclusion

Stripe's Minions are impressive. But they're not magic — they're the result of building the right infrastructure first.

The architecture described here — three repos, clear separation of concerns, a manifest-driven deployment, and scoped context loading — is that infrastructure, starting at the smallest possible scale. One developer, one machine, three git repositories.

The local setup is not the destination. It's the proof of concept for a pattern that scales: agent intelligence is configuration, configuration belongs in git, and git belongs in a system where it can be versioned, distributed, and composed.

Start local. Think at scale. Build the foundation that makes the next step possible.

That's Agentic Platform Engineering.

Top comments (0)