Juan Torchia

Posted on Apr 24 • Originally published at juanchi.dev

Agent Vault: I tested the open-source credential proxy for agents — here's what it solves (and what it doesn't)

#english #produccion #railway #arquitectura

Agent Vault: I tested the open-source credential proxy for agents — here's what it solves (and what it doesn't)

Why are we still thinking about agent credentials like they're app credentials? We've had .env, Vault, Secrets Manager for years — a whole industry built on the premise that a human decides when a credential gets used. With agents, that premise broke. And nobody's saying it out loud.

I saw the Agent Vault Show HN with 107 points on Tuesday morning. First reaction: "another vault." Second reaction, after reading the full README: "wait, there's a specific idea here worth digging into." Third reaction, after running it against my actual setup: "it solves something real, but not what I actually needed to solve."

I'm going slow because the topic deserves it.

The structural problem Agent Vault claims to solve

When I built CrabTrap last year, the problem was different — I wanted a judge between my agent and the final output to catch hallucinations in production. Credentials weren't the focus. I handled them with environment variables like any normal backend and called it a day.

After measuring the real costs of every design decision in my agent, I started paying closer attention to how often the agent was touching external resources. And that's where the discomfort showed up: the agent wasn't just using credentials — it was deciding when to use them based on prompt context.

That's fundamentally different from a traditional app.

In a traditional app:

User → Request → Handler → Credential → External API → Response

The flow is deterministic. The handler always calls the same API with the same credential at the same point in the code. You can audit that.

In an agent:

User → Prompt → Agent → [decides] → Credential A or B or C → External API N
                                   → [in a loop, with memory] → more APIs

The agent reasons about which tool to use. A Stripe credential can get triggered because the agent interpreted "handle the payment" as requiring a refund action you never explicitly asked for. That happened in one of my setups three months ago. It wasn't catastrophic, but it made me sit down and think hard.

My thesis: the credential problem in agents isn't about storage — it's about dynamic authorization. Agent Vault solves the first better than any open-source alternative I've tested, but it barely touches the second.

What Agent Vault is and how I installed it

Agent Vault is an HTTP proxy that sits between your agent and external APIs. Credentials live in the proxy, not in the agent process. The agent makes requests to localhost:8743 (or wherever you run it), the proxy intercepts them, injects the right credential, and forwards them on.

The idea is related to what I was doing with parallel agents in Zed where I started thinking about intermediation layers — but Agent Vault goes lower in the stack.

Installation on my Railway + Docker setup:

# Dockerfile.agent-vault
FROM node:20-alpine

WORKDIR /app

# Clone Agent Vault (open-source, MIT)
COPY package.json package-lock.json ./
RUN npm ci --production

# Credential config — never in the build, always at runtime
COPY agent-vault.config.js ./

EXPOSE 8743

CMD ["node", "src/proxy.js"]

// agent-vault.config.js — this file does NOT go to git
// Real credentials come from environment variables in Railway

module.exports = {
  port: 8743,
  credentials: {
    // Each agent tool has its own namespace
    stripe: {
      secret: process.env.STRIPE_SECRET_KEY,
      // Important: define which endpoints it can touch
      allowedPaths: ['/v1/customers', '/v1/payment_intents'],
      // Which HTTP methods are allowed for this namespace
      allowedMethods: ['GET', 'POST'],
    },
    github: {
      token: process.env.GITHUB_TOKEN,
      allowedPaths: ['/repos/**', '/user'],
      // Read-only — the agent can't push
      allowedMethods: ['GET'],
    },
    postgres: {
      connectionString: process.env.DATABASE_URL,
      // Agent Vault has less support here — we'll come back to this
      allowedQueries: 'readonly', // experimental in v0.4
    },
  },
  // Log every access — this I genuinely loved
  auditLog: {
    enabled: true,
    output: './logs/agent-vault-audit.jsonl',
  },
};

Real installation time: 47 minutes. Clear documentation, one bug with environment variables in Docker that I fixed in 15 minutes using an already-open GitHub issue.

What Agent Vault solves well

Three concrete things that worked from day one:

1. Credential isolation from the agent process

The agent never sees the real credential. It does POST https://api.stripe.com/v1/customers through the proxy and Agent Vault injects the Bearer token. If the agent gets compromised — prompt injection, for example, a topic I get into in my analysis of LLM-generated security reports — the real credentials aren't sitting in its context memory.

That's real value. Not nothing.

2. Automatic audit log

Every request lands in agent-vault-audit.jsonl with a timestamp, endpoint touched, HTTP method, and — this is the good part — the agent's tool call that originated it (if you set up the agent SDK integration).

{"ts":"2026-07-14T09:23:41Z","credential":"stripe","path":"/v1/customers","method":"GET","agent_tool":"get_customer_info","prompt_hash":"a3f...","latency_ms":234}
{"ts":"2026-07-14T09:23:44Z","credential":"stripe","path":"/v1/payment_intents","method":"POST","agent_tool":"create_payment","prompt_hash":"a3f...","latency_ms":891}

That log showed me something uncomfortable: in a 40-minute session, my agent made 23 calls to Stripe. I was expecting around 8. The extra 15 were redundant GET /v1/customers calls the agent was making to "confirm" context at each step of the loop. That's a design problem on my end, not Agent Vault's — but I never would have seen it without the audit log.

3. Path filtering as a minimum blast-radius layer

The agent simply can't touch /v1/refunds because it's not in allowedPaths. That's a concrete safety net. Not sufficient on its own (I'll explain why), but dramatically better than nothing.

What Agent Vault doesn't solve (and should say so more clearly)

Here's the crux of it.

Agent Vault controls access: which endpoints, which methods, which credential. It doesn't control intent: why the agent is touching that endpoint at this particular moment in the conversation.

Concrete example. If my agent has permission to POST /v1/payment_intents, Agent Vault will let that request through. It has no idea whether the agent is doing it because the user said "process payment for order 1234" or because the agent arrived at that conclusion through a reasoning chain that drifted from an ambiguous context.

The problem isn't the what — it's the why and the when.

This reminds me of something I learned building with MCP: tool protocols define capabilities, but they don't define contextual authorization. Agent Vault is excellent at the capabilities layer. The contextual authorization layer is still unsolved territory.

Three specific gotchas I hit:

Gotcha 1: rate limiting per credential, not per user session

Agent Vault lets you define rate limits per credential:

stripe: {
  secret: process.env.STRIPE_SECRET_KEY,
  rateLimit: { requests: 100, windowMs: 60000 }, // 100 req/min
}

But that's the global limit for all agents using that credential. If you have multiple simultaneous users in production, one agent going haywire can exhaust the rate limit for everyone else. You need your own session logic on top.

Gotcha 2: database credentials are second-class citizens

PostgreSQL/MySQL support is marked "experimental" in v0.4 and it shows. The allowedQueries: 'readonly' option doesn't actually parse SQL to verify it's truly read-only — it trusts your ORM or driver to handle that correctly. That's a false sense of security.

For my Railway PostgreSQL setup, I ended up leaving the database connection outside Agent Vault entirely and handling it with my own wrapper that validates the query type before executing.

Gotcha 3: latency that stacks up

Every request goes through the proxy. In my tests: +12ms on average per call. Just twelve milliseconds — not dramatic. But when the agent makes 23 Stripe calls in a session (as the audit log revealed), that's 276ms of accumulated proxy overhead alone. In the context of the benchmarks I've seen around TPU inference latency, this overhead is minor, but in long agent loops you feel it.

What an honest architecture actually looks like

What I'm running today, after a week with Agent Vault in staging:

User
   │
   ▼
Agent (Next.js API Route)
   │
   ├── [tools that don't touch external APIs] → direct
   │
   └── [tools that touch external APIs]
          │
          ▼
      Agent Vault Proxy (:8743)
          │
          ├── Audit log (JSONL)
          ├── Path filtering
          └── Credential injection
                 │
                 └── External APIs (Stripe, GitHub, etc.)

What Agent Vault does NOT cover and I have to handle myself:

Agent
   │
   └── [contextual authorization] → my own logic
          │
          ├── Does this tool call make sense given the prompt?
          ├── Did the user explicitly authorize this action?
          └── Are we in a loop that shouldn't be happening?

That second box is CrabTrap territory (output quality) mixed with something that still doesn't exist as a mature product: an intent validator for agents. Agent Vault and CrabTrap are complementary layers, not substitutes.

FAQ — What the team Slack channel asked when I demoed it

Does Agent Vault work with any agent or only specific frameworks?

Works with anything that can make HTTP calls. LangChain, Mastra, LlamaIndex, a custom SDK — all you need to do is point external API calls at the proxy instead of the original endpoints. The tool call integration for the audit log does require a specific SDK or manually adding the X-Agent-Tool header to each request.

Is it safe to run in production today?

I have it in staging and I'm keeping it there until v0.5 ships with more solid database support. For REST APIs like Stripe or GitHub, yes — I'd consider it production-ready. For databases, not yet.

How is this different from HashiCorp Vault or AWS Secrets Manager?

Vault and Secrets Manager solve secure credential storage. Agent Vault solves dynamic injection of those credentials into HTTP requests without the agent ever seeing them. They're different layers — in fact, Agent Vault can read its credentials from Vault or Secrets Manager. They're not competitors. They're complementary.

Does the proxy become a single point of failure?

Yes, and you have to design for that. On Railway I ran it with automatic restart and had zero downtime in a week of staging. For real production with high traffic, you need at least two instances and a health check. The Agent Vault docs touch on this but don't give a complete operational guide.

Does it solve prompt injection?

Partially. If an attacker gets the agent to execute a malicious tool call, Agent Vault can limit the blast radius (it can't touch endpoints outside allowedPaths). But it doesn't detect that the tool call was the result of an injection — for that you need something higher up the chain, closer to what I explored with LLM-generated security reports.

Is it worth it given the 12ms overhead per call?

For most agent use cases, yes. The overhead is real but predictable. What Agent Vault gives you — audit log, path filtering, credential isolation — is worth more than those 12ms in almost any serious production architecture.

What I'm taking away and what I'm not buying

Two weeks ago I was reminded of when Next.js App Router dropped in 2021 and I spent two weeks furious because it broke everything I knew. Then I understood it was the right abstraction. With Agent Vault I feel something similar, but inverted: the abstraction exists, it's correct at its layer, but it's being sold as if it solves more than it actually does.

What I accept: Agent Vault is the best open-source solution I've tested for the credential storage and isolation problem in agents. The audit log alone justifies the install.

What I don't buy: that credential proxy = agent security. They're treated as the same problem in the same pitch doc, and they're not. An agent can behave in ways that break all your security assumptions without touching a single endpoint outside the allowed list — just by using the allowed ones in ways you didn't anticipate.

The honest trade-off: install it, use the audit log to understand what your agent is actually doing, and build your contextual authorization layer on top. Not the other way around.

If you've built something that attacks the intent validation problem in agents — that second box I drew above — I want to see it. That's the gap that's still wide open.

This article was originally published on juanchi.dev

DEV Community

Agent Vault: I tested the open-source credential proxy for agents — here's what it solves (and what it doesn't)

Agent Vault: I tested the open-source credential proxy for agents — here's what it solves (and what it doesn't)

The structural problem Agent Vault claims to solve

What Agent Vault is and how I installed it

What Agent Vault solves well

What Agent Vault doesn't solve (and should say so more clearly)

Gotcha 1: rate limiting per credential, not per user session

Gotcha 2: database credentials are second-class citizens

Gotcha 3: latency that stacks up

What an honest architecture actually looks like

FAQ — What the team Slack channel asked when I demoed it

What I'm taking away and what I'm not buying

Top comments (0)