DEV Community: Nico

Barbacane vs Portkey and LiteLLM: picking an AI gateway in 2026

Nico — Mon, 18 May 2026 14:34:38 +0000

If you are picking an AI gateway in 2026, Portkey, LiteLLM, and Barbacane are all real options. They overlap enough to make the choice real, and they differ enough that the right answer depends on what else you want your gateway to do.

Every AI-gateway evaluation runs into the same question after the first demo: once your OpenAI calls go through a gateway, what about everything else? The rate limits your platform team owns, the auth your security team owns, the audit trail your compliance team owns, the spec-first workflow your API team relies on, the agents calling back the other way. The more of that lives next to the AI traffic, the more the choice of AI gateway becomes an architecture decision and not a feature match.

This post compares the three products on that axis. What they share. What separates them. How to pick.

The overlap: outbound LLM proxying

All three products sit between your application and one or more LLM providers. All three give you:

Provider abstraction with an OpenAI-compatible API surface
Fallback chains when a provider errors, times out, or is unreachable
Token usage and latency metrics per call and per provider
Budget and rate-limit guardrails at the gateway layer
Prompt and response guardrails (scope varies by product)

If outbound LLM proxy is all you need, all three will work. The differences show up in what else the gateway does, how it is configured, and what happens when your requirements grow beyond the LLM path.

What Portkey is

Portkey is a commercial AI gateway, available as managed SaaS or self-hosted. It focuses specifically on the LLM path and invests heavily in the operator experience: a configuration UI, a playground, a prompt library, an observability dashboard purpose-built for LLM traffic. It tends to be the right pick if you want an AI gateway as a product (vendor support, managed upgrades, fancy UI) and AI is the thing your team cares about most.

What LiteLLM is

LiteLLM is an open-source Python proxy that exposes a very broad set of LLM providers behind one unified OpenAI-compatible API. Actively developed, wide provider coverage, can run as a Python library or as a proxy server. Good pick if you want broad provider support, an MIT-licensed OSS foundation, and a Python-native runtime that plays well with your ML tooling.

What Barbacane is

Barbacane is an open-source, Rust-native API gateway. AI capability is built from composable plugins rather than a monolithic feature:

ai-proxy dispatcher routes requests to OpenAI, Anthropic, and Ollama (plus any OpenAI-compatible endpoint: vLLM, TGI, LocalAI, Azure). The client always sends OpenAI format; the dispatcher translates per provider, pins the provider API version, and handles SSE streaming where the provider supports it.
Named targets + cel middleware express policy-driven routing. A target like premium is a full provider profile (provider, model, credentials); the cel middleware writes ai.target into the request context when a rule matches, and the dispatcher picks the target from there. Credentials never leave dispatcher config.
ai-prompt-guard, ai-token-limit, ai-cost-tracker, ai-response-guard middlewares compose around the dispatcher. Each is a separate, skippable concern with named profiles, CEL expressions, and fail-closed defaults on misconfig.

And one more capability Portkey and LiteLLM do not offer: Barbacane is also an MCP gateway. The same artifact that proxies your LLM traffic outbound also exposes your existing APIs to AI agents as tools inbound. One gateway covers both directions of AI traffic.

The architectural difference: monolithic AI proxy vs dispatcher plus middlewares

This is where the three products diverge.

Portkey and LiteLLM treat the AI gateway as a unified product: one binary, one config, one API surface. Every operational concern (rate limits, caching, observability, guardrails) is a feature baked into the proxy. This is the right shape when AI is the only traffic the gateway handles.

Barbacane treats the AI gateway as a set of primitives you compose:

The ai-proxy dispatcher handles translation and routing.
Each concern is a separate middleware, ordered explicitly in the spec.
You stack the middlewares you need, skip the ones you do not, and compose multiple instances of the same plugin (stack two ai-token-limit instances for a minute-and-hour window, stack multiple cel rules for routing).
The exact same primitives govern non-AI traffic on the same gateway.

The trade-off is sharp. If you want the shortest path from zero to "OpenAI call via a gateway", Portkey and LiteLLM win on time-to-live. If you want AI traffic governed the same way your team already governs every other HTTP request, Barbacane's composition model gets you there without a second product to run, a second config source to reconcile, or a second telemetry stack to watch.

The architectural bet is the same one the service-mesh community made five years ago: specialized proxies for specialized traffic, or one data plane that handles every protocol your platform cares about. Both are valid; they produce different operational footprints.

Spec-first: OpenAPI as source of truth

Portkey and LiteLLM configure AI routes in their own config files (YAML for LiteLLM, config UI or SDK for Portkey). Barbacane configures AI routes in your OpenAPI spec:

paths:
  /v1/chat/completions:
    post:
      operationId: chatCompletion
      summary: Route LLM chat completion requests
      x-barbacane-dispatch:
        name: ai-proxy
        config:
          provider: openai
          model: gpt-4o
          api_key: "${OPENAI_API_KEY}"
          fallback:
            - provider: anthropic
              model: claude-sonnet-4-20250514
              api_key: "${ANTHROPIC_API_KEY}"
            - provider: ollama
              model: llama3
              base_url: http://ollama:11434

The documentation your frontend team reads, the client SDKs they generate, the contracts your platform team enforces, and the gateway config your SRE team operates all derive from the same file. Adding an LLM route adds an entry in the spec. Renaming a parameter renames it everywhere. Vacuum-based lint runs shift-left in your editor, in a pre-commit hook, or in CI, so provider typos and invalid regex patterns fail at lint time, not at call time.

If your organization is already spec-first for non-AI APIs, extending that discipline to AI routes is the cheapest integration path. If you do not run spec-first APIs, Portkey and LiteLLM feel more familiar because they do not ask you to change your workflow.

The inbound direction: MCP

One axis Portkey and LiteLLM do not compete on.

Portkey and LiteLLM sit between your application and the LLM. They do not stand between an AI agent and your APIs. That inbound direction is a different gateway category; we covered it at length in the canonical MCP gateway post.

Barbacane is a full MCP gateway in addition to its outbound AI capability. One artifact handles both directions. Whether that matters depends on whether agents calling your APIs is in scope:

If you are building an agent product and your agents only hit public tools and third-party services, the inbound direction does not apply and the MCP capability is not doing work for you.
If your agents call your internal APIs, or if you are a platform team preparing to expose internal APIs to agents built elsewhere, the inbound direction is real work. Barbacane treats it as a first-class concern. Portkey and LiteLLM leave it outside the gateway entirely, which means a separate MCP server per service and all the sprawl the canonical post describes.

When to pick which

Situation	Pick
Fastest path from zero to an OpenAI call via a gateway, with an operator UI	Portkey
Very broad LLM provider coverage, Python-native, OSS-first	LiteLLM
Managed SaaS with vendor support and a polished dashboard	Portkey
AI gateway as part of a broader API gateway, not a second box	Barbacane
AI routes defined in your OpenAPI spec alongside the rest of your API	Barbacane
Same gateway also exposes your APIs to AI agents via MCP	Barbacane
OSS, self-hostable, Rust-native, FIPS-ready for regulated-industry posture	Barbacane
Platform team; AI is one of many gateway concerns (auth, routing, observability)	Barbacane
AI-first product team; LLM calls are the only traffic the gateway proxies	Portkey or LiteLLM

Feature comparison

A compact, direction-setting comparison. All three products evolve; check current docs before committing.

Concern	Portkey	LiteLLM	Barbacane
Outbound LLM proxy	Yes	Yes	Yes (`ai-proxy` dispatcher)
Inbound MCP gateway	No	No	Yes
Provider coverage	Broad	Very broad (100+ models)	OpenAI, Anthropic, Ollama, plus any OpenAI-compat API
Provider fallback	Yes	Yes	Yes
Policy-driven routing	Yes	Yes	Yes (via `cel` middleware + named targets)
Prompt and response guardrails	Built in	Built in	`ai-prompt-guard` + `ai-response-guard` middlewares
Token rate limits	Built in	Built in	`ai-token-limit` middleware
Cost tracking	Built-in dashboard	Built-in metrics	`ai-cost-tracker` middleware
Source of truth for config	Config UI or SDK	YAML config	OpenAPI spec
Runtime	SaaS and self-host	Python proxy	Rust binary
License	Commercial	MIT	AGPLv3 + commercial
Governs non-AI HTTP traffic	No	No	Yes (full API gateway)

Where a row says "No", the product was not designed for that concern. Forcing a tool into the wrong role is how shadow stacks start.

What to watch for during procurement

If you are being pitched an AI gateway and the first question is "do you already run an API gateway?", you are in the right conversation. If it is not asked, ask it yourself. The answer changes what you need from the new product.

A short procurement checklist:

Where does AI gateway config live? If the answer is "a second config file", you are creating a drift source. Prefer products that integrate with the spec or config surface your team already uses.
Is the feature set monolithic or composable? Monolithic is simpler day one and harder to extend. Composable is more to learn and easier to shape to your operational model.
Does it govern agent traffic too? If agents calling your APIs is on your roadmap, ask about MCP. If not, skip.
How does it integrate with your observability stack? Prometheus, OpenTelemetry, structured logs. Avoid products that ship their own telemetry you have to separately consume.
Self-hosting path and license. SaaS is fine for many teams; regulated, on-prem, or air-gapped environments will need an OSS, self-hostable option.

Closing thoughts

All three products handle the core outbound LLM path competently. The axis that differentiates them is how the AI gateway relates to the rest of your infrastructure:

If AI is the primary problem and the AI gateway stands alone, Portkey or LiteLLM will get you live faster. Pick Portkey if you want SaaS with a UI. Pick LiteLLM if you want OSS breadth and a Python runtime.
If AI is one of several gateway concerns and you want one spec-first artifact covering auth, rate limits, routing, AI, and MCP, Barbacane is the architecture fit.

Pick by architecture, not feature count. The feature sets will converge; the architectural assumptions will not.

For the Barbacane side of the comparison, the /ai page is the five-minute version, and the canonical MCP gateway post is the longer read. For Portkey and LiteLLM, their own docs are the right place to start; their positioning is consistent enough that a fair comparison is easier now than it was a year ago.

Why agents break where developers cope: API governance as agent readiness

Nico — Tue, 12 May 2026 15:48:33 +0000

Every API team has a list of things they keep meaning to fix. Agents are about to decide which of those things are actually optional.

If you have worked on an internal API platform for any length of time, you know the inventory. The endpoint that returns 200 with an error body instead of 4xx. The field that is documented as required and is, in practice, sometimes null. The auth header that is technically optional because one legacy caller never adopted the new flow. The OpenAPI spec that is mostly right, drifting from reality at the edges. The rate-limit response that returns a different shape on the staging cluster than in production.

None of this stops anyone from shipping. Human developers absorb it. They read the issue tracker, ask in Slack, copy a working example from another service, and move on. The API works, in the sense that the people calling it have learned how to call it.

Then agents show up, and the bill comes due.

What human developers absorb

Most internal APIs are held together by a layer of unwritten knowledge. Some of it is documented, most of it is not, and a meaningful slice is contradictory. Developers cope by reading source code, copying from working clients, asking the team that owns the service, and pattern-matching from other APIs they have used.

A short, non-exhaustive list of things human developers routinely route around:

Spec drift. The OpenAPI file says one thing, the runtime returns another. Devs notice the divergence, log a ticket, and update their client by hand.
Field shape inconsistency. created_at is a Unix timestamp on one endpoint, an ISO string on another, both in the same service. The dev writes a small adapter.
Inconsistent error contracts. Some endpoints return RFC 9457 problem details, some return a custom envelope, some return a string. Each is fine if you know about it.
Auth quirks. The endpoint accepts either a JWT in Authorization or a session cookie, but only the JWT path enforces the scope check. Nobody documents this; everyone who matters knows.
Undocumented side effects. Calling POST /orders also triggers a webhook to billing. The doc does not mention it; the integration tests imply it.
Inconsistent pagination. Some endpoints use cursor pagination, some use offset, some return the whole list because "it is a small table." Until it is not.
Rate limit signals. One service returns 429 with Retry-After, another returns 503, another returns 200 with an empty list. Backoff logic is written defensively.
Operation IDs and descriptions. Often missing, sometimes copy-pasted, occasionally lying. Developers ignore the field and read the path.

Every one of these has a workaround. The workaround lives in someone's head, or in a wrapper library, or in a runbook, or in the code of the first team that integrated and figured it out. The API surface and the API contract are two different things, and human developers spend a non-trivial fraction of their time reconciling them.

This has been tolerable for fifteen years because the cost of friction is bounded by the patience of the human on the other end. The pattern works until the other end stops being human.

What agents cannot absorb

An LLM-driven agent is not a slower developer. It is a different kind of consumer, and the differences matter.

Agents work from declared contracts. When an agent calls a tool via MCP, the only thing it knows about that tool is what the schema says. Tool name, description, parameters, return type. If the schema is wrong, the agent's plan is wrong. There is no Slack channel to check. There is no senior engineer who has seen this bug before. The agent does what the contract advertises, and when reality diverges from the contract, the agent loops, hallucinates, or fails.

Agents do not pattern-match across services the way developers do. A developer who has used twenty APIs has a strong prior about how errors look, how pagination works, how dates are encoded. They bring that prior to every new API and use it to fill in gaps. An agent has only the current contract. If the contract is incomplete, the gap is a coin flip.

Agents fail loudly and expensively. A confused developer pauses and asks a question. A confused agent burns tokens, retries, fans out, and produces a plausible-looking wrong answer. Every retry is paid for. Every loop shows up as latency. Every wrong tool call may have side effects the agent does not understand it has triggered.

Agents are cheap and parallel. There will be more agent traffic against your API in 2027 than human-developer traffic, by orders of magnitude, on any service that gets MCP-exposed. The handful of pain points your developers tolerate gets multiplied by a number you have not budgeted for.

Concretely, every item in the previous section becomes something different when the caller is an agent:

Spec drift becomes silent tool failure.
Field shape inconsistency becomes unparseable responses and retry loops.
Inconsistent error contracts become an agent that does not know whether to back off, retry, or escalate.
Auth quirks become unpredictable 401s the agent has no strategy for.
Undocumented side effects become consequences the agent never planned for and cannot reason about.
Inconsistent pagination becomes either truncated answers or runaway scans.
Inconsistent rate limit signals become traffic that does not back off, because the agent does not recognise the signal as a limit.
Missing operation IDs and descriptions become tools the agent never selects, because nothing in the schema told it what they do.

The cost of inconsistency stops being a friction tax on developer time and becomes a reliability tax on production.

The reframe: agent readiness is API discipline

For a decade, the case for spec-first development, schema linting, consistent error contracts, and centralized auth has been made on developer-experience grounds. Cleaner specs make for happier integrators. Consistent errors make for nicer SDKs. Centralized auth makes for a saner security review. All true, all worth doing, all easy to push to next quarter.

The agent era reframes the same investments as reliability work. The exact same backlog, with a different price tag attached.

Spec discipline is no longer about docs. Your OpenAPI file is the input to the tool surface agents see. A missing description is a tool the agent cannot use. A missing operationId is a tool with no stable identity. A wrong type is a contract the agent will honor and the runtime will reject.
Consistent error contracts are no longer about SDK ergonomics. They are the signal agents use to decide between "retry," "back off," "ask the human," and "escalate." Without consistency, every agent has to implement bespoke heuristics per endpoint, and most will get it wrong.
Centralized auth is no longer about security review. It is about giving agents one predictable failure mode for "you are not allowed to do this," instead of N endpoint-specific ones.
Rate limits and quotas are no longer about cost control. They are the only thing standing between an agent in a loop and your database.
Observability is no longer about debugging. It is about being able to answer "what did the agent do last night, in what order, with what consequences," which is a question your team will start getting asked.

This is not a new investment. It is the investment you have been postponing, with a new and less negotiable deadline.

Where the gateway sits in this

If the contract is the thing agents depend on, the question is where the contract gets enforced. The historical answer, "each backend service enforces its own piece," is exactly the pattern that produced the inconsistencies in the first place. Ten teams will produce ten error envelopes, ten auth flows, and ten rate-limit responses, no matter how good the style guide is.

The gateway is the natural enforcement point because it is the only place that sees every request the same way. Auth, rate limits, validation, error shape, and observability can be applied uniformly there, without asking ten teams to coordinate. This is the same argument that drove the API gateway category fifteen years ago, with a different forcing function: agents instead of integration partners.

In practice, agent-readiness at the gateway looks like:

Spec-first compilation. The OpenAPI spec is not documentation that lives next to the gateway; it is the input the gateway is built from. Tool surfaces, request validation, and response schemas are all derived from the same artifact. There is nothing to drift.
Uniform auth and authorization. One identity story across every operation, with policy decisions made at the gateway before the request reaches a backend that might implement them differently.
One error contract. Every failure mode, validation, auth, rate limit, upstream error, returns the same shape. Agents learn one envelope, not ten.
Consistent rate limit signaling. One 429 shape, one Retry-After semantic, one rate-limit headers convention, applied to every route the gateway fronts.
Spec validation in CI. Missing descriptions, missing operation IDs, drifted types, and inconsistent error references fail the build, not the agent at 3am. Linting the spec used to be a nice-to-have. With agents in the loop, it is the cheapest reliability investment available.
Agent-specific middleware. Token-based limits, prompt and response guarding, cost attribution, and per-agent audit logs sit at the same layer as the API governance, not as a separate sidecar.

None of this is new infrastructure. It is the API gateway you already wanted, with the dev-experience arguments replaced by reliability arguments, and the deadline moved up.

What to do about it

The honest version of the advice: take the list of things your developers have been quietly working around, and treat it as a backlog of agent-readiness work. Sort it by how often the workaround shows up in client code, because that is a good proxy for how often an agent will hit it.

A few specific moves that pay off quickly:

Lint your specs in CI. Use Vacuum, Spectral, or equivalent. Fail the build on missing descriptions, missing operationIds, undeclared error responses, and inconsistent schema references. This is a one-week project that catches a quarter of the problems described above.
Pick one error contract and enforce it at the gateway. RFC 9457 problem details is a defensible default. The exact choice matters less than the consistency.
Move authorization decisions to the gateway where the decision can be expressed declaratively, not implemented per service. CEL for route-level guards, OPA for centralized policy; we wrote about the two approaches here.
Audit your rate-limit responses. Pick one signaling convention. Make sure every limit, gateway-side and backend-side, follows it.
Treat your OpenAPI as the source of truth, not a derivative. If your gateway is configured separately from your spec, your gateway will drift from your spec, and your agents will fail in the gap. The compile-don't-configure pattern closes the gap by construction.
Decide what gets MCP-exposed at the spec level, not in a parallel registry. Per-operation opt-out via spec annotation keeps the agent surface honest. More on that here.

The pattern across all of these is the same: the contract is the thing agents depend on, the gateway is the place where the contract becomes operational, and the spec is the source of truth that connects them.

Closing thoughts

The agent era is not a new layer of work bolted on top of API platforms. It is a forcing function that converts the discretionary improvements API teams have been making the case for, contract discipline, consistency, centralized policy, observable surfaces, into operational requirements. The work was always worth doing. The new thing is that postponing it now produces user-visible failures instead of developer-visible friction.

The good news is that everything that makes an API agent-ready also makes it better for the humans who were already using it. The teams that get this right do not end up with a second platform for agents. They end up with one well-governed platform that happens to also be safe for agents to call.

At Barbacane we build that platform on the assumption that this is where things are going: spec-first compilation, gateway-enforced governance, MCP exposure derived from the same OpenAPI you already maintain. The categories matter more than the product. But the categories are converging quickly, and the teams that act on the convergence early will skip the second-system rebuild that the ones who delay are about to start.

Barbacane is open source (AGPLv3) and available at github.com/barbacane-dev/barbacane. If MCP and agent-readiness are on your roadmap, the /mcp page is the short version of how we approach it.

Authorization at the gateway: CEL and OPA for policy-driven access control

Nico — Thu, 07 May 2026 08:01:24 +0000

Authentication is a solved problem. Authorization is where things get complicated.

Once you know who is making a request, how do you decide what they're allowed to do?

At small scale, authorization is simple. An admin role gets full access, a viewer role gets read-only. You hardcode a few rules and move on. But enterprise APIs don't stay small. Teams multiply, services proliferate, and authorization logic becomes a tangled web of role hierarchies, resource ownership, temporal constraints, and regulatory requirements.

This is where most gateway setups start to crack.

The Authorization Gap

Traditional API gateways handle authentication well. JWT validation, API key checks, OAuth2 introspection: these are table stakes. But once the token is verified, the authorization question is typically punted to the application layer:

Gateway: "This is Alice, she has a valid token."
Backend: "Great, but can Alice delete this specific order?"
Gateway: "¯\_(ツ)_/¯"

This pushes authorization logic into every backend service. Each team implements its own checks. Rules diverge. Auditing becomes a nightmare. And when a policy change is needed, say, revoking access for a departing employee's role, you're patching multiple services instead of updating one policy.

The alternative is moving authorization decisions to the gateway, where they can be enforced before the request reaches your backends. But this requires expressive policy languages, not just role lists.

Two Philosophies, One Gateway

Barbacane now ships two authorization plugins that represent fundamentally different approaches to the same problem:

	CEL	OPA
Execution	Inline, in-process	External service (HTTP)
Language	CEL expressions	Rego policies
Latency	Microseconds	HTTP round-trip
Policy location	In the OpenAPI spec	In a policy repository
Best for	Route-level guards	Centralized policy management

They're not competing. They're complementary. Most enterprise deployments will use both.

CEL: Inline Policy Expressions

CEL (Common Expression Language) is a lightweight expression language designed by Google for evaluating policies. It's the same language behind Kubernetes admission webhooks, Envoy RBAC filters, and Firebase Security Rules. If you've written a CEL expression anywhere in the cloud-native ecosystem, you already know how it works in Barbacane.

The CEL plugin evaluates expressions directly in the gateway process. No sidecar. No HTTP call. No external dependency. You write the rule in your spec, and it runs at request time:

paths:
  /admin/users:
    get:
      x-barbacane-middlewares:
        - name: jwt-auth
          config:
            issuer: "https://auth.example.com"
        - name: cel
          config:
            expression: "'admin' in request.claims.roles"
            deny_message: "Admin access required"

The expression has access to the full request context: method, path, headers, query parameters, client IP, and (critically) the parsed claims from an upstream auth middleware:

// Only admins can DELETE
request.method != 'DELETE' || 'admin' in request.claims.roles

// Rate-limit bypass for internal services
request.headers['x-internal-service'] != '' && request.client_ip.startsWith('10.')

// Time-based access (with string comparison on ISO timestamps)
request.method == 'GET' || request.claims.role == 'admin'

// Resource ownership via path params
request.claims.sub == request.path_params.user_id || 'admin' in request.claims.roles

Because CEL evaluates in-process, latency overhead is measured in microseconds. There's no network hop, no serialization, no external service to monitor. The expression is compiled once and cached for subsequent requests.

This makes CEL ideal for route-level guards: rules that are specific to an endpoint and belong alongside the route definition. When you read the spec, you see exactly what's enforced. The policy is the configuration, the same principle that drives everything in Barbacane.

OPA: Centralized Policy Engine

Open Policy Agent takes the opposite approach: policies live outside your specs, in a dedicated policy repository, written in Rego (OPA's purpose-built policy language). The gateway sends request context to OPA via its REST API and enforces the boolean decision.

paths:
  /orders/{id}:
    delete:
      x-barbacane-middlewares:
        - name: oauth2-auth
          config:
            introspection_endpoint: "https://auth.example.com/introspect"
        - name: opa-authz
          config:
            opa_url: "http://opa:8181/v1/data/api/authz/allow"
            include_claims: true

The OPA plugin constructs an input payload from the request and POSTs it to your OPA endpoint:

{
  "input": {
    "method": "DELETE",
    "path": "/orders/ord-42",
    "headers": { "x-auth-consumer": "alice" },
    "client_ip": "10.0.0.1",
    "claims": {
      "sub": "alice",
      "roles": ["order-manager"],
      "department": "fulfillment"
    }
  }
}

Your Rego policy evaluates the decision:

package api.authz

default allow := false

# Order managers can delete orders in their department
allow if {
    input.method == "DELETE"
    startswith(input.path, "/orders/")
    input.claims.roles[_] == "order-manager"
}

# Admins can do anything
allow if {
    input.claims.roles[_] == "admin"
}

# Read-only access for authenticated users
allow if {
    input.method == "GET"
    input.claims.sub != ""
}

This model introduces an HTTP round-trip per request, which is a real cost. But what you get in return is significant:

Centralized policy management. All authorization rules live in one repository, versioned and reviewed like code.
Decoupled policy evolution. Update policies without recompiling gateway artifacts or redeploying services.
Audit trails. OPA's decision logs provide a complete record of every authorization decision.
Complex logic. Rego supports data joins, partial evaluation, and recursive rules that go well beyond what inline expressions can express.

For organizations that need to answer "who had access to what, and when?", think compliance-heavy industries, multi-tenant platforms, regulated APIs, OPA is the right tool.

Composing Authorization with Authentication

Both plugins are designed to slot into Barbacane's middleware chain after an authentication middleware. The auth plugin sets standard headers (x-auth-consumer, x-auth-consumer-groups, x-auth-claims) that the authorization plugin reads. This decoupling means you can swap auth methods without touching authorization logic:

# Global: authenticate with OIDC
x-barbacane-middlewares:
  - name: oidc-auth
    config:
      issuer_url: "https://accounts.google.com"
      audience: "my-api"

paths:
  /admin/settings:
    put:
      # Route-level: CEL guard for admin-only
      x-barbacane-middlewares:
        - name: cel
          config:
            expression: "'admin' in request.claims.roles"

  /reports/{id}:
    get:
      # Route-level: OPA for complex ownership rules
      x-barbacane-middlewares:
        - name: opa-authz
          config:
            opa_url: "http://opa:8181/v1/data/reports/access"
            include_claims: true

Notice what's happening: OIDC authentication is global, but authorization varies per route. Simple admin checks use CEL (no external dependency, microsecond overhead). Complex ownership checks delegate to OPA (centralized policy, full audit trail). The gateway runs the right tool for each endpoint.

This layered approach also composes with Barbacane's existing ACL middleware, which handles group-based allow/deny lists. For many routes, ACL is sufficient. CEL and OPA extend the authorization spectrum for cases where group membership alone isn't enough.

Choosing the Right Tool

Here's a practical decision framework:

Use ACL when:

Authorization is based on group/role membership (e.g., "admins can access /admin/*")
Rules are static allow/deny lists
You don't need expression logic

Use CEL when:

Rules are specific to a route and benefit from living in the spec
You need expressions beyond simple group checks (method + path + claims combinations)
Latency is critical (no external dependency)
The team maintaining the spec also owns the authorization rules

Use OPA when:

Policies are managed by a dedicated security/platform team
Rules are complex, cross-cutting, or frequently updated independently of deployments
You need audit logs of every authorization decision
Compliance requirements mandate centralized policy governance
Policies reference external data (user attributes, resource metadata)

Use CEL + OPA together when:

Simple route guards in CEL, complex cross-cutting policies in OPA
CEL as a fast pre-filter, OPA for the authoritative decision
Different teams own different parts of the authorization surface

Toward Zero Trust at the Gateway

The traditional enterprise pattern, check roles in a middleware, check permissions in the backend, hope they agree, is fundamentally at odds with zero trust. In a zero trust model, no request is trusted by default. Every call, whether it comes from the public internet or from a service two hops away in your Kubernetes cluster, must be explicitly verified against policy before it reaches its destination.

The API gateway is a natural enforcement point for this. It already sits on the request path. It already knows the caller's identity (via auth middleware). What's been missing is the ability to express and evaluate policies, not just check role lists.

That's what CEL and OPA bring to the table. Every request gets evaluated against an explicit policy. Internal traffic doesn't get a free pass. External traffic doesn't get a different code path. The same expressions, the same Rego rules, the same decision framework applies everywhere. And because policies are declared in the spec (CEL) or in a versioned policy repo (OPA), they're auditable. You can answer "what policy was enforced on this endpoint last Tuesday?" without digging through application logs.

This doesn't mean moving all authorization to the gateway. Fine-grained object-level checks ("can Alice edit this specific document?") still belong in the backend, where you have the data context. But coarse-grained and medium-grained decisions ("can this role call DELETE on this endpoint?", "does this department have access to this API?", "is this consumer allowed to write to production resources?") are gateway concerns. Enforcing them before the request reaches your backend reduces attack surface, simplifies backend code, and provides a single enforcement point that security teams can actually audit.

With CEL and OPA, Barbacane gives you two industry-standard tools for building this layer. CEL for the rules that belong next to the route definition. OPA for the policies that belong in a dedicated repository. Both enforced at the same gateway, both integrated with the same authentication chain, both verifiable before deployment.

Strengths and Tradeoffs

What works well:

CEL expressions are validated at request time, catching typos early
OPA integration uses the standard Data API, so any OPA deployment works
Both plugins produce RFC 9457 Problem Details for HTTP APIs, consistent with the rest of the gateway
Authentication and authorization are cleanly separated via standard headers

What to keep in mind:

CEL expressions live in your spec, so changes require recompilation and redeployment
OPA adds an HTTP round-trip per request (mitigated by running OPA as a local sidecar)
The OPA plugin evaluates a single boolean decision; structured deny reasons require custom Rego
CEL doesn't support external data lookups; if your policy needs database queries, use OPA

These are deliberate design choices. CEL optimizes for speed and spec-locality. OPA optimizes for flexibility and centralized governance. Barbacane gives you both, and the middleware chain lets you compose them per route.

Getting Started

Both plugins are available today. Add them to your spec, compile, and deploy:

# CEL: no external dependencies
barbacane compile --spec api.yaml -m barbacane.yaml -o api.bca

# OPA: run OPA alongside your gateway
docker run -d -p 8181:8181 openpolicyagent/opa:latest run --server /policies
barbacane compile --spec api.yaml -m barbacane.yaml -o api.bca

The middleware documentation covers configuration details, expression syntax, and OPA input format. The plugin development guide shows how to build custom authorization plugins if CEL and OPA don't fit your model.

Barbacane is open source (AGPLv3) and available at github.com/barbacane-dev/barbacane. The CEL and OPA authorization plugins ship with v0.1.x. Try them against your specs and let us know what works.

One gateway, many specs: how Barbacane unifies your API ecosystem

Nico — Tue, 05 May 2026 08:04:12 +0000

Most API tooling assumes one repository, one specification. But your architecture doesn't work that way.

In our previous article, we explored how Barbacane eliminates configuration drift by compiling your OpenAPI spec directly into the gateway's runtime artifact. One spec, one .bca file, zero drift.

But what happens when your architecture has more than one spec?

The Multi-Spec Reality

In a typical microservices setup, your API surface isn't described by a single file. Whether you're working contract-first or generating specs from code, you end up with multiple specifications:

A User Service with its own openapi.yaml
An Order Service with its own openapi.yaml
An Inventory Service exposing both REST endpoints and event consumers, described across OpenAPI and AsyncAPI files
Event schemas scattered across repos

Each file is validated in isolation, deployed independently, and versioned on its own timeline. Single-spec tools can't see across these boundaries, so cross-service mismatches only surface at runtime. The feedback loop is as slow as it gets: write specs, deploy, discover the conflict in production, fix, redeploy.

One Command, Multiple Specs

Barbacane's compile command accepts multiple specification files in a single invocation:

barbacane compile \
  -s services/user-service/openapi.yaml \
  -s services/order-service/openapi.yaml \
  -s services/inventory-service/openapi.yaml \
  -s services/inventory-service/asyncapi.yaml \
  -m barbacane.yaml \
  -o gateway.bca

The compiler parses every file (OpenAPI 3.x and AsyncAPI 3.0.x) and merges their routes into a single .bca artifact. The output message tells you exactly what you got:

compiled 4 spec(s) to gateway.bca (23 routes, 5 plugin(s) bundled)

One artifact. One routing table. One deployment.

What Multi-Spec Compilation Actually Does

When you pass multiple specs, the compiler:

Parses each file independently. OpenAPI and AsyncAPI specs are each validated against their respective standards.
Merges routes into a unified routing table. All operations from all specs end up in a single routes.json inside the .bca artifact. The gateway doesn't care which file a route came from; it serves them all.
Detects routing conflicts. If two specs define the same path and method combination (e.g., both declare GET /users/{id}), compilation fails with error E1010. This is a hard gate: you cannot produce an artifact with ambiguous routing.
Bundles everything together. WASM plugins, dispatcher configurations, and the original source specs are all packaged into the artifact. The source specs remain accessible at /__barbacane/specs for documentation and debugging.

This isn't magic. It's the same compilation pipeline applied across multiple input files. But the practical impact is a meaningful shift left: routing conflicts that would previously surface as mysterious 404s or wrong-handler bugs in production now fail your build.

Specs Stay Accessible at Runtime

Compilation merges routes, but the original source specs aren't thrown away. They're embedded in the .bca artifact and served by the gateway at /__barbacane/specs:

GET /__barbacane/specs

This returns an index of every spec that was compiled into the running artifact, with links to the full OpenAPI and AsyncAPI documents. The served specs are stripped of Barbacane-specific extensions (x-barbacane-*), so what your API consumers and documentation tools see is clean, standard OpenAPI and AsyncAPI with no vendor-specific noise. And because these are the exact specs that were compiled, they can't drift from what the gateway is actually running.

No separate spec hosting. No stale docs. The gateway is the documentation server.

A Practical Example

Consider an e-commerce platform with four services:

User Service      → openapi.yaml
Order Service     → openapi.yaml
Inventory Service → openapi.yaml + asyncapi.yaml
Notification Svc  → asyncapi.yaml

Without multi-spec compilation, you'd deploy each service's gateway configuration independently, trusting that teams coordinated their path prefixes and schema versions. With Barbacane:

barbacane compile \
  -s services/user-service/openapi.yaml \
  -s services/order-service/openapi.yaml \
  -s services/inventory-service/openapi.yaml \
  -s services/inventory-service/asyncapi.yaml \
  -s services/notification-service/asyncapi.yaml \
  -m barbacane.yaml \
  -o gateway.bca

If the Order Service accidentally defines a route that collides with the User Service, compilation fails. You find out in seconds, not after a deploy.

CI/CD Integration

Multi-spec compilation fits naturally into a CI gate. Block merges to main if the combined specs don't compile cleanly:

# .github/workflows/validate-contracts.yml
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Compile gateway artifact
        run: |
          barbacane compile \
            -s services/user-service/openapi.yaml \
            -s services/order-service/openapi.yaml \
            -s services/inventory-service/openapi.yaml \
            -s services/inventory-service/asyncapi.yaml \
            -m barbacane.yaml \
            -o gateway.bca

A non-zero exit code (1 for validation failures, 2 for manifest errors) blocks the pipeline. No ambiguous warnings: either the specs compile together, or they don't.

Progressive Adoption

You don't have to compile everything at once. Start with your most critical services and expand:

# Start with two services
barbacane compile \
  -s user-service/openapi.yaml \
  -s auth-service/openapi.yaml \
  -m barbacane.yaml \
  -o gateway.bca

# Later, add event-driven services
barbacane compile \
  -s user-service/openapi.yaml \
  -s auth-service/openapi.yaml \
  -s order-service/openapi.yaml \
  -s order-service/asyncapi.yaml \
  -m barbacane.yaml \
  -o gateway.bca

Each additional spec increases the surface area of conflict detection. The more specs you compile together, the more mismatches you catch before deployment.

Strengths and Limitations

Multi-spec compilation extends the "compile, don't configure" philosophy across service boundaries, but it's worth understanding what it does and doesn't do today.

What it catches:

Routing conflicts (duplicate path + method across specs)
Spec-level validation errors (malformed OpenAPI/AsyncAPI)
Missing plugin or dispatcher declarations

What it doesn't do (yet):

Cross-spec schema validation (e.g., verifying that an Order object is consistent between two specs)
Breaking change detection between spec versions
Dependency graph analysis between services

These are real limitations. Multi-spec compilation today is primarily about route-level unification and conflict detection, not deep semantic analysis across your API ecosystem. For schema consistency, you'll still need complementary tooling or careful code review.

Shifting Left Across Service Boundaries

The idea behind "shift left" is simple: catch problems earlier in the development lifecycle, when they're cheapest to fix. Linters shift left on code quality. Type systems shift left on correctness. Multi-spec compilation shifts left on cross-service integration.

In our previous article, we showed how Barbacane shifts gateway configuration left by compiling the spec into the runtime artifact. Multi-spec compilation takes this further: instead of discovering that two services disagree on routing after deployment, you discover it at compile time, in CI, on a pull request.

It's not a silver bullet. Cross-service consistency is a hard problem, and route-level conflict detection is just one piece of the puzzle. But it's a piece that most gateway tooling doesn't offer at all, and one that pays off immediately in any multi-service architecture. The earlier you catch a conflict, the less it costs.

Barbacane is open source and available at github.com/barbacane-dev/barbacane. Check the documentation for the full CLI reference and getting started guide.

Beyond configuration drift: how Barbacane reimagines the API gateway with Rust and WASM

Nico — Wed, 29 Apr 2026 18:54:22 +0000

What if your OpenAPI spec wasn't just documentation, but the actual configuration of your production gateway?

For years, API teams have lived with a quiet frustration: the gap between specification and reality. You write a beautiful OpenAPI spec. You configure your gateway (Kong, Tyk, AWS API Gateway) with routes, plugins, and security rules. And then… drift happens. A hotfix bypasses the spec. A plugin gets misconfigured. The documentation lies. The gateway behaves unexpectedly. The contract between frontend and backend fractures.

This isn't a people problem. It's an architecture problem.

Enter Barbacane, a spec-driven API gateway built in Rust that treats your OpenAPI (and AsyncAPI) specification as the single source of truth. No separate configuration files. No UI clicks that diverge from Git. Just your spec, compiled into a self-contained artifact that runs at the edge with memory safety guarantees and sub-millisecond latency.

Let's dive into why this approach matters, and whether it's ready for your production workloads.

The Configuration Drift Crisis

Most API gateways follow the same pattern:

You write an OpenAPI spec (hopefully)
You separately configure the gateway via YAML, UI, or CLI
You hope these two artifacts stay in sync

This dual-source model creates inevitable drift:

# openapi.yaml
paths:
  /users/{id}:
    get:
      security: [{ jwt: [] }]

# kong.yaml (oops, forgot to add auth plugin!)
routes:
  - name: users-get
    paths: [/users/{id}]
    # missing jwt-auth plugin configuration

The result? A route that should require authentication ships to production wide open. Security teams panic. Post-mortems happen. Trust erodes.

Barbacane eliminates this entire class of failure by making drift architecturally impossible.

The Core Insight: Compile, Don't Configure

Barbacane's philosophy is radical in its simplicity:

Your spec is your gateway.

Instead of parsing specs at runtime or maintaining parallel configuration, Barbacane introduces a compilation step:

# Step 1: Write your spec (as usual)
openapi: 3.1.0
info:
  title: User API
  version: 1.0.0
x-barbacane-plugins:
  - name: oidc-auth
    config:
      issuer_url: "https://auth.example.com"
      audience: "my-api"

# Step 2: Compile it
barbacane compile --spec openapi.yaml --manifest barbacane.yaml --output api.bca

# Step 3: Run the gateway
barbacane serve --artifact api.bca

The .bca artifact is a self-contained binary bundle:

Pre-compiled routing trie (FlatBuffers, zero-copy deserialization)
JSON Schema validators for request/response validation
WASM plugins (including your auth middleware)
OPA policies for fine-grained authorization
Dispatcher configurations (HTTP upstreams, Lambda, Kafka)

Critically: no runtime spec parsing. The gateway starts in <100ms because everything is pre-optimized. What you compile is exactly what runs. No surprises.

Architecture Deep Dive: Control Plane vs. Data Plane

Barbacane cleanly separates concerns:

The Control Plane (`barbacane-control`)

Stateful service (PostgreSQL-backed)
Handles spec ingestion, validation, and compilation
Serves artifacts to data planes
Provides UI for fleet visibility

The Data Plane (`barbacane`)

Completely stateless single binary
Loads .bca artifact at startup (memory-mapped via FlatBuffers)
Zero runtime dependencies
Optional WebSocket connection to control plane for health reporting

This separation enables true edge deployment: ship a 15MB static binary with your compiled artifact to a CDN POP, and it runs independently. No coordination required. Scale horizontally by launching more binaries. No consensus protocols. No distributed state.

WASM Plugins: Safety Without Sacrifice

Barbacane ships as a "bare binary" with zero bundled plugins. Every capability (JWT auth, rate limiting, CORS) is implemented as a WASM module explicitly declared in your spec:

x-barbacane-plugins:
  - name: rate-limit
    config:
      quota: 100
      window: 60
      partition_key: "header:x-api-key"

During compilation:

Plugin is fetched from registry (or local cache)
Validated against spec requirements
Bundled into the .bca artifact

At runtime:

Plugins execute in a wasmtime sandbox with strict resource limits
Memory isolation prevents plugin crashes from taking down the gateway
Host functions are capability-gated (e.g., vault access requires explicit grant)
Execution timeouts prevent CPU starvation

This model delivers what Lua plugins in Kong wish they had: true isolation without sacrificing performance. Benchmarks show 261us overhead per WASM middleware invocation, including instantiation, on modern hardware.

Security by Construction

Barbacane's security model is defense-in-depth by design:

Layer	Mechanism	Why It Matters
Memory Safety	Rust + WASM sandbox	Eliminates entire classes of CVEs (buffer overflows, use-after-free)
Secrets Management	Vault fetch at startup only	No secrets in Git, specs, or artifacts. Only in runtime memory
AuthN/AuthZ	Plugin-based + OPA	No vendor lock-in; policies compiled to WASM for speed
Compilation	Fail-fast validation	Blocks dangerous configs early (e.g., `http://` backends in prod)
Transport	Rustls (no OpenSSL)	Memory-safe TLS with modern crypto defaults

For secrets, specs reference them by ID only:

x-barbacane-dispatcher:
  name: http-upstream
  config:
    url: "https://backend.example.com"
    headers:
      Authorization: "Bearer {{ vault://prod/api-gateway/backend-token }}"

At startup, the data plane fetches secrets from a secret manager, never storing them on disk. Rotate keys in your secret manager, and the gateway picks up new values on next restart (or via periodic refresh).

Performance: Why FlatBuffers Matters

Most gateways deserialize JSON configs at startup. For small specs, this is fine. For large specs (500+ routes, complex schemas), it becomes a bottleneck.

Barbacane uses FlatBuffers for its artifact format, a choice that pays dividends:

Zero-copy deserialization: Memory-map the artifact and access data directly
Startup in <100ms: Even for 1,000-route specs
No GC pressure: Critical for latency-sensitive edge workloads
Schema evolution: Backward/forward compatibility built-in

Benchmarks show route lookup in 83 nanoseconds for 1,000 routes, faster than a single L3 cache miss. Full request validation (parameters + body schema) averages 1.2 microseconds. This isn't theoretical; it's the difference between viable and non-viable edge deployment.

Strengths and Tradeoffs

No tool is the right fit for every situation. Here's where Barbacane shines and what to keep in mind.

Strengths

Spec integrity: Drift is architecturally impossible
Security posture: Rust + WASM sandboxing beats Lua/JS runtimes
Edge readiness: Stateless, fast startup, minimal footprint
AsyncAPI support: Rare among gateways. Handles WebSockets/MQTT alongside HTTP
GitOps native: Specs in Git → CI validation → artifact deployment

Tradeoffs to consider

Young project: v0.1.x, actively developed with a growing community
Focused plugin set: ~17 official plugins covering core use cases, with more on the way
Compile-first workflow: Changes go through CI/CD rather than runtime hot-patching
Static backends: Service discovery requires a custom plugin or DNS-based resolution

Barbacane prioritizes configuration integrity and safety over plugin breadth and dynamic reconfiguration. If that tradeoff works for your team, it's worth evaluating.

Competitive Landscape

Gateway	Spec-Driven	Memory Safe	WASM Plugins	Edge-Ready	AsyncAPI
Barbacane	Native	Rust	First-class	Yes	Yes
Kong	Separate config	Lua/Nginx	Experimental	Heavy	No
Tyk	Separate config	Go (GC)	No	Heavy	No
AWS API Gateway	Import only	N/A	No	Managed	No
KrakenD	JSON config	Go (GC)	No	Yes	No

Barbacane targets a different design point than Kong or Tyk: configuration integrity and security over plugin ecosystem breadth.

Who Should Consider Barbacane Today?

Strong fits:

Greenfield APIs with OpenAPI-first development workflows
Edge deployments requiring sub-5ms latency overhead
Security-sensitive domains (fintech, healthcare, govtech)
Teams with mature GitOps/CI-CD practices
Organizations investing in Rust/WASM toolchains

Poor fits:

Legacy systems requiring dynamic runtime reconfiguration
Teams needing 50+ pre-built plugins immediately
Environments without DevOps automation for compilation
Brownfield migrations where spec completeness is low

The Bigger Picture: A Shift in Gateway Philosophy

Barbacane represents more than a new gateway. It's a philosophical shift:

Stop configuring your gateway to match your spec. Make your spec the configuration.

This aligns with broader industry movements:

Infrastructure as Code → Behavior as Specification
Runtime validation → Compile-time validation
Configuration drift → Configuration integrity

It's not the only path forward (declarative gateways like KrakenD point in a similar direction), but Barbacane's Rust/WASM/FlatBuffers stack delivers uniquely strong safety and performance guarantees.

Final Thoughts

Barbacane's spec-driven model addresses a real pain point for API teams: keeping specs and gateway behavior in sync. By compiling the spec into the runtime artifact, that problem goes away entirely. The Rust and WASM foundation delivers strong performance and safety guarantees on top.

The project is at v0.1.x, so it's best suited for new projects where you control the spec lifecycle. If your team already works OpenAPI-first with CI/CD automation, Barbacane fits naturally into that workflow.

The goal: your API contract is your production configuration. Security policies validated before deployment. Edge gateways starting in milliseconds with zero configuration drift. That's the direction we're heading.

Barbacane is open source and available at github.com/barbacane-dev/barbacane. As of February 2026, it remains an early-stage project—evaluate thoroughly before production use.

DEV Community: Nico

Barbacane vs Portkey and LiteLLM: picking an AI gateway in 2026

The overlap: outbound LLM proxying

What Portkey is

What LiteLLM is

What Barbacane is

The architectural difference: monolithic AI proxy vs dispatcher plus middlewares

Spec-first: OpenAPI as source of truth

The inbound direction: MCP

When to pick which

Feature comparison

What to watch for during procurement

Closing thoughts

Why agents break where developers cope: API governance as agent readiness

What human developers absorb

What agents cannot absorb

The reframe: agent readiness is API discipline

Where the gateway sits in this

What to do about it

Closing thoughts

Authorization at the gateway: CEL and OPA for policy-driven access control

The Authorization Gap

Two Philosophies, One Gateway

CEL: Inline Policy Expressions

OPA: Centralized Policy Engine

Composing Authorization with Authentication

Choosing the Right Tool

Toward Zero Trust at the Gateway

Strengths and Tradeoffs

Getting Started

One gateway, many specs: how Barbacane unifies your API ecosystem

The Multi-Spec Reality

One Command, Multiple Specs

What Multi-Spec Compilation Actually Does

Specs Stay Accessible at Runtime

A Practical Example

CI/CD Integration

Progressive Adoption

Strengths and Limitations

Shifting Left Across Service Boundaries

Beyond configuration drift: how Barbacane reimagines the API gateway with Rust and WASM

The Configuration Drift Crisis

The Core Insight: Compile, Don't Configure

Architecture Deep Dive: Control Plane vs. Data Plane

The Control Plane (barbacane-control)

The Data Plane (barbacane)

WASM Plugins: Safety Without Sacrifice

Security by Construction

Performance: Why FlatBuffers Matters

Strengths and Tradeoffs

Strengths

Tradeoffs to consider

Competitive Landscape

Who Should Consider Barbacane Today?

The Bigger Picture: A Shift in Gateway Philosophy

Final Thoughts

The Control Plane (`barbacane-control`)

The Data Plane (`barbacane`)