<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sahajmeet Kaur</title>
    <description>The latest articles on DEV Community by Sahajmeet Kaur (@sahajmeet_kaur_).</description>
    <link>https://dev.to/sahajmeet_kaur_</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3978504%2F0ef5b27d-0f02-4f25-ab3a-6e9534bbf6e9.png</url>
      <title>DEV Community: Sahajmeet Kaur</title>
      <link>https://dev.to/sahajmeet_kaur_</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sahajmeet_kaur_"/>
    <language>en</language>
    <item>
      <title>Bifrost vs TrueFoundry: What changes when you go from OSS gateway to enterprise platform</title>
      <dc:creator>Sahajmeet Kaur</dc:creator>
      <pubDate>Thu, 11 Jun 2026 13:23:42 +0000</pubDate>
      <link>https://dev.to/sahajmeet_kaur_/what-i-learned-evaluating-five-mcp-gateways-for-production-3la8</link>
      <guid>https://dev.to/sahajmeet_kaur_/what-i-learned-evaluating-five-mcp-gateways-for-production-3la8</guid>
      <description>&lt;p&gt;Bifrost and TrueFoundry both market themselves as AI gateways in 2026, and if you look at a feature grid, they cover a lot of the same ground: LLM routing, MCP support, guardrails, observability, rate limiting, cost attribution, agent execution. The overlap is real.&lt;/p&gt;

&lt;p&gt;But the overlap obscures the actual decision. Bifrost is a single Go binary you run. TrueFoundry is a Kubernetes-native control plane whose gateway is one layer of a larger platform. These are different architectural bets, and the right choice depends almost entirely on where your team is starting from and how far you expect to grow.&lt;/p&gt;

&lt;p&gt;Here's what I found running both.&lt;/p&gt;

&lt;p&gt;What you're actually deploying&lt;br&gt;
&lt;strong&gt;Bifrost&lt;/strong&gt; starts in seconds. One binary, no external dependencies — it initializes a local SQLite store and stands up on first run. The startup log from a v1.5.7 instance shows exactly what it spins up: config, logs, and governance stores; a per-user OAuth sweep worker; a pricing sync worker; and its model catalog. Everything in one process. Apache 2.0 licensed, self-hosted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.truefoundry.com/" rel="noopener noreferrer"&gt;TrueFoundry&lt;/a&gt;&lt;/strong&gt; inverts this entirely. There's no binary to download — the gateway installs into Kubernetes as part of a control plane, configured via YAML through the TrueFoundry CLI. Available as managed SaaS, VPC deployment, on-prem, or air-gapped. That's more operational surface area at the start. In return, you get deployment options with documented compliance posture, and a control plane that handles things no single binary can: multi-team RBAC tied to your identity provider, SCIM-driven provisioning, and managed model hosting alongside the gateway.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.truefoundry.com/" rel="noopener noreferrer"&gt;TrueFoundry gateway&lt;/a&gt; itself is stateless — built on the Hono framework, synced from the control plane over a NATS queue. Auth, RBAC, and rate limiting run in memory; logs write asynchronously to ClickHouse. Their published benchmarks: &lt;strong&gt;~250 RPS on 1 vCPU / 1 GB pod&lt;/strong&gt;, reaching &lt;strong&gt;~350 RPS&lt;/strong&gt; before saturation, adding roughly +7 ms overhead (closer to +12 ms with full tracing enabled). [Note: these are vendor-stated figures under vendor-specified conditions.]&lt;/p&gt;
&lt;h2&gt;
  
  
  Model access: both are OpenAI-compatible drop-ins
&lt;/h2&gt;

&lt;p&gt;The code change to adopt either is minimal — a base_url swap:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bifrost (from a running v1.5.7 instance):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080/openai&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dummy-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;   &lt;span class="c1"&gt;# handled by Bifrost
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List files in current directory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;TrueFoundry (same SDK, gateway endpoint):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&amp;lt;org&amp;gt;.truefoundry.com/api/llm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tfy-...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai-main/gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;# provider/model set in GitOps YAML
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List files in current directory&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the v1.5.7 instance I ran, Bifrost's catalog showed 3,020 models across 89 providers. Their current public docs say "1000+ models" and "23+ providers" — these numbers don't match what I observed in the instance, and I can't reconcile them [see RISK FLAGS in the review file]. Regardless of the exact count, the catalog is large in both cases. TrueFoundry documents 1,600+ managed models plus self-hosted options. For most teams the model coverage difference won't be a deciding factor.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP and agent execution: closer than expected
&lt;/h2&gt;

&lt;p&gt;Both gateways have invested meaningfully in MCP, and the design pattern is similar.&lt;/p&gt;

&lt;p&gt;Bifrost offers two modes: Manual Tool Execution, where the client calls and approves each tool, and Agent Mode, where the gateway auto-executes whitelisted tools. You configure this with tools_to_execute (what can be called) and tools_to_auto_execute (what runs without approval):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bifrost agent-mode pattern (from docs)
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[...],&lt;/span&gt;
    &lt;span class="n"&gt;extra_body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools_to_execute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list_files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;read_file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools_to_auto_execute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list_files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# runs without approval
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TrueFoundry frames the same control differently: &lt;a href="https://www.truefoundry.com/docs/ai-gateway/mcp/virtual-mcp-server#virtual-mcp-server" rel="noopener noreferrer"&gt;Virtual MCP Servers &lt;/a&gt;(curated subsets of tools exposed to specific teams), per-team RBAC on tool access, and pre/post-call MCP guardrails. It also ships prebuilt connectors for Slack, Confluence, Sentry, and Datadog, and can wrap any REST/OpenAPI service as an MCP server.&lt;/p&gt;

&lt;p&gt;Where Bifrost has a clear advantage: if you want lean, self-hosted MCP execution without standing up a full platform, Bifrost's implementation is immediately runnable. Where TrueFoundry adds value: when you need org-level identity on every tool call — one auto-refreshed OAuth token per user across all MCP servers — or multi-agent, session-aware workflows through their Agent Gateway.&lt;/p&gt;

&lt;p&gt;For teams that just need reliable tool calling without enterprise identity overhead, Bifrost's approach is genuinely simpler.&lt;/p&gt;

&lt;h2&gt;
  
  
  Identity, compliance, and deployment options
&lt;/h2&gt;

&lt;p&gt;This is where the two products diverge most clearly.&lt;/p&gt;

&lt;p&gt;Bifrost's auth model is built around per-user OAuth with automatic token refresh — visible as a running worker in the boot log. Its Enterprise tier adds SAML-based SSO, RBAC, and OIDC directory sync. Compliance claims (SOC 2 Type II, HIPAA, ISO 27001, GDPR) are marketed as part of the Enterprise Governance module with immutable audit logs. Pricing for the Enterprise tier isn't publicly documented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TrueFoundry's&lt;/strong&gt; identity model operates primarily at the org level: SSO via OIDC or SAML 2.0 through any major IdP, optional SCIM provisioning for automated user/group sync, and RBAC. On its higher-tier on-prem Enterprise plan, auth traffic can be routed directly to your IdP without touching TrueFoundry's servers. &lt;br&gt;
&lt;strong&gt;Compliance claims&lt;/strong&gt;: &lt;strong&gt;SOC 2 Type II, HIPAA, GDPR, and ITAR&lt;/strong&gt; for export-controlled defense/aerospace workloads.&lt;/p&gt;

&lt;p&gt;The honest take on the compliance parity: both vendors market SOC 2 and HIPAA, both offer VPC/on-prem/air-gapped deployment. As is standard, certifications attach to the managed/audited environment — for self-hosted deployments, compliance also depends on your own controls.&lt;/p&gt;

&lt;p&gt;The genuine differentiators:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ITAR&lt;/strong&gt;: TrueFoundry claims ITAR-compliant deployments; Bifrost does not advertise this. If this matters for your work, you'll need to verify TrueFoundry's current ITAR posture directly — I haven't independently confirmed the scope.&lt;br&gt;
&lt;strong&gt;SCIM&lt;/strong&gt;: TrueFoundry offers SCIM-driven provisioning for automated team/user management. Bifrost's Enterprise tier has OIDC directory sync, which is directionally similar, but SCIM is the more standardized enterprise provisioning protocol.&lt;br&gt;
&lt;strong&gt;Per-user OAuth at the tool level&lt;/strong&gt;: Bifrost's built-in per-user OAuth is cleaner for MCP tool authentication where individual user credentials need to flow through to downstream services. TrueFoundry handles this differently — worth evaluating which model fits your auth architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability, cost, and prompts
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Observability:&lt;/strong&gt; Bifrost ships with a dashboard, LLM logs, MCP logs, and 365-day log retention. It integrates with Maxim's evaluation platform for evals. TrueFoundry is fully OpenTelemetry-compliant with metadata tagging and a dedicated tracing product. Both are solid; TrueFoundry's OTel compliance means easier integration with existing monitoring infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost attribution:&lt;/strong&gt; Bifrost initializes a governance store at boot and applies budgets and rate limits per user/team. TrueFoundry enforces budgets at user/team/model level with chargeback. Comparable in intent; TrueFoundry goes deeper on multi-level attribution and consolidated reporting across the broader platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt management&lt;/strong&gt;: Both treat prompts as managed artifacts — Bifrost has a Prompt Repository; TrueFoundry offers prompt lifecycle management with versioning, rollback, and publishing. This is closer to parity than either vendor's marketing implies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where each wins
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Bifrost is the better choice when:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You want open-source code you can read, audit, and fully own&lt;br&gt;
A zero-dependency, single-binary setup matters — operationally or philosophically&lt;br&gt;
You need MCP + agent-mode auto-execution without adopting a broader platform&lt;br&gt;
Your infra doesn't run Kubernetes&lt;br&gt;
Per-user OAuth flowing through to downstream MCP servers is the auth model you need&lt;br&gt;
You don't need vendor-documented ITAR compliance&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TrueFoundry is the better choice when:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You need ITAR/export-controlled deployment posture (if confirmed for your use case)&lt;br&gt;
You want SCIM-driven org provisioning and centralized identity management&lt;br&gt;
You need to consolidate gateway + model deployment/training + MCP hosting + multi-agent workflows&lt;br&gt;
You're governing AI tool access across many teams from a single control plane&lt;br&gt;
Your org is already Kubernetes-native and wants a managed SaaS or VPC option with vendor support&lt;/p&gt;

&lt;h2&gt;
  
  
  What this comparison doesn't settle
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Version gap:&lt;/strong&gt; I evaluated Bifrost v1.5.7. The Helm chart is at v2.1.22 and the project has 86 contributors and thousands of commits. Several capabilities described above may have changed. Check the Bifrost GitHub for current state before drawing conclusions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security research controversy:&lt;/strong&gt; A dev.to post raised concerns about Bifrost and Maxim AI (H3 Labs) potentially fitting patterns of API key harvesting services. I have not verified these claims, but anyone evaluating Bifrost for production workloads involving sensitive API keys should read that post and form their own judgment. [Link: dev.to/bradleymatera/research-why-bifrost-maxim-ai-h3-labs-inc-fits-the-exact-pattern-of-api-key-harvesting-2844]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Neither vendor publishes clear pricing for enterprise tiers. This matters if you're comparing total cost. Bifrost's core is open source with no licensing cost; TrueFoundry is a commercial product. Get quotes before assuming either fits your budget.&lt;/p&gt;

&lt;p&gt;Actual benchmark comparison: The source article doesn't include a head-to-head latency comparison between the two gateways. Bifrost's own marketing claims "&amp;lt;100 µs overhead at 5k RPS." TrueFoundry publishes "+7 ms at 250 RPS." These are measured under different conditions and aren't comparable without a controlled test. [NEEDS: head-to-head benchmark under identical load and config.]&lt;/p&gt;

&lt;p&gt;If you've run either in production at scale and have data on latency, memory usage, or operational overhead that differs from what's documented here, I'd be interested in hearing about it in the comments.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>apigateway</category>
      <category>mcp</category>
      <category>llm</category>
    </item>
    <item>
      <title>What I learned evaluating five MCP gateways for production</title>
      <dc:creator>Sahajmeet Kaur</dc:creator>
      <pubDate>Thu, 11 Jun 2026 13:05:45 +0000</pubDate>
      <link>https://dev.to/sahajmeet_kaur_/what-i-learned-evaluating-five-mcp-gateways-for-production-2clg</link>
      <guid>https://dev.to/sahajmeet_kaur_/what-i-learned-evaluating-five-mcp-gateways-for-production-2clg</guid>
      <description>&lt;p&gt;When Anthropic released the Model Context Protocol in November 2024, the initial conversation was mostly about the protocol itself: a standard way for AI agents to discover and call tools without building custom integrations for every API. That problem was real and the protocol mostly solved it.&lt;/p&gt;

&lt;p&gt;But MCP adoption created a second problem that teams started hitting around mid-2025: how do you manage dozens of MCP server connections at scale, control what agents can access, see what they're actually doing, and handle credential rotation without your security team losing sleep? The base protocol doesn't address any of this.&lt;/p&gt;

&lt;p&gt;That's the gap MCP gateways fill. I spent several weeks evaluating the main options. This covers what I found, including where each tool has a genuine edge and where it falls short.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually matters when evaluating an MCP gateway
&lt;/h2&gt;

&lt;p&gt;Most comparison posts lead with latency benchmarks or feature checkboxes. Those matter, but three questions did more to differentiate the tools in practice:&lt;/p&gt;

&lt;p&gt;Where does it fit in your existing stack? Some gateways are standalone infrastructure; others integrate tightly with a specific cloud provider or container runtime. The right choice depends heavily on what you're already running - adopting the wrong architecture fit creates more work than it saves.&lt;/p&gt;

&lt;p&gt;What security model does it enforce? Tool poisoning, credential exposure, and unauthorized agent access are production concerns, not theoretical ones. Gateways take meaningfully different approaches, and the differences aren't cosmetic.&lt;/p&gt;

&lt;p&gt;What's the operational overhead at scale? Managing 5 MCP connections with no central observability is fine. Managing 50 without it isn't. Solutions that are easy to set up often become painful to operate as workloads grow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The five tools I evaluated
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. TrueFoundry
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjt7x0qs4a4f748mbj8r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmjt7x0qs4a4f748mbj8r.png" alt="Truefoundry's unified interface for accessing LLMs" width="799" height="453"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;TrueFoundry's&lt;/strong&gt; &lt;a href="https://www.truefoundry.com/mcp-gateway" rel="noopener noreferrer"&gt;&lt;strong&gt;MCP gateway&lt;/strong&gt;&lt;/a&gt; is built around a specific architectural bet: teams managing LLM workloads shouldn't have to run separate infrastructure for &lt;a href="https://www.truefoundry.com/docs/ai-gateway/mcp/mcp-overview" rel="noopener noreferrer"&gt;MCP&lt;/a&gt; orchestration. The unified platform handles both, with the same &lt;strong&gt;security, observability,&lt;/strong&gt; and &lt;strong&gt;rate-limiting&lt;/strong&gt; mechanisms applying to LLM calls and tool calls alike.&lt;/p&gt;

&lt;p&gt;The performance claims in their docs are specific s*&lt;em&gt;ub-3ms latency, 350+ RPS&lt;/em&gt;* on a single vCPU, attributed to &lt;strong&gt;in-memory auth&lt;/strong&gt; and rate limiting rather than database lookups. The architecture makes that plausible, but there's no published benchmark methodology or test configuration with these numbers. [NEEDS: test configuration details — payload size, model size, infra spec, test harness — so readers can reproduce or compare against their own workload.] If latency is a hard requirement, run your own test before planning capacity.&lt;/p&gt;

&lt;p&gt;The genuinely strong parts: unified billing and observability across LLM and tool usage, MCP Server Groups for per-team isolation without separate gateway deployments, and an interactive playground that generates production-ready client code across multiple languages. If you're already tracking LLM costs through &lt;a href="https://www.truefoundry.com/docs/ai-gateway/intro-to-llm-gateway" rel="noopener noreferrer"&gt;TrueFoundry's AI gateway&lt;/a&gt;, getting consolidated tool-call data in the same dashboard is a real operational win rather than a feature checkbox.&lt;/p&gt;

&lt;p&gt;The weaknesses: this is a full-platform product, which means you're adopting a broader dependency. If you want a thin, standalone MCP proxy, TrueFoundry is more than you need. It's also a commercial product — pricing isn't published in a way that makes quick evaluation easy for smaller teams. [NEEDS: pricing information or at least an order-of-magnitude range.]&lt;/p&gt;

&lt;p&gt;Best fit: Teams already running significant AI workloads on TrueFoundry, or those who want a single vendor managing both LLM routing and MCP orchestration with unified cost visibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Docker MCP Gateway
&lt;/h2&gt;

&lt;p&gt;Docker applied its core capability — container isolation — to the MCP problem, and the result is coherent. Each MCP server runs in its own container with CPU capped at 1 core, memory at 2GB, and no host filesystem access by default. Cryptographically signed container images address supply chain security in a way that none of the other tools in this list have an equivalent for.&lt;/p&gt;

&lt;p&gt;The Docker MCP Catalog now ships with 300+ verified, pre-packaged server images. That's the largest pre-built library of any option here, and it lowers the barrier to trying new tools significantly — the difference between "pull this image" and "read the setup docs and figure out how to auth it" is non-trivial when you're evaluating a dozen servers at once.&lt;/p&gt;

&lt;p&gt;The Docker Desktop integration is a genuine differentiator for local development. Developers get safe, isolated MCP experimentation without complex setup, and the same container model carries forward to production environments.&lt;/p&gt;

&lt;p&gt;Where Docker is weaker: the latency profile (source benchmarks show 50–200ms [NEEDS: benchmark methodology]) reflects container overhead, which compounds for agents making many short sequential tool calls. There's also limited built-in observability beyond logging and call tracing — you need to bring your own monitoring stack for meaningful analytics.&lt;/p&gt;

&lt;p&gt;Best fit: Container-first infrastructure teams, use cases involving code execution or high-isolation requirements (the resource caps matter a lot here), and teams that want standardized packaging across many MCP servers without managing custom deployment scripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. IBM ContextForge
&lt;/h2&gt;

&lt;p&gt;One important correction to the framing you'll find in some other writeups: IBM ContextForge is no longer in alpha/beta, and "no commercial support" is no longer accurate. IBM released v1.0.0-GA and offers IBM Elite Support for commercial deployments. The project now has 100+ open source contributors and powers IBM Consulting Advantage, which serves 160,000+ users. It's not an experimental side project — that framing is outdated.&lt;/p&gt;

&lt;p&gt;With that corrected: ContextForge's federation capabilities are the most architecturally sophisticated of any option here. Auto-discovery via mDNS, health monitoring across gateway instances, and capability merging that lets multiple gateways present as a unified endpoint — these are features that matter in genuinely complex deployments where multiple teams or regions each manage their own MCP infrastructure. Virtual server composition lets you combine multiple MCP backends into a single logical endpoint, simplifying agent interactions without restructuring your backend.&lt;/p&gt;

&lt;p&gt;Authentication flexibility is also notable: JWT Bearer, Basic Auth, custom header schemes, AES encryption for tool credentials, and multi-database backend support (PostgreSQL, MySQL, SQLite). If you're integrating with existing enterprise identity systems, this matters.&lt;/p&gt;

&lt;p&gt;Where ContextForge is weaker: the developer experience is steeper than Docker or TrueFoundry. Configuration requires more infrastructure expertise, and the latency overhead (100–300ms per source benchmarks [NEEDS: methodology]) is meaningfully higher than lighter options. Also worth evaluating: how tightly it integrates with IBM's own cloud services versus cloud-agnostic deployments.&lt;/p&gt;

&lt;p&gt;Best fit: Large enterprises anticipating multiple gateway deployments across environments or regions — this is the only tool here that was designed for federation from the ground up. Requires a team comfortable with infrastructure complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Microsoft MCP Gateway (Azure)
&lt;/h2&gt;

&lt;p&gt;Microsoft's approach isn't a single product — it's a set of integration points across Azure services that together handle gateway responsibilities. Azure API Management handles policy enforcement and OAuth flows; Azure App Service and Functions handle server hosting; Microsoft Entra ID handles authentication and RBAC. The recent Build 2026 updates added built-in MCP for Azure App Service and improved Functions MCP extensions with native Entra auth.&lt;/p&gt;

&lt;p&gt;For Azure-native organizations, this native integration is a real advantage. OAuth flows work without additional configuration. Entra ID policies apply directly. The Azure Resource Manager MCP Server gives agents first-class access to infrastructure operations — querying, deploying, and managing Azure resources through ARM — in ways that would require significant custom integration with other gateways.&lt;/p&gt;

&lt;p&gt;Where Microsoft wins clearly: if your security and compliance posture is built around Entra ID, and your agents primarily interact with Azure-hosted services, the native identity flow is substantive, not cosmetic. No other tool here matches it in the Azure-native scenario.&lt;/p&gt;

&lt;p&gt;Where it's weaker: anything outside the Azure ecosystem. Multi-cloud or hybrid deployments require significant custom work. The operational surface area is large — you're managing multiple Azure services rather than a single gateway product — and getting comprehensive monitoring requires stitching together Azure Monitor, Application Insights, and service-specific logging.&lt;/p&gt;

&lt;p&gt;Best fit: Azure-committed organizations where Entra ID investment can be leveraged directly. Not a good fit for multi-cloud architectures or teams that need quick setup and centralized observability in a single console.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Lasso Security
&lt;/h2&gt;

&lt;p&gt;Lasso (2024 Gartner Cool Vendor for AI Security — verified) is built around a problem the other tools treat as secondary: when AI agents interact with tools, most gateway infrastructure gives you almost no visibility into whether those interactions were legitimate or malicious.&lt;/p&gt;

&lt;p&gt;Their specific capabilities: real-time prompt injection detection that blocks malicious inputs before they reach MCP tools; MCP server reputation scoring based on behavior patterns, code analysis, and community data (automatic blocking of flagged servers addresses supply chain attacks from compromised tool packages); token masking to prevent credential exposure in tool call logs. The plugin-based architecture lets organizations add security controls incrementally rather than adopting all-or-nothing.&lt;/p&gt;

&lt;p&gt;This is the only tool here that was built with agent security as the primary design axis. That shows both in the depth of the security features and in what it doesn't do well as a general gateway — it's more security overlay than full orchestration infrastructure.&lt;/p&gt;

&lt;p&gt;Where Lasso wins clearly: regulated industries and environments where agent tool access is a high-consequence security surface. Healthcare, financial services, and legal sectors handling sensitive data get threat detection specifically designed for AI agent behavior patterns, which general-purpose security tools don't provide.&lt;/p&gt;

&lt;p&gt;Where it's weaker: the security scanning adds latency overhead (100–250ms range per source benchmarks [NEEDS: methodology]), which compounds for high-frequency tool calls. It's also not a complete gateway replacement — most teams would run it alongside Docker or TrueFoundry for the routing and management layer.&lt;/p&gt;

&lt;p&gt;Best fit: Regulated industries, any team where a security incident involving agent tool access would have severe downstream consequences. Likely used in conjunction with another gateway rather than as a standalone solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to choose
&lt;/h2&gt;

&lt;p&gt;Rather than a ranked list, here's how I'd map situations to tools:&lt;/p&gt;

&lt;p&gt;Already running TrueFoundry for LLM routing, want unified cost and observability → TrueFoundry&lt;br&gt;
Container-first infra, code execution isolation needed, want a large pre-built server catalog → Docker&lt;br&gt;
Multiple gateway deployments across environments or regions, federation is a real requirement → IBM ContextForge&lt;br&gt;
Azure-native, Entra ID is your identity backbone, agents interact primarily with Azure services → Microsoft&lt;br&gt;
Regulated industry, agent security threat detection is a non-negotiable → Lasso Security (likely layered with one of the above)&lt;/p&gt;

&lt;p&gt;One honest caveat on the performance comparisons: the latency numbers cited above — and in most comparison posts, including earlier versions of this one — come without published benchmark methodology, test configuration, or hardware specs. They're directionally useful but not reliable for capacity planning. If latency matters for your use case, run your own test on representative payloads and tool call patterns before committing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Things this post didn't settle&lt;/strong&gt;&lt;br&gt;
A few open questions I'm still thinking about after this evaluation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability standards:&lt;/strong&gt; Each tool exports metrics and logs in different schemas. There's no common format yet, which means your monitoring stack needs custom adapters regardless of which gateway you choose. This is underdiscussed in most gateway comparisons.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost modeling at scale:&lt;/strong&gt; The operational cost picture — caching overhead, retry rates, security scanning compute — is hard to predict without production data. Most teams I've talked to have been surprised by total tool-call costs at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-agent coordination&lt;/strong&gt;: None of these tools natively handle agent-to-agent tool routing in a multi-agent architecture. If your setup involves several agents sharing a gateway, you'll hit undocumented edge cases.&lt;/p&gt;

&lt;p&gt;If you've deployed any of these in production and have data that contradicts what I've written here — especially on latency or operational overhead at scale — I'd genuinely like to hear about it in the comments.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>apigateway</category>
      <category>mcp</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
