<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Signadot</title>
    <description>The latest articles on DEV Community by Signadot (@signadot).</description>
    <link>https://dev.to/signadot</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F467077%2Fb24943f8-8c14-4c75-841c-7017b22507ee.png</url>
      <title>DEV Community: Signadot</title>
      <link>https://dev.to/signadot</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/signadot"/>
    <language>en</language>
    <item>
      <title>Why the MCP Server Is Now a Critical Microservice</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Thu, 19 Feb 2026 15:55:36 +0000</pubDate>
      <link>https://dev.to/signadot/why-the-mcp-server-is-now-a-critical-microservice-5cla</link>
      <guid>https://dev.to/signadot/why-the-mcp-server-is-now-a-critical-microservice-5cla</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this article on &lt;a href="https://www.signadot.com/blog/why-the-mcp-server-is-now-a-critical-microservice?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Why+the+MCP+Server+Is+Now+a+Critical+Microservice" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In my previous article on &lt;a href="https://www.signadot.com/blog/your-ci-cd-pipeline-is-not-ready-to-ship-ai-agents" rel="noopener noreferrer"&gt;preparing CI/CD pipelines to ship production-ready agents&lt;/a&gt;, I argued that we cannot ship agents to production that are driven primarily by non-deterministic models. Instead, they must be built as robust workflows where the large language model (LLM) is introduced strategically at specific steps within a deterministic control flow.&lt;/p&gt;

&lt;p&gt;Now we must examine the most critical node in that framework.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol (MCP) server facilitates interactions between the probabilistic LLM node and the deterministic microservices workflow. It acts as the translation layer connecting the reasoning engine to external data and tools.&lt;/p&gt;

&lt;p&gt;The model is one half of the agent architecture. The MCP server is the other half. While model evaluations validate the reasoning engine, they cannot verify the system as a whole. Validation strategies relying on&lt;a href="https://www.signadot.com/blog/why-mocks-fail-real-environment-testing-for-microservices" rel="noopener noreferrer"&gt; mocks fail to test&lt;/a&gt; the agent as a workflow.&lt;/p&gt;

&lt;p&gt;Reliability of the end-to-end workflow is paramount when shipping agents to production. The MCP server is the critical node in this topology, acting as both sensory organ and effector arm. If it transmits ambiguous signals, the agent acts erratically. It hallucinates. It degrades user trust. It causes critical business errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architectural Shift From Contracts to Semantics
&lt;/h3&gt;

&lt;p&gt;To understand the failure risks, we must examine how the MCP server alters service contracts.&lt;/p&gt;

&lt;p&gt;Service-to-service communication is deterministic in standard microservices environments. Service A calls Service B using a strict REST or gRPC contract. The interaction is rigid. It is predictable. It is easily validated.&lt;/p&gt;

&lt;p&gt;An agentic workflow inverts this.&lt;/p&gt;

&lt;p&gt;The agent is a nondeterministic actor operating on probabilistic logic. It decides when to call a tool based on semantic context provided by the MCP server. The server exposes a world model rather than just an API endpoint.&lt;/p&gt;

&lt;p&gt;This makes the MCP server a distinct type of microservice. It is a translation layer converting probabilistic intent into deterministic action. This responsibility manifests in three operations requiring rigorous engineering.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49hvyp0pw9j5esb0xxw9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49hvyp0pw9j5esb0xxw9.png" alt=" " width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Defining Capability Boundaries
&lt;/h4&gt;

&lt;p&gt;The MCP server defines agent capabilities through JSON-RPC tool definitions.&lt;/p&gt;

&lt;p&gt;If the server exposes a schema with vague descriptions, the agent cannot formulate a valid execution plan. A human developer might read documentation to clarify an API field, but the agent relies solely on metadata exposed by the list_tools capability.&lt;/p&gt;

&lt;p&gt;Consider a payment operations agent handling refunds. A fragile MCP implementation might expose a tool named refund_user to process a refund.&lt;/p&gt;

&lt;p&gt;This lacks semantic density. The model does not know whether this applies to a full or partial refund or if it handles tax calculation. It is a black box.&lt;/p&gt;

&lt;p&gt;A robust implementation defines the boundary with precision. It exposes process_prorated_subscription_refund. The description explicitly states that it calculates the remaining balance for the current billing cycle and issues a credit.&lt;/p&gt;

&lt;p&gt;The reasoning chain breaks without this specificity.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Governing the Context Economy
&lt;/h4&gt;

&lt;p&gt;The MCP server governs the context window. It must retrieve backend data and format it for LLM consumption.&lt;/p&gt;

&lt;p&gt;This data engineering challenge requires differentiating between signal and noise.&lt;/p&gt;

&lt;p&gt;Providing a raw 5 MB JSON dump dilutes agent attention. It wastes tokens and increases latency. Conversely, providing too little data causes the agent to hallucinate missing details.&lt;/p&gt;

&lt;p&gt;The server must act as a transformation layer that optimizes raw data into context-ready snippets.&lt;/p&gt;

&lt;h4&gt;
  
  
  3. Executing Side Effects
&lt;/h4&gt;

&lt;p&gt;The MCP server executes actions for the agent. When an agent triggers a deployment, the server is the execution mechanism.&lt;/p&gt;

&lt;p&gt;A confused agent can trigger destructive loops if the server lacks idempotency or error handling. The server must implement safeguards preventing the model from erroneously retrying state-changing operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Engineering Rigor Required for Production
&lt;/h3&gt;

&lt;p&gt;Shipping agents to production requires due diligence exceeding standard microservice development. This is most visible in return state ambiguity.&lt;/p&gt;

&lt;p&gt;A traditional API might return a 404 error code, which a client handles with logic. An MCP server faces a more complex challenge. It must return a natural language description or structured tool result explaining why the action failed.&lt;/p&gt;

&lt;p&gt;If the server returns a generic stack trace, the agent may retry endlessly or invent a plausible but incorrect reason for failure. The error message becomes part of the prompt for the next conversation turn. It must be engineered as carefully as the system prompt.&lt;/p&gt;

&lt;p&gt;Latency is also critical. Agents operate in a sequential thought loop. They reason. They call a tool. They wait. They reason again.&lt;/p&gt;

&lt;p&gt;A slow server breaks the cognitive chain. High latency causes context timeouts, forcing the agent to abandon workflows. This leaves systems in inconsistent states.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling Testing via Multitenancy
&lt;/h3&gt;

&lt;p&gt;The nondeterministic nature of the client makes testing difficult. Traditional unit tests are insufficient.&lt;/p&gt;

&lt;p&gt;Unit testing a Python function to ensure valid JSON output does not prove that an agent will understand how to use it. Mocks are equally ineffective. They decouple the &lt;a href="https://thenewstack.io/why-your-microservice-integration-tests-miss-real-problems/" rel="noopener noreferrer"&gt;test from real system behavior&lt;/a&gt; and create false confidence.&lt;/p&gt;

&lt;p&gt;The only way to validate an MCP server is through rigorous end-to-end testing against real dependencies. However, spinning up full cluster replicas for every test is rarely feasible.&lt;/p&gt;

&lt;p&gt;To validate an MCP server without the overhead of full environment replication, we treat the test run as a logical slice within a shared cluster. This life cycle relies on header based routing and session affinity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Handshake and routing: The test harness initializes the agent with specific context metadata (such as a baggage header or a custom routing parameter) during the WebSocket or transport handshake. This signals the ingress controller or service mesh to route the persistent JSON-RPC session specifically to the candidate MCP server (the version under test), bypassing the stable production traffic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Session isolation: Once connected, the agent operates within a strictly isolated session. While the underlying compute resources may be shared, the logical control flow is pinned to the candidate artifact. This ensures that the nondeterministic reasoning of the agent is exercising only the new code paths.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Shared downstream state: The candidate MCP server processes the agent’s intent but executes side effects against shared downstream dependencies such as staging databases or stable microservices. This eliminates the need for mocks, allowing the agent to interact with a realistic “world model” where API contracts and data schemas are genuine.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe487fjlg1oealeep18jc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe487fjlg1oealeep18jc.png" alt=" " width="800" height="613"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This architecture enables safe end-to-end semantic testing. The harness prompts the agent to perform an operation and verifies the state change against downstream microservices.&lt;/p&gt;

&lt;p&gt;Isolation at the connection layer turns the test run into a private lane on a public highway. This enables full end-to-end validation of the MCP server without saturating &lt;a href="https://www.signadot.com/blog/smart-ephemeral-environments-share-more-copy-less" rel="noopener noreferrer"&gt;testing infrastructure or introducing resource contention in shared staging environments&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Treat It Like Critical Infrastructure
&lt;/h3&gt;

&lt;p&gt;Teams that are shipping advanced, customer-facing agents understand that robust MCP servers are critical infrastructure. We must recognize them as complex architectural nodes that directly affect agent reliability.&lt;/p&gt;

&lt;p&gt;Model evals are critical but insufficient for production standards. Rigorous integration testing of agents with MCP servers is necessary.&lt;/p&gt;

&lt;p&gt;An agent is only as effective as its tools. A fragile MCP server creates a fragile agent. Elevating the MCP server to a fully validated microservice is essential for advancing agent development from internal experiments to products that are ready for production.&lt;/p&gt;

&lt;p&gt;Learn more about how to implement this testing workflow for your agents at &lt;a href="https://www.signadot.com/" rel="noopener noreferrer"&gt;Signadot.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aiagents</category>
      <category>platformengineering</category>
      <category>mcp</category>
      <category>backend</category>
    </item>
    <item>
      <title>Your CI/CD Pipeline Is Not Ready To Ship AI Agents</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Wed, 18 Feb 2026 15:43:01 +0000</pubDate>
      <link>https://dev.to/signadot/your-cicd-pipeline-is-not-ready-to-ship-ai-agents-219</link>
      <guid>https://dev.to/signadot/your-cicd-pipeline-is-not-ready-to-ship-ai-agents-219</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/your-ci-cd-pipeline-is-not-ready-to-ship-ai-agents?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Your+CI%2FCD+Pipeline+Is+Not+Ready+To+Ship+AI%C2%A0Agents" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let’s be honest with ourselves for a minute. If you look past the hype cycles, the viral Twitter demos and the astronomical valuation of foundation model companies, you will notice a distinct gap in the AI landscape.&lt;/p&gt;

&lt;p&gt;We are incredibly early, and our infrastructure is failing us.&lt;/p&gt;

&lt;p&gt;While every SaaS company has slapped a copilot sidebar onto its UI, actual autonomous agents are rare in the wild. I am referring to software that reliably executes complex and multistep tasks without human hand-holding. Most agents today are internal tools glued together by enthusiastic engineers to summarize Slack threads or query a SQL database. They live in the safe harbor of internal usage where a 20% failure rate is a quirky annoyance rather than a churn event.&lt;/p&gt;

&lt;p&gt;Why aren’t these agents facing customers yet? It is not because the models lack intelligence. It is because our delivery pipelines lack rigor. Taking an agent from cool demo to production-grade reliability is an engineering nightmare that few have solved because traditional CI/CD pipelines simply were not designed for non-deterministic software.&lt;/p&gt;

&lt;p&gt;We are learning the hard way that shipping agents is not an AI problem. It is a systems engineering problem. Specifically, it is a &lt;a href="https://thenewstack.io/why-your-microservice-integration-tests-miss-real-problems/" rel="noopener noreferrer"&gt;testing infrastructure problem&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Death of ‘Prompt and Pray’
&lt;/h3&gt;

&lt;p&gt;For the last year, the industry has been obsessed with frameworks that promised magic. You give the framework a goal and it figures out the rest. This was the &lt;a href="https://www.oreilly.com/radar/beyond-prompt-and-pray/" rel="noopener noreferrer"&gt;“prompt and pray”&lt;/a&gt; era.&lt;/p&gt;

&lt;p&gt;But as recent discussions in the engineering community highlight, specifically the insightful &lt;a href="https://github.com/humanlayer/12-factor-agents" rel="noopener noreferrer"&gt;conversation around 12-Factor Agents&lt;/a&gt;, production reality is boringly deterministic. The developers actually shipping reliable agents are abandoning the idea of total autonomy. Instead, they are building robust and deterministic workflows where large language models (LLMs) are treated as fuzzy function calls injected at specific leverage points.&lt;/p&gt;

&lt;p&gt;When teams start testing agents, they almost always start with evals.&lt;/p&gt;

&lt;p&gt;The 12-Factor philosophy correctly argues that you must own your control flow. You cannot outsource your logic loop to a probabilistic model. If you do, you end up with a system that works 80% of the time and hallucinates itself into a corner the other 20%.&lt;/p&gt;

&lt;p&gt;So we build the agent as a workflow. We treat the LLM as a component rather than the architect. But once we settle on this architecture, we run headfirst into a wall that traditional software engineering solved a decade ago but which AI has reopened. That wall is integration testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Trap of Evals
&lt;/h3&gt;

&lt;p&gt;When teams start testing agents, they almost always start with evals.&lt;/p&gt;

&lt;p&gt;Evals are critical. You need frameworks to score your LLM outputs for relevance, toxicity and hallucinations. You need to know if your prompt changes caused a regression in reasoning.&lt;/p&gt;

&lt;p&gt;However, in the context of shipping a product, evals are essentially unit tests. They test the logic of the node, but they do not test the integrity of the graph.&lt;/p&gt;

&lt;p&gt;In a production environment, your agent is not chatting in a void. It is acting. It is calling tools. It is fetching data from a CRM, updating a ticket in Jira or triggering a deployment via an &lt;a href="https://thenewstack.io/mcp-the-missing-link-between-ai-agents-and-apis/" rel="noopener noreferrer"&gt;MCP (Model Context Protocol) server&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The reliability of your agent is not just defined by how well it writes text or code. It is defined by how consistently it handles the messy and structured data returned by these external dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Integration Nightmare
&lt;/h3&gt;

&lt;p&gt;This is where the platform engineering headache begins.&lt;/p&gt;

&lt;p&gt;Imagine you have an agent designed to troubleshoot Kubernetes pod failures. To test this agent, you cannot just feed it a text prompt. You need to put it in an environment where it can do several things. It must call the Kubernetes API or an MCP server wrapping it. It must receive a JSON payload describing a CrashLoopBackOff. It must parse that payload. It must decide to check the logs. Finally, it must call the log service.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjf8wg92a7q3ekw6fexi4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjf8wg92a7q3ekw6fexi4.png" alt=" " width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the structure of that JSON payload changes, or if the latency of the log service spikes, or if the MCP server returns a slightly different error schema, your agent might break. It might hallucinate a solution because the input context did not match its training examples.&lt;/p&gt;

&lt;p&gt;To test this reliably, you &lt;a href="https://www.signadot.com/blog/we-need-a-new-approach-to-testing-microservices" rel="noopener noreferrer"&gt;need integration testing&lt;/a&gt;. But integration testing for agents is significantly harder than for standard web apps.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Traditional Testing Tails
&lt;/h3&gt;

&lt;p&gt;In traditional software development, we mock dependencies. We stub out the database and the third-party APIs.&lt;/p&gt;

&lt;p&gt;But with LLM agents, the data is the control flow. If you mock the response from an MCP server, you are feeding the LLM a perfect and sanitized scenario. You are testing the happy path. But LLMs are most dangerous on the unhappy path.&lt;/p&gt;

&lt;p&gt;You need to know how the agent reacts when the MCP server returns a 500 error, an empty list or a schema with missing fields. If you mock these interactions, you are writing the test to pass rather than to find bugs. You are not testing the agent’s ability to reason. You are testing your own ability to write mocks.&lt;/p&gt;

&lt;p&gt;The alternative to mocking is usually a full staging environment where you spin up the agent, the MCP servers, the databases and the message queues.&lt;/p&gt;

&lt;p&gt;But in a modern microservices architecture, spinning up a duplicate stack for every pull request is prohibitively expensive and slow. You cannot wait 45 minutes for a full environment provision just to test if a tweak to the system prompt handles a database error correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Need for Ephemeral Sandboxes
&lt;/h3&gt;

&lt;p&gt;To ship production-grade agents, we need to rethink our CI/CD pipeline. We need infrastructure that allows us to perform high-fidelity integration testing early in the software development life cycle.&lt;/p&gt;

&lt;p&gt;We need ephemeral sandboxes.&lt;/p&gt;

&lt;p&gt;A platform engineer needs to provide a way for the AI developer to spin up a lightweight, isolated environment that contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The version of the agent being tested.&lt;/li&gt;
&lt;li&gt;The specific MCP servers and microservices it depends on.&lt;/li&gt;
&lt;li&gt;Access to real (or realistic) data stores.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Crucially, we do not need to duplicate the entire platform. We need a system that allows us to spin up the changed components while routing traffic intelligently to shared and stable baselines for the rest of the stack.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddujny88yp7dlc8koxmu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddujny88yp7dlc8koxmu.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This approach solves the data fidelity problem. The agent interacts with real MCP servers running real logic. If the MCP server returns a complex JSON object, the agent has to ingest it. If the agent makes a state-changing call like restart pod, it actually hits the service or a sandboxed version of it. This ensures the loop is closed.&lt;/p&gt;

&lt;p&gt;This is the only way to verify that the workflow holds up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shifting Left on Agentic Reliability
&lt;/h3&gt;

&lt;p&gt;The future of AI agents is not just better models. It is better DevOps.&lt;/p&gt;

&lt;p&gt;If we accept that production agents are just software with fuzzy logic, we must accept that they require the same rigor in integration testing as a payment gateway or a flight control system.&lt;/p&gt;

&lt;p&gt;We are moving toward a world where the agent is just one microservice in a Kubernetes cluster. It communicates via MCP to other services. The challenge for platform engineers is to give developers the confidence to merge code.&lt;/p&gt;

&lt;p&gt;That confidence does not come from a green checkmark on a prompt eval. It comes from seeing the agent navigate a live environment, query a live MCP server and execute a workflow successfully.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Building the agent is the easy part. Building the stack to reliably test the agent is where the battle is won or lost.&lt;/p&gt;

&lt;p&gt;As we move from internal toys and controlled demos to customer-facing products, the teams that win will be those that can iterate fast without breaking things. They will be the teams that abandon the idea of “prompt and pray” and instead bring production fidelity to their pull request (PR) review. This requires a specific type of infrastructure focused on request-level isolation and ephemeral &lt;a href="https://www.signadot.com/blog/kubernetes-isnt-your-ai-bottleneck----its-your-secret-weapon" rel="noopener noreferrer"&gt;testing environments that work natively within Kubernetes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Solving this infrastructure gap is our core mission at Signadot. We allow platform teams to create lightweight &lt;a href="https://www.signadot.com/blog/sandbox-testing-the-devex-game-changer-for-microservices" rel="noopener noreferrer"&gt;sandboxes to test agents&lt;/a&gt; against real dependencies without the complexity of full environments. If you are refining the architecture for your AI workflows, you can learn more about this testing pattern at signadot.com.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>aiagents</category>
      <category>automation</category>
    </item>
    <item>
      <title>Ramp’s Inspect shows closed-loop AI agents are software’s future</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Thu, 12 Feb 2026 15:23:51 +0000</pubDate>
      <link>https://dev.to/signadot/ramps-inspect-shows-closed-loop-ai-agents-are-softwares-future-4ic1</link>
      <guid>https://dev.to/signadot/ramps-inspect-shows-closed-loop-ai-agents-are-softwares-future-4ic1</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/ramps-inspect-shows-closed-loop-ai-agents-are-softwares-future?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Ramp%E2%80%99s+Inspect+shows+closed-loop+AI+agents+are+software%E2%80%99s+future" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The recent release of the &lt;a href="https://engineering.ramp.com/post/why-we-built-our-background-agent" rel="noopener noreferrer"&gt;background coding agent Inspect&lt;/a&gt; by Ramp’s engineering team serves as a definitive proof point that closed-loop agentic systems are the future of software development. It has transformed coding agents into truly autonomous engineering partners, and it is fundamentally changing the way agents deliver software.&lt;/p&gt;

&lt;p&gt;Whether teams use a custom cloud development environment (CDE) like Ramp or another approach, the signal is clear: Teams need to solve for this kind of autonomy or risk getting left behind. Modern engineers need access to coding agents that do not just generate code but also run it, verify the output, and iterate on the solution until it works.&lt;/p&gt;

&lt;p&gt;This distinction represents a fundamental shift. The industry has been focused on optimizing the “brain” of agents, solving for context windows and reasoning. Ramp’s success validates that the “body” matters just as much.&lt;/p&gt;

&lt;p&gt;The ability to interact with a runtime environment is what transforms code from a hypothesis into a solution. This verification loop separates truly autonomous coding agents from those that rely on humans to validate their work.&lt;/p&gt;

&lt;h3&gt;
  
  
  The open-loop bottleneck
&lt;/h3&gt;

&lt;p&gt;Modern coding agents are impressive. They can plan complex refactors and generate thousands of lines of code. However, these &lt;a href="https://thenewstack.io/your-ci-cd-pipeline-is-not-ready-to-ship-ai-agents/" rel="noopener noreferrer"&gt;agents typically operate&lt;/a&gt; in an open loop. They rely on the developer to act as the runtime environment. The agent proposes a solution. The human must compile, test, and interpret error messages or feed them back to the agent. The cognitive load of verification remains with the user.&lt;/p&gt;

&lt;p&gt;This workflow caps developer velocity. The speed of the agent is irrelevant if the verification process is slow. We have optimized code generation to be near instantaneous, but verification remains bound by human bandwidth and linear CI pipelines.&lt;/p&gt;

&lt;p&gt;Inspect demonstrates that closing that loop unlocks a new category of velocity. By giving the agent access to a &lt;a href="https://www.signadot.com/blog/sandbox-testing-the-devex-game-changer-for-microservices" rel="noopener noreferrer"&gt;sandbox to run builds and tests&lt;/a&gt;, the agent transitions from text generator to task completer. It hands off a verified solution rather than a draft.&lt;/p&gt;

&lt;p&gt;The impact is measurable. Ramp reported vertical internal adoption charts. Within months, approximately 30% of all pull requests merged to its frontend and backend repositories were written by Inspect. This penetration suggests closed-loop agents are a step function change in productivity, not a marginal improvement.&lt;/p&gt;

&lt;h3&gt;
  
  
  The economics of curiosity
&lt;/h3&gt;

&lt;p&gt;The value proposition of closed-loop agents is not just delivering code faster. It is about the parallelization of solution discovery.&lt;/p&gt;

&lt;p&gt;In traditional workflows, exploring refactors or library upgrades is expensive. It requires context switching, stashing work and fighting dependency conflicts. Because experimentation costs are high, we experiment less. We stick to safe patterns to avoid the time sink of failure.&lt;/p&gt;

&lt;p&gt;Background agents change the economics of curiosity. If an engineer can spin up 10 concurrent agent sessions to explore 10 architectural approaches, the cost of failure drops significantly.&lt;/p&gt;

&lt;p&gt;Consider a team migrating a legacy component. Currently, this is a multiweek spike. In the new paradigm, a &lt;a href="https://www.signadot.com/blog/your-infrastructure-isnt-ready-for-agentic-development-at-scale" rel="noopener noreferrer"&gt;developer could instead task a fleet of agents&lt;/a&gt; to attempt the migration using different strategies. One agent might try a strangler fig pattern. Another might attempt a hard cutover. A third might focus on integration tests.&lt;/p&gt;

&lt;p&gt;The developer then reviews results rather than typing code. The agents run in isolated sandboxes. They build, catch syntax errors, and run test suites until they achieve a green state. The developer wakes up to three potential pull requests verified against the CI pipeline and chooses the best one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verification beyond localhost
&lt;/h3&gt;

&lt;p&gt;Ramp’s Inspect platform validates within a custom-built CDE. To ensure these environments start quickly despite their complexity, a sophisticated snapshotting system keeps images warm and ready to launch. Ramp was able to extend this CDE infrastructure to also support integration testing, a brilliant engineering feat that works well for its specific context.&lt;/p&gt;

&lt;p&gt;However, for many organizations building complex, cloud native applications with high levels of dependencies, this approach faces significant hurdles. Often, the entire stack is too large to be spun up on a single virtual machine (VM) or devpod. In these scenarios, while CDEs remain excellent for replacing local development laptops, high-fidelity integration &lt;a href="https://www.signadot.com/blog/we-need-a-new-approach-to-testing-microservices" rel="noopener noreferrer"&gt;testing requires a different approach&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To enable true autonomy in these complex environments, we need a way to perform integration testing without replicating the entire world. We can connect agents directly to a &lt;a href="https://www.signadot.com/blog/smart-ephemeral-environments-share-more-copy-less" rel="noopener noreferrer"&gt;shared baseline environment&lt;/a&gt; using existing Kubernetes infrastructure.&lt;/p&gt;

&lt;p&gt;In this model, the agent deploys only the modified service to a lightweight sandbox. The infrastructure uses dynamic routing and context propagation to direct specific &lt;a href="https://www.signadot.com/blog/microservices-testing-4-use-cases-for-sandbox-environments" rel="noopener noreferrer"&gt;test traffic to that sandbox&lt;/a&gt; while fulfilling all other dependencies from a shared, stable baseline.&lt;/p&gt;

&lt;p&gt;This approach gives coding agents the power to execute autonomous end-to-end testing, regardless of the stack’s size or complexity. It leverages the existing cluster to provide high-fidelity context. An agent can then run integration tests against real upstream and downstream services. It sees how the change interacts with the actual message queue schema and the latency of the live database.&lt;/p&gt;

&lt;p&gt;This closes the loop with higher fidelity while lowering the infrastructure barrier. By testing against a shared cluster, the agent can catch integration regressions that might pass in a hermetic VM without requiring the platform team to build a custom orchestration engine to support it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The future of software delivery
&lt;/h3&gt;

&lt;p&gt;The release of Inspect is a clear signal of where software development is heading. The era of the human engineer as the sole verifier is ending. We are moving toward a world where agents operate as autonomous partners capable of exploring solutions and verifying their own work.&lt;/p&gt;

&lt;p&gt;Ramp has proven that this workflow is not science fiction. It is working in production today and is driving massive efficiency gains. The question for the rest of the industry is not whether to adopt this workflow, but how.&lt;/p&gt;

&lt;p&gt;Whether a team chooses to build a custom platform like Ramp or adopt an existing cloud native solution like &lt;a href="https://www.signadot.com/" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt; to give their agents a runtime, the imperative is the same. We must provide our agents with a body. We must close the loop between generation and verification. Once we do, we unlock a level of velocity that will define the next generation of high-performing engineering teams.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>agents</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>Your infrastructure isn’t ready for agentic development at scale</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Thu, 05 Feb 2026 15:53:22 +0000</pubDate>
      <link>https://dev.to/signadot/your-infrastructure-isnt-ready-for-agentic-development-at-scale-25jk</link>
      <guid>https://dev.to/signadot/your-infrastructure-isnt-ready-for-agentic-development-at-scale-25jk</guid>
      <description>&lt;p&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/your-infrastructure-isnt-ready-for-agentic-development-at-scale?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Your+infrastructure+isn%E2%80%99t+ready+for+agentic+development+at+scale" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I have spent the last year watching the AI conversation shift from smart autocomplete to autonomous contribution. When I test tools like Claude Code or GitHub Copilot Workspace, I am no longer just seeing code suggestions. I am watching them solve tickets and refactor entire modules.&lt;/p&gt;

&lt;p&gt;The promise is seductive. I imagine assigning a complex task and returning to merged work. But while these agents generate code in seconds, I have discovered that code verification is the new bottleneck.&lt;/p&gt;

&lt;p&gt;For agents to be force multipliers, they cannot rely on humans to &lt;a href="https://www.signadot.com/blog/traditional-code-review-is-dead-what-comes-next" rel="noopener noreferrer"&gt;validate every step&lt;/a&gt;. If I have to debug every intermediate state, my productivity gains evaporate. To achieve 10 times the impact, we must transition to an agent-driven loop where humans provide intent while agents handle implementation and integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  The code generation feedback loop crisis
&lt;/h3&gt;

&lt;p&gt;Consider a scenario where an agent is tasked with updating a deprecated API endpoint in a user service. The agent parses the codebase, identifies the relevant files, and generates syntactically correct code. It may even generate a unit test that passes within the limited context of that specific repository.&lt;/p&gt;

&lt;p&gt;However, problems emerge when code interacts with the broader system. A change might break a contract with a downstream payment gateway or an upstream authentication service. If the agent cannot see this failure, it assumes the task is complete and opens a pull request.&lt;/p&gt;

&lt;p&gt;The burden then falls on human developers. They have to pull down the agent’s branch, spin up a local environment, or wait for a slow staging build to finish, only to discover the integration error. The developer pastes the error log back into the chat window and asks the agent to try again. This ping-pong effect destroys velocity.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linkedin.com/in/bcherny/" rel="noopener noreferrer"&gt;Boris Cherny&lt;/a&gt;, creator of Claude Code, has noted the necessity of &lt;a href="https://x.com/bcherny/status/2007179832300581177" rel="noopener noreferrer"&gt;closed-loop systems&lt;/a&gt; for agents to be effective. An agent is only as capable as its ability to observe the consequences of its actions. Without a feedback loop that includes real runtime data, an agent is building in the dark.&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://thenewstack.io/cloud-native/" rel="noopener noreferrer"&gt;cloud native development&lt;/a&gt;, unit tests and mocks are insufficient for this feedback. In a microservices architecture, correctness is a function of the broader ecosystem.&lt;/p&gt;

&lt;p&gt;Code that passes a unit test is merely a suggestion that it might work. True verification requires the code to run against real dependencies, real network latency, and real data schemas. For an agent to iterate autonomously, it needs access to runtime reality.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxhi3rozcn71v13ojttz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxhi3rozcn71v13ojttz.png" alt=" " width="800" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The requirement: Realistic runtime environments at scale
&lt;/h3&gt;

&lt;p&gt;In a recent blog post, &lt;a href="https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents" rel="noopener noreferrer"&gt;“Effective harnesses for long-running agents,”&lt;/a&gt; Anthropic’s engineering team argued that an agent’s performance is strictly limited by the quality of its harness. If the harness provides slow or inaccurate feedback, the agent cannot learn or correct itself.&lt;/p&gt;

&lt;p&gt;This presents a massive infrastructure challenge for engineering leadership. In a large organization, you might deploy 100 autonomous agents to tackle backlog tasks simultaneously. To support this, you effectively need 100 distinct staging environments.&lt;/p&gt;

&lt;p&gt;The traditional approach to this problem fails at scale. Spinning up full Kubernetes namespaces or ephemeral clusters for every task is cost-prohibitive and slow. Provisioning a full cluster with 50 or more microservices, databases, and message queues can take 15 minutes or more. This latency is fatal for an AI workflow. Large language models (LLMs) operate on a timescale of seconds.&lt;/p&gt;

&lt;p&gt;We are left with a fundamental conflict. We need production-like fidelity to ensure reliability, but we cannot afford the production-level overhead for every &lt;a href="https://thenewstack.io/ai-agents-vs-agentic-ai-a-kubernetes-developers-guide/" rel="noopener noreferrer"&gt;agentic task&lt;/a&gt;. We need a way to verify code that is fast, cheap, and accurate.&lt;/p&gt;

&lt;h3&gt;
  
  
  The solution: Environment virtualization
&lt;/h3&gt;

&lt;p&gt;The answer lies in decoupling the environment from the underlying infrastructure. This concept is known as environment virtualization.&lt;/p&gt;

&lt;p&gt;Environment virtualization allows the creation of lightweight and ephemeral sandboxes within a shared Kubernetes cluster. In this model, a baseline environment runs the stable versions of all services. When an agent proposes a change to a specific service, such as the user service mentioned earlier, it does not clone the entire cluster. Instead, it spins up only the modified workload containing the agent’s new code as a shadow deployment.&lt;/p&gt;

&lt;p&gt;The environment then utilizes dynamic traffic routing to create the illusion of a dedicated environment. It employs context propagation headers to route specific requests to the agent’s sandbox. If a request carries a specific routing key associated with the agent’s task, the service mesh or ingress controller directs that request to the shadow deployment. All other downstream calls fall back to the stable baseline services.&lt;/p&gt;

&lt;p&gt;This architecture solves the agent-environment fit in three specific ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Speed:&lt;/strong&gt; Because a single container or pod is launching, rather than a full cluster, sandboxes spin up in seconds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost:&lt;/strong&gt; The infrastructure footprint is minimal. You are not paying for idle databases or duplicate copies of stable services.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fidelity:&lt;/strong&gt; Agents test against real dependencies and valid data rather than stubs. The modified service interacts with the actual payment gateways and databases in the baseline.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The seamless verification workflow for AI agents
&lt;/h3&gt;

&lt;p&gt;The mechanics of this verification loop rely on precise context propagation, typically handled through standard tracing headers like OpenTelemetry baggage.&lt;/p&gt;

&lt;p&gt;When an agent works on a task, its environment is virtually mapped to the remote Kubernetes cluster. This setup supports conflict-free parallelism. Multiple agents can simultaneously work on the same microservice in different sandboxes without collision because routing is determined by unique headers attached to test traffic.&lt;/p&gt;

&lt;p&gt;Here is the autonomous workflow for an agent refactoring a microservice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generation:&lt;/strong&gt; The agent analyzes a ticket and generates a code fix with local static analysis. At this stage, the code is theoretical.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instantiation:&lt;/strong&gt; The agent triggers a sandbox via the &lt;a href="https://thenewstack.io/why-the-mcp-server-is-now-a-critical-microservice" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; server. This deploys only the modified workload alongside the running baseline in seconds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Verification:&lt;/strong&gt; The agent runs integration tests against the cluster using a specific routing header. Requests route to the modified service while dependencies fall back to the baseline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Feedback:&lt;/strong&gt; If the change breaks a downstream contract, the baseline service returns a real runtime error (e.g., 400 Bad Request). The agent captures this actual exception rather than relying on a mock.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Iteration:&lt;/strong&gt; The agent analyzes the error, refines the code to fix the integration failure, and updates the sandbox instantly. It runs the test again to confirm the fix works in the real environment.‍&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Submission:&lt;/strong&gt; Once tests pass, the agent submits a verified pull request (PR). The human reviewer receives a sandbox link to interact with the running code immediately, bypassing local setup.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo1vbnxj6l6ptqdc6ac3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo1vbnxj6l6ptqdc6ac3.png" alt=" " width="800" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why engineering’s future is autonomous
&lt;/h3&gt;

&lt;p&gt;As we scale the use of AI agents, the bottleneck moves from the keyboard to the infrastructure. If we treat agents as faster typists but force them to wait for slow &lt;a href="https://thenewstack.io/your-ci-cd-pipeline-is-not-ready-to-ship-ai-agents" rel="noopener noreferrer"&gt;legacy CI/CD pipelines&lt;/a&gt;, we gain nothing. We simply build a longer queue of unverified pull requests.&lt;/p&gt;

&lt;p&gt;To move toward a truly autonomous engineering workforce, we must give agents the ability to see. They need to see how their code performs in the real world rather than just in a text editor. They need to experience the friction of deployment and the reality of network calls. This is &lt;a href="https://www.signadot.com/" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;’s approach.&lt;/p&gt;

&lt;p&gt;Environment virtualization is shifting from a tool for developer experience to foundational infrastructure. By closing the loop, agents can do the messy and iterative work of integration. This leaves architects and engineers free to focus on system design, high-level intent, and the creative aspects of building software.&lt;/p&gt;

</description>
      <category>aiagents</category>
      <category>kubernetes</category>
      <category>devops</category>
      <category>platformengineering</category>
    </item>
    <item>
      <title>Traditional Code Review Is Dead. What Comes Next?</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Tue, 27 Jan 2026 18:22:10 +0000</pubDate>
      <link>https://dev.to/signadot/traditional-code-review-is-dead-what-comes-next-41oi</link>
      <guid>https://dev.to/signadot/traditional-code-review-is-dead-what-comes-next-41oi</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/traditional-code-review-is-dead-what-comes-next?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Traditional+Code+Review+Is+Dead.+What+Comes+Next%3F" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I noticed a quiet shift in our engineering team recently that brought me to a broader realization about the future of software development: Code review has changed fundamentally.&lt;/p&gt;

&lt;p&gt;It started with a pull request (PR). An engineer had used an agent to generate the entire change, iterating with it to define business logic, but ultimately relying on the agent to write the code. It was a substantial chunk of work. The code was syntactically perfect. It followed our linting rules. It even included unit tests that passed green.&lt;/p&gt;

&lt;p&gt;The human reviewer, a senior engineer who is usually meticulous about architectural patterns and naming conventions, approved it almost immediately. The time between the PR opening and the approval was less than two minutes.&lt;/p&gt;

&lt;p&gt;When I asked about the speed of the approval, they said they checked if the output was correct and moved on. They did not feel the need to parse every line of syntax because it was written by an agent. They spun up the deploy preview, clicked the buttons, verified the state changes and merged it.&lt;/p&gt;

&lt;p&gt;This made sense, but it still took me by surprise. I realized that I was witnessing the silent death of traditional code review.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Silent Death of the Code Review
&lt;/h3&gt;

&lt;p&gt;For decades, the peer review process has been the primary quality gate in software engineering. Humans reading code written by other humans served two critical purposes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It caught logic bugs that automated tests missed.&lt;/li&gt;
&lt;li&gt;It maintained a shared mental model of the codebase across the team.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The assumption behind this process was that code is a scarce resource produced slowly. A human developer might write 50 to 100 lines of meaningful code in a day. Another human can reasonably review that volume while maintaining high cognitive focus.&lt;/p&gt;

&lt;p&gt;But we are entering an era where code is becoming abundant and cheap. In fact, the precise goal of implementing coding agents is to generate code at a velocity and volume that by design makes it impossible for humans to keep up.&lt;/p&gt;

&lt;p&gt;When an engineer sees a massive block of AI-generated code, the instinct is to offload the syntax-checking to the machine. If the linter is happy and the tests pass, the human assumes the code is valid. The rigorous line-by-line inspection vanishes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: AI Trust and the Rubber Stamp
&lt;/h3&gt;

&lt;p&gt;This shift leads to what I call the rubber stamp effect. We see a “lgtm” (looks good to me) approval on code that nobody actually read.&lt;/p&gt;

&lt;p&gt;This creates a significant change to the risk profile. Human errors usually manifest as syntax errors or obvious logic gaps. AI errors are different. Large language models (LLMs) often hallucinate plausible but functionally incorrect code.&lt;/p&gt;

&lt;p&gt;Traditional diff-based review tools are ill-equipped for this. A diff shows you what changed in the text file. It does not show you the emergent behavior of that change. When a human writes code, the diff is a representation of their intent. When an AI writes code, the diff is just a large volume of tokens that may or may not align with the prompt.&lt;/p&gt;

&lt;p&gt;We are moving from a syntax-first culture to an outcome-first culture. The question is no longer “Did you write this correctly?” The question is “Does this do what we asked the agent for?”&lt;/p&gt;

&lt;h3&gt;
  
  
  Previews as the New Source of Truth
&lt;/h3&gt;

&lt;p&gt;In this new world, where engineers are logic architects who offload the writing of code to agents, the most important artifact is not the code. It is the preview.&lt;/p&gt;

&lt;p&gt;If we cannot rely on humans to read the code, we must rely on humans to verify the behavior. But to verify behavior, we need more than a diff. We need a destination. The code must be deployed to a live environment where it can be exercised.&lt;/p&gt;

&lt;p&gt;While frontend previews have become standard, the critical gap — and the harder problem to solve — is the backend.&lt;/p&gt;

&lt;p&gt;Consider a change to a payment processing microservice generated by an agent. The code might look syntactically correct. The logic flow seems correct. But does it handle the race condition when two requests hit the API simultaneously? Does the new database migration lock a critical table for too long?&lt;/p&gt;

&lt;p&gt;You cannot see these problems in a text diff. You cannot even see them in a unit test mock. You can only see them when the code is running in a live, integrated environment.&lt;/p&gt;

&lt;p&gt;A backend preview environment allows for true end-to-end verification. It allows a reviewer to execute real API calls against a real database instance. It transforms the review process from a passive reading exercise into an active verification session. We are not just checking whether the code compiles. We are checking whether the system behaves.&lt;/p&gt;

&lt;p&gt;As AI agents write more code, the “review” phase of the software development life cycle must evolve into a “validation” phase. We are not reviewing the recipe. We are tasting the dish.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Infrastructure Challenge: The Concurrency Explosion
&lt;/h3&gt;

&lt;p&gt;However, this shift to outcome-based verification comes with a massive infrastructure challenge that most platform engineering teams are not ready for.&lt;/p&gt;

&lt;p&gt;A human developer typically works linearly. They open a branch, write code, open a pull request, wait for review and merge. They might context switch between two tasks, but rarely more.&lt;/p&gt;

&lt;p&gt;AI agents work in parallel. An agent tasked with fixing a bug might spin up 10 different strategies to solve it. It could open 10 parallel pull requests, each with a different implementation, and ask the human to select the best one.&lt;/p&gt;

&lt;p&gt;This creates an explosion of concurrency.&lt;/p&gt;

&lt;p&gt;Traditional &lt;a href="https://www.signadot.com/blog/your-ci-cd-pipeline-wasnt-built-for-microservices" rel="noopener noreferrer"&gt;CI/CD pipelines are built&lt;/a&gt; for linear human workflows. They assume a limited number of concurrent builds. If your AI agent opens 20 parallel sessions to test different hypotheses, you face two prohibitive problems: cost and contention.&lt;/p&gt;

&lt;p&gt;First, you cannot have 20 full-scale staging &lt;a href="https://www.signadot.com/blog/reimagining-environments-for-cloud-native-architectures" rel="noopener noreferrer"&gt;environments spinning up on expensive cloud instances&lt;/a&gt;. Imagine spinning up a dedicated Kubernetes cluster and database for 20 variations of a single bug fix. The cloud costs would be astronomical.&lt;/p&gt;

&lt;p&gt;Second, and perhaps worse, is the bottleneck of shared resources. Many pipelines rely on a single &lt;a href="https://www.signadot.com/blog/why-staging-doesnt-scale-for-microservice-testing" rel="noopener noreferrer"&gt;staging environment or limited testing slots&lt;/a&gt;. To avoid data collisions, these systems force PRs into a queue.&lt;/p&gt;

&lt;p&gt;With existing human engineering teams, these queues are already a frustrating bottleneck. With multiple agents dumping 20 PRs into the pipe simultaneously, the queue becomes a deadlock. The alternative of running them all at once on shared infrastructure results in race conditions and flaky tests.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frj98b9ys9f5u0y7x9wy6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frj98b9ys9f5u0y7x9wy6.png" alt=" " width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling Development With Environment Virtualization
&lt;/h3&gt;

&lt;p&gt;To scale agent-driven development, we cannot rely on infrastructure built for linear human pacing. We are talking about potentially hundreds of concurrent agents generating PRs in parallel, all of which need to be validated with previews. Cloning the entire stack for each one is not a viable option.&lt;/p&gt;

&lt;p&gt;The solution is to multiplex these &lt;a href="https://www.signadot.com/blog/smart-ephemeral-environments-share-more-copy-less" rel="noopener noreferrer"&gt;environments on shared infrastructure&lt;/a&gt;. Just as a single physical computer can host multiple virtual machines (VMs), a single Kubernetes cluster can multiplex thousands of lightweight, ephemeral environments.&lt;/p&gt;

&lt;p&gt;By applying smart isolation techniques at the application layer, we can provide strict separation for each agent’s work without duplicating the underlying infrastructure. This allows us to spin up a dedicated sandbox for every change, ensuring agents can work in parallel and validate code end-to-end without stepping on each other’s toes or exploding cloud costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;There is a clear shift happening in the way we review changes. As agents take over the writing of code, the review process naturally evolves from checking syntax to verifying behavior. The preview is no longer just a convenience. It is the only scalable way to validate the work that agents produce.&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://www.signadot.com/" rel="noopener noreferrer"&gt;Signdot&lt;/a&gt;, we are building for this future. We provide the orchestration layer that enables fleets of agents to work in parallel, generating and validating code end-to-end in a closed loop with instant, cost-effective previews.&lt;/p&gt;

&lt;p&gt;The winners of the next era won’t be the teams with the best style guides, but those who can handle the parallelism of AI agents without exploding their cloud budgets or bringing their CI/CD pipelines to a grinding halt.&lt;/p&gt;

&lt;p&gt;In an AI-first world, reading code is a luxury we can no longer afford. Verification is the new standard. If you cannot preview it, you cannot ship it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>kubernetes</category>
      <category>testing</category>
    </item>
    <item>
      <title>Merging To Test Is Killing Your Microservices Velocity</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Mon, 19 Jan 2026 19:57:45 +0000</pubDate>
      <link>https://dev.to/signadot/merging-to-test-is-killing-your-microservices-velocity-4do0</link>
      <guid>https://dev.to/signadot/merging-to-test-is-killing-your-microservices-velocity-4do0</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/merging-to-test-is-killing-your-microservices-velocity?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Merging+To+Test+Is+Killing+Your+Microservices+Velocity" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontend and data layers have evolved with branch-based previews and isolated environments. Why is the backend service layer stuck with shared staging?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you are a platform engineer or an engineering leader, look at your current development pipeline. Is everything treated equally?&lt;/p&gt;

&lt;p&gt;To me, it seems that there is a glaring discrepancy in the way we treat different parts of the stack.&lt;/p&gt;

&lt;p&gt;When a frontend developer pushes code to a feature branch, tools like Vercel or Netlify immediately spin up a deploy preview. It is a unique URL, isolated from production, where they can click around and validate changes instantly.&lt;/p&gt;

&lt;p&gt;When a database engineer needs to test a schema migration, modern platforms like Neon or PlanetScale allow them to branch the database. They get an isolated, copy-on-write clone of the production data to wreck and repair without affecting a single real user.&lt;/p&gt;

&lt;p&gt;But what happens when a backend engineer pushes a change to one microservice in a mesh of 50?&lt;/p&gt;

&lt;p&gt;Nothing.&lt;/p&gt;

&lt;p&gt;There is a gaping hole in the middle of our cloud native stack. While frontend and data layers have evolved to embrace branch-based development, the backend service layer is stuck in the stone age of shared environments.&lt;/p&gt;

&lt;p&gt;This isn’t just an annoyance. It is the primary bottleneck preventing teams from truly shifting left.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimiry5qfmqr8w6e3jkx1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimiry5qfmqr8w6e3jkx1.png" alt=" " width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Merge To Validate Anti-Pattern
&lt;/h3&gt;

&lt;p&gt;In most distributed architectures, a developer working on a backend service cannot realistically run the entire platform on their laptop. It is too heavy.&lt;/p&gt;

&lt;p&gt;So they rely on &lt;a href="https://www.signadot.com/blog/why-mocks-fail-real-environment-testing-for-microservices" rel="noopener noreferrer"&gt;unit tests and mocks&lt;/a&gt;. But we all know that mocks are liars. They do not catch the contract drift between services or the latency issues that only appear over the network.&lt;/p&gt;

&lt;p&gt;To get real validation, the &lt;a href="https://www.signadot.com/blog/the-struggle-to-test-microservices-before-merging" rel="noopener noreferrer"&gt;developer has to merge&lt;/a&gt; their branch to the main trunk so it can be deployed to a shared staging environment.&lt;/p&gt;

&lt;p&gt;This is where velocity goes to die.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The queue:&lt;/strong&gt; Developers wait in line to deploy to staging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The block:&lt;/strong&gt; If one developer breaks staging, everyone is blocked.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The noise:&lt;/strong&gt; Testing fails, but is it your code, or did someone else deploy a bad config to the auth-service five minutes ago?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We have normalized this dysfunction. We treat staging as a fragile, sacred monolith. But in an era where we want to deploy multiple times a day, merging to trunk just to see if your code works is backward. It is like pouring concrete before you have checked the blueprints.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Service Branching
&lt;/h3&gt;

&lt;p&gt;We need to bring the Vercel and Neon experience to the Kubernetes backend. We need service branching.&lt;/p&gt;

&lt;p&gt;The goal is simple. Every git branch should result in a testable, isolated environment.&lt;/p&gt;

&lt;p&gt;However, the physics of microservices makes this hard. You cannot duplicate a cluster with 100+ services for every single pull request. The cost and spin-up time would be prohibitive.&lt;/p&gt;

&lt;p&gt;The solution is not duplication. It is isolation.&lt;/p&gt;

&lt;p&gt;Imagine a base environment, your existing staging cluster, that runs the stable version of all your services. When a developer pushes a change to a specific service, the platform shouldn’t clone the cluster. It should simply spin up a lightweight sandbox containing only the modified service.&lt;/p&gt;

&lt;p&gt;Smart routing does the rest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standard traffic flows through the stable baseline.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.signadot.com/blog/sandbox-testing-the-devex-game-changer-for-microservices" rel="noopener noreferrer"&gt;Test traffic is intercepted and rerouted only to the sandboxed service.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;If the sandboxed service needs to call other services, it routes back into the stable baseline. This gives the developer the experience of a full, dedicated environment with a fraction of the infrastructure footprint.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The New Mental Model: Git Equals Environment
&lt;/h3&gt;

&lt;p&gt;For this to work at scale, platform engineers need to provide a clean mental model that maps source code directly to infrastructure.&lt;/p&gt;

&lt;p&gt;This is what the new standard looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trunk (main) corresponds to the baseline environment (staging). This is the source of truth. It represents the stable state of the world where all services are interacting as expected.&lt;/li&gt;
&lt;li&gt;Feature branch (feat-xyz) corresponds to a sandbox environment. This is ephemeral. It lives only as long as the PR is open. It contains only the delta of the services that have changed in that specific branch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a developer opens a PR, they do not need to think about clusters or namespaces. They just get a dedicated playground that mirrors their branch perfectly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdazpgwch5no3nk9hk3rn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdazpgwch5no3nk9hk3rn.png" alt=" " width="800" height="516"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Holy Grail: The Full Virtual Stack
&lt;/h3&gt;

&lt;p&gt;When you combine this service branching approach with the existing tools for frontend and database branching, you unlock something powerful: a full virtual stack per branch.&lt;/p&gt;

&lt;p&gt;Imagine a workflow where a developer creates a branch, and magically, a complete, isolated environment materializes. To the developer, it feels like they have their own private copy of the entire company’s infrastructure.&lt;/p&gt;

&lt;p&gt;This includes frontend, backend services and database schemas. They are all aligned to their specific code changes.&lt;/p&gt;

&lt;p&gt;They can run end-to-end integration tests on their branch before merging. They can hand a URL to a product manager to demo the feature. They can validate complex migrations safely. It is a dedicated reality for their feature, created instantly and destroyed just as quickly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9llrwcfgh7asl0wvsamu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9llrwcfgh7asl0wvsamu.png" alt=" " width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters: Speed and Quality at Scale
&lt;/h3&gt;

&lt;p&gt;This model shifts the paradigm from serial blocking to massive parallelism.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remove the bottleneck:&lt;/strong&gt; Large engineering teams no longer have to queue up for staging. You can have 10, 50 or 100 developers and agents testing simultaneously without stepping on each other’s toes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;True shift left:&lt;/strong&gt; Integration testing happens during development, not after the merge. You catch the bug when you write it, not three days later when the staging build fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More quality, faster:&lt;/strong&gt; When testing is easy and isolated, people do more of it. We stop fearing deployments and start treating them as routine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result is a software delivery pipeline that is both significantly faster and more stable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Closing the Gap
&lt;/h3&gt;

&lt;p&gt;The technology to do this exists. The patterns are proven. It is time for platform teams to stop managing static environments and start managing dynamic, ephemeral workflows.&lt;/p&gt;

&lt;p&gt;If you are looking to implement this service branching layer to complete your testing strategy, this is exactly what Signadot was built for. &lt;a href="https://www.signadot.com/?utm_content=inline+mention" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt; provides the orchestration layer that brings request-based isolation to Kubernetes.&lt;/p&gt;

&lt;p&gt;Stop merging to test. Start branching to validate.&lt;/p&gt;

</description>
      <category>cloudnative</category>
      <category>productivity</category>
      <category>devops</category>
      <category>testing</category>
    </item>
    <item>
      <title>Agentic Coding Tools Are Accelerating Output, Not Velocity</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Mon, 12 Jan 2026 16:14:19 +0000</pubDate>
      <link>https://dev.to/signadot/agentic-coding-tools-are-accelerating-output-not-velocity-51fc</link>
      <guid>https://dev.to/signadot/agentic-coding-tools-are-accelerating-output-not-velocity-51fc</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this article on &lt;a href="https://www.signadot.com/blog/agentic-coding-tools-are-accelerating-output-not-velocity?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Agentic+Coding+Tools+Are+Accelerating+Output%2C+Not+Velocity" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We are at a pivotal moment in software development. In just a few years, LLMs have gone from impressive chatbots to full-blown coding agents built directly into your IDE.  &lt;/p&gt;

&lt;p&gt;This has the entire industry racing to accelerate developer velocity with AI. With &lt;a href="https://survey.stackoverflow.co/2025/" rel="noopener noreferrer"&gt;84% of developers now using AI tools&lt;/a&gt;, and companies like Google stating that 25% or more of their code is now generated by AI, the mandate is clear.   &lt;/p&gt;

&lt;p&gt;So far, this hasn’t translated into significantly more code being generated, but that is already changing. Models and tooling are evolving rapidly, with agents now able to work in parallel on longer coding tasks &lt;/p&gt;

&lt;p&gt;Right now, this workflow is still maturing. We are in a transition phase where tools are helpful but often require heavy verification, limiting the actual output of code that makes it to PR. But as the tools continue to get better, the per-developer output of code is only going to increase.&lt;/p&gt;

&lt;p&gt;And that is the goal. Teams are adopting these tools now with the expectation that they will continue to improve developer velocity, ultimately enabling the same teams to write 5x or 10x more code. &lt;/p&gt;

&lt;p&gt;But the future state where that goal is achieved presents a critical challenge. The resulting explosion in code output will inevitably make existing CI/CD bottlenecks worse in direct proportion to the coding efficiencies gained.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Invisible Queue
&lt;/h3&gt;

&lt;p&gt;Picture the emerging workflow. A developer runs three Cursor sessions at once. One refactors the auth service. Another adds a feature to the notification system. A third optimizes database queries. By the end of the day she has three PRs ready to publish. Multiply that by a hundred developers and you get a sense of the scale of code that will be hitting legacy CI/CD pipelines. &lt;/p&gt;

&lt;p&gt;Without modernizing validation infrastructure in parallel to developer tooling, the acceleration in code generation will never translate to an equal acceleration in productivity. Instead, it will create a merge crisis. All those PRs will continue to pile up at the exact same chokepoint: validation.&lt;/p&gt;

&lt;p&gt;Code review becomes the new bottleneck. Even with AI-assisted reviews, humans still need to sign off. If teams achieve their goal of 5x developer output, that translates to hundreds of PRs a week needing eyeballs, testing, and approval. The queue grows. PRs age. Conflicts multiply.&lt;/p&gt;

&lt;p&gt;But the real chaos happens post-merge. All that code lands in staging and staging breaks, because nobody could actually test how their microservices changes interact until they're all deployed together. Now 100 developers are blocked, staring at a broken staging environment, waiting for someone to untangle the mess.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Only Way Forward
&lt;/h3&gt;

&lt;p&gt;There's only one solution, and it's counterintuitive: test more, but &lt;em&gt;earlier&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Not unit tests. Not mocks. Real integration tests. End-to-end tests. Running against actual services. But here's the key: run them at the granularity of every code change, during local development, before the PR even exists.&lt;/p&gt;

&lt;p&gt;Think about it differently. Right now, that Cursor session is generating code. It writes a new API endpoint, updates three service calls, modifies a database schema. The developer reviews the code, thinks it looks good, publishes the PR. Then the waiting starts. Code review queue. CI pipeline. Manual testing in staging. Maybe it works. Maybe it doesn't. Either way, you find out hours or days later.&lt;/p&gt;

&lt;p&gt;What if instead, while Cursor is writing that code, it's also spinning up a lightweight test environment, deploying those changes, running integration tests, and validating everything works, all before the developer even switches back to that tab?&lt;/p&gt;

&lt;p&gt;That's not science fiction. That's how development needs to work in the agentic era.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shifting Left, For Real This Time
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ijqjzusqhms0e1sefy5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ijqjzusqhms0e1sefy5.png" alt=" " width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We've been talking about "shift left" for years. This time it's not optional. It's critical for survival.&lt;/p&gt;

&lt;p&gt;The paradigm: every unit of code produced by an AI agent gets tested end-to-end during local development. Not after merge. Not during the PR review. &lt;em&gt;While being written.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When this works, something magical happens. The PR that lands for review isn't a leap of faith. It's already validated. The reviewer sees: "15 integration tests passed. 3 end-to-end flows validated. Contract tests clean." Code review becomes fast. Confidence is high. Merges happen quickly. Staging stays green.&lt;/p&gt;

&lt;p&gt;This is where Signadot comes in.&lt;/p&gt;

&lt;h3&gt;
  
  
  Making It Real
&lt;/h3&gt;

&lt;p&gt;The solution isn't better code review tools or faster CI pipelines. It's parallelizing validation itself.&lt;/p&gt;

&lt;h4&gt;
  
  
  Parallelize Validation
&lt;/h4&gt;

&lt;p&gt;To unblock the queue, we must enable full integration testing during local development.&lt;/p&gt;

&lt;p&gt;Signadot allows developers to use the existing staging environment as a shared baseline. By isolating requests rather than duplicating infrastructure, every developer gets a sandboxed virtual private staging environment.&lt;/p&gt;

&lt;p&gt;This allows 100+ developers to test in parallel on the same cluster without impacting each other. No queue. No contention. No waiting for staging to be free.&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent-Native Infrastructure via MCP
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57jjrrbaywqxidxfpb81.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57jjrrbaywqxidxfpb81.png" alt=" " width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Signadot integrates directly into the AI coding workflow via the Signadot MCP (Model Context Protocol) Server.&lt;/p&gt;

&lt;p&gt;Tools like Cursor and Claude Code can now "speak" infrastructure. They use plain English commands to instantiate sandboxes, configure routing, and manage resources directly from the agent interface.&lt;/p&gt;

&lt;p&gt;The agent doesn't need you to switch contexts. It handles infrastructure the same way it handles code.&lt;/p&gt;

&lt;h4&gt;
  
  
  Bi-Directional Tunneling
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sd6qhgcss0dikkrcfhh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sd6qhgcss0dikkrcfhh.png" alt=" " width="800" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the sandbox is requested, Signadot establishes a secure, bi-directional tunnel between the agent's local environment and the remote Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;Requests triggered from the staging UI in the cluster are routed to your local machine. Simultaneously, your locally running service directly hits remote dependencies—database, auth, other microservices—in the cluster. You're testing against the real stack, not mocks.&lt;/p&gt;

&lt;h4&gt;
  
  
  AI-Driven Test Generation via Traffic Capture
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww1fnefvudx4r213bqqp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww1fnefvudx4r213bqqp.png" alt=" " width="800" height="596"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you interact with your local service, Signadot captures the actual request/response traffic flows. The coding agent analyzes this captured traffic to automatically generate functional API tests, ensuring your changes don't break existing contracts.&lt;/p&gt;

&lt;p&gt;No manual test writing. The agent sees how your service actually behaves and writes tests that validate those behaviors.&lt;/p&gt;

&lt;h4&gt;
  
  
  Autonomous Debugging Loop
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4o5vlay00769huqjpy7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4o5vlay00769huqjpy7.png" alt=" " width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Test failed? The agent reads the live logs streamed directly from the Signadot sandbox to pinpoint the error. It corrects the code and reruns the tests instantly in the same sandbox.&lt;/p&gt;

&lt;p&gt;This closes the feedback loop without human intervention. The agent iterates until tests pass, all during local development.&lt;/p&gt;

&lt;p&gt;When you publish the PR, you attach the sandbox link. Your reviewer sees exactly what was tested, how it performed, and what edge cases were validated. The PR goes from "needs investigation" to "LGTM" in minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Real Unlock
&lt;/h3&gt;

&lt;p&gt;Agentic coding tools promise 10x code generation and companies are racing to achieve that goal. But without granular pre-merge testing, we will remain stuck with 1x shipping velocity.&lt;/p&gt;

&lt;p&gt;The companies that will win are not the ones generating the most code. They are the ones who can validate and ship that code as fast as it is written. That means testing at the same granularity as coding. Every change. Every branch. Every experiment.&lt;/p&gt;

&lt;p&gt;The bottleneck is moving. The question is whether your testing infrastructure is moving with it.&lt;/p&gt;

&lt;p&gt;At Signadot, we're building the platform that makes agentic coding actually productive. Because code in a PR queue isn't value. Code in production is.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>kubernetes</category>
      <category>productivity</category>
    </item>
    <item>
      <title>It’s Time To Kill Staging: The Case for Testing in Production</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Thu, 04 Dec 2025 16:02:20 +0000</pubDate>
      <link>https://dev.to/signadot/its-time-to-kill-staging-the-case-for-testing-in-production-521a</link>
      <guid>https://dev.to/signadot/its-time-to-kill-staging-the-case-for-testing-in-production-521a</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/its-time-to-kill-staging-the-case-for-testing-in-production?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=It%E2%80%99s+Time+To+Kill+Staging%3A+The+Case+for+Testing+in+Production" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learn how testing in production with the right guardrails can eliminate bottlenecks, reduce costs and help your team ship more reliable code faster.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Staging has always been a necessary evil. New approaches to isolation and on-demand sandboxes have finally made it just plain evil.&lt;/p&gt;

&lt;p&gt;For decades, the staging environment has been a fixture of software development. And for just as long, it has been hated by developers everywhere. It’s the proverbial traffic jam that every developer is forced to sit in just to get their work validated.&lt;/p&gt;

&lt;p&gt;Yet staging is no longer necessary; its time has come and gone. Newer &lt;a href="https://www.signadot.com/blog/smart-ephemeral-environments-share-more-copy-less" rel="noopener noreferrer"&gt;isolation methods&lt;/a&gt; enable developers to test safely in live environments, providing fast, high-fidelity feedback that is impossible to achieve in a staging environment.&lt;/p&gt;

&lt;p&gt;It is time to kill your staging environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem We All Know
&lt;/h3&gt;

&lt;p&gt;Staging environments make sense, in theory. We must test code in a production-like environment before we ship to production. Anything else would be madness.&lt;/p&gt;

&lt;p&gt;But the cure has become its own disease. In any organization with more than a handful of microservices, the staging environment inevitably becomes a wasteland of developer pain and burned cash.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It’s a bottleneck:&lt;/strong&gt; When 50 developers merge code, staging becomes a shared queue. Tests fail not because of bad code, but because another developer deployed a conflicting change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It’s not production:&lt;/strong&gt; We call it “production-like,” but it never has the same data scale, traffic patterns or identity and access management (IAM) policies. This fidelity gap is where dangerous bugs hide.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It kills velocity:&lt;/strong&gt; Commit code, wait for CI, wait for a deploy slot, run a 40-minute test suite. This multihour cycle destroys flow state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nobody maintains it:&lt;/strong&gt; Teams treat it as a dumping ground for unstable builds, further diverging from production.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We have accepted this broken workflow for 20 years. We believed it was the only way.&lt;/p&gt;

&lt;h3&gt;
  
  
  From Environment Isolation To Request Isolation
&lt;/h3&gt;

&lt;p&gt;Staging exists because of the assumption that testing must be isolated at the environment level. To test a new version of the payment service, you must deploy it to an environment that also contains a cart service, user service and auth service.&lt;/p&gt;

&lt;p&gt;This assumption is outdated and obsolete.&lt;/p&gt;

&lt;p&gt;The new model is request-level isolation. Instead of cloning an entire environment, you spin up only the service you’re changing. This model is enabled by Kubernetes-native platforms that provide on-demand sandboxes for every request.&lt;/p&gt;

&lt;p&gt;Here is how it works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A new service version is spun up in an on-demand, isolated “sandbox.”&lt;/li&gt;
&lt;li&gt;When a test request is sent (tagged with a unique header), it is routed to the sandboxed service.&lt;/li&gt;
&lt;li&gt;As that service calls its dependencies, those calls are routed back to the stable, baseline services in production.&lt;/li&gt;
&lt;li&gt;The &lt;a href="https://www.signadot.com/blog/shifting-testing-left-the-request-isolation-solution" rel="noopener noreferrer"&gt;test request remains isolated&lt;/a&gt; as it travels through the stack, while all other traffic flows normally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this approach, you get high-fidelity testing (real dependencies, real network policies) without the downsides of shared environments (no collisions, no queues, dramatically lower cost).&lt;/p&gt;

&lt;p&gt;Request-level isolation can be implemented into a traditional local &amp;gt; staging &amp;gt; production deployment flow to improve it, eliminating contention and long waits for a CI pipeline. But its real power lies in bypassing the need for staging altogether, enabling testing in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Safety Model
&lt;/h3&gt;

&lt;p&gt;Testing in production sounds dangerous. It’s not when you have the right guardrails.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Strict data isolation is critical.&lt;/strong&gt; The biggest concern is that a sandboxed service could corrupt data in other services. The solution is simple. The same routing header that isolates test traffic also routes database operations to separate test databases. For example, when a test request flows through the system, each service recognizes the test context and directs all database writes to isolated test data stores, completely separate from production databases. Test users interact only with test data. Production data remains untouched.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multitenancy provides the foundation.&lt;/strong&gt; Virtual private network (VPN) restrictions ensure test traffic originates only from authorized internal networks. Audit logs track every sandbox session for compliance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request routing provides blast radius control.&lt;/strong&gt; Your sandbox is isolated at the request level. Your colleagues’ work is unaffected. Production traffic flows normally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progressive rollout remains essential.&lt;/strong&gt; Sandboxes handle preproduction validation, but you still use canary deployments, feature flags and observability to safely roll out to real users.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Answering the Hard Questions
&lt;/h3&gt;

&lt;p&gt;Testing in production is the logical evolution of the movement to shift testing left. But making such a foundational change to your CI/CD pipeline naturally brings up some critical questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;“How do you guarantee test traffic doesn’t corrupt production data?”&lt;/strong&gt; Test writes are isolated. The isolation header redirects all database writes to ephemeral, nonproduction data stores that are destroyed after the test. Production data is never touched.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;“What about the blast radius? How do you stop a bad test from DDoSing a downstream service?”&lt;/strong&gt; Sandboxes are “shadow” deployments built with guardrails like circuit breakers and network policies. A buggy test with runaway network requests is automatically throttled and contained, preventing it from overwhelming baseline services or affecting other users.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;“This sounds fine for simple APIs, but what about Kafka or gRPC?”&lt;/strong&gt; The isolation model is protocol-agnostic. The isolation header is propagated over gRPC or as a Kafka message header. A sandboxed consumer, for example, reads from the main topic but only processes messages with its unique sandbox ID.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;“What about compliance and audit requirements?”&lt;/strong&gt; This model is more auditable. Every sandbox is tied to a specific user and a pull request/dev session. All test traffic is explicitly tagged with a sandbox ID and user identity, creating a granular audit log that is far superior to a shared staging environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Addressing these and making the shift to testing in production will inevitably involve some upfront engineering investment. However, the return on investment for that work is not just the savings from eliminating the direct infrastructure costs of your staging. It’s also a better developer experience, faster product delivery and fewer opportunities are lost to competitors that can iterate and ship faster than you.&lt;/p&gt;

&lt;h3&gt;
  
  
  It’s Time To Let Go
&lt;/h3&gt;

&lt;p&gt;Eliminating staging may sound like a pipe dream, but several prominent cloud native teams like &lt;a href="https://careersatdoordash.com/blog/moving-e2e-testing-into-production-with-multi-tenancy-for-increased-speed-and-reliability/" rel="noopener noreferrer"&gt;DoorDash&lt;/a&gt; and &lt;a href="https://www.uber.com/blog/shifting-e2e-testing-left/" rel="noopener noreferrer"&gt;Uber&lt;/a&gt; have already made the shift left to testing in production. Driven by their highly complex &lt;a href="https://www.signadot.com/blog/its-time-to-kill-staging-the-case-for-testing-in-production" rel="noopener noreferrer"&gt;microservice stacks and a need to get better testing&lt;/a&gt; fidelity, they are also realizing huge infrastructure cost savings.&lt;/p&gt;

&lt;p&gt;Teams like these deprecating their staging environments to test in production represent a broader trend: the rejection of approximation in favor of reality. Staging environments are &lt;a href="https://www.signadot.com/blog/why-duplicating-environments-for-microservices-backfires" rel="noopener noreferrer"&gt;artifacts of an era when duplicating infrastructure&lt;/a&gt; was a harder problem to solve than coordinating humans around shared resources.&lt;/p&gt;

&lt;p&gt;That era is ending.&lt;/p&gt;

&lt;p&gt;The future isn’t about building better approximations of production or optimizing your CI pipeline. It’s about adopting an entirely new paradigm. The teams taking this step aren’t just moving faster and cutting costs. They’re also shipping more reliable code.&lt;/p&gt;

&lt;p&gt;It’s time to kill your staging environment.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>devops</category>
      <category>discuss</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Kubernetes Isn't Your AI Bottleneck - It's Your Secret Weapon</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Mon, 17 Nov 2025 15:24:20 +0000</pubDate>
      <link>https://dev.to/signadot/kubernetes-isnt-your-ai-bottleneck-its-your-secret-weapon-190j</link>
      <guid>https://dev.to/signadot/kubernetes-isnt-your-ai-bottleneck-its-your-secret-weapon-190j</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this blog on &lt;a href="https://www.signadot.com/blog/kubernetes-isnt-your-ai-bottleneck----its-your-secret-weapon?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Kubernetes+Isn%E2%80%99t+Your+AI+Bottleneck+%E2%80%94+It%E2%80%99s+Your+Secret+Weapon" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let’s get one thing straight: In the AI era, the only thing that matters is experimenting faster than your competition. Generative coding assistants are enabling rapid iteration and users are flocking to whichever platform offers them the latest, most automated features.&lt;/p&gt;

&lt;p&gt;But for many established companies, there’s a huge, complex anchor slowing you down: Kubernetes (K8s).&lt;/p&gt;

&lt;p&gt;You adopted it to scale, and now it feels like a velocity tax. It’s the stumbling block that gets in the way of rapid, AI-assisted coding converting into actual product iteration. You watch agile, AI-first teams shipping 10 times faster on simpler platforms, and you worry that your “mature” stack is leaving you behind.&lt;/p&gt;

&lt;p&gt;I’ll be blunt: If you’re a five-person team trying to find product-market fit, this criticism is 100% correct. You shouldn’t be near K8s. Your constraint is discovery, not delivery. Stop worrying about infrastructure.&lt;/p&gt;

&lt;p&gt;But this article isn’t for you.&lt;/p&gt;

&lt;p&gt;This is for the teams in growth mode and beyond who feel that exact pain. The teams that need to ship fast to compete but are trapped by their own scale.&lt;/p&gt;

&lt;p&gt;I’m here to tell you Kubernetes isn’t your stumbling block. If you use it right, it’s the superpower that will let you win the AI integration race.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Bottleneck Has Moved
&lt;/h3&gt;

&lt;p&gt;Let’s be real about what’s happening. Tools like Cursor, Claude and the whole fleet of AI coding tools are ridiculously good at generating code, to the extent that they are even &lt;a href="https://www.signadot.com/blog/your-next-pull-request-will-come-from-a-product-manager" rel="noopener noreferrer"&gt;enabling a new class of code contributors&lt;/a&gt;. The “blank page” problem is vanishing. I can ask an agent to “refactor this Python service to use a new gRPC endpoint and add retry logic,” and it will generate a 90% correct pull request (PR) in 30 seconds.&lt;/p&gt;

&lt;p&gt;The bottleneck is no longer &lt;em&gt;writing&lt;/em&gt; code. It’s &lt;em&gt;validating&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;My engineers’ job is shifting from “typist” to “editor-in-chief.” Their day is less about writing code line by line and more about asking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“The agent gave me three valid-looking approaches. Which one is actually better?”&lt;/li&gt;
&lt;li&gt;“This PR looks right, but did it silently break one of the 15 downstream services that depend on it?”&lt;/li&gt;
&lt;li&gt;“How can I test this end to end without spending two days mocking dependencies?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The new currency for competitive advantage is speed of experimentation. The team that can validate and ship 10 AI-generated experiments while the other team is still setting up their environment is going to win. Period.&lt;/p&gt;

&lt;p&gt;And this is where K8s, the platform everyone loves to hate, becomes your secret weapon.&lt;/p&gt;

&lt;h3&gt;
  
  
  K8s Is the Ultimate Experimentation Platform
&lt;/h3&gt;

&lt;p&gt;The old complaint about K8s was, “It’s too complex! I don’t want to run 20+ microservices on my laptop just to test a one-line change.”&lt;/p&gt;

&lt;p&gt;That argument is dead. If you’re still doing that, you’re using it wrong.&lt;/p&gt;

&lt;p&gt;Tools (like my own startup, &lt;a href="https://www.signadot.com/" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;) can help. No one runs the whole stack locally anymore. You connect your local machine to the cluster, or better yet, you get an isolated “sandbox” inside the cluster for every single PR.&lt;/p&gt;

&lt;p&gt;This is the game-changer that unlocks rapid experimentation at enterprise scale.&lt;/p&gt;

&lt;p&gt;When an engineer gets a PR from an AI agent, they shouldn’t be testing it on their MacBook. They should have a workflow that, with one click, gives them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;An ephemeral sandbox:&lt;/strong&gt; K8s is brilliant at this. It spins up a lightweight, isolated environment containing just the new version of their service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request-level isolation:&lt;/strong&gt; This new &lt;a href="https://www.signadot.com/blog/microservices-testing-4-use-cases-for-sandbox-environments" rel="noopener noreferrer"&gt;“sandbox” environment&lt;/a&gt; intelligently routes only the developer’s test requests to their new code. All other traffic flows to the stable “baseline” services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faznv766cs6ugfnbnkepz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faznv766cs6ugfnbnkepz.png" alt=" " width="800" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A preview URL:&lt;/strong&gt; The developer gets a unique URL to test their AI-generated feature against the entire production-like stack, without colliding with anyone else or breaking staging.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp06g8uxa04y3ndlqb9ej.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp06g8uxa04y3ndlqb9ej.png" alt=" " width="800" height="730"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, that engineer can review three different AI-generated PRs in an hour. They can run a full suite of end-to-end (E2E) tests against each one. They’re not validating code in a vacuum; they’re validating outcomes in a real-world environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stop Validating Code and Start Validating Hypotheses
&lt;/h3&gt;

&lt;p&gt;This brings me to my last point: testing.&lt;/p&gt;

&lt;p&gt;What about running all those tests? K8s gives you the ultimate building blocks. Sure, you could just use a managed CI provider, but that’s like building your factory inside someone else’s warehouse. You’re stuck with their rules, their limitations and their pricing. You’ll inevitably outgrow it. K8s is about owning the factory itself. It gives you the control to build a custom validation platform that is tailored precisely to your company’s workflow. You’re not renting a generic tool; you’re building a durable, competitive asset.&lt;/p&gt;

&lt;p&gt;Your whole validation pipeline runs on the same platform as your application.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An agent generates a PR.&lt;/li&gt;
&lt;li&gt;CI kicks in, builds a container and deploys it to a K8s sandbox.&lt;/li&gt;
&lt;li&gt;A series of Kubernetes Jobs are triggered, hammering that &lt;a href="https://www.signadot.com/blog/sandbox-testing-the-devex-game-changer-for-microservices" rel="noopener noreferrer"&gt;sandbox’s preview URL with E2E tests&lt;/a&gt;, load tests and contract tests.&lt;/li&gt;
&lt;li&gt;The engineer gets a report: “Approach A passed all tests. Approach B failed the latency test under load.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is how you build an “experimentation engine,” not a “CI/CD pipeline.” You’re creating a factory for validating hypotheses at scale. The human’s job is to manage the factory, not turn the wrenches.&lt;/p&gt;

&lt;p&gt;So yes, K8s is complex. It’s a beast. But in an era where code is generated instantly, the battleground has shifted from creation to validation. And the only platform built to handle that level of high-concurrency, isolated, on-demand experimentation at scale is Kubernetes.&lt;/p&gt;

&lt;p&gt;If you’re in growth mode, speed of innovation is all that matters. Stop arguing about YAML and start building your experimentation engine.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>testing</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>The Great AWS Outage: The $11 Billion Argument for Kubernetes</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Mon, 10 Nov 2025 18:45:51 +0000</pubDate>
      <link>https://dev.to/signadot/the-great-aws-outage-the-11-billion-argument-for-kubernetes-5dod</link>
      <guid>https://dev.to/signadot/the-great-aws-outage-the-11-billion-argument-for-kubernetes-5dod</guid>
      <description>&lt;p&gt;&lt;em&gt;Image by Erik McLean from Unsplash. Read this blog on &lt;a href="https://www.signadot.com/blog/the-great-aws-outage-the-11-billion-argument-for-kubernetes?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=The+Great+AWS+Outage%3A+The+%2411+Billion+Argument+for+Kubernetes" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Well, that was a rough week.&lt;/p&gt;

&lt;p&gt;If you’re in tech, you know exactly which week I’m talking about. The great &lt;a href="https://aws.amazon.com/?utm_content=inline+mention" rel="noopener noreferrer"&gt;AWS&lt;/a&gt; us-east-1 outage of October 2025. Where were you when the alerts started firing?&lt;/p&gt;

&lt;p&gt;I was in the middle of a demo, and suddenly, nothing worked. Not our app, not our auth, not our CI/CD pipelines. It was a digital ghost town. For eight hours, a massive chunk of the internet — from streaming services and e-commerce sites to, terrifyingly, financial and healthcare platforms simply vanished.&lt;/p&gt;

&lt;p&gt;In the days and weeks since, the postmortems have rolled in. Publications like &lt;a href="https://www.tomsguide.com/news/live/amazon-outage-october-2025" rel="noopener noreferrer"&gt;Tom’s Guide&lt;/a&gt; chronicled the massive, cascading impact, while &lt;a href="https://www.forbes.com/sites/christerholloman/2025/10/20/aws-outage-billions-lost-multi-cloud-is-wall-streets-solution/" rel="noopener noreferrer"&gt;Forbe&lt;/a&gt;s has been tallying the cost. The current estimate? Over $11 billion in lost revenue and market value.&lt;/p&gt;

&lt;p&gt;Eleven. Billion. Dollars.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Siren Song of the Single Cloud
&lt;/h3&gt;

&lt;p&gt;For a decade, the public cloud has been an unbelievable accelerator. We traded capital expenditure for operational expenditure, and in return, we got compute, storage and a rich ecosystem of managed services on demand. It was a fantastic deal.&lt;/p&gt;

&lt;p&gt;But we also got comfortable. We built our entire systems, our businesses, around one provider’s proprietary APIs. We built on top of DynamoDB, used Lambda for functions and wired everything together with identity and access management (IAM). It was fast, it was powerful, and it was a ticking time bomb.&lt;/p&gt;

&lt;p&gt;The October outage wasn’t just a service failure; it was the failure of the control plane. When the identity services went down, the whole house of cards collapsed. It exposed the fundamental flaw in our thinking: We built incredibly resilient applications on top of a single, massive point of failure.&lt;/p&gt;

&lt;p&gt;The Forbes article pointed out that Wall Street’s new favorite phrase is “multicloud.” And they’re not wrong. The companies that had active-active setups across AWS and Google Cloud Platform (GCP), for example, were the ones tweeting, “We’re experiencing minor issues” instead of “We are completely dead.”&lt;/p&gt;

&lt;p&gt;But for most of us, “just go multicloud” is a terrible, simplistic and wildly expensive piece of advice.&lt;/p&gt;

&lt;p&gt;It’s easy to jump on the bandwagon and blame AWS. But let’s be honest, engineering at that scale is impossibly hard. Failures will happen.&lt;/p&gt;

&lt;p&gt;Now that the dust has settled, we need to ask ourselves the real, uncomfortable question: Why were we so fragile?&lt;/p&gt;

&lt;h3&gt;
  
  
  Why ‘Multicloud’ Is a Trap (Usually)
&lt;/h3&gt;

&lt;p&gt;If your answer to the outage is to have your team also learn the ins and outs of &lt;a href="https://cloud.google.com/?utm_content=inline+mention" rel="noopener noreferrer"&gt;Google&lt;/a&gt;‘s IAM, Azure’s Active Directory and all their distinct managed database services … good luck.&lt;/p&gt;

&lt;p&gt;True multicloud is hard. It’s not just running a few virtual machines (VMs) in two places. It’s:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Different APIs:&lt;/strong&gt; The way you provision a load balancer in AWS is completely different from how you do it in GCP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Different services:&lt;/strong&gt; There is no 1:1 equivalent for every managed service. You end up building for the lowest common denominator, or worse, building two (or three) entirely separate stacks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Different tooling:&lt;/strong&gt; Your  scripts are useless in Azure. Your entire CI/CD and observability stack may need to be duplicated or rearchitected.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach doesn’t just double your infrastructure cost; it doubles your cognitive load. You’re shipping features, managing reliability and fighting fires across two completely alien ecosystems.&lt;/p&gt;

&lt;p&gt;For years, this complexity was the price of admission for true resilience. Most of us, rightly, decided not to pay it.&lt;/p&gt;

&lt;p&gt;But what if the price of admission just dropped to zero? What if the platform we’ve been adopting for other reasons was the key all along?&lt;/p&gt;

&lt;h3&gt;
  
  
  Kubernetes as the Great Abstraction Layer
&lt;/h3&gt;

&lt;p&gt;This is where Kubernetes (K8s) changes the game.&lt;/p&gt;

&lt;p&gt;For many, K8s is just a “container orchestrator.” It’s what you use to run your microservices. But that description misses the forest for the trees.&lt;/p&gt;

&lt;p&gt;Kubernetes is a consistent, cloud-agnostic API for your entire application.&lt;/p&gt;

&lt;p&gt;Think about it. A Kubernetes &lt;code&gt;Deployment.yaml&lt;/code&gt; looks identical whether you’re submitting it to a cluster running on Elastic Kubernetes Service (AWS), Google Kubernetes Engine or Azure Kubernetes Service. A &lt;code&gt;Service&lt;/code&gt; object abstracts away the underlying cloud load balancer. A &lt;code&gt;PersistentVolumeClaim&lt;/code&gt; abstracts away the underlying storage class (Amazon Elastic Block Store, Google Persistent Disk, etc.).&lt;/p&gt;

&lt;p&gt;K8s is the abstraction layer we’ve been missing. It’s the “operating system” for the cloud.&lt;/p&gt;

&lt;p&gt;When your application only speaks Kubernetes, you are no longer locked into a cloud provider’s proprietary APIs. You are locked into an open source standard that runs everywhere.&lt;/p&gt;

&lt;p&gt;This makes the multicloud dream a practical reality:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;True portability:&lt;/strong&gt; A container image is a container image. Your app, packaged as a container, will run identically on your laptop, in an AWS us-east-1 cluster and in a GCP europe-west-2 cluster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure as data:&lt;/strong&gt; Your application’s entire desired state is just a set of YAML or JSON files. Pointing your Argo CD or Flux (GitOps) pipeline to a new, empty cluster in a different region — or on a different cloud — is trivial.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Federation and failover:&lt;/strong&gt; With modern tooling, you can manage a fleet of clusters as one logical unit. Service meshes (like &lt;a href="https://www.signadot.com/blog/using-istio-or-linkerd-to-unlock-ephemeral-environments" rel="noopener noreferrer"&gt;Linkerd or Istio&lt;/a&gt;) can automatically route traffic away from a failing region or cloud provider, often with no human intervention.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Adopting Kubernetes isn’t just about container orchestration. It’s a strategic business decision to buy back your freedom. It’s how you save billions of dollars, not just by surviving an outage, but by not having to build and maintain N different versions of your platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  Beyond Resilience: The Real Win Is Velocity
&lt;/h3&gt;

&lt;p&gt;Here’s the part that most “multicloud” think pieces are missing. Focusing on Kubernetes purely for disaster recovery is like buying an F1 car to go grocery shopping.&lt;/p&gt;

&lt;p&gt;The real, day-to-day magic of Kubernetes is what it does for your developer productivity.&lt;/p&gt;

&lt;p&gt;We are living in a new, AI-native world. We have copilots and agents generating enormous amounts of code. The &lt;a href="https://www.signadot.com/blog/why-staging-is-a-bottleneck-for-microservice-testing" rel="noopener noreferrer"&gt;bottleneck in software&lt;/a&gt; is no longer writing code; it’s testing and validating it.&lt;/p&gt;

&lt;p&gt;How can you be sure your AI-generated (or junior dev-generated) change doesn’t break one of the 50 other microservices it has to talk to?&lt;/p&gt;

&lt;p&gt;The old way was to have a &lt;a href="https://www.signadot.com/blog/smart-ephemeral-environments-share-more-copy-less" rel="noopener noreferrer"&gt;shared staging environment&lt;/a&gt;. A single, brittle, always-broken “God” environment that everyone was terrified to touch. It was a permanent bottleneck. But, used the right way, Kubernetes can be a velocity supercharger.&lt;/p&gt;

&lt;p&gt;With its native concepts of namespaces, resource quotas and network policies, Kubernetes is an excellent multitenant platform. This multitenancy unlocks a far more powerful and scalable model than simply spinning up complete, &lt;a href="https://www.signadot.com/blog/shifting-testing-left-the-request-isolation-solution" rel="noopener noreferrer"&gt;isolated copies of your entire stack for every pull request&lt;/a&gt; — a strategy that quickly becomes unfeasible with dozens or hundreds of microservices.&lt;/p&gt;

&lt;p&gt;Imagine this more advanced approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A developer opens a pull request (PR) with a change to a single microservice.&lt;/li&gt;
&lt;li&gt;A CI/CD pipeline instantly spins up only that changed service.&lt;/li&gt;
&lt;li&gt;Using a service mesh (like Linkerd or Istio) and context-aware routing, the &lt;a href="https://www.signadot.com/blog/streamlining-microservices-testing-with-platform-engineering" rel="noopener noreferrer"&gt;platform creates a “virtual” test&lt;/a&gt; environment.&lt;/li&gt;
&lt;li&gt;When a developer or an automated end-to-end test sends a request to this environment (e.g., by adding a special HTTP header), the mesh intelligently routes that request.&lt;/li&gt;
&lt;li&gt;Requests for the changed service go to the new version. Requests for all other (unchanged) services are routed to the stable, shared baseline stack.&lt;/li&gt;
&lt;li&gt;The developer gets a high-fidelity, isolated test against the full stack, but without the massive overhead of duplicating it.&lt;/li&gt;
&lt;li&gt;Once the PR is merged, only that single, lightweight namespace is destroyed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the holy grail of CI/CD. It gives teams the confidence to merge and deploy 50+ times a day. And it’s something that is simply not feasible, financially or technically, on a traditional VM-based architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  Don’t Focus on the Last Outage, Prep for the Next Decade
&lt;/h3&gt;

&lt;p&gt;The AWS outage was a painful, expensive lesson in the fragility of concentration. Yes, Kubernetes is the technical solution that enables the multiregion and multicloud resilience that Wall Street is now demanding.&lt;/p&gt;

&lt;p&gt;But that’s just the beginning.&lt;/p&gt;

&lt;p&gt;Don’t adopt K8s just to survive the next provider outage. Adopt it to build a platform that is resilient to the bottlenecks in your own development process. Adopt it so your teams can ship faster, safer and with more confidence.&lt;/p&gt;

&lt;p&gt;This is the vision that excites me. At Signadot, we see Kubernetes as the ultimate foundation for developer productivity — a platform that lets every developer get an isolated, high-fidelity test environment on demand, even in this new AI-driven world of constant, rapid change. (You can see more about this approach in our &lt;a href="https://www.signadot.com/docs/overview/?utm_source=the+new+stack&amp;amp;utm_medium=referral&amp;amp;utm_campaign=tns+platform" rel="noopener noreferrer"&gt;docs&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;The future of software is fast, distributed and complex. Stop building castles on a single provider’s sand. It’s time to build on rock.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>cloud</category>
      <category>platformengineering</category>
      <category>devops</category>
    </item>
    <item>
      <title>Developer-First Traffic Management: See Live Traffic &amp; Override APIs Instantly</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Thu, 06 Nov 2025 15:46:06 +0000</pubDate>
      <link>https://dev.to/signadot/developer-first-traffic-management-see-live-traffic-override-apis-instantly-peh</link>
      <guid>https://dev.to/signadot/developer-first-traffic-management-see-live-traffic-override-apis-instantly-peh</guid>
      <description>&lt;p&gt;&lt;em&gt;Image by Guerilla Buzz from Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read this article on &lt;a href="https://www.signadot.com/blog/developer-first-traffic-management-see-live-traffic-override-apis-instantly?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Developer-First+Traffic+Management%3A+See+Live+Traffic+%26+Override+APIs+Instantly" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;What if developers could see live traffic flowing in a complex cluster, and then instantly mock or override any API in that microservices stack for testing?&lt;/p&gt;

&lt;p&gt;Service meshes are the undisputed titans of modern infrastructure, giving platform engineers unprecedented control over production traffic. But they were built for platform engineers, not developers. When you hand the keys to that complex engine to a developer who just wants to take a test drive, you get friction, slowdowns, and a development cycle that feels more like navigating a bureaucracy than building innovative software.&lt;/p&gt;

&lt;p&gt;This isn't a critique of the meshes themselves; it's a critique of how they're used. We’ve learned a hard lesson: tools built for the operational rigor of production are often the wrong tools for the creative chaos of development.&lt;/p&gt;

&lt;p&gt;This post focuses on &lt;em&gt;what&lt;/em&gt; we built to solve this problem: a new set of developer-first traffic management tools that puts the developer back in the driver's seat. In a follow-up post (Part 2), we'll dive deep into &lt;em&gt;how&lt;/em&gt; we built it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Developer's Dilemma: A Day in the Life
&lt;/h3&gt;

&lt;p&gt;Let's ground this in a concrete example. Imagine you're a developer working on a large e-commerce application, tasked with adding a feature to the &lt;code&gt;checkout-service&lt;/code&gt;. To test it properly, you need it to interact with the real &lt;code&gt;payment-service&lt;/code&gt; and &lt;code&gt;inventory-service&lt;/code&gt; running in a shared staging environment.&lt;/p&gt;

&lt;p&gt;Testing this change the "old way" presented a terrible choice: attempt to run the entire stack locally (a fantasy), or redeploy the entire service for every minor code change (a nightmare). We've previously solved this core problem with &lt;strong&gt;Sandboxes&lt;/strong&gt;: a way to create an isolated environment that intelligently routes all traffic for &lt;em&gt;your&lt;/em&gt; &lt;code&gt;checkout-service&lt;/code&gt; to your local workstation, while all other service-to-service traffic remains unaffected.&lt;/p&gt;

&lt;p&gt;This is a massive leap forward. But as developers started using this power, they wanted to do more. The "all-or-nothing" routing of an entire service was great, but it didn't solve for more granular, complex scenarios. Developers still needed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deeply inspect live traffic:&lt;/strong&gt; Understand the &lt;em&gt;actual&lt;/em&gt; request and response payloads flowing between services, not just what the (often stale) documentation claims.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reroute a single API call:&lt;/strong&gt; Make a granular change to just one API endpoint (e.g., &lt;code&gt;POST /checkout/new-feature&lt;/code&gt;) to test a new function, while letting the stable cluster version handle all other calls (e.g., &lt;code&gt;GET /checkout/history&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mock specific network calls:&lt;/strong&gt; Test failure modes by mocking a specific response (e.g., what happens when the &lt;code&gt;payment-service&lt;/code&gt; returns a &lt;code&gt;503&lt;/code&gt;?) without writing a ton of custom mocking code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debug and reproduce specific cases:&lt;/strong&gt; Isolate a single problematic API call to reproduce a bug, without having to change the entire running service.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Technically, most of these things are possible with a service mesh. But as we quickly learned, "possible" is not the same as "practical."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcjxo86w7nm8wor0qyzn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcjxo86w7nm8wor0qyzn.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where the Mesh Falls Short for Developers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where the dream collides with reality. To implement that "simple" granular route, a developer must:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Become a Mesh Expert:&lt;/strong&gt; They have to understand the specific, complex CRDs of their mesh, like Istio's &lt;code&gt;VirtualService&lt;/code&gt;, &lt;code&gt;DestinationRule&lt;/code&gt;, or the Kubernetes Gateway API constructs like &lt;code&gt;HTTPRoute&lt;/code&gt;. They need to learn a platform engineer-centric YAML schema that was designed for cluster-wide operations, not for a developer's temporary workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suffer High Cognitive Load:&lt;/strong&gt; Their focus shifts from "How does my feature work?" to "What's the right incantation of YAML to make the network do what I want?" Even with newer standards like the Kubernetes Gateway API, it's still a leaky abstraction that forces developers to understand and manage Kubernetes-native configuration, which is not their primary job.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Risk Breaking Things:&lt;/strong&gt; In many setups, developers might be editing shared mesh configuration files. A small typo in an Istio &lt;code&gt;VirtualService&lt;/code&gt; could inadvertently disrupt traffic for the entire team working in that shared environment. The blast radius for a simple mistake is far too large.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using mesh-native tools for simple dev tasks felt like using a sledgehammer to crack a nut. We needed a different approach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unleashing Developer Superpowers: What We Built
&lt;/h3&gt;

&lt;p&gt;We built a new, programmable layer that sits &lt;em&gt;above&lt;/em&gt; the mesh, designed specifically as a developer-first toolkit. It provides two core capabilities to start, enabling a complete, frictionless workflow. This isn't an exhaustive list, but rather the first two powerful tools we chose to build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Real-Time Traffic Discovery: "What's &lt;em&gt;Actually&lt;/em&gt; Happening?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before you can change an API, you have to understand it. Stale documentation and out-of-date OpenAPI specs don't cut it. Our new layer lets a developer safely tap into a sandboxed version of a service on the cluster and see the real-time flow.&lt;/p&gt;

&lt;p&gt;With a simple policy, a developer can get a "read-only" stream of all request and response payloads for a given service in their sandbox. This provides a safe, passive view into the live behavior of the system, letting them work with the ground truth, not just documentation.&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/97QXE3-oFbI"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. API-Level Sandboxing: "Surgical Override for Your Local Code"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once a developer understands the traffic, they can act. This is the killer feature. Instead of redeploying an entire service, the developer runs a single command:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;signadot local override --sandbox ... -w ... --p=80 --to=localhost:8000&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/fiWOhTmZd1U"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;p&gt;This command connects their local process (running on &lt;code&gt;localhost:8000&lt;/code&gt;) to a specific sandbox's workload port. Here's the magic: our proxy first sends the request to the local process. If that process returns the &lt;code&gt;sd-override&lt;/code&gt; header set to &lt;code&gt;true&lt;/code&gt;, its response is used (the override is successful). If it &lt;em&gt;doesn't&lt;/em&gt; return that header, the proxy seamlessly falls back, using the response from the original workload in the sandbox.&lt;/p&gt;

&lt;p&gt;This simple mechanism becomes incredibly powerful when coupled with a new workload type we call "virtual." Unlike "fork" or "local" workloads, a "virtual" workload isn't a separate copy of a service; it's just a pointer to the stable, "baseline" version.&lt;/p&gt;

&lt;p&gt;By using the &lt;code&gt;override&lt;/code&gt; command against a "virtual" workload, a developer can use their sandbox's routing key (an identifier for their request) to test a single API call against the stable baseline service, but have it served by their local code. This means in just two commands, they can surgically override one API call from the baseline cluster service with code running in their local IDE, all without touching any YAML.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion: A New Framework for Developer-First Networking
&lt;/h3&gt;

&lt;p&gt;Our journey taught us a critical lesson, and it's the key takeaway we want to leave you with: &lt;strong&gt;a vast number of powerful capabilities already exist in our infrastructure, but they aren't exposed at the right level of abstraction. When we build the right developer-first layer to expose them, we can supercharge the developer workflow.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the modern stack, where teams are building with AI and moving faster than ever, the ability to safely inspect and manipulate live traffic isn't just a convenience; it's a developer superpower. The tools built for networking and operations, like service meshes, are powerful but operate at the wrong abstraction level for this kind of development velocity. The key to unlocking developer productivity is to &lt;em&gt;partner&lt;/em&gt; with platform engineers and use a simple, programmable, developer-first layer on top of the mesh to safely expose its power.&lt;/p&gt;

&lt;p&gt;The capabilities we've shown today, live traffic discovery and API-level sandboxing, are just the beginning. We've built a general "middleware" framework that allows many more traffic capabilities to be exposed to developers safely. We're on a path to build out a full plugin system, one that empowers platform teams to write and deploy their own custom middlewares, tailored to their organization's specific needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Coming in Part 2:&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;Now that you've seen what this layer does, you're probably wondering how it works. In Part 2, we'll dive into the architecture: how we built a high-performance middleware proxy, and integrated it all with different service meshes.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>testing</category>
      <category>servicemesh</category>
    </item>
    <item>
      <title>Using Sandboxes with Apollo GraphQL</title>
      <dc:creator>Signadot</dc:creator>
      <pubDate>Mon, 03 Nov 2025 15:35:41 +0000</pubDate>
      <link>https://dev.to/signadot/using-sandboxes-with-apollo-graphql-40ob</link>
      <guid>https://dev.to/signadot/using-sandboxes-with-apollo-graphql-40ob</guid>
      <description>&lt;p&gt;&lt;em&gt;Read this tutorial on &lt;a href="https://www.signadot.com/blog/using-sandboxes-with-apollo-graphql?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=Using+Sandboxes+with+Apollo+GraphQL" rel="noopener noreferrer"&gt;Signadot&lt;/a&gt;. See git repo &lt;a href="https://github.com/signadot/examples/tree/main/apollographql-tutorial" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;Introducing schema changes in a federated GraphQL architecture can be tricky. You need a way to safely test and validate new functionality without disrupting baseline traffic or impacting other developers. This is especially important in fast-moving teams where multiple changes can happen in parallel.&lt;/p&gt;

&lt;p&gt;In this post, we’ll walk through how to combine &lt;strong&gt;Apollo GraphQL&lt;/strong&gt; with &lt;strong&gt;Signadot Sandboxes&lt;/strong&gt; to create fully isolated environments for testing schema updates tied to pull requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Baseline Environment
&lt;/h3&gt;

&lt;p&gt;Before we dive into using Sandboxes with Apollo GraphQL, let’s set up a simple baseline environment.&lt;/p&gt;

&lt;p&gt;For this guide, we’ll use the &lt;a href="https://github.com/apollographql/supergraph-demo" rel="noopener noreferrer"&gt;Apollo Supergraph Demo&lt;/a&gt; and deploy it to a local &lt;strong&gt;Minikube&lt;/strong&gt; cluster, running side-by-side with the &lt;strong&gt;Signadot Operator&lt;/strong&gt;. This will give us a clean, realistic playground to test how Sandboxes and Apollo can work together.&lt;/p&gt;

&lt;p&gt;The demo relies on &lt;strong&gt;Apollo Managed Federation&lt;/strong&gt; and the &lt;strong&gt;Apollo Router&lt;/strong&gt;, but the same approach can be adapted to other Apollo deployments as well. Think of this as a starting point you can evolve to fit your own setup.&lt;/p&gt;

&lt;h4&gt;
  
  
  Prerequisites
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;kubectl&lt;/code&gt; and &lt;code&gt;minikube&lt;/code&gt; installed&lt;/li&gt;
&lt;li&gt;An Apollo Studio account&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;rover&lt;/code&gt; CLI installed and authenticated&lt;/li&gt;
&lt;li&gt;A Signadot account with the operator installed in the cluster&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;signadot&lt;/code&gt; CLI installed and authenticated&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Steps
&lt;/h4&gt;

&lt;p&gt;We’ll start by creating a new graph in &lt;a href="https://studio.apollographql.com/" rel="noopener noreferrer"&gt;Apollo Studio&lt;/a&gt;. Once the graph is created, export the following environment variables using the values returned by Apollo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export APOLLO_KEY=&amp;lt;generated-key&amp;gt;
export APOLLO_GRAPH_NAME=&amp;lt;graph-name&amp;gt;
export APOLLO_BASELINE_VARIANT=&amp;lt;graph-variant&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With those values set, you’re ready to prepare your environment. Run the following commands in your terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Clone the signadot/examples repo
mkdir -p ~/git/signadot/
cd ~/git/signadot/
git clone https://github.com/signadot/examples.git


# Clone the apollographql/supergraph-demo repo
mkdir -p ~/git/third-party/apollographql
cd ~/git/third-party/apollographql
git clone https://github.com/apollographql/supergraph-demo.git


# Build demo images
cd ~/git/signadot/examples/apollographql-tutorial
eval $(minikube docker-env)
./build.sh


# Create and setup all the subgraphs in Apollo Studio
./setup-apollo.sh


# Create the namespace and router secret
kubectl create ns apollographql-demo
kubectl -n apollographql-demo create secret generic router \
    --from-literal=APOLLO_KEY=${APOLLO_KEY} \
    --from-literal=APOLLO_GRAPH_NAME=${APOLLO_GRAPH_NAME} \
    --from-literal=APOLLO_BASELINE_VARIANT=${APOLLO_BASELINE_VARIANT}


# Deploy baseline services + the Apollo Router
kubectl apply -n apollographql-demo -f ./k8s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Validation
&lt;/h4&gt;

&lt;p&gt;At this point, your federated graph should be active and visible in Apollo Studio.You can open the Studio console to confirm that all subgraphs have been successfully registered and are part of the graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgyc3bl1adf0kk5rlrjoy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgyc3bl1adf0kk5rlrjoy.png" alt=" " width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also explore the router’s GraphQL Playground locally using Signadot:&lt;/p&gt;

&lt;p&gt;1) Start the local connection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;signadot local connect

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;2) Open the router playground in your browser: &lt;a href="http://router.apollographql-demo.svc:4000/%E2%80%8D" rel="noopener noreferrer"&gt;http://router.apollographql-demo.svc:4000/‍&lt;/a&gt;&lt;br&gt;
3) Paste and run the following GraphQL query in the editor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query{
  allProducts {
    id
    createdBy {
      name
      email
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzz4l4wtyp20kks9y9dkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzz4l4wtyp20kks9y9dkw.png" alt=" " width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Sandboxes
&lt;/h3&gt;

&lt;p&gt;To safely introduce schema changes without impacting the baseline environment, we’ll leverage &lt;a href="https://www.apollographql.com/docs/graphos/platform/graph-management/variants" rel="noopener noreferrer"&gt;Apollo Variants&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The idea is simple: each sandbox corresponds to a dedicated Apollo variant, ensuring that schema changes are fully isolated and don’t interfere with the baseline graph.&lt;/p&gt;

&lt;p&gt;In a typical setup, variant creation happens automatically in the CI pipeline, usually triggered by a pull request. To support this, we provide a helper script called &lt;code&gt;clone-variant.sh&lt;/code&gt;, which uses the &lt;code&gt;rover&lt;/code&gt; CLI to fetch all subgraph schemas from the baseline supergraph and publish them into a new PR variant. From there, any specific changes from the PR can be applied on top.&lt;/p&gt;

&lt;p&gt;In this demo, we include an alternative version of the users subgraph, which introduces a few new queries and a &lt;code&gt;phoneNumber&lt;/code&gt; field on the User entity (you can find the code &lt;a href="https://github.com/signadot/examples/tree/main/apollographql-tutorial/subgraphs/users" rel="noopener noreferrer"&gt;here&lt;/a&gt;). These changes are what we’ll be sandboxing.&lt;/p&gt;

&lt;p&gt;To continue with the example, let’s manually create a PR variant, as if this were happening in the context of PR #1234:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export PR_GRAPH_VARIANT="pr-1234"

cd ~/git/signadot/examples/apollographql-tutorial

# Create the PR variant based on the baseline
./clone-variant.sh ${APOLLO_GRAPH_NAME} ${APOLLO_BASELINE_VARIANT} ${PR_GRAPH_VARIANT}

# Apply the schema changes introduced in the PR
rover subgraph publish "${APOLLO_GRAPH_NAME}@${PR_GRAPH_VARIANT}" \
  --schema ./subgraphs/users/users.graphql \
  --name users \
  --routing-url http://users.apollographql-demo.svc:4000/graphql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let’s create a sandbox that forks both the router (to point it to the PR variant) and the users service (to use the modified image with the schema changes), replace &lt;code&gt;&amp;lt;cluster-name&amp;gt;&lt;/code&gt; with the corresponding cluster name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;signadot sandbox apply -f - &amp;lt;&amp;lt;EOF
name: apollographql-users
spec:
  cluster: &amp;lt;cluster-name&amp;gt;
  forks:
    # Fork the router pointing it to the PR variant
    - forkOf:
        kind: Deployment
        namespace: apollographql-demo
        name: router
      customizations:
        env:
          - name: APOLLO_GRAPH_VARIANT
            value: "${PR_GRAPH_VARIANT}"

    # Fork the users workload pointing it to the new version of the service
    - forkOf:
        kind: Deployment
        namespace: apollographql-demo
        name: users
      customizations:
        images:
          - image: signadot/apollographql-demo-users-fg:latest
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this sandbox in place, any request sent using the associated routing key will resolve against the forked &lt;code&gt;users&lt;/code&gt; service and the router configured with the PR variant, giving you an isolated environment to safely test the schema changes.&lt;/p&gt;

&lt;h4&gt;
  
  
  Validation
&lt;/h4&gt;

&lt;p&gt;If you open the router playground again, this time under the context of the sandbox you just created, you’ll see the new &lt;code&gt;phoneNumber&lt;/code&gt; field available on the User type, along with the &lt;code&gt;allUsers&lt;/code&gt; and &lt;code&gt;user&lt;/code&gt; queries.&lt;/p&gt;

&lt;p&gt;In this demo, we’re using the &lt;a href="https://www.signadot.com/docs/guides/developer-environments/access-sandboxes#1-chrome-extension-for-web-applications" rel="noopener noreferrer"&gt;Signadot browser extension&lt;/a&gt; to easily route traffic through the sandbox.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1lkhax4tdyyd283l507.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1lkhax4tdyyd283l507.png" alt=" " width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you run the same query against the baseline environment, you’ll get an error since the allUsers query (and the new phoneNumber field) don’t exist in that version of the graph.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lvev1q1ncar460hq1ws.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lvev1q1ncar460hq1ws.png" alt=" " width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How it Works
&lt;/h3&gt;

&lt;p&gt;Once the sandbox is created, the request flow changes slightly so that requests carrying the sandbox routing key are routed through the forked workloads instead of the baseline ones.&lt;/p&gt;

&lt;p&gt;Here’s how it works step by step:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcg2sef3pm8k04fwt11qu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcg2sef3pm8k04fwt11qu.png" alt=" " width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Requests enter through the DevMesh Proxy in front of the Apollo Router.If the request includes the corresponding routing key, it’s automatically routed to the sandbox fork of the router (in this example, the PR #1234 environment).&lt;/li&gt;
&lt;li&gt;The sandboxed router is configured to use the PR variant we created earlier.This means it exposes the updated schema, including the new fields and queries introduced in the forked users subgraph.&lt;/li&gt;
&lt;li&gt;To resolve queries, the router uses the original routing endpoints defined in Apollo.However, since the request still carries the routing key, the DevMesh proxy forwards the request to the forked users service instead of the baseline version.&lt;/li&gt;
&lt;li&gt;The response is then returned through the same path, giving the client a fully isolated experience of the new schema, all without affecting production traffic.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Alternatives and Improvements
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Single Router Mode
&lt;/h4&gt;

&lt;p&gt;In this article we explored integrations using the built-in capabilities of Apollo, where each Router instance can only point to a single graph variant at a time.&lt;/p&gt;

&lt;p&gt;Because of this limitation, every sandbox that includes a schema change must also spin up its own Router instance. While this works well operationally, it is sub-optimal from a resource perspective.&lt;/p&gt;

&lt;p&gt;Ideally, we’d be able to handle multiple schema variants within a single Router. In that model, the router would dynamically resolve, load, and cache variants on a per-request basis, trading off increased memory usage for reduced infrastructure footprint.&lt;/p&gt;

&lt;p&gt;This more advanced (and therefore more complex) setup is technically possible through a native Rust plugin in the Router, which can load and cache multiple supergraph services and dispatch requests to the appropriate variant at runtime.&lt;/p&gt;

&lt;p&gt;Though this approach require custom development and add operational complexity, it can reduce the number of running router instances if many short-lived sandboxes are in play.&lt;/p&gt;

&lt;h4&gt;
  
  
  Variant Reconciler
&lt;/h4&gt;

&lt;p&gt;In the solution described in this tutorial, the PR variant is created at CI time and remains unchanged afterward. However, in a dynamic development environment where multiple changes are happening in parallel, this can lead to schema drift between the baseline and the PR variants.&lt;/p&gt;

&lt;p&gt;For example, imagine we’re working on PR #1234. After submitting the PR and creating the variant and sandbox, another PR, say #1233, is merged, introducing changes to the products service. The baseline now includes those updates, but the PR #1234 variant is still based on the old state and doesn’t reflect the new schema.&lt;/p&gt;

&lt;p&gt;To address this issue, you can introduce a variant reconciler process. This process would detect changes introduced to the baseline variant and automatically propagate them to existing PR variants, ensuring they stay in sync with the latest baseline. By running this reconciler at the cluster level, you can maintain consistency across all active sandboxes without manual intervention.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Using Signadot Sandboxes with Apollo GraphQL provides a clean, flexible way to test schema changes in isolation without interfering with baseline traffic or blocking other teams. By combining Apollo Variants with sandbox routing, each pull request can get its own dedicated environment, enabling faster iteration and safer releases.&lt;/p&gt;

&lt;p&gt;We also explored potential improvements, such as reducing resource overhead by handling multiple variants in a single router, or implementing a variant reconciler to keep PR variants in sync with the baseline as it evolves. These enhancements can make the workflow even more efficient at scale.&lt;/p&gt;

</description>
      <category>graphql</category>
      <category>microservices</category>
      <category>devops</category>
      <category>apollographql</category>
    </item>
  </channel>
</rss>
