<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Benjamin Cane</title>
    <description>The latest articles on DEV Community by Benjamin Cane (@madflojo).</description>
    <link>https://dev.to/madflojo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2573285%2F844e50df-c3e8-456f-a257-020f7711bb97.png</url>
      <title>DEV Community: Benjamin Cane</title>
      <link>https://dev.to/madflojo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/madflojo"/>
    <language>en</language>
    <item>
      <title>Generating Code Faster Is Only Valuable If You Can Validate Every Change With Confidence</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/generating-code-faster-is-only-valuable-if-you-can-validate-every-change-with-confidence-3fe3</link>
      <guid>https://dev.to/madflojo/generating-code-faster-is-only-valuable-if-you-can-validate-every-change-with-confidence-3fe3</guid>
      <description>&lt;p&gt;Generating code faster is only valuable if you can validate every change with confidence.&lt;/p&gt;

&lt;p&gt;Software engineering has never really been about writing code. Coding is often the easy part.&lt;/p&gt;

&lt;p&gt;Testing is harder, and many teams struggle with it.&lt;/p&gt;

&lt;p&gt;As tools make it easier to generate code quickly, that gap widens. If you can produce changes faster than you can validate them, you eventually create more code than you can safely operate.&lt;/p&gt;

&lt;p&gt;Which begs the question: What does good testing actually look like?&lt;/p&gt;

&lt;h2&gt;
  
  
  🔍 What Good Looks Like
&lt;/h2&gt;

&lt;p&gt;One of the biggest challenges I see is that teams struggle to understand what “good” testing means and never define it.&lt;/p&gt;

&lt;p&gt;Pipelines are often built early in a project, when the team is small, and they rarely keep pace with the system and organization as they grow.&lt;/p&gt;

&lt;p&gt;My starting principle is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;At pull request time, you should have strong confidence that the change will not break the service or platform being modified.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Within a day of merging, you should have strong confidence that the change hasn’t broken the full customer journey that the platform supports.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🔁 On Pull Request
&lt;/h2&gt;

&lt;p&gt;For backend platforms, I like to see three levels of automated testing before merging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Tests (Unit Tests)
&lt;/h3&gt;

&lt;p&gt;This level is the foundation. Unit tests validate internal logic, error handling, and edge cases. Techniques such as fuzz testing and benchmarking also reveal issues early. As the test pyramid tells us, this is where the majority of testing and logic validation should take place.&lt;/p&gt;

&lt;h3&gt;
  
  
  Service-Level Functional Tests
&lt;/h3&gt;

&lt;p&gt;Too many teams stop at unit tests for pull requests. Functional tests should also be run in CI for every pull request.&lt;/p&gt;

&lt;p&gt;Services should be tested in isolation with functional tests. Dependencies can be mocked, but things like databases should ideally run for real (Dockerized).&lt;/p&gt;

&lt;p&gt;This is where API contracts are validated and regressions can be identified without wondering whether the issue came from this change or another service.&lt;/p&gt;

&lt;h3&gt;
  
  
  Platform-Level Functional Tests
&lt;/h3&gt;

&lt;p&gt;Testing a service alone isn’t enough. Changes can break upstream or downstream dependencies. Platform-level tests spin up the entire platform in CI and validate that services interact correctly.&lt;/p&gt;

&lt;p&gt;These tests ensure the platform continues to work as a system.&lt;/p&gt;

&lt;p&gt;For platforms with strict latency or resiliency requirements, I recommend introducing light stress tests at both the service and platform levels. These aren’t full performance tests, but they act as early indicators of performance regressions.&lt;/p&gt;

&lt;p&gt;If these three layers pass, you should have high confidence in the change. But not complete confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  🌙 Nightly Testing
&lt;/h2&gt;

&lt;p&gt;Some failures take time to appear.&lt;/p&gt;

&lt;p&gt;Memory leaks, performance degradation, and cross-platform integration issues may not show up immediately.&lt;/p&gt;

&lt;p&gt;That’s why I like to run a nightly build (or every few hours).&lt;/p&gt;

&lt;p&gt;This environment runs end-to-end customer journey tests, performance tests, and chaos tests.&lt;/p&gt;

&lt;p&gt;These are typically the same tests used during release validation, but running them continuously accelerates feedback. If something breaks, you learn about it early, before the pressure of a release.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;There is no universal approach everyone can follow.&lt;/p&gt;

&lt;p&gt;Different systems have different needs; mission-critical systems may focus heavily on correctness and resilience. Non-mission-critical systems may focus more on validating core functionality.&lt;/p&gt;

&lt;p&gt;Your testing strategy depends heavily on architecture, dependencies, and operational constraints. But if your organization is increasing its ability to generate code quickly, your testing capabilities must evolve at the same pace.&lt;/p&gt;

&lt;p&gt;AI-generated code becomes much easier to review when you already have high confidence in your testing.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>programming</category>
      <category>softwareengineering</category>
      <category>testing</category>
    </item>
    <item>
      <title>When You Go to Production with gRPC, Make Sure You’ve Solved Load Distribution First</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/when-you-go-to-production-with-grpc-make-sure-youve-solved-load-distribution-first-30f8</link>
      <guid>https://dev.to/madflojo/when-you-go-to-production-with-grpc-make-sure-youve-solved-load-distribution-first-30f8</guid>
      <description>&lt;p&gt;When you go to production with gRPC, make sure you’ve solved load distribution first.&lt;/p&gt;

&lt;p&gt;I was recently talking with another engineer who is rolling out gRPC into production. He asked what the biggest gotchas were.&lt;/p&gt;

&lt;p&gt;My first answer: Load Distribution.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚦 HTTP/1 vs. HTTP/2
&lt;/h2&gt;

&lt;p&gt;Most teams first implement services using REST over HTTP/1 and then migrate to gRPC as they seek its performance benefits.&lt;/p&gt;

&lt;p&gt;That shift introduces a subtle but important change in how traffic gets distributed across instances.&lt;/p&gt;

&lt;p&gt;With HTTP/1, requests are generally tied closely to connections. A client opens a connection, sends a request, waits for the response, and then sends another (if connection re-use is enabled).&lt;/p&gt;

&lt;p&gt;HTTP/2 (which underpins gRPC) works differently.&lt;/p&gt;

&lt;p&gt;HTTP/2 multiplexes requests over persistent connections. A client can send many requests over the same connection without waiting for responses.&lt;/p&gt;

&lt;p&gt;This is one of the reasons gRPC provides a performance boost, but it can create unexpected load distribution issues.&lt;/p&gt;

&lt;p&gt;If your infrastructure isn’t built for an HTTP/2 world, you’ll quickly find traffic becoming unevenly distributed.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏗️ Infrastructure Support
&lt;/h2&gt;

&lt;p&gt;In an HTTP/1 world, load balancing at the connection (Layer 4) level often works well enough. But with HTTP/2, connections live much longer and carry far more concurrent traffic.&lt;/p&gt;

&lt;p&gt;If your load balancer distributes traffic based only on connections, a busy client may hammer a single instance while others sit idle.&lt;/p&gt;

&lt;p&gt;Unfortunately, much of the infrastructure still doesn’t fully support HTTP/2-aware load balancing.&lt;/p&gt;

&lt;p&gt;Depending on your environment, your load balancers or ingress controllers may operate primarily at Layer 4. That works fine for HTTP/1, but once you introduce HTTP/2 via gRPC, the effectiveness changes significantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚙️ Supporting gRPC
&lt;/h2&gt;

&lt;p&gt;To get the most out of gRPC, the best approach is to use infrastructure that understands HTTP/2 and load-balances requests rather than just connections.&lt;/p&gt;

&lt;p&gt;If that’s not possible, another option is client-side load balancing.&lt;/p&gt;

&lt;p&gt;Many gRPC clients support opening a pool of connections and distributing requests across them. You still benefit from HTTP/2’s persistent connections, but you avoid concentrating all traffic on a single backend instance.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;gRPC offers many advantages, including performance, strongly typed contracts, and efficient communication. But it also introduces different networking behavior.&lt;/p&gt;

&lt;p&gt;If you’re rolling out gRPC into production, make sure your load balancing infrastructure is ready for an HTTP/2 world.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>distributedsystems</category>
      <category>networking</category>
    </item>
    <item>
      <title>You may be building for availability, but are you building for resiliency?</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 13 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/you-may-be-building-for-availability-but-are-you-building-for-resiliency-34om</link>
      <guid>https://dev.to/madflojo/you-may-be-building-for-availability-but-are-you-building-for-resiliency-34om</guid>
      <description>&lt;p&gt;You may be building for availability, but are you building for resiliency? Many teams design for availability. Far fewer design for resiliency.&lt;/p&gt;

&lt;p&gt;A concept that took me a while to really grasp is that building highly available systems and highly resilient systems is not the same thing.&lt;/p&gt;

&lt;p&gt;The difference is how the system reacts to failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚄 High Availability
&lt;/h2&gt;

&lt;p&gt;When you build for high availability, the goal is simple: ensure there is always another path.&lt;/p&gt;

&lt;p&gt;If something fails, traffic can be redirected somewhere else.&lt;/p&gt;

&lt;p&gt;For example, a service might run across multiple availability zones or regions. If one fails, traffic is routed to another.&lt;/p&gt;

&lt;p&gt;Detecting failures and redirecting traffic are core elements of building for high availability.&lt;/p&gt;

&lt;p&gt;Availability is about rerouting traffic when something fails.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚂 High Resiliency
&lt;/h2&gt;

&lt;p&gt;Building for resiliency is different.&lt;/p&gt;

&lt;p&gt;The solution to failure isn’t another path; it’s how the system handles the error.&lt;/p&gt;

&lt;p&gt;When a dependency fails, the decision becomes:&lt;/p&gt;

&lt;p&gt;Do we retry? Do we continue without that dependency? Do we degrade functionality? Do we stop processing altogether?&lt;/p&gt;

&lt;p&gt;Resiliency is about defining what happens when things go wrong.&lt;/p&gt;

&lt;p&gt;Sometimes you can continue processing. Sometimes you can defer work and fix it later.&lt;/p&gt;

&lt;p&gt;Resiliency is absorbing failure instead of avoiding it.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 A Simple Example
&lt;/h2&gt;

&lt;p&gt;When you design systems with resiliency in mind, you tend to treat dependencies differently.&lt;/p&gt;

&lt;p&gt;A simple example is configuration.&lt;/p&gt;

&lt;p&gt;Many systems use distributed configuration services so that runtime behavior can change without redeployment.&lt;/p&gt;

&lt;p&gt;But that configuration service then becomes a dependency. To avoid turning it into a hard dependency, many systems cache the configuration in memory.&lt;/p&gt;

&lt;p&gt;When updates occur, the system fetches the new configuration and switches only after it’s fully loaded into memory.&lt;/p&gt;

&lt;p&gt;If configuration refresh fails, the system continues operating with the last known configuration. Transient failures don’t bring the system down.&lt;/p&gt;

&lt;p&gt;That’s resiliency.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;When I talk about non-functional requirements, you’ll hear me say:&lt;/p&gt;

&lt;p&gt;“Highly available and resilient systems”&lt;/p&gt;

&lt;p&gt;I separate them intentionally because the approaches are different.&lt;/p&gt;

&lt;p&gt;Availability ensures there is always another path. Resiliency ensures the system can continue operating when failures occur.&lt;/p&gt;

&lt;p&gt;Availability routes around failure. Resiliency survives failure. You need both.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>distributedsystems</category>
      <category>sre</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>When your coding agent doesn’t understand your project, you’ll get junk</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 06 Mar 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/when-your-coding-agent-doesnt-understand-your-project-youll-get-junk-8b2</link>
      <guid>https://dev.to/madflojo/when-your-coding-agent-doesnt-understand-your-project-youll-get-junk-8b2</guid>
      <description>&lt;p&gt;When your coding agent doesn’t understand your project, you’ll get junk.&lt;/p&gt;

&lt;p&gt;Junk in, junk out.&lt;/p&gt;

&lt;p&gt;One of the best ways to get more from agentic coding tools is to give the agent context.&lt;/p&gt;

&lt;p&gt;The more an agent understands your project, the better its work will be.&lt;/p&gt;

&lt;p&gt;If you ask an agent to add a method to a class, it will. It might read the file. It might infer some structure. But it won’t understand the project's intent.&lt;/p&gt;

&lt;p&gt;If you asked a human engineer to make the same change, they would have questions.&lt;/p&gt;

&lt;p&gt;What is the purpose of this project? How is it used? What constraints exist?&lt;/p&gt;

&lt;p&gt;If they skipped that step, you’d get exactly what you asked for, even if it was wrong.&lt;/p&gt;

&lt;p&gt;That’s the same challenge many face with coding agents. A lack of context means it only does what it’s told — which isn’t always what you actually need.&lt;/p&gt;

&lt;p&gt;But when it understands a project, it operates with far more clarity.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧙‍♂️ My “Old School” Method
&lt;/h2&gt;

&lt;p&gt;Before I start serious work with an agent, I have it learn the project.&lt;/p&gt;

&lt;p&gt;Read the docs 📚 Review the codebase ⚙️ Understand the architecture 🏙️ Learn how to build, test, and run the project locally 👩‍🔧&lt;/p&gt;

&lt;p&gt;I even ask the agent to summarize its understanding back to me.&lt;/p&gt;

&lt;p&gt;This started as a saved prompt, turned into a slash command, and is now a skill.&lt;/p&gt;

&lt;p&gt;This step is a huge productivity boost.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 Agents Files (&lt;code&gt;AGENTS.md&lt;/code&gt;)
&lt;/h2&gt;

&lt;p&gt;Over the past year, an open standard for providing agents with structured context has emerged.&lt;/p&gt;

&lt;p&gt;Instead of prompting the agent to rediscover your project every time, document that context once — and the agent will reference it going forward.&lt;/p&gt;

&lt;p&gt;Most modern agents support an Agents.md file and reference it during each interaction.&lt;/p&gt;

&lt;h2&gt;
  
  
  💽 What Goes in an Agents File?
&lt;/h2&gt;

&lt;p&gt;Think of the Agents file as onboarding documentation, but for an agent.&lt;/p&gt;

&lt;p&gt;Project context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Purpose&lt;/li&gt;
&lt;li&gt;Architecture&lt;/li&gt;
&lt;li&gt;Layout&lt;/li&gt;
&lt;li&gt;CI/CD instructions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Team context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code style preferences&lt;/li&gt;
&lt;li&gt;Testing philosophy (TDD or YOLO)&lt;/li&gt;
&lt;li&gt;Tech stack constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Any tribal knowledge you’d expect a new team member to learn belongs in an Agents file.&lt;/p&gt;

&lt;h2&gt;
  
  
  👨‍💻 Personal Agent Files
&lt;/h2&gt;

&lt;p&gt;Many tools also support a personal Agents file in your home directory.&lt;/p&gt;

&lt;p&gt;That’s where your workflow preferences live. Are you a two-space tabs person? Do you want your agent to prefer table tests?&lt;/p&gt;

&lt;p&gt;If you have preferences you want to apply to every project, but are unique to you, they go in the personal Agents file.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Using an Agents file dramatically improves agent quality.&lt;/p&gt;

&lt;p&gt;Even then, I still use my “learn-this” slash command — sometimes that extra context makes a difference.&lt;/p&gt;

&lt;p&gt;If you wouldn’t drop a new engineer into a project without context, don’t do it to your agents.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>coding</category>
      <category>productivity</category>
    </item>
    <item>
      <title>You can have 100% Code Coverage and still have ticking time bombs in your code. 💣</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 27 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/you-can-have-100-code-coverage-and-still-have-ticking-time-bombs-in-your-code-51hk</link>
      <guid>https://dev.to/madflojo/you-can-have-100-code-coverage-and-still-have-ticking-time-bombs-in-your-code-51hk</guid>
      <description>&lt;p&gt;You can have 100% Code Coverage and still have ticking time bombs in your code. 💣&lt;/p&gt;

&lt;p&gt;I was listening to a team recently, and an engineer was discussing how a coding agent added additional tests to a project that already had 100% code coverage.&lt;/p&gt;

&lt;p&gt;The conversation reminded me that coverage is directional and often mistaken for quality. Just because your coverage shows 100% doesn’t mean your software is fully tested.&lt;/p&gt;

&lt;h2&gt;
  
  
  👨‍🏫 Understanding How Coverage Is Measured
&lt;/h2&gt;

&lt;p&gt;Code Coverage measures the percentage of executable lines that run during code tests. Executed doesn’t mean well-tested.&lt;/p&gt;

&lt;p&gt;Just because every function runs doesn’t mean it’s free of logic errors or safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  😃 Happy Path Testing
&lt;/h2&gt;

&lt;p&gt;A common challenge teams face with testing is focusing too much on the happy path.&lt;/p&gt;

&lt;p&gt;Suppose you have a function that accepts an array. In your tests, you always pass 5 elements — because that’s the expected usage. Coverage shows all branches executed. You’re good, right?&lt;/p&gt;

&lt;p&gt;What happens if you pass 4 elements? Or 0?&lt;/p&gt;

&lt;p&gt;If you never test fewer than 5, how do you know? You may say: “But wait, it’s only ever called with 5 elements.” That may be true, for now.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚠️ Protecting Against Your Future Self
&lt;/h2&gt;

&lt;p&gt;Code is rarely static; someone will come along and change things. That might be you, it might be someone else.&lt;/p&gt;

&lt;p&gt;Eventually someone changes that function. Will they add tests for new edge cases? Maybe. Assume they won’t.&lt;/p&gt;

&lt;p&gt;When you write tests, don’t just focus on how you know a function is going to be used; also include tests that misuse the function.&lt;/p&gt;

&lt;p&gt;Rather than sending an array with 5 elements, send one with 4, 0, and send a nil value.&lt;/p&gt;

&lt;p&gt;Rather than sending strings that match an expected pattern, send junk that doesn’t.&lt;/p&gt;

&lt;p&gt;Does the function still behave correctly? Should it?&lt;/p&gt;

&lt;p&gt;The more you test outside the happy path, the more resilient your code becomes — and the less likely it is to break later.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Code coverage is a guide, don’t let it give you false confidence. Test the happy path, and the unexpected ones. Validate function outputs against the input you provide.&lt;/p&gt;

&lt;p&gt;100% Coverage is easy. Writing reliable code is not.&lt;/p&gt;

</description>
      <category>codequality</category>
      <category>softwaredevelopment</category>
      <category>softwareengineering</category>
      <category>testing</category>
    </item>
    <item>
      <title>Getting More Out of Agentic Coding Tools</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 20 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/getting-more-out-of-agentic-coding-tools-4l9g</link>
      <guid>https://dev.to/madflojo/getting-more-out-of-agentic-coding-tools-4l9g</guid>
      <description>&lt;p&gt;Are you getting the most out of Agentic Coding Tools?&lt;/p&gt;

&lt;p&gt;Software engineering is changing fast.&lt;/p&gt;

&lt;p&gt;Agentic coding tools became widely available last year, and if you’re not using them today, you’re already behind. But many still struggle to move beyond the “fancy chat” experience.&lt;/p&gt;

&lt;p&gt;Just like any tool in our engineering tool belts, knowing how to use it effectively matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 Agents Are More Than A Better Chat
&lt;/h2&gt;

&lt;p&gt;Last year, most were using tab-complete with a useful chat interface where you could ask questions, get suggestions, and maybe copy/paste into your code.&lt;/p&gt;

&lt;p&gt;But agents can do much more than make suggestions — they can understand your codebase and act.&lt;/p&gt;

&lt;p&gt;Instead of asking an agent:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Can you suggest additional tests?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Tell your agent:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Create additional test cases, then run make tests and validate they pass.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An agent can create tests, run them, inspect failures, adjust the implementation, and re-run the suite until it passes.&lt;/p&gt;

&lt;p&gt;This isn’t about suggestions anymore; agents have more autonomy.&lt;/p&gt;

&lt;p&gt;I think of coding agents as assistants working toward a shared goal. They do some work, you do some, and you iterate together.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏆 Moving from Direction to Outcomes
&lt;/h2&gt;

&lt;p&gt;A big mental shift is moving away from simple directions to defining an outcome with guidance &amp;amp; guardrails.&lt;/p&gt;

&lt;p&gt;Agents don’t just perform a single task; they can execute multiple steps (and even parallelize them). You don’t need to spoon-feed each directive one by one.&lt;/p&gt;

&lt;p&gt;Instead, define the outcome you want, along with guidance and guardrails.&lt;/p&gt;

&lt;p&gt;The clearer you are on the outcomes, constraints, and context around what you are trying to do, the better the agent will perform.&lt;/p&gt;

&lt;h2&gt;
  
  
  📋 Examples: Real-world tasks I’ve asked Agents to handle
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;“Using the existing DB Driver X as a reference, create a set of table tests for driver Y. The tests should be structured similarly to the existing driver, surface any logic issues, concurrency issues, and act as a clear insurance against the defined interface.”&lt;/p&gt;

&lt;p&gt;“Update CI workflows to Go 1.26.0, find and update any references to 1.25.6, then run tests to ensure everything still builds and passes”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I also use agents for mundane work like git commits and opening pull requests. They consistently produce better commit messages and PR descriptions than I would.&lt;/p&gt;

&lt;p&gt;Agents don’t always get it exactly right, but with a bit of feedback and occasional adjustment, you can get a lot done quickly.&lt;/p&gt;

&lt;p&gt;Avoid going down the rabbit hole of endless refinement, sometimes it’s better to reset with a clearer prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  👨‍🏫 Context is Key
&lt;/h2&gt;

&lt;p&gt;If you want the best results from agents, you need to give them context.&lt;/p&gt;

&lt;p&gt;Before I do serious work on a project, I have the agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read the Docs 📚&lt;/li&gt;
&lt;li&gt;Review the Architecture 🏙️&lt;/li&gt;
&lt;li&gt;Understand the Project Structure 📐&lt;/li&gt;
&lt;li&gt;Understand how to build, test, and run the application locally 👩‍🔧&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same steps that a human would take. Agents are no different.&lt;/p&gt;

&lt;p&gt;(I’ll dive deeper into Agent files, skills, and effective ways to provide more context in a future post)&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Engineers are doing amazing things with agents, and new capabilities are being added daily. But you don’t need to be at the bleeding edge to get more out of them (I certainly am not).&lt;/p&gt;

&lt;p&gt;Don’t worry about the hype. Understand what these tools can do, making small adjustments in how you use them can drastically change what you get back.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>productivity</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Why is Infrastructure-as-Code so important? Hint: It's correctness</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 13 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/why-is-infrastructure-as-code-so-important-hint-its-correctness-2pao</link>
      <guid>https://dev.to/madflojo/why-is-infrastructure-as-code-so-important-hint-its-correctness-2pao</guid>
      <description>&lt;p&gt;Why is Infrastructure-as-Code so important? Hint: It's correctness.&lt;/p&gt;

&lt;p&gt;I’ve worked on many systems in my career, and one thing that I’ve noticed is that those that leverage infrastructure-as-code tend to be more stable than those that don’t.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤔 But wait, isn’t everyone using IaC these days?
&lt;/h2&gt;

&lt;p&gt;You may be thinking, "Why am I talking about IaC in 2026? Isn’t this just the de facto standard at this point?"&lt;/p&gt;

&lt;p&gt;My hope is yes, everyone does this, but I’m sure many don’t invest the time into it.&lt;/p&gt;

&lt;p&gt;I’m not here to tell you to use IaC; I’m here to tell you why it’s important, and it’s not necessarily about the speed of deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏎️ Fast is great, but it’s not the biggest benefit
&lt;/h2&gt;

&lt;p&gt;A very clear and correct reason people leverage IaC is the speed of infrastructure provisioning.&lt;/p&gt;

&lt;p&gt;It’s much faster to provision infrastructure with IaC; it takes less time, enabling you to scale faster, and it lets you do cool things like ephemeral environments.&lt;/p&gt;

&lt;p&gt;But the biggest benefit of IaC, in my mind, is correctness.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚠️ IaC reduces human error
&lt;/h2&gt;

&lt;p&gt;Humans make mistakes. When you ask humans to click the same buttons in the same sequence every time, you’ll get mixed results.&lt;/p&gt;

&lt;p&gt;Steps get missed — especially when time passes or people rely on memory instead of process.&lt;/p&gt;

&lt;p&gt;Documentation helps, but there are those of us who think, “I’ve done this a million times, I don’t need instructions.”&lt;/p&gt;

&lt;p&gt;This attitude is the same reason one of my kid’s desks wobbles and the other one doesn’t…&lt;/p&gt;

&lt;p&gt;IaC is a contract. Once defined, every environment is created from the same source of truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  ✅ Consistency is essential to production stability
&lt;/h2&gt;

&lt;p&gt;The consistency of IaC is what brings production stability.&lt;/p&gt;

&lt;p&gt;When your performance testing environment matches production, your tests become more accurate.&lt;/p&gt;

&lt;p&gt;If one service has a larger memory footprint in testing than it does in production, you might find yourself surprised by out-of-memory errors, especially if heap sizes are configured based on your test environment and not your production environment (because, of course, they would be the same, right?).&lt;/p&gt;

&lt;p&gt;When I come across platforms that use IaC, I see fewer mistakes and fewer incorrect assumptions. And production tends to be more stable, at least with respect to infrastructure and capacity-related issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;So, to answer the question, why is IaC so important? It’s not the speed of provisioning; it’s the correctness of the environments.&lt;/p&gt;

&lt;p&gt;In production systems, correctness beats speed every time.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>devops</category>
      <category>sre</category>
    </item>
    <item>
      <title>Optimizing the team’s workflow can be more impactful than building business features</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 06 Feb 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/optimizing-the-teams-workflow-can-be-more-impactful-than-building-business-features-54kf</link>
      <guid>https://dev.to/madflojo/optimizing-the-teams-workflow-can-be-more-impactful-than-building-business-features-54kf</guid>
      <description>&lt;p&gt;Optimizing the team’s workflow can be more impactful than building business features. It defies logic, but it’s true.&lt;/p&gt;

&lt;p&gt;I work with and talk to a lot of engineers, and to explain my point, I’ll describe two engineers on the same team.&lt;/p&gt;

&lt;h2&gt;
  
  
  💪 Engineer 1
&lt;/h2&gt;

&lt;p&gt;The first engineer churns out a lot of code and user stories. They’re focused, consistently finishing on time, and often doing more than they’re assigned.&lt;/p&gt;

&lt;p&gt;When it comes to shipping business features, this person does a great job.&lt;/p&gt;

&lt;p&gt;But this person is also more than happy to let their build run for 3 hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  🦾 Engineer 2
&lt;/h2&gt;

&lt;p&gt;The second engineer completes their assigned user stories, but when they encounter inefficiencies, they spend time fixing them. Sometimes it’s improving the build pipeline, fixing flaky tests, making code more maintainable, etc.&lt;/p&gt;

&lt;p&gt;While this engineer may finish fewer user stories because they are distracted by these “side quests,” they make a bigger impact.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏋️ Enabling Others
&lt;/h2&gt;

&lt;p&gt;While avoiding the 10x engineer trope, Engineer 2 has a bigger impact by resolving issues affecting the whole team.&lt;/p&gt;

&lt;p&gt;A slow pipeline slows everyone’s work.&lt;/p&gt;

&lt;p&gt;Open a single change, then wait 3 hours. A test fails—wait another 3 hours. Feedback comes in—wait 3 more.&lt;/p&gt;

&lt;p&gt;Broken workflows turn simple changes into long, inefficient endeavors.&lt;/p&gt;

&lt;p&gt;So fixing these not just for themselves but for everyone means the whole team can ship code faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  📈 Invest in Workflows
&lt;/h2&gt;

&lt;p&gt;Investing time in optimizing your workflow and the team’s workflow usually pays dividends.&lt;/p&gt;

&lt;p&gt;Sometimes it’s hard to quantify, but the smallest optimizations can be huge.&lt;/p&gt;

&lt;p&gt;Someone on the team who gets frustrated with inefficiencies and decides to fix them is incredibly valuable.&lt;/p&gt;

&lt;h2&gt;
  
  
  👩‍🔧 Do you take ownership of your codebase?
&lt;/h2&gt;

&lt;p&gt;If you want to make a greater impact, look at how you work.&lt;/p&gt;

&lt;p&gt;When you fix a bug, do you search the codebase for the same bug elsewhere?&lt;/p&gt;

&lt;p&gt;When your build pipeline is slow, or you have flaky tests, do you fix them or live with them, complaining while nothing changes?&lt;/p&gt;

</description>
      <category>career</category>
      <category>devops</category>
      <category>productivity</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>I follow an architecture principle I call The Law of Collective Amnesia</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 30 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/i-follow-an-architecture-principle-i-call-the-law-of-collective-amnesia-4nf2</link>
      <guid>https://dev.to/madflojo/i-follow-an-architecture-principle-i-call-the-law-of-collective-amnesia-4nf2</guid>
      <description>&lt;p&gt;I follow an architecture principle I call The Law of Collective Amnesia.&lt;/p&gt;

&lt;p&gt;Over time, everyone (including yourself) forgets the original intention of the system's design as new requirements emerge.&lt;/p&gt;

&lt;p&gt;This law applies at all levels, from system design to &lt;em&gt;microservices&lt;/em&gt;, or even libraries.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧬 Systems Evolve (and Intent Fades)
&lt;/h2&gt;

&lt;p&gt;When building new platforms/services/whatever, we create a system design that follows a structure.&lt;/p&gt;

&lt;p&gt;Different components have distinct responsibilities; they interact clearly with the rest of the system, and there is a plan.&lt;/p&gt;

&lt;p&gt;But as time progresses, new people may not understand the original intentions of the design.&lt;/p&gt;

&lt;p&gt;As new requirements come in, the pressure to deliver may push you or others down a path that doesn't align with the original plan.&lt;/p&gt;

&lt;p&gt;When the architecture’s intent is understood, additions can be beneficial. When it’s forgotten, they start to feel duct-taped on.&lt;/p&gt;

&lt;p&gt;Duct-taped solutions turn into technical debt or operational/management complexity that starts to weigh the system down.&lt;/p&gt;

&lt;h2&gt;
  
  
  📠 How Good Systems Become Legacy Nightmares
&lt;/h2&gt;

&lt;p&gt;We've all seen the legacy platform that feels brittle, does too much, and is daunting to refactor.&lt;/p&gt;

&lt;p&gt;It didn't start that way.&lt;/p&gt;

&lt;p&gt;At the time, it was probably a great design, but over time, new features and capabilities turned it into Frankenstein's monster.&lt;/p&gt;

&lt;h2&gt;
  
  
  👮 How to Defend Architecture from Collective Amnesia
&lt;/h2&gt;

&lt;p&gt;While it may not be possible to prevent the system from devolving forever, you can reduce the need for duct tape solutions by designing for change.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 Roles and Responsibilities
&lt;/h3&gt;

&lt;p&gt;An important—but not always effective—step is to document and define the roles and responsibilities of components within the system.&lt;/p&gt;

&lt;p&gt;When a system is broken down into components with distinct roles and responsibilities, it becomes easier for people to make informed decisions about where new capabilities should reside.&lt;/p&gt;

&lt;p&gt;The documentation “should” influence how change is implemented.&lt;/p&gt;

&lt;p&gt;But it relies on people following that documentation, which is the fundamental flaw.&lt;/p&gt;

&lt;h3&gt;
  
  
  🚧 Architectural Guardrails: Make the Right Path the Easy Path
&lt;/h3&gt;

&lt;p&gt;When I say "architectural guardrails," you probably think of review boards and ADRs. These processes are essential, but they don't always work as a prevention.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Instead, I mean designing the system so that the correct placement of functionality is the path of least resistance.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🔏 Contracts as Constraints, Not Convenience
&lt;/h3&gt;

&lt;p&gt;In general, I feel like back-end &lt;code&gt;APIs&lt;/code&gt; should provide as much data as possible, and it should be up to the clients to use what's relevant.&lt;/p&gt;

&lt;p&gt;But sometimes contracts can be used to enforce design behaviors.&lt;/p&gt;

&lt;p&gt;Systems can't act unless they receive the data required to act.&lt;/p&gt;

&lt;h3&gt;
  
  
  🚪 Control Ingress and Egress to Control Evolution
&lt;/h3&gt;

&lt;p&gt;Ensuring that only specific systems serve as entry and exit points helps direct future design decisions.&lt;/p&gt;

&lt;p&gt;It's often easier to add a new endpoint than to add a new platform that serves as an entry point.&lt;/p&gt;

&lt;p&gt;Knowing this can allow you to put in place processing at those entry and exit points that ensure future capabilities follow specific patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 Design for Change, Not Today’s Requirements
&lt;/h2&gt;

&lt;p&gt;When you are first building a system, it's easy to want to make it quickly based on the requirements in front of you.&lt;/p&gt;

&lt;p&gt;But when you know a platform will evolve, it's beneficial to take time and implement interfaces that make the system more modular.&lt;/p&gt;

&lt;p&gt;Within a &lt;em&gt;microservice&lt;/em&gt;, this can be how you structure the application, how you create packages that can be extended even though you don't need them day one.&lt;/p&gt;

&lt;p&gt;At a platform level, it could be the decision between &lt;em&gt;monolith&lt;/em&gt; and &lt;em&gt;microservices&lt;/em&gt;. If you know there will be a rapid change, it may make sense to leverage &lt;em&gt;microservices&lt;/em&gt;. If you know there won't be a fast change, start with a &lt;em&gt;monolith&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts: Assume Intent Will Be Forgotten
&lt;/h2&gt;

&lt;p&gt;The above examples are just a subset of the ways you can enforce a design that aligns with your intentions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The key lesson:&lt;/strong&gt; don't build a plan that relies on people to follow your intentions. They won't.&lt;/p&gt;

&lt;p&gt;You have to assume the next person won't design systems the way you do, they won't understand the reasons behind your design, and they'll be under pressure to deliver.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>discuss</category>
      <category>softwareengineering</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Performance testing without a target is like running a race with no finish line</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 23 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/performance-testing-without-a-target-is-like-running-a-race-with-no-finish-line-5hbn</link>
      <guid>https://dev.to/madflojo/performance-testing-without-a-target-is-like-running-a-race-with-no-finish-line-5hbn</guid>
      <description>&lt;p&gt;Performance testing without a target is like running a race with no finish line.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Did you win or did you stop early?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I previously shared my thoughts on benchmark and endurance tests, but before ever running a test, a target must be defined.&lt;/p&gt;

&lt;h2&gt;
  
  
  🎯 Why Set Targets?
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Without a target, how do you know what good looks like?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I've often come across teams that have incorporated performance testing into their releases (which is excellent). But they had no targets defined.&lt;/p&gt;

&lt;p&gt;No production baseline.&lt;/p&gt;

&lt;p&gt;No service-level objectives from the business.&lt;/p&gt;

&lt;p&gt;_How did they know whether the system was meeting expectations?_They didn't.&lt;/p&gt;

&lt;p&gt;In some cases, after targets were defined, the system was performing as needed.&lt;/p&gt;

&lt;p&gt;In others, it clearly wasn't, and the team had no idea until targets were defined and compared with production.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏆 Defining Targets
&lt;/h2&gt;

&lt;p&gt;It's easier to define targets for existing systems (and modernization projects) than for a brand-new system.&lt;/p&gt;

&lt;p&gt;Existing platforms have production numbers you can reference, user expectations, and service-level objectives that can be translated into performance targets.&lt;/p&gt;

&lt;p&gt;New systems rarely have much to baseline from.&lt;/p&gt;

&lt;p&gt;For a brand-new system, I like to work with the product/business team and understand their goals.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;- 📈 What is the expected growth? Slow and steady, or fast and unpredictable?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;- 🚨 What is the criticality of the platform? If it fails to respond, is it a problem or an inconvenience?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;- 🌟 What unique constraints or features of the platform might influence performance requirements?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Once defined, targets should not be treated as static.&lt;/p&gt;

&lt;p&gt;As traffic starts, you can adjust targets accordingly. Maybe it's higher, perhaps it's lower.&lt;/p&gt;

&lt;h2&gt;
  
  
  🪫 Leave Some Buffer
&lt;/h2&gt;

&lt;p&gt;Once a target is agreed upon, I like to add a bit of buffer.&lt;/p&gt;

&lt;p&gt;If the requirement is 100ms, I’ll target closer to 75ms, or lower, depending on the system and its purpose.&lt;/p&gt;

&lt;p&gt;_Why?_Adding capacity or tuning the system takes time.&lt;/p&gt;

&lt;p&gt;Things change, sometimes in unexpected ways.&lt;/p&gt;

&lt;p&gt;Sometimes unexpected changes can be handled by automatic/manual scaling, but not always.&lt;/p&gt;

&lt;p&gt;It's important to give yourself a bit of buffer to respond to those changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;I've talked a lot about setting targets and their importance. But one of the most important aspects of having targets is monitoring and measuring production.&lt;/p&gt;

&lt;p&gt;Having visibility in production helps validate that your targets are realistic.&lt;/p&gt;

&lt;p&gt;Maybe they are too high, and you have wasted infrastructure reserved.&lt;/p&gt;

&lt;p&gt;Perhaps they are too low, and you won't be able to survive the next traffic spike.&lt;/p&gt;

&lt;p&gt;Traffic changes over time, and application performance naturally drifts as new capabilities are added.&lt;/p&gt;

&lt;p&gt;Clear visibility into traffic and latency patterns is essential for anyone operating mission-critical, large-scale systems.&lt;/p&gt;

&lt;p&gt;But also a foundational practice for most platforms.&lt;/p&gt;

&lt;p&gt;Do you have performance targets for your platform? Is it grounded in production measurements? Should you?&lt;/p&gt;

</description>
      <category>devops</category>
      <category>performance</category>
      <category>testing</category>
    </item>
    <item>
      <title>Many teams think performance testing means throwing traffic at a system until it breaks.</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 16 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/many-teams-think-performance-testing-means-throwing-traffic-at-a-system-until-it-breaks-that-okb</link>
      <guid>https://dev.to/madflojo/many-teams-think-performance-testing-means-throwing-traffic-at-a-system-until-it-breaks-that-okb</guid>
      <description>&lt;p&gt;Many teams think performance testing means throwing traffic at a system until it breaks. That approach is fine, but it misses how systems are actually stressed in the real world.&lt;/p&gt;

&lt;p&gt;The approach I’ve found most effective is to split performance testing into two distinct categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🏋️‍♀️ &lt;strong&gt;Benchmark testing&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🚣‍♀️ &lt;strong&gt;Endurance testing&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both stress the system, but they answer &lt;em&gt;different questions&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏋️‍♀️ Benchmark Testing:
&lt;/h2&gt;

&lt;p&gt;Benchmark tests are where most teams start: increasing load until the system fails.&lt;/p&gt;

&lt;p&gt;Failure might mean:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⏱️ Latency SLAs are exceeded&lt;/li&gt;
&lt;li&gt;⚠️ Error rates cross acceptable thresholds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sometimes failure is measured by when the system stops responding entirely. This is known as &lt;em&gt;breakpoint testing&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Even when SLAs are the target, I recommend running breakpoint tests after thresholds are exceeded.&lt;/p&gt;

&lt;p&gt;Knowing how the system breaks under load is useful when dealing with the uncertainties of production.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚣‍♀️ Endurance Testing:
&lt;/h2&gt;

&lt;p&gt;Endurance tests answer a &lt;em&gt;different question&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Can the system sustain high load over time?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Running at high but realistic levels (often &lt;em&gt;near production max&lt;/em&gt;) over extended periods exposes different problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🪣 Queues, file systems, and databases slowly fill&lt;/li&gt;
&lt;li&gt;🧹 Garbage collection and thread pools behave differently&lt;/li&gt;
&lt;li&gt;🧵 Memory or thread leaks become visible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These issues &lt;em&gt;rarely&lt;/em&gt; show up in short spikes of traffic. If you only run benchmarks, you’ll discover them for the first time in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⌛️ Testing Thoroughly vs Deployment Speed:
&lt;/h2&gt;

&lt;p&gt;Benchmarks run fast; Endurance testing takes time.&lt;/p&gt;

&lt;p&gt;A 24-hour endurance test can slow down releases, especially when you want to release the same service multiple times a day.&lt;/p&gt;

&lt;p&gt;It's a &lt;strong&gt;trade-off&lt;/strong&gt; between the system's criticality and the need for rapid deployments.&lt;/p&gt;

&lt;p&gt;How tolerant is the system to minor performance regressions?&lt;/p&gt;

&lt;p&gt;If performance truly matters, slowing releases down to run endurance tests might be the right call.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts:
&lt;/h2&gt;

&lt;p&gt;Effective performance testing isn’t just about surviving spikes.&lt;/p&gt;

&lt;p&gt;Spikes matter, but so does answering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📈 Can the system withstand peak load for extended periods?&lt;/li&gt;
&lt;li&gt;🔎 If not, how does it fail, and why?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All too often, I see the system's capacity become the breaking point during unexpected traffic patterns.&lt;/p&gt;

&lt;p&gt;While an application might handle spikes, the overall platform often can't sustain them. That's where endurance tests deliver their &lt;strong&gt;real value&lt;/strong&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Pre-populating caches is a “bolt-on” cache-optimization I've used successfully in many systems. It works, but it adds complexity</title>
      <dc:creator>Benjamin Cane</dc:creator>
      <pubDate>Fri, 09 Jan 2026 00:00:00 +0000</pubDate>
      <link>https://dev.to/madflojo/pre-populating-caches-is-a-bolt-on-cache-optimization-ive-used-successfully-in-many-systems-it-26ne</link>
      <guid>https://dev.to/madflojo/pre-populating-caches-is-a-bolt-on-cache-optimization-ive-used-successfully-in-many-systems-it-26ne</guid>
      <description>&lt;p&gt;Pre-populating caches is a “&lt;em&gt;bolt-on&lt;/em&gt;” cache-optimization I've used successfully in many systems.&lt;/p&gt;

&lt;p&gt;It works, but it &lt;strong&gt;adds complexity&lt;/strong&gt; , which is why most teams avoid it.&lt;/p&gt;

&lt;h2&gt;
  
  
  📖 Context
&lt;/h2&gt;

&lt;p&gt;For context, in this post, I’m talking about scenarios where one system requires data from another system, I.E., the _source of record (SOR)._The data is needed frequently, and the decision to cache has already been made.&lt;/p&gt;

&lt;p&gt;A good traditional approach is the &lt;em&gt;cache-aside pattern&lt;/em&gt;, which maintains a local cache of data.&lt;/p&gt;

&lt;p&gt;That cache is populated organically by checking for records as needed, finding that the data is not cached, fetching it from the SOR, and storing the result.&lt;/p&gt;

&lt;p&gt;A pro of this approach is that the cache is &lt;strong&gt;transient&lt;/strong&gt;. If it's dropped, it's ok because you can always go back to the SOR, albeit with a performance penalty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But slow is better than broken.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🤔 Why?
&lt;/h2&gt;

&lt;p&gt;Calls to the SOR are problematic for low-latency or random-access workloads.&lt;/p&gt;

&lt;p&gt;When 9 out of 10 requests all want the same data, you’ll have infrequent cache misses. But when 9 out of 10 requests all require different data, you’ll have more cache misses, which reduces the effectiveness of caching.&lt;/p&gt;

&lt;p&gt;Pre-populating caches is a way to avoid those cache misses by trading off latency for complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚙️ How?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Caveat:&lt;/strong&gt; I use pre-population purely as a &lt;em&gt;bolt-on&lt;/em&gt; optimization, not a core dependency.&lt;/p&gt;

&lt;p&gt;Typically, I keep the cache-aside path as the &lt;em&gt;primary mechanism&lt;/em&gt;. If anything goes wrong (and it will), there is always the option to go to the SOR for data (&lt;code&gt;slow &amp;gt; broken&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A key decision&lt;/strong&gt; is whether to pull the data or listen for it.&lt;/p&gt;

&lt;p&gt;I prefer the SOR publishes updates as they occur, but platform constraints or circumstances may require you to pull the data.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Pub/sub&lt;/code&gt; works great when the SOR publishes, but other options exist as well (webhooks, files) with their own trade-offs.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Use whatever makes sense for your environment.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚠️ Why Not?
&lt;/h2&gt;

&lt;p&gt;Implementing pre-populating a cache can be &lt;em&gt;easier said than done&lt;/em&gt;, as a lot can go wrong.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What happens if you lose a message or two?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What happens when you’re rebuilding the cache (errors or new instances)?&lt;/em&gt;&lt;em&gt;How do you repopulate?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The cache-aside will cover any dropped messages, but implementing &lt;strong&gt;republish mechanisms is complicated&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You can’t rely solely on deltas; at some point, you'll need to &lt;em&gt;republish the entire dataset&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Building all of these systems is complicated; there's more to monitor, patch, and manage.&lt;/p&gt;

&lt;p&gt;If the latency hit and traffic volume to the SOR are not a concern, then that complexity is &lt;em&gt;not worth it&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧠 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Pre-populating caches can be a &lt;strong&gt;significant performance win&lt;/strong&gt; , but it can also be an &lt;strong&gt;operational overhead&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If your data is primarily static (&lt;em&gt;changing infrequently&lt;/em&gt;), the overhead can be worthwhile.&lt;/p&gt;

&lt;p&gt;If your data changes frequently, stick with &lt;em&gt;cache-aside&lt;/em&gt; (and aggressive &lt;code&gt;TTLs&lt;/code&gt;), or no cache at all.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>backend</category>
      <category>performance</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
