<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ilya Ploskovitov</title>
    <description>The latest articles on DEV Community by Ilya Ploskovitov (@aragossa).</description>
    <link>https://dev.to/aragossa</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3583415%2F7a4dcf1d-1d9c-47d7-aa7f-ef413d464224.jpg</url>
      <title>DEV Community: Ilya Ploskovitov</title>
      <link>https://dev.to/aragossa</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aragossa"/>
    <language>en</language>
    <item>
      <title>How to mask PII in Kubernetes before sending logs to Datadog</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Sun, 15 Mar 2026 23:05:58 +0000</pubDate>
      <link>https://dev.to/aragossa/how-to-mask-pii-in-kubernetes-before-sending-logs-to-datadog-3b4l</link>
      <guid>https://dev.to/aragossa/how-to-mask-pii-in-kubernetes-before-sending-logs-to-datadog-3b4l</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Datadog Bills and GDPR Nightmares
&lt;/h2&gt;

&lt;p&gt;If you are running applications in Kubernetes and shipping your logs to Datadog, you have probably faced two major headaches:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; Datadog charges you based on the volume of logs ingested and indexed. Every megabyte counts. Moreover, while Datadog offers a built-in Sensitive Data Scanner, it is a premium feature billed separately on top of your base log costs. By using a free, open-source sidecar, you can completely bypass expensive vendor-side premium scrubbers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance:&lt;/strong&gt; Sending Personally Identifiable Information (PII) like emails, credit card numbers, or API keys to a third-party logging service often violates GDPR and other privacy laws.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The more comprehensive your logs are for debugging, the higher your Datadog bill gets, and the bigger your risk of a privacy breach becomes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Standard Approach (And Why It Hurts)
&lt;/h2&gt;

&lt;p&gt;The standard way to solve this is to configure the Datadog Agent to mask or scrub PII before it leaves your cluster.&lt;/p&gt;

&lt;p&gt;However, this approach has significant drawbacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Complexity:&lt;/strong&gt; Setting up custom parsing rules, regexes, and pipelines in the Datadog Agent configuration can be tedious and difficult to maintain.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;High CPU Usage:&lt;/strong&gt; Running heavy regex operations over massive volumes of text inside your log shipper consumes a lot of CPU resources. This can slow down your node's performance or require larger, more expensive compute instances.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Whack-a-Mole:&lt;/strong&gt; You are constantly updating rules as your application output changes, which takes time and effort.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Solution: PII-Shield as a Lightweight Sidecar
&lt;/h2&gt;

&lt;p&gt;Instead of burdening your cluster-wide log shipper with heavy processing, you can mask PII &lt;em&gt;before&lt;/em&gt; it even leaves the pod.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PII-Shield&lt;/strong&gt; is a lightning-fast, zero-dependency tool written in Go. It acts as a sidecar container that sits right next to your application. It intercepts the logs in real-time, scrubs sensitive data using entropy detection and deterministic hashing, and then passes the clean logs forward.&lt;/p&gt;

&lt;p&gt;By the time the Datadog Agent picks up the logs from the Kubernetes node, they are already completely sanitized.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ready-to-Use Pod Configuration
&lt;/h3&gt;

&lt;p&gt;Here is a practical example of how to inject PII-Shield as a sidecar into your Kubernetes Pod. We use a shared volume so PII-Shield can read the application's output and write safe logs to its own standard output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-with-pii-shield&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app-image:v1.0.0&lt;/span&gt;
      &lt;span class="c1"&gt;# Instead of writing directly to stdout, the app writes to a shared file or pipe&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/bin/sh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-c"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./my-app-binary&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/shared-logs/app.log"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;shared-logs&lt;/span&gt;
          &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/shared-logs&lt;/span&gt;

    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pii-shield-sidecar&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;thelisdeep/pii-shield:v1.2.3&lt;/span&gt;
      &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PII_SALT&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-secure-random-salt"&lt;/span&gt;
      &lt;span class="c1"&gt;# PII-Shield reads the file in real-time, scrubs the data, and outputs to stdout&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/bin/sh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-c"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tail&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-n&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;+1&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-f&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/shared-logs/app.log&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pii-shield"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;shared-logs&lt;/span&gt;
          &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/shared-logs&lt;/span&gt;

  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;shared-logs&lt;/span&gt;
      &lt;span class="na"&gt;emptyDir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: The &lt;code&gt;thelisdeep/pii-shield&lt;/code&gt; image is multi-arch (supporting both &lt;code&gt;amd64&lt;/code&gt; and &lt;code&gt;arm64&lt;/code&gt;), which is perfect if you are saving costs by running on ARM processors like AWS Graviton.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does Datadog know what to read?&lt;/strong&gt;&lt;br&gt;
Because the main application now redirects its output to a file, its standard output (&lt;code&gt;stdout&lt;/code&gt;) is empty. The Datadog Agent, which natively listens to &lt;code&gt;stdout&lt;/code&gt; across all containers via Autodiscovery, will automatically pick up &lt;em&gt;only&lt;/em&gt; the clean stream from the &lt;code&gt;pii-shield-sidecar&lt;/code&gt;. There are no conflicts and no duplicate logs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this is better:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Zero Configuration for Log Shippers:&lt;/strong&gt; Datadog just receives clean logs. There are no complex pipeline rules to manage.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Bypass Premium Vendor Fees:&lt;/strong&gt; Datadog's built-in Sensitive Data Scanner is a premium feature billed on top of your regular log volumes. By using a free, open-source sidecar, you completely eliminate the need for expensive vendor-side scrubbing.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Predictable Performance:&lt;/strong&gt; PII-Shield utilizes zero-allocation JSON parsing and consumes a mere ~30Mi of memory (footprint). For a sidecar running in every pod across your cluster, this negligible resource footprint is critical.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Easy Debugging:&lt;/strong&gt; With deterministic hashing, &lt;code&gt;user@email.com&lt;/code&gt; becomes something like &lt;code&gt;[HIDDEN:a1b2c3]&lt;/code&gt;. You can still trace that same user across your Datadog logs for debugging without ever knowing their real email.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By putting the shield right where the data is generated, you protect your users' privacy and keep your observability bills in check.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Ready to secure your Kubernetes logs?&lt;/strong&gt; &lt;br&gt;
Check out the &lt;a href="https://github.com/thelisdeep/pii-shield" rel="noopener noreferrer"&gt;PII-Shield repository on GitHub&lt;/a&gt;, try out the Helm chart, and if you find it useful, consider dropping a star!&lt;/p&gt;

</description>
      <category>datalog</category>
      <category>kubernetes</category>
      <category>log</category>
    </item>
    <item>
      <title>Integrating PII-Shield into GuardSpine (WASM vs Native execution)</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Wed, 25 Feb 2026 05:39:27 +0000</pubDate>
      <link>https://dev.to/aragossa/integrating-pii-shield-into-guardspine-wasm-vs-native-execution-1m4e</link>
      <guid>https://dev.to/aragossa/integrating-pii-shield-into-guardspine-wasm-vs-native-execution-1m4e</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/DNYoussef/codeguard-action" rel="noopener noreferrer"&gt;GuardSpine's&lt;/a&gt; main job is AI-aware code governance—using AI models to automatically review pull requests and create cryptographic proof of those reviews. However, because GuardSpine sends code to third-party AI models (like Claude or GPT) to be reviewed, there is a massive risk. Sensitive information—like passwords and personal data (PII)—must be stopped from leaking to these AI providers. You cannot let personal data poison third-party AI models.&lt;/p&gt;

&lt;p&gt;To solve this, the GuardSpine team turned to &lt;a href="https://github.com/aragossa/pii-shield" rel="noopener noreferrer"&gt;PII-Shield&lt;/a&gt;: a lightning-fast, open-source Go engine designed to find and redact sensitive data. The big engineering challenge was figuring out how to integrate this powerful external Go dependency into GuardSpine's strictly Python-based ecosystem. We had to make sure the integration was 100% secure (no data could accidentally leak to the internet) and extremely fast.&lt;/p&gt;

&lt;p&gt;We had to make a choice. Should we run the Go tool as a normal program using Python's &lt;code&gt;subprocess.run&lt;/code&gt;? Should we deploy a local server? Or, should we try something newer and run the Go code straight inside Python using WebAssembly (WASM)? This article explains how we tested normal programs against WebAssembly to adapt PII-Shield for this AI pipeline, the problems we found with slow startup times, and when you should use each method.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. WebAssembly (WASI) Integration: The Sandboxed Approach
&lt;/h2&gt;

&lt;p&gt;WebAssembly (WASM) is great for adding tools into Python. Its biggest benefit is that it works anywhere. Instead of building different versions of the tool for Linux, Mac, and Windows, we only need to build one &lt;code&gt;.wasm&lt;/code&gt; file. Any computer running &lt;code&gt;wasmtime&lt;/code&gt; can use it. Also, WASM is very secure because it runs in a "sandbox." This means the Go tool is locked inside and cannot access your network or your hard drive.&lt;/p&gt;

&lt;p&gt;To make this work, we built a Python script (&lt;code&gt;pii_wasm_client.py&lt;/code&gt;). When we need to hide data, this script starts the WASM engine. It passes important settings (like the secret &lt;code&gt;PII_SALT&lt;/code&gt; password) safely using environment variables. Then, it uses temporary files to send the data into the WASM tool and get the clean text back.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgwbjbstaa9xbhjaaejb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgwbjbstaa9xbhjaaejb.png" alt=" " width="637" height="474"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Setup the Engine (Turn on the computer)
&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Engine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Load the Code (Load the program)
&lt;/span&gt;&lt;span class="n"&gt;module&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pii-shield-wasi.wasm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Setup the Rules (Configure WASI)
&lt;/span&gt;&lt;span class="n"&gt;linker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;linker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;define_wasi&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# 4. Create a Safe Workspace (Create a Store)
&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;wasi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;WasiConfig&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;wasi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inherit_stdout&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_wasi&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wasi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 5. Run the Program
&lt;/span&gt;&lt;span class="n"&gt;instance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;linker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;instantiate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;_start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;wasmtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ExitTrap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt; &lt;span class="c1"&gt;# Expected exit
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we explain this to business leaders, they usually ask these important security questions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Why not just use a paid API on the internet?"&lt;/strong&gt; Because the WASM tool runs locally on your machine, your code never leaves your private network.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"How do you keep the secret salt (PII_SALT) safe?"&lt;/strong&gt; The secret is passed directly into the locked WASM sandbox. Because the sandbox is blocked off, nothing outside of it can steal the secret.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Why not just use simple search patterns (Regex)?"&lt;/strong&gt; Simple search patterns are fast but easily break on weird data, often flagging safe text as secrets by mistake (false positives). PII-Shield uses smart tokenization and entropy scoring to find real secrets, which requires a strong Go engine, not just simple text matching.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Real-World Benchmarks &amp;amp; The Bottlenecks
&lt;/h2&gt;

&lt;p&gt;When we started, we thought running the WASM tool directly inside Python would be faster than asking the computer (OS) to start a whole new external program (&lt;code&gt;subprocess.run&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;To test this, we built a script to run the tool 10,000 times (you can find the open-source &lt;code&gt;benchmark.py&lt;/code&gt; script in this public &lt;a href="https://gist.github.com/aragossa/68f19a73a982a49db0596857d30a6055" rel="noopener noreferrer"&gt;GitHub Gist&lt;/a&gt;). The text we tested was very small: just a 100-character JSON string. We compared a normal Linux program against the WASM engine.&lt;/p&gt;

&lt;p&gt;The results were surprising. In our test with the tiny text, the normal program was about 4.7x faster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normal Program (&lt;code&gt;subprocess.run&lt;/code&gt;): ~0.86 ms per run.&lt;/li&gt;
&lt;li&gt;Starting and running the WASM Engine: ~4.03 ms per run.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even when scaled to 100,000 iterations, the per-execution latency remained rock-solid (0.85ms vs 4.05ms). The execution time scaled perfectly linearly, proving that while WASM carries a strict 4ms "cold start" tax, it introduces absolutely no memory leaks or degraded performance over time—a crucial metric for GuardSpine's stability.&lt;/p&gt;

&lt;p&gt;But saying "WASM is slower" is not fair. WebAssembly code is actually very fast. The problem was the "Cold Start"—how long it takes to turn the tool on. We found three big slowdowns:&lt;/p&gt;

&lt;p&gt;First, &lt;strong&gt;Starting Go is Slow&lt;/strong&gt;. Every time we ran the WASM tool, the Go language had to start up its memory manager and background tasks inside the sandbox. When you run a normal program, your computer’s Operating System does this almost instantly. In WASM, we had to wait for it every single time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpl6xmpxn0a3iboumdq2n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpl6xmpxn0a3iboumdq2n.png" alt=" " width="637" height="150"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Second, &lt;strong&gt;The File Speed Problem (I/O)&lt;/strong&gt;. To send text into the WASM tool, our Python script created temporary files on the hard drive. Creating, writing, and deleting files takes a lot of time and caused about 30-50% of the delay.&lt;/p&gt;

&lt;p&gt;Finally, &lt;strong&gt;The Text Was Too Small&lt;/strong&gt;. Because our test text was only 100 characters, 99% of the time was spent just turning the tool on and managing files. The actual work of finding the secrets took almost zero time. If we tested a massive 10MB file instead, the 4x speed difference would likely disappear because the real work (finding secrets) would take much longer than the startup time.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Unlocking WASM's True Potential
&lt;/h2&gt;

&lt;p&gt;Even though the "Cold Start" is slow, we shouldn't give up on WASM's amazing security and portability. To make WASM lightning-fast, we need to change how we run it:&lt;/p&gt;

&lt;p&gt;The easiest fix is &lt;strong&gt;Keeping It Running&lt;/strong&gt;. Our test turned the WASM tool on and off 10,000 times. If we turn it on just once when the server starts, we never have to wait for the Go startup delay again. WASM would be just as fast as a normal program.&lt;/p&gt;

&lt;p&gt;Next, we must fix the slow files by using &lt;strong&gt;In-Memory Pipes&lt;/strong&gt;. Instead of writing data to real files on the hard drive, we can push the data directly through standard in-memory streams (pipes). This fixes the file speed problem easily.&lt;/p&gt;

&lt;p&gt;The perfect, final goal is &lt;strong&gt;Shared Memory&lt;/strong&gt;. Instead of copying text back and forth, Python and WASM can look at the exact same spot in the computer's memory. This is the fastest possible way to share data, though it introduces complex engineering challenges, such as safely managing Go's Garbage Collector across the Python-WASM boundary.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkucqd0r2mtc7ib703u3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkucqd0r2mtc7ib703u3.png" alt=" " width="189" height="637"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, we should look at &lt;strong&gt;TinyGo&lt;/strong&gt;. The normal Go compiler adds a lot of heavy, extra background tasks. TinyGo is a smaller compiler made just for WASM. Using it would make the tool much smaller and make it start up 10 to 20 times faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. The Verdict: When to use which?
&lt;/h2&gt;

&lt;p&gt;Even though normal programs start faster, speed isn't the only thing that matters—security and stability matter too. The hidden superpower of WASM is that Python is in total control. If Python crashes, the WASM tool dies cleanly with it. If you use &lt;code&gt;subprocess.run&lt;/code&gt;, a crashed Python app might leave behind "zombie" programs that eat up your server's resources.&lt;/p&gt;

&lt;p&gt;So, when should you use a &lt;strong&gt;Normal Program&lt;/strong&gt;? They are best for things you run rarely, like a command-line developer tool or a script that runs once an hour. The computer's Operating System is great at starting them quickly. While proper process management in Python can mitigate risks, WASM remains inherently safer for memory lifecycles.&lt;/p&gt;

&lt;p&gt;When should you use &lt;strong&gt;WASM&lt;/strong&gt;? WASM is the undisputable winner for modern, always-on Cloud servers. When you need to run untrusted code safely, or you need your tool to work on any operating system without building 10 different versions, WASM is unbeatable. It guarantees that your code stays locked in a safe box.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Conclusion
&lt;/h2&gt;

&lt;p&gt;Integrating PII-Shield into GuardSpine taught us a lot about the trade-offs between raw speed and safe, modern design. While normal programs won the simple speed test, adapting WebAssembly gave the platform unmatched security and the ability to run anywhere without complex cross-compilation.&lt;/p&gt;

&lt;p&gt;As the platform's processing volume scales, the roadmap for this integration is clear. We need to stop turning the WASM tool on and off for every log line. Instead, the plan is to keep the engine running persistently, compile it with TinyGo, and share memory directly across the Python-WASM boundary. The ultimate goal isn't just to match the speed of normal programs, but to beat them completely, creating a lightning-fast, unbreakable redaction layer for AI development.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>go</category>
      <category>privacy</category>
      <category>security</category>
    </item>
    <item>
      <title>Stop Leaking API Keys in your AI Agent Logs: A Go Sidecar Approach</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Thu, 05 Feb 2026 21:41:24 +0000</pubDate>
      <link>https://dev.to/aragossa/stop-leaking-api-keys-in-your-ai-agent-logs-a-go-sidecar-approach-d3</link>
      <guid>https://dev.to/aragossa/stop-leaking-api-keys-in-your-ai-agent-logs-a-go-sidecar-approach-d3</guid>
      <description>&lt;h1&gt;
  
  
  Stop Leaking API Keys in your AI Agent Logs: A Go Sidecar Approach
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Subtitle:&lt;/strong&gt; The Hidden Privacy Leak in your AI Agents (and why your LLM "Audit Logs" are a GDPR Nightmare)&lt;/p&gt;




&lt;h2&gt;
  
  
  1. The Problem
&lt;/h2&gt;

&lt;p&gt;Everyone is building AI agents right now. Whether you're using LangChain, AutoGPT, or custom loops, you are almost certainly logging their work. You keep traces of every step to debug reasoning loops or monitor costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here lies the pain:&lt;/strong&gt; Everything goes into these logs. User prompts, model responses, and raw JSON payloads from APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Visualization
&lt;/h3&gt;

&lt;p&gt;Imagine this scenario:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input Log:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aragossa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"My key is sk-live-123456"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What your Logging System Saved:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aragossa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"My key is sk-live-123456"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a user pastes their password, API key, or PII into the chat, or if an API returns a sensitive internal token, that data is now permanently etched into your Elasticsearch, Datadog, or S3 bucket.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Consequence:&lt;/strong&gt; Your fine-tuning dataset is now "poisoned" with real user data. This is a massive GDPR violation and a ticking security time bomb. You can't just "delete" it if you don't know where it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Gap: Why usual methods fail
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Regex?&lt;/strong&gt; Good luck maintaining a regex list for every possible API key format, session token, and PII variation in existence. It’s a game of whack-a-mole you will lose.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;ML-based protection?&lt;/strong&gt; Too slow. If your agent operates in real-time, you cannot afford a 500ms roundtrip to a BERT model just to sanitize specific log lines. PII scans often become the bottleneck.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Python-native logic?&lt;/strong&gt; Processing massive text streams in an interpreted language (like Python) adds significant CPU overhead per log line compared to a compiled Go binary. In high-throughput pipes, this latency adds up fast.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Existing Observability Tools?&lt;/strong&gt;&lt;br&gt;
Systems like Fluent Bit, Datadog, or OpenTelemetry already offer redaction and PII masking, usually via pattern rules and regex. For many workloads that’s perfectly fine. The trade‑off is that these pipelines are not optimized for AI‑agent traces: they either run late in the pipeline (after logs have already left the pod) or rely on configuration‑heavy pattern catalogs that are hard to keep up‑to‑date in a world of ever‑changing API keys and internal tokens.&lt;/p&gt;

&lt;p&gt;Static scanners like &lt;strong&gt;TruffleHog&lt;/strong&gt; shine for repositories and CI, where you scan code at rest. They’re not meant to sit inline on a hot log stream and make sub‑millisecond decisions on every line.&lt;/p&gt;

&lt;p&gt;Where &lt;strong&gt;PII‑Shield&lt;/strong&gt; is different is not in “inventing redaction”, but in the combination of techniques tailored for AI agents: entropy + bigram signals for unknown secrets, &lt;strong&gt;deterministic HMAC&lt;/strong&gt; instead of &lt;code&gt;***&lt;/code&gt; for referential integrity, and deep JSON traversal to keep your log schemas intact while still scrubbing sensitive values.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. The Implementation: Enter PII-Shield
&lt;/h2&gt;

&lt;p&gt;Meet &lt;strong&gt;PII-Shield&lt;/strong&gt;. It’s a lightweight sidecar written in Go that sits right next to your agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Killer Feature #1: Entropy-based Detection
&lt;/h3&gt;

&lt;p&gt;We don't just search for "password=". We look for &lt;strong&gt;chaos&lt;/strong&gt;.&lt;br&gt;
API keys and authentication tokens naturally have high "entropy" (randomness/complexity). Normal human speech has low entropy. By calculating the mathematical complexity of strings, we can flag 64-character hex strings or base64 blobs without knowing their specific format.&lt;/p&gt;
&lt;h3&gt;
  
  
  Killer Feature #2: Deterministic HMAC
&lt;/h3&gt;

&lt;p&gt;This is the feature that caught attention on Hacker News.&lt;br&gt;
We don't just replace secrets with &lt;code&gt;***&lt;/code&gt;. We turn &lt;code&gt;secret123&lt;/code&gt; into &lt;code&gt;[HIDDEN:a1b2c3]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input Log:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aragossa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"My key is sk-live-123456"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;PII-Shield Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aragossa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"My key is [HIDDEN:8f2a1b]"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;br&gt;
This is a deterministic HMAC (Hash-based Message Authentication Code).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  It allows you to &lt;strong&gt;trace&lt;/strong&gt; a specific user or session across multiple log lines without knowing who they are.&lt;/li&gt;
&lt;li&gt;  It preserves &lt;strong&gt;Referential Integrity&lt;/strong&gt; for debugging. You can see that "Session A" failed 5 times, but you validly cannot see the Session ID itself.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Killer Feature #3: Statistical Adaptive Threshold
&lt;/h3&gt;

&lt;p&gt;PII-Shield doesn't just use a hardcoded number. It learns the "baseline noise" of your logs. By calculating the mean and standard deviation ($2\sigma$), it automatically adjusts the sensitivity to your specific environment.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. The Logic: Sidecar Architecture
&lt;/h2&gt;

&lt;p&gt;The architecture is dead simple, leveraging the power of Kubernetes sidecars or UNIX pipes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2a293mf3qpmfv2r502b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2a293mf3qpmfv2r502b.png" alt=" " width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your agent simply writes logs to stdout. PII-Shield intercepts the stream, scans it in real-time with near-zero overhead, sanitizes it, and passes it forward.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. The Technical Meat
&lt;/h2&gt;

&lt;p&gt;Why Go? Because we need raw speed and no dependencies.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Entropy &amp;amp; Bigrams (The Math)
&lt;/h3&gt;

&lt;p&gt;Here is how we calculate the "Chaos" (Entropy) of a token in &lt;a href="https://github.com/aragossa/pii-shield/blob/main/pkg/scanner/scanner.go" rel="noopener noreferrer"&gt;scanner.go&lt;/a&gt;. We use a combination of Shannon Entropy, Character Class Bonuses, and English Bigram analysis.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Bigram Check&lt;/strong&gt; is crucial: it penalizes strings that look like valid English (common letter pairs) and boosts score for "unnatural" strings. (Note: This is optimized for English but can be disabled or tuned via &lt;code&gt;PII_DISABLE_BIGRAM_CHECK&lt;/code&gt; for other languages).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// From scanner.go&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;CalculateComplexity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// 1. Shannon Entropy&lt;/span&gt;
    &lt;span class="n"&gt;entropy&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;calculateShannon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="c"&gt;// 2. Class Bonus (Upper, Lower, Digit, Symbols)&lt;/span&gt;
    &lt;span class="c"&gt;// bonus := float64(classes-1) * 0.5&lt;/span&gt;
    &lt;span class="n"&gt;bonus&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;calculateClassBonus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="c"&gt;// 3. Bigram Check (English Likelihood)&lt;/span&gt;
    &lt;span class="c"&gt;// Penalizes common English, boosts random noise&lt;/span&gt;
    &lt;span class="n"&gt;bigramScore&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;calculateBigramAdjustment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;entropy&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;bonus&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;bigramScore&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. False Positives? (Whitelists)
&lt;/h3&gt;

&lt;p&gt;"Entropy is great, but won't it eat my Git Hashes or UUIDs?"&lt;br&gt;
PII-Shield includes built-in &lt;strong&gt;Whitelists&lt;/strong&gt; (&lt;a href="https://github.com/aragossa/pii-shield/blob/main/pkg/scanner/scanner.go#670-760" rel="noopener noreferrer"&gt;isSafe&lt;/a&gt; function) for standard identifiers like UUIDs, IPv6 addresses, Git Commit hashes (SHA-1), and MongoDB ObjectIDs. This ensures your debugging data stays intact while secrets get redacted.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Credit Card Detection (Luhn Algorithm)
&lt;/h3&gt;

&lt;p&gt;Entropy isn't enough for credit cards, as numbers often have low randomness. PII-Shield includes a high-performance implementation of the &lt;strong&gt;Luhn Algorithm&lt;/strong&gt; to scan for valid card checksums in the stream (&lt;a href="https://github.com/aragossa/pii-shield/blob/main/pkg/scanner/scanner.go#813-883" rel="noopener noreferrer"&gt;FindLuhnSequences&lt;/a&gt;).&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Deep JSON Inspection
&lt;/h3&gt;

&lt;p&gt;For JSON logs, PII-Shield performs deep inspection (&lt;a href="https://github.com/aragossa/pii-shield/blob/main/pkg/scanner/scanner.go#962-981" rel="noopener noreferrer"&gt;processJSONLine&lt;/a&gt;), preserving the schema while redacting values. &lt;br&gt;
&lt;em&gt;Note: PII-Shield re-serializes JSON (using &lt;code&gt;json.Marshal&lt;/code&gt;), which may change key order/sorting. It guarantees semantic integrity but is best used early in your pipeline, before any byte-sensitive steps (like signing or exact-diff comparisons).&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  5. Deterministic Redaction
&lt;/h3&gt;

&lt;p&gt;Here is the HMAC logic using a secure Salt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;redactWithHMAC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sensitiveData&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// CurrentConfig.Salt is loaded securely from env vars&lt;/span&gt;
    &lt;span class="n"&gt;mac&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currentConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Salt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;mac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sensitiveData&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;hex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EncodeToString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mac&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c"&gt;// We only keep a short prefix for tracing identity&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"[HIDDEN:%s]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="m"&gt;6&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main loop (&lt;a href="https://github.com/aragossa/pii-shield/blob/main/cmd/cleaner/main.go" rel="noopener noreferrer"&gt;cmd/cleaner/main.go&lt;/a&gt;) is a highly efficient buffered reader:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bufio&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewScanner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Stdin&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Scan&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;reader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c"&gt;// Core logic: Low allocation overhead&lt;/span&gt;
        &lt;span class="c"&gt;// Imports github.com/aragossa/pii-shield/pkg/scanner&lt;/span&gt;
        &lt;span class="n"&gt;cleaned&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;scanner&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ScanAndRedact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cleaned&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  A Note on Scope: High-Entropy vs. Natural Language
&lt;/h3&gt;

&lt;p&gt;Let's be clear: PII-Shield is laser-focused on &lt;strong&gt;High-Entropy secrets&lt;/strong&gt; (API keys, tokens, auth headers) and structural patterns (Credit Cards). It is &lt;strong&gt;not&lt;/strong&gt; a magic NLP bullet for detecting names like "John Smith" or free-text addresses. For that, you should treat PII-Shield as a low-latency "first line of defense" for your infrastructure, potentially complemented by heavier offline NLP tools for semantic analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. How to try it
&lt;/h2&gt;

&lt;p&gt;I am looking for edge cases. If you are building AI agents, try running your trace logs through this and see what it catches (or misses).&lt;/p&gt;

&lt;p&gt;You can run it locally with Docker:&lt;br&gt;
&lt;code&gt;docker run -i -e PII_SALT="mysalt" pii-shield &amp;lt; logs.txt&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/aragossa/pii-shield" rel="noopener noreferrer"&gt;https://github.com/aragossa/pii-shield&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt;&lt;br&gt;
Security often trails behind innovation. With the explosion of AI Agents, we are generating massive amounts of sensitive data in logs. PII-Shield is a "drop-in" safety net to ensure your innovation doesn't become a liability.&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>devops</category>
      <category>go</category>
    </item>
    <item>
      <title>Playwright &amp; Chaos Engineering: 3 Ways to Break Your UI in 10 Lines of Code 🧨</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Tue, 03 Feb 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/aragossa/playwright-chaos-engineering-3-ways-to-break-your-ui-in-10-lines-of-code-2bkj</link>
      <guid>https://dev.to/aragossa/playwright-chaos-engineering-3-ways-to-break-your-ui-in-10-lines-of-code-2bkj</guid>
      <description>&lt;p&gt;"The tests are green, but production is down."&lt;/p&gt;

&lt;p&gt;We’ve all been there. Your CI/CD pipeline looks like a Christmas tree (all green), yet 5 minutes after deployment, the support tickets start rolling in. Why? Because we tend to test only the &lt;strong&gt;"Happy Path."&lt;/strong&gt; In the real world, users enter elevators (network loss), backends have database deadlocks (500 errors), and low-end devices struggle with heavy JS (CPU race conditions).&lt;/p&gt;

&lt;h2&gt;
  
  
  Here are 3 simple ways to inject chaos into your Playwright tests using &lt;strong&gt;Python&lt;/strong&gt; and &lt;strong&gt;TypeScript&lt;/strong&gt; without any external dependencies.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. The "Kill the Backend" Scenario (500 Error Injection)
&lt;/h2&gt;

&lt;p&gt;What happens if your billing API fails? Does your UI show a "Retry" button, or does it hang forever?&lt;/p&gt;

&lt;p&gt;Scenario: Intercept a critical API call and return a 500 Internal Server Error.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Python Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_billing_failure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Intercepting the payment endpoint
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/api/v1/billing/pay&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fulfill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;content_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Internal Database Error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/checkout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_by_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;button&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pay Now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert that the UI handles the crash gracefully
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.error-message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;b&gt;TypeScript Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;handle billing failure&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;**/api/v1/billing/pay&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt; &lt;span class="o"&gt;=&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fulfill&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Internal Database Error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/checkout&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Pay Now&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.error-message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  2. The "Elevator Effect" (Sudden Offline Mode)
&lt;/h2&gt;

&lt;p&gt;Users move. Networks drop. If your app is an SPA, losing connection mid-session can lead to corrupted local states.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scenario&lt;/strong&gt;: Start a file upload and cut the internet connection.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Python Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_upload_interruption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/upload&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_by_label&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set_input_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;heavy_video.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Chaos: Go offline instantly
&lt;/span&gt;    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_offline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Expect a "Resume" button or "Connection lost" banner
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_by_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;button&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Resume&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_offline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Restore network
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;b&gt;TypeScript Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recovery on network loss&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/upload&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByLabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;File&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;setInputFiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;heavy_video.mp4&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setOffline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Resume&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setOffline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  3. The "Old Phone" Race Condition (CPU Throttling)
&lt;/h2&gt;

&lt;p&gt;Async bugs often hide behind the speed of your developer laptop. By slowing down the CPU, you change the execution order of scripts and catch elusive race conditions.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Python Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_race_condition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Slow down CPU by 6x using Chrome DevTools Protocol (CDP)
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new_cdp_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Emulation.setCPUThrottlingRate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/heavy-dashboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_by_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;button&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Load Stats&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Assert that the status eventually becomes 'Ready'
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_contain_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ready&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;b&gt;TypeScript Code&lt;/b&gt;&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;catch race conditions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;gt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;newCDPSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Emulation.setCPUThrottlingRate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;rate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/heavy-dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Load Stats&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#status&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toContainText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Ready&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  💡 Pro Tip: When to Run These?
&lt;/h2&gt;

&lt;p&gt;Don't run chaos tests on every PR. They are inherently more complex and can be "flaky" if your timeouts aren't tuned.&lt;/p&gt;

&lt;p&gt;Best Practice: Add them to a nightly or pre-release suite.&lt;/p&gt;

&lt;p&gt;Limit: Remember that CDP (CPU Throttling) only works on Chromium-based browsers.&lt;/p&gt;

&lt;p&gt;Wrapping Up&lt;br&gt;
Resilience is a feature. If you only test for success, you're only doing half of your job as a QA Engineer. Break your UI before your users do.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I’ve written a more detailed deep-dive on Resilience Strategy &amp;amp; CI/CD integration on my new blog. Check it out at &lt;a href="http://chaosqa.com" rel="noopener noreferrer"&gt;ChaosQA.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>automation</category>
      <category>playwright</category>
      <category>python</category>
    </item>
    <item>
      <title>Please, Stop Redirecting to Login on 401 Errors 🛑</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Thu, 15 Jan 2026 21:57:24 +0000</pubDate>
      <link>https://dev.to/aragossa/please-stop-redirecting-to-login-on-401-errors-3c0l</link>
      <guid>https://dev.to/aragossa/please-stop-redirecting-to-login-on-401-errors-3c0l</guid>
      <description>&lt;p&gt;You spend 15 minutes filling out a long configuration form. You get a Slack notification, switch tabs, reply to a colleague, and grab a coffee.&lt;/p&gt;

&lt;p&gt;30 minutes later, you come back to the form and click &lt;strong&gt;"Save"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The page flashes. The login screen appears.&lt;strong&gt;And your data is gone.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the most annoying UX pattern in web development, and we need to stop doing it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Lazy" Pattern
&lt;/h3&gt;

&lt;p&gt;Why does this happen? Usually, it's because the JWT (access token) expired, the backend returned a 401 Unauthorized, and the frontend code did exactly what the tutorials said to do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Don't do this&lt;/span&gt;
&lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;interceptors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/login&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// RIP data 💀&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Developers often argue: &lt;em&gt;"But it's a security requirement! The session is dead!"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Yes, the session is dead. But that doesn't mean you have to kill the current page state.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Better Way (Resilience)
&lt;/h3&gt;

&lt;p&gt;If a user is just reading a dashboard, a redirect is fine. But if they have unsaved input (forms, comments, settings), a redirect is a bug.&lt;/p&gt;

&lt;p&gt;Here is how a robust app handles this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Intercept:&lt;/strong&gt; Catch the 401 error.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Queue:&lt;/strong&gt; Pause the failed request. Do not reload the page.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Refresh:&lt;/strong&gt; Try to get a new token in the background (using a refresh token) OR show a modal asking for the password again.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Retry:&lt;/strong&gt; Once authenticated, replay the original request with the new token.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The user doesn't even notice. The form saves successfully.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to test this? (The hard part)
&lt;/h3&gt;

&lt;p&gt;Implementing the "Silent Refresh" is tricky, but testing it is annoying.&lt;/p&gt;

&lt;p&gt;Access tokens usually last 1 hour. You can't ask your QA team to "wait 60 minutes and then click Save" to verify the fix.&lt;/p&gt;

&lt;p&gt;You need a way to trigger a 401 error &lt;strong&gt;exactly when you click the button&lt;/strong&gt;, even if the token is valid.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Chaos" Approach
&lt;/h3&gt;

&lt;p&gt;Instead of waiting for the token to expire naturally, we can just delete it "mid-flight."&lt;/p&gt;

&lt;p&gt;I use &lt;strong&gt;Playwright&lt;/strong&gt; for this. We can intercept the outgoing request and strip the Authorization header before it hits the server.&lt;/p&gt;

&lt;p&gt;This forces the backend to reject the request, triggering your app's recovery logic immediately.&lt;/p&gt;

&lt;p&gt;Here is a Python/Playwright snippet I use to verify my apps are "expiry-proof":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_chaos_silent_logout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Login and go to a form
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... perform login logic ...
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/settings/profile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Fill out data
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#bio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Important text I don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t want to lose.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. CHAOS: Intercept the 'save' request
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;kill_token&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;
        &lt;span class="c1"&gt;# We manually delete the token to simulate expiration
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Send the "naked" request. Backend will throw 401.
&lt;/span&gt;        &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;continue_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Attach the interceptor
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/api/profile/save&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kill_token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Click Save
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#save-btn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Check if we survived
&lt;/span&gt;
    &lt;span class="c1"&gt;# If the app is bad, we are now on /login
&lt;/span&gt;    &lt;span class="c1"&gt;# if page.url == "/login": fail()
&lt;/span&gt;
    &lt;span class="c1"&gt;# If the app is good, it refreshed the token and retried.
&lt;/span&gt;    &lt;span class="c1"&gt;# The text should still be there, and the save should succeed.
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#bio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_have_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Important text I don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t want to lose.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.success-message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;Network failures and expired tokens are facts of life. Your app should handle them without punishing the user.&lt;/p&gt;

&lt;p&gt;If you want to build high-quality software, treat 401 Unauthorized as a recoverable error, not a fatal crash.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;PS: If you need to test this on real mobile devices where you can't run Playwright scripts, you can use a &lt;a href="https://chaos-proxy.debuggo.app/blog/silent-logout-auth-chaos" rel="noopener noreferrer"&gt;Chaos Proxy&lt;/a&gt; to strip headers on the network level.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>testing</category>
      <category>ux</category>
      <category>playwright</category>
    </item>
    <item>
      <title>I got tired of guessing why my server crashed: Building a "Smart" Monitor with Global Checks &amp; JSON Validation</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Sat, 03 Jan 2026 21:08:37 +0000</pubDate>
      <link>https://dev.to/aragossa/i-got-tired-of-guessing-why-my-server-crashed-building-a-smart-monitor-with-global-checks-json-f42</link>
      <guid>https://dev.to/aragossa/i-got-tired-of-guessing-why-my-server-crashed-building-a-smart-monitor-with-global-checks-json-f42</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgw8qbvcvj0sv8qhvkyab.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgw8qbvcvj0sv8qhvkyab.png" alt=" " width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: Basic uptime tools just tell you "It's down." I wanted a tool that tells me why (DNS? SSL? App crash?), checks from Tokyo/NY, and validates JSON schemas. So I built OpsPulse.&lt;/p&gt;

&lt;h4&gt;
  
  
  The "It’s Just Down" Problem
&lt;/h4&gt;

&lt;p&gt;Every developer has been there. It’s 3:00 AM. PagerDuty/Telegram screams "Service Down." You wake up, rush to the terminal, check the logs... and the service is fine.&lt;/p&gt;

&lt;p&gt;Was it a network blip? Did the load balancer choke? Did an ISP in Europe drop packets?&lt;/p&gt;

&lt;p&gt;Most uptime monitors are lazy. They check for a 200 OK from a single region (usually AWS us-east-1) and call it a day. That wasn't enough for me. I decided to build a platform that digs deeper without costing as much as Datadog.&lt;/p&gt;

&lt;p&gt;Here is how I built &lt;strong&gt;OpsPulse&lt;/strong&gt; and what makes it different.&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Smart Diagnostics (Root Cause Analysis)
&lt;/h4&gt;

&lt;p&gt;The killer feature of OpsPulse is context. It doesn’t just yell "Error!", it tries to diagnose the patient.&lt;/p&gt;

&lt;p&gt;When an HTTP check fails, the worker triggers a cascade of lower-level checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ping (ICMP):&lt;/strong&gt; Is the server even reachable?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;TCP Connect:&lt;/strong&gt; Is the port open, but Nginx is hanging?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SSL Handshake:&lt;/strong&gt; Did the cert expire, or is the chain of trust broken?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Result:&lt;/strong&gt; instead of a generic "Error 500", you get a Telegram alert saying:🔴 &lt;strong&gt;Status:&lt;/strong&gt; DOWN📉 &lt;strong&gt;Reason:&lt;/strong&gt; Web Server Error🧠 &lt;strong&gt;Context:&lt;/strong&gt; Port 443 open, Ping OK, but Nginx returned 502 Bad Gateway. The issue is on the backend.&lt;/p&gt;

&lt;h4&gt;
  
  
  2. True Global Monitoring (Multi-Region)
&lt;/h4&gt;

&lt;p&gt;Local checks lie. To verify availability properly, I integrated &lt;strong&gt;Google Cloud Functions&lt;/strong&gt;. OpsPulse spins up ephemeral runners to check your resource simultaneously from the US, Europe, and Asia.&lt;/p&gt;

&lt;p&gt;This enabled a &lt;strong&gt;Global DNS Monitor&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Checks propagation of A, MX, TXT records worldwide.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Uses &lt;strong&gt;Fuzzy Matching&lt;/strong&gt; (handling trailing dots and format quirks).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If your site is up in NY but down in Tokyo — you’ll know.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Dev-Centric Features (Not just for websites)
&lt;/h4&gt;

&lt;p&gt;I built this for developers, not just for marketing landing pages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advanced HTTP Monitor:&lt;/strong&gt;It supports custom headers, all methods (GET, POST, PATCH), and strict &lt;strong&gt;Content Validation&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Positive Match:&lt;/strong&gt; Ensure the response contains "Success".&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Negative Match:&lt;/strong&gt; Alert if the response contains "Exception" or "MySQL Error".&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Heartbeat (Dead Man's Switch) with JSON Schema:&lt;/strong&gt;Perfect for backups and cron jobs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Scenario:&lt;/em&gt; Your backup script sends { "status": "ok", "size_mb": 2 }.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Config:&lt;/em&gt; Alert if size_mb &amp;lt; 50.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;The Magic:&lt;/em&gt; I added &lt;strong&gt;JSON Schema Validation&lt;/strong&gt;. You can enforce a strict structure on your incoming webhooks. It turns uptime monitoring into business-metric monitoring.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. Security First
&lt;/h4&gt;

&lt;p&gt;Since a monitoring tool sends requests everywhere, I had to prevent abuse:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SSRF Protection:&lt;/strong&gt; Strict blocking of internal network scanning (localhost, 192.168.x.x) and cloud metadata endpoints.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SSL Chain Validation:&lt;/strong&gt; We don't just check the expiry date. We validate the full chain of trust.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Header Sanitization:&lt;/strong&gt; Stripping dangerous headers before webhook dispatch.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  5. Alerts You Actually Want to Read
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Grace Period:&lt;/strong&gt; Ignore 1-second network hiccups.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recovery Alerts:&lt;/strong&gt; Get notified when systems are back online.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Channels:&lt;/strong&gt; Telegram (bot), Slack (rich formatting), and custom Webhooks.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The Tech Stack
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js + React (Real-time Dashboard).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Backend Worker:&lt;/strong&gt; Python (For heavy lifting and network checks).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cloud:&lt;/strong&gt; Google Cloud Functions (For global nodes).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Database:&lt;/strong&gt; PostgreSQL (via Supabase).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;OpsPulse started as a side project to stop the 3 AM guessing game. Now it’s a full platform that helps me sleep better.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://opspulse.debuggo.app/" rel="noopener noreferrer"&gt;OpsPulse&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What checks are missing from your current monitoring tools? Let me know in the comments! 👇&lt;/p&gt;

</description>
      <category>devops</category>
      <category>monitoring</category>
      <category>cloudfunctions</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Stop Building "Zombie UI": The Resilient UX Checklist (Playwright + Python)</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Wed, 24 Dec 2025 10:08:06 +0000</pubDate>
      <link>https://dev.to/aragossa/stop-building-zombie-ui-the-resilient-ux-checklist-playwright-python-3ahk</link>
      <guid>https://dev.to/aragossa/stop-building-zombie-ui-the-resilient-ux-checklist-playwright-python-3ahk</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Problem: The "Zombie UI"&lt;/strong&gt; 🧠&lt;/p&gt;

&lt;p&gt;You click "Submit". The database is writing data. The API is processing the request perfectly. The backend is healthy. But on the screen... &lt;strong&gt;nothing happens&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The button still looks clickable. The cursor is still a pointer. This is the "&lt;strong&gt;Dead Zone&lt;/strong&gt;" — the gap between the user's input and the interface's reaction.&lt;/p&gt;

&lt;p&gt;According to &lt;strong&gt;Jakob Nielsen's Response Time Limits&lt;/strong&gt;, you have a strict budget:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;0 - 100ms&lt;/strong&gt;: Instant. Feels like manipulating a physical object.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100 - 300ms&lt;/strong&gt;: Slight delay. Acceptable, but noticeable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;300 - 1000ms&lt;/strong&gt;: User loses focus. "Is it working? Did I miss the button?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&amp;gt; 1000ms: Zombie Mode&lt;/strong&gt;. The user thinks the app crashed. They will refresh the page or &lt;strong&gt;rage-click&lt;/strong&gt; the button, triggering duplicate transactions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A working backend is not enough. If your UI freezes for 2 seconds without feedback, &lt;strong&gt;your feature is broken&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8flm38c9bnn1153p4d3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu8flm38c9bnn1153p4d3.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The Solution: The 3-Step Feedback Loop&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We are used to writing tests like this: &lt;code&gt;expect(success_message).to_be_visible()&lt;/code&gt;. But that is not enough. We must assert the &lt;strong&gt;intermediate states&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;✅ I use this Resilient UX Checklist for every async action:&lt;br&gt;
The Checklist&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Immediate (&amp;lt;100ms)&lt;/strong&gt;: The button MUST become disabled. This prevents Rage Clicks and double-charges.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Short Wait (300ms)&lt;/strong&gt;: A spinner or skeleton loader MUST appear.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long Wait (&amp;gt;3000ms)&lt;/strong&gt;: If the network is terrible (e.g., subway tunnel), show a "This is taking longer than usual..." toast. Never leave the user staring at an infinite spinner.&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;&lt;strong&gt;The Code (Python + Playwright)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;How do we test this? We can't rely on random network lag. We need to deliberately freeze the request for 3 seconds to verify the application's "Patience Logic".&lt;/p&gt;

&lt;p&gt;Here is a Playwright test that guarantees the "Zombie UI" never happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.sync_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expect&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_slow_network_ux&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 🛑 1. Setup the "Freeze" Interceptor
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;slow_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❄️ Freezing request to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; for 3s...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Simulate Bad 3G / Subway Network
&lt;/span&gt;        &lt;span class="c1"&gt;# Note: In async tests, use 'await asyncio.sleep(3)'
&lt;/span&gt;        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;continue_&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Intercept the checkout API
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;**/api/checkout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;slow_handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/cart&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 🎬 2. Trigger the Action
&lt;/span&gt;    &lt;span class="n"&gt;submit_btn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#submit-order&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;submit_btn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# ✅ 3. Assert "Immediate Feedback" (0-100ms)
&lt;/span&gt;    &lt;span class="c1"&gt;# The button must be disabled immediately
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submit_btn&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_be_disabled&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# ✅ 4. Assert "Loading State" (100-300ms)
&lt;/span&gt;    &lt;span class="c1"&gt;# The spinner must appear while we wait (we have 3 seconds)
&lt;/span&gt;    &lt;span class="n"&gt;spinner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.spinner-loader&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# ✅ 5. Assert "Success State" (After 3s)
&lt;/span&gt;    &lt;span class="c1"&gt;# Eventually, the request completes
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.success-message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# The spinner should disappear
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spinner&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;not_to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why automate this?&lt;/p&gt;

&lt;p&gt;If you test this manually on &lt;code&gt;localhost&lt;/code&gt;, you will blink and miss the spinner. By forcing a &lt;strong&gt;3-second delay&lt;/strong&gt; in CI, you guarantee that every user—even those on a slow mobile connection—gets a responsive UI, not a dead one.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Architecture Note: CDP vs System Proxy&lt;/strong&gt; 🏗️&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Why your local tests might be lying to you&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Most network interception tests (like the one above) use the &lt;strong&gt;Chrome DevTools Protocol (CDP)&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CDP is great for Browsers&lt;/strong&gt;: It gives you perfect control over traffic inside the Chrome process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDP fails on "The Full Matrix"&lt;/strong&gt;: You cannot easily attach CDP to a physical iPhone running Safari, a Smart TV app, or a native Android build.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The "Real World" Reality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to run these Chaos scenarios on &lt;strong&gt;real physical devices&lt;/strong&gt; (not emulators), code-based interception isn't enough.&lt;/p&gt;

&lt;p&gt;You need a &lt;strong&gt;System Level Proxy&lt;/strong&gt; (like Charles Proxy or a cloud-native tool like &lt;a href="https://chaos-proxy.debuggo.app/" rel="noopener noreferrer"&gt;Chaos Proxy Debuggo&lt;/a&gt;). These tools sit between the physical device and the internet, allowing you to apply "3G Throttling" or "Random 500 Errors" to a real iPhone without changing a single line of your app's code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Perceived Performance &amp;gt; Actual Performance.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You cannot always fix the slow SQL query. You cannot fix the user's spotty 4G connection. But you can fix how your UI communicates that delay.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>ux</category>
      <category>playwright</category>
      <category>qa</category>
    </item>
    <item>
      <title>💥 Break your API before your users do. Automated Network Chaos for CI/CD.</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Sun, 21 Dec 2025 17:46:45 +0000</pubDate>
      <link>https://dev.to/aragossa/chaos-network-proxy-3cbf</link>
      <guid>https://dev.to/aragossa/chaos-network-proxy-3cbf</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7wueacjpruo3l0nns0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat7wueacjpruo3l0nns0.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;The Story: Why I built Chaos Proxy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We've all been there. The feature works perfectly on &lt;code&gt;localhost&lt;/code&gt;. The E2E tests pass with flying colors. Then we deploy to production, and users on 3G networks start complaining that the app freezes, crashes, or—worst of all—&lt;strong&gt;charges them twice&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I realized that our CI pipelines were living in a fantasy world of &lt;strong&gt;0ms latency and 100% uptime&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I wanted to simulate "Bad Network" conditions automatically in GitHub Actions, specifically for &lt;strong&gt;mobile apps&lt;/strong&gt; and &lt;strong&gt;backend idempotency checks&lt;/strong&gt;. I tried mocking requests in Playwright, but that didn't cover native Android/iOS emulators. I tried local proxies, but they were hard to script.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Chaos Proxy&lt;/strong&gt; —a cloud-based, programmable Chaos Proxy designed for CI/CD.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Debuggo isn't just a GUI tool. It’s an &lt;strong&gt;API-first platform&lt;/strong&gt;. You can treat your network infrastructure like code.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt;: Your CI script calls our API to spin up an isolated, ephemeral proxy container.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connect&lt;/strong&gt;: You route your E2E test traffic (Web, Android, iOS) through this proxy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Break&lt;/strong&gt;: You send API commands to inject latency, trigger 503 errors, or tamper with headers in real-time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=NbPPoqCv3mY" rel="noopener noreferrer"&gt;Demo: Simulating 503 Errors in Chrome (Visual)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Features&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API for CI/CD&lt;/strong&gt;: Spin up and destroy proxies programmatically. No long-living servers to manage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Rage Click" Test&lt;/strong&gt;: Inject 3 seconds of latency into specific endpoints (e.g., &lt;code&gt;/api/pay&lt;/code&gt;) to ensure your UI disables buttons correctly before the user clicks twice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Mobile Support&lt;/strong&gt;: Since it works at the network level (HTTP Proxy), it supports Android Emulators and iOS Simulators perfectly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response Fuzzing&lt;/strong&gt;: Automatically tamper with JSON bodies to see if your app crashes on malformed data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⚡️ &lt;strong&gt;See it in action&lt;/strong&gt;&lt;br&gt;
Here is how simple it is to inject a 503 Service Unavailable error into your checkout flow using curl:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; PUT https://api.debuggo.app/v1/sessions/&lt;span class="nv"&gt;$SESSION_ID&lt;/span&gt;/rules &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "rules": [
      {
        "url_pattern": "*/api/checkout",
        "failure_rate": 50,
        "error_code": 503
      }
    ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=x_S-guPwPEk" rel="noopener noreferrer"&gt;Demo: Automating Network Chaos via Terminal (CLI)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How you can get involved&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I just launched the &lt;strong&gt;Public API Beta&lt;/strong&gt;. I am looking for QA Engineers and DevOps folks who are tired of "flaky" apps and want to build true resilience.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Try the Free Tier: You can start manually or via API for free.&lt;/li&gt;
&lt;li&gt;Break your App: Try the "Rage Click" test (Tip #5 on our blog).&lt;/li&gt;
&lt;li&gt;Feedback: Let me know what integration you need next!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Stop trusting localhost. Start testing reality.&lt;/p&gt;

</description>
      <category>launch</category>
      <category>devops</category>
      <category>chaosengineering</category>
      <category>testing</category>
    </item>
    <item>
      <title>Announcing Chaos Proxy API: Automate Network Chaos in CI/CD 🚀</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Sun, 21 Dec 2025 14:45:04 +0000</pubDate>
      <link>https://dev.to/aragossa/announcing-chaos-proxy-api-automate-network-chaos-in-cicd-1lac</link>
      <guid>https://dev.to/aragossa/announcing-chaos-proxy-api-automate-network-chaos-in-cicd-1lac</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2feazg1cp9ctjxvqzg79.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2feazg1cp9ctjxvqzg79.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Moving Beyond "Localhost" Testing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Until now, Debuggo has been a fantastic tool for manual testing. You spin up a proxy, connect your phone, and verify how your app handles a 503 error or high latency. It works great for ad-hoc debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But manual testing doesn't scale.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You cannot ask your QA team to manually verify "Offline Mode" handling on every single Pull Request. You cannot manually check if your payment gateway handles double-clicks correctly before every deploy.&lt;/p&gt;

&lt;p&gt;To build truly resilient apps, you need &lt;strong&gt;Continuous Chaos&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Today, we are launching the &lt;strong&gt;Chaos Proxy API&lt;/strong&gt;. Now you can programmatically create proxies, configure chaos rules, and tear them down—all within your CI/CD pipeline (GitHub Actions, GitLab CI, Jenkins).&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Architecture: How it works in CI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The API gives you full control over the lifecycle of a Chaos Proxy directly from your pipeline scripts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt;: Spin up a fresh, isolated proxy instance on demand (&lt;code&gt;POST /sessions&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configure&lt;/strong&gt;: Apply chaos rules (latency, errors, body tampering) via JSON (&lt;code&gt;PUT /rules&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Certify&lt;/strong&gt;: Download the CA certificate to install on Android Emulators or iOS Simulators (&lt;code&gt;GET /certs&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test&lt;/strong&gt;: Run your E2E suite (Playwright, Appium, Cypress) routing traffic through the proxy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Destroy&lt;/strong&gt;: Clean up resources when the test finishes (&lt;code&gt;DELETE /sessions&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;



&lt;p&gt;&lt;strong&gt;Real-World Example&lt;/strong&gt;: GitHub Actions&lt;br&gt;
Here is a complete workflow. This script spins up a proxy, injects a &lt;strong&gt;3-second latency&lt;/strong&gt; to simulate a slow network, runs tests to ensure the UI handles "Rage Clicks" correctly, and then shuts everything down.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;🧪 Chaos E2E Tests&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;chaos-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v3&lt;/span&gt;

      &lt;span class="c1"&gt;# 1. Start the Proxy&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;🚀 Start Debuggo Proxy&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;start_proxy&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;RESPONSE=$(curl -s -X POST https://chaos-proxy.debuggo.app/api/v1/sessions \&lt;/span&gt;
            &lt;span class="s"&gt;-H "Authorization: Bearer ${{ secrets.DEBUGGO_API_KEY }}")&lt;/span&gt;

          &lt;span class="s"&gt;# Extract and save details to ENV&lt;/span&gt;
          &lt;span class="s"&gt;echo "PROXY_ID=$(echo $RESPONSE | jq -r .id)" &amp;gt;&amp;gt; $GITHUB_ENV&lt;/span&gt;
          &lt;span class="s"&gt;echo "PROXY_HOST=$(echo $RESPONSE | jq -r .host)" &amp;gt;&amp;gt; $GITHUB_ENV&lt;/span&gt;
          &lt;span class="s"&gt;echo "PROXY_PORT=$(echo $RESPONSE | jq -r .port)" &amp;gt;&amp;gt; $GITHUB_ENV&lt;/span&gt;
          &lt;span class="s"&gt;echo "PROXY_AUTH=$(echo $RESPONSE | jq -r .auth)" &amp;gt;&amp;gt; $GITHUB_ENV&lt;/span&gt;

      &lt;span class="c1"&gt;# 2. Configure Chaos (The "Bad 3G" Simulation)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;💣 Configure Chaos Rules&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;curl -X PUT https://chaos-proxy.debuggo.app/api/v1/sessions/$PROXY_ID/rules \&lt;/span&gt;
            &lt;span class="s"&gt;-H "Authorization: Bearer ${{ secrets.DEBUGGO_API_KEY }}" \&lt;/span&gt;
            &lt;span class="s"&gt;-H "Content-Type: application/json" \&lt;/span&gt;
            &lt;span class="s"&gt;-d '{&lt;/span&gt;
              &lt;span class="s"&gt;"rules": [&lt;/span&gt;
                &lt;span class="s"&gt;{&lt;/span&gt;
                  &lt;span class="s"&gt;"url_pattern": "*/api/checkout",&lt;/span&gt;
                  &lt;span class="s"&gt;"delay": 3000,&lt;/span&gt;
                  &lt;span class="s"&gt;"error_code": null&lt;/span&gt;
                &lt;span class="s"&gt;}&lt;/span&gt;
              &lt;span class="s"&gt;]&lt;/span&gt;
            &lt;span class="s"&gt;}'&lt;/span&gt;

      &lt;span class="c1"&gt;# 3. Run Tests&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;🧪 Run Playwright Tests&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;# Route traffic through the authenticated proxy&lt;/span&gt;
          &lt;span class="s"&gt;export HTTPS_PROXY="http://$PROXY_AUTH@$PROXY_HOST:$PROXY_PORT"&lt;/span&gt;
          &lt;span class="s"&gt;npx playwright test&lt;/span&gt;

      &lt;span class="c1"&gt;# 4. Cleanup (Always run this, even if tests fail)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;🧹 Cleanup&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;curl -X DELETE https://chaos-proxy.debuggo.app/api/v1/sessions/$PROXY_ID \&lt;/span&gt;
            &lt;span class="s"&gt;-H "Authorization: Bearer ${{ secrets.DEBUGGO_API_KEY }}"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;API Reference&lt;/strong&gt;&lt;br&gt;
Use these endpoints to integrate Chaos Proxy into your custom scripts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication&lt;/strong&gt;&lt;br&gt;
Authenticate all requests by including your API Key in the header. You can generate a key in your Dashboard Settings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Authorization: Bearer dbg_ci_YOUR_KEY
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Start Proxy Session
Creates a new isolated proxy container. Returns the host, port, and credentials.&lt;/li&gt;
&lt;li&gt;Endpoint: &lt;code&gt;POST /api/v1/sessions&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Response:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sess_abc123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"host"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"proxy-us-east.debuggo.app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"port"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10245&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"auth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user:pass"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Configure Rules
Updates the chaos logic in real-time. You can change rules mid-test (e.g., test success first, then inject failure).&lt;/li&gt;
&lt;li&gt;Endpoint: &lt;code&gt;PUT /api/v1/sessions/{session_id}/rules&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Body Example:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url_pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*/api/v1/checkout"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"failure_rate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"error_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"delay"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url_pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*/api/v1/search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"delay"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Download CA Certificate
Retrieves the Root CA certificate. Essential for automated setup of Android Emulators or iOS Simulators in CI.&lt;/li&gt;
&lt;li&gt;Endpoint: GET /api/v1/certs/ca.pem&lt;/li&gt;
&lt;li&gt;Usage:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-O&lt;/span&gt; https://chaos-proxy.debuggo.app/api/v1/certs/ca.pem
&lt;span class="c"&gt;# Then install via adb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Stop Session
Stops the proxy and releases the port.&lt;/li&gt;
&lt;li&gt;Endpoint: &lt;code&gt;DELETE /api/v1/sessions/{session_id}&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;Why automate Chaos?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Catch Regressions in "Unhappy Paths"&lt;/strong&gt; Developers often break error handling logic because they rarely see errors locally. Automating a 500 Error test ensures your "Something went wrong" screen never breaks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate Idempotency&lt;/strong&gt; By injecting latency into your payment endpoints during CI, you can verify that your backend correctly handles duplicate requests (Rage Clicks) before they reach production.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Mobile Testing&lt;/strong&gt; Unlike Playwright’s built-in page.route (which only works in a browser context), Debuggo works at the system level. This allows you to test &lt;strong&gt;Native Android and iOS apps&lt;/strong&gt; running in emulators within your CI pipeline.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ready to break your build (on purpose)? &lt;a href="https://chaos-proxy.debuggo.app/" rel="noopener noreferrer"&gt;Get your API Key&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>automation</category>
      <category>api</category>
      <category>testing</category>
    </item>
    <item>
      <title>The "Spinner of Death": Why Localhost Latency is Lying to You</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Thu, 18 Dec 2025 23:18:21 +0000</pubDate>
      <link>https://dev.to/aragossa/the-spinner-of-death-why-localhost-latency-is-lying-to-you-3326</link>
      <guid>https://dev.to/aragossa/the-spinner-of-death-why-localhost-latency-is-lying-to-you-3326</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqkenu53ndcrptns9qhl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqkenu53ndcrptns9qhl.png" alt=" " width="800" height="255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "Localhost" Bias&lt;/strong&gt;&lt;br&gt;
We've all been there.&lt;/p&gt;

&lt;p&gt;On your machine, the API responds in &lt;strong&gt;5ms&lt;/strong&gt;. The UI updates instantly. You click "Submit," the modal closes, and you move on to the next ticket. Status: &lt;strong&gt;Done&lt;/strong&gt;. ✅&lt;/p&gt;

&lt;p&gt;But on a user's 4G connection in a subway tunnel, that same API call takes &lt;strong&gt;2 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Because you tested on localhost (Gigabit Fiber), you missed critical race conditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;🖱️ &lt;strong&gt;The Double-Click Bug&lt;/strong&gt;: The user clicks "Submit" twice because "nothing happened," charging their credit card twice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🔄 &lt;strong&gt;The Infinite Spinner&lt;/strong&gt;: The loader gets stuck forever because a packet was dropped.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;🏎️ &lt;strong&gt;Race Conditions&lt;/strong&gt;: Data arrives out of order, overwriting the user's input.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your app feels fast because you are cheating. &lt;strong&gt;0ms latency is a lie&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Wrong Solution&lt;/strong&gt;: &lt;code&gt;time.sleep()&lt;/code&gt;&lt;br&gt;
I often see tests that look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ❌ Don't do this
page.click("#submit")
time.sleep(2) # Simulating "network lag"
expect(page.locator(".success")).to_be_visible()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Why this fails&lt;/strong&gt;: &lt;code&gt;sleep()&lt;/code&gt; just pauses the test execution script. The browser engine itself is still blazing fast. It doesn't simulate network queues, slow handshakes, or constrained bandwidth. You aren't testing the network; you're just making your test suite slower.
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Right Solution: Network Throttling (CDP)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To test this properly in automation, you need to talk directly to the browser engine. You need to tell Chrome: "&lt;em&gt;Pretend you are on a terrible 50kb/s connection.&lt;/em&gt;"&lt;/p&gt;

&lt;p&gt;We can do this using the &lt;strong&gt;Chrome DevTools Protocol (CDP)&lt;/strong&gt; within Playwright. This forces the browser to handle packet delays and loading states exactly as a real user would experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Code (Python + Playwright)&lt;/strong&gt;&lt;br&gt;
Here is how to inject a "Bad 3G" connection into your test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from playwright.sync_api import Page, expect

def test_slow_network_handling(page: Page):
    # 1. Connect to Chrome DevTools Protocol (CDP)
    # This gives us low-level access to the browser
    client = page.context.new_cdp_session(page)

    # 2. 🧨 CHAOS: Emulate "Bad 3G"
    # Latency: 2000ms (2 seconds)
    # Throughput: 50kb/s (Very slow)
    client.send("Network.emulateNetworkConditions", {
        "offline": False,
        "latency": 2000, 
        "downloadThroughput": 50 * 1024,
        "uploadThroughput": 50 * 1024
    })

    page.goto("https://myapp.com/search")

    # 3. Trigger the slow action
    page.fill("#search-box", "Playwright")
    page.click("#search-btn")

    # 4. Resilience Assertion

    # Check 1: Does the UI prevent double submission?
    expect(page.locator("#search-btn")).to_be_disabled()

    # Check 2: Does the user get immediate feedback?
    expect(page.locator(".loading-spinner")).to_be_visible()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Why this matters&lt;/strong&gt;: This test proves your UI provides feedback. If a user clicks a button and waits 2 seconds with no visual feedback, they will assume the app is broken.
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;But wait, what about Mobile Apps?&lt;/strong&gt; 📱&lt;br&gt;
The script above is perfect for automated CI pipelines running Chrome. But CDP has a major limitation: &lt;strong&gt;It doesn't work on a physical iPhone or Android device&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you are a Mobile Developer or manual QA, you can't "attach Playwright" to the phone in your hand to simulate a subway tunnel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Manual Alternative (System-Level Proxy)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To test latency on a real device without writing code, you need a &lt;strong&gt;System-Level Proxy&lt;/strong&gt; that sits between your phone and the internet.&lt;/p&gt;

&lt;p&gt;You can use desktop tools like Charles Proxy (if you enjoy configuring Java apps and firewalls), or you can use a cloud-based tool like &lt;a href="https://chaos-proxy.debuggo.app/" rel="noopener noreferrer"&gt;Chaos Proxy&lt;/a&gt; (which I'm building).&lt;/p&gt;

&lt;p&gt;It allows you to simulate &lt;strong&gt;"Subway Mode" (2s latency)&lt;/strong&gt; on any device—iPhone, Android, or Laptop—just by connecting to a Wi-Fi proxy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Workflow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a "Chaos Rule" (e.g., Latency = 2000ms).&lt;/li&gt;
&lt;li&gt;Connect your phone to the proxy via QR code.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
   3. Watch your app struggle (and then fix it).
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Stop trusting Localhost&lt;/strong&gt;. It hides your worst bugs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated&lt;/strong&gt;: Use Playwright + CDP to inject latency in your E2E tests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual/Mobile&lt;/strong&gt;: Use a Chaos Proxy to test resilience on physical devices.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Happy testing! 🧪
&lt;/h2&gt;

&lt;p&gt;If you found this useful, check out my previous post: &lt;a href="https://dev.to/aragossa/stop-testing-success-kill-the-database-27mg"&gt;Stop Testing Success. Kill the Database.&lt;/a&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>webdev</category>
      <category>qa</category>
      <category>playwright</category>
    </item>
    <item>
      <title>Stop Testing Success. Kill the Database. 🧨</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Thu, 11 Dec 2025 10:00:44 +0000</pubDate>
      <link>https://dev.to/aragossa/stop-testing-success-kill-the-database-27mg</link>
      <guid>https://dev.to/aragossa/stop-testing-success-kill-the-database-27mg</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5nnr9md142co5npw3ri.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl5nnr9md142co5npw3ri.jpg" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Intro to Chaos Engineering for QA. Learn how to test resilience by injecting failures with Docker and Playwright.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We are obsessed with the "Happy Path".&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In traditional QA, we verify that the application works when everything is perfect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The network is stable.&lt;/li&gt;
&lt;li&gt;The database responds in 5ms.&lt;/li&gt;
&lt;li&gt;Third-party APIs are online.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But in production, &lt;strong&gt;nothing is perfect&lt;/strong&gt;. Pods crash, networks lag, and databases lock up.&lt;/p&gt;

&lt;p&gt;When these things happen, a standard Selenium/Playwright test just says: &lt;code&gt;Failed&lt;/code&gt;. It doesn't tell you &lt;em&gt;how&lt;/em&gt; the application failed. Did it show a graceful error message? Or did it crash with a white screen and a raw stack trace?&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;Chaos Engineering&lt;/strong&gt; comes in.&lt;/p&gt;

&lt;h2&gt;
  
  
  From QA to Resilience Engineering
&lt;/h2&gt;

&lt;p&gt;Chaos Engineering isn't just for Site Reliability Engineers (SREs). As modern QAs, we need to stop asking "Does it work?" and start asking "What happens when it breaks?"&lt;/p&gt;

&lt;p&gt;Today, I’ll show you how to write a &lt;strong&gt;Chaos Test&lt;/strong&gt; using Python, Playwright, and the Docker SDK.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Goal
&lt;/h3&gt;

&lt;p&gt;We aren't going to wait for the database to fail. We are going to &lt;strong&gt;kill it intentionally&lt;/strong&gt; in the middle of a test and verify that our frontend handles it gracefully.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Python&lt;/strong&gt; (Test logic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Playwright&lt;/strong&gt; (UI Interaction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker SDK&lt;/strong&gt; (The Chaos Injector)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Code 🐍
&lt;/h2&gt;

&lt;p&gt;Here is the complete script. It connects to your local Docker daemon, finds the Postgres container, and strangles it while the user is trying to work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;docker&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;playwright.sync_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expect&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_database_failure_resilience&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Setup: Connect to Docker
&lt;/span&gt;    &lt;span class="c1"&gt;# We use the python-docker library to control the infrastructure
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;docker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_env&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Target your specific database container
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;db_container&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgres-prod&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;docker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NotFound&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Database container not found! Is Docker running?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Happy Path: Verify the app loads normally
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Step 1: Loading Dashboard...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:3000/dashboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.user-balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 🧨 CHAOS TIME: Kill the Database
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔥 Step 2: Injecting Chaos (Stopping DB)...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;db_container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Resilience Assertion
&lt;/span&gt;    &lt;span class="c1"&gt;# The app should NOT show a white screen or crash.
&lt;/span&gt;    &lt;span class="c1"&gt;# It SHOULD show a friendly "Connection Lost" toast or retry button.
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;👀 Step 3: Verifying graceful degradation...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Trigger an action that requires the DB
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 

    &lt;span class="c1"&gt;# Assert UI handles the error
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.error-toast&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_contain_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Connection lost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.retry-button&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 🩹 RECOVERY: Bring the Database back
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🩹 Step 4: Healing the infrastructure...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;db_container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Give the app a moment to reconnect (or trigger a manual retry)
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.retry-button&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Self-Healing Assertion
&lt;/span&gt;    &lt;span class="c1"&gt;# The app should recover without requiring a full page refresh
&lt;/span&gt;    &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.user-balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to_be_visible&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Test Passed: System is resilient.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you run this test and your application shows a &lt;strong&gt;500 Server Error&lt;/strong&gt; page, you have found a bug. Not a functional bug, but an &lt;strong&gt;architectural bug&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;By adding "Chaos Tests" to your regression suite, you guarantee that your product doesn't just work—it &lt;strong&gt;survives&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;👋 &lt;strong&gt;Want more Chaos?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I write The 5-Minute QA—a daily newsletter for Senior QAs and SDETs. Every morning, I send one actionable tip on &lt;strong&gt;Chaos Engineering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://substack.com/@qaespresso" rel="noopener noreferrer"&gt;👉 Subscribe here to get the tips in your inbox&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>devops</category>
      <category>chaosengineering</category>
      <category>python</category>
    </item>
    <item>
      <title>Why "Record and Playback" is Dead (And Why I'm Betting on Natural Language)</title>
      <dc:creator>Ilya Ploskovitov</dc:creator>
      <pubDate>Sat, 06 Dec 2025 19:43:13 +0000</pubDate>
      <link>https://dev.to/aragossa/why-record-and-playback-is-dead-and-why-im-betting-on-natural-language-lb8</link>
      <guid>https://dev.to/aragossa/why-record-and-playback-is-dead-and-why-im-betting-on-natural-language-lb8</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhck9y8ae3x5r8g4evwu.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdhck9y8ae3x5r8g4evwu.jpg" alt="Intent-based AI" width="800" height="436"&gt;&lt;/a&gt;&lt;br&gt;
Hey dev.to!&lt;/p&gt;

&lt;p&gt;It's Ilya from &lt;strong&gt;Debuggo&lt;/strong&gt; again.&lt;/p&gt;

&lt;p&gt;Let's talk about the "elephant in the room" of test automation. About the thing we all started with and the thing we all eventually learned to hate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Record and Playback.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You know these tools. You hit the red "Record" button, click around your website, the tool saves your actions, and... voilà! You have a test.&lt;/p&gt;

&lt;p&gt;It looks like magic. Right until you run that test tomorrow. Or until a developer moves a button by 5 pixels. Or until an element ID changes.&lt;/p&gt;

&lt;p&gt;Then the magic turns into a pumpkin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Trap of Classic Recorders&lt;/strong&gt;&lt;br&gt;
Why do classic recorders (from the old Selenium IDE to modern alternatives) break so often?&lt;/p&gt;

&lt;p&gt;Because they record the &lt;strong&gt;Implementation&lt;/strong&gt;, not the &lt;strong&gt;Intent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When you click the "Buy" button, the recorder sees:&lt;br&gt;
&lt;code&gt;click(css="#app &amp;gt; div:nth-child(2) &amp;gt; button.red-btn")&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The recorder doesn't know this is a "Buy" button. It only knows its "address" in the DOM tree. If you wrap that button in a new &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;, the address changes. The test fails. This is called Brittle Tests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intent-Based Testing: The Evolution&lt;/strong&gt;&lt;br&gt;
That's why in Debuggo, I ditched "click recording" in favor of &lt;strong&gt;Natural Language&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of recording coordinates or rigid selectors, you write:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Click the "Buy" button&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;At this moment, the AI magic happens. It analyzes the page, understands the context, and translates your intent into an action.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's the difference?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scenario&lt;/strong&gt;: A developer changes the layout. The "Buy" button now has a different class, a different ID, and sits in a different part of the page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recorder&lt;/strong&gt;: Looks for the old selector &lt;code&gt;#btn-123&lt;/code&gt;. Doesn't find it. &lt;strong&gt;Test Fails&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debuggo&lt;/strong&gt;: The AI looks at the page. It "thinks": "Okay, the old selector is gone. But the user asked to click 'Buy'. I see a button with the text 'Purchase now' and a cart icon. Semantically, this is the same thing." &lt;strong&gt;Test Passes&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;"But isn't AI slow?" (Math and Ecology)&lt;/strong&gt;&lt;br&gt;
Here, an experienced engineer will ask: &lt;em&gt;"Ilya, if you run screenshots through an LLM for every single step, your test suite will take forever! And it will cost a fortune!"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And you would be absolutely right.&lt;/p&gt;

&lt;p&gt;In my benchmarks, a complex scenario requiring AI planning and analysis (Reasoning) takes about &lt;strong&gt;10 minutes&lt;/strong&gt; to generate. Imagine if every CI run took 10 minutes per test. Your pipeline would grind to a halt.&lt;/p&gt;

&lt;p&gt;That is why I use a Hybrid Architecture, which saves both time and electricity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. "Think Once, Execute Thousands"&lt;/strong&gt;&lt;br&gt;
I use the AI only during the test creation phase. The Agent analyzes the page, "thinks" for 10 minutes, finds the perfect locators, and saves them to the database as optimized steps.&lt;/p&gt;

&lt;p&gt;When you run this test again (e.g., in CI/CD), Debuggo &lt;strong&gt;does not use the AI&lt;/strong&gt;. It pulls the ready-made steps from the database. The execution time for that same test from the DB? &lt;strong&gt;130 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Compare: &lt;strong&gt;600 seconds&lt;/strong&gt; (&lt;strong&gt;with AI&lt;/strong&gt;) vs &lt;strong&gt;130 seconds&lt;/strong&gt; (&lt;strong&gt;without AI&lt;/strong&gt;). We get a speed boost of almost 5x on every run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Self-Healing and Ecology (Green IT)&lt;/strong&gt;&lt;br&gt;
There is also an ethical angle. We all know LLMs consume massive amounts of energy. Training and inference require powerful GPUs that heat up the planet.&lt;/p&gt;

&lt;p&gt;Using "heavy" AI for every run of a regression test (which might run hundreds of times a day) is &lt;strong&gt;ecological waste&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In Debuggo, the AI only "wakes up" when it is actually needed—for &lt;strong&gt;Self-Healing&lt;/strong&gt;. If a locator breaks due to a layout change during a fast run (the 130-second version):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The test pauses.&lt;/li&gt;
&lt;li&gt;We make &lt;strong&gt;one targeted request to the AI&lt;/strong&gt;: "Find this button again."&lt;/li&gt;
&lt;li&gt;The AI finds the new locator and updates the database.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Important&lt;/strong&gt;: The test continues and is marked as &lt;strong&gt;Passed&lt;/strong&gt;, but with a &lt;strong&gt;Warning&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Why a Warning? Because we don't want to hide changes. We are telling the tester:&lt;/p&gt;

&lt;p&gt;"&lt;em&gt;Hey, the test passed, but the 'Buy' button moved. I found it and fixed the test, but you should take a look: is this a redesign or a layout bug?&lt;/em&gt;"&lt;/p&gt;

&lt;p&gt;This way, we get the reliability of an AI agent, but keep the speed of classic code and leave the final control to the human.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Bet&lt;/strong&gt;&lt;br&gt;
I'm building Debuggo on the hypothesis that the future of QA is a balance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed and Sustainability&lt;/strong&gt; of standard code during execution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intelligence&lt;/strong&gt; of AI during creation and maintenance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This eliminates the main weakness of recorders (brittleness) and the main weakness of AI agents (slowness and energy cost).&lt;/p&gt;

&lt;p&gt;If you are tired of fixing tests that break with every layout shift, and you want to do it efficiently—give this approach a try.&lt;/p&gt;

&lt;p&gt;I'm waiting for you in the beta: &lt;a href="https://debuggo.app" rel="noopener noreferrer"&gt;https://debuggo.app&lt;/a&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>automation</category>
      <category>architecture</category>
      <category>greenit</category>
    </item>
  </channel>
</rss>
