<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matthew Wimpelberg</title>
    <description>The latest articles on DEV Community by Matthew Wimpelberg (@matthew_wimpelberg_79193b).</description>
    <link>https://dev.to/matthew_wimpelberg_79193b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3117473%2F0015b293-7100-4a57-88bb-5368b1a76d2a.jpg</url>
      <title>DEV Community: Matthew Wimpelberg</title>
      <link>https://dev.to/matthew_wimpelberg_79193b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/matthew_wimpelberg_79193b"/>
    <language>en</language>
    <item>
      <title>From Load Test to Production Monitor k6 Studio, Grafana Cloud, and Synthetic Monitoring</title>
      <dc:creator>Matthew Wimpelberg</dc:creator>
      <pubDate>Fri, 12 Jun 2026 14:02:02 +0000</pubDate>
      <link>https://dev.to/matthew_wimpelberg_79193b/from-load-test-to-production-monitor-k6-studio-grafana-cloud-and-synthetic-monitoring-4mbb</link>
      <guid>https://dev.to/matthew_wimpelberg_79193b/from-load-test-to-production-monitor-k6-studio-grafana-cloud-and-synthetic-monitoring-4mbb</guid>
      <description>&lt;h1&gt;
  
  
  Part 4 of 4: From Load Test to Production Monitor — k6 Studio, Grafana Cloud, and Synthetic Monitoring
&lt;/h1&gt;

&lt;p&gt;The first three parts of this series were about running tests. This one is about making them permanent.&lt;/p&gt;

&lt;p&gt;In part 1, k6 was a command-line tool you ran against a URL. In part 2 it became a layered test suite version-controlled alongside the app it tests. In part 3 the stress test revealed something real about the app's architecture. All of that is useful as a development workflow. None of it tells you anything about what's happening in production right now.&lt;/p&gt;

&lt;p&gt;That's what this post is about. The same scripts, pointed at a publicly reachable endpoint via ngrok, streaming results into Grafana Cloud in real time, and running on a schedule as synthetic monitors. One codebase. Three modes: local development, cloud-streamed load test, permanent availability check.&lt;/p&gt;

&lt;p&gt;All the code is here: &lt;a href="https://github.com/mwimpelberg28/k6-playground" rel="noopener noreferrer"&gt;https://github.com/mwimpelberg28/k6-playground&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Exposing the homelab with ngrok
&lt;/h2&gt;

&lt;p&gt;The Online Boutique runs on a private cluster at &lt;code&gt;10.4.20.2&lt;/code&gt;. Grafana Cloud's synthetic monitoring probes can't reach that they're running from data centers in major cloud providers. To demo synthetic monitoring against a real app rather than a public URL I don't control, I needed to expose the cluster temporarily.&lt;/p&gt;

&lt;p&gt;ngrok handles this in one command, pointed at the cluster's frontend service on the reserved free static domain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ngrok http &lt;span class="nt"&gt;--url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;imitation-laxative-iphone.ngrok-free.dev 10.4.20.2:80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ngrok prints the forwarding URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Forwarding  https://imitation-laxative-iphone.ngrok-free.dev -&amp;gt; http://10.4.20.2:80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That URL is now publicly reachable. Any HTTP request to it gets tunneled to the cluster. ngrok's free tier now gives you one reserved static domain on &lt;code&gt;ngrok-free.dev&lt;/code&gt;.  It's stable across restarts, which is what lets me hardcode it into the committed &lt;code&gt;cloud-*&lt;/code&gt; npm scripts and the synthetic monitor config rather than re-editing them every time the tunnel comes up. (An ephemeral tunnel gets a random URL that changes on each restart; a paid plan adds multiple custom domains.)&lt;/p&gt;

&lt;p&gt;One honest observation: response times change through the tunnel. TTFB in the local load tests was 36ms because the test runner and the cluster are on the same LAN. Through ngrok, requests travel to ngrok's edge, get forwarded to the cluster, and travel back — a single &lt;code&gt;curl&lt;/code&gt; through the tunnel measured TTFB around 315ms, roughly 9× the LAN figure. Under the full load run, request &lt;code&gt;p95&lt;/code&gt; landed at ~670ms (vs 273ms on the LAN). That's not a problem — it's actually more realistic. Local load tests measure server performance. Measuring through the tunnel captures something closer to what a remote user experiences.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the load test from Grafana Cloud with &lt;code&gt;k6 cloud run&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This is the other reason the app has to be publicly reachable. &lt;code&gt;k6 cloud run&lt;/code&gt; doesn't execute on your laptop — it uploads the script and runs it on Grafana Cloud's load generators, in whatever regions you configure. Those runners live in Grafana's data centers, so they reach the Online Boutique exactly the way the synthetic probes do: through the ngrok tunnel, not over the LAN. As the test runs, every metric data point streams back into Grafana Cloud in real time rather than printing to the terminal at the end.&lt;/p&gt;

&lt;p&gt;Authentication is a one-time login with an API token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k6 cloud login &lt;span class="nt"&gt;--token&lt;/span&gt; &amp;lt;your-api-token&amp;gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K6_CLOUD_PROJECT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-project-id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The token comes from your Grafana Cloud account under k6 → Settings → API Token. The project ID is visible on the same page.&lt;/p&gt;

&lt;p&gt;Then the cloud run is the same bundle and the same config, pointed at the public URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k6 cloud run dist/test.main.js &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;../src/config/load.config.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://imitation-laxative-iphone.ngrok-free.dev
&lt;span class="c"&gt;# or: npm run cloud-load&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(These commands, like the npm scripts, run from the &lt;code&gt;k6-boutique/&lt;/code&gt; directory — that's why &lt;code&gt;dist/&lt;/code&gt; and &lt;code&gt;../src/config/&lt;/code&gt; resolve the way they do.)&lt;/p&gt;

&lt;p&gt;Every &lt;code&gt;http_req_duration&lt;/code&gt;, every custom metric, every check result is written to Grafana Cloud as it happens.&lt;/p&gt;

&lt;p&gt;The Grafana Cloud k6 interface gives you a run summary page automatically — no dashboard configuration required. It shows the VU ramp timeline, p95 response time over the run, error rate, and check pass rate. For a quick read it's enough. For deeper analysis — and for correlating load-test results with infrastructure metrics — you want the Grafana dashboard.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the tunnel run actually surfaced
&lt;/h3&gt;

&lt;p&gt;The cloud run finished and tripped a threshold — k6 exits non-zero when any threshold fails. The cloud UI holds the full metric breakdown; to read the complete table here I ran the same &lt;code&gt;load.config.json&lt;/code&gt; through the same tunnel (35 max VUs, three concurrent journeys, five minutes). The interesting part is &lt;em&gt;which&lt;/em&gt; threshold failed — every latency threshold held comfortably:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;✓ http_req_duration                p(95)=668ms   (&amp;lt;3000)
  ✓ {journey:browser}              p(95)=783ms   (&amp;lt;2000)
  ✓ {journey:shopper}              p(95)=604ms   (&amp;lt;4000)
  ✓ {journey:currency}             p(95)=609ms   (&amp;lt;2000)
✓ group_duration{:::homepage}      avg=309ms     (&amp;lt;500)
✓ group_duration{:::browse product} avg=273ms    (&amp;lt;400)
✓ boutique_checkout_duration       p(95)=617ms   (&amp;lt;5000)
&lt;/span&gt;&lt;span class="gp"&gt;✓ boutique_checkout_success        rate=100%     (&amp;gt;&lt;/span&gt;0.80&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;✗ http_req_failed                  rate=9.51%    (&amp;lt;0.05)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Latency was fine. Checkout succeeded 100% of the time. The single failing threshold was the error rate: &lt;code&gt;http_req_failed&lt;/code&gt; at 9.51% — 375 failed requests out of 3,942 — clustered on the homepage and product-page fetches (the &lt;code&gt;status 200&lt;/code&gt; check dropped to 86%), with 107 dropped iterations alongside them.&lt;/p&gt;

&lt;p&gt;That pattern is the lesson. The app served clean 200s on every manual request, latency stayed healthy, and yet ~1 request in 10 failed under sustained load. The cause wasn't the Online Boutique, it was the free ngrok tunnel. At ~12.7 requests/second the free tier's connection and rate limits start shedding requests, and those show up in k6 as non-200s. The bottleneck under load was the demo plumbing, not the system under test.&lt;/p&gt;

&lt;p&gt;This is worth internalizing before you trust a number: a load test measures the &lt;em&gt;entire&lt;/em&gt; path. When you insert a free tunnel between the generator and the app, you've added a component with its own limits, and at high enough throughput that component fails before the app does. For a real load test you'd point k6 at the cluster directly (or pay for a tunnel tier built for it); the tunnel is for &lt;em&gt;reachability&lt;/em&gt; exposing the app to Grafana's cloud generators and synthetic probes — not for absorbing load. The thresholds in &lt;code&gt;load.config.json&lt;/code&gt; were calibrated against the app on the LAN, so they correctly flagged that something in the path was degrading. They just couldn't tell me it was the tunnel; the error pattern did.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building the Grafana dashboard
&lt;/h2&gt;

&lt;p&gt;The value of having k6 metrics in Grafana Cloud isn't the k6 interface it's that the same data is in the same Prometheus datasource as your infrastructure metrics. You can build panels that put them side by side.&lt;/p&gt;

&lt;p&gt;The custom metrics from the scripts are queryable by name. The four from this suite:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# checkout success rate
boutique_checkout_success

# p95 checkout duration
histogram_quantile(0.95, rate(boutique_checkout_duration_bucket[1m]))

# cart errors over a 5-minute window
increase(boutique_cart_errors_total[5m])

# active sessions (latest gauge value)
boutique_active_sessions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The dashboard I built has eight panels, organized into five groups:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VU ramp&lt;/strong&gt; — a time series of &lt;code&gt;k6_vus&lt;/code&gt; showing the ramp shape. Useful for correlating degradation onset with a specific VU count. When the product page began slowing in the stress test, this panel pinned down exactly when — and at what VU count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;p95 response time by journey&lt;/strong&gt; — overlaid lines via &lt;code&gt;histogram_quantile(0.95, rate(k6_http_req_duration_bucket{journey="shopper"}[1m]))&lt;/code&gt; and the equivalent for &lt;code&gt;journey="browser"&lt;/code&gt;. At low load the two journeys track each other closely; as VUs climb they fan apart, and the panel shows &lt;em&gt;which&lt;/em&gt; journey's latency is degrading rather than burying it in a single global p95.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkout success rate&lt;/strong&gt; — &lt;code&gt;boutique_checkout_success&lt;/code&gt; as a stat panel with a threshold at 80%. Green above, red below. During the load test this sits comfortably at 100%. During stress it starts to drop. This is the panel that maps to a business SLO rather than an infrastructure metric.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cart error count&lt;/strong&gt; — &lt;code&gt;boutique_cart_errors_total&lt;/code&gt; as a time series. Flat during normal load. Any spikes here are worth investigating immediately regardless of what the response time panels show — a cart error is a customer who couldn't add an item, and that has a direct revenue implication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Web Vitals&lt;/strong&gt; — LCP, FCP, TTFB, and CLS as stat panels with their respective thresholds colored. CLS shows red at 0.117 from the browser test results. Everything else is green.&lt;/p&gt;

&lt;p&gt;The dashboard is exportable as JSON and lives in the repo at &lt;code&gt;grafana/dashboard.json&lt;/code&gt;. Import it into any Grafana instance connected to the same Prometheus datasource and it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  k6 Studio
&lt;/h2&gt;

&lt;p&gt;k6 Studio is a desktop app that sits between browser recording and code. You record a session in its built-in browser, it generates a k6 script, and you can validate and replay the recording before exporting the script.&lt;/p&gt;

&lt;p&gt;It's useful in two specific situations: onboarding someone who hasn't written k6 scripts before, and quickly generating the skeleton of a new test flow for an endpoint you haven't covered yet. For the Online Boutique I could've used it to record the checkout flow end-to-end adding a product to cart, navigating to cart, submitting the order and then folded the generated script into the &lt;code&gt;lib/&lt;/code&gt; layer to add error handling and custom metrics.&lt;/p&gt;

&lt;p&gt;The generated script is verbose. k6 Studio captures everything the browser sends, including headers and cookies that k6 handles automatically, and includes them explicitly. Before the generated script is usable in a real suite you'll strip the redundant headers, replace hardcoded URLs with variables, and wrap the requests in groups. But having the request sequence correct from the start the right endpoints in the right order with the right request bodies saves meaningful time compared to reconstructing it from documentation or browser DevTools by hand.&lt;/p&gt;

&lt;p&gt;One thing it doesn't do: k6 Studio doesn't understand your application's business logic. It records what the browser sent. It doesn't know that the &lt;code&gt;cartId&lt;/code&gt; in the cart request needs to match the session, or that the currency selector needs to be set before the price conversion call. That logic lives in the &lt;code&gt;lib/&lt;/code&gt; layer and you add it manually after import.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setting up synthetic monitoring
&lt;/h2&gt;

&lt;p&gt;Synthetic monitoring turns a k6 script into a scheduled check that runs from Grafana's global probe network. The same script that ran as a local load test becomes a permanent canary executing on a set interval (as often as every minute), from multiple locations, alerting when it fails.&lt;/p&gt;

&lt;p&gt;The setup lives in Grafana Cloud under Synthetic Monitoring → Scripted. You paste your script, configure the probe locations, set the execution interval, and save. The script runs against your target URL on that schedule indefinitely.&lt;/p&gt;

&lt;p&gt;For the Online Boutique I used the smoke test script with the BASE_URL pointed at the ngrok tunnel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// k6-boutique/src/config/smoke.config.json — top-level thresholds&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;thresholds&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http_req_failed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rate&amp;lt;0.05&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http_req_duration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;p(95)&amp;lt;2000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;checks&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rate&amp;gt;0.90&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;// ...plus per-group group_duration thresholds, omitted here&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The probe locations I selected: North Virginia (US East), London (EU West), and Tokyo (Asia Pacific). One note on interval: the docs and most tutorials assume a one-minute frequency, but Synthetic Monitoring's free tier caps you at 100,000 check executions/month, and a scripted check fanned out to three probes at one-minute eats ~130,000/month on its own. To stay inside the free tier I ran the three probes at a &lt;strong&gt;two-minute&lt;/strong&gt; interval (~65,000/month). Every two minutes, each probe runs the smoke test against the public URL and reports pass/fail, response time, and check results back to Grafana Cloud.&lt;/p&gt;

&lt;p&gt;For alerting you'd wire the check's pass rate to a contact point: if it drops below 95% for two consecutive probe intervals, fire to Slack. At a two-minute interval that's a ~four-minute detection window — fast enough to catch a real availability incident, slow enough to ride out a single-probe flake.&lt;/p&gt;

&lt;p&gt;Unlike the load test, synthetic monitoring runs at a low request rate — three probes, once every two minutes — so it never approaches the tunnel's &lt;em&gt;rate&lt;/em&gt; limits. But "low rate" is not "zero failures," and that turned out to be the interesting part. Over a collection window of a few intervals, the measured per-probe numbers were:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Probe                       avg http_req_duration   check pass rate   checkout success
North Virginia (US East)    72 ms                   88%               100%
London (EU West)            277 ms                  95%               100%
Tokyo (Asia Pacific)        324 ms                  90%               100%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things stand out. First, checks did &lt;strong&gt;not&lt;/strong&gt; sit at a clean 100% they ran 88–95%, because the free tunnel dropped the occasional request even at this trickle of traffic. The checkout flow itself succeeded 100% of the time on every probe; the misses were on the homepage and product fetches, the same tunnel-shedding signature the load test surfaced, just much rarer. The lesson from earlier holds at every scale: you're measuring the whole path, and the free tunnel is the weakest link in it.&lt;/p&gt;

&lt;p&gt;Second, the latency gradient by location is real and expected —but note &lt;em&gt;which&lt;/em&gt; probe is fastest. North Virginia comes in lowest at 72 ms because ngrok's edge and the cluster are both US-based, so that probe barely leaves the country. London and Tokyo are 4–5× higher not because the app is slower for them, but because their requests cross an ocean to reach the US edge before they ever touch the cluster. The cluster is physically in the US; the speed of light does the rest. This is something a local load test, with the runner next to the cluster on the same LAN, can never show you.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the unified view actually gives you
&lt;/h2&gt;

&lt;p&gt;By the end of this series, the k6 setup does three distinct things that look the same from the outside but serve different purposes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k6 run&lt;/code&gt; during development catches regressions before they ship. You run the smoke test against a branch before opening a PR. If response times have jumped or a check is failing, you find out before the reviewer does.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k6 cloud run&lt;/code&gt; during staging runs the full load and stress scenarios from Grafana's load generators and puts the results in the same observability stack as your infrastructure metrics. When the p95 product page latency spikes at 100 VUs, you can open the same Grafana instance and look at CPU and memory on the catalog and recommendation service pods at that exact moment. The load test result and the infrastructure telemetry share a timestamp axis.&lt;/p&gt;

&lt;p&gt;Synthetic monitoring in production tells you what users are experiencing right now, from where they are, continuously. Not a snapshot from the last test run a live signal.&lt;/p&gt;

&lt;p&gt;The same script, version-controlled, reviewed, and maintained like application code, powers all three.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;This series started with a 30-line script and a philosophical argument: load tests should be code, not configuration. By the end it's a layered test suite, a Grafana dashboard, a stress test that revealed something real about a microservices call graph, a CLS finding that HTTP testing would never have surfaced, and a synthetic monitor running checks from three continents.&lt;/p&gt;

&lt;p&gt;The tooling is k6 and Grafana Cloud. The underlying idea is that performance isn't a phase before launch it's a property of the system that you measure continuously, with the same rigor you bring to the rest of your engineering.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;#k6 #Grafana #SyntheticMonitoring #LoadTesting #Observability #SRE #Kubernetes #WebVitals&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>monitoring</category>
      <category>performance</category>
      <category>testing</category>
    </item>
    <item>
      <title>Custom Metrics, Stress Testing, and Web Vitals, Going Beyond Basic Load Testing with k6</title>
      <dc:creator>Matthew Wimpelberg</dc:creator>
      <pubDate>Mon, 08 Jun 2026 09:06:37 +0000</pubDate>
      <link>https://dev.to/matthew_wimpelberg_79193b/custom-metrics-stress-testing-and-web-vitals-going-beyond-basic-load-testing-with-k6-4moc</link>
      <guid>https://dev.to/matthew_wimpelberg_79193b/custom-metrics-stress-testing-and-web-vitals-going-beyond-basic-load-testing-with-k6-4moc</guid>
      <description>&lt;h1&gt;
  
  
  Part 3 of 4: Custom Metrics, Stress Testing, and Web Vitals — Going Beyond Basic Load Testing with k6
&lt;/h1&gt;

&lt;p&gt;In part 2 I built a layered test suite against Google's Online Boutique on a homelab Kubernetes cluster. Smoke passed. The load test ran clean after fixing two bugs, a wrong assertion string on checkout and a missing &lt;code&gt;await&lt;/code&gt; in the browser scenario. The load test summary showed p95 response times at 273ms, checkout success at 100%, and a CLS score of 0.117 nudging just over the 0.10 threshold.&lt;/p&gt;

&lt;p&gt;That left three things unfinished. The stress test hadn't run. The CLS finding had no explanation. And the four custom metric types I'd defined in the scenarios deserved more than a passing mention.&lt;/p&gt;

&lt;p&gt;This post runs the stress test, reads the results architecturally, explains what CLS 0.117 actually means and why HTTP testing would never have surfaced it, and walks through all four custom metric types with concrete examples of when each one is the right tool.&lt;/p&gt;

&lt;p&gt;All the code is here: &lt;a href="https://github.com/mwimpelberg28/k6-playground" rel="noopener noreferrer"&gt;https://github.com/mwimpelberg28/k6-playground&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the stress test
&lt;/h2&gt;

&lt;p&gt;The stress config ramps VUs in stages, holds at peak, then ramps down. The goal isn't "break the app" it's find where degradation starts and understand the shape of it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;src/config/stress.config.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scenarios"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stress"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ramping-vus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"exec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stressFlow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"stages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"gracefulStop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"30s"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thresholds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"http_req_failed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;                        &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rate&amp;lt;0.10"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"http_req_duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;                      &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;5000"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::homepage}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;1000"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::browse product}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;2000"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"checks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;                                 &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rate&amp;gt;0.70"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thresholds are deliberately looser than the load test. Stress isn't about enforcing SLOs it's about observing where and how the system degrades before it hits a hard wall. A stress test that fails immediately at tight thresholds tells you nothing useful about the degradation curve.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run stress
&lt;span class="c"&gt;# k6 run dist/test.main.js -e CONFIG_FILE=../src/config/stress.config.json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What the stress test showed
&lt;/h2&gt;

&lt;p&gt;The homepage held through the full ramp. Browse product requests started accumulating failures around the 100 VU mark, and by 150 VUs the check pass rate for product pages had dropped noticeably while the homepage check pass rate stayed flat.&lt;/p&gt;

&lt;p&gt;That divergence is the finding. The homepage and product page both live in the same frontend service, on the same pod. If the frontend service itself were the bottleneck, both would degrade together. They didn't.&lt;/p&gt;

&lt;p&gt;The difference is what each endpoint does downstream. The homepage makes one call: fetch featured products from the catalog service. The product page makes three in parallel fetch product details from the catalog service, fetch recommendations from the recommendation service, convert the price via the currency service. Under low concurrency that fan-out is invisible. Under high concurrency, those downstream services start queuing work, and the product page's response time is gated on whichever of the three takes longest.&lt;/p&gt;

&lt;p&gt;This is one of the defining properties of microservices under load. Call graph depth matters more than frontend capacity. A single downstream service that saturates its thread pool or starts garbage collecting will cause latency spikes in every upstream caller that touches it and only those callers. The homepage, which doesn't touch the recommendation or currency service, keeps serving cleanly.&lt;/p&gt;

&lt;p&gt;The stress test didn't break the app catastrophically. The homepage never went down. That's actually a well-behaved degradation pattern, the system is shedding load on complex, expensive paths while protecting simple ones. A poorly behaved version of this would see the frontend process itself crash, taking everything with it. What we observed instead was selective degradation by call graph complexity, which points directly at the downstream services as the constraint rather than the frontend.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why group_duration catches this and http_req_duration doesn't
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;http_req_duration&lt;/code&gt; measures how long a single HTTP request takes. During the stress test, individual requests to the frontend completed in reasonable time even as the app was struggling.  The frontend was accepting connections and dispatching work quickly. What was slow was waiting for the downstream calls to come back.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;group_duration&lt;/code&gt; measures the wall-clock time of a named step end-to-end, including any sequential calls inside it. Every &lt;code&gt;group()&lt;/code&gt; in the scripts layer gets a corresponding &lt;code&gt;group_duration{group:::name}&lt;/code&gt; series automatically with no extra instrumentation required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// threshold on a single request&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http_req_duration{journey:shopper}&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;p(95)&amp;lt;4000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;

&lt;span class="c1"&gt;// threshold on the full browse step including downstream wait time&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;group_duration{group:::browse product}&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;avg&amp;lt;2000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the only threshold was &lt;code&gt;http_req_duration&lt;/code&gt;, the stress test would have looked healthier than it was. The group threshold on &lt;code&gt;browse product&lt;/code&gt; caught the degradation because it was measuring the step the user actually experiences from initiating the product page load to receiving a complete response, including all downstream latency.&lt;/p&gt;

&lt;p&gt;This is the shift from infrastructure metrics to user-experience metrics. The group is the unit of SLO, not the request.&lt;/p&gt;




&lt;h2&gt;
  
  
  The four custom metric types
&lt;/h2&gt;

&lt;p&gt;k6 ships four custom metric types. Each has a specific meaning that makes it right for certain questions and wrong for others. Using the wrong one produces data that's technically correct but practically misleading.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trend&lt;/strong&gt; collects a distribution of values and exposes percentiles, min, max, and average. Use it when you want to know what "typical" looks like across all iterations. Checkout duration is a Trend because you want to know p95 the slowest experience a large fraction of users had, not just whether checkout ever succeeded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkoutDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Trend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_checkout_duration&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// record it — called once per checkout attempt&lt;/span&gt;
&lt;span class="nx"&gt;checkoutDuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// threshold against it&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;boutique_checkout_duration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;p(95)&amp;lt;5000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second argument to &lt;code&gt;new Trend()&lt;/code&gt; is &lt;code&gt;isTime&lt;/code&gt;. Pass &lt;code&gt;true&lt;/code&gt; when the values are milliseconds and k6 will format them as time in the terminal output rather than raw numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate&lt;/strong&gt; measures the fraction of recorded values that were successful. Use it when you want a success or failure percentage. Checkout success is a Rate because "82% of checkouts completed" is a statement that maps to a business SLO. "134 checkouts completed" is a count that requires context to interpret, context that changes depending on how many VUs ran and for how long.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkoutSuccess&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Rate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_checkout_success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// record it — true = success, false = failure&lt;/span&gt;
&lt;span class="nx"&gt;checkoutSuccess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// threshold against it&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;boutique_checkout_success&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rate&amp;gt;0.80&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Counter&lt;/strong&gt; accumulates a total. Use it when you want an absolute count of something. Cart errors are a Counter rather than a Rate because even a low error rate can represent a large absolute number of failures at high VU counts, and in a real business, each cart error is a customer who couldn't buy something. A Rate tells you the proportion; a Counter tells you the magnitude.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cartErrors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_cart_errors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// record it&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;cartOk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;cartErrors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gauge&lt;/strong&gt; records the current value of something at the moment it's called. Unlike Trend, it doesn't accumulate a distribution, it reflects the most recent reading. Use it for point-in-time state: how many sessions are active right now, what's the current queue depth, is a feature flag on or off. In a test context it's less common than the others, but it's the right tool when you care about instantaneous state rather than aggregate behavior.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;activeSessions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Gauge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_active_sessions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;activeSessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cookieJar&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;cookiesForURL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The practical summary: percentile distribution → Trend. Success/failure percentage → Rate. Running total → Counter. Point-in-time reading → Gauge.&lt;/p&gt;

&lt;p&gt;All four stream to Grafana Cloud as Prometheus time series when you run with &lt;code&gt;k6 cloud run&lt;/code&gt;. The names you define in code become the series names. You query them in Grafana exactly like any other metric — &lt;code&gt;rate(boutique_cart_errors[5m])&lt;/code&gt;, &lt;code&gt;histogram_quantile(0.95, boutique_checkout_duration)&lt;/code&gt;. Your load test data lives in the same datasource as your infrastructure metrics, with the same query language and the same alerting system.&lt;/p&gt;




&lt;h2&gt;
  
  
  The k6 Browser module and Web Vitals
&lt;/h2&gt;

&lt;p&gt;The load test results from part 2 included a CLS score of 0.117 just over the 0.10 "good" threshold. To understand that number you need to know what the browser module is measuring and why it's different from everything else in the test suite.&lt;/p&gt;

&lt;p&gt;The browser module runs a real Chromium instance. Not a simulated HTTP client, it's an actual browser, rendering pages, executing JavaScript, loading images, painting layout. Web Vitals are measurements taken from inside that rendering process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LCP (Largest Contentful Paint)&lt;/strong&gt;  when did the largest visible element finish rendering? Measures perceived load speed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FCP (First Contentful Paint)&lt;/strong&gt;  when did any content first appear? Measures how quickly the page starts showing something.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTFB (Time to First Byte)&lt;/strong&gt;  how long before the browser received the first byte of the response? Measures server and network responsiveness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLS (Cumulative Layout Shift)&lt;/strong&gt;  how much did the page layout move around after initial render? Measures visual stability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are measurable from the HTTP layer. An HTTP client can tell you the server responded in 80ms. It can't tell you the user saw a blank white screen for 1.2 seconds while JavaScript parsed, or that the page jumped when an image loaded late and pushed all the text down.&lt;/p&gt;

&lt;p&gt;The entry point is separate from the HTTP tests. Browser scenarios require Chromium, which can't share a process with the HTTP engine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/browser.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;k6/browser&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;check&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;   &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;k6&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;sleep&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;   &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;k6&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;scenarios&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;browser_smoke&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;per-vu-iterations&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;         &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chromium&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;vus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;             &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;gracefulStop&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;browser_web_vital_lcp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(75)&amp;lt;2500&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;browser_web_vital_fcp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(75)&amp;lt;1800&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;browser_web_vital_ttfb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(75)&amp;lt;800&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;browser_web_vital_cls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(75)&amp;lt;0.10&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://10.4.20.2/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page title present&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;shows Hot Products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hot Products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two implementation details worth being explicit about.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;async/await&lt;/code&gt; throughout. The browser API is Promise-based. &lt;code&gt;page.title()&lt;/code&gt; returns a Promise — not a string. Without &lt;code&gt;await&lt;/code&gt;, &lt;code&gt;page.title().length&lt;/code&gt; is &lt;code&gt;1&lt;/code&gt;, which is the length of the Promise object. The check always passes and measures nothing. This is the bug from part 2. Every browser API call needs to be awaited, including &lt;code&gt;page.goto()&lt;/code&gt;, &lt;code&gt;page.content()&lt;/code&gt;, and &lt;code&gt;page.title()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;try/finally&lt;/code&gt; around page operations. If a check throws, the &lt;code&gt;finally&lt;/code&gt; block closes the page regardless. Without it, failed iterations leak Chromium instances and the test eventually exhausts memory. This isn't optional defensive programming — it's required for browser tests to be reliable.&lt;/p&gt;

&lt;p&gt;No &lt;code&gt;group()&lt;/code&gt; calls in browser scenarios. There's a long-standing k6 issue with groups in browser context. Use named checks for step-level visibility instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  What CLS 0.117 actually means
&lt;/h2&gt;

&lt;p&gt;The part 2 results showed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;browser_web_vital_lcp:  avg=335ms  p(75)=380ms   ✓
browser_web_vital_fcp:  avg=255ms  p(75)=290ms   ✓
browser_web_vital_ttfb: avg=36ms   p(75)=42ms    ✓
browser_web_vital_cls:  avg=0.117  p(75)=0.121   ✗
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LCP, FCP, and TTFB are well inside their thresholds. TTFB at 36ms reflects a local network with no physical distance between test runner and server, but also a frontend service responding promptly. Nothing to fix there.&lt;/p&gt;

&lt;p&gt;CLS at 0.117 failed the threshold. CLS accumulates a score each time visible content shifts position after the initial render — specifically, it measures the fraction of the viewport affected multiplied by the distance the content moved. A score of 0 means nothing shifted. A score over 0.10 is Google's boundary between "good" and "needs improvement."&lt;/p&gt;

&lt;p&gt;On the Online Boutique homepage, the most likely cause is the product image grid. The browser paints the page structure  nav, heading, product card containers before the images have loaded. When the images arrive, they push the surrounding layout down. The browser records that shift and adds it to the CLS score.&lt;/p&gt;

&lt;p&gt;The fix is to tell the browser how much space each image will occupy before it loads. Explicit &lt;code&gt;width&lt;/code&gt; and &lt;code&gt;height&lt;/code&gt; attributes on &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tags, or a CSS &lt;code&gt;aspect-ratio&lt;/code&gt; declaration on the container, lets the browser reserve the right amount of space during initial layout. The images load into that space without causing a shift.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- causes layout shift — browser doesn't know the image dimensions --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"/static/img/products/sunglasses.jpg"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;&amp;lt;!-- no layout shift — browser reserves space before the image loads --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;img&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"/static/img/products/sunglasses.jpg"&lt;/span&gt; &lt;span class="na"&gt;width=&lt;/span&gt;&lt;span class="s"&gt;"320"&lt;/span&gt; &lt;span class="na"&gt;height=&lt;/span&gt;&lt;span class="s"&gt;"320"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Online Boutique frontend doesn't do this. The product images load asynchronously into unsized containers and shift the layout on arrival. A 0.117 CLS score won't meaningfully affect search rankings on its own, but it is a real user experience problem — content jumping while someone is trying to read it, and it's exactly the class of issue that HTTP testing never surfaces because HTTP testing doesn't render anything.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Part 4 closes the loop on the unified observability story that started in part 1. &lt;code&gt;k6 run&lt;/code&gt; is a developer workflow. &lt;code&gt;k6 cloud run&lt;/code&gt; streams results into Grafana Cloud in real time. k6 Studio is a visual test editor that generates scripts without writing code. And synthetic monitoring turns the same scripts you've been running locally into scheduled checks from global probe locations, the same code, running permanently, alerting when production degrades.&lt;/p&gt;

&lt;p&gt;The shift from load test to production monitor is what makes k6 different from most testing tools. Part 4 is about making that shift concrete.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;#k6 #Grafana #LoadTesting #WebVitals #PerformanceTesting #Observability #SRE #Kubernetes&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>performance</category>
      <category>testing</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Building a Real k6 Test Suite Against a Live Kubernetes App</title>
      <dc:creator>Matthew Wimpelberg</dc:creator>
      <pubDate>Thu, 28 May 2026 15:29:02 +0000</pubDate>
      <link>https://dev.to/matthew_wimpelberg_79193b/part-2-of-4-building-a-real-k6-test-suite-against-a-live-kubernetes-app-1f81</link>
      <guid>https://dev.to/matthew_wimpelberg_79193b/part-2-of-4-building-a-real-k6-test-suite-against-a-live-kubernetes-app-1f81</guid>
      <description>&lt;h1&gt;
  
  
  Part 2 of 4: Building a Real k6 Test Suite Against a Live Kubernetes App
&lt;/h1&gt;

&lt;p&gt;In part 1 I covered k6's philosophy and the anatomy of a first test. This post is where things get real — a production-grade test suite running against a live microservices app on a homelab Kubernetes cluster, including what went wrong on the first run and how I debugged it. All of the code can be found here: &lt;a href="https://github.com/mwimpelberg28/k6-playground" rel="noopener noreferrer"&gt;https://github.com/mwimpelberg28/k6-playground&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The target: Online Boutique
&lt;/h2&gt;

&lt;p&gt;Rather than testing against a mock or a toy API, I wanted something that resembles a real production system. Google's Online Boutique is a microservices demo app with 11 services covering a realistic e-commerce stack: frontend, cart, checkout, product catalog, currency conversion, recommendations, and more.&lt;/p&gt;

&lt;p&gt;Deploying it took about two minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create namespace boutique
kubectl apply &lt;span class="nt"&gt;-n&lt;/span&gt; boutique &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/main/release/kubernetes-manifests.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My homelab runs a kubeadm cluster on Ubuntu with MetalLB for load balancing. Within 30 seconds MetalLB had assigned a real external IP and the app was serving traffic at &lt;code&gt;http://10.4.20.2&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get svc &lt;span class="nt"&gt;-n&lt;/span&gt; boutique frontend-external
&lt;span class="c"&gt;# NAME                TYPE           EXTERNAL-IP   PORT(S)&lt;/span&gt;
&lt;span class="c"&gt;# frontend-external   LoadBalancer   10.4.20.2     80:xxxxx/TCP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The architecture decision that matters most
&lt;/h2&gt;

&lt;p&gt;Before writing a single test I designed a layered project structure. This is the difference between a test suite and a folder of scripts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;k6-boutique/
├── src/
│   ├── config/          ← test options as JSON, selected at runtime
│   │   ├── smoke.config.json
│   │   ├── load.config.json
│   │   ├── stress.config.json
│   │   └── browser.config.json
│   ├── scenarios/       ← user journey flows: chain scripts + sleep
│   │   ├── browseFlow.js
│   │   ├── shopperFlow.js
│   │   ├── currencyFlow.js
│   │   └── stressFlow.js
│   ├── scripts/         ← individual page actions: one group() per file
│   │   ├── home.js
│   │   ├── product.js
│   │   ├── cart.js
│   │   ├── checkout.js
│   │   └── currency.js
│   ├── pages/           ← Page Object Model classes for browser tests
│   │   ├── HomePage.js
│   │   └── ProductPage.js
│   ├── lib/             ← shared HTTP client and check assertions
│   │   ├── client.js
│   │   └── checks.js
│   ├── main.js          ← single entry point for all HTTP tests
│   └── browser.js       ← entry point for browser tests
├── webpack.config.js
└── package.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Think of it as lego blocks. The &lt;code&gt;lib/&lt;/code&gt; layer knows how to talk to the app. The &lt;code&gt;scripts/&lt;/code&gt; layer wraps each action in a named &lt;code&gt;group()&lt;/code&gt;. The &lt;code&gt;scenarios/&lt;/code&gt; layer chains those actions into user journeys. The &lt;code&gt;config/&lt;/code&gt; layer defines the load profile and thresholds for each test type. Nothing reaches down more than one layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The shared client
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/lib/client.js&lt;/code&gt; knows how to talk to the app — base URL, request helpers, product IDs, checkout payload. Every layer imports from it. Change the target URL once, everything picks it up.&lt;/p&gt;

&lt;p&gt;One detail worth calling out: every request carries a &lt;code&gt;name&lt;/code&gt; tag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/client.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;baseHeaders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;frontend&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/product/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;get-product&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;addToCart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;quantity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;BASE_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/cart`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nf"&gt;params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;add-to-cart&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without the &lt;code&gt;name&lt;/code&gt; tag, k6 tracks &lt;code&gt;/product/0PUK6V6EV0&lt;/code&gt; and &lt;code&gt;/product/1YMWWN1N4O&lt;/code&gt; as separate metric series. With 10 product IDs and many VUs you hit Grafana Cloud's "too many series" limit fast. The &lt;code&gt;name&lt;/code&gt; tag collapses all product page requests into a single &lt;code&gt;get-product&lt;/code&gt; series regardless of the ID in the URL.&lt;/p&gt;

&lt;h3&gt;
  
  
  The shared checks
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/lib/checks.js&lt;/code&gt; knows what a good response looks like for each page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/lib/checks.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkHome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;status 200&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;shows products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hot Products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;response &amp;lt; 2s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Define it once, use it everywhere. When the app changes, fix it in one place.&lt;/p&gt;

&lt;h3&gt;
  
  
  The scripts layer
&lt;/h3&gt;

&lt;p&gt;Each file in &lt;code&gt;scripts/&lt;/code&gt; wraps one action in a named &lt;code&gt;group()&lt;/code&gt; and runs the appropriate check. This is the unit of reuse — scenarios call these, not raw HTTP calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/scripts/product.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;browseProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;browse product&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;checkProductPage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;viewProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;view product&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;checkProductPage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;getProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Different group names matter — &lt;code&gt;group_duration{group:::browse product}&lt;/code&gt; and &lt;code&gt;group_duration{group:::view product}&lt;/code&gt; are separate metrics, so you can set different SLAs for casual browsing vs. intent-to-buy flows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Config files drive everything
&lt;/h2&gt;

&lt;p&gt;Rather than hardcoding load profiles in test files, each test type has a JSON config file that's passed at runtime. The single entry point reads whichever config you point it at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/main.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CONFIG_FILE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;__ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CONFIG_FILE&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../src/config/smoke.config.json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;testConfig&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;insecureSkipTlsVerify&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;testConfig&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;setup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;getHome&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;  &lt;span class="c1"&gt;// warm the connection before VUs start&lt;/span&gt;
  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Named exports so scenario `exec` fields in the JSON config can reference them&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;browseFlow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;shopperFlow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;currencyFlow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;stressFlow&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The build step (webpack) bundles everything into &lt;code&gt;dist/test.main.js&lt;/code&gt;. The JSON config files stay outside the bundle and are opened at runtime, so you can swap them without rebuilding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run build

&lt;span class="c"&gt;# local run&lt;/span&gt;
k6 run dist/test.main.js &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;../src/config/load.config.json

&lt;span class="c"&gt;# cloud run&lt;/span&gt;
k6 cloud run dist/test.main.js &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;../src/config/load.config.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Smoke test first
&lt;/h2&gt;

&lt;p&gt;The smoke config is 1 VU, 5 iterations of &lt;code&gt;shopperFlow&lt;/code&gt; — homepage → product → add to cart → checkout. Its only job is to confirm the app is up and critical paths respond correctly. If smoke fails, nothing else runs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;src/config/smoke.config.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scenarios"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"smoke"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"per-vu-iterations"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"vus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"iterations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"exec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"shopperFlow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"gracefulStop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"30s"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"thresholds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"http_req_failed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rate&amp;lt;0.05"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"http_req_duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;2000"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"checks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rate&amp;gt;0.90"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;

    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::homepage}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;500"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::view product}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;500"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::add to cart}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;1000"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"group_duration{group:::checkout}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"avg&amp;lt;5000"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;group_duration&lt;/code&gt; thresholds are worth explaining. &lt;code&gt;http_req_duration&lt;/code&gt; tells you how fast individual requests are. &lt;code&gt;group_duration&lt;/code&gt; tells you how long an entire named step takes — a group might contain a single request or several. Setting an SLA on &lt;code&gt;group_duration{group:::checkout}&lt;/code&gt; is much closer to a real business SLO than a raw request threshold, because checkout involves multiple sequential calls.&lt;/p&gt;

&lt;p&gt;The syntax looks unusual — &lt;code&gt;group:::checkout&lt;/code&gt; uses three colons. That's the k6 tag format for the built-in &lt;code&gt;group_duration&lt;/code&gt; metric. Every group you define in code gets a corresponding series in this metric for free.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k6 run dist/test.main.js &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;../src/config/smoke.config.json
&lt;span class="c"&gt;# or: npm run smoke&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What the first run caught
&lt;/h2&gt;

&lt;p&gt;First smoke run: 10% error rate, two thresholds crossed. Response times were excellent — p95 of 87ms — so this wasn't a performance problem. Something was functionally wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging step 1&lt;/strong&gt; — verify the text the check was looking for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://10.4.20.2/product/0PUK6V6EV0 | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"add to cart"&lt;/span&gt;
&lt;span class="c"&gt;# &amp;lt;button type="submit" class="cymbal-button-primary"&amp;gt;Add To Cart&amp;lt;/button&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Text matched exactly. So the check wasn't wrong — some requests were returning non-200 responses before the check even ran.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging step 2&lt;/strong&gt; — check what the cart POST actually returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://10.4.20.2/cart &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"product_id=0PUK6V6EV0&amp;amp;quantity=1"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/x-www-form-urlencoded"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;&amp;lt; &lt;/span&gt;&lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt; &lt;span class="m"&gt;302&lt;/span&gt; &lt;span class="ne"&gt;Found&lt;/span&gt;
&lt;span class="s"&gt;&amp;lt; Location: /cart&lt;/span&gt;
&lt;span class="s"&gt;&amp;lt; Set-Cookie: shop_session-id=51779754-8ac6-4ac9-bbd9-1f062a8dc1b4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cart POST returns a 302 and sets a session cookie. With only a handful of iterations, cold-start noise before sessions were established was dominating the results. The fix: bump the iteration count, add the &lt;code&gt;setup()&lt;/code&gt; warmup in &lt;code&gt;main.js&lt;/code&gt;, and slightly relax thresholds — smoke should catch catastrophic failure, not enforce strict SLOs.&lt;/p&gt;

&lt;p&gt;This is the value of testing against a real app rather than a mock — you discover actual system behaviour.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two bugs found during the load test
&lt;/h2&gt;

&lt;p&gt;Running the full suite surfaced two more issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bug 1 — Checkout success was 0%.&lt;/strong&gt; All 79 checkout attempts completed and returned 200, but none matched the expected text. One curl command revealed it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;checkout flow with cookies] | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"order&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;confirm&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;thank"&lt;/span&gt;
&lt;span class="c"&gt;# Your order is complete!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The check in &lt;code&gt;src/lib/checks.js&lt;/code&gt; assumed &lt;code&gt;Your order is placed&lt;/code&gt;. Fixed in one place, picked up everywhere:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkCheckout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;order placed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Your order is complete!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;response &amp;lt; 3s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bug 2 — Browser "page title present" failed all 41 iterations.&lt;/strong&gt; In k6's browser API, &lt;code&gt;page.title()&lt;/code&gt; returns a Promise and needs to be awaited. The fix sits in &lt;code&gt;src/scenarios/browserFlow.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// broken&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page title present&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

&lt;span class="c1"&gt;// fixed&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;page title present&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both fixes are a good reminder that checks are only as good as the assumptions baked into them. The test framework did its job — it surfaced the mismatches immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  User journeys: three concurrent scenarios
&lt;/h2&gt;

&lt;p&gt;With smoke passing, it was time for the load test. Rather than hitting one endpoint in a loop, three distinct user types run simultaneously as k6 scenarios. All three are defined in &lt;code&gt;load.config.json&lt;/code&gt;; the scenario functions live in &lt;code&gt;src/scenarios/&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;src/config/load.config.json&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(scenarios&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;section)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scenarios"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"browsers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ramping-vus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nl"&gt;"exec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"browseFlow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="nl"&gt;"stages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"3m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"journey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"browser"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"shoppers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ramping-vus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nl"&gt;"exec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"shopperFlow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nl"&gt;"stages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"3m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"1m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"target"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"journey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"shopper"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currencyUsers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"executor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"constant-arrival-rate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"exec"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"currencyFlow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"rate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"timeUnit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"preAllocatedVUs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"maxVUs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"journey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Browsers&lt;/strong&gt; — casual visitors, read-only, up to 20 VUs. The scenario chains &lt;code&gt;visitHome()&lt;/code&gt; and multiple &lt;code&gt;browseProduct()&lt;/code&gt; calls from the scripts layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/scenarios/browseFlow.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;browseFlow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;pagesViewed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;visitHome&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;pagesViewed&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randSleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;numProducts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;numProducts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;browseProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randomProduct&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
    &lt;span class="nx"&gt;pagesViewed&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randSleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;browseDepth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pagesViewed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Shoppers&lt;/strong&gt; — full checkout flow, up to 5 VUs. The checkout script returns &lt;code&gt;{ ok, duration }&lt;/code&gt; so the scenario can record custom metrics without needing access to the raw response:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/scenarios/shopperFlow.js&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;shopperFlow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;visitHome&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randSleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="nf"&gt;viewProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randSleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cartOk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;addItemToCart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;cartOk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cartErrors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;viewCart&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randSleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;doCheckout&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;checkoutDuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;checkoutSuccess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Currency switchers&lt;/strong&gt; — exercises the currency microservice at a constant arrival rate of 2 RPS. &lt;code&gt;constant-arrival-rate&lt;/code&gt; controls throughput rather than concurrency — 2 iterations per second regardless of how long each one takes. That's how production traffic actually behaves.&lt;/p&gt;

&lt;h3&gt;
  
  
  Per-journey request thresholds
&lt;/h3&gt;

&lt;p&gt;Because each scenario tag is set in the JSON config (&lt;code&gt;"tags": {"journey":"browser"}&lt;/code&gt;), you can threshold each journey's request duration independently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"http_req_duration{journey:browser}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;2000"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"http_req_duration{journey:shopper}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;4000"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"http_req_duration{journey:currency}"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;2000"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Custom metrics as business SLOs
&lt;/h2&gt;

&lt;p&gt;Custom metrics are defined in the scenario files where they're used. &lt;code&gt;shopperFlow.js&lt;/code&gt; owns the checkout metrics; &lt;code&gt;browseFlow.js&lt;/code&gt; owns browse depth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/scenarios/shopperFlow.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkoutDuration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Trend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_checkout_duration&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkoutSuccess&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Rate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_checkout_success&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cartErrors&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;boutique_cart_errors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The thresholds in &lt;code&gt;load.config.json&lt;/code&gt; encode real business requirements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"boutique_checkout_duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"p(95)&amp;lt;5000"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"boutique_checkout_success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"rate&amp;gt;0.80"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the shift from infrastructure SLOs to business SLOs — codified, version-controlled, enforced automatically in CI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results across all four test types
&lt;/h2&gt;

&lt;p&gt;After fixing both bugs and re-running the full suite:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fse4n9u91ar1b5mavi815.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fse4n9u91ar1b5mavi815.png" alt="results" width="630" height="871"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The results tell a clear story.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Response times are strong under normal load.&lt;/strong&gt; Smoke p95 at 89ms and load p95 at 273ms show the app handles realistic traffic comfortably on homelab hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checkout: 0% → 100% after the fix.&lt;/strong&gt; All 80 checkout attempts placed orders successfully, with a p95 of 224ms against a 5,000ms threshold. The bug was entirely in the check assertion, not the app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browser Web Vitals are healthy.&lt;/strong&gt; LCP at 335ms and FCP at 255ms are well inside Core Web Vital targets. TTFB at 36ms is excellent. CLS at 0.117 just nudges over the 0.10 target — worth monitoring but not alarming. Note: browser tests deliberately have no &lt;code&gt;group()&lt;/code&gt; calls — there's a long-standing k6 issue with groups in the browser context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Product page buckled first under stress.&lt;/strong&gt; Product page failures started accumulating around 100 VUs while the homepage held all the way through the 150 VU peak — 9,907 successful checks, zero 500 errors. The product page finished with 2,037 failures total. This makes architectural sense: the product page fans out to the product catalog, recommendation, and currency services simultaneously. Under load, those downstream calls start queuing. The homepage is a simpler call graph and degrades later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Browse depth averaged 4.0 pages per session&lt;/strong&gt; — the random product browsing in the browse journey is working as intended, generating realistic read patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Post 3 covers the stress test in depth — reading degradation signals, understanding the product page failure pattern architecturally, and the k6 Browser module for Web Vitals measurement. Plus all four custom metric types and how to use them as CI-enforceable SLOs in Grafana Cloud.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;#k6 #Grafana #LoadTesting #Kubernetes #Observability #SRE #PerformanceTesting&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>performance</category>
      <category>testing</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>k6: The Tool, The Philosophy, and Your First Test</title>
      <dc:creator>Matthew Wimpelberg</dc:creator>
      <pubDate>Tue, 26 May 2026 09:37:09 +0000</pubDate>
      <link>https://dev.to/matthew_wimpelberg_79193b/k6-the-tool-the-philosophy-and-your-first-test-2ccp</link>
      <guid>https://dev.to/matthew_wimpelberg_79193b/k6-the-tool-the-philosophy-and-your-first-test-2ccp</guid>
      <description>&lt;p&gt;I've been going deep on k6, Grafana's open-source load and performance testing tool. This is the first in a four-part series documenting that journey, from first principles to a full test suite running against a live Kubernetes environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why k6?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most load testing tools treat tests as configuration. k6 treats them as code, JavaScript, version-controlled, modular, and reviewable like any other engineering artifact. That's a meaningful philosophical difference. It means your performance tests live in the same repo as your application, go through the same review process, and can be maintained by the same team.&lt;/p&gt;

&lt;p&gt;For those of us already in the Grafana ecosystem, there's another compelling reason: k6 is a Grafana Labs product. Test results stream natively into Grafana Cloud. Custom metrics you define in your scripts become queryable Prometheus time series. Your load test data lives alongside your infrastructure metrics, traces, and logs in one place with one query language and one alerting system.&lt;/p&gt;

&lt;p&gt;That unified observability story is what made me want to understand k6 deeply, not just at the surface level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What k6 covers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most people think of k6 as a load testing tool. It's actually much broader:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smoke testing: Is the app up and returning the right things?&lt;/li&gt;
&lt;li&gt;Load testing: How does it behave under realistic traffic?&lt;/li&gt;
&lt;li&gt;Stress testing: Where does it break?&lt;/li&gt;
&lt;li&gt;Soak testing: Does it degrade over hours?&lt;/li&gt;
&lt;li&gt;Browser testing: Real Chromium, Web Vitals, frontend performance&lt;/li&gt;
&lt;li&gt;Synthetic monitoring: Scheduled availability checks from global probe locations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One tool, one scripting language, the full testing lifecycle from development through production monitoring.  I'm going to begin by running a test script locally on my laptop to illustrate the most basic use case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your first k6 script&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once you have k6 installed, a minimal test looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;k6/http&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;check&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;group&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;k6&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;vus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30s&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;http_req_failed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;                    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;rate&amp;lt;0.01&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;http_req_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;                  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(95)&amp;lt;500&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http_req_duration{group:::Homepage}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;p(95)&amp;lt;400&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;// group-scoped threshold&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Homepage&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://quickpizza.grafana.com/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;status 200&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;response &amp;lt; 500ms&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things are happening here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;options&lt;/strong&gt; tells k6 how to run — 10 virtual users for 30 seconds, and two thresholds that define pass/fail: less than 1% of requests can fail, and the 95th percentile response time must stay under 500ms. If either threshold is violated, k6 exits with a non-zero code. Your CI pipeline fails. That's your SLO enforced automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;checks&lt;/strong&gt; are per-request assertions. A failing check doesn't stop the test, it increments a failure counter. At the end you see pass rates across all iterations, not just a binary pass/fail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;sleep&lt;/strong&gt; is think time between requests. Without it, k6 hammers the server as fast as possible with unrealistic load that produces misleading results. Real users read pages. sleep(1) models that.&lt;/p&gt;

&lt;p&gt;Run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;k6 run script.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The terminal output gives you request count, duration percentiles (p50, p90, p95, p99), error rate, data sent/received, and threshold results. A clean first run looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✓ status 200
✓ response &amp;lt; 500ms

http_req_duration: avg=45ms p(95)=112ms
http_req_failed:   0.00%
✓ thresholds passed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What's next&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My next post will cover building a real test suite with a shared library architecture, smoke testing against a live microservices app running on a homelab Kubernetes cluster, and what happens when your first run doesn't go as expected.&lt;/p&gt;

&lt;p&gt;The target app is Google's Online Boutique.  It's a realistic e-commerce microservices demo with 11 services.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;#k6 #Grafana #LoadTesting #PerformanceTesting #Observability #SRE #Kubernetes&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>performance</category>
      <category>testing</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
