<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: angufibo lincoln</title>
    <description>The latest articles on DEV Community by angufibo lincoln (@angufibo_lincoln_13822ecd).</description>
    <link>https://dev.to/angufibo_lincoln_13822ecd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3812160%2Fdafa7c90-71d6-44e2-91af-d966f519cbad.jpg</url>
      <title>DEV Community: angufibo lincoln</title>
      <link>https://dev.to/angufibo_lincoln_13822ecd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/angufibo_lincoln_13822ecd"/>
    <language>en</language>
    <item>
      <title>Introducing FlameIQ — Deterministic Performance Regression Detection for Python</title>
      <dc:creator>angufibo lincoln</dc:creator>
      <pubDate>Sat, 07 Mar 2026 22:06:07 +0000</pubDate>
      <link>https://dev.to/angufibo_lincoln_13822ecd/introducing-flameiq-deterministic-performance-regression-detection-for-python-2n3o</link>
      <guid>https://dev.to/angufibo_lincoln_13822ecd/introducing-flameiq-deterministic-performance-regression-detection-for-python-2n3o</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Performance regressions are invisible in code review.&lt;/p&gt;

&lt;p&gt;A careless refactor that recompiles a regex on every function call. A new dependency that adds 40ms to your p95 latency. A database query that wasn't indexed. None of these show up in a diff. They accumulate silently across hundreds of commits — a 3ms latency increase here, a 2% throughput drop there — until they become expensive production incidents.&lt;/p&gt;

&lt;p&gt;Type checkers enforce correctness automatically. Linters enforce style automatically. &lt;strong&gt;Nothing enforces performance — until now.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing FlameIQ
&lt;/h2&gt;

&lt;p&gt;Today we are releasing &lt;strong&gt;FlameIQ v1.0.0&lt;/strong&gt; — an open-source, deterministic, CI-native performance regression engine for Python.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;flameiq-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FlameIQ compares your current benchmark results against a stored baseline and &lt;strong&gt;fails your CI pipeline&lt;/strong&gt; if any metric exceeds its configured threshold — the same way a type checker fails your build on a type error.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Initialise&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;my-project
flameiq init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2 — Run your benchmarks and produce a metrics file&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"schema_version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"commit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"abc123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"branch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ci"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"latency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"mean"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;120.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"p95"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;180.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"p99"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;240.0&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"throughput"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;950.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"memory_mb"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;512.0&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3 — Set a baseline&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq baseline &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;--metrics&lt;/span&gt; benchmark.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4 — Compare on every PR&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq compare &lt;span class="nt"&gt;--metrics&lt;/span&gt; current.json &lt;span class="nt"&gt;--fail-on-regression&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Metric           Baseline    Current      Change   Threshold  Status
  ────────────────────────────────────────────────────────────────────
  latency.p95       2.45 ms     4.51 ms     +84.08%    ±10.0%  REGRESSION
  throughput        412.30      231.50      -43.84%    ±10.0%  REGRESSION

  ✗ REGRESSION — 2 metric(s) exceeded threshold.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code &lt;code&gt;1&lt;/code&gt;. Pipeline fails. Regression caught before merge.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Real Example: Catching a Regex Regression
&lt;/h2&gt;

&lt;p&gt;Here is the kind of bug FlameIQ is designed to catch. A developer refactors a text processing function and accidentally recompiles the regex on every call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FAST — original implementation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# Python caches compiled regex
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\s+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# SLOW — regressed implementation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;punct_re&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# recompiled on every call!
&lt;/span&gt;    &lt;span class="n"&gt;space_re&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\s+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# recompiled on every call!
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;punct_re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;space_re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is invisible in code review. The logic is identical. The diff looks clean.&lt;br&gt;
FlameIQ catches it with an 84% p95 latency increase — well above the 10% threshold.&lt;/p&gt;


&lt;h2&gt;
  
  
  GitHub Actions Integration
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install FlameIQ&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install flameiq-core&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Restore baseline cache&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.flameiq/&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;flameiq-${{ github.base_ref }}&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run benchmarks&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python run_benchmarks.py &amp;gt; metrics.json&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check for regressions&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;flameiq compare --metrics metrics.json --fail-on-regression&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Deterministic by design&lt;/strong&gt;&lt;br&gt;
Given identical inputs, FlameIQ always produces identical outputs. No randomness, no network calls, no &lt;code&gt;datetime.now()&lt;/code&gt;. Safe for any CI environment including air-gapped infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No vendor dependency&lt;/strong&gt;&lt;br&gt;
Baselines are local JSON files. No SaaS account. No API keys. No telemetry. Your performance data stays on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direction-aware thresholds&lt;/strong&gt;&lt;br&gt;
FlameIQ knows that latency increases are regressions and throughput decreases are regressions. Thresholds are sign-aware per metric type — no manual configuration required for known metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Statistical mode&lt;/strong&gt;&lt;br&gt;
For noisy benchmark environments, FlameIQ can apply the &lt;strong&gt;Mann-Whitney U test&lt;/strong&gt; alongside threshold comparison. A regression is only declared if both the threshold is exceeded &lt;em&gt;and&lt;/em&gt; the result is statistically significant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Versioned schema&lt;/strong&gt;&lt;br&gt;
The metrics schema is versioned (currently v1) with a formal specification. The threshold algorithm and statistical methodology are both fully documented in &lt;code&gt;/specs&lt;/code&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  HTML Reports
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq report &lt;span class="nt"&gt;--metrics&lt;/span&gt; current.json &lt;span class="nt"&gt;--output&lt;/span&gt; report.html
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Generates a self-contained HTML report with a full metric diff table, regression highlights, and trend analysis. No external assets — works offline.&lt;/p&gt;


&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;flameiq.yaml&lt;/code&gt; (created by &lt;code&gt;flameiq init&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;latency.p95&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;10%&lt;/span&gt;    &lt;span class="c1"&gt;# Allow up to 10% latency increase&lt;/span&gt;
  &lt;span class="na"&gt;latency.p99&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;15%&lt;/span&gt;
  &lt;span class="na"&gt;throughput&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;-5%&lt;/span&gt;    &lt;span class="c1"&gt;# Allow up to 5% throughput decrease&lt;/span&gt;
  &lt;span class="na"&gt;memory_mb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;8%&lt;/span&gt;

&lt;span class="na"&gt;baseline&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rolling_median&lt;/span&gt;
  &lt;span class="na"&gt;rolling_window&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;

&lt;span class="na"&gt;statistics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.95&lt;/span&gt;

&lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Try the Demo
&lt;/h2&gt;

&lt;p&gt;We built a demo project — &lt;strong&gt;flameiq-demo&lt;/strong&gt; — that walks through the full regression detection workflow using a real Python library:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/flameiq/demo-flameiq" rel="noopener noreferrer"&gt;https://github.com/flameiq/demo-flameiq&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/flameiq-core/" rel="noopener noreferrer"&gt;https://pypi.org/project/flameiq-core/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="https://flameiq-core.readthedocs.io" rel="noopener noreferrer"&gt;https://flameiq-core.readthedocs.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/flameiq/flameiq-core" rel="noopener noreferrer"&gt;https://github.com/flameiq/flameiq-core&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Demo project:&lt;/strong&gt; &lt;a href="https://github.com/flameiq/demo-flameiq" rel="noopener noreferrer"&gt;https://github.com/flameiq/demo-flameiq&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Feedback, issues, and contributions are welcome. If you have caught a regression with FlameIQ or have a use case we haven't considered, open an issue or start a discussion on GitHub.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;python&lt;/code&gt; &lt;code&gt;opensource&lt;/code&gt; &lt;code&gt;devtools&lt;/code&gt; &lt;code&gt;ci&lt;/code&gt; &lt;code&gt;performance&lt;/code&gt;&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>opensource</category>
      <category>performance</category>
      <category>python</category>
    </item>
    <item>
      <title>Introducing FlameIQ — Deterministic Performance Regression Detection for Python</title>
      <dc:creator>angufibo lincoln</dc:creator>
      <pubDate>Sat, 07 Mar 2026 22:06:07 +0000</pubDate>
      <link>https://dev.to/angufibo_lincoln_13822ecd/introducing-flameiq-deterministic-performance-regression-detection-for-python-5coa</link>
      <guid>https://dev.to/angufibo_lincoln_13822ecd/introducing-flameiq-deterministic-performance-regression-detection-for-python-5coa</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Performance regressions are invisible in code review.&lt;/p&gt;

&lt;p&gt;A careless refactor that recompiles a regex on every function call. A new dependency that adds 40ms to your p95 latency. A database query that wasn't indexed. None of these show up in a diff. They accumulate silently across hundreds of commits — a 3ms latency increase here, a 2% throughput drop there — until they become expensive production incidents.&lt;/p&gt;

&lt;p&gt;Type checkers enforce correctness automatically. Linters enforce style automatically. &lt;strong&gt;Nothing enforces performance — until now.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introducing FlameIQ
&lt;/h2&gt;

&lt;p&gt;Today we are releasing &lt;strong&gt;FlameIQ v1.0.0&lt;/strong&gt; — an open-source, deterministic, CI-native performance regression engine for Python.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;flameiq-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FlameIQ compares your current benchmark results against a stored baseline and &lt;strong&gt;fails your CI pipeline&lt;/strong&gt; if any metric exceeds its configured threshold — the same way a type checker fails your build on a type error.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Initialise&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;my-project
flameiq init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2 — Run your benchmarks and produce a metrics file&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"schema_version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"commit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"abc123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"branch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ci"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metrics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"latency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"mean"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;120.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"p95"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;180.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"p99"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;240.0&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"throughput"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;950.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"memory_mb"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;512.0&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3 — Set a baseline&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq baseline &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;--metrics&lt;/span&gt; benchmark.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4 — Compare on every PR&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq compare &lt;span class="nt"&gt;--metrics&lt;/span&gt; current.json &lt;span class="nt"&gt;--fail-on-regression&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Metric           Baseline    Current      Change   Threshold  Status
  ────────────────────────────────────────────────────────────────────
  latency.p95       2.45 ms     4.51 ms     +84.08%    ±10.0%  REGRESSION
  throughput        412.30      231.50      -43.84%    ±10.0%  REGRESSION

  ✗ REGRESSION — 2 metric(s) exceeded threshold.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exit code &lt;code&gt;1&lt;/code&gt;. Pipeline fails. Regression caught before merge.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Real Example: Catching a Regex Regression
&lt;/h2&gt;

&lt;p&gt;Here is the kind of bug FlameIQ is designed to catch. A developer refactors a text processing function and accidentally recompiles the regex on every call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FAST — original implementation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# Python caches compiled regex
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\s+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# SLOW — regressed implementation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;punct_re&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[^\w\s]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# recompiled on every call!
&lt;/span&gt;    &lt;span class="n"&gt;space_re&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\s+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# recompiled on every call!
&lt;/span&gt;    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;punct_re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;space_re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is invisible in code review. The logic is identical. The diff looks clean.&lt;br&gt;
FlameIQ catches it with an 84% p95 latency increase — well above the 10% threshold.&lt;/p&gt;


&lt;h2&gt;
  
  
  GitHub Actions Integration
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install FlameIQ&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install flameiq-core&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Restore baseline cache&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.flameiq/&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;flameiq-${{ github.base_ref }}&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run benchmarks&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python run_benchmarks.py &amp;gt; metrics.json&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check for regressions&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;flameiq compare --metrics metrics.json --fail-on-regression&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Deterministic by design&lt;/strong&gt;&lt;br&gt;
Given identical inputs, FlameIQ always produces identical outputs. No randomness, no network calls, no &lt;code&gt;datetime.now()&lt;/code&gt;. Safe for any CI environment including air-gapped infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No vendor dependency&lt;/strong&gt;&lt;br&gt;
Baselines are local JSON files. No SaaS account. No API keys. No telemetry. Your performance data stays on your infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direction-aware thresholds&lt;/strong&gt;&lt;br&gt;
FlameIQ knows that latency increases are regressions and throughput decreases are regressions. Thresholds are sign-aware per metric type — no manual configuration required for known metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Statistical mode&lt;/strong&gt;&lt;br&gt;
For noisy benchmark environments, FlameIQ can apply the &lt;strong&gt;Mann-Whitney U test&lt;/strong&gt; alongside threshold comparison. A regression is only declared if both the threshold is exceeded &lt;em&gt;and&lt;/em&gt; the result is statistically significant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Versioned schema&lt;/strong&gt;&lt;br&gt;
The metrics schema is versioned (currently v1) with a formal specification. The threshold algorithm and statistical methodology are both fully documented in &lt;code&gt;/specs&lt;/code&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  HTML Reports
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;flameiq report &lt;span class="nt"&gt;--metrics&lt;/span&gt; current.json &lt;span class="nt"&gt;--output&lt;/span&gt; report.html
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Generates a self-contained HTML report with a full metric diff table, regression highlights, and trend analysis. No external assets — works offline.&lt;/p&gt;


&lt;h2&gt;
  
  
  Configuration
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;flameiq.yaml&lt;/code&gt; (created by &lt;code&gt;flameiq init&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;latency.p95&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;10%&lt;/span&gt;    &lt;span class="c1"&gt;# Allow up to 10% latency increase&lt;/span&gt;
  &lt;span class="na"&gt;latency.p99&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="s"&gt;15%&lt;/span&gt;
  &lt;span class="na"&gt;throughput&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;-5%&lt;/span&gt;    &lt;span class="c1"&gt;# Allow up to 5% throughput decrease&lt;/span&gt;
  &lt;span class="na"&gt;memory_mb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;8%&lt;/span&gt;

&lt;span class="na"&gt;baseline&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rolling_median&lt;/span&gt;
  &lt;span class="na"&gt;rolling_window&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;

&lt;span class="na"&gt;statistics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.95&lt;/span&gt;

&lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Try the Demo
&lt;/h2&gt;

&lt;p&gt;We built a demo project — &lt;strong&gt;flameiq-demo&lt;/strong&gt; — that walks through the full regression detection workflow using a real Python library:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/flameiq/demo-flameiq" rel="noopener noreferrer"&gt;https://github.com/flameiq/demo-flameiq&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/flameiq-core/" rel="noopener noreferrer"&gt;https://pypi.org/project/flameiq-core/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="https://flameiq-core.readthedocs.io" rel="noopener noreferrer"&gt;https://flameiq-core.readthedocs.io&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/flameiq/flameiq-core" rel="noopener noreferrer"&gt;https://github.com/flameiq/flameiq-core&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Demo project:&lt;/strong&gt; &lt;a href="https://github.com/flameiq/demo-flameiq" rel="noopener noreferrer"&gt;https://github.com/flameiq/demo-flameiq&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Feedback, issues, and contributions are welcome. If you have caught a regression with FlameIQ or have a use case we haven't considered, open an issue or start a discussion on GitHub.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;python&lt;/code&gt; &lt;code&gt;opensource&lt;/code&gt; &lt;code&gt;devtools&lt;/code&gt; &lt;code&gt;ci&lt;/code&gt; &lt;code&gt;performance&lt;/code&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>performance</category>
      <category>python</category>
      <category>testing</category>
    </item>
  </channel>
</rss>
