<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: orenlab</title>
    <description>The latest articles on DEV Community by orenlab (@orenlab).</description>
    <link>https://dev.to/orenlab</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3844366%2F7c41355d-8cb0-4d62-aae0-775c1cc580f0.jpg</url>
      <title>DEV Community: orenlab</title>
      <link>https://dev.to/orenlab</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/orenlab"/>
    <language>en</language>
    <item>
      <title>Structural review that finally knows what your tests cover</title>
      <dc:creator>orenlab</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:52:28 +0000</pubDate>
      <link>https://dev.to/orenlab/codeclone-b5-structural-review-that-finally-knows-what-your-tests-cover-4l80</link>
      <guid>https://dev.to/orenlab/codeclone-b5-structural-review-that-finally-knows-what-your-tests-cover-4l80</guid>
      <description>&lt;p&gt;In earlier posts, I wrote about &lt;a href="https://dev.to/orenlab/i-built-a-baseline-aware-python-code-health-tool-for-ci-and-ai-assisted-coding-5dij"&gt;why I built CodeClone&lt;/a&gt;, &lt;a href="https://dev.to/orenlab/i-turned-my-python-code-quality-tool-into-a-budget-aware-mcp-server-for-ai-agents-127j"&gt;why I exposed it through MCP for AI agents&lt;/a&gt;, and how &lt;a href="https://dev.to/orenlab/codeclone-b4-from-cli-tool-to-a-real-review-surface-for-vs-code-claude-desktop-and-codex-150c"&gt;&lt;code&gt;b4&lt;/code&gt; turned it into a real review surface&lt;/a&gt; for VS Code, Claude Desktop, and Codex.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b5&lt;/code&gt; is the release where structural review stops being a parallel universe to your test suite.&lt;/p&gt;

&lt;p&gt;Until now, CodeClone could tell you that a function is long, complex, duplicated, or coupled to everything — but it had no idea whether that function was covered by a single unit test. That mattered more than I wanted to admit. A complex function with a &lt;code&gt;0.98&lt;/code&gt; coverage ratio is not the same risk as the identical function with &lt;code&gt;0.0&lt;/code&gt;. A reviewer knows this. An AI agent reading an MCP response doesn't — unless the tool tells it.&lt;/p&gt;

&lt;p&gt;So &lt;code&gt;b5&lt;/code&gt; fixes that, and while doing it, also lifts a few other things that kept getting in the way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;typing and docstring coverage as first-class review facts&lt;/li&gt;
&lt;li&gt;public API drift as a baseline-governed signal&lt;/li&gt;
&lt;li&gt;intentionally-duplicated test fixtures stop polluting health and CI gates&lt;/li&gt;
&lt;li&gt;a much clearer triage story for MCP and IDE clients&lt;/li&gt;
&lt;li&gt;a rebuilt HTML report with unified filters and cleaner empty states&lt;/li&gt;
&lt;li&gt;a Claude Desktop launcher that actually picks the right Python&lt;/li&gt;
&lt;li&gt;a warm-path benchmark that now tells the truth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let me walk through what changed and why.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Bring your &lt;code&gt;coverage.xml&lt;/code&gt; into the review
&lt;/h2&gt;

&lt;p&gt;The headline feature of &lt;code&gt;b5&lt;/code&gt; is &lt;strong&gt;Coverage Join&lt;/strong&gt;. Point CodeClone at any Cobertura XML produced by &lt;code&gt;coverage.py&lt;/code&gt;, &lt;code&gt;pytest-cov&lt;/code&gt;, or your CI and it fuses test coverage into the same run that produces clone groups, complexity, cohesion, and dead code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--coverage&lt;/span&gt; coverage.xml &lt;span class="nt"&gt;--coverage-min&lt;/span&gt; 50 &lt;span class="nt"&gt;--html&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What comes out is not "new coverage tool, please delete the old one." It's coverage used as a &lt;strong&gt;modifier on structural review&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each function in the current run gets a factual coverage ratio.&lt;/li&gt;
&lt;li&gt;Functions below the threshold show up as &lt;strong&gt;coverage hotspots&lt;/strong&gt; with their complexity and caller count alongside.&lt;/li&gt;
&lt;li&gt;High-risk findings can now read "complex + uncovered + new vs baseline" instead of just "complex."&lt;/li&gt;
&lt;li&gt;A new gate, &lt;code&gt;--fail-on-untested-hotspots&lt;/code&gt;, fails CI on below-threshold functions &lt;strong&gt;only where the coverage report actually measured them&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last distinction is the part I care about most.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Honest about scope: measured vs out-of-scope
&lt;/h2&gt;

&lt;p&gt;The easy mistake when bolting coverage onto a second tool is to silently treat "function missing from &lt;code&gt;coverage.xml&lt;/code&gt;" as "function is uncovered." It makes the dashboard look busier, but it's a lie — the function might be covered by a coverage run that was filtered to a different package, or it might be a module the coverage config excluded on purpose.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b5&lt;/code&gt; keeps these two cases cleanly separate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Coverage hotspots&lt;/strong&gt; — code that &lt;code&gt;coverage.xml&lt;/code&gt; measured and reported below threshold. This is a hard signal.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coverage scope gaps&lt;/strong&gt; — code present in your repo but not in the coverage XML at all. This is a &lt;strong&gt;scoping observation&lt;/strong&gt;, not a verdict.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both show up in the report and through MCP, but with different meanings. In mixed monorepos this stops being cosmetic very fast.&lt;/p&gt;

&lt;p&gt;None of this changes clone identity, fingerprints, or NEW-vs-KNOWN semantics — the baseline model is untouched. Coverage Join is a &lt;strong&gt;current-run fact&lt;/strong&gt;, not baseline truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Typing and docstring coverage are now part of the picture
&lt;/h2&gt;

&lt;p&gt;I used to expose "typing coverage" and "docstring coverage" as optional toggles. In practice, nobody turned them on, and they kept hiding behind flags that felt vestigial.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b5&lt;/code&gt; removes the toggles and just collects adoption coverage whenever you run in metrics mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;parameter annotation coverage&lt;/li&gt;
&lt;li&gt;return annotation coverage&lt;/li&gt;
&lt;li&gt;public docstring coverage&lt;/li&gt;
&lt;li&gt;explicit &lt;code&gt;Any&lt;/code&gt; count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They land in the main CLI &lt;code&gt;Metrics&lt;/code&gt; block, in the HTML Overview, in MCP summaries, and in the baseline. And they get their own CI gates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-typing-coverage&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-docstring-coverage&lt;/span&gt; 60 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--fail-on-typing-regression&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--fail-on-docstring-regression&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The regression gates are the interesting pair: they don't force you to reach a specific threshold, they just fail CI when adoption &lt;strong&gt;drops&lt;/strong&gt; compared to your trusted baseline. That tends to be more realistic for real codebases where you're migrating gradually.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Public API drift becomes a first-class signal
&lt;/h2&gt;

&lt;p&gt;Another thing that used to live outside the review surface: "did this PR break the public API?"&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b5&lt;/code&gt; adds an opt-in &lt;strong&gt;API Surface&lt;/strong&gt; layer that takes a snapshot of your public symbols — modules, classes, functions, their parameters and return types — into the metrics baseline. Subsequent runs produce a baseline diff with explicit categories: additions, breaking changes, everything else.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Record the snapshot&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--api-surface&lt;/span&gt; &lt;span class="nt"&gt;--update-metrics-baseline&lt;/span&gt;

&lt;span class="c"&gt;# Guard PRs&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--fail-on-api-break&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's not a type checker and it's not SemVer enforcement. It's "the set of externally-callable names in this package just changed in a way that is likely to break downstream users, please confirm." For libraries that's the thing you want CI to block on.&lt;/p&gt;

&lt;p&gt;Private symbols are classified separately from public ones, so moving an internal helper around doesn't pollute the diff.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Golden fixtures stop showing up as debt
&lt;/h2&gt;

&lt;p&gt;Some repositories — including CodeClone itself — intentionally keep duplicated golden fixtures to lock report contracts and parser behavior. Those clones are real. They are also &lt;em&gt;not&lt;/em&gt; live review debt.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b5&lt;/code&gt; adds a project-level policy for exactly that case:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[tool.codeclone]&lt;/span&gt;
&lt;span class="py"&gt;golden_fixture_paths&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"tests/fixtures/golden_*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clone groups fully contained in those paths are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;excluded from the health score&lt;/li&gt;
&lt;li&gt;excluded from CI gates&lt;/li&gt;
&lt;li&gt;excluded from active findings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;still carried&lt;/strong&gt; in the report as suppressed facts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the tool stays honest — you can still see the suppressed groups in the HTML Clones tab and in the canonical JSON — without making CI noisier than it needs to be. If a group stops being "fully inside the fixture paths," it stops being suppressed automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Triage that says what it's actually looking at
&lt;/h2&gt;

&lt;p&gt;MCP summary and triage payloads in &lt;code&gt;b5&lt;/code&gt; include a few compact interpretation fields that turned out to matter a lot for both AI agents and humans:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;health_scope&lt;/code&gt; — is this number repository-wide, production-only, or for a specific focus?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;focus&lt;/code&gt; — what does "new findings" actually mean for this run?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;new_by_source_kind&lt;/code&gt; — of the new findings, how many are in production code vs tests vs tooling?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The net effect is that an agent asking "is this PR risky?" no longer has to guess whether "3 new findings" means "three new bugs in production" or "three new flake-prone tests." The payload tells it directly. The VS Code extension uses the same fields to explain repository-wide health, production focus, and outside-focus debt without widening the review flow.&lt;/p&gt;

&lt;p&gt;The extension also now surfaces Coverage Join facts in its overview when the connected server supports them, and the optional in-IDE help topics are gated by server version so they stay honest about what's actually available.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. The HTML report got a proper rebuild
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;b4&lt;/code&gt; made the HTML report useful. &lt;code&gt;b5&lt;/code&gt; makes it feel finished.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified filters popover&lt;/strong&gt; — Clones and Suggestions share the same filter UX: one button, one menu, an active-filter count, keyboard dismiss. Every control lives in the same place on every tab that has filters. No more two-row filter strips that wrap on narrow screens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cleaner empty states&lt;/strong&gt; — instead of empty tables, sections with no findings now render a single reassuring row with an explicit "no issues detected" message and an icon. Silence has meaning now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coverage Join subtab&lt;/strong&gt; — Quality gets a dedicated Coverage Join view with per-function rows: coverage %, complexity, callers, source kind, and a clear marker for scope gaps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adaptive theme toggle&lt;/strong&gt; — the theme button shows a sun in light mode and a moon in dark mode, resolved at paint time so you don't flash the wrong icon on first load.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refreshed palette&lt;/strong&gt; — the whole report moved to a chromatic neutral scale tinted toward the brand indigo, so surfaces, borders, and text live on the same hue axis instead of looking like "grayscale + one purple button."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better provenance&lt;/strong&gt; — the meta block makes it explicit which python tag the baseline was built for, and calls out baseline mismatches instead of hiding them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stat-card rhythm&lt;/strong&gt; — KPI cards across Overview, Quality, Dependencies, and Dead Code share one card component now. Same padding, same typography, same tone variants.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of that changes a single report contract. It's pure render-layer work.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Claude Desktop launches the right Python
&lt;/h2&gt;

&lt;p&gt;A boring but high-impact &lt;code&gt;b5&lt;/code&gt; change: the Claude Desktop bundle now resolves your project's runtime before falling back to a global one. Poetry's &lt;code&gt;.venv&lt;/code&gt;, workspace &lt;code&gt;.venv&lt;/code&gt;, and an explicit &lt;code&gt;workspace_root&lt;/code&gt; override all come before anything on &lt;code&gt;PATH&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Before: installing CodeClone into your project, then launching it via Claude Desktop, would often run &lt;em&gt;some other&lt;/em&gt; CodeClone from &lt;code&gt;/usr/local/bin&lt;/code&gt; because that happened to be first on &lt;code&gt;PATH&lt;/code&gt;. That's fixed.&lt;/p&gt;

&lt;p&gt;If you've been getting subtly wrong results through Claude Desktop and couldn't explain why, this is the one to pull.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Safer and more deterministic under the hood
&lt;/h2&gt;

&lt;p&gt;Two changes that are unglamorous but worth noting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Git diff ref validation&lt;/strong&gt;. When you use &lt;code&gt;--diff-against&lt;/code&gt;, the supplied revision is now validated as a safe single-revision expression before being passed to &lt;code&gt;git&lt;/code&gt;. No shell surprises, no accidental multi-ref expressions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Canonical segment digests&lt;/strong&gt;. Segment clone digests no longer use &lt;code&gt;repr()&lt;/code&gt; — they're computed from canonical JSON bytes. This closes a subtle determinism hole where two runs on different interpreters could, in rare cases, produce different segment digests for the same input.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Neither changes clone identity or fingerprint semantics.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. The warm path is actually warm
&lt;/h2&gt;

&lt;p&gt;One of the more satisfying &lt;code&gt;b5&lt;/code&gt; fixes wasn't a feature at all.&lt;/p&gt;

&lt;p&gt;I'd been quietly suspicious of the benchmark numbers for a while — warm runs were looking &lt;em&gt;too&lt;/em&gt; good, and I couldn't make the shape of the curve match what the pipeline was actually doing. Turns out the benchmark harness had a bug that broke process-pool execution on warm runs, so the cache was being credited for work it wasn't doing.&lt;/p&gt;

&lt;p&gt;After fixing the harness and tightening gating around benchmark runs so repo quality gates don't interfere, the numbers are now both fast &lt;strong&gt;and&lt;/strong&gt; trustworthy. From the Linux smoke benchmark:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cold_full&lt;/code&gt;: &lt;code&gt;6.58s&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;warm_full&lt;/code&gt;: &lt;code&gt;0.95s&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;warm_clones_only&lt;/code&gt;: &lt;code&gt;0.86s&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;About &lt;strong&gt;6.9× speedup&lt;/strong&gt; on warm runs. The cache is no longer "probably helping" — it is clearly doing useful work, and now I can say that with a straight face.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;If &lt;code&gt;b4&lt;/code&gt; made CodeClone a real review surface, &lt;code&gt;b5&lt;/code&gt; is the release where that surface learned to ask useful second-order questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this complex function actually tested?&lt;/li&gt;
&lt;li&gt;Is this low-coverage number a hard signal or a scope gap?&lt;/li&gt;
&lt;li&gt;Is this new finding in production code or in fixtures?&lt;/li&gt;
&lt;li&gt;Did this PR break the public API?&lt;/li&gt;
&lt;li&gt;Is this duplication intentional test scaffolding or real debt?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of those used to require me to eyeball two dashboards and a coverage report. Now there's a single canonical answer, and it ships consistently through CLI, HTML, JSON, SARIF, MCP, the VS Code extension, the Claude Desktop bundle, and the Codex plugin.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Base install&lt;/span&gt;
uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; codeclone

&lt;span class="c"&gt;# With MCP for AI agents (Claude Desktop, Codex, VS Code, Cursor, ...)&lt;/span&gt;
uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; &lt;span class="s2"&gt;"codeclone[mcp]"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A one-liner to feel the new shape on your own repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--coverage&lt;/span&gt; coverage.xml &lt;span class="nt"&gt;--coverage-min&lt;/span&gt; 70 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-typing-coverage&lt;/span&gt; 80 &lt;span class="nt"&gt;--fail-on-typing-regression&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-surface&lt;/span&gt; &lt;span class="nt"&gt;--fail-on-api-break&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--html&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open the HTML report, watch the Coverage Join tab populate, and check whether your "risky" functions really were the risky ones.&lt;/p&gt;

&lt;p&gt;Feedback, issues, and PRs welcome on &lt;a href="https://github.com/orenlab/codeclone" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>python</category>
      <category>devtools</category>
      <category>ai</category>
      <category>showdev</category>
    </item>
    <item>
      <title>CodeClone b4: from CLI tool to a real review surface for VS Code, Claude Desktop, and Codex</title>
      <dc:creator>orenlab</dc:creator>
      <pubDate>Sun, 05 Apr 2026 18:28:09 +0000</pubDate>
      <link>https://dev.to/orenlab/codeclone-b4-from-cli-tool-to-a-real-review-surface-for-vs-code-claude-desktop-and-codex-150c</link>
      <guid>https://dev.to/orenlab/codeclone-b4-from-cli-tool-to-a-real-review-surface-for-vs-code-claude-desktop-and-codex-150c</guid>
      <description>&lt;p&gt;I already wrote about why I built CodeClone and why I cared about &lt;a href="https://dev.to/orenlab/i-built-a-baseline-aware-python-code-health-tool-for-ci-and-ai-assisted-coding-5dij"&gt;baseline-aware code health&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Then I wrote about turning it into a &lt;a href="https://dev.to/orenlab/i-turned-my-python-code-quality-tool-into-a-budget-aware-mcp-server-for-ai-agents-127j"&gt;read-only, budget-aware MCP server for AI agents&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This post is about what changed in &lt;code&gt;2.0.0b4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The short version:&lt;/strong&gt; if &lt;code&gt;b3&lt;/code&gt; made CodeClone usable through MCP, &lt;code&gt;b4&lt;/code&gt; made it feel like a product.&lt;/p&gt;

&lt;p&gt;Not because I added more analysis magic or built a separate "AI mode." But because I pushed the same structural truth into the places where people and agents actually work — VS Code, Claude Desktop, Codex — and tightened the contract between all of them.&lt;/p&gt;

&lt;p&gt;A lot of developer tools are strong on analysis and weak on workflow. A lot of AI-facing tools shine in a demo and fall apart in daily use.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;b4&lt;/code&gt;, I wanted a tighter shape:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the CLI, HTML report, MCP, and IDE clients should agree on what "health" means&lt;/li&gt;
&lt;li&gt;the first pass should stay conservative&lt;/li&gt;
&lt;li&gt;deeper inspection should be explicit, not accidental&lt;/li&gt;
&lt;li&gt;report-only signals should stay visible without polluting gates&lt;/li&gt;
&lt;li&gt;setup failures should tell you what went wrong&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the release theme. Not "more output" — better day-to-day workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  The most interesting new layer: Overloaded Modules
&lt;/h2&gt;

&lt;p&gt;Clone detection tells you &lt;em&gt;this logic is repeated&lt;/em&gt;. Complexity tells you &lt;em&gt;this function is locally hard to reason about&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Overloaded Modules&lt;/code&gt; asks a different question: &lt;strong&gt;which modules are taking on too much responsibility?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The signals include module size pressure, dependency pressure, hub-like shape, and reimport-heavy structure. This points to code that often &lt;em&gt;feels&lt;/em&gt; wrong before it is easy to classify. You know the file keeps attracting logic. Every change in it feels heavier than it should. But it is not a clone group or a single high-complexity function.&lt;/p&gt;

&lt;p&gt;The important design choice: this layer is &lt;strong&gt;report-only&lt;/strong&gt; for now. It shows up in JSON, HTML, Markdown, text, MCP, and the VS Code extension — but it does not affect health score, gates, baseline novelty, or SARIF.&lt;/p&gt;

&lt;p&gt;I wanted the signal to be useful before letting it become consequential.&lt;/p&gt;

&lt;h2&gt;
  
  
  VS Code became a real client, not a demo
&lt;/h2&gt;

&lt;p&gt;The preview VS Code extension is the first release where CodeClone feels properly usable &lt;em&gt;inside&lt;/em&gt; an editor instead of only around one.&lt;/p&gt;

&lt;p&gt;It is now live on the &lt;a href="https://marketplace.visualstudio.com/items?itemName=orenlab.codeclone" rel="noopener noreferrer"&gt;Visual Studio Marketplace&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The extension is not a generic linter panel. It is built around a review loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Analyze the workspace.&lt;/li&gt;
&lt;li&gt;Look at compact structural health.&lt;/li&gt;
&lt;li&gt;Review priorities first.&lt;/li&gt;
&lt;li&gt;Reveal source.&lt;/li&gt;
&lt;li&gt;Open detail only when needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A lot of extensions get this wrong by dumping every result into the IDE and calling it integration. I wanted the opposite: a client that is baseline-aware, triage-first, source-first, trust-aware, and read-only.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b4&lt;/code&gt; also tightened the surrounding UX:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Restricted Mode&lt;/strong&gt; — onboarding works in untrusted workspaces, but analysis stays gated until trust is granted&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit analysis profiles&lt;/strong&gt; — "deeper review" is a deliberate follow-up, not silent threshold drift&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard version checks&lt;/strong&gt; — if the IDE client quietly talks to the wrong local server version, you do not get a tool you can trust; you get folklore&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one mattered more than I expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Desktop and Codex speak the same contract
&lt;/h2&gt;

&lt;p&gt;I also added native client paths for Claude Desktop and Codex.&lt;/p&gt;

&lt;p&gt;The goal was not "be available in more places." It was keeping one analysis contract across all of them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no second analyzer&lt;/li&gt;
&lt;li&gt;no plugin-specific findings&lt;/li&gt;
&lt;li&gt;no AI-only semantics&lt;/li&gt;
&lt;li&gt;no client that quietly disagrees with the CLI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude Desktop gets a local &lt;code&gt;.mcpb&lt;/code&gt; bundle with pre-loaded review instructions.&lt;br&gt;
Codex gets a native plugin with two focused skills — full review and quick hotspot discovery. Both sit on top of the same &lt;code&gt;codeclone-mcp&lt;/code&gt; server.&lt;/p&gt;

&lt;p&gt;That may sound boring, but boring is good here. The more clients you add, the easier it becomes to fork your own semantics without noticing. A lot of the &lt;code&gt;b4&lt;/code&gt; work was about resisting exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conservative first, deeper only when you mean it
&lt;/h2&gt;

&lt;p&gt;CodeClone defaults are intentionally conservative. That is the right first pass for CI, baseline-aware review, and agent-driven workflows.&lt;/p&gt;

&lt;p&gt;But there is a real second need: sometimes the default pass looks clean, and you want to go hunting for smaller, more local repetition.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b4&lt;/code&gt; makes that distinction explicit:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with defaults or &lt;code&gt;pyproject.toml&lt;/code&gt; thresholds.&lt;/li&gt;
&lt;li&gt;Use that as the stable first pass.&lt;/li&gt;
&lt;li&gt;Lower thresholds only for an intentional deeper review.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This now shows up clearly in MCP help topics and in the VS Code analysis profiles.&lt;/p&gt;

&lt;p&gt;"More sensitive" is not the same as "more correct." A clean conservative pass&lt;br&gt;
does not prove there is no finer-grained repetition. But a lower-threshold exploratory pass should not quietly pretend to have the same meaning as the default profile. That distinction needed to become product-level.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP got smarter about guiding agents — and cheaper to talk to
&lt;/h2&gt;

&lt;p&gt;Two things happened on the MCP side that are easy to miss but matter a lot in practice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First: the &lt;code&gt;help&lt;/code&gt; tool.&lt;/strong&gt; In &lt;code&gt;b3&lt;/code&gt;, agents had 20 analysis and query tools but no way to ask "what should I do next?" or "what does this baseline state mean?" without burning tokens on trial and error.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b4&lt;/code&gt; adds a &lt;code&gt;help(topic=...)&lt;/code&gt; tool with bounded, static answers for common uncertainty points: workflow sequencing, analysis profile semantics, baseline interpretation, suppression rules, review state, and changed-scope review. An agent can ask one cheap question instead of making three exploratory tool calls to figure out the right next step.&lt;/p&gt;

&lt;p&gt;This is a small surface — seven topics, short answers, no dynamic analysis. But it changes the economics of agent workflows significantly. The difference between "the agent guesses and retries" and "the agent asks and proceeds" is often 3–5x in token cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second: tighter token budgets across the board.&lt;/strong&gt; &lt;code&gt;b4&lt;/code&gt; continued the budget-aware work from &lt;code&gt;b3&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;finding IDs are now sha256-based short forms instead of full canonical URIs&lt;/li&gt;
&lt;li&gt;the &lt;code&gt;derived&lt;/code&gt; section in MCP payloads is projected down to what agents actually need&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;metrics_detail&lt;/code&gt; is paginated with family and path filters so agents never pull the full metrics table by accident&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this changes the canonical report — the JSON is still the complete truth. But the MCP view over it is now meaningfully leaner.&lt;/p&gt;

&lt;h2&gt;
  
  
  The boring fixes that matter most
&lt;/h2&gt;

&lt;p&gt;Some of my favorite changes in &lt;code&gt;b4&lt;/code&gt; are not flashy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;setup guidance that matches the real install path&lt;/li&gt;
&lt;li&gt;faster launcher failure behavior with clear error messages&lt;/li&gt;
&lt;li&gt;stricter local version handling across all client surfaces&lt;/li&gt;
&lt;li&gt;enriched MCP server instructions so agents get behavioral context on connect, not just a list of tools&lt;/li&gt;
&lt;li&gt;terminology cleanup around module hotspots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not the kind of work that looks impressive in a screenshot. But it is exactly the kind of work that makes an engineering tool feel trustworthy over weeks and months.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;code&gt;b4&lt;/code&gt; feels like
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;b1&lt;/code&gt; — CodeClone became more than a clone detector.&lt;br&gt;
&lt;code&gt;b3&lt;/code&gt; — it became a serious MCP server.&lt;br&gt;
&lt;code&gt;b4&lt;/code&gt; — it started to feel coherent across the CLI, the report, MCP, and every client surface.&lt;/p&gt;

&lt;p&gt;You can start in the editor. You can stay aligned with baseline-aware truth. You can inspect module-level pressure without turning it into fake gating. You can move between human and agent workflows without changing the underlying semantics.&lt;/p&gt;

&lt;p&gt;That is much closer to what I wanted CodeClone to become.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; codeclone        &lt;span class="c"&gt;# core CLI (beta)&lt;/span&gt;
uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; &lt;span class="s2"&gt;"codeclone[mcp]"&lt;/span&gt; &lt;span class="c"&gt;# + MCP server for agents and IDEs&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt;                            &lt;span class="c"&gt;# analyze the current project&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--html&lt;/span&gt; &lt;span class="nt"&gt;--open-html-report&lt;/span&gt;  &lt;span class="c"&gt;# open the interactive report&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/orenlab/codeclone" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; — source, extensions, plugin&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://orenlab.github.io/codeclone/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt; — contracts, guides, live report&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://orenlab.github.io/codeclone/mcp/" rel="noopener noreferrer"&gt;MCP guide&lt;/a&gt; — agent and IDE setup&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pypi.org/project/codeclone/" rel="noopener noreferrer"&gt;PyPI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are building review workflows around IDEs, MCP clients, or AI-assisted refactoring, I would love feedback on one question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes a structural analysis tool feel trustworthy once it leaves the CLI and starts living inside real developer workflows?&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>mcp</category>
      <category>vscode</category>
      <category>devtools</category>
    </item>
    <item>
      <title>I turned my Python code quality tool into a budget-aware MCP server for AI agents</title>
      <dc:creator>orenlab</dc:creator>
      <pubDate>Wed, 01 Apr 2026 13:00:50 +0000</pubDate>
      <link>https://dev.to/orenlab/i-turned-my-python-code-quality-tool-into-a-budget-aware-mcp-server-for-ai-agents-127j</link>
      <guid>https://dev.to/orenlab/i-turned-my-python-code-quality-tool-into-a-budget-aware-mcp-server-for-ai-agents-127j</guid>
      <description>&lt;p&gt;I already wrote about why I built CodeClone and why I care about baseline-aware&lt;br&gt;
code health:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/orenlab/i-built-a-baseline-aware-python-code-health-tool-for-ci-and-ai-assisted-coding-5dij"&gt;I built a baseline-aware Python code health tool for CI and AI-assisted coding&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This post is about what changed in &lt;code&gt;2.0.0b3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The short version: this is the first release where CodeClone feels less like a Python structural analysis CLI and more like a serious MCP surface for AI coding agents.&lt;/p&gt;

&lt;p&gt;Not by building a second engine.&lt;br&gt;
Not by adding AI-specific heuristics to the core.&lt;br&gt;
But by exposing the same deterministic, baseline-aware pipeline through a read-only MCP layer that agents can actually use.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why MCP mattered for CodeClone
&lt;/h2&gt;

&lt;p&gt;Once you start using coding agents seriously, the hard part is not "can the model write code?"&lt;/p&gt;

&lt;p&gt;The harder questions are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what changed structurally?&lt;/li&gt;
&lt;li&gt;is this debt new or already accepted in baseline?&lt;/li&gt;
&lt;li&gt;is this production risk or just test noise?&lt;/li&gt;
&lt;li&gt;should this block CI?&lt;/li&gt;
&lt;li&gt;what is the safest next refactor target?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the gap I wanted CodeClone to close.&lt;/p&gt;
&lt;h2&gt;
  
  
  What shipped in &lt;code&gt;2.0.0b3&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The headline is an optional MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; &lt;span class="s2"&gt;"codeclone[mcp]"&lt;/span&gt;
codeclone-mcp &lt;span class="nt"&gt;--transport&lt;/span&gt; stdio
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since &lt;code&gt;b3&lt;/code&gt; is still a beta, the &lt;code&gt;--pre&lt;/code&gt; flag matters here.&lt;/p&gt;

&lt;p&gt;But the useful part is the workflow around it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;b3&lt;/code&gt; adds three things that matter together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a read-only MCP surface for agents and IDE clients&lt;/li&gt;
&lt;li&gt;review-oriented workflows: changed-files analysis, run comparison, gate preview, and PR summaries&lt;/li&gt;
&lt;li&gt;tighter surrounding surfaces: stronger SARIF, better HTML navigation, and directory hotspots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is also a packaging change worth mentioning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CodeClone source code is now under &lt;code&gt;MPL-2.0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;documentation stays under &lt;code&gt;MIT&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What makes this MCP layer different
&lt;/h2&gt;

&lt;p&gt;I think there are a lot of tools now that can expose "some analysis" over MCP.&lt;/p&gt;

&lt;p&gt;What I wanted from CodeClone was stricter than that.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Canonical-report-first
&lt;/h3&gt;

&lt;p&gt;The MCP layer is not a second truth path.&lt;/p&gt;

&lt;p&gt;It reads the same canonical report model as the CLI, HTML, and SARIF surfaces.&lt;br&gt;
That means an agent is not looking at an "AI view" that quietly disagrees with what CI or the report says.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Read-only
&lt;/h3&gt;

&lt;p&gt;This was non-negotiable for me.&lt;/p&gt;

&lt;p&gt;CodeClone MCP does not mutate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;source files&lt;/li&gt;
&lt;li&gt;baselines&lt;/li&gt;
&lt;li&gt;repository state&lt;/li&gt;
&lt;li&gt;on-disk report artifacts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only mutable part is session-local review state, and that stays in memory only.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Budget-aware by design
&lt;/h3&gt;

&lt;p&gt;This is the part I ended up caring about more than I expected.&lt;/p&gt;

&lt;p&gt;A lot of MCP tools are technically useful, but easy to use badly. An agent can burn a lot of tokens just by listing too much too early.&lt;/p&gt;

&lt;p&gt;CodeClone MCP is intentionally shaped so that the cheapest useful path is the default path.&lt;/p&gt;

&lt;p&gt;It is not only bounded in payload shape. It actively guides agents toward low-cost, high-signal workflows.&lt;/p&gt;

&lt;p&gt;The cheapest useful path is now the most obvious path.&lt;/p&gt;
&lt;h2&gt;
  
  
  The workflow I wanted agents to follow
&lt;/h2&gt;

&lt;p&gt;The right first pass is not "dump all findings."&lt;/p&gt;

&lt;p&gt;In practice, the first useful question is rarely “show me everything.”&lt;br&gt;
It is usually “where should I look first?”&lt;/p&gt;

&lt;p&gt;It is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;analyze_repository or analyze_changed_paths
→ get_run_summary or get_production_triage
→ list_hotspots or focused check_*
→ get_finding
→ get_remediation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds simple, but it matters a lot.&lt;/p&gt;

&lt;p&gt;It means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cheap overview first&lt;/li&gt;
&lt;li&gt;narrow triage second&lt;/li&gt;
&lt;li&gt;deep detail only when it is actually needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a better fit for LLMs, and honestly a better fit for humans too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real token cost on a dirty repository
&lt;/h2&gt;

&lt;p&gt;I wanted to check whether this was just a nice theory, so I tested it on one of my own messier private Python repos.&lt;br&gt;
It is still in an early development stage, is not public yet, and from CodeClone's point of view it currently has a lot of structural debt.&lt;br&gt;
It works, but "works" and "structurally healthy" are obviously not the same thing.&lt;/p&gt;

&lt;p&gt;In one local run, that looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;449&lt;/code&gt; Python files&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;108,939&lt;/code&gt; lines&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2,729&lt;/code&gt; functions&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;1,048&lt;/code&gt; classes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;659&lt;/code&gt; findings&lt;/li&gt;
&lt;li&gt;health score &lt;code&gt;34 (F)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then I compared two MCP paths.&lt;/p&gt;

&lt;h3&gt;
  
  
  Broad first-pass flow
&lt;/h3&gt;

&lt;p&gt;A more naive "ask for a lot of things" practical cycle came out to about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;10,566&lt;/code&gt; tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Guided first-pass flow
&lt;/h3&gt;

&lt;p&gt;Following the new MCP guidance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;analyze_repository&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;get_production_triage&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;list_hotspots&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;get_finding&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;get_remediation&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same first-pass workflow came out to about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;2,535&lt;/code&gt; tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is roughly a &lt;code&gt;76%&lt;/code&gt; reduction in token cost for a useful first pass.&lt;/p&gt;

&lt;p&gt;The payloads did not magically become tiny; the main change was that the MCP surface now guided the client toward a narrower first-pass workflow.&lt;/p&gt;

&lt;p&gt;That result mattered to me because it changed how I think about MCP quality.&lt;/p&gt;

&lt;p&gt;For agent tooling, payload size is only half the story.&lt;br&gt;
The other half is whether the server nudges the agent toward the right path.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for PR review
&lt;/h2&gt;

&lt;p&gt;In practice, the most valuable agent loop is usually not “analyze the whole repository forever,” but “review what changed, compare it to baseline, and decide whether anything should block the merge.”&lt;/p&gt;

&lt;p&gt;It is usually closer to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;code changed&lt;/li&gt;
&lt;li&gt;tests passed&lt;/li&gt;
&lt;li&gt;now check whether the structure got better or worse&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is why &lt;code&gt;b3&lt;/code&gt; puts a lot of weight on changed-scope review.&lt;/p&gt;

&lt;p&gt;With CodeClone MCP, an agent can now ask things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what findings touch the files changed in this branch?&lt;/li&gt;
&lt;li&gt;are these findings new relative to baseline?&lt;/li&gt;
&lt;li&gt;what is the highest-priority structural issue here?&lt;/li&gt;
&lt;li&gt;would this fail CI?&lt;/li&gt;
&lt;li&gt;can I produce a short PR-ready summary?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a much better review loop than a giant flat findings dump.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the MCP surface is good at now
&lt;/h2&gt;

&lt;p&gt;The shape I like most is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;full repository analysis when you need canonical truth&lt;/li&gt;
&lt;li&gt;changed-files analysis when you need review focus&lt;/li&gt;
&lt;li&gt;compact triage first&lt;/li&gt;
&lt;li&gt;single-finding drill-down second&lt;/li&gt;
&lt;li&gt;markdown PR summary at the end&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, that makes prompts stay simple.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;h3&gt;
  
  
  Changed-files review
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use CodeClone MCP to review the files changed in this branch.
Show me only findings that touch changed files, rank them by priority, and tell me whether anything here should block CI.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Safe refactor pick
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use CodeClone MCP to find one high-priority structural issue that looks safe to refactor. Explain why it is a good first target and what refactor shape you would use. Do not change code yet.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AI-generated code check
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I added a lot of code with an AI agent.
Use CodeClone MCP to check for structural drift: new clone groups, duplicated branches, dead code, or design hotspots. Prioritize what is new relative to baseline.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the kind of MCP ergonomics I was aiming for: prompts stay fairly client-agnostic, and the server gives the agent a disciplined path.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;b3&lt;/code&gt; is not only about MCP
&lt;/h2&gt;

&lt;p&gt;Even though MCP is the headline, I did not want it to be isolated from the rest of the product.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;2.0.0b3&lt;/code&gt; also tightens the surrounding surfaces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;canonical report schema &lt;code&gt;2.2&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;cache schema &lt;code&gt;2.3&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;canonical design-finding thresholds recorded in report metadata&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Hotspots by Directory&lt;/code&gt; in the HTML overview&lt;/li&gt;
&lt;li&gt;stronger SARIF identities for code-scanning workflows&lt;/li&gt;
&lt;li&gt;Composite GitHub Action v2 for CI and PR automation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters because I want all of these surfaces to agree:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CLI for CI&lt;/li&gt;
&lt;li&gt;MCP for agents&lt;/li&gt;
&lt;li&gt;HTML for navigation&lt;/li&gt;
&lt;li&gt;SARIF for platform workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The product truth I am taking from this release
&lt;/h2&gt;

&lt;p&gt;The biggest lesson from &lt;code&gt;b3&lt;/code&gt; is that a good MCP server is not just a pile of tools.&lt;/p&gt;

&lt;p&gt;It is a control surface.&lt;/p&gt;

&lt;p&gt;For CodeClone, that now means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deterministic&lt;/li&gt;
&lt;li&gt;canonical-report-first&lt;/li&gt;
&lt;li&gt;read-only&lt;/li&gt;
&lt;li&gt;budget-aware&lt;/li&gt;
&lt;li&gt;triage-first&lt;/li&gt;
&lt;li&gt;agent-guiding&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the direction I want to keep pushing.&lt;/p&gt;

&lt;p&gt;Not "AI magic."&lt;br&gt;
Better control loops.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it (don't forget use &lt;code&gt;--pre&lt;/code&gt;)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/orenlab/codeclone" rel="noopener noreferrer"&gt;orenlab/codeclone&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Docs: &lt;a href="https://orenlab.github.io/codeclone/" rel="noopener noreferrer"&gt;orenlab.github.io/codeclone&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;MCP guide: &lt;a href="https://orenlab.github.io/codeclone/mcp/" rel="noopener noreferrer"&gt;orenlab.github.io/codeclone/mcp/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/codeclone/" rel="noopener noreferrer"&gt;pypi.org/project/codeclone&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are already building with MCP clients, I would especially love feedback on one question:&lt;/p&gt;

&lt;p&gt;what would make PR review through an MCP tool genuinely useful for your team?&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>opensource</category>
      <category>mcp</category>
    </item>
    <item>
      <title>I built a baseline-aware Python code health tool for CI and AI-assisted coding</title>
      <dc:creator>orenlab</dc:creator>
      <pubDate>Thu, 26 Mar 2026 11:49:56 +0000</pubDate>
      <link>https://dev.to/orenlab/i-built-a-baseline-aware-python-code-health-tool-for-ci-and-ai-assisted-coding-5dij</link>
      <guid>https://dev.to/orenlab/i-built-a-baseline-aware-python-code-health-tool-for-ci-and-ai-assisted-coding-5dij</guid>
      <description>&lt;h1&gt;
  
  
  I built a baseline-aware Python code health tool for CI and AI-assisted coding
&lt;/h1&gt;

&lt;p&gt;If you write Python with AI tools today, you’ve probably felt this already:&lt;/p&gt;

&lt;p&gt;the code usually works, tests may pass, lint is green, but the structure gets worse in ways that are hard to notice&lt;br&gt;
until the repository starts fighting back.&lt;/p&gt;

&lt;p&gt;Not in one dramatic commit. More like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the same logic gets rewritten in slightly different ways across multiple files;&lt;/li&gt;
&lt;li&gt;helper functions quietly grow until nobody wants to touch them;&lt;/li&gt;
&lt;li&gt;coupling increases one import at a time;&lt;/li&gt;
&lt;li&gt;framework callbacks look unused even when they are not;&lt;/li&gt;
&lt;li&gt;dead code accumulates because generated code tends to leave leftovers behind.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the problem space I built &lt;strong&gt;CodeClone&lt;/strong&gt; for.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;CodeClone 2.0.0b1&lt;/code&gt; is the first version where the tool really matches the model I wanted from the beginning: not just&lt;br&gt;
“find some clones,” but &lt;strong&gt;track structural code health over time&lt;/strong&gt;, in CI, with a trusted baseline.&lt;/p&gt;

&lt;p&gt;This post is an introduction to that version and the design choices behind it.&lt;/p&gt;
&lt;h2&gt;
  
  
  First: I know the ecosystem is not empty
&lt;/h2&gt;

&lt;p&gt;I’m not pretending this is the first serious tool in this space.&lt;/p&gt;

&lt;p&gt;There are already strong tools around adjacent problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SonarQube / SonarCloud&lt;/strong&gt; for broad code quality, governance, and quality gates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PMD CPD&lt;/strong&gt; as one of the classic copy/paste detectors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;jscpd&lt;/strong&gt; for practical duplicate-code scanning across multiple languages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vulture&lt;/strong&gt; for Python dead-code detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Radon / Xenon&lt;/strong&gt; for complexity-related checks&lt;/li&gt;
&lt;li&gt;and newer tools like &lt;strong&gt;pyscn&lt;/strong&gt;, which also move toward structural/code-health analysis for Python&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters, because I don’t think useful tools should be framed as “everything before this was wrong.”&lt;/p&gt;

&lt;p&gt;CodeClone is not trying to replace all of the above.&lt;/p&gt;

&lt;p&gt;Its angle is narrower and, I think, pretty specific:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structural duplication is a first-class signal;&lt;/li&gt;
&lt;li&gt;baseline-aware governance is the center of the workflow, not an extra feature;&lt;/li&gt;
&lt;li&gt;deterministic output is non-negotiable;&lt;/li&gt;
&lt;li&gt;and the UI/report layer is not allowed to invent conclusions the analysis engine did not produce.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If I had to summarize the difference in one sentence, it would be this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;CodeClone is built around separating &lt;strong&gt;accepted debt&lt;/strong&gt; from &lt;strong&gt;new regressions&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That sounds simple, but it changes the entire shape of the tool.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why I think this matters more now
&lt;/h2&gt;

&lt;p&gt;AI coding assistants are genuinely useful. I use them. They speed things up.&lt;/p&gt;

&lt;p&gt;But they also change the failure mode of a codebase.&lt;/p&gt;

&lt;p&gt;The biggest risk is often not “the AI wrote something syntactically invalid.” That part is easy to catch.&lt;/p&gt;

&lt;p&gt;The harder problem is that AI tools are very good at producing &lt;strong&gt;locally plausible&lt;/strong&gt; code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one more handler,&lt;/li&gt;
&lt;li&gt;one more service method,&lt;/li&gt;
&lt;li&gt;one more variant of the same logic,&lt;/li&gt;
&lt;li&gt;one more utility that overlaps with three existing ones.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each individual change looks reasonable.&lt;/p&gt;

&lt;p&gt;The repository as a whole gets worse.&lt;/p&gt;

&lt;p&gt;That is why I think structural analysis is especially useful for AI-assisted teams. If you are using Claude Code,&lt;br&gt;
Cursor, Codex, or similar tools, the important question is often not:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Is this code valid?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;but:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Did this change make the repository structurally worse?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is exactly the question a baseline-aware tool can answer well.&lt;/p&gt;
&lt;h2&gt;
  
  
  What CodeClone focuses on
&lt;/h2&gt;

&lt;p&gt;At the core, CodeClone analyzes Python projects and looks at structural signals such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;function clones&lt;/li&gt;
&lt;li&gt;block clones&lt;/li&gt;
&lt;li&gt;segment clones&lt;/li&gt;
&lt;li&gt;structural findings like duplicated branch families&lt;/li&gt;
&lt;li&gt;dead code&lt;/li&gt;
&lt;li&gt;complexity&lt;/li&gt;
&lt;li&gt;coupling&lt;/li&gt;
&lt;li&gt;cohesion&lt;/li&gt;
&lt;li&gt;dependency cycles&lt;/li&gt;
&lt;li&gt;a combined health score&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The outputs come in multiple formats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTML&lt;/li&gt;
&lt;li&gt;JSON&lt;/li&gt;
&lt;li&gt;Markdown&lt;/li&gt;
&lt;li&gt;SARIF&lt;/li&gt;
&lt;li&gt;Text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But they all come from a single canonical report document. That was important to me because I wanted consistency between&lt;br&gt;
machine-readable outputs and the human-facing report.&lt;/p&gt;
&lt;h2&gt;
  
  
  The key idea: baseline-aware governance
&lt;/h2&gt;

&lt;p&gt;This is the part I care about most.&lt;/p&gt;

&lt;p&gt;A lot of code quality tools can tell you that your repository has problems. That is useful, but it is not enough for&lt;br&gt;
real CI.&lt;/p&gt;

&lt;p&gt;In a non-trivial codebase, there is usually historical debt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;old duplication&lt;/li&gt;
&lt;li&gt;old complexity hotspots&lt;/li&gt;
&lt;li&gt;old dead code&lt;/li&gt;
&lt;li&gt;old architectural compromises&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If a tool only says “you have 400 problems,” that doesn’t help much. Most teams will either ignore it or disable it.&lt;/p&gt;

&lt;p&gt;CodeClone is designed around a different model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;take the current state as a baseline;&lt;/li&gt;
&lt;li&gt;trust and validate that baseline explicitly;&lt;/li&gt;
&lt;li&gt;keep accepted debt visible;&lt;/li&gt;
&lt;li&gt;block &lt;strong&gt;new&lt;/strong&gt; regressions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That makes the tool much more usable in practice.&lt;/p&gt;

&lt;p&gt;Instead of asking teams to become perfect overnight, it asks a much more realistic question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Did this branch make the codebase worse than the state we already accepted?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the main reason I describe CodeClone as &lt;strong&gt;baseline-aware&lt;/strong&gt; before I describe it as a clone detector.&lt;/p&gt;
&lt;h2&gt;
  
  
  What changed in 2.0.0b1
&lt;/h2&gt;

&lt;p&gt;Version &lt;code&gt;2.0.0b1&lt;/code&gt; is the point where that model became much more complete.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. A real code-health model
&lt;/h3&gt;

&lt;p&gt;CodeClone now computes a health score from multiple dimensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;clones&lt;/li&gt;
&lt;li&gt;complexity&lt;/li&gt;
&lt;li&gt;coupling&lt;/li&gt;
&lt;li&gt;cohesion&lt;/li&gt;
&lt;li&gt;dead code&lt;/li&gt;
&lt;li&gt;dependencies&lt;/li&gt;
&lt;li&gt;coverage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did not want this to become a decorative “AI score.” The point is not the number by itself; the point is whether the score can be traced back to concrete structural reasons.&lt;/p&gt;

&lt;p&gt;That is why the new HTML overview is built around:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a health gauge&lt;/li&gt;
&lt;li&gt;KPI cards&lt;/li&gt;
&lt;li&gt;an executive summary&lt;/li&gt;
&lt;li&gt;source-scope breakdown&lt;/li&gt;
&lt;li&gt;a health profile chart&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is to answer not only “what failed?” but also “what should I look at first?”&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Baseline became a first-class contract
&lt;/h3&gt;

&lt;p&gt;In &lt;code&gt;2.0.0b1&lt;/code&gt;, baseline handling is no longer just a convenience file.&lt;/p&gt;

&lt;p&gt;It is now a stricter contract with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;trust semantics&lt;/li&gt;
&lt;li&gt;compatibility checks&lt;/li&gt;
&lt;li&gt;integrity fields&lt;/li&gt;
&lt;li&gt;deterministic payload handling&lt;/li&gt;
&lt;li&gt;unified clone + metrics baseline flow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters a lot in CI. If the baseline itself is not trustworthy, the entire gating story becomes shaky.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Dead code arrived, but with explicit suppressions
&lt;/h3&gt;

&lt;p&gt;Dead-code analysis is now part of the model, but I did not want to solve dynamic Python behavior with magic heuristics.&lt;/p&gt;

&lt;p&gt;So for intentional runtime-driven cases, CodeClone uses explicit inline suppressions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# codeclone: ignore[dead-code]
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle_exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Middleware&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# codeclone: ignore[dead-code]
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is a deliberate design choice.&lt;/p&gt;

&lt;p&gt;I would rather have a local, visible policy mechanism than silently broaden the detector until it becomes hard to reason&lt;br&gt;
about.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. SARIF was added in 2.0.0b1
&lt;/h3&gt;

&lt;p&gt;This is worth calling out explicitly because I do not want to misrepresent the release: &lt;strong&gt;SARIF is new in 2.0.0b1&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I wanted it to be useful beyond “technically yes, there is a SARIF file.”&lt;/p&gt;

&lt;p&gt;So the current implementation is designed to work better with IDE/code-scanning workflows, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;%SRCROOT%&lt;/code&gt; anchoring&lt;/li&gt;
&lt;li&gt;artifacts&lt;/li&gt;
&lt;li&gt;richer rule metadata&lt;/li&gt;
&lt;li&gt;location alignment&lt;/li&gt;
&lt;li&gt;baseline state for clone results when applicable&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  5. Detection thresholds got more practical
&lt;/h3&gt;

&lt;p&gt;The default thresholds are now more permissive than before.&lt;/p&gt;

&lt;p&gt;That means CodeClone filters out less and analyzes more. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;function-level &lt;code&gt;min_loc&lt;/code&gt; was lowered from &lt;code&gt;15&lt;/code&gt; to &lt;code&gt;10&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;block thresholds were relaxed&lt;/li&gt;
&lt;li&gt;segment thresholds were relaxed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This does increase analysis volume, so it has performance implications. But it also makes the tool more honest. It stops politely ignoring a bunch of small-but-real structural issues.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this is useful for AI-generated code
&lt;/h2&gt;

&lt;p&gt;I want to be careful here, because “AI code quality” can turn into hand-wavy marketing really fast.&lt;/p&gt;

&lt;p&gt;I am &lt;strong&gt;not&lt;/strong&gt; claiming that CodeClone can detect whether a human or an LLM wrote a piece of code.&lt;/p&gt;

&lt;p&gt;That is not the point.&lt;/p&gt;

&lt;p&gt;The point is simpler:&lt;/p&gt;

&lt;p&gt;AI-assisted development tends to amplify a certain class of structural problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;repeated patterns with small variations&lt;/li&gt;
&lt;li&gt;copy-pasted orchestration logic&lt;/li&gt;
&lt;li&gt;overgrown functions&lt;/li&gt;
&lt;li&gt;dead callback surfaces&lt;/li&gt;
&lt;li&gt;architecture drift that happens in many individually “reasonable” steps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CodeClone is a good fit for that environment because it is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structural rather than stylistic&lt;/li&gt;
&lt;li&gt;deterministic enough for CI&lt;/li&gt;
&lt;li&gt;baseline-aware, so it can focus on regression control&lt;/li&gt;
&lt;li&gt;explicit about suppressions instead of hiding runtime ambiguity behind heuristics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your team ships a lot of AI-assisted code, the practical question is not “is AI bad?” It is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“How do we keep the repository readable, stable, and governable while code is being produced faster?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the problem I think CodeClone helps with.&lt;/p&gt;
&lt;h2&gt;
  
  
  What it is not
&lt;/h2&gt;

&lt;p&gt;I think first posts do better when they are honest about scope, so here is the short version.&lt;/p&gt;

&lt;p&gt;CodeClone is &lt;strong&gt;not&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a replacement for SonarQube&lt;/li&gt;
&lt;li&gt;a style linter&lt;/li&gt;
&lt;li&gt;a security scanner&lt;/li&gt;
&lt;li&gt;a magic AI-code detector&lt;/li&gt;
&lt;li&gt;a claim that every other tool got the problem wrong&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is a Python-focused, baseline-aware, structural analysis tool with a strong CI orientation.&lt;/p&gt;

&lt;p&gt;And yes, it is still beta.&lt;/p&gt;
&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;

&lt;p&gt;If you want to try the prerelease:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; codeclone
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv tool &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--pre&lt;/span&gt; &lt;span class="nv"&gt;codeclone&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;2.0.0b1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codeclone &lt;span class="nb"&gt;.&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--html&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--ci&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to adopt the baseline workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--update-baseline&lt;/span&gt;
codeclone &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--ci&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where to look next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Docs: &lt;a href="https://orenlab.github.io/codeclone/" rel="noopener noreferrer"&gt;https://orenlab.github.io/codeclone/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Live sample
report: &lt;a href="https://orenlab.github.io/codeclone/examples/report/" rel="noopener noreferrer"&gt;https://orenlab.github.io/codeclone/examples/report/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/codeclone/" rel="noopener noreferrer"&gt;https://pypi.org/project/codeclone/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/orenlab/codeclone" rel="noopener noreferrer"&gt;https://github.com/orenlab/codeclone&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;If I had to summarize &lt;code&gt;CodeClone 2.0.0b1&lt;/code&gt; in one line, it would be this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;It is the point where the project stopped being “just a clone detector” and became a baseline-aware structural quality&lt;br&gt;
tool for Python CI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the direction I wanted from the beginning.&lt;/p&gt;

&lt;p&gt;And with AI-assisted development becoming normal, I think tools in this category are becoming more important, not less.&lt;/p&gt;

&lt;p&gt;If this sounds useful, I would be glad to hear what breaks, what feels noisy, what you would want from the CI workflow,&lt;br&gt;
and what kinds of repositories you would actually trust a tool like this on.&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>devops</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
