<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lex cano</title>
    <description>The latest articles on DEV Community by Lex cano (@lex_cano).</description>
    <link>https://dev.to/lex_cano</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3947481%2F2e01fbcd-337e-4abd-9662-22f6a674fa19.jpeg</url>
      <title>DEV Community: Lex cano</title>
      <link>https://dev.to/lex_cano</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lex_cano"/>
    <language>en</language>
    <item>
      <title>I made Claude Code refuse to write code unless the ticket scores 80/100</title>
      <dc:creator>Lex cano</dc:creator>
      <pubDate>Sat, 23 May 2026 10:21:43 +0000</pubDate>
      <link>https://dev.to/lex_cano/i-made-claude-code-refuse-to-write-code-unless-the-ticket-scores-80100-45lh</link>
      <guid>https://dev.to/lex_cano/i-made-claude-code-refuse-to-write-code-unless-the-ticket-scores-80100-45lh</guid>
      <description>&lt;p&gt;I've been using Claude Code daily on real projects for several months. It's excellent. It's also a brilliant improviser — and that's the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The failure mode
&lt;/h2&gt;

&lt;p&gt;The recurring pattern looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I'd write a vague ticket. Claude Code would happily start coding.&lt;/li&gt;
&lt;li&gt;It would ship something &lt;em&gt;close&lt;/em&gt; to what I asked, but not quite.&lt;/li&gt;
&lt;li&gt;It would touch files I hadn't anticipated.&lt;/li&gt;
&lt;li&gt;Security and UI review lived in the same head as the implementation — so nothing caught the obvious.&lt;/li&gt;
&lt;li&gt;Each session forgot what the last one learned. The same bugs reappeared in different forms.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model wasn't wrong. The &lt;em&gt;workflow&lt;/em&gt; was wrong. I was treating a senior pair programmer like a junior who'd guess if I was unclear. The result was a slow drift toward "vibe-coded" software — superficially impressive demos that broke when they met real users.&lt;/p&gt;

&lt;p&gt;So I stopped treating each ticket as a prompt and started treating it as a spec. The methodology that came out is now open-source. It's called &lt;strong&gt;Forgekeel&lt;/strong&gt;, and the core idea is a quality gate I named &lt;strong&gt;KERNEL&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  KERNEL: a scored gate before any code
&lt;/h2&gt;

&lt;p&gt;The premise: &lt;strong&gt;no code is written until the ticket passes a gate.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The gate scores every ticket against 6 orthogonal dimensions, 100 points total:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Weight&lt;/th&gt;
&lt;th&gt;Question&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Clarity&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;Is the objective unambiguous in one sentence?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;Are inclusions and exclusions explicit?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Enough context (files, deps, prior decisions) to execute without asking?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Risk&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Risks identified and mitigated? (auth, DB, migrations, deletes, PII)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Validation&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Acceptance criteria verifiable and reproducible?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Priority&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Does this advance an active goal or unblock a critical path?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The score decides what happens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&amp;lt; 60&lt;/strong&gt; → reject. Return to me with concrete flags per dimension.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;60–79&lt;/strong&gt; → conditional. The &lt;code&gt;architect&lt;/code&gt; subagent (Opus, read-only) drafts a plan before any write action.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;≥ 80&lt;/strong&gt; → execute. The &lt;code&gt;builder&lt;/code&gt; subagent (Sonnet) proceeds directly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The 6 dimensions are orthogonal. You can't compensate for low Risk with high Clarity — a ticket with Clarity 20 and Risk 5 still totals well numerically, but the low Risk forces an architect review before any DB or auth touch.&lt;/p&gt;

&lt;p&gt;The score is &lt;strong&gt;not a vanity metric&lt;/strong&gt;. The line that ended up at the top of the rubric file:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A ticket that scores 95 and breaks production is worse than one that scored 65, was flagged, and got a plan first.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  A concrete example
&lt;/h2&gt;

&lt;p&gt;Here's a ticket I rejected on myself: &lt;em&gt;"Add password reset email."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;KERNEL pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Clarity      15 / 20   "Reset" undefined — link only, or full flow + template?
Scope        10 / 20   No mention of expiry, rate limits, or email provider
Context       8 / 15   Doesn't say which auth provider or where templates live
Risk          5 / 15   Auth flow + email = high risk, not addressed
Validation    8 / 15   "Make it work" isn't testable
Priority     12 / 15   Active KPI
                       ─────
                       58 / 100   → REJECT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Forgekeel refused to touch code. I rewrote the ticket as I would have if a junior asked me to spec it properly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Clarity      19 / 20   Add `/reset-password` flow: email link with 30-min expiry token
Scope        18 / 20   This ticket: link email only. Excluded: SMS, recovery codes
Context      14 / 15   Supabase Auth `resetPasswordForEmail`. Template in /emails/
Risk         13 / 15   Rate-limit 3/h per email. No token leak in logs. RLS audit
Validation   13 / 15   Cypress: request reset, click link, set new password, login
Priority     13 / 15   Unblocks login retention KPI
                       ─────
                       90 / 100   → EXECUTE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same problem, two tickets, two outcomes. The first version would have shipped &lt;em&gt;something&lt;/em&gt;. Probably without rate limits, without auditing whether the token could leak into logs, with "manual testing" as the validation step. The second version forces every gap shut before the IDE opens.&lt;/p&gt;

&lt;p&gt;This is what KERNEL really is: &lt;strong&gt;the difference between debugging in your editor and debugging in production.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Around the gate
&lt;/h2&gt;

&lt;p&gt;KERNEL is the most distinctive piece, but the methodology has more:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;7 specialized subagents&lt;/strong&gt; with enforced read-only vs write roles. &lt;code&gt;architect&lt;/code&gt;, &lt;code&gt;security-auditor&lt;/code&gt;, and &lt;code&gt;ui-reviewer&lt;/code&gt; &lt;em&gt;never modify files&lt;/em&gt; — their job is to think, not type. &lt;code&gt;builder&lt;/code&gt;, &lt;code&gt;tester&lt;/code&gt;, and &lt;code&gt;designer&lt;/code&gt; can write, but only after the read-only review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A per-project constitution&lt;/strong&gt; declaring stack, non-negotiable principles, design tokens, allowed/forbidden MCPs. Every agent reads it before acting. Without it, &lt;code&gt;/design-iterate&lt;/code&gt; refuses to run because the &lt;code&gt;designer&lt;/code&gt; would hallucinate values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A learnings loop.&lt;/strong&gt; Every closed ticket appends "what went wrong / how it was resolved / what not to repeat" to a &lt;code&gt;learnings.md&lt;/code&gt; file. By ticket 20, the project has a written history of bugs it should never re-introduce. Future sessions read it as part of context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full breakdown in the &lt;a href="https://github.com/forgekeel/forgekeel" rel="noopener noreferrer"&gt;README&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it's stack-locked
&lt;/h2&gt;

&lt;p&gt;Forgekeel is opinionated and stack-locked to &lt;strong&gt;Next.js + Supabase + TypeScript + Tailwind v4 + shadcn/ui + pnpm&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's deliberate. The agents reference specific patterns (Server Actions, RLS on every table, Tailwind &lt;code&gt;@theme&lt;/code&gt; tokens). If I made them stack-agnostic, the methodology would become advice instead of execution. Adapting to a different stack means editing the agents directly — there's no abstraction layer planned.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm looking for
&lt;/h2&gt;

&lt;p&gt;MIT, v0.1.0, used internally on real projects before this release.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/forgekeel/forgekeel" rel="noopener noreferrer"&gt;https://github.com/forgekeel/forgekeel&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;npm:&lt;/strong&gt; &lt;a href="https://www.npmjs.com/package/forgekeel" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/forgekeel&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd genuinely value feedback on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The KERNEL rubric — would you weight the dimensions differently? Blind spots in the 6?&lt;/li&gt;
&lt;li&gt;Anyone running similar structured workflows on Claude Code — what worked, what didn't?&lt;/li&gt;
&lt;li&gt;The subagent setup — what's missing for your daily loop?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If KERNEL stops one ticket from shipping broken to your users, my week was worth it.&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>ai</category>
      <category>claude</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
