<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alan West</title>
    <description>The latest articles on DEV Community by Alan West (@alanwest).</description>
    <link>https://dev.to/alanwest</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3834047%2F6413d0cf-9d90-4ccc-80a9-123656fd78ba.png</url>
      <title>DEV Community: Alan West</title>
      <link>https://dev.to/alanwest</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alanwest"/>
    <language>en</language>
    <item>
      <title>Auth0 vs Clerk vs Authon: Picking Auth for Your Vibe-Coded Project</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Wed, 29 Apr 2026 03:53:17 +0000</pubDate>
      <link>https://dev.to/alanwest/auth0-vs-clerk-vs-authon-picking-auth-for-your-vibe-coded-project-1jb1</link>
      <guid>https://dev.to/alanwest/auth0-vs-clerk-vs-authon-picking-auth-for-your-vibe-coded-project-1jb1</guid>
      <description>&lt;p&gt;If you've spent any time on r/vibecoding lately, you've seen the pattern. Someone prompts their way to a working app in 20 minutes, posts a screenshot, and then someone in the comments asks: "cool, but how are you handling auth?"&lt;/p&gt;

&lt;p&gt;Silence.&lt;/p&gt;

&lt;p&gt;Authentication is where vibe coding hits a wall. You can prompt an AI to scaffold a full-stack app surprisingly fast, but auth involves redirects, tokens, session management, OAuth flows, and security concerns that don't forgive sloppy implementation. This is exactly where a managed auth service earns its keep.&lt;/p&gt;

&lt;p&gt;I've been shipping side projects and client work using AI-assisted coding for months now, and I've tried three major auth providers in that workflow: &lt;strong&gt;Auth0&lt;/strong&gt;, &lt;strong&gt;Clerk&lt;/strong&gt;, and &lt;strong&gt;Authon&lt;/strong&gt;. Here's how they actually compare when your development style leans heavily on AI code generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Auth Matters More in Vibe Coding
&lt;/h2&gt;

&lt;p&gt;When you're prompting your way through a project, the AI generates code fast. But auth code that &lt;em&gt;looks&lt;/em&gt; right can be dangerously wrong. A subtle mistake in token validation or session handling won't throw an error — it'll just leave your app wide open.&lt;/p&gt;

&lt;p&gt;Managed auth services take that risk off the table. You integrate their SDK, call their APIs, and the security-critical stuff happens on their end. The question is which one fits your workflow best.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Contenders at a Glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Auth0&lt;/th&gt;
&lt;th&gt;Clerk&lt;/th&gt;
&lt;th&gt;Authon&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Type&lt;/td&gt;
&lt;td&gt;Hosted&lt;/td&gt;
&lt;td&gt;Hosted&lt;/td&gt;
&lt;td&gt;Hosted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free tier&lt;/td&gt;
&lt;td&gt;7,500 MAU&lt;/td&gt;
&lt;td&gt;10,000 MAU&lt;/td&gt;
&lt;td&gt;Unlimited users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OAuth providers&lt;/td&gt;
&lt;td&gt;30+&lt;/td&gt;
&lt;td&gt;20+&lt;/td&gt;
&lt;td&gt;10+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDKs&lt;/td&gt;
&lt;td&gt;Many (official + community)&lt;/td&gt;
&lt;td&gt;~10 (React-focused)&lt;/td&gt;
&lt;td&gt;15 across 6 languages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSO (SAML/LDAP)&lt;/td&gt;
&lt;td&gt;Yes (paid)&lt;/td&gt;
&lt;td&gt;Yes (paid)&lt;/td&gt;
&lt;td&gt;Planned, not yet available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom domains&lt;/td&gt;
&lt;td&gt;Yes (paid)&lt;/td&gt;
&lt;td&gt;Yes (paid)&lt;/td&gt;
&lt;td&gt;Planned, not yet available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pricing model&lt;/td&gt;
&lt;td&gt;Per-user tiers&lt;/td&gt;
&lt;td&gt;Per-user tiers&lt;/td&gt;
&lt;td&gt;No per-user pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Auth0: The Enterprise Workhorse
&lt;/h2&gt;

&lt;p&gt;Auth0 has been around forever in auth-service years. It's battle-tested, extremely configurable, and has documentation for practically every edge case.&lt;/p&gt;

&lt;p&gt;Here's what a basic Auth0 setup looks like in a Next.js app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/api/auth/[...auth0]/route.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;handleAuth&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@auth0/nextjs-auth0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// This single line handles login, logout, callback, and profile routes&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;GET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;handleAuth&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/layout.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;UserProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@auth0/nextjs-auth0/client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;RootLayout&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;children&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;UserProvider&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/UserProvider&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/body&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/html&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The good: Auth0 handles nearly any auth scenario you can think of. Machine-to-machine tokens, multi-tenant setups, custom social connections — it's all there.&lt;/p&gt;

&lt;p&gt;The bad: The dashboard is sprawling. Configuration lives across multiple screens and concepts (tenants, applications, APIs, rules, actions, hooks). When you're vibe coding and want to move fast, Auth0's setup can feel like navigating a government website. The pricing also scales with users, which gets expensive quickly once you pass the free tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clerk: The Developer Experience Play
&lt;/h2&gt;

&lt;p&gt;Clerk is newer and designed specifically for the React ecosystem. Their pre-built components are genuinely nice — drop in a &lt;code&gt;&amp;lt;SignIn /&amp;gt;&lt;/code&gt; component and you get a polished auth UI immediately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/sign-in/[[...sign-in]]/page.jsx&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SignIn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@clerk/nextjs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;SignInPage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"flex justify-center items-center min-h-screen"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* Clerk handles the entire UI — no form code needed */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SignIn&lt;/span&gt; &lt;span class="na"&gt;afterSignInUrl&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"/dashboard"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// middleware.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;clerkMiddleware&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@clerk/nextjs/server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Protects all routes by default, configure public routes separately&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;clerkMiddleware&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;matcher&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/((?!.*&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s1"&gt;..*|_next).*)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/(api|trpc)(.*)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The good: Clerk's DX is excellent, especially for Next.js. The components look polished out of the box, and the middleware approach to route protection is clean. AI tools tend to generate Clerk integration code pretty reliably because there are tons of examples in training data.&lt;/p&gt;

&lt;p&gt;The bad: It's very React/Next.js-centric. If your vibe-coded project ends up being a Python API or a Go service, Clerk's SDK support thins out. Pricing is also per-user, which can be unpredictable for a side project that unexpectedly gets traction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authon: The Newer Alternative
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://authon.dev" rel="noopener noreferrer"&gt;Authon&lt;/a&gt; is a hosted auth service that takes a different approach to pricing — there's no per-user cost. Their free plan includes unlimited users, which is honestly refreshing when you're prototyping and don't want to worry about a surprise bill.&lt;/p&gt;

&lt;p&gt;They offer 15 SDKs across 6 languages, which is solid coverage. The SDK design is intended to be familiar if you've used Clerk or Auth0 before, aiming for compatibility with those patterns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Example Authon setup in a Node.js/Express app&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AuthonClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@authon/node&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AuthonClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;projectId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AUTHON_PROJECT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AUTHON_SECRET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Middleware to protect routes&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/protected&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;authon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verifySession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The good: Unlimited users on the free tier is a real differentiator. Having 15 SDKs across 6 languages means you're less likely to hit a wall if your project isn't JavaScript-only. If you're vibe coding across different stacks — maybe a React frontend today, a Python backend tomorrow — that breadth matters.&lt;/p&gt;

&lt;p&gt;The tradeoffs: Authon currently supports 10+ OAuth providers, which covers the major ones but is less than what Auth0 or Clerk offer. SSO via SAML/LDAP and custom domains are both planned but not available yet, so if you need enterprise SSO today, this isn't your option. It's also newer, which means smaller community, fewer Stack Overflow answers, and — importantly for vibe coding — potentially less representation in AI training data.&lt;/p&gt;

&lt;h2&gt;
  
  
  So Which One Do You Pick?
&lt;/h2&gt;

&lt;p&gt;Here's my honest take after using all three in AI-assisted workflows:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Auth0&lt;/strong&gt; if you're building something that needs enterprise features &lt;em&gt;now&lt;/em&gt; — SSO, advanced RBAC, compliance certifications. The complexity is the price of completeness. Just budget for it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Clerk&lt;/strong&gt; if you're building a Next.js or React app and developer experience is your top priority. The pre-built components alone save hours of UI work, and AI tools generate Clerk code reliably.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choose Authon&lt;/strong&gt; if you're cost-sensitive (that unlimited free tier is not a gimmick), working across multiple languages, or you want to avoid per-user pricing as you scale. Just be aware of the current feature gaps around SSO and custom domains.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Note on Vibe Coding and Auth Security
&lt;/h3&gt;

&lt;p&gt;Whichever service you pick, please don't just paste the AI-generated auth code and ship it. Take ten minutes to actually read through the integration. Check that tokens are validated server-side, that routes are actually protected, and that you're not accidentally exposing user data in client-side state.&lt;/p&gt;

&lt;p&gt;Auth is the one part of your vibe-coded app where "it works on my machine" truly isn't good enough. Use a managed service, read their security docs, and test the unhappy paths — what happens when a token expires, when a session is revoked, when someone hits your API without credentials.&lt;/p&gt;

&lt;p&gt;The best part about all three of these services is that they make it really hard to mess up the crypto. The rest is on you.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>security</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Why Your LLM App Fails in Production (and How to Debug It)</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Wed, 29 Apr 2026 02:23:16 +0000</pubDate>
      <link>https://dev.to/alanwest/why-your-llm-app-fails-in-production-and-how-to-debug-it-3mio</link>
      <guid>https://dev.to/alanwest/why-your-llm-app-fails-in-production-and-how-to-debug-it-3mio</guid>
      <description>&lt;p&gt;You shipped your LLM-powered feature. It worked great in testing. Then users started reporting hallucinations, inconsistent outputs, and responses that completely ignored their instructions. Sound familiar?&lt;/p&gt;

&lt;p&gt;I've been there. Three times in the last year alone. The problem isn't that LLMs are unreliable — it's that most of us are flying blind once our AI features hit production. We don't have the observability, evaluation, or guardrail infrastructure that we'd never skip for a traditional backend service.&lt;/p&gt;

&lt;p&gt;Let me walk through how I stopped guessing and started actually debugging my LLM applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Root Cause: You Can't Debug What You Can't See
&lt;/h2&gt;

&lt;p&gt;With a traditional API, debugging is straightforward. You check logs, look at status codes, trace the request through your system. With LLM applications, the failure mode is completely different.&lt;/p&gt;

&lt;p&gt;Your API returned a 200. The response was valid JSON. The model even sounded confident. But the answer was wrong, or it leaked context from another user's session, or it ignored a critical instruction in your system prompt.&lt;/p&gt;

&lt;p&gt;The root causes usually fall into three buckets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prompt drift&lt;/strong&gt; — your prompts work for your test cases but fail on real-world input patterns you didn't anticipate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context window mismanagement&lt;/strong&gt; — you're stuffing too much (or too little) context, and the model loses track of what matters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing guardrails&lt;/strong&gt; — there's no validation layer between the model's output and your user&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix isn't just "write better prompts." It's building proper infrastructure around your LLM calls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Add Tracing to Every LLM Interaction
&lt;/h2&gt;

&lt;p&gt;Before you can fix anything, you need visibility. Every call to an LLM should be traced — the full prompt, the response, latency, token counts, and any metadata about the user's session.&lt;/p&gt;

&lt;p&gt;Here's the pattern I use with OpenAI's Python client, wrapping calls with trace context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;

&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_tracing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traced_completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;trace_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Log the full request for later analysis
&lt;/span&gt;    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;
    &lt;span class="p"&gt;}))&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

    &lt;span class="c1"&gt;# Log the response alongside the request trace
&lt;/span&gt;    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llm_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;duration_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens_used&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_tokens&lt;/span&gt;
    &lt;span class="p"&gt;}))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the bare minimum. In production, you want this data flowing into something queryable — not just log files.&lt;/p&gt;

&lt;p&gt;Open-source platforms like &lt;a href="https://github.com/future-agi/future-agi" rel="noopener noreferrer"&gt;FutureAGI&lt;/a&gt; provide end-to-end tracing, evaluation, and guardrail infrastructure specifically for this problem. It's Apache 2.0 licensed and self-hostable, which matters if you're dealing with sensitive data. But even if you roll your own, the principle is the same: &lt;strong&gt;trace everything&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Build Evaluation Pipelines, Not Just Unit Tests
&lt;/h2&gt;

&lt;p&gt;Here's where most teams get stuck. You can't unit test an LLM the way you test a function. The output is non-deterministic. So what do you do?&lt;/p&gt;

&lt;p&gt;You build eval pipelines. The idea is simple: maintain a dataset of input-output pairs that represent what "good" looks like, and continuously run your prompts against them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eval_dataset_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eval_dataset_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;case&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt_template&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;case&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;traced_completion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Score using your criteria — could be exact match,
&lt;/span&gt;        &lt;span class="c1"&gt;# semantic similarity, or even another LLM as judge
&lt;/span&gt;        &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;evaluate_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;case&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;criteria&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;case&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;criteria&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accuracy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;case&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;case&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;actual&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;trace_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;trace_id&lt;/span&gt;  &lt;span class="c1"&gt;# link back to the full trace
&lt;/span&gt;        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;passing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Eval results: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;passing&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; passing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: your eval dataset should grow over time. Every production failure you catch? Add it to the dataset. Every edge case a user reports? That's a new eval case. After a few months, you'll have a regression suite that actually reflects how your app is used.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Add Output Guardrails
&lt;/h2&gt;

&lt;p&gt;Tracing tells you what happened. Evals tell you if things are getting worse. But guardrails prevent bad outputs from reaching users in the first place.&lt;/p&gt;

&lt;p&gt;I keep my guardrails simple and composable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GuardrailPipeline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_fn&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_fn&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_fn&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reason&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;check_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;failures&lt;/span&gt;

&lt;span class="c1"&gt;# Example checks
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;no_pii_leak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Catch common PII patterns in the output&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
    &lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\b\d{3}-\d{2}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# SSN
&lt;/span&gt;        &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\b\d{16}\b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;               &lt;span class="c1"&gt;# credit card (simplified)
&lt;/span&gt;    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PII pattern detected: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;response_not_empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Empty response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# Wire it up
&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GuardrailPipeline&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no_pii&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;no_pii_leak&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_check&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not_empty&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response_not_empty&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this between your LLM call and your user-facing response. When a check fails, you can retry with a modified prompt, return a fallback response, or flag it for human review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Close the Feedback Loop
&lt;/h2&gt;

&lt;p&gt;The most important step is one that most teams skip: feeding production observations back into your eval datasets and prompt iterations.&lt;/p&gt;

&lt;p&gt;Here's the workflow that actually works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Trace&lt;/strong&gt; every LLM call in production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flag&lt;/strong&gt; low-quality responses (via guardrails, user feedback, or automated scoring)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add&lt;/strong&gt; flagged cases to your eval dataset&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iterate&lt;/strong&gt; on prompts using the eval pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt; and repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This isn't a one-time setup. It's a continuous loop, and it's the difference between an LLM feature that degrades over time and one that gets better.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prevention: What I Do on Every New LLM Project Now
&lt;/h2&gt;

&lt;p&gt;After getting burned enough times, here's my checklist before any LLM feature goes to production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tracing from day one&lt;/strong&gt; — not after the first incident&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At least 50 eval cases&lt;/strong&gt; before launch, covering happy paths and known edge cases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output guardrails&lt;/strong&gt; for PII, format validation, and content policy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A fallback strategy&lt;/strong&gt; — what happens when the model fails? Users should see a graceful degradation, not a hallucination&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost monitoring&lt;/strong&gt; — runaway token usage can wreck your budget overnight&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want a batteries-included solution rather than building all this from scratch, check out &lt;a href="https://github.com/future-agi/future-agi" rel="noopener noreferrer"&gt;FutureAGI&lt;/a&gt;. It bundles tracing, evals, simulations, datasets management, a gateway, and guardrails into a single self-hostable platform under an Apache 2.0 license. It's the kind of thing I wish existed when I first started shipping LLM features.&lt;/p&gt;

&lt;p&gt;But whether you use a platform or build your own, the principle is the same: treat your LLM like any other critical system dependency. Observe it, test it, and put safety nets around it. Your users — and your on-call rotation — will thank you.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>python</category>
      <category>observability</category>
    </item>
    <item>
      <title>Why Local LLMs Keep Failing at Code Generation (and How to Fix It)</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Wed, 29 Apr 2026 01:00:59 +0000</pubDate>
      <link>https://dev.to/alanwest/why-local-llms-keep-failing-at-code-generation-and-how-to-fix-it-275</link>
      <guid>https://dev.to/alanwest/why-local-llms-keep-failing-at-code-generation-and-how-to-fix-it-275</guid>
      <description>&lt;p&gt;You finally got that 34B parameter model running on your beefy GPU. You feed it a prompt. It confidently writes a function that looks perfect — until you realize it's calling an API that literally doesn't exist. Sound familiar?&lt;/p&gt;

&lt;p&gt;I spent the better part of three months trying to make local LLMs my primary coding assistant. I wanted the privacy, the zero-cost inference, the offline capability. What I got was a masterclass in debugging AI-generated hallucinations. But I also figured out what actually works, and more importantly, &lt;em&gt;why&lt;/em&gt; local models struggle with code in ways that aren't immediately obvious.&lt;/p&gt;

&lt;p&gt;Let's break this down.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Root Cause: It's Not Just "Model Size"
&lt;/h2&gt;

&lt;p&gt;The knee-jerk explanation is "local models are too small." That's part of it, but it misses the real problem. Code generation fails locally for three interconnected reasons:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Quantization destroys code precision.&lt;/strong&gt; When you squish a 70B model down to 4-bit quantization so it fits in your 24GB of VRAM, you're losing fidelity in the exact places that matter for code. Natural language is forgiving — swap a synonym and meaning is preserved. Code isn't. A single wrong token means a &lt;code&gt;TypeError&lt;/code&gt; or a function that doesn't exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Context window limits kill real-world usefulness.&lt;/strong&gt; Most local setups give you 4K-8K context reliably. Some models advertise 32K or 128K, but actual performance degrades badly in the upper ranges when running quantized on consumer hardware. Real coding tasks — refactoring a module, understanding how a service connects to three others — need a lot of context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Training data gaps compound everything.&lt;/strong&gt; Smaller models have seen fewer code examples, fewer Stack Overflow answers, fewer GitHub repos. They're especially weak on newer frameworks, niche libraries, and language-specific idioms that larger training runs would catch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Pick the Right Model for Code (Not the Biggest One)
&lt;/h2&gt;

&lt;p&gt;Not all models are equal for code tasks. A general-purpose 70B chat model will often perform &lt;em&gt;worse&lt;/em&gt; at code than a specialized 7B-15B code model. Here's what to look for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# My current local model selection criteria&lt;/span&gt;
&lt;span class="na"&gt;priority_order&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Code-specialized training (not just general chat)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Native context length (not extended via RoPE hacks)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Quantization headroom (a 15B at Q6_K &amp;gt; a 70B at Q3_K_M)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Instruction-tuned for code completion AND chat&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Models fine-tuned specifically on code datasets — things like CodeLlama variants, DeepSeek-Coder, or StarCoder-based models — punch way above their parameter count. A 7B code-specialized model will often outperform a general-purpose 13B model on function generation, bug fixing, and code explanation.&lt;/p&gt;

&lt;p&gt;Check the model card for what it was trained on. If the training data section doesn't specifically mention code corpora, keep looking.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Fix Your Quantization Strategy
&lt;/h2&gt;

&lt;p&gt;This is where most people silently lose quality. The default advice of "just use Q4_K_M" is fine for chatting about philosophy. It's not fine when a single wrong token breaks your build.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of this (common default):&lt;/span&gt;
llama-server &lt;span class="nt"&gt;-m&lt;/span&gt; codellama-34b.Q4_K_M.gguf &lt;span class="nt"&gt;-c&lt;/span&gt; 4096

&lt;span class="c"&gt;# Try a smaller model at higher quantization:&lt;/span&gt;
llama-server &lt;span class="nt"&gt;-m&lt;/span&gt; deepseek-coder-v2-lite.Q6_K.gguf &lt;span class="nt"&gt;-c&lt;/span&gt; 8192 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--n-gpu-layers&lt;/span&gt; 35  &lt;span class="c"&gt;# offload as many layers to GPU as fit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tradeoff math is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Q6_K or Q8_0&lt;/strong&gt; on a smaller model = precise token prediction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Q3_K or Q4_K&lt;/strong&gt; on a bigger model = more knowledge, fuzzier output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For code, precision wins. I'd rather have a model that correctly generates the 15 most common patterns than one that &lt;em&gt;almost&lt;/em&gt; gets 50 patterns right.&lt;/p&gt;

&lt;p&gt;Test this yourself. Take the same prompt, run it against a 34B-Q4 and a 15B-Q6 five times each. Count the outputs that run without modification. I'll bet the smaller, higher-quant model wins.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Engineer Your Prompts Like You Mean It
&lt;/h2&gt;

&lt;p&gt;Local models are way more sensitive to prompt quality than the big cloud APIs. A lazy prompt that works fine with a 400B+ parameter model will crash and burn locally.&lt;/p&gt;

&lt;p&gt;What works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# BAD prompt for local models:
&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a function to parse CSV files&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# GOOD prompt for local models:
&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Write a Python 3.11 function that:
- Takes a file path (str) as input
- Reads a CSV file using the csv module from stdlib
- Returns a list of dictionaries where keys are column headers
- Handles the case where the file doesn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t exist (raise FileNotFoundError)
- Do NOT use pandas
- Include type hints

Function signature: def parse_csv(filepath: str) -&amp;gt; list[dict[str, str]]:
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference is night and day. Key principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Specify the language and version.&lt;/strong&gt; Don't let the model guess.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Name the libraries&lt;/strong&gt; (or explicitly exclude them). Local models love to import packages that don't exist or mix up APIs across libraries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Provide the function signature.&lt;/strong&gt; This constrains the output and reduces hallucination.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be explicit about error handling.&lt;/strong&gt; Otherwise you'll get either nothing or a seven-layer try/except lasagna.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 4: Use Fill-in-the-Middle, Not Chat
&lt;/h2&gt;

&lt;p&gt;Here's a trick that dramatically improved my local code generation quality. Stop using chat mode for inline coding tasks. Most code-specialized models support Fill-in-the-Middle (FIM) — you give them a prefix and suffix, and they generate what goes between.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# FIM format (model-specific, check docs):
&lt;/span&gt;&lt;span class="n"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;def calculate_tax(income: float, rate: float) -&amp;gt; float:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Calculate tax with standard deduction.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;    &lt;/span&gt;&lt;span class="sh"&gt;""""&lt;/span&gt;
&lt;span class="n"&gt;suffix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;    return round(tax, 2)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# The model fills in the middle — constrained by both sides
# This produces FAR more accurate code than open-ended chat generation
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FIM works because it constrains the model's output on &lt;em&gt;both&lt;/em&gt; ends. The model can't hallucinate a wildly different function signature or return type because the suffix already defines the boundary. For autocomplete-style coding — which is honestly 70% of what you want a coding assistant for — FIM with a local model is genuinely competitive.&lt;/p&gt;

&lt;p&gt;Most editor integrations (Continue, llama.vim, Tabby) support FIM natively. Use them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Set Up a Validation Pipeline
&lt;/h2&gt;

&lt;p&gt;Here's the uncomfortable truth: even with all the above, local models will still generate broken code sometimes. The fix is to stop trusting and start verifying.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# save as: validate_generated.sh&lt;/span&gt;
&lt;span class="c"&gt;# Run generated code through basic checks before accepting&lt;/span&gt;

&lt;span class="nv"&gt;FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;

&lt;span class="c"&gt;# Syntax check&lt;/span&gt;
python3 &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"import ast; ast.parse(open('&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;').read())"&lt;/span&gt; 2&amp;gt;&amp;amp;1
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-ne&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"FAIL: Syntax error in generated code"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Type check with mypy (fast, catches hallucinated APIs)&lt;/span&gt;
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; mypy &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--ignore-missing-imports&lt;/span&gt; 2&amp;gt;&amp;amp;1
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-ne&lt;/span&gt; 0 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"WARN: Type errors detected"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;

&lt;span class="c"&gt;# Run any existing tests that touch the modified module&lt;/span&gt;
python3 &lt;span class="nt"&gt;-m&lt;/span&gt; pytest tests/ &lt;span class="nt"&gt;-x&lt;/span&gt; &lt;span class="nt"&gt;--timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10 2&amp;gt;&amp;amp;1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I run something like this automatically in my editor whenever I accept a generated code block. It catches the most common local LLM failure — confidently calling functions or methods that don't exist on an object. mypy is particularly good at catching these.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Bail: Know the Limits
&lt;/h2&gt;

&lt;p&gt;Even with all these fixes, local LLMs have hard limits you should respect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-file refactoring&lt;/strong&gt;: Needs too much context. Don't even try.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging complex runtime errors&lt;/strong&gt;: The model needs to understand state, call stacks, and timing. It won't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anything requiring up-to-date API knowledge&lt;/strong&gt;: If the library was updated after the model's training cutoff, you'll get plausible-looking code for an API version that no longer exists.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generating tests for complex business logic&lt;/strong&gt;: The model doesn't understand your domain. It'll write tests that pass but test nothing meaningful.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For these tasks, you're better off using local models for smaller subtasks — generate a single function, write a type definition, convert a data structure — and doing the architectural thinking yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Honest Summary
&lt;/h2&gt;

&lt;p&gt;Local LLMs for coding aren't useless. They're just unforgiving. You can't use them the way you'd use a cloud model — fire off a vague prompt and get back working code. You need to pick specialized models, preserve quantization quality, write precise prompts, use FIM for inline completion, and validate everything.&lt;/p&gt;

&lt;p&gt;Is it more work? Yeah. But you get offline capability, complete privacy, zero API costs, and — once you dial it in — a surprisingly capable coding assistant that runs on hardware you already own.&lt;/p&gt;

&lt;p&gt;The trick is matching the tool to the task. Use local models for the 80% of coding work that's pattern-matching and boilerplate. Keep your judgment for the 20% that requires actual understanding.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>codegen</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to Migrate Your Open-Source Project Away from GitHub</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Wed, 29 Apr 2026 00:43:50 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-migrate-your-open-source-project-away-from-github-10h0</link>
      <guid>https://dev.to/alanwest/how-to-migrate-your-open-source-project-away-from-github-10h0</guid>
      <description>&lt;p&gt;So you've probably seen the news — Ghostty, the terminal emulator created by Mitchell Hashimoto (yes, the HashiCorp co-founder), is leaving GitHub. And if you've been following the broader conversation in the open-source community, Ghostty isn't alone. More projects are questioning whether GitHub is still the right home for their code.&lt;/p&gt;

&lt;p&gt;This got me thinking about a practical problem: what does it actually take to migrate a project off GitHub without losing contributors, breaking workflows, or creating chaos?&lt;/p&gt;

&lt;p&gt;I've helped move two mid-sized projects between forges, and let me tell you — the code is the easy part. Everything else is where it gets messy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Projects Are Moving
&lt;/h2&gt;

&lt;p&gt;Before we dig into the how, let's talk about the why. The reasons I keep hearing (and have experienced) boil down to a few themes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Platform lock-in&lt;/strong&gt; — GitHub Actions, GitHub Packages, Dependabot, code search... the more features you use, the harder it becomes to leave. That's not accidental.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Governance concerns&lt;/strong&gt; — When your project's entire infrastructure depends on a single company, you're at the mercy of their decisions. Policy changes, AI training on public repos, terms of service shifts — you don't get a vote.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Centralization risk&lt;/strong&gt; — A huge chunk of the world's open-source code lives on one platform. That's a single point of failure for the entire ecosystem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature gaps for specific workflows&lt;/strong&gt; — GitHub's issue tracker, PR review process, and CI system are good enough for most, but some projects outgrow them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mitchell Hashimoto wrote extensively about Ghostty's specific reasons in his blog post at mitchellh.com. Whatever your motivation, the migration process is largely the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Audit Your GitHub Dependencies
&lt;/h2&gt;

&lt;p&gt;Before you touch anything, figure out how deeply you're entangled with GitHub. Run through this checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find all references to GitHub-specific features in your repo&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"github.com"&lt;/span&gt; .github/ &lt;span class="nt"&gt;--include&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"*.yml"&lt;/span&gt; &lt;span class="nt"&gt;--include&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"*.yaml"&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"actions/"&lt;/span&gt; .github/workflows/
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"GITHUB_TOKEN"&lt;/span&gt; .github/
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"github.event"&lt;/span&gt; .github/workflows/

&lt;span class="c"&gt;# Check for GitHub-specific bot configs&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; .github/
&lt;span class="c"&gt;# Look for: dependabot.yml, CODEOWNERS, FUNDING.yml,&lt;/span&gt;
&lt;span class="c"&gt;# stale.yml, workflows/, ISSUE_TEMPLATE/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make a list. Every workflow file, every bot integration, every webhook — it all needs a replacement or needs to be dropped. In my experience, most projects have between 5 and 20 GitHub-specific touchpoints. Some have way more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Choose Your Destination
&lt;/h2&gt;

&lt;p&gt;You've got real options now. The self-hosted forge space has matured significantly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Forgejo&lt;/strong&gt; — Community fork of Gitea, fully open-source under good governance. This is what a lot of projects are choosing. Lightweight, familiar UI for GitHub users, and has a growing ecosystem of CI runners.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gitea&lt;/strong&gt; — The original. Still solid, though the community split with Forgejo created some tension about its direction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitLab CE&lt;/strong&gt; — The heavyweight option. More features than you'll ever need, but also more operational overhead to self-host.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Codeberg&lt;/strong&gt; — If you don't want to self-host, Codeberg runs Forgejo and is backed by a nonprofit. Zero cost for open-source projects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sourcehut&lt;/strong&gt; — The minimalist choice. Email-based workflow, no JavaScript required. Not for everyone, but the people who love it &lt;em&gt;really&lt;/em&gt; love it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most open-source projects, I'd recommend starting with either Codeberg (hosted) or Forgejo (self-hosted). The migration path from GitHub is the smoothest, and contributors won't feel completely lost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Migrate the Git History (The Easy Part)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone your GitHub repo with all branches and tags&lt;/span&gt;
git clone &lt;span class="nt"&gt;--mirror&lt;/span&gt; https://github.com/yourorg/yourproject.git
&lt;span class="nb"&gt;cd &lt;/span&gt;yourproject.git

&lt;span class="c"&gt;# Add your new forge as a remote&lt;/span&gt;
git remote add newforge https://your-forge.example.com/yourorg/yourproject.git

&lt;span class="c"&gt;# Push everything — all branches, all tags, all refs&lt;/span&gt;
git push newforge &lt;span class="nt"&gt;--mirror&lt;/span&gt;

&lt;span class="c"&gt;# Verify the push&lt;/span&gt;
git ls-remote newforge | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it for the code. Every commit, every branch, every tag — it all comes along. Git doesn't care where it lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Migrate Issues (The Hard Part)
&lt;/h2&gt;

&lt;p&gt;This is where most migrations get painful. GitHub issues contain years of context — bug reports, feature discussions, workarounds buried in comment threads. You have a few options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Using the GitHub API to export issues
# You'll need: pip install requests
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;export_github_issues&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;token &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# state=all gets both open and closed issues
&lt;/span&gt;        &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.github.com/repos/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;owner&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_page&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;issues_export.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Exported &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Forgejo and Gitea both have built-in GitHub import tools that handle issues, pull requests, labels, and milestones. It's not perfect — some formatting gets mangled, and image links pointing to GitHub's CDN may break — but it gets you 90% of the way there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; Don't try to migrate every single issue. Archive closed issues as a static export and only migrate open issues plus recently active closed ones. Your contributors will thank you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Replace Your CI Pipeline
&lt;/h2&gt;

&lt;p&gt;GitHub Actions is the stickiest part of the GitHub ecosystem. Here's a rough translation guide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Forgejo Actions&lt;/strong&gt; — Uses the same YAML syntax as GitHub Actions. Many GitHub Actions work unmodified, and there's a growing compatibility layer. This is the path of least resistance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Woodpecker CI&lt;/strong&gt; — Lightweight, purpose-built for Forgejo/Gitea. Different syntax but very capable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jenkins / Drone / Buildkite&lt;/strong&gt; — If you're self-hosting anyway, these are forge-agnostic options.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your workflows are simple (build, test, lint), the migration is straightforward. If you've got complex matrix builds with caching and artifact uploads, budget extra time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: The Contributor Communication Plan
&lt;/h2&gt;

&lt;p&gt;This is the step people skip, and it's the one that kills migrations. Your contributors need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Advance notice&lt;/strong&gt; — At least 2-4 weeks before the switch. Post it everywhere: README, GitHub discussions, mailing list, Discord/Matrix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A migration guide&lt;/strong&gt; — Step-by-step instructions for setting up on the new platform. Don't assume everyone knows how to use Forgejo.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redirects&lt;/strong&gt; — Keep the GitHub repo around as an archive with a pinned issue or README pointing to the new home. Don't delete it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patience&lt;/strong&gt; — Some contributors won't follow you. That's okay. Make the onboarding as frictionless as possible and accept that you'll lose some people.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prevention: Avoiding Lock-in From Day One
&lt;/h2&gt;

&lt;p&gt;If you're starting a new project, here's how to stay portable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Keep CI config generic.&lt;/strong&gt; Use Makefiles or shell scripts as your actual build logic. Let the CI YAML just call those scripts. Switching CI providers becomes trivial.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't use GitHub-specific markdown extensions&lt;/strong&gt; in your docs. Stick to standard CommonMark.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run your issue tracker separately&lt;/strong&gt; if governance matters to you. A mailing list or a Discourse instance ages better than any platform's built-in tracker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mirror your repo&lt;/strong&gt; to multiple forges from day one. &lt;code&gt;git push&lt;/code&gt; to two remotes costs nothing.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add multiple push URLs to a single remote&lt;/span&gt;
git remote set-url &lt;span class="nt"&gt;--add&lt;/span&gt; origin https://codeberg.org/you/project.git
git remote set-url &lt;span class="nt"&gt;--add&lt;/span&gt; origin https://gitlab.com/you/project.git

&lt;span class="c"&gt;# Now 'git push' sends to all three (GitHub + Codeberg + GitLab)&lt;/span&gt;
git remote &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;span class="c"&gt;# origin  https://github.com/you/project.git (fetch)&lt;/span&gt;
&lt;span class="c"&gt;# origin  https://github.com/you/project.git (push)&lt;/span&gt;
&lt;span class="c"&gt;# origin  https://codeberg.org/you/project.git (push)&lt;/span&gt;
&lt;span class="c"&gt;# origin  https://gitlab.com/you/project.git (push)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Ghostty's departure from GitHub is part of a larger trend that I think is healthy for the ecosystem. We've been through this cycle before — SourceForge to Google Code to GitHub. Platforms rise, centralize, and eventually projects diversify again.&lt;/p&gt;

&lt;p&gt;The tools for self-hosting and federation have gotten dramatically better. Forgejo is working on ActivityPub-based federation, which could eventually let forges talk to each other the way email servers do. Imagine opening a pull request on someone's Forgejo instance from your Codeberg account. That future is being actively built.&lt;/p&gt;

&lt;p&gt;Whether or not you move your project today, doing the audit from Step 1 is worth your time. Know your exit path. Git was designed to be decentralized — it might be time we actually used it that way.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>github</category>
      <category>git</category>
      <category>devops</category>
    </item>
    <item>
      <title>pgbackrest Maintenance Has Stopped — How to Plan Your PostgreSQL Backup Migration</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Tue, 28 Apr 2026 22:00:49 +0000</pubDate>
      <link>https://dev.to/alanwest/pgbackrest-maintenance-has-stopped-how-to-plan-your-postgresql-backup-migration-44m6</link>
      <guid>https://dev.to/alanwest/pgbackrest-maintenance-has-stopped-how-to-plan-your-postgresql-backup-migration-44m6</guid>
      <description>&lt;p&gt;If you manage PostgreSQL in production, you probably felt a chill when the news hit Hacker News: &lt;a href="https://github.com/pgbackrest/pgbackrest" rel="noopener noreferrer"&gt;pgbackrest appears to no longer be actively maintained&lt;/a&gt;. For a tool that's been the backbone of PostgreSQL backup strategies for years, that's a big deal.&lt;/p&gt;

&lt;p&gt;Let's talk about what this means practically, how to assess your exposure, and how to migrate to an alternative without losing sleep.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Hurts
&lt;/h2&gt;

&lt;p&gt;pgbackrest has been the gold standard for PostgreSQL backup and restore for a long time. It handles incremental backups, parallel backup/restore, encryption, and repository management in a way that pg_dump simply can't match. A lot of production setups — including several I've worked on — depend on it heavily.&lt;/p&gt;

&lt;p&gt;When a critical infrastructure tool loses active maintenance, you're looking at a few real problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Security patches stop.&lt;/strong&gt; Any CVEs discovered going forward won't get fixed upstream.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New PostgreSQL versions may break compatibility.&lt;/strong&gt; PostgreSQL 17 and beyond might introduce changes pgbackrest can't handle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bug fixes are on you.&lt;/strong&gt; That edge case in differential backup you've been meaning to report? It's staying.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This doesn't mean you need to panic and rip it out tomorrow. But you do need a plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Audit Your Current pgbackrest Setup
&lt;/h2&gt;

&lt;p&gt;Before migrating anything, document what you're actually using. Not every pgbackrest feature has a 1:1 replacement in every alternative tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check your pgbackrest config&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/pgbackrest/pgbackrest.conf

&lt;span class="c"&gt;# List your current backup stanzas and their status&lt;/span&gt;
pgbackrest &lt;span class="nt"&gt;--stanza&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_db info

&lt;span class="c"&gt;# Check your repo type (local filesystem, S3, Azure, GCS?)&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s1"&gt;'repo1-type'&lt;/span&gt; /etc/pgbackrest/pgbackrest.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Write down the answers to these questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are you using &lt;strong&gt;incremental or differential&lt;/strong&gt; backups?&lt;/li&gt;
&lt;li&gt;What's your &lt;strong&gt;retention policy&lt;/strong&gt; (how many full backups do you keep)?&lt;/li&gt;
&lt;li&gt;Are you backing up to &lt;strong&gt;object storage&lt;/strong&gt; (S3, GCS, Azure) or local disk?&lt;/li&gt;
&lt;li&gt;Do you use &lt;strong&gt;encryption at rest&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;Are you using pgbackrest for &lt;strong&gt;PITR&lt;/strong&gt; (point-in-time recovery)?&lt;/li&gt;
&lt;li&gt;Do you rely on &lt;strong&gt;parallel backup/restore&lt;/strong&gt;?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This list becomes your migration checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Evaluate Your Alternatives
&lt;/h2&gt;

&lt;p&gt;There are three serious contenders worth evaluating. Each has tradeoffs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Barman (by EnterpriseDB)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://pgbarman.org/" rel="noopener noreferrer"&gt;Barman&lt;/a&gt; is the most feature-complete alternative. It's actively maintained by EnterpriseDB (the folks behind the major PostgreSQL commercial distribution) and handles most of what pgbackrest does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; PITR, incremental backups, parallel jobs, S3/Azure/GCS support, solid documentation, active development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt; Python-based (adds a runtime dependency), the configuration model is different enough to require real migration effort.&lt;/p&gt;

&lt;h3&gt;
  
  
  WAL-G
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/wal-g/wal-g" rel="noopener noreferrer"&gt;WAL-G&lt;/a&gt; is a Go-based tool originally developed at Citus Data. It's focused on cloud-native backup workflows and is quite fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; Written in Go (single binary, no runtime deps), excellent cloud storage support, delta backups, good performance with large databases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt; Less mature PITR tooling compared to Barman, fewer configuration knobs for complex setups.&lt;/p&gt;

&lt;h3&gt;
  
  
  pg_basebackup + Manual WAL Archiving
&lt;/h3&gt;

&lt;p&gt;The built-in option. PostgreSQL ships with &lt;code&gt;pg_basebackup&lt;/code&gt; and WAL archiving out of the box. No third-party tool required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; Zero additional dependencies, guaranteed compatibility with your PostgreSQL version, well-documented in core PostgreSQL docs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt; No incremental backups, no built-in retention management, no parallel restore, you're writing your own wrapper scripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Migration Path (pgbackrest → Barman Example)
&lt;/h2&gt;

&lt;p&gt;Here's a concrete walkthrough for migrating to Barman, since it covers the most common pgbackrest use cases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install Barman&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;barman  &lt;span class="c"&gt;# Debian/Ubuntu&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;yum &lt;span class="nb"&gt;install &lt;/span&gt;barman      &lt;span class="c"&gt;# RHEL/CentOS&lt;/span&gt;

&lt;span class="c"&gt;# Create a Barman configuration for your server&lt;/span&gt;
&lt;span class="nb"&gt;sudo cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /etc/barman.d/main-db.conf &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[main-db]
description = "Main Production Database"
ssh_command = ssh postgres@db-server
conninfo = host=db-server user=barman dbname=postgres
backup_method = postgres
# Use streaming for WAL archiving (replaces pgbackrest archive-push)
streaming_archiver = on
slot_name = barman
streaming_conninfo = host=db-server user=streaming_barman
# Retention: keep 4 full backups (adjust to match your pgbackrest policy)
retention_policy = RECOVERY WINDOW OF 14 DAYS
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then set up the replication slot and test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the PostgreSQL server, create the replication slot&lt;/span&gt;
psql &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"SELECT pg_create_physical_replication_slot('barman');"&lt;/span&gt;

&lt;span class="c"&gt;# Back on the Barman server, verify the connection&lt;/span&gt;
barman check main-db

&lt;span class="c"&gt;# Take your first backup&lt;/span&gt;
barman backup main-db

&lt;span class="c"&gt;# Verify it worked&lt;/span&gt;
barman list-backup main-db
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Critical Overlap Period
&lt;/h3&gt;

&lt;p&gt;Don't cut over immediately. Run both tools in parallel for at least two full backup cycles. This means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Keep pgbackrest running on its existing schedule&lt;/li&gt;
&lt;li&gt;Run Barman alongside it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test a restore from Barman&lt;/strong&gt; to a staging environment&lt;/li&gt;
&lt;li&gt;Only after a successful test restore, disable pgbackrest&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I cannot stress point 3 enough. A backup you haven't tested restoring is not a backup. It's a hope.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Update Your archive_command
&lt;/h2&gt;

&lt;p&gt;If you're using pgbackrest's &lt;code&gt;archive-push&lt;/code&gt; in your &lt;code&gt;postgresql.conf&lt;/code&gt;, you'll need to update this. With Barman using streaming replication, you might not need &lt;code&gt;archive_command&lt;/code&gt; at all, but if you want belt-and-suspenders:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# postgresql.conf — old pgbackrest config
# archive_command = 'pgbackrest --stanza=main-db archive-push %p'
&lt;/span&gt;
&lt;span class="c"&gt;# Option A: Switch to barman-wal-archive
&lt;/span&gt;&lt;span class="py"&gt;archive_command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'barman-wal-archive barman-server main-db %p'&lt;/span&gt;

&lt;span class="c"&gt;# Option B: If using streaming replication with Barman,
# you can use a simple copy as fallback
&lt;/span&gt;&lt;span class="py"&gt;archive_command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'cp %p /var/lib/postgresql/wal_archive/%f'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload PostgreSQL after changing this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload postgresql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prevention: Making Your Backup Strategy More Resilient
&lt;/h2&gt;

&lt;p&gt;This situation is a good reminder that depending on a single tool for critical infrastructure is risky. Here's what I'm doing going forward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Layer your backups.&lt;/strong&gt; Use a tool like Barman or WAL-G for your primary backup pipeline, but also run periodic &lt;code&gt;pg_dump&lt;/code&gt; exports as a secondary safety net. They're slower and larger, but they're format-independent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test restores regularly.&lt;/strong&gt; Set up a cron job or CI pipeline that restores your latest backup to a throwaway instance at least weekly. If you're not testing restores, you don't have backups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor backup health.&lt;/strong&gt; Whatever tool you use, set up alerts for failed backups, growing backup age, and WAL archive lag. The worst time to discover your backups aren't working is during a recovery.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document your recovery procedure.&lt;/strong&gt; Write a runbook. Actually write it down. Include the exact commands, the expected timelines, and who has access to what. Future-you at 3 AM during an incident will be grateful.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Don't Panic, But Don't Wait
&lt;/h2&gt;

&lt;p&gt;pgbackrest losing active maintenance doesn't mean your existing backups are suddenly invalid. The tool still works. Your existing backups are still there. But the clock is ticking on compatibility with future PostgreSQL releases and security patches.&lt;/p&gt;

&lt;p&gt;Start your evaluation now, pick the alternative that matches your feature needs, and run a parallel migration over the next few weeks. Your future self — and your on-call rotation — will thank you.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>devops</category>
      <category>backup</category>
    </item>
    <item>
      <title>Open-Source LLMs You Can Actually Run Today vs. Waiting for Grok 3</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Tue, 28 Apr 2026 15:30:22 +0000</pubDate>
      <link>https://dev.to/alanwest/open-source-llms-you-can-actually-run-today-vs-waiting-for-grok-3-2jc2</link>
      <guid>https://dev.to/alanwest/open-source-llms-you-can-actually-run-today-vs-waiting-for-grok-3-2jc2</guid>
      <description>&lt;p&gt;The r/LocalLLaMA community has been buzzing with a familiar refrain: "Where's the open-source Grok 3?" Elon Musk has repeatedly signaled that xAI would open-source its models, and while Grok-1 did get released back in March 2024, Grok 3 remains firmly closed. If you're sitting around waiting for that drop, I have good news and bad news. The bad news: nobody knows when (or if) it'll happen. The good news: the open-source LLM landscape is so stacked right now that you might not even need it.&lt;/p&gt;

&lt;p&gt;Let's compare what's actually available today, how to get these models running locally, and what tradeoffs you're making with each option.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Comparison Matters
&lt;/h2&gt;

&lt;p&gt;Running LLMs locally isn't just a hobby anymore. There are real reasons to go open-source and self-hosted:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data privacy&lt;/strong&gt; — your prompts never leave your machine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost control&lt;/strong&gt; — no per-token billing at scale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customization&lt;/strong&gt; — fine-tune on your own data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt; — no API outages or rate limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Grok 3 reportedly performs competitively with GPT-4 and Claude on various benchmarks. But benchmarks don't ship features. Let's look at what you can actually deploy right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Contenders: A Side-by-Side Look
&lt;/h2&gt;

&lt;p&gt;Here's my honest assessment after running each of these on real projects over the past few months.&lt;/p&gt;

&lt;h3&gt;
  
  
  Meta's Llama 3.1 / 3.3
&lt;/h3&gt;

&lt;p&gt;The 800-pound gorilla of open-source LLMs. Llama 3.1 405B is massive, and the 70B variant hits a sweet spot for most use cases. Llama 3.3 70B brought further improvements in instruction following.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Running Llama 3.3 70B with ollama — dead simple
# Install: curl -fsSL https://ollama.com/install.sh | sh
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;llama3.3:70b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Explain the builder pattern in Rust&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Huge community, excellent tool-calling support, permissive license for most uses.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; The 405B model needs serious hardware (multiple GPUs). Even 70B wants at least 48GB VRAM for decent quantization.&lt;/p&gt;
&lt;h3&gt;
  
  
  Mistral / Mixtral
&lt;/h3&gt;

&lt;p&gt;Mistral has been punching above its weight class since day one. Their Mixture of Experts architecture means you get big-model quality without big-model VRAM requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Efficient inference, strong multilingual support, good coding performance.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Licensing has gotten murkier with newer releases — check the specific model's license before deploying commercially.&lt;/p&gt;
&lt;h3&gt;
  
  
  DeepSeek-V3 / DeepSeek-R1
&lt;/h3&gt;

&lt;p&gt;DeepSeek shook the industry with R1's reasoning capabilities. The open-weights release of both V3 (671B MoE) and R1 was a big deal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# DeepSeek R1 distilled models are more practical for local use
# The 32B distill runs well on a single 24GB GPU with quantization
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ollama&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deepseek-r1:32b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Write a SQL query to find duplicate rows by email&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# R1 shows its chain-of-thought reasoning in &amp;lt;think&amp;gt; tags
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Exceptional reasoning, transparent chain-of-thought, competitive with frontier closed models on many tasks.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; The full 671B model is impractical for most setups. Distilled versions lose some of that magic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Qwen 2.5
&lt;/h3&gt;

&lt;p&gt;Alibaba's Qwen series has been quietly excellent. The 72B model is genuinely strong at coding tasks, and their smaller models (7B, 14B) are some of the best in their weight class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Great coding performance, solid instruction following, Apache 2.0 license.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Less community tooling compared to Llama ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Best Size for Local&lt;/th&gt;
&lt;th&gt;Min VRAM (Q4)&lt;/th&gt;
&lt;th&gt;Coding&lt;/th&gt;
&lt;th&gt;Reasoning&lt;/th&gt;
&lt;th&gt;License&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Llama 3.3&lt;/td&gt;
&lt;td&gt;70B&lt;/td&gt;
&lt;td&gt;~40GB&lt;/td&gt;
&lt;td&gt;Great&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Llama License&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mixtral 8x7B&lt;/td&gt;
&lt;td&gt;8x7B (MoE)&lt;/td&gt;
&lt;td&gt;~26GB&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek-R1 (distill)&lt;/td&gt;
&lt;td&gt;32B&lt;/td&gt;
&lt;td&gt;~20GB&lt;/td&gt;
&lt;td&gt;Great&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qwen 2.5&lt;/td&gt;
&lt;td&gt;72B&lt;/td&gt;
&lt;td&gt;~42GB&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grok 3&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Unknown&lt;/td&gt;
&lt;td&gt;Unknown&lt;/td&gt;
&lt;td&gt;Closed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last row is the point. You can't run what you can't download.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building Apps on Top: The Auth Question
&lt;/h2&gt;

&lt;p&gt;Once you pick your model and get it running, you'll probably want to build an actual application around it. I've been wiring up a local LLM-powered code review tool, and one of the first questions was how to handle user authentication.&lt;/p&gt;

&lt;p&gt;If you're comparing auth solutions for your LLM-powered app, here's the quick rundown I landed on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auth0&lt;/strong&gt; — The incumbent. Feature-rich, expensive at scale with per-user pricing, and the DX has gotten bloated over the years.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clerk&lt;/strong&gt; — Great developer experience, modern React components, but you're locked into their ecosystem and pricing scales with users.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authon&lt;/strong&gt; (&lt;a href="https://authon.dev" rel="noopener noreferrer"&gt;authon.dev&lt;/a&gt;) — A hosted auth service with 15 SDKs across 6 languages and 10+ OAuth providers. The part that caught my attention: free plan with unlimited users and no per-user pricing. It's also designed for compatibility with Clerk and Auth0 migration paths. SSO (SAML/LDAP) and custom domains aren't available yet but are on the roadmap. If you need those today, look elsewhere.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Example: protecting an LLM inference endpoint with Authon&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AuthonClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@authon/node&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AuthonClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AUTHON_API_KEY&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Middleware to verify the user's session&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/inference&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;authon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verifySession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/inference&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Now you know who's making the request&lt;/span&gt;
  &lt;span class="c1"&gt;// Forward to your local LLM endpoint&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:11434/api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deepseek-r1:32b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tradeoff is straightforward: Auth0 and Clerk have more mature ecosystems and enterprise features right now. Authon is newer but the pricing model is genuinely better if you're building something where user count is unpredictable — which describes most side projects and early-stage apps built on local LLMs.&lt;/p&gt;

&lt;h2&gt;
  
  
  So Should You Wait for Grok 3?
&lt;/h2&gt;

&lt;p&gt;Honestly? No. Here's my take:&lt;/p&gt;

&lt;p&gt;If Grok 3 drops as open weights tomorrow, that's great — more competition is always good. But the models available today are already production-capable for most use cases. I've been running DeepSeek-R1's 32B distill for code review tasks, and it catches issues that I'd expect from a much larger model.&lt;/p&gt;

&lt;p&gt;The open-source LLM space moves so fast that waiting for any single model is like waiting for the "right time" to buy a GPU. There's always something better around the corner.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Recommendation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For coding tasks:&lt;/strong&gt; Qwen 2.5 72B or DeepSeek-R1 32B distill&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For general-purpose use:&lt;/strong&gt; Llama 3.3 70B — the ecosystem support is unmatched&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For limited hardware (16GB VRAM):&lt;/strong&gt; Qwen 2.5 14B or Llama 3.2 8B&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For reasoning-heavy tasks:&lt;/strong&gt; DeepSeek-R1 distilled variants, full stop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stop waiting for Grok 3. Start building with what's here. And if xAI does eventually open-source it, you'll already have the infrastructure to swap it in.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Further reading: &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;Ollama model library&lt;/a&gt;, &lt;a href="https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard" rel="noopener noreferrer"&gt;Hugging Face Open LLM Leaderboard&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>llm</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>How to Secure Voice and Biometric Data in Your AI Training Pipeline</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Tue, 28 Apr 2026 15:10:25 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-secure-voice-and-biometric-data-in-your-ai-training-pipeline-54f1</link>
      <guid>https://dev.to/alanwest/how-to-secure-voice-and-biometric-data-in-your-ai-training-pipeline-54f1</guid>
      <description>&lt;p&gt;A reported breach involving terabytes of voice samples from tens of thousands of AI contractors recently made the rounds on Hacker News. Whether or not you're handling voice data specifically, the underlying problem is one I've seen across nearly every ML project I've consulted on: sensitive training data sitting in places it shouldn't, protected by controls that wouldn't stop a determined intern, let alone an attacker.&lt;/p&gt;

&lt;p&gt;Let's walk through how to actually lock this stuff down.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Root Problem: Training Data Is Treated Like Throwaway Data
&lt;/h2&gt;

&lt;p&gt;Here's what I see constantly. A team spins up an ML pipeline. Voice samples, images, text with PII — it all gets dumped into an S3 bucket or a shared NFS mount. Access controls? "We'll tighten those up before launch." Encryption? "It's behind a VPN, it's fine."&lt;/p&gt;

&lt;p&gt;Then the project scales. Contractors get onboarded. Data gets copied to staging environments. Someone shares a pre-signed URL in Slack. And suddenly your "temporary" storage has become a 4TB treasure chest with the access controls of a public park.&lt;/p&gt;

&lt;p&gt;The core issue is that biometric data — voice prints, facial geometry, fingerprints — isn't like a leaked password. You can't rotate someone's voice. Once it's out, it's out forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Encrypt at Rest AND in Transit (Yes, Both)
&lt;/h2&gt;

&lt;p&gt;This sounds obvious, but I still find projects where object storage encryption is "default" (meaning the cloud provider manages keys and anyone with bucket access sees plaintext). You need customer-managed keys at minimum.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# AWS example: create a dedicated KMS key for training data&lt;/span&gt;
aws kms create-key &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--description&lt;/span&gt; &lt;span class="s2"&gt;"ML training data encryption"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--key-usage&lt;/span&gt; ENCRYPT_DECRYPT &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--origin&lt;/span&gt; AWS_KMS

&lt;span class="c"&gt;# Use it for your bucket's server-side encryption&lt;/span&gt;
aws s3api put-bucket-encryption &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bucket&lt;/span&gt; ml-voice-samples &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--server-side-encryption-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "arn:aws:kms:us-east-1:123456789:key/your-key-id"
      },
      "BucketKeyEnabled": true
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But here's the part people skip: encrypt the data &lt;em&gt;before&lt;/em&gt; it hits storage too. If your pipeline ingests voice samples from contractors, encrypt them client-side before upload. That way, a compromised bucket credential alone isn't enough.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;cryptography.fernet&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Fernet&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;encrypt_sample_before_upload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Encrypt voice sample client-side before sending to storage.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;fernet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Fernet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;# Encrypted blob — useless without the key even if bucket is exposed
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fernet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encrypt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Key should come from a secrets manager, never hardcoded
&lt;/span&gt;&lt;span class="n"&gt;encryption_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SAMPLE_ENCRYPTION_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;encrypted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;encrypt_sample_before_upload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recording_0421.wav&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encryption_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Enforce Least-Privilege Access With Short-Lived Credentials
&lt;/h2&gt;

&lt;p&gt;The pattern I see over and over: a service account with broad read access to the entire training data bucket, and that credential living in an &lt;code&gt;.env&lt;/code&gt; file on six contractors' laptops.&lt;/p&gt;

&lt;p&gt;Stop doing this. Use scoped, time-limited credentials.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_scoped_training_data_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contractor_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Generate a short-lived session scoped to one contractor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s data prefix.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;sts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Session valid for 1 hour, scoped to a specific S3 prefix
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assume_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;RoleArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::123456789:role/ContractorDataReader&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;RoleSessionName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contractor-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;contractor_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;DurationSeconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# 1 hour max
&lt;/span&gt;        &lt;span class="n"&gt;Policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{{
            &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2012-10-17&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,
            &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Statement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: [{{
                &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Effect&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Allow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,
                &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: [&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3:GetObject&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;],
                &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3:::ml-voice-samples/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;contractor_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
            }}]
        }}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Credentials&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each contractor can only access their own data prefix. The credentials expire in an hour. If one set leaks, the blast radius is limited to one person's samples, not 40,000 people's.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Audit Everything, Detect Bulk Access
&lt;/h2&gt;

&lt;p&gt;The difference between normal pipeline access and exfiltration is usually volume. Your training pipeline reads samples sequentially during training runs. An attacker (or a compromised account) downloads everything as fast as possible.&lt;/p&gt;

&lt;p&gt;Set up access logging and alert on anomalies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example CloudWatch alarm for unusual S3 GetObject volume&lt;/span&gt;
&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;BulkAccessAlarm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::CloudWatch::Alarm&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AlarmName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TrainingDataBulkAccessAlert&lt;/span&gt;
      &lt;span class="na"&gt;MetricName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NumberOfObjects&lt;/span&gt;
      &lt;span class="na"&gt;Namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS/S3&lt;/span&gt;
      &lt;span class="na"&gt;Statistic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sum&lt;/span&gt;
      &lt;span class="na"&gt;Period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;  &lt;span class="c1"&gt;# 5-minute window&lt;/span&gt;
      &lt;span class="na"&gt;EvaluationPeriods&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
      &lt;span class="na"&gt;Threshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt;  &lt;span class="c1"&gt;# normal training reads ~500 objects per window&lt;/span&gt;
      &lt;span class="na"&gt;ComparisonOperator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GreaterThanThreshold&lt;/span&gt;
      &lt;span class="na"&gt;AlarmActions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;SecurityAlertSNSTopic&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the application side, log every access with context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_access_audit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;audited_fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requester&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;purpose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Wrap every data access with an audit log entry.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data_access&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;extra&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sample_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sample_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requester&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;requester&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;purpose&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;purpose&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# "training", "validation", "export"
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;get_request_ip&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Proceed with actual fetch
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fetch_sample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sample_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Separate Raw Biometrics From Training Features
&lt;/h2&gt;

&lt;p&gt;This is the one most teams skip entirely. You almost never need raw voice recordings sitting around after feature extraction. Your model trains on mel spectrograms, MFCCs, or embeddings — not the raw &lt;code&gt;.wav&lt;/code&gt; files.&lt;/p&gt;

&lt;p&gt;Build your pipeline so raw biometric data flows through and gets transformed, but doesn't persist in an accessible form:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Ingest&lt;/strong&gt;: Contractor uploads encrypted voice sample&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Process&lt;/strong&gt;: Pipeline decrypts, extracts features (spectrograms, embeddings), then &lt;em&gt;deletes the raw file&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store&lt;/strong&gt;: Only the derived features (which can't reconstruct the original voice) persist in your training dataset&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Archive&lt;/strong&gt;: If you must keep originals for legal/compliance, put them in cold storage with separate access controls and a retention policy&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The derived features are still useful for training but dramatically less valuable to an attacker. You can't clone someone's voice from an MFCC matrix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Implement Data Retention and Deletion Policies
&lt;/h2&gt;

&lt;p&gt;I've seen training datasets from 2019 still sitting in production buckets because nobody bothered to clean up. Every voice sample you store is liability. Set retention policies and enforce them automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# S3 lifecycle rule: move raw samples to Glacier after 30 days,&lt;/span&gt;
&lt;span class="c"&gt;# delete after 1 year&lt;/span&gt;
aws s3api put-bucket-lifecycle-configuration &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bucket&lt;/span&gt; ml-voice-samples &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--lifecycle-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
    "Rules": [{
      "ID": "BiometricRetention",
      "Status": "Enabled",
      "Filter": {"Prefix": "raw-samples/"},
      "Transitions": [{
        "Days": 30,
        "StorageClass": "GLACIER"
      }],
      "Expiration": {"Days": 365}
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prevention Checklist
&lt;/h2&gt;

&lt;p&gt;Before your next ML project goes live with sensitive data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Encrypt client-side&lt;/strong&gt; before data reaches your storage layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use customer-managed keys&lt;/strong&gt;, not provider defaults&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope credentials&lt;/strong&gt; per-user/per-contractor with short TTLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log and alert&lt;/strong&gt; on bulk access patterns that deviate from normal training runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extract and discard&lt;/strong&gt; — persist features, not raw biometrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set retention policies&lt;/strong&gt; — data you don't have can't be stolen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Threat model your contractors&lt;/strong&gt; — they're often the widest part of your attack surface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run periodic access reviews&lt;/strong&gt; — who still has access to what, and do they still need it?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Truth
&lt;/h2&gt;

&lt;p&gt;Most ML teams I've worked with treat data security as someone else's problem. The infra team handles encryption. The security team handles access controls. Meanwhile, the ML engineers are the ones actually deciding where data lives, how it flows, and who can touch it.&lt;/p&gt;

&lt;p&gt;If you're building training pipelines with biometric data, security isn't a layer you add at the end. It's a constraint you design around from day one. The cost of getting it right upfront is a few extra hours of pipeline work. The cost of getting it wrong is the kind of headline nobody wants to be associated with.&lt;/p&gt;

</description>
      <category>security</category>
      <category>machinelearning</category>
      <category>devops</category>
      <category>python</category>
    </item>
    <item>
      <title>How to Stop Getting Garbage Sprite Sheets from AI Image Generators</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Tue, 28 Apr 2026 01:03:34 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-stop-getting-garbage-sprite-sheets-from-ai-image-generators-3pli</link>
      <guid>https://dev.to/alanwest/how-to-stop-getting-garbage-sprite-sheets-from-ai-image-generators-3pli</guid>
      <description>&lt;p&gt;If you've ever tried to use an AI image generator to create sprite sheets for a 2D game, you already know the pain. You type in a prompt like "8-directional walk cycle for a knight character, pixel art, sprite sheet" and what you get back is... a vaguely knight-shaped blob with inconsistent frame sizes, no transparency, and animation frames that look like they belong to four different characters.&lt;/p&gt;

&lt;p&gt;I spent an embarrassing amount of time last month trying to wrangle DALL-E and Stable Diffusion into producing usable sprite sheets for a small game jam project. The result? Hours of manual cleanup in Aseprite for every single character. There has to be a better way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI Image Generators Fail at Sprite Sheets
&lt;/h2&gt;

&lt;p&gt;The root cause is simple: general-purpose image generators don't understand the &lt;em&gt;structure&lt;/em&gt; of a sprite sheet. A sprite sheet isn't just a picture — it's a grid of consistently-sized frames that need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintain the same character proportions across every frame&lt;/li&gt;
&lt;li&gt;Use transparent backgrounds (not white, not colored — actual alpha)&lt;/li&gt;
&lt;li&gt;Follow a logical animation sequence&lt;/li&gt;
&lt;li&gt;Align to a consistent grid so your game engine can slice them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you prompt a generic AI model, it treats "sprite sheet" as an aesthetic concept, not a structural one. It'll give you something that &lt;em&gt;looks like&lt;/em&gt; a sprite sheet in a thumbnail but falls apart the moment you try to load it into Unity, Godot, or even a simple &lt;code&gt;&amp;lt;canvas&amp;gt;&lt;/code&gt; renderer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# What you WANT to do:
&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;split_sprite_sheet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;knight_walk.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame_width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame_height&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Expected: 8 cleanly separated frames
# Reality: frames overlap, sizes are wrong, background bleeds through
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;split_sprite_sheet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame_height&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sheet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;frame_width&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;frame_height&lt;/span&gt;
    &lt;span class="n"&gt;frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;crop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;frames&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code above works perfectly — when the sprite sheet is actually structured correctly. The problem is upstream.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pipeline Approach: Structure Before Generation
&lt;/h2&gt;

&lt;p&gt;The fix isn't to prompt harder. It's to wrap the AI generation step in a pipeline that enforces structure. Instead of asking an AI to generate a full sprite sheet in one shot, you break the process into discrete steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Generate a single reference frame&lt;/strong&gt; — one pose, one angle, clean background&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use that reference to generate variations&lt;/strong&gt; — maintaining style consistency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post-process each frame&lt;/strong&gt; — background removal, size normalization, alignment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Composite into a proper grid&lt;/strong&gt; — with correct spacing and metadata&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is exactly the approach that tools like &lt;a href="https://github.com/0x0funky/agent-sprite-forge" rel="noopener noreferrer"&gt;agent-sprite-forge&lt;/a&gt; take. It's an open-source project that wraps AI image generation into a structured pipeline specifically designed for sprite sheet output. Rather than hoping a single prompt produces a usable sheet, it handles the generation-to-spritesheet pipeline as separate concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing Background Removal That Actually Works
&lt;/h2&gt;

&lt;p&gt;The most common failure point is transparency. AI generators almost never produce true alpha channels. Here's a practical approach to cleaning up generated frames:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;remove_background&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;240&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Remove near-white backgrounds and add alpha channel.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;img_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Detect pixels that are close to white
&lt;/span&gt;    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img_array&lt;/span&gt;&lt;span class="p"&gt;[:,:,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;img_array&lt;/span&gt;&lt;span class="p"&gt;[:,:,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;img_array&lt;/span&gt;&lt;span class="p"&gt;[:,:,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;white_mask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Set those pixels to fully transparent
&lt;/span&gt;    &lt;span class="n"&gt;img_array&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;white_mask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;normalize_frame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Center the sprite content within a fixed-size frame.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Find the bounding box of non-transparent content
&lt;/span&gt;    &lt;span class="n"&gt;bbox&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getbbox&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;bbox&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;cropped&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;crop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bbox&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Scale to fit within target while maintaining aspect ratio
&lt;/span&gt;    &lt;span class="n"&gt;cropped&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;thumbnail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LANCZOS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Center on a transparent canvas
&lt;/span&gt;    &lt;span class="n"&gt;canvas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;offset_x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cropped&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;offset_y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_size&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;cropped&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;paste&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cropped&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset_x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset_y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This two-step process — remove background, then normalize — catches most of the issues you'll hit with raw AI output. The threshold-based approach isn't perfect (it struggles with light-colored characters), but it handles 80% of cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling Edge Cases
&lt;/h3&gt;

&lt;p&gt;For sprites with light colors near the edges, a smarter approach uses flood-fill from the corners:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ImageDraw&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flood_fill_remove_bg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Remove background using flood fill from corners.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pixels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;

    &lt;span class="c1"&gt;# Sample background color from corners
&lt;/span&gt;    &lt;span class="n"&gt;corners&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
               &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="c1"&gt;# Use the most common corner color as background reference
&lt;/span&gt;    &lt;span class="n"&gt;bg_color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;corners&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;corners&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;stack&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="n"&gt;visited&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="c1"&gt;# Check if pixel is similar to background color
&lt;/span&gt;        &lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;bg_color&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Make transparent
&lt;/span&gt;            &lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is more computationally expensive but handles colored backgrounds and doesn't accidentally erase light-colored sprite content.&lt;/p&gt;

&lt;h2&gt;
  
  
  Assembling the Final Sheet
&lt;/h2&gt;

&lt;p&gt;Once you have clean, normalized frames, compositing them into a proper sprite sheet is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_sprite_sheet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Assemble individual frames into a grid-based sprite sheet.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No frames provided&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;frame_w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;  &lt;span class="c1"&gt;# ceiling division
&lt;/span&gt;
    &lt;span class="n"&gt;sheet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_h&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;divmod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;paste&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;frame_h&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sheet&lt;/span&gt;

&lt;span class="c1"&gt;# Usage
&lt;/span&gt;&lt;span class="n"&gt;raw_frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;frame_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;clean_frames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;normalize_frame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;remove_background&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw_frames&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;sheet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_sprite_sheet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;knight_walk_sheet.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Generating Animated GIFs for Previews
&lt;/h2&gt;

&lt;p&gt;While sprite sheets are what your game engine needs, animated GIFs are invaluable for quick previews and sharing with your team:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;frames_to_gif&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Convert frames to an animated GIF for preview.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;save_all&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;append_images&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:],&lt;/span&gt;
        &lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# milliseconds per frame
&lt;/span&gt;        &lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;disposal&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="c1"&gt;# clear frame before drawing next — prevents ghosting
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;disposal=2&lt;/code&gt; parameter is one of those things that'll cost you an hour of debugging if you don't know about it. Without it, transparent pixels in later frames show the previous frame bleeding through.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prevention: Building a Repeatable Workflow
&lt;/h2&gt;

&lt;p&gt;The real lesson here isn't about any specific tool — it's about treating AI-generated assets as &lt;em&gt;raw material&lt;/em&gt;, not finished product. Here's what I'd recommend:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Never ask an AI to generate a full sprite sheet in one prompt.&lt;/strong&gt; Generate individual frames or small batches and composite them yourself.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate your post-processing pipeline.&lt;/strong&gt; The background removal and normalization code above should live in a script you run on every batch of generated frames.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version your prompts.&lt;/strong&gt; When you find a prompt that produces consistent results with your chosen model, save it alongside your assets. Future you will thank present you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use structured tools when they exist.&lt;/strong&gt; Projects like agent-sprite-forge exist specifically because this problem is common enough to warrant dedicated tooling. Don't reinvent the wheel if someone's already built the pipeline.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The gap between "AI can generate images" and "AI can generate &lt;em&gt;game-ready assets&lt;/em&gt;" is wider than most people expect. But with the right pipeline, you can bridge it without losing your weekend to manual pixel pushing.&lt;/p&gt;

</description>
      <category>gamedev</category>
      <category>python</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>How to Track and Control AI Coding Assistant Costs Before They Spiral</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Tue, 28 Apr 2026 00:42:26 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-track-and-control-ai-coding-assistant-costs-before-they-spiral-4mmk</link>
      <guid>https://dev.to/alanwest/how-to-track-and-control-ai-coding-assistant-costs-before-they-spiral-4mmk</guid>
      <description>&lt;p&gt;The bill arrived, and it was ugly.&lt;/p&gt;

&lt;p&gt;I'd been happily using AI-powered code completion across three projects — a React dashboard, a Go microservice, and a Python data pipeline. Everything felt great until I looked at the monthly invoice and realized my team's AI tooling costs had quietly tripled. The shift toward usage-based billing for AI coding tools means this is going to happen to a lot more developers.&lt;/p&gt;

&lt;p&gt;Let's talk about why AI assistant costs sneak up on you, and more importantly, how to get them under control.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Usage-Based AI Billing Catches Teams Off Guard
&lt;/h2&gt;

&lt;p&gt;Flat-rate subscriptions were simple. You paid per seat, you got the tool. Done. But AI coding assistants are moving toward consumption-based models — and for good reason. Not every developer uses the same amount of compute. Premium model requests (Claude, GPT-4-class models) cost more than base completions.&lt;/p&gt;

&lt;p&gt;The problem is that &lt;strong&gt;nobody tracks their AI request volume&lt;/strong&gt;. You don't think about it. You tab-complete, you chat, you ask for refactors. Each interaction is a request, and some cost more than others.&lt;/p&gt;

&lt;p&gt;Here's what typically drives costs up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chat-heavy workflows&lt;/strong&gt;: Asking the AI to explain code, generate tests, or debug issues uses significantly more tokens than inline completions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Premium model requests&lt;/strong&gt;: Using the most capable models for every task instead of reserving them for complex problems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large context windows&lt;/strong&gt;: Pasting entire files or long error logs into chat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redundant requests&lt;/strong&gt;: Re-asking similar questions because you didn't save the output&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Measure What You're Actually Using
&lt;/h2&gt;

&lt;p&gt;Before you can optimize, you need visibility. Most AI coding tools expose usage data through APIs or dashboards, but few developers actually check them.&lt;/p&gt;

&lt;p&gt;If your tool provides a CLI or API, start by pulling your usage stats. Here's a generic pattern for tracking API-based AI tool consumption:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_ai_usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage_log_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Parse an AI tool usage log and break down costs by category.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage_log_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;breakdown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# e.g., "completion", "chat", "review"
&lt;/span&gt;        &lt;span class="n"&gt;breakdown&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;breakdown&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Sort by token consumption — the real cost driver
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;breakdown&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;analyze_ai_usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ai_usage_april.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; requests, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I ran something like this on my own usage, the results were eye-opening. Chat interactions were 15% of my requests but nearly 60% of my token consumption.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Set Up Budget Alerts and Spending Caps
&lt;/h2&gt;

&lt;p&gt;Most platforms that bill by usage let you configure spending limits. &lt;strong&gt;Use them.&lt;/strong&gt; Don't wait until month-end to find out your team blew through the budget.&lt;/p&gt;

&lt;p&gt;If your tool integrates with your organization's billing API, you can automate alerts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Simple spending check — run via cron daily&lt;/span&gt;

&lt;span class="nv"&gt;SPEND_LIMIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;500  &lt;span class="c"&gt;# monthly budget in dollars&lt;/span&gt;
&lt;span class="nv"&gt;CURRENT_SPEND&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"https://api.your-ai-tool.com/v1/billing/usage"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$API_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="s1"&gt;'.current_month_spend'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="nv"&gt;PERCENT_USED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"scale=0; (&lt;/span&gt;&lt;span class="nv"&gt;$CURRENT_SPEND&lt;/span&gt;&lt;span class="s2"&gt; / &lt;/span&gt;&lt;span class="nv"&gt;$SPEND_LIMIT&lt;/span&gt;&lt;span class="s2"&gt;) * 100"&lt;/span&gt; | bc&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DAY_OF_MONTH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%d&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DAYS_IN_MONTH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y-%m-01&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; + 1 month - 1 day"&lt;/span&gt; +%d 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo &lt;/span&gt;30&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Alert if spend rate exceeds linear projection&lt;/span&gt;
&lt;span class="nv"&gt;EXPECTED_PERCENT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"scale=0; (&lt;/span&gt;&lt;span class="nv"&gt;$DAY_OF_MONTH&lt;/span&gt;&lt;span class="s2"&gt; / &lt;/span&gt;&lt;span class="nv"&gt;$DAYS_IN_MONTH&lt;/span&gt;&lt;span class="s2"&gt;) * 100"&lt;/span&gt; | bc&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PERCENT_USED&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-gt&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXPECTED_PERCENT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"WARNING: AI tool spend is &lt;/span&gt;&lt;span class="nv"&gt;$PERCENT_USED&lt;/span&gt;&lt;span class="s2"&gt;% of budget on day &lt;/span&gt;&lt;span class="nv"&gt;$DAY_OF_MONTH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"AI Spending Alert"&lt;/span&gt; team@yourcompany.com
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;compare your spending rate to the linear projection for the month&lt;/strong&gt;, not just absolute thresholds. Being at 50% spend on day 10 is very different from being at 50% on day 25.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Optimize Your Request Patterns
&lt;/h2&gt;

&lt;p&gt;This is where the real savings come from. You don't need to use AI less — you need to use it smarter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use the right model tier for the job
&lt;/h3&gt;

&lt;p&gt;Not every task needs the most powerful model. Inline code completions? A fast, base-tier model handles those fine. Complex architectural questions or multi-file refactors? That's when you reach for the premium model.&lt;/p&gt;

&lt;p&gt;Think of it like choosing between a sports car and a commuter bike. Both get you there — but one costs a lot more per mile.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reduce context size
&lt;/h3&gt;

&lt;p&gt;Every token you send costs money. Instead of pasting an entire 500-line file into chat, extract the relevant function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Instead of this (expensive):
# "Here's my entire app.py file [500 lines], why is the login broken?"
&lt;/span&gt;
&lt;span class="c1"&gt;# Do this (focused):
# "This auth middleware returns 401 even with a valid token:"
&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;verify_token&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;algorithms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HS256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="c1"&gt;# BUG: this check fails because exp is compared as string, not int
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;exp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;JWTError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Smaller, focused prompts get better answers AND cost less. Win-win.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cache and reuse responses
&lt;/h3&gt;

&lt;p&gt;If you asked the AI to generate a testing pattern for your API routes, save that response. Don't ask the same question next week. I keep a &lt;code&gt;docs/ai-patterns/&lt;/code&gt; directory in my projects with useful generated snippets that I reference instead of regenerating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Set Team-Wide Policies
&lt;/h2&gt;

&lt;p&gt;For teams, the cost multiplier is real. Five developers each casually chatting with AI all day adds up fast. Here are policies that actually work without killing productivity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reserve premium models for code review and architecture work&lt;/strong&gt; — not for generating boilerplate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share useful AI outputs&lt;/strong&gt; in your team wiki or Slack channel instead of everyone generating the same patterns independently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set per-developer monthly budgets&lt;/strong&gt; with soft alerts at 80%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review usage weekly&lt;/strong&gt; during standups — not to shame anyone, but to identify patterns and share optimization tips&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prevention: Build Cost Awareness Into Your Workflow
&lt;/h2&gt;

&lt;p&gt;The best fix is making costs visible before they become a problem.&lt;/p&gt;

&lt;p&gt;Add a simple dashboard to your team's internal tools that shows AI usage trends. Most developer platforms provide usage APIs — hook them into Grafana, Datadog, or even a simple spreadsheet that auto-updates.&lt;/p&gt;

&lt;p&gt;The developers on my team who can see their usage in real-time naturally self-optimize. It's not about restricting access — it's about making the invisible visible.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Usage-based billing is the future for AI developer tools. It's actually fairer — light users pay less, heavy users pay for what they consume. But it requires a mindset shift from "unlimited buffet" to "pay per plate."&lt;/p&gt;

&lt;p&gt;The teams that figure out cost-efficient AI usage now will have a real advantage. Not because they spend less, but because they spend &lt;em&gt;intentionally&lt;/em&gt;. They'll use premium models where it matters, optimize their prompts, and treat AI compute as a resource to manage — just like they manage cloud infrastructure costs today.&lt;/p&gt;

&lt;p&gt;Start with measurement. You can't optimize what you can't see.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>devops</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How to Stop AI Agents From Nuking Your Production Database</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Mon, 27 Apr 2026 13:41:44 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-stop-ai-agents-from-nuking-your-production-database-1h45</link>
      <guid>https://dev.to/alanwest/how-to-stop-ai-agents-from-nuking-your-production-database-1h45</guid>
      <description>&lt;p&gt;So you gave an AI agent access to your infrastructure and it dropped your production database. Or maybe it hasn't happened to you yet — but if you're handing AI agents shell access, database credentials, or deployment permissions without guardrails, it's a matter of when, not if.&lt;/p&gt;

&lt;p&gt;This isn't a hypothetical. Stories of AI coding agents running destructive commands in production keep surfacing. Recently, a developer shared how an AI agent straight-up deleted their production database and then calmly explained what it did afterward. The agent's "confession" read like a polite incident report written by the thing that caused the incident.&lt;/p&gt;

&lt;p&gt;I've been integrating AI agents into my own workflows for a while now, and I've had a few close calls myself. Let me walk through why this happens and how to actually prevent it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI Agents Go Rogue on Your Data
&lt;/h2&gt;

&lt;p&gt;The root cause is almost always the same: &lt;strong&gt;the agent had permissions it never should have had&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;AI coding agents — whether they're running in your terminal, your CI pipeline, or some autonomous loop — operate by generating and executing commands. They're pattern-matching machines optimizing for completing your request. If you ask an agent to "clean up the test data" and it has production credentials in scope, it will happily connect to prod and start deleting things.&lt;/p&gt;

&lt;p&gt;Here's what typically goes wrong:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flat credential access&lt;/strong&gt;: The agent inherits your shell environment, which has &lt;code&gt;DATABASE_URL&lt;/code&gt; pointing at production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No execution sandbox&lt;/strong&gt;: Commands run with your full user permissions — &lt;code&gt;DROP TABLE&lt;/code&gt; works just fine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ambiguous instructions&lt;/strong&gt;: "Reset the database" means different things in dev vs. prod, and the agent might guess wrong&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No confirmation step&lt;/strong&gt;: The agent chains commands together without pausing for human review on destructive operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent isn't malicious. It's doing exactly what it thinks you want. The problem is in the environment you put it in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Isolate Agent Environments Completely
&lt;/h2&gt;

&lt;p&gt;The first rule is dead simple: &lt;strong&gt;AI agents should never have production credentials&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Set up a dedicated environment for agent execution. At minimum, use separate environment files and ensure your agent's shell session loads the right one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# .env.agent — this is what your AI agent sees&lt;/span&gt;
&lt;span class="nv"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgresql://agent:password@localhost:5432/myapp_dev
&lt;span class="nv"&gt;REDIS_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;redis://localhost:6379/1
&lt;span class="nv"&gt;AWS_PROFILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;dev-readonly

&lt;span class="c"&gt;# .env.production — this should NEVER be in the agent's path&lt;/span&gt;
&lt;span class="c"&gt;# DATABASE_URL=postgresql://admin:secret@prod-db.internal:5432/myapp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're launching agents from a wrapper script, explicitly clear dangerous variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# launch-agent.sh — sanitize the environment before handing control to the agent&lt;/span&gt;

&lt;span class="c"&gt;# Kill any production credentials&lt;/span&gt;
&lt;span class="nb"&gt;unset &lt;/span&gt;DATABASE_URL
&lt;span class="nb"&gt;unset &lt;/span&gt;PROD_DB_HOST
&lt;span class="nb"&gt;unset &lt;/span&gt;AWS_SECRET_ACCESS_KEY

&lt;span class="c"&gt;# Load dev-only config&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; .env.agent | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s1"&gt;'^#'&lt;/span&gt; | xargs&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Drop privileges if running as a powerful user&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PGUSER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;readonly_dev

&lt;span class="c"&gt;# Now start the agent&lt;/span&gt;
&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is basic hygiene, but I'm constantly surprised how many teams skip it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Use Database Roles With Minimal Permissions
&lt;/h2&gt;

&lt;p&gt;Even in development, your AI agent doesn't need &lt;code&gt;SUPERUSER&lt;/code&gt; access. Create a dedicated database role with only the permissions the agent actually needs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create a restricted role for AI agent access&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;LOGIN&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'agent_local_pass'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Grant read access to all tables&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="n"&gt;TABLES&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Grant write access only to specific tables if needed&lt;/span&gt;
&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;specific_table&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Explicitly deny destructive operations&lt;/span&gt;
&lt;span class="c1"&gt;-- (the absence of DROP/DELETE grants handles this, but be explicit)&lt;/span&gt;
&lt;span class="k"&gt;REVOKE&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="n"&gt;TABLES&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Prevent schema modifications entirely&lt;/span&gt;
&lt;span class="k"&gt;REVOKE&lt;/span&gt; &lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The principle of least privilege isn't new. But when you're dealing with an autonomous system that generates its own SQL, it goes from "best practice" to "the only thing standing between you and a very bad day."&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Add a Command Allowlist or Confirmation Gate
&lt;/h2&gt;

&lt;p&gt;If your agent can execute arbitrary shell commands, you need a gate. Most agent frameworks support some form of tool-use approval. If yours doesn't, build a simple wrapper.&lt;/p&gt;

&lt;p&gt;Here's a basic approach using a shell wrapper that intercepts dangerous patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;

&lt;span class="c1"&gt;# Patterns that should NEVER run without explicit human approval
&lt;/span&gt;&lt;span class="n"&gt;BLOCKED_PATTERNS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DROP\s+(TABLE|DATABASE|SCHEMA)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DELETE\s+FROM\s+(?!.*WHERE)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# DELETE without WHERE clause
&lt;/span&gt;    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TRUNCATE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rm\s+-rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--force&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--hard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FORMAT\s+C:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# just in case
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Return False if the command matches a dangerous pattern.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pattern&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;BLOCKED_PATTERNS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IGNORECASE&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BLOCKED: Command matches dangerous pattern &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;pattern&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Command was: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;confirm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Override? Type &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;yes-i-mean-it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; to proceed: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;confirm&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yes-i-mean-it&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;check_command&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Command blocked. Exiting.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shell&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a blunt instrument, but blunt instruments work when the alternative is a deleted database. You can refine it over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Enable Point-in-Time Recovery (Do This Today)
&lt;/h2&gt;

&lt;p&gt;Guardrails fail. Humans make mistakes. Agents hallucinate. You need a safety net.&lt;/p&gt;

&lt;p&gt;If you're running PostgreSQL, enable WAL archiving and set up point-in-time recovery (PITR). If you're on a managed service like RDS or Cloud SQL, this is usually a checkbox — turn it on and set the retention to at least 7 days.&lt;/p&gt;

&lt;p&gt;For self-managed PostgreSQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# postgresql.conf&lt;/span&gt;
wal_level &lt;span class="o"&gt;=&lt;/span&gt; replica
archive_mode &lt;span class="o"&gt;=&lt;/span&gt; on
archive_command &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'cp %p /var/lib/postgresql/wal_archive/%f'&lt;/span&gt;

&lt;span class="c"&gt;# Set a reasonable checkpoint frequency&lt;/span&gt;
checkpoint_timeout &lt;span class="o"&gt;=&lt;/span&gt; 15min
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also: &lt;strong&gt;test your backups&lt;/strong&gt;. A backup you've never restored is just a wish. Schedule monthly restore drills. I know it's tedious. Do it anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Implement Audit Logging
&lt;/h2&gt;

&lt;p&gt;When things go wrong — and they will — you need to know exactly what happened. Enable statement logging for your agent's database role:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Log all statements from the agent role&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;log_statement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'all'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;ai_agent&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;log_min_duration_statement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a complete trail. When the agent does something unexpected, you can replay its exact sequence of operations and figure out where the logic went sideways.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;We're in a weird transition period where AI agents are powerful enough to be genuinely useful but not reliable enough to be trusted with production systems unsupervised. The developers I know who are using agents most effectively treat them like junior engineers with &lt;code&gt;sudo&lt;/code&gt; access — capable, but requiring guardrails and review.&lt;/p&gt;

&lt;p&gt;Here's my checklist for any team using AI agents near infrastructure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Separate credentials&lt;/strong&gt;: Dev, staging, and prod should be completely isolated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Least privilege roles&lt;/strong&gt;: The agent gets the minimum database permissions needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Command gating&lt;/strong&gt;: Destructive operations require human confirmation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backups with tested restore&lt;/strong&gt;: PITR enabled, restore drills scheduled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging&lt;/strong&gt;: Every agent action is recorded and reviewable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network isolation&lt;/strong&gt;: Agent environments can't even reach production hosts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is revolutionary. It's the same defense-in-depth we've been preaching for decades. The difference is that AI agents make it easier to skip these steps because they feel like "just another developer" in your terminal. They're not. They're autonomous systems executing generated code, and they deserve the same rigor you'd apply to any automated pipeline touching your data.&lt;/p&gt;

&lt;p&gt;Don't learn this lesson the hard way. Set up the guardrails before the agent teaches you why you needed them.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>database</category>
      <category>security</category>
    </item>
    <item>
      <title>How to Stop Your GitHub Issues From Becoming a Graveyard</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Mon, 27 Apr 2026 13:39:26 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-stop-your-github-issues-from-becoming-a-graveyard-1mbn</link>
      <guid>https://dev.to/alanwest/how-to-stop-your-github-issues-from-becoming-a-graveyard-1mbn</guid>
      <description>&lt;p&gt;Every maintainer knows the feeling. You open your repo's issue tracker and there are 847 open issues. Half of them are from 2022. Some reference APIs that no longer exist. A few are duplicates nobody caught. And buried somewhere in that mess are the three issues that actually matter this week.&lt;/p&gt;

&lt;p&gt;Stale issues and PRs aren't just visual clutter — they actively slow down your team. New contributors can't tell what's relevant. Triaging becomes a full-time job. And that PR from eight months ago? It's now 200 commits behind main and would take longer to rebase than to rewrite.&lt;/p&gt;

&lt;p&gt;Let's talk about how to automate the cleanup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Manual Triage Doesn't Scale
&lt;/h2&gt;

&lt;p&gt;I've tried the "issue bankruptcy" approach — closing everything older than 6 months and asking people to reopen if it's still relevant. It works once. Then six months later you're back in the same spot.&lt;/p&gt;

&lt;p&gt;The core problem is that issues and PRs don't have a natural lifecycle. They get opened, maybe discussed for a day, then forgotten. Unlike branches (which you eventually merge or delete), issues just... persist.&lt;/p&gt;

&lt;p&gt;Manual triage fails because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It requires consistent human attention on a recurring schedule&lt;/li&gt;
&lt;li&gt;Context about &lt;em&gt;why&lt;/em&gt; something can be closed lives in people's heads&lt;/li&gt;
&lt;li&gt;The decision to close requires reading the full thread, checking if the bug still exists, or verifying if a feature was shipped another way&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Automated Scanning With ClawSweeper
&lt;/h2&gt;

&lt;p&gt;ClawSweeper is an open-source tool that tackles this by scanning all your issues and PRs on a weekly schedule. It analyzes each one and suggests what can be closed, along with the reasoning. Think of it as a triage assistant that never takes a day off.&lt;/p&gt;

&lt;p&gt;The general approach works like this: the tool runs as a scheduled GitHub Action, iterates through open issues and PRs, evaluates them against a set of heuristics, and then either comments with a close recommendation or auto-labels items for review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up Automated Issue Scanning
&lt;/h2&gt;

&lt;p&gt;The pattern for scheduled issue scanning via GitHub Actions looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/sweep-issues.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Sweep Stale Issues&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Run every Monday at 9am UTC&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1'&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Allow manual triggers for testing&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;sweep&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;issues&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
      &lt;span class="na"&gt;pull-requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run issue scanner&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw/clawsweeper@main&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;github-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;workflow_dispatch&lt;/code&gt; trigger is something I always add — it lets you test the action without waiting a full week for the cron to fire.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes an Issue "Closeable"?
&lt;/h2&gt;

&lt;p&gt;The interesting part isn't the scheduling — it's the heuristics for deciding what's stale. Here are the signals that generally indicate an issue can be closed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Common heuristics for stale issue detection
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;should_suggest_close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;signals&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# No activity in a long time
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;days_since_last_comment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no_recent_activity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Original author hasn't responded to questions
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;has_unanswered_maintainer_question&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;days_waiting&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;awaiting_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Referenced PR was already merged
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;linked_pr_merged&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix_already_shipped&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Issue references a version that's EOL
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;references_eol_version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;outdated_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Duplicate of another issue (based on content similarity)
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;similarity_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;open_issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;likely_duplicate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signals&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;signals&lt;/span&gt;  &lt;span class="c1"&gt;# Require multiple signals
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight here is requiring &lt;em&gt;multiple&lt;/em&gt; signals before suggesting closure. A single signal (like "no activity in 90 days") catches too many false positives — some issues are legitimate feature requests that just haven't been prioritized yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling PRs Differently Than Issues
&lt;/h2&gt;

&lt;p&gt;PRs have different staleness signals than issues. A PR that's been open for months usually means one of three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The author lost interest&lt;/li&gt;
&lt;li&gt;It's blocked on a design decision nobody made&lt;/li&gt;
&lt;li&gt;It's drifted so far from main that it needs a complete rewrite
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Configuration for PR-specific rules&lt;/span&gt;
&lt;span class="na"&gt;pr_rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;max_days_without_update&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;60&lt;/span&gt;
  &lt;span class="na"&gt;max_merge_conflicts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;check_ci_status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="c1"&gt;# Don't auto-suggest closing draft PRs — they're explicitly WIP&lt;/span&gt;
  &lt;span class="na"&gt;ignore_drafts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'd recommend being more aggressive with PR cleanup than issue cleanup. A stale PR is almost never going to get merged as-is. The code has diverged, the review context is lost, and often the approach itself has been superseded.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prevention: Reducing Future Staleness
&lt;/h2&gt;

&lt;p&gt;Automated scanning catches the backlog, but you also want to reduce the inflow of issues that &lt;em&gt;become&lt;/em&gt; stale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use issue templates&lt;/strong&gt; that require reproduction steps. Issues without repro steps are the ones that sit forever because nobody can verify them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Label issues at creation time.&lt;/strong&gt; A &lt;code&gt;needs-triage&lt;/code&gt; label makes it obvious what hasn't been looked at yet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set expectations in your CONTRIBUTING.md.&lt;/strong&gt; Tell people that issues inactive for 90+ days may be closed, and that they can always reopen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Close PRs that fail CI for more than 2 weeks.&lt;/strong&gt; If the author isn't fixing the tests, they're not coming back.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The "Suggest, Don't Auto-Close" Philosophy
&lt;/h2&gt;

&lt;p&gt;One thing I appreciate about ClawSweeper's approach is that it &lt;em&gt;suggests&lt;/em&gt; closures rather than auto-closing. I've seen too many bots that just slam issues shut with a generic "this issue has been inactive" message. It's hostile to contributors and it creates busywork when people reopen things.&lt;/p&gt;

&lt;p&gt;The better workflow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Bot comments with &lt;em&gt;why&lt;/em&gt; it thinks the issue can be closed&lt;/li&gt;
&lt;li&gt;Maintainer reviews the suggestion (takes 5 seconds per issue vs. 2 minutes of manual triage)&lt;/li&gt;
&lt;li&gt;Maintainer either closes it or removes the label to indicate "keep open"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This keeps humans in the loop while eliminating the cognitive overhead of the initial scan.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running It On Your Own Repos
&lt;/h2&gt;

&lt;p&gt;If you maintain any repo with more than ~50 open issues, I'd recommend setting up some form of automated sweeping. Whether you use ClawSweeper specifically or build your own with the GitHub API and a scheduled action, the pattern is the same: scan weekly, suggest closures with reasoning, let humans make the final call.&lt;/p&gt;

&lt;p&gt;The GitHub API gives you everything you need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Quick check: how bad is your backlog?&lt;/span&gt;
gh issue list &lt;span class="nt"&gt;--state&lt;/span&gt; open &lt;span class="nt"&gt;--json&lt;/span&gt; number,title,updatedAt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--jq&lt;/span&gt; &lt;span class="s1"&gt;'[.[] | select(.updatedAt &amp;lt; "2025-01-01")] | length'&lt;/span&gt;
&lt;span class="c"&gt;# Output: 142  (yikes)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If that number makes you uncomfortable, it's time to automate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Stale issues aren't a moral failing — they're an inevitable consequence of active projects attracting more attention than any team can handle. The fix isn't "be more disciplined about triage" (that never sticks). The fix is tooling that does the boring scanning work for you, so you can spend your limited attention on the decisions that actually require human judgment.&lt;/p&gt;

&lt;p&gt;Check out &lt;a href="https://github.com/openclaw/clawsweeper" rel="noopener noreferrer"&gt;ClawSweeper on GitHub&lt;/a&gt; if you want a turnkey solution, or steal the heuristics above and build your own. Either way, your future self will thank you when the issue count drops below triple digits.&lt;/p&gt;

</description>
      <category>github</category>
      <category>opensource</category>
      <category>automation</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to Avoid License Violations When Publishing Derivative AI Models</title>
      <dc:creator>Alan West</dc:creator>
      <pubDate>Mon, 27 Apr 2026 02:05:15 +0000</pubDate>
      <link>https://dev.to/alanwest/how-to-avoid-license-violations-when-publishing-derivative-ai-models-1532</link>
      <guid>https://dev.to/alanwest/how-to-avoid-license-violations-when-publishing-derivative-ai-models-1532</guid>
      <description>&lt;p&gt;So you fine-tuned a model, ran some abliteration passes, maybe merged a few LoRAs together, and now you want to publish it. Cool. But did you check the license on every upstream model you touched?&lt;/p&gt;

&lt;p&gt;I've been watching the open-weight AI community long enough to see this pattern repeat itself: someone publishes a derivative model, strips out the attribution, slaps on a different license, and acts surprised when the community notices. It happened again recently in the LocalLLaMA community, and honestly, it keeps happening because people treat model weights like they're somehow exempt from software licensing rules.&lt;/p&gt;

&lt;p&gt;They're not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Keeps Happening
&lt;/h2&gt;

&lt;p&gt;The root cause is a combination of two things: the AI model ecosystem moves fast, and most people publishing derivative models have never had to think about license compliance before.&lt;/p&gt;

&lt;p&gt;When you &lt;code&gt;git clone&lt;/code&gt; a repo and start modifying code, most developers instinctively know to check the LICENSE file. But when you download model weights from Hugging Face, apply some transformations, and re-upload, that same instinct doesn't kick in. The weights feel like "data" rather than "software," so people treat them differently.&lt;/p&gt;

&lt;p&gt;But legally and ethically, a derivative model is a derivative work. If the upstream model uses a license that requires attribution — and most of them do — you need to provide it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Specific Problem: Abliteration and Derivative Works
&lt;/h2&gt;

&lt;p&gt;Abliteration (the technique for removing refusal directions from model activations) is a great example of where this gets tricky. You're taking an existing model, running a transformation on its weights, and producing something new. The output model is absolutely a derivative work.&lt;/p&gt;

&lt;p&gt;Here's what a typical abliteration script looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="c1"&gt;# Load the base model you're modifying
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;original-model-id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;original-model-id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Identify refusal direction (simplified)
# This is the core of abliteration — finding the activation direction
# associated with refusal behavior and removing it
&lt;/span&gt;&lt;span class="n"&gt;refusal_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_refusal_direction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;harmful_prompts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;harmless_prompts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Modify weights by subtracting the refusal direction
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;layer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o_proj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;outer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;refusal_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;refusal_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;layer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;o_proj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;weight&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of this process is a modified version of &lt;code&gt;original-model-id&lt;/code&gt;. Whatever license that original model carries? It still applies to your output. Period.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step-by-Step: How to Stay Compliant
&lt;/h2&gt;

&lt;p&gt;Here's my checklist for publishing any derivative model work. I use this every time, and it takes maybe 10 minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Identify Every Upstream Model
&lt;/h3&gt;

&lt;p&gt;Before you publish anything, trace your lineage. If you merged three models and abliterated the result, you have at least three upstream licenses to check.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a simple lineage file — I keep one in every model repo&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; MODEL_LINEAGE.md &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;LINEAGE&lt;/span&gt;&lt;span class="sh"&gt;'
## Model Lineage

- **Base model:** org/base-model-v2 (Apache 2.0)
- **Fine-tune source:** researcher/finetuned-variant (CC-BY-4.0)
- **Merge component:** community/specialized-lora (MIT)
- **Technique applied:** Abliteration via orthogonal projection
- **Original abliteration method:** Credit to [original researchers/authors]
&lt;/span&gt;&lt;span class="no"&gt;LINEAGE
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't just good practice — for some licenses, it's literally required.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Check License Compatibility
&lt;/h3&gt;

&lt;p&gt;Not all open-source licenses play nicely together. Here are the ones you'll see most often in the model ecosystem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Apache 2.0&lt;/strong&gt; — Permissive. Requires attribution and notice of changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MIT&lt;/strong&gt; — Permissive. Requires copyright notice preservation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CC-BY-4.0&lt;/strong&gt; — Requires attribution. Common for datasets and some models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CC-BY-SA-4.0&lt;/strong&gt; — Attribution + share-alike. Your derivative must use the same or compatible license.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Llama Community License / Custom licenses&lt;/strong&gt; — Read. Every. Word. These vary wildly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The share-alike licenses are where people trip up most. If your upstream model uses CC-BY-SA-4.0, you cannot release your derivative under Apache 2.0. The share-alike provision requires your derivative to carry the same license.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Provide Proper Attribution
&lt;/h3&gt;

&lt;p&gt;This is the part that takes zero effort and people still skip. Your model card should include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The name and link to every upstream model&lt;/li&gt;
&lt;li&gt;The original authors or organizations&lt;/li&gt;
&lt;li&gt;The license of each upstream component&lt;/li&gt;
&lt;li&gt;A description of what you changed
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In your README.md / model card metadata&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;license&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cc-by-sa-4.0&lt;/span&gt;
&lt;span class="na"&gt;base_model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;org/base-model-v2&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;researcher/finetuned-variant&lt;/span&gt;
&lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;abliterated&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;merge&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="c1"&gt;## Attribution&lt;/span&gt;

&lt;span class="s"&gt;This model is a derivative of [org/base-model-v2](link) by Original Org&lt;/span&gt;
&lt;span class="s"&gt;(Apache 2.0) and [researcher/finetuned-variant](link) by Researcher Name&lt;/span&gt;
&lt;span class="s"&gt;(CC-BY-SA-4.0).&lt;/span&gt;

&lt;span class="s"&gt;The abliteration technique used is based on work by [original authors](link).&lt;/span&gt;

&lt;span class="na"&gt;Modifications&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Applied orthogonal projection to remove refusal direction,&lt;/span&gt;
&lt;span class="s"&gt;then merged with specialized LoRA weights.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Verify Before You Upload
&lt;/h3&gt;

&lt;p&gt;I wrote a quick pre-upload check I run before pushing to Hugging Face:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_model_compliance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Basic compliance checks before publishing a derivative model.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Check for LICENSE file
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LICENSE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LICENSE.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MISSING: No LICENSE file found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Check README for attribution section
&lt;/span&gt;    &lt;span class="n"&gt;readme&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;README.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;readme&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;readme&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;credit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WARNING: README has no attribution section&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WARNING: No base_model specified in metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MISSING: No README.md found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Check for lineage documentation
&lt;/span&gt;    &lt;span class="n"&gt;has_lineage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MODEL_LINEAGE.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ATTRIBUTION.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NOTICE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;has_lineage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUGGESTION: Consider adding a MODEL_LINEAGE.md file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;

&lt;span class="c1"&gt;# Run it
&lt;/span&gt;&lt;span class="n"&gt;issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;check_model_compliance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./my-model-repo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;issue&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;issue&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Is it sophisticated? No. Does it catch the most common mistakes? Yes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prevention: Build the Habit
&lt;/h2&gt;

&lt;p&gt;The AI model ecosystem is still figuring out norms, but the legal framework isn't ambiguous. Derivative works require compliance with upstream licenses. Here's how to make this painless:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start every project with a lineage doc.&lt;/strong&gt; Before you even begin training or modifying, document what you're starting from and what license it carries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Hugging Face's model card metadata properly.&lt;/strong&gt; The &lt;code&gt;base_model&lt;/code&gt; and &lt;code&gt;license&lt;/code&gt; fields exist for a reason. Fill them out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When in doubt, ask.&lt;/strong&gt; Most model authors are happy to clarify their licensing intent. Open an issue or discussion on their repo.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat model weights like code.&lt;/strong&gt; Because from a licensing perspective, they are.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;The open-weight AI community is built on trust and reciprocity. When someone publishes their work under a license that says "use this, just give me credit," stripping that credit isn't just a legal violation — it undermines the entire ecosystem that makes open AI development possible.&lt;/p&gt;

&lt;p&gt;I've seen communities fracture over exactly this kind of thing. Maintainers stop sharing. Researchers go closed-source. Everyone loses.&lt;/p&gt;

&lt;p&gt;Spend the ten minutes. Check your licenses. Write the attribution. It's the lowest-effort, highest-impact thing you can do to keep this ecosystem healthy.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>licensing</category>
    </item>
  </channel>
</rss>
