<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ertuğrul Demir</title>
    <description>The latest articles on DEV Community by Ertuğrul Demir (@erturul_demir_695474ad8d).</description>
    <link>https://dev.to/erturul_demir_695474ad8d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3700961%2Fa008c5d6-a099-4c11-827e-bc2df02828a9.jpg</url>
      <title>DEV Community: Ertuğrul Demir</title>
      <link>https://dev.to/erturul_demir_695474ad8d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/erturul_demir_695474ad8d"/>
    <language>en</language>
    <item>
      <title>Decoding Bronze Age Paperwork: Modern AI vs. Ancient Assyrian Clay Tablets</title>
      <dc:creator>Ertuğrul Demir</dc:creator>
      <pubDate>Sat, 28 Mar 2026 12:17:54 +0000</pubDate>
      <link>https://dev.to/gde/decoding-bronze-age-paperwork-modern-ai-vs-ancient-assyrian-clay-tablets-5adf</link>
      <guid>https://dev.to/gde/decoding-bronze-age-paperwork-modern-ai-vs-ancient-assyrian-clay-tablets-5adf</guid>
      <description>&lt;p&gt;Four thousand years ago, Assyrian merchants were doing what people have always done: tracking debts, chasing payments, arguing over contracts. They pressed these records into clay tablets. Not sacred texts, not epic poetry. Just the ancient equivalent of office emails.&lt;/p&gt;

&lt;p&gt;Nearly 23,000 of these tablets survive. Half have never been translated — not because they're damaged, but because a few people on Earth can read Old Assyrian.&lt;/p&gt;

&lt;p&gt;When the Deep Past Initiative turned this into a Kaggle competition, build a machine translation system for Old Assyrian cuneiform — I jumped in. The task: take transliterated text (cuneiform signs converted to Latin characters) and produce an English translation.&lt;/p&gt;

&lt;p&gt;The training set? Around 1500 pairs. That's it.&lt;/p&gt;

&lt;p&gt;For context, standard translation models train on millions of sentence pairs. Even research on "low-resource" languages works with tens of thousands. We got fifteen hundred documents and a pat on the back.&lt;/p&gt;

&lt;p&gt;So the question was straightforward: how do you build a translation model when you barely have any data, for a language that no modern tokenizer has ever seen, where every proper noun and number matters because these are legal and financial records?&lt;/p&gt;

&lt;p&gt;What started as "fine-tune a model on some ancient text" turned into a full-stack AI pipeline: Gemini vision for OCR-ing scanned academic books, LLMs for sentence alignment and cross-lingual translation, ByT5 as a byte-level backbone that doesn't choke on cuneiform, Unsloth for efficient LoRA training, and vLLM for fast inference on Kaggle T4s. The results surprised us.&lt;/p&gt;

&lt;p&gt;Let's start with why the obvious approaches don't work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Obvious Approaches Don't Work
&lt;/h2&gt;

&lt;p&gt;The first thing I tried was what everyone tries — throw a pretrained LLM at it. Gemma, Qwen, the usual suspects. Prompt it with some examples, let it translate.&lt;/p&gt;

&lt;p&gt;And honestly? The outputs look pretty good at first glance. Fluent English, reasonable sentence structure, feels like it could be right. But "feels right" is dangerous when you're translating ancient legal documents.&lt;/p&gt;

&lt;p&gt;The problem is hallucination — and not the subtle kind. These models confidently fill in names of merchants, cities, and commodities that simply aren't in the source text. When the transliteration says &lt;code&gt;A-šùr-i-dí&lt;/code&gt; the model might output a completely different name that sounds plausibly Bronze Age. When it hits an unfamiliar trade term, it improvises. For documents where every name, every number, every commodity is the actual information — that's not a minor quality issue, it's the whole problem.&lt;/p&gt;

&lt;p&gt;Ok so what about standard encoder-decoder translation models? Here the issue is more fundamental: tokenization. Modern tokenizers are trained on modern text. Akkadian transliteration is a different universe — hyphenated syllable sequences like &lt;code&gt;a-na&lt;/code&gt;, Sumerian logograms in ALL CAPS like &lt;code&gt;KÙ.BABBAR&lt;/code&gt;, determinatives in curly braces like &lt;code&gt;{d}&lt;/code&gt; and &lt;code&gt;{ki}&lt;/code&gt;, subscript digits encoding phonetic variants like &lt;code&gt;il₅&lt;/code&gt;, and gap markers like &lt;code&gt;&amp;lt;gap&amp;gt;&lt;/code&gt; for broken sections of the physical tablet.&lt;/p&gt;

&lt;p&gt;Feed this into a standard tokenizer and it fragments on every character it hasn't seen. Proper nouns that have never appeared in any pretraining corpus get silently mangled. The &lt;code&gt;&amp;lt;gap&amp;gt;&lt;/code&gt; markers that indicate missing text get treated as noise or special tokens.&lt;/p&gt;

&lt;p&gt;So: decoder-only models hallucinate, standard translation models can't tokenize the input properly. What actually fits this problem?&lt;/p&gt;

&lt;h2&gt;
  
  
  ByT5 — The Right Tool for a Weird Job
&lt;/h2&gt;

&lt;p&gt;One of the best things about Kaggle competitions is the community. People share findings, discuss approaches in the forums, and collectively narrow down what works. Early on, several participants converged on the same answer: ByT5.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjrn2lis6dpfmlfo8i5o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjrn2lis6dpfmlfo8i5o.png" alt="ByT5 architecture" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;small&gt;Image from "ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models" (Xue et al., 2021)&lt;/small&gt;



&lt;p&gt;ByT5 comes from a 2021 Google Research paper — &lt;em&gt;"Towards a Token-Free Future with Pre-trained Byte-to-Byte Models"&lt;/em&gt;. The idea is simple and kind of radical: skip tokenization entirely. Instead of mapping text to a learned vocabulary of subwords, ByT5 operates directly on raw bytes. A standard Transformer, minimal modifications, just processing one byte at a time.&lt;/p&gt;

&lt;p&gt;Why does this matter for our problem? Because every character is valid input by definition. It doesn't matter that &lt;code&gt;A-mur-{d}UTU&lt;/code&gt; has never appeared in any pretraining corpus — ByT5 doesn't need it to. No vocabulary misses, no fragmented tokens, no special handling for curly braces or subscript digits. The model just sees bytes.&lt;/p&gt;

&lt;p&gt;The paper also showed something else that turned out to be critical: byte-level models are significantly more robust to noise. When your source text comes from OCR'd clay tablets with inconsistent transcription conventions across different scholars and decades — that robustness isn't a nice-to-have, it's a requirement.&lt;/p&gt;

&lt;p&gt;Architecture: solved. Now came the harder problem — we had the right model, but nowhere near enough data to train it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Data Problem
&lt;/h2&gt;

&lt;p&gt;With ByT5 as the architecture, the bottleneck shifted entirely to data. And the competition host made the challenges very clear in a public discussion post.&lt;/p&gt;

&lt;p&gt;Two things consistently broke translations more than anything else:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Named entities.&lt;/strong&gt; Personal names, place names, divine names — they're transliterated inconsistently across different editions, they preserve older spelling conventions, and they're completely opaque to the model. In practice, many otherwise reasonable translations failed because a name got mangled, dropped, or hallucinated. The host even prepared an onomasticon (a curated list of attested name spellings) as supplemental data to help with this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Transliteration format inconsistency.&lt;/strong&gt; Different corpora encode the same text using different conventions. One participant converted diacritics to ASCII before training (&lt;code&gt;š → sz&lt;/code&gt;, &lt;code&gt;ú → u2&lt;/code&gt;) — a reasonable instinct, but the evaluation data expects diacritics. Collapsing &lt;code&gt;ṣ&lt;/code&gt; into &lt;code&gt;S₂&lt;/code&gt; or &lt;code&gt;š&lt;/code&gt; into &lt;code&gt;SZ&lt;/code&gt; removes distinctions that are semantically meaningful in Akkadian. The rule was clear: normalize &lt;em&gt;toward&lt;/em&gt; the format used in the evaluation set, not away from it.&lt;/p&gt;

&lt;p&gt;On top of that, gap handling was tricky. Damaged sections of tablets are marked with &lt;code&gt;&amp;lt;gap&amp;gt;&lt;/code&gt;, but the training data wasn't perfectly aligned — sometimes a large gap appears in the transliteration but not in the translation, forcing the model to learn misalignment rather than translation. Edge cases like &lt;code&gt;&amp;lt;gap&amp;gt;-A-šùr&lt;/code&gt; (a gap attached to a word) needed to be preserved, not blindly stripped.&lt;/p&gt;

&lt;p&gt;The host's closing point stuck with me: these aren't model architecture problems. They're data problems. And with only ~1,500 training pairs, every one of these issues hits harder because the model sees so few examples to learn from.&lt;/p&gt;

&lt;p&gt;So the path forward was obvious — find more data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding More Data — The AKT Books
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxq6mxsgy4zhmqc57gxy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxq6mxsgy4zhmqc57gxy.jpg" alt="AKT 5 Cover" width="250" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The training data had to come from somewhere. The competition hosts pointed the way — they shared scanned PDFs of the AKT series (Anatolian Kültepe Texts), a multi-volume scholarly publication of Old Assyrian tablets from the Kültepe excavations in Turkey. Each volume contains transliterations and translations of tablets. Exactly the domain, exactly the format we needed.&lt;/p&gt;

&lt;p&gt;The catch? These are academic books published between 1990 and the 2020s, by different authors, in different languages. AKT 1, 2, 4, 9a, and 10 are in Turkish. AKT 3 is in German. Each volume has its own layout, its own heading conventions, its own way of marking tablet edges and sections. Different fonts, different editorial styles, different decades of typesetting.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft0r6loipcfioqnhvq8y6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft0r6loipcfioqnhvq8y6.png" alt="Example of publication" width="800" height="660"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This isn't structured data you can parse with a script. These are scanned pages of physical books — some crisp, some not — where a tablet's transliteration might start on one page and continue on the next, where scholarly commentary sits right next to the translation text, and where the format changes just enough between volumes that nothing generalizes cleanly.&lt;/p&gt;

&lt;p&gt;But inside these messy PDFs was exactly what we were starving for: hundreds of additional transliteration-translation pairs, many with line-by-line alignment that the original training set didn't have.&lt;/p&gt;

&lt;p&gt;The question was whether I could extract it reliably enough to actually help the model — or whether the noise would make things worse. This is where Gemini's multimodal capabilities came in — specifically its ability to understand page layouts, distinguish between transliteration blocks and commentary, and handle multilingual content out of the box. I decided to build the pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Extraction Pipeline
&lt;/h2&gt;

&lt;p&gt;Building this pipeline was its own mini-project. Each step solved one problem and revealed the next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: PDF → Page Images&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The simplest step — render each PDF page as a numbered PNG. This is the only part that runs purely local. Everything else goes through Gemini.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Page Images → Structured JSON&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each page image gets sent to Gemini's vision model via the Vertex AI Batch API. The flow: build a JSONL of requests (one per page image, referencing GCS URIs), submit to Vertex, parse the predictions back.&lt;/p&gt;

&lt;p&gt;A quick note on why batch inference: when you're processing hundreds of pages and don't need real-time responses, the Batch API is a no-brainer. You get a 50% discount over standard inference, much higher rate limits, and the service handles parallelization and retries for you — typically completing within 24 hours. You submit one job, go do something else, come back to results. For a pipeline like this where I was processing multiple books with hundreds of pages each, it saved both money and sanity.&lt;/p&gt;

&lt;p&gt;The request construction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gcs_uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fileData&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mimeType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fileUri&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;gcs_uri&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
                    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt_text&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generationConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;responseMimeType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mediaResolution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIA_RESOLUTION_HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# needed for diacritics
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thinkingConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thinkingLevel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIUM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We used &lt;code&gt;gemini-3.1-flash-lite-preview&lt;/code&gt; with medium thinking enabled — the reasoning step helped significantly with understanding complex page layouts and making correct decisions about where one tablet ends and another begins.&lt;/p&gt;

&lt;p&gt;Submit with the Vertex AI Batch API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vertexai&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3.1-flash-lite-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gs://your-bucket/book/ocr_batch/requests.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;CreateBatchJobConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;dest&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gs://your-bucket/book/ocr_batch/predictions/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One gotcha that bit me early: &lt;strong&gt;predictions come back shuffled&lt;/strong&gt;. You can't rely on line order in the output — you have to extract the page number from each prediction's original request URI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_page_num&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pred&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fileData&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fileUri&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page_(\d+)\.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is actually a feature — it forces you to write robust parsing from the start.&lt;/p&gt;

&lt;p&gt;Every AKT volume needs its own prompt. Different heading formats, different edge markers (&lt;code&gt;Ö.y.&lt;/code&gt;, &lt;code&gt;Ak.&lt;/code&gt; for Turkish volumes; &lt;code&gt;Vs.&lt;/code&gt;, &lt;code&gt;Rs.&lt;/code&gt; for German), different conventions for commentary blocks. Get this wrong and you extract commentary as translation, or merge two tablets into one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: JSON Pages → Tablets CSV&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A book-specific export script aggregates all the per-page JSONs into a flat CSV — one row per tablet with combined transliteration and translation fields. Each volume needs its own exporter because the structure varies enough that a generic one would silently break.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Visual QC&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Dump everything to an HTML file and actually look at it. This is where you spot the real problems: misread headings, commentary leaking into translation fields, duplicate translations from continuation pages. No amount of automated testing replaces eyeballing the output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Cleanup&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Book-specific cleanup scripts apply the fixes found during QC — drop bad rows, merge tablets that got split across pages, strip commentary that leaked through. Unglamorous and manual but completely necessary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6: Sentence Chunking + Translation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbs63lyyuenjkftivfaa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbs63lyyuenjkftivfaa.png" alt="Batch Job Flow" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's where it gets interesting again. The original training data is document-level — full tablet in, full translation out. But the AKT books have something better: line-by-line structure. Each transliteration line has a marker (&lt;code&gt;(Vs.1)&lt;/code&gt;, &lt;code&gt;(2)&lt;/code&gt;, &lt;code&gt;(Rs.14)&lt;/code&gt;) and each translation sentence references those markers.&lt;/p&gt;

&lt;p&gt;A second Gemini batch job handles two things at once: align transliteration lines to translation sentences by marker, and translate the non-English content (Turkish or German) into English. For each tablet, I retrieved the most similar examples from the official training set using TF-IDF cosine similarity and included them as few-shot context. This turned out to be crucial — not just for translation quality, but for matching the distribution of the host's wording, style, and terminology choices. The model wasn't just translating, it was learning to translate &lt;em&gt;the way the competition data expected&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Same batch pattern — build JSONL, submit, parse shuffled predictions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 7: Normalization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most of the invisible work happened here. The competition test set uses a specific character format, and the books don't match it. Every volume has its own OCR artifacts, its own conventions.&lt;/p&gt;

&lt;p&gt;A few examples from the normalization stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ḫ/Ḫ → h/H&lt;/code&gt; (test set uses plain H)&lt;/li&gt;
&lt;li&gt;Unicode subscripts → plain digits (&lt;code&gt;₄ → 4&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Superscript determinatives → brace format (&lt;code&gt;ᵈ → {d}&lt;/code&gt;, &lt;code&gt;ᵏⁱ → {ki}&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;OCR artifacts: &lt;code&gt;KU.BABBAR → KÙ.BABBAR&lt;/code&gt;, &lt;code&gt;ś → š&lt;/code&gt;, &lt;code&gt;ş → ṣ&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Gap deduplication: &lt;code&gt;&amp;lt;gap&amp;gt; &amp;lt;gap&amp;gt; → &amp;lt;gap&amp;gt;&lt;/code&gt;, while preserving attachments like &lt;code&gt;&amp;lt;gap&amp;gt;-A-šùr&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a character-level model like ByT5, this isn't cosmetic. A single character mismatch between training and test — &lt;code&gt;ḫ&lt;/code&gt; vs &lt;code&gt;h&lt;/code&gt;, &lt;code&gt;₄&lt;/code&gt; vs &lt;code&gt;4&lt;/code&gt; — is invisible to a human reviewer and catastrophic to a model that has learned exactly one representation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 8: Merge&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The final step pulls normalized chunks into the main training set. Starting from ~1,500 pairs, the pipeline roughly multiplied our available training data — and more importantly, added sentence-level pairs that gave the model a much finer-grained learning signal than document-level translations alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training — ByT5 Gets You Far, Then Stops
&lt;/h2&gt;

&lt;p&gt;With the expanded dataset ready, training ByT5 was straightforward — standard seq2seq encoder-decoder training using HuggingFace Transformers. No tricks, no exotic schedulers. The model picked up patterns fast and translated training-domain tablets surprisingly well.&lt;/p&gt;

&lt;p&gt;But then the leaderboard scores started telling a different story.&lt;/p&gt;

&lt;p&gt;In our case, the hidden test set on Kaggle seemed to have a different distribution than what we trained on. Our best guess: different books, different topics, different translator styles, unfamiliar names and locations. Our ByT5 was doing well on what it had seen directly in training, but the leaderboard scores suggested it wasn't generalizing beyond that.&lt;/p&gt;

&lt;p&gt;We hit a ceiling. Many teams went on to have great success pushing ByT5 further — better augmentation, longer training, smarter tricks I guess. But in our setup, the gains had stalled, and we decided to explore a different direction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to Decoder-Only — But This Time, Fine-Tuned
&lt;/h2&gt;

&lt;p&gt;This is where the story comes full circle. Earlier, we'd dismissed decoder-only LLMs because they hallucinate. That's still true — out of the box. But fine-tuning changes the picture completely.&lt;/p&gt;

&lt;p&gt;The reasoning was simple: ByT5 and Qwen were solving different problems. ByT5 was a great fit for the transliteration itself — every character mattered, and byte-level modeling let it handle weird orthography, diacritics, subscripts, and determinatives without fighting the tokenizer. But once the task became generalization across unfamiliar tablets, translator styles, and topic shifts, Qwen3.5 had something ByT5 didn't: much stronger pretrained language knowledge.&lt;/p&gt;

&lt;p&gt;Out of the box, that strength was useless because it came with hallucination. Fine-tuning changed that. LoRA gave us a way to keep the model's broader language ability while grounding it in the task and the dataset. Instead of prompting a general-purpose model and hoping for the best, we trained a lightweight adapter on our curated examples. Combined with few-shot prompting to match the host's translation style, the fine-tuned Qwen handled the distribution shifts that our ByT5 couldn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fine-Tuning with Unsloth — Making LLMs Affordable
&lt;/h2&gt;

&lt;p&gt;Before diving into the training details, a quick primer for anyone who hasn't fine-tuned a model before.&lt;/p&gt;

&lt;p&gt;The naive approach to fine-tuning a large language model means updating all its parameters — billions of them. That requires serious hardware, serious memory, and serious money. For a Kaggle competition where you're iterating fast on limited GPUs, it's a non-starter.&lt;/p&gt;

&lt;p&gt;This is where LoRA (Low-Rank Adaptation) comes in. Instead of updating the entire model, you freeze the original weights and train a small set of adapter matrices on top. You get most of the benefits of full fine-tuning at a fraction of the cost. QLoRA takes it a step further by quantizing the base model to 4-bit precision, which dramatically cuts memory usage — making it possible to fine-tune models that would otherwise never fit on a single GPU.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fma6se1jr7jilhm9b8wik.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fma6se1jr7jilhm9b8wik.png" alt="Unsloth" width="225" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For this project we used Unsloth, which makes the whole process surprisingly painless. It handles the LoRA/QLoRA setup, optimizes training to run ~2x faster with ~70% less VRAM, and supports a wide range of models out of the box — including Qwen3.5, which is what we needed.&lt;/p&gt;

&lt;p&gt;The training itself was SFT (Supervised Fine-Tuning) using Unsloth's built-in SFT trainer. We structured our data as chat conversations: a system prompt setting the role of an expert Assyriologist, few-shot examples retrieved via TF-IDF similarity, and the target tablet as the final user message. The model only learns from the assistant completion — the actual translation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# each training example looks like this
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are an expert Assyriologist...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;# few-shot examples from similar tablets
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Translate: um-ma A-šùr-i-dí-ma ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Thus says Aššur-idī: ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Translate: um-ma Pu-šu-ki-in-ma ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Thus says Pūšu-kēn: ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;# the actual tablet to translate
&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Translate: a-na A-lim {ki} ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;To the City: ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# model learns this
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An important detail here: we used completion-only masking. The loss is computed only on the assistant's translation tokens — the prompt tokens (system message, few-shot examples, user messages) are masked out during training. This means the model isn't wasting capacity learning to predict the input; it's focused entirely on producing accurate translations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gj9rl0yko1xfy4srvhf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1gj9rl0yko1xfy4srvhf.png" alt="Completion masking: prompt tokens are masked in the labels, loss is only computed on the completion tokens" width="800" height="185"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This meant the model wasn't just learning to translate — it was learning to translate &lt;em&gt;in context&lt;/em&gt;, grounded by similar examples. The same retrieval and prompt structure would be used at inference time, so there was no gap between how the model trained and how it would be evaluated.&lt;/p&gt;

&lt;p&gt;One direction we started exploring but ran out of time for: reinforcement learning on top of the fine-tuned model. The idea was to use GRPO (Group Relative Policy Optimization) with custom reward functions — combining the competition metric itself, gap alignment between transliteration and translation, and length balance — to push the model beyond what SFT alone could achieve. Each reward would target a specific failure mode that supervised training couldn't address directly. We didn't get there before the deadline, but it felt like the natural next step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inference — vLLM on Kaggle T4s
&lt;/h2&gt;

&lt;p&gt;With a fine-tuned model ready, the next challenge was actually running it within Kaggle's competition constraints. This is a code competition — no internet access at submission time, two T4 GPUs with 16GB VRAM each, and a strict time limit.&lt;/p&gt;

&lt;p&gt;A quick intro on vLLM for those unfamiliar: it's an open-source inference engine originally developed at UC Berkeley that's become the go-to for serving LLMs efficiently. The key innovation is PagedAttention — instead of pre-allocating a fixed block of memory for each sequence's key-value cache, it pages the KV cache dynamically, similar to how operating systems manage virtual memory. This means you can serve larger models on less hardware. On top of that you get continuous batching, optimized CUDA kernels, tensor parallelism, and seamless HuggingFace model support out of the box.&lt;/p&gt;

&lt;p&gt;Sounds perfect, right? In theory. In practice, we hit a wall.&lt;/p&gt;

&lt;p&gt;Qwen3.5 was released in the final weeks of the competition. The model was brand new — vLLM support was experimental and unstable. On top of that, Kaggle's T4 GPUs have compute capability 7.5, which means no FlashAttention 2 support. We had to fall back to Triton attention backend, wrestle with environment compatibility issues, and work around the fact that you can't pip install anything at submission time — every dependency needs to be pre-packaged in your dataset.&lt;/p&gt;

&lt;p&gt;Getting a 9B parameter model to load, run, and generate translations on two T4s without crashing was its own mini-project. Tensor parallelism across both GPUs was non-negotiable — the model simply wouldn't fit on a single card.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LLM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL_PATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;float16&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_model_len&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;gpu_memory_utilization&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;enforce_eager&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="c1"&gt;# no CUDA graphs on T4
&lt;/span&gt;    &lt;span class="n"&gt;tensor_parallel_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;# split across both T4s
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inference prompt mirrors the training setup exactly — same system prompt, same TF-IDF few-shot retrieval. For each test tablet, we retrieve the 5 most similar examples from our training data and include them as conversation context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;prompts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;build_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;transliteration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transliteration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;few_shot_examples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transliteration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_rows&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sampling_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sampling_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keeping the inference pipeline identical to training — same prompt structure, same retrieval, same style anchoring — meant the model was seeing exactly the kind of input it was trained on. No distribution shift at inference time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results and Reflections
&lt;/h2&gt;

&lt;p&gt;Our team finished with a silver medal out of 2500+ teams. In the final days of the competition, the OCR extraction pipeline was still producing new data — each batch of cleaned and normalized tablets pushed our scores higher. We genuinely felt like gold was within reach with a couple more days. That stings a bit, but honestly? The journey was worth more than the medal.&lt;/p&gt;

&lt;p&gt;Here's what I'm taking away from this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemini's batch inference is a superpower for unstructured data.&lt;/strong&gt; We used it to turn scanned academic books from the 1990s — messy layouts, multiple languages, inconsistent formatting — into clean, structured training data. If it works for 4,000-year-old Assyrian tablets in Turkish and German PDFs, it'll work for your use case too. The Vertex AI Batch API made it affordable and painless at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Few-shot retrieval is still easy gains.&lt;/strong&gt; TF-IDF character n-gram similarity is dead simple to implement, and using retrieved examples to anchor both training and inference gave us consistent improvements with minimal effort. Small iterations, big returns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fine-tuning is more accessible than you think.&lt;/strong&gt; LoRA + Unsloth meant we could train a 9B parameter model on Kaggle's free GPUs. You don't need a cluster. You need good data and the right tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vLLM makes deployment practical.&lt;/strong&gt; Even on constrained hardware like Kaggle T4s, with a brand-new model and no internet access, we got a 9B model running with tensor parallelism. The ecosystem is maturing fast.&lt;/p&gt;

&lt;p&gt;And the bigger picture — the one that got me into this competition in the first place — is that there are still thousands of untranslated tablets sitting in museums. The pipeline we built here isn't a one-off competition hack. It's a blueprint: scan the books, extract the data, train the models, translate the tablets. The tools already exist. The data is already out there. At this point, the bottleneck is no longer whether this can be done. It's whether someone is willing to do it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>kaggle</category>
      <category>gemini</category>
      <category>vertexai</category>
    </item>
    <item>
      <title>Skills, Not Vibes: Teaching AI Agents to Write Clean Code</title>
      <dc:creator>Ertuğrul Demir</dc:creator>
      <pubDate>Mon, 26 Jan 2026 11:17:47 +0000</pubDate>
      <link>https://dev.to/gde/skills-not-vibes-teaching-ai-agents-to-write-clean-code-3l9e</link>
      <guid>https://dev.to/gde/skills-not-vibes-teaching-ai-agents-to-write-clean-code-3l9e</guid>
      <description>&lt;p&gt;In February 2025, Andrej Karpathy coined "vibe coding" to describe programming's new reality: give in to the vibes, accept all changes, "forget that the code even exists." He called it "not too bad for throwaway weekend projects." But for production systems? That's where the trouble starts.&lt;/p&gt;

&lt;p&gt;I've watched AI-generated codebases accumulate the same mess developers spent decades learning to avoid—duplication everywhere, inconsistent naming, missing edge cases. Then it hit me: these are exactly the problems Robert C. Martin warned about in &lt;em&gt;Clean Code&lt;/em&gt; almost two decades ago.&lt;/p&gt;

&lt;p&gt;So I went back to the book, specifically Chapter 17's catalog of 66 code smells and heuristics. These aren't just relevant to AI coding—they're &lt;em&gt;more&lt;/em&gt; relevant. AI makes exactly the mistakes Uncle Bob warned us about, just faster and at scale.&lt;/p&gt;

&lt;p&gt;The solution? &lt;strong&gt;Skills&lt;/strong&gt;—instruction files that AI agents read before writing code. I've translated Clean Code's complete catalog into Python skills you can use today. They work in Google's Antigravity IDE, Anthropic's Claude Code, and anywhere that supports the Agent Skills standard.&lt;/p&gt;

&lt;p&gt;Let me show you why we need this, and how to implement it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Even Linus Torvalds Vibe Codes (Sometimes)
&lt;/h2&gt;

&lt;p&gt;In January 2026, Linus Torvalds revealed a side project called &lt;a href="https://github.com/torvalds/AudioNoise" rel="noopener noreferrer"&gt;AudioNoise&lt;/a&gt;—a digital audio effects simulator he'd been tinkering with over the holidays. The Python visualizer, he noted, was "basically written by vibe-coding."&lt;/p&gt;

&lt;p&gt;In his own words from the repo:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I know more about analog filters—and that's not saying much—than I do about python. It started out as my typical 'google and do the monkey-see-monkey-do' kind of programming, but then I cut out the middle-man—me—and just used Google Antigravity to do the audio sample visualizer."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Hacker News discussion revealed two camps. Some saw it as validation: "It's official, vibe coding is legit." Others noted the crucial context: Torvalds used AI for the part he lacks expertise in (Python visualization) while hand-coding the parts he knows (C and digital signal processing).&lt;/p&gt;

&lt;p&gt;One commenter nailed it: "There's a big difference between vibe-coding an entire project and having an AI build a component that you lack competency for."&lt;/p&gt;

&lt;p&gt;Another observation cut deeper: "If anyone on the planet knows how to do vibe coding right, it's him"—because Torvalds spent decades mastering code review. He can spot bad code instantly. Most of us can't.&lt;/p&gt;

&lt;p&gt;But here's what's telling: Torvalds wrote tests for his hand-coded C—numerical accuracy checks for the DSP primitives he understands. The vibe-coded Python visualizer? &lt;strong&gt;No tests, no type hints, and a duplicated function definition that slipped right through.&lt;/strong&gt; The same four-line method appears twice in a row—the first an empty stub, the second the real implementation. It's textbook "Accept All, don't read the diffs." The code runs fine (Python silently overwrites the first definition), but it's exactly the kind of dead code that accumulates into maintenance nightmares.&lt;/p&gt;

&lt;p&gt;This works for Torvalds' toy project precisely. It's a throwaway learning exercise. The moment that visualizer needs to be production code, those missing guardrails become technical debt.&lt;/p&gt;

&lt;p&gt;The same week, Torvalds rejected "AI slop" submissions to the Linux kernel, arguing that documentation telling people not to submit garbage won't help because "the people who would submit it won't read the documentation anyway."&lt;/p&gt;

&lt;p&gt;The lesson isn't that vibe coding is bad. It's that context matters. Skills let you define when to enforce rigor and when to let the vibes flow.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Data: AI Code Quality Is Getting Worse
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://cloud.google.com/resources/content/2025-dora-ai-assisted-software-development-report" rel="noopener noreferrer"&gt;Google's DORA Report&lt;/a&gt;&lt;/strong&gt;  found AI adoption shows a negative relationship with software delivery stability. The 2025 report's central finding: "AI doesn't fix a team; it amplifies what's already there." Without robust control systems—strong testing, mature practices, fast feedback loops—increased AI-generated code leads to instability. Skills are exactly those control systems, encoded as instructions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://arxiv.org/abs/2511.04427" rel="noopener noreferrer"&gt;Carnegie Mellon researchers&lt;/a&gt;&lt;/strong&gt; analyzed 807 GitHub repositories after Cursor adoption: +30% static analysis warnings, +41% code complexity. The speed gains were transient; the quality problems compounded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research" rel="noopener noreferrer"&gt;GitClear's&lt;/a&gt;&lt;/strong&gt; analysis of 211 million lines of code from Google, Microsoft, Meta, and enterprise repositories found code duplication increased &lt;strong&gt;4x&lt;/strong&gt; with AI adoption. For the first time in their dataset, copy/pasted code exceeded refactored code.&lt;/p&gt;

&lt;p&gt;Even &lt;strong&gt;&lt;a href="https://claude.com/blog/eight-trends-defining-how-software-gets-built-in-2026" rel="noopener noreferrer"&gt;Anthropic's Agentic Coding Trends Report&lt;/a&gt;&lt;/strong&gt; shows the gap: developers use AI in roughly 60% of their work, but can fully delegate only 0-20% of tasks. The rest requires "thoughtful setup, active supervision, and human judgment."&lt;/p&gt;

&lt;p&gt;That gap—between what AI touches and what AI can own—is exactly what skills address. The setup &lt;em&gt;is&lt;/em&gt; the skill. The supervision &lt;em&gt;is&lt;/em&gt; the rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pattern: AI Recreates Classic Code Smells
&lt;/h3&gt;

&lt;p&gt;The research consistently identifies the same failure patterns. Here's how they map to specific Clean Code violations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Naming and Consistency Problems&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inconsistent variable names across similar functions&lt;/li&gt;
&lt;li&gt;Vague names like &lt;code&gt;data&lt;/code&gt;, &lt;code&gt;tmp&lt;/code&gt;, &lt;code&gt;proc&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Mixing naming conventions (camelCase and snake_case)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Clean Code rules: N1 (descriptive names), G11 (consistency), G24 (conventions)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Code Duplication&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Copy/paste instead of extracting shared logic&lt;/li&gt;
&lt;li&gt;Same calculation appearing in multiple places&lt;/li&gt;
&lt;li&gt;Pattern repetition that should be abstracted&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Clean Code rule: G5 (DRY - Don't Repeat Yourself)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Missing Safety Checks&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No validation of input boundaries&lt;/li&gt;
&lt;li&gt;Assumptions about data structure without verification&lt;/li&gt;
&lt;li&gt;Missing null/None checks&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Clean Code rules: G3 (boundary conditions), G4 (don't override safeties), G26 (be precise)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Readability Issues&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Magic numbers without explanation (what does 86400 mean?)&lt;/li&gt;
&lt;li&gt;Unused variables cluttering code&lt;/li&gt;
&lt;li&gt;Functions mixing multiple abstraction levels&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Clean Code rules: G12 (remove clutter), G16 (no obscured intent), G34 (single abstraction level)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance Problems&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Functions doing multiple things at once&lt;/li&gt;
&lt;li&gt;Exposing internal data unnecessarily&lt;/li&gt;
&lt;li&gt;Nested loops that could be optimized&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Clean Code rules: G8 (minimize public interface), G30 (functions do one thing)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't arbitrary style preferences—they're the exact problems that make code hard to maintain, debug, and extend. The skills we'll build enforce these rules automatically.&lt;/p&gt;

&lt;p&gt;The fix isn't to stop using AI. It's to give AI the explicit rules it needs to follow.&lt;/p&gt;

&lt;p&gt;That's what skills do.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Are Skills?
&lt;/h2&gt;

&lt;p&gt;Skills are markdown files containing domain-specific instructions that AI agents read before working on your code. They follow the &lt;a href="https://agentskills.io" rel="noopener noreferrer"&gt;Agent Skills&lt;/a&gt; open standard and work in Google Antigravity, Anthropic's Claude Code, and other compatible agents.&lt;/p&gt;

&lt;p&gt;The architecture is called &lt;strong&gt;Progressive Disclosure&lt;/strong&gt;. Instead of dumping every instruction into the agent's context at once (causing what Antigravity's docs call "Context Saturation"), skills work in layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: The agent sees only a lightweight menu of skill names and descriptions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation&lt;/strong&gt;: When your request matches a skill's description, the full instructions load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution&lt;/strong&gt;: Scripts and templates are read only when the task requires them&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This keeps the agent fast and focused. It's not thinking about database migrations when you're writing a React component.&lt;/p&gt;

&lt;p&gt;The format is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;skill-name&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;When this skill should activate&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="c1"&gt;# Skill Title&lt;/span&gt;

&lt;span class="s"&gt;Your instructions, examples, and rules here.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The &lt;code&gt;description&lt;/code&gt; field is crucial—it's the trigger phrase. The agent semantically matches your request against all available skill descriptions to decide which ones to load. "Enforces function best practices" is vague. "Use when writing or refactoring Python functions" tells the agent exactly when to activate.&lt;/p&gt;

&lt;p&gt;Skills can do far more than enforce coding standards—the community has built skills for Stripe integration, Metasploit security testing, voice agents, and even multi-agent startup automation. This article focuses on one specific use case: encoding Clean Code principles.&lt;/p&gt;

&lt;p&gt;Let me show you how to translate Clean Code's catalog into working skills.&lt;/p&gt;


&lt;h2&gt;
  
  
  Building the Skills: Three Examples
&lt;/h2&gt;

&lt;p&gt;Rather than catalog all 66 rules exhaustively, I'll show you three critical categories in detail. The complete implementation is at the end.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Comments (C1-C5): Code Should Explain Itself
&lt;/h3&gt;

&lt;p&gt;Uncle Bob is famously skeptical of comments—not because documentation is bad, but because comments rot faster than code updates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File Reference: &lt;code&gt;clean-comments/SKILL.md&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;clean-comments&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use when writing, fixing, editing, or reviewing Python comments and docstrings. Enforces Clean Code principles—no metadata, no redundancy, no commented-out code.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Clean Comments&lt;/span&gt;

&lt;span class="gu"&gt;## C1: No Inappropriate Information&lt;/span&gt;

Comments shouldn't hold metadata. Use Git for author names, change history, 
ticket numbers, and dates. Comments are for technical notes about code only.

&lt;span class="gu"&gt;## C2: Delete Obsolete Comments&lt;/span&gt;

If a comment describes code that no longer exists or works differently, 
delete it immediately. Stale comments become "floating islands of 
irrelevance and misdirection."

&lt;span class="gu"&gt;## C3: No Redundant Comments&lt;/span&gt;

&lt;span class="gh"&gt;# Bad - the code already says this&lt;/span&gt;
i += 1  # increment i
user.save()  # save the user

&lt;span class="gh"&gt;# Good - explains WHY, not WHAT&lt;/span&gt;
i += 1  # compensate for zero-indexing in display

&lt;span class="gu"&gt;## C4: Write Comments Well&lt;/span&gt;

If a comment is worth writing, write it well:
&lt;span class="p"&gt;-&lt;/span&gt; Choose words carefully
&lt;span class="p"&gt;-&lt;/span&gt; Use correct grammar
&lt;span class="p"&gt;-&lt;/span&gt; Don't ramble or state the obvious
&lt;span class="p"&gt;-&lt;/span&gt; Be brief

&lt;span class="gu"&gt;## C5: Never Commit Commented-Out Code&lt;/span&gt;

&lt;span class="gh"&gt;# DELETE THIS - it's an abomination&lt;/span&gt;
&lt;span class="gh"&gt;# def old_calculate_tax(income):&lt;/span&gt;
&lt;span class="gh"&gt;#     return income * 0.15&lt;/span&gt;

Who knows how old it is? Who knows if it's meaningful? Delete it. 
Git remembers everything.

&lt;span class="gu"&gt;## The Goal&lt;/span&gt;

The best comment is the code itself. If you need a comment to explain 
what code does, refactor first, comment last.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  2. Functions (F1-F4): Small, Focused, Obvious
&lt;/h3&gt;

&lt;p&gt;Functions should do one thing, do it well, and have an obvious purpose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File Reference: &lt;code&gt;clean-functions/SKILL.md&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;clean-functions&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use when writing or refactoring Python functions. Enforces Clean Code principles—maximum 3 arguments, single responsibility, no flag parameters.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Clean Functions&lt;/span&gt;

&lt;span class="gu"&gt;## F1: Too Many Arguments (Maximum 3)&lt;/span&gt;

&lt;span class="gh"&gt;# Bad - too many parameters&lt;/span&gt;
def create_user(name, email, age, country, timezone, language, newsletter):
    ...

&lt;span class="gh"&gt;# Good - use a dataclass or dict&lt;/span&gt;
@dataclass
class UserData:
    name: str
    email: str
    age: int
    country: str
    timezone: str
    language: str
    newsletter: bool

def create_user(data: UserData):
    ...

More than 3 arguments means your function is doing too much or needs 
a data structure.

&lt;span class="gu"&gt;## F2: No Output Arguments&lt;/span&gt;

Don't modify arguments as side effects. Return values instead.

&lt;span class="gh"&gt;# Bad - modifies argument&lt;/span&gt;
def append_footer(report: Report) -&amp;gt; None:
    report.append("&lt;span class="se"&gt;\n&lt;/span&gt;---&lt;span class="se"&gt;\n&lt;/span&gt;Generated by System")

&lt;span class="gh"&gt;# Good - returns new value&lt;/span&gt;
def with_footer(report: Report) -&amp;gt; Report:
    return report + "&lt;span class="se"&gt;\n&lt;/span&gt;---&lt;span class="se"&gt;\n&lt;/span&gt;Generated by System"

&lt;span class="gu"&gt;## F3: No Flag Arguments&lt;/span&gt;

Boolean flags mean your function does at least two things.

&lt;span class="gh"&gt;# Bad - function does two different things&lt;/span&gt;
def render(is_test: bool):
    if is_test:
        render_test_page()
    else:
        render_production_page()

&lt;span class="gh"&gt;# Good - split into two functions&lt;/span&gt;
def render_test_page(): ...
def render_production_page(): ...

&lt;span class="gu"&gt;## F4: Delete Dead Functions&lt;/span&gt;

If it's not called, delete it. No "just in case" code. Git preserves history.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  3. General Principles (G1-G36): The Core Rules
&lt;/h3&gt;

&lt;p&gt;These are the fundamental patterns that separate clean code from legacy nightmares.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File Reference: &lt;code&gt;clean-general/SKILL.md&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;clean-general&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Use when reviewing Python code quality. Enforces Clean Code's core principles—DRY, single responsibility, clear intent, no magic numbers, proper abstractions.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# General Clean Code Principles&lt;/span&gt;

&lt;span class="gu"&gt;## Critical Rules&lt;/span&gt;

&lt;span class="gs"&gt;**G5: DRY (Don't Repeat Yourself)**&lt;/span&gt;

Every piece of knowledge has one authoritative representation.

&lt;span class="gh"&gt;# Bad - duplication&lt;/span&gt;
tax_rate = 0.0825
ca_total = subtotal &lt;span class="err"&gt;*&lt;/span&gt; 1.0825
ny_total = subtotal &lt;span class="err"&gt;*&lt;/span&gt; 1.07

&lt;span class="gh"&gt;# Good - single source of truth&lt;/span&gt;
TAX_RATES = {"CA": 0.0825, "NY": 0.07}
def calculate_total(subtotal: float, state: str) -&amp;gt; float:
    return subtotal &lt;span class="err"&gt;*&lt;/span&gt; (1 + TAX_RATES[state])

&lt;span class="gs"&gt;**G16: No Obscured Intent**&lt;/span&gt;

Don't be clever. Be clear.

&lt;span class="gh"&gt;# Bad - what does this do?&lt;/span&gt;
return (x &amp;amp; 0x0F) &amp;lt;&amp;lt; 4 | (y &amp;amp; 0x0F)

&lt;span class="gh"&gt;# Good - obvious intent&lt;/span&gt;
return pack_coordinates(x, y)

&lt;span class="gs"&gt;**G23: Prefer Polymorphism to If/Else**&lt;/span&gt;

&lt;span class="gh"&gt;# Bad - will grow forever&lt;/span&gt;
def calculate_pay(employee):
    if employee.type == "SALARIED":
        return employee.salary
    elif employee.type == "HOURLY":
        return employee.hours &lt;span class="err"&gt;*&lt;/span&gt; employee.rate
    elif employee.type == "COMMISSIONED":
        return employee.base + employee.commission

&lt;span class="gh"&gt;# Good - open/closed principle&lt;/span&gt;
class SalariedEmployee:
    def calculate_pay(self): return self.salary

class HourlyEmployee:
    def calculate_pay(self): return self.hours &lt;span class="err"&gt;*&lt;/span&gt; self.rate

class CommissionedEmployee:
    def calculate_pay(self): return self.base + self.commission

&lt;span class="gs"&gt;**G25: Replace Magic Numbers with Named Constants**&lt;/span&gt;

&lt;span class="gh"&gt;# Bad&lt;/span&gt;
if elapsed_time &amp;gt; 86400:
    ...

&lt;span class="gh"&gt;# Good&lt;/span&gt;
SECONDS_PER_DAY = 86400
if elapsed_time &amp;gt; SECONDS_PER_DAY:
    ...

&lt;span class="gs"&gt;**G30: Functions Should Do One Thing**&lt;/span&gt;

If you can extract another function, your function does more than one thing.

&lt;span class="gs"&gt;**G36: Law of Demeter (Avoid Train Wrecks)**&lt;/span&gt;

&lt;span class="gh"&gt;# Bad - reaching through multiple objects&lt;/span&gt;
output_dir = context.options.scratch_dir.absolute_path

&lt;span class="gh"&gt;# Good - one dot&lt;/span&gt;
output_dir = context.get_scratch_dir()

&lt;span class="gu"&gt;## Enforcement Checklist&lt;/span&gt;

When reviewing AI-generated code, verify:
&lt;span class="p"&gt;-&lt;/span&gt; [ ] No duplication (G5)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Clear intent, no magic numbers (G16, G25)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Polymorphism over conditionals (G23)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Functions do one thing (G30)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] No Law of Demeter violations (G36)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Boundary conditions handled (G3)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Dead code removed (G9)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Complete Catalog
&lt;/h3&gt;

&lt;p&gt;I've translated all 66 rules from Clean Code Chapter 17 into skills covering six categories:&lt;/p&gt;

&lt;p&gt;
  Click to expand all skill categories
  &lt;p&gt;&lt;strong&gt;Comments (C1-C5)&lt;/strong&gt;: Minimal, accurate commenting&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;C1: No inappropriate information (metadata belongs in version control)&lt;/li&gt;
&lt;li&gt;C2: Delete obsolete comments immediately&lt;/li&gt;
&lt;li&gt;C3: No redundant comments that repeat the code&lt;/li&gt;
&lt;li&gt;C4: Write comments well—brief, grammatical, purposeful&lt;/li&gt;
&lt;li&gt;C5: Never commit commented-out code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Environment (E1-E2)&lt;/strong&gt;: One-command build and test&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;E1: Build requires only one step&lt;/li&gt;
&lt;li&gt;E2: Tests require only one step&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Functions (F1-F4)&lt;/strong&gt;: Small, focused, obvious&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;F1: Maximum 3 arguments (use data structures for more)&lt;/li&gt;
&lt;li&gt;F2: No output arguments (return values instead)&lt;/li&gt;
&lt;li&gt;F3: No flag arguments (split into separate functions)&lt;/li&gt;
&lt;li&gt;F4: Delete dead functions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;General (G1-G36)&lt;/strong&gt;: Core principles&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;G1: Multiple languages in one source file&lt;/li&gt;
&lt;li&gt;G2: Obvious behavior is unimplemented&lt;/li&gt;
&lt;li&gt;G3: Incorrect behavior at the boundaries&lt;/li&gt;
&lt;li&gt;G4: Overridden safeties&lt;/li&gt;
&lt;li&gt;G5: Duplication&lt;/li&gt;
&lt;li&gt;G6: Code at wrong level of abstraction&lt;/li&gt;
&lt;li&gt;G7: Base classes depending on their derivatives&lt;/li&gt;
&lt;li&gt;G8: Too much information&lt;/li&gt;
&lt;li&gt;G9: Dead code&lt;/li&gt;
&lt;li&gt;G10: Vertical separation&lt;/li&gt;
&lt;li&gt;G11: Inconsistency&lt;/li&gt;
&lt;li&gt;G12: Clutter&lt;/li&gt;
&lt;li&gt;G13: Artificial coupling&lt;/li&gt;
&lt;li&gt;G14: Feature envy&lt;/li&gt;
&lt;li&gt;G15: Selector arguments&lt;/li&gt;
&lt;li&gt;G16: Obscured intent&lt;/li&gt;
&lt;li&gt;G17: Misplaced responsibility&lt;/li&gt;
&lt;li&gt;G18: Inappropriate static&lt;/li&gt;
&lt;li&gt;G19: Use explanatory variables&lt;/li&gt;
&lt;li&gt;G20: Function names should say what they do&lt;/li&gt;
&lt;li&gt;G21: Understand the algorithm&lt;/li&gt;
&lt;li&gt;G22: Make logical dependencies physical&lt;/li&gt;
&lt;li&gt;G23: Prefer polymorphism to if/else or switch/case&lt;/li&gt;
&lt;li&gt;G24: Follow standard conventions&lt;/li&gt;
&lt;li&gt;G25: Replace magic numbers with named constants&lt;/li&gt;
&lt;li&gt;G26: Be precise&lt;/li&gt;
&lt;li&gt;G27: Structure over convention&lt;/li&gt;
&lt;li&gt;G28: Encapsulate conditionals&lt;/li&gt;
&lt;li&gt;G29: Avoid negative conditionals&lt;/li&gt;
&lt;li&gt;G30: Functions should do one thing&lt;/li&gt;
&lt;li&gt;G31: Hidden temporal couplings&lt;/li&gt;
&lt;li&gt;G32: Don't be arbitrary&lt;/li&gt;
&lt;li&gt;G33: Encapsulate boundary conditions&lt;/li&gt;
&lt;li&gt;G34: Functions should descend only one level of abstraction&lt;/li&gt;
&lt;li&gt;G35: Keep configurable data at high levels&lt;/li&gt;
&lt;li&gt;G36: Avoid transitive navigation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Names (N1-N7)&lt;/strong&gt;: Descriptive, unambiguous, right-sized&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;N1: Choose descriptive names&lt;/li&gt;
&lt;li&gt;N2: Choose names at the right abstraction level&lt;/li&gt;
&lt;li&gt;N3: Use standard nomenclature where possible&lt;/li&gt;
&lt;li&gt;N4: Use unambiguous names&lt;/li&gt;
&lt;li&gt;N5: Use long names for long scopes&lt;/li&gt;
&lt;li&gt;N6: Avoid encodings (Hungarian notation, etc.)&lt;/li&gt;
&lt;li&gt;N7: Names should describe side effects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tests (T1-T9)&lt;/strong&gt;: Fast, independent, exhaustive&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;T1: Insufficient tests—test everything that could break&lt;/li&gt;
&lt;li&gt;T2: Use a coverage tool&lt;/li&gt;
&lt;li&gt;T3: Don't skip trivial tests&lt;/li&gt;
&lt;li&gt;T4: Ignored tests indicate ambiguity&lt;/li&gt;
&lt;li&gt;T5: Test boundary conditions&lt;/li&gt;
&lt;li&gt;T6: Exhaustively test near bugs&lt;/li&gt;
&lt;li&gt;T7: Patterns of failure are diagnostic&lt;/li&gt;
&lt;li&gt;T8: Coverage patterns can be revealing&lt;/li&gt;
&lt;li&gt;T9: Tests should be fast&lt;/li&gt;
&lt;/ul&gt;



&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get the complete skill files:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/ertugrul-dmr" rel="noopener noreferrer"&gt;
        ertugrul-dmr
      &lt;/a&gt; / &lt;a href="https://github.com/ertugrul-dmr/clean-code-skills" rel="noopener noreferrer"&gt;
        clean-code-skills
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Clean Code Skills for AI Agents&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href="https://agentskills.io" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/2bad303febd09cbe378475a843a53a6edf564fbe547636be2bb815d8835c7e1e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4167656e74253230536b696c6c732d436f6d70617469626c652d626c7565" alt="Agent Skills"&gt;&lt;/a&gt;
&lt;a href="https://github.com/ertugrul-dmr/clean-code-skills/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/784362b26e4b3546254f1893e778ba64616e362bd6ac791991d2c9e880a3a64e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d677265656e2e737667" alt="License: MIT"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Teach your AI to write code that doesn't suck.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This repository contains &lt;a href="https://agentskills.io" rel="nofollow noopener noreferrer"&gt;Agent Skills&lt;/a&gt; that enforce Robert C. Martin's &lt;em&gt;Clean Code&lt;/em&gt; principles. They work with Google Antigravity, Anthropic's Claude Code, and any agent that supports the Agent Skills standard.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Why?&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;AI generates code fast, but research shows it also generates technical debt fast:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitClear&lt;/strong&gt;: 4x increase in code duplication with AI adoption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Carnegie Mellon&lt;/strong&gt;: +30% static analysis warnings, +41% code complexity after Cursor adoption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google DORA&lt;/strong&gt;: Negative relationship between AI adoption and software delivery stability&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These skills encode battle-tested solutions to exactly these problems—directly into your AI workflow.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What's Included&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Rules&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;boy-scout&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Orchestrator&lt;/strong&gt;—always leave code cleaner than you found it&lt;/td&gt;
&lt;td&gt;Coordinates all skills&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;python-clean-code&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Master skill&lt;/strong&gt; with all 66 rules&lt;/td&gt;
&lt;td&gt;C1-C5, E1-E2, F1-F4, G1-G36, N1-N7, P1-P3, T1-T9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;clean-comments&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Minimal, accurate commenting&lt;/td&gt;
&lt;td&gt;C1-C5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;clean-functions&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Small, focused, obvious functions&lt;/td&gt;
&lt;td&gt;F1-F4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;…&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/ertugrul-dmr/clean-code-skills" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;




&lt;p&gt;The repo includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;boy-scout&lt;/code&gt;&lt;/strong&gt;: An orchestrator skill that embodies the Boy Scout Rule—"always leave code cleaner than you found it"—and coordinates the other skills&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;python-clean-code&lt;/code&gt;&lt;/strong&gt;: A master skill with all 66 rules, plus a quick reference table and anti-patterns cheatsheet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Individual skills&lt;/strong&gt; for each category (&lt;code&gt;clean-comments&lt;/code&gt;, &lt;code&gt;clean-functions&lt;/code&gt;, &lt;code&gt;clean-general&lt;/code&gt;, &lt;code&gt;clean-names&lt;/code&gt;, &lt;code&gt;clean-tests&lt;/code&gt;)—drop in only what you need&lt;/li&gt;
&lt;li&gt;Installation instructions for Antigravity, Claude Code, and other Agent Skills-compatible tools&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How to Use These Skills
&lt;/h2&gt;

&lt;p&gt;Skills sit in a specific place in the agent ecosystem. &lt;strong&gt;Rules&lt;/strong&gt; are passive guardrails that are always on. &lt;strong&gt;Skills&lt;/strong&gt; are agent-triggered—the model decides when to equip them based on your intent. If you're using MCP servers (connections to external tools like GitHub or Postgres), think of MCP as the "hands" and skills as the "brains" that direct them.&lt;/p&gt;

&lt;h3&gt;
  
  
  For Antigravity
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Create &lt;code&gt;.agent/skills/&lt;/code&gt; in your project root (or &lt;code&gt;~/.gemini/antigravity/skills/&lt;/code&gt; for global access)&lt;/li&gt;
&lt;li&gt;Save the skill as a folder with a &lt;code&gt;SKILL.md&lt;/code&gt; file inside (e.g., &lt;code&gt;.agent/skills/python-clean-code/SKILL.md&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Ask the agent to review or write code—it'll automatically apply the rules when relevant&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Global vs Project Skills
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Project-specific&lt;/strong&gt;: &lt;code&gt;.agent/skills/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Global Antigravity&lt;/strong&gt;: &lt;code&gt;~/.gemini/antigravity/skills/&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent only loads full skill content when needed, so comprehensive skills don't slow down simple requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Going Further
&lt;/h3&gt;

&lt;p&gt;The skills in this article are instruction-only—they tell the agent what to do. For stricter enforcement, you could add a &lt;code&gt;scripts/&lt;/code&gt; folder with a linter that compatible agents runs them automatically, or an &lt;code&gt;examples/&lt;/code&gt; folder with before/after code samples for few-shot learning. The format supports it; we're just keeping things simple here.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Real-World Example
&lt;/h2&gt;

&lt;p&gt;Here's code that violates multiple Clean Code rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;  &lt;span class="c1"&gt;# P1
# Author: John, Modified: 2024-01-15  # C1
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;proc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# N1, F1, F3
&lt;/span&gt;    &lt;span class="c1"&gt;# Process the data  # C3
&lt;/span&gt;    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# N1
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;flag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# F3
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;A&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# G23
&lt;/span&gt;                &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;val&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.0825&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# G25
&lt;/span&gt;            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;B&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;val&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.05&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# G25
&lt;/span&gt;        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;val&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/tmp/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# G6
&lt;/span&gt;        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Old approach  # C5
&lt;/span&gt;    &lt;span class="c1"&gt;# for item in d:
&lt;/span&gt;    &lt;span class="c1"&gt;#     print(item)
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Violations&lt;/strong&gt;: P1, C1, C3, C5, F1, F3, G6, G23, G25, N1&lt;/p&gt;

&lt;p&gt;With the Clean Code skill active, ask your AI agent to refactor this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;

&lt;span class="n"&gt;TAX_RATE_CA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0825&lt;/span&gt;
&lt;span class="n"&gt;TAX_RATE_NY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.05&lt;/span&gt;
&lt;span class="n"&gt;TransactionType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;NY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;
    &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TransactionType&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;apply_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Apply state-specific tax to transaction value.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;tax_rates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CA&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TAX_RATE_CA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;NY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TAX_RATE_NY&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tax_rates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_transactions_with_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Calculate taxed values for all transactions.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;apply_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_transactions_without_tax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Transaction&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Extract raw values from all transactions.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_results&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Save processed values to JSON file.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The refactored version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ No wildcard imports (P1)&lt;/li&gt;
&lt;li&gt;✅ No metadata comments (C1)&lt;/li&gt;
&lt;li&gt;✅ No redundant comments (C3)&lt;/li&gt;
&lt;li&gt;✅ No commented-out code (C5)&lt;/li&gt;
&lt;li&gt;✅ Descriptive names (N1)&lt;/li&gt;
&lt;li&gt;✅ No flag arguments (F3)&lt;/li&gt;
&lt;li&gt;✅ Named constants instead of magic numbers (G25)&lt;/li&gt;
&lt;li&gt;✅ Functions do one thing (G30)&lt;/li&gt;
&lt;li&gt;✅ Polymorphism through data structure (G23)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Anatomy of a Vibe-Coded Script
&lt;/h3&gt;

&lt;p&gt;Remember the duplicated function I mentioned in Torvalds' &lt;a href="https://github.com/torvalds/AudioNoise/blob/3a6b51032da587e5d2e269515f3dc21c96da15c4/visualize.py#L342C9-L342C27" rel="noopener noreferrer"&gt;AudioNoise visualizer&lt;/a&gt;? Here it is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_slider_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Helper to update slider texts (Width and End Point).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
    &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_val&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_slider_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Helper to update slider texts (Width and End Point).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
    &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_val&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x_mode&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valtext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Window: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valtext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Window: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first definition unpacks values, calculates width, then... returns &lt;code&gt;None&lt;/code&gt;. The second definition is the real implementation. Python silently overwrites the first with the second, so the code runs. But it's textbook dead code—&lt;strong&gt;Clean Code rule G9: Remove dead code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With the skill active, an agent refactors the entire 600-line script. The duplicate vanishes, magic numbers become constants, and nested functions get extracted into focused methods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_slider_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Update slider text with either time or sample count.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
    &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;end_val&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_val&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x_mode&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valtext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Window: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;valtext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Window: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; + &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbaemhkvcb4do3479focu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbaemhkvcb4do3479focu.png" alt="Antigravity Review"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The refactored version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Dead code removed (G9)&lt;/li&gt;
&lt;li&gt;✅ Type hints added (clarity)&lt;/li&gt;
&lt;li&gt;✅ Single, authoritative definition (G5)&lt;/li&gt;
&lt;li&gt;✅ Magic numbers extracted to constants (G25)&lt;/li&gt;
&lt;li&gt;✅ Large methods decomposed (G30)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full diff shows 600+ lines reduced to ~440—not by removing functionality, but by eliminating duplication and extracting reusable patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters Now
&lt;/h2&gt;

&lt;p&gt;Vibe coding isn't going away. AI will get better at generating code, not worse. But "better at generating" doesn't mean "better at maintaining."&lt;/p&gt;

&lt;p&gt;The research is clear: AI produces code faster, but that code accumulates technical debt faster too. Without guard rails, we're building tomorrow's legacy systems today.&lt;/p&gt;

&lt;p&gt;Uncle Bob's Clean Code principles are almost 20 years old, but they're exactly what we need now. They're not arbitrary style preferences—they're battle-tested solutions to the problems AI recreates at scale.&lt;/p&gt;

&lt;p&gt;Skills give you the mechanism to encode these rules directly into your AI workflow. Whether you're using Antigravity, Claude Code, or another agent, the approach is the same: &lt;strong&gt;define what clean code means, then let the AI follow the rules&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Your agent doesn't know what good code looks like unless you tell it.&lt;/p&gt;

&lt;p&gt;So tell it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Book&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clean Code&lt;/strong&gt; by Robert C. Martin: &lt;a href="https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882" rel="noopener noreferrer"&gt;Amazon&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Skills Documentation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://agentskills.io" rel="noopener noreferrer"&gt;Agent Skills Standard&lt;/a&gt; — The open standard for AI agent instructions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://antigravity.google/docs/skills" rel="noopener noreferrer"&gt;Antigravity Skills Guide&lt;/a&gt; — Google's official documentation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview" rel="noopener noreferrer"&gt;Claude Code Agent Skills&lt;/a&gt; — Anthropic's implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Research Cited&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/resources/content/2025-dora-ai-assisted-software-development-report" rel="noopener noreferrer"&gt;DORA 2025: AI-Assisted Software Development&lt;/a&gt; — Google's findings on AI and delivery stability&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://arxiv.org/abs/2511.04427" rel="noopener noreferrer"&gt;Code Quality After Cursor Adoption&lt;/a&gt; — Carnegie Mellon's analysis of 807 repositories&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research" rel="noopener noreferrer"&gt;GitClear 2025 Code Quality Report&lt;/a&gt; — 211M lines analyzed&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://claude.com/blog/eight-trends-defining-how-software-gets-built-in-2026" rel="noopener noreferrer"&gt;Agentic Coding Trends&lt;/a&gt; — Anthropic's delegation gap analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Get the Skills&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
Clean Code Skills Repository — All 66 rules as ready-to-use skill files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The future of programming is human intent translated by AI. Make sure the translation preserves quality, not just speed.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>antigravity</category>
      <category>python</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
