<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: kivumia</title>
    <description>The latest articles on DEV Community by kivumia (@swarmly).</description>
    <link>https://dev.to/swarmly</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849501%2Fa6661c84-1c46-49da-b721-26eb3938197e.png</url>
      <title>DEV Community: kivumia</title>
      <link>https://dev.to/swarmly</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/swarmly"/>
    <language>en</language>
    <item>
      <title>We validated our COBOL-to-Python engine on 15,552 real-world programs. 98.78% produce valid Python. Zero LLMs involved.</title>
      <dc:creator>kivumia</dc:creator>
      <pubDate>Sun, 05 Apr 2026 04:39:11 +0000</pubDate>
      <link>https://dev.to/swarmly/we-validated-our-cobol-to-python-engine-on-15552-real-world-programs-9878-produce-valid-python-2d6a</link>
      <guid>https://dev.to/swarmly/we-validated-our-cobol-to-python-engine-on-15552-real-world-programs-9878-produce-valid-python-2d6a</guid>
      <description>&lt;p&gt;We validated our COBOL-to-Python engine on&lt;br&gt;
15,552 real-world programs. 98.78% produce valid&lt;br&gt;
Python. Zero LLMs involved.&lt;br&gt;
Last week we published a proof of concept with IBM's SAM1 — 505 lines, 32 milliseconds.&lt;br&gt;
This week we scaled it to the entire planet.&lt;br&gt;
The corpus&lt;br&gt;
15,552 COBOL source files. Not synthetic benchmarks. Real programs, collected from 131&lt;br&gt;
open-source repositories across 5 continents:&lt;br&gt;
— Norway. France. Brazil. India. Japan. USA.&lt;br&gt;
— GitHub. HuggingFace. CBT Tape. GnuCOBOL. IBM public repositories.&lt;br&gt;
— Commercial COBOL. GnuCOBOL extensions. TypeCOBOL. Mainframe dialects.&lt;br&gt;
No selection bias. No curated samples. Everything we could find.&lt;br&gt;
The result&lt;br&gt;
Before (v5.6)&lt;br&gt;
Corpus&lt;br&gt;
Valid Python&lt;br&gt;
Failures&lt;br&gt;
Net gain&lt;br&gt;
14,508 files&lt;br&gt;
14,020 (96.84%)&lt;br&gt;
After (v5.8e)&lt;br&gt;
15,552 files (+1,044)&lt;br&gt;
15,362 (98.78%)&lt;br&gt;
456&lt;br&gt;
—&lt;br&gt;
190&lt;br&gt;
+1,342 files&lt;br&gt;
On the original v5.7 reference corpus: 99.25%. 180 of 289 failures corrected in a single session.&lt;br&gt;
What "valid Python" means&lt;br&gt;
We are not using LLMs to judge output quality. We are not doing string comparison. We are not&lt;br&gt;
running style checks.&lt;br&gt;
We use ast.parse().&lt;br&gt;
Binary. Deterministic. No margin for interpretation.&lt;br&gt;
If the generated Python passes ast.parse() without raising a SyntaxError — it is valid. If it raises — it&lt;br&gt;
fails. Nothing in between.&lt;br&gt;
This is the strictest possible definition of syntactic correctness. A human reviewer cannot override it.&lt;br&gt;
A model cannot hallucinate its way through it.&lt;br&gt;
What fails and why&lt;br&gt;
190 files still fail. Here is what they are:&lt;br&gt;
Category&lt;br&gt;
TypeCOBOL&lt;br&gt;
GnuCOBOL extensions&lt;br&gt;
Non-standard COBOL&lt;br&gt;
Deep STRING/UNSTRING&lt;br&gt;
~Files&lt;br&gt;
~60&lt;br&gt;
~40&lt;br&gt;
~30&lt;br&gt;
Example&lt;br&gt;
Multi-level qualifications, REPLACE, typed expressions&lt;br&gt;
GUI, bitwise composed, OO, SCREEN SECTION&lt;br&gt;
WebSocket, brainfuck interpreter, .NET GUI&lt;br&gt;
~25&lt;br&gt;
Exotic mainframe&lt;br&gt;
~35&lt;br&gt;
Complex nesting, multiple delimiters&lt;br&gt;
CICS inline, complex EXEC SQL, nested copybooks&lt;br&gt;
These are not parsing bugs. These are constructions that sit at the outer boundary of what any&lt;br&gt;
standard COBOL parser is expected to handle. The sanitizer cannot fix what the parser never&lt;br&gt;
understood.&lt;br&gt;
We know exactly what they are. We are working on them.&lt;br&gt;
How it works&lt;br&gt;
AGUELLID CODE does not translate COBOL to Python.&lt;br&gt;
It transforms COBOL into a semantic intermediate representation, then generates Python that is&lt;br&gt;
provably equivalent — not line-by-line, but behavior-by-behavior.&lt;br&gt;
No neural network. No prompt. No sampling.&lt;br&gt;
The transformation is deterministic: the same input always produces the same output. The output&lt;br&gt;
can be audited. The logic can be traced. There is no black box.&lt;br&gt;
This matters in banking. In insurance. In government systems. In any environment where "the model&lt;br&gt;
thought it was right" is not an acceptable explanation.&lt;br&gt;
Why this matters&lt;br&gt;
There are an estimated 220 billion lines of COBOL in active production today.&lt;br&gt;
Most of it runs on systems that organizations can no longer maintain. The engineers who wrote it&lt;br&gt;
are retired. The documentation is incomplete. The behavior is institutional memory encoded in&lt;br&gt;
syntax.&lt;br&gt;
Modernizing this code is not a style choice. It is a survival question for dozens of industries.&lt;br&gt;
Current approaches:&lt;br&gt;
— Manual rewrite: expensive, slow, error-prone&lt;br&gt;
— LLM translation: non-deterministic, unauditable, high hallucination risk on legacy syntax&lt;br&gt;
— Transpilers: brittle, shallow, fail on complex constructs&lt;br&gt;
AGUELLID CODE is none of these.&lt;br&gt;
98.78% on 15,552 real files. Deterministic. Auditable. No LLMs.&lt;br&gt;
What comes next&lt;br&gt;
The 190 remaining failures map to specific parser gaps. We are working through them by gain/risk&lt;br&gt;
ratio — some TypeCOBOL patterns alone can recover 20-30 files in a single micro-patch.&lt;br&gt;
Target: 99.2-99.5% on the full expanded corpus.&lt;br&gt;
The forge is still burning.&lt;br&gt;
KIVUMIA — AGUELLID CODE v5.8e&lt;br&gt;
Validated: 2026-04-05 03:27 UTC&lt;br&gt;
Corpus: 131 sources, 15,552 files, 5 continents&lt;br&gt;
Engine: deterministic, zero LLMs&lt;br&gt;
kivumia.ai&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbdmnxbf06sv5cnf8hfh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsbdmnxbf06sv5cnf8hfh.png" alt=" " width="604" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cobol</category>
      <category>python</category>
      <category>legacy</category>
      <category>opensource</category>
    </item>
    <item>
      <title>We ran 6.2 billion COBOL validation passes. Zero errors. Here's what we learned.</title>
      <dc:creator>kivumia</dc:creator>
      <pubDate>Sun, 29 Mar 2026 15:10:32 +0000</pubDate>
      <link>https://dev.to/swarmly/we-ran-62-billion-cobol-validation-passes-zero-errors-heres-what-we-learned-29ic</link>
      <guid>https://dev.to/swarmly/we-ran-62-billion-cobol-validation-passes-zero-errors-heres-what-we-learned-29ic</guid>
      <description>&lt;p&gt;COBOL is not dead. It's everywhere.&lt;br&gt;
95% of ATM transactions worldwide run on COBOL. 80% of in-person point-of-sale transactions. An estimated 3 billion lines of COBOL are actively running in banking systems, insurance companies, and government infrastructure.&lt;br&gt;
And yet — no modernization vendor has ever published a large-scale validation benchmark. Promises accumulate. Evidence remains absent.&lt;br&gt;
We decided to change that.&lt;br&gt;
The test&lt;br&gt;
Environment: Hostinger KVM 8 VPS — 8 cores, 32 GB RAM, Ubuntu 24.04&lt;br&gt;
Corpus: 9,595 real COBOL files — 4,490,720 lines&lt;br&gt;
Method: 1,380 complete validation passes, 8 parallel workers&lt;br&gt;
Total duration: 12.7 hours continuous&lt;br&gt;
No synthetic data. No fabricated corpus. Real COBOL files — the raw material of industry.&lt;br&gt;
The results&lt;/p&gt;

&lt;p&gt;Total validations: 6,197,193,600&lt;br&gt;
Errors: 0&lt;br&gt;
Success rate: 100.000%&lt;br&gt;
Stable speed (0–5h): 293,000 lines/second&lt;br&gt;
Peak speed: 329,411 lines/second&lt;br&gt;
Average speed: 283,881 lines/second&lt;br&gt;
Memory leak: None&lt;br&gt;
Crash: None&lt;/p&gt;

&lt;p&gt;Milestones: 1B at 0.9h — 2B at 1.9h — 3B at 2.8h — 4B at 3.8h — 5B at 4.9h — 6B at 7.9h&lt;br&gt;
The speed curve — and what it reveals&lt;br&gt;
The parser held stable at ~293K lines/second for the first 5 hours. Then throughput declined progressively.&lt;br&gt;
This is not a parser failure. It is the VPS being throttled by Hostinger after 5 hours of sustained 100% CPU load.&lt;br&gt;
The parser did not fail. The infrastructure was externally limited.&lt;br&gt;
This is the floor of an entry-level cloud VPS — not the ceiling of KIVUMIA.CODE.&lt;br&gt;
What this means&lt;br&gt;
Six billion validations. Zero errors. On a standard VPS.&lt;br&gt;
Next step: run on local Ryzen hardware, no throttle, targeting 125 billion validations.&lt;br&gt;
About KIVUMIA&lt;br&gt;
Multi-agent AI platform dedicated to COBOL modernization — semantic migration to Python, large-scale validation, European digital sovereignty.&lt;br&gt;
We don't conquer. We pollinate. 🐝&lt;br&gt;
🌐 kivumia.com | kivumia.ai&lt;a href="https://dev.tourl"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cobol</category>
      <category>benchmark</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
