<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zhijie Wong</title>
    <description>The latest articles on DEV Community by Zhijie Wong (@zhijiewong).</description>
    <link>https://dev.to/zhijiewong</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3523370%2Fdbce5ad4-386a-479e-b1ac-e04498dcf897.jpeg</url>
      <title>DEV Community: Zhijie Wong</title>
      <link>https://dev.to/zhijiewong</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zhijiewong"/>
    <language>en</language>
    <item>
      <title>How a pure-TypeScript flex layout engine closed the last WASM-Yoga gap</title>
      <dc:creator>Zhijie Wong</dc:creator>
      <pubDate>Sat, 23 May 2026 09:06:24 +0000</pubDate>
      <link>https://dev.to/zhijiewong/how-a-pure-typescript-flex-layout-engine-closed-the-last-wasm-yoga-gap-12ef</link>
      <guid>https://dev.to/zhijiewong/how-a-pure-typescript-flex-layout-engine-closed-the-last-wasm-yoga-gap-12ef</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;I've been building &lt;a href="https://github.com/pilatesjs/pilates" rel="noopener noreferrer"&gt;Pilates&lt;/a&gt;, a flex layout engine for terminal UIs in pure TypeScript. As of last week, across the 9 scenarios in my bench suite, the pure-TS engine is faster than WASM Yoga (the engine Ink uses) on each — including the structural-mutation workload (append + remove a row per frame) Yoga led on by ~5× until phases 15–17 closed it. That flipped to a ~1.7× Pilates win, in pure TypeScript.&lt;/p&gt;

&lt;p&gt;No native bindings. No WASM port. The fix was algorithmic, and the algorithmic fix worked in TS.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;Median latency, win32-x64, Node 22, ~5s tinybench windows with bootstrap CI95:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Pilates&lt;/th&gt;
&lt;th&gt;yoga-layout (WASM)&lt;/th&gt;
&lt;th&gt;Ratio&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;tiny (10 nodes)&lt;/td&gt;
&lt;td&gt;4.5µs&lt;/td&gt;
&lt;td&gt;19.0µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.2× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;realistic (~100)&lt;/td&gt;
&lt;td&gt;121µs&lt;/td&gt;
&lt;td&gt;328µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.7× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;stress (~1000)&lt;/td&gt;
&lt;td&gt;601µs&lt;/td&gt;
&lt;td&gt;1.94ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.2× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;big (~5000)&lt;/td&gt;
&lt;td&gt;3.32ms&lt;/td&gt;
&lt;td&gt;9.17ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.8× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;huge (~10000)&lt;/td&gt;
&lt;td&gt;8.62ms&lt;/td&gt;
&lt;td&gt;18.5ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.1× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hot-relayout&lt;/td&gt;
&lt;td&gt;16.3µs&lt;/td&gt;
&lt;td&gt;83.0µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5.1× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hot-relayout + boundaries&lt;/td&gt;
&lt;td&gt;15.8µs&lt;/td&gt;
&lt;td&gt;77.8µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.9× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hot-relayout (text mutation)&lt;/td&gt;
&lt;td&gt;8.9µs&lt;/td&gt;
&lt;td&gt;90.6µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;hot-structural&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71.3µs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;118.3µs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.7× faster&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Caveats up front: 9 hand-picked scenarios, not a universal claim. Reproduce with &lt;code&gt;pnpm bench&lt;/code&gt; — about 5 minutes on a recent machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why pure TS can beat WASM here
&lt;/h2&gt;

&lt;p&gt;Terminal UI is a curiously hostile workload for a WASM engine. Trees are small (10–10,000 nodes), but updates are frequent — one keystroke, one tick, one frame. The crossing cost from JS into WASM dominates: Yoga's per-call kernel is a few microseconds, but &lt;code&gt;node.setWidth(N)&lt;/code&gt; from JS to WASM is also a few microseconds. A pure-TS engine pays no crossing cost.&lt;/p&gt;

&lt;p&gt;That was the thesis going in. Phases 15–17 are evidence the thesis holds even in the worst case — the workload where Yoga's compute kernel is exactly what's being measured, with the tree pre-built and only the structural-mutation layout timed.&lt;/p&gt;

&lt;h2&gt;
  
  
  How hot-structural went from ~450µs to ~70µs
&lt;/h2&gt;

&lt;p&gt;Two algorithmic changes did the work.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Linear-recurrence main-axis positions
&lt;/h3&gt;

&lt;p&gt;The original main-axis position rule was a cumulative sum: each cell's position depended on the size of every prior sibling. A 100-cell row in the stress fixture meant ~300 dependency edges per row.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Old rule — every cell reads every prior sibling&lt;/span&gt;
&lt;span class="nx"&gt;mainPos&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;siblings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;mainSize&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;margin&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;gap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replaced with a linear recurrence — each cell only reads the cell immediately before it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// New rule — each cell only reads the previous one&lt;/span&gt;
&lt;span class="nx"&gt;mainPos&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;mainPos&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mainSize&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;marginEnd&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;me&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;marginStart&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;gap&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reverse-direction (&lt;code&gt;row-reverse&lt;/code&gt; / &lt;code&gt;column-reverse&lt;/code&gt;) keeps the cumulative-sum fallback because the recurrence depends on the prior cell's already-resolved position, which doesn't hold when iteration is reversed.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Fold default-valued style inputs
&lt;/h3&gt;

&lt;p&gt;Observation: roughly half of all input fields in the grammar were sitting at default values forever — &lt;code&gt;margin: 0&lt;/code&gt;, &lt;code&gt;minWidth: 0&lt;/code&gt;, &lt;code&gt;maxWidth: undefined&lt;/code&gt;, etc. They still consumed dirty-flag slots, propagated through dependents, and appeared in dependency sets.&lt;/p&gt;

&lt;p&gt;Phase 17 folds these defaults into compile-time constants at grammar-build time. Each per-cell node went from ~15 fields to ~7. The classifier's &lt;code&gt;nodeSig&lt;/code&gt; was extended with fold-predicate bits so that mutating from default → non-default correctly triggers a structural rebuild.&lt;/p&gt;

&lt;p&gt;Combined, hot-structural went from ~450µs to ~70µs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why pure TS over a native rewrite
&lt;/h2&gt;

&lt;p&gt;I considered porting the engine to a native-compiled-to-WASM language before doing the algorithmic work. Glad I didn't.&lt;/p&gt;

&lt;p&gt;Yoga's advantage wasn't speed of arithmetic — its C++ kernel is fast and well-tuned, but speed of arithmetic wasn't the bottleneck on this workload. The advantage was the structural-mutation algorithm: Yoga handled it natively, the pure-TS engine was redoing too much work per mutation.&lt;/p&gt;

&lt;p&gt;A native-compiled port from my side would have inherited the same algorithmic shape and reached parity at best. The fix was algorithmic, and the algorithmic fix worked in TypeScript. &lt;strong&gt;"Pure TS is competitive with native code on this workload"&lt;/strong&gt; is the actually-interesting result.&lt;/p&gt;

&lt;h2&gt;
  
  
  Validation, including a same-day hotfix story
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;1,470 unit + integration tests pass&lt;/li&gt;
&lt;li&gt;Structural-differential fuzzer green at 3,000 runs&lt;/li&gt;
&lt;li&gt;33 Yoga oracle fixtures (cell-for-cell comparison)&lt;/li&gt;
&lt;li&gt;Byte-identical cached-vs-cold differential mode at 833 runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A small incident worth mentioning: within hours of publishing 2.0.0, the fast-check property fuzzer caught a real bug — &lt;code&gt;createStyleDirtier&lt;/code&gt; was throwing on a node whose entire style had been folded out, a case my analysis said couldn't happen. The fuzzer immediately found it. 2.0.1 shipped same day with the fix and a pinned regression test, and 2.0.0 was deprecated on npm pointing at 2.0.1.&lt;/p&gt;

&lt;p&gt;Property-based fuzzing earns its keep. I had been on the fence about whether the fuzzer was worth maintaining; this answered it.&lt;/p&gt;

&lt;h2&gt;
  
  
  API stability
&lt;/h2&gt;

&lt;p&gt;Public &lt;code&gt;calculateLayout()&lt;/code&gt; is byte-identical between 1.x and 2.x. The SemVer-major bump reflects internal API and memory-characteristic shifts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Typed-array runtime (&lt;code&gt;Field.id&lt;/code&gt; integer + array storage replacing &lt;code&gt;Map&amp;lt;Field, X&amp;gt;&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LayoutPool&lt;/code&gt; grows unbounded (tried FinalizationRegistry-based recycling in phase 15C; caused 2× regression so removed)&lt;/li&gt;
&lt;li&gt;Per-property dirty bitmask replacing single dirty bool&lt;/li&gt;
&lt;li&gt;Linear recurrence + fold default values (the algorithmic changes above)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're using only the documented public API, you upgrade and the speedup is transparent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/pilatesjs/pilates
&lt;span class="nb"&gt;cd &lt;/span&gt;pilates
pnpm &lt;span class="nb"&gt;install
&lt;/span&gt;pnpm bench   &lt;span class="c"&gt;# ~5 min&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or install the engine directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @pilates/core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full React stack (reconciler + widgets):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @pilates/react @pilates/widgets react
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adversarial benchmarks are very welcome — if there's a workload where this approach breaks down, I'd genuinely like to find it. That's the most valuable feedback the project can get right now.&lt;/p&gt;




&lt;p&gt;Repo (MIT): &lt;a href="https://github.com/pilatesjs/pilates" rel="noopener noreferrer"&gt;https://github.com/pilatesjs/pilates&lt;/a&gt;&lt;br&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/@pilates/core" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@pilates/core&lt;/a&gt;&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>javascript</category>
      <category>performance</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Why Pattern-Matching Scanners Miss Structural Bugs (and What I Built Instead)</title>
      <dc:creator>Zhijie Wong</dc:creator>
      <pubDate>Wed, 22 Apr 2026 10:48:03 +0000</pubDate>
      <link>https://dev.to/zhijiewong/why-pattern-matching-scanners-miss-structural-bugs-and-what-i-built-instead-34k9</link>
      <guid>https://dev.to/zhijiewong/why-pattern-matching-scanners-miss-structural-bugs-and-what-i-built-instead-34k9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom90i7tpmkt1kvbgjd2l.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fom90i7tpmkt1kvbgjd2l.gif" alt=" " width="1278" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Pattern-matching scanners (Semgrep, Snyk, CodeQL) find what their rulebook encodes. Bugs that arrive as &lt;strong&gt;structural variants&lt;/strong&gt; — the sink is three calls away, the taint flows through an unusual shape, the CVE matters but the pattern doesn't match verbatim — slip through.&lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;mythos-agent&lt;/strong&gt;, an open-source AI code reviewer (MIT, TypeScript, &lt;a href="https://github.com/mythos-agent/mythos-agent" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;), to layer an LLM-based hypothesis stage on top of a traditional SAST foundation. This post is the technical writeup: what the pipeline looks like, what bug classes it surfaces that regex-only scanners miss, and where it still gets things wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx mythos-agent scan     &lt;span class="c"&gt;# pattern scan, no API key&lt;/span&gt;
npx mythos-agent hunt     &lt;span class="c"&gt;# full AI hypothesis + analyzer pipeline&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  1. The problem: rulebook coverage vs. bug space
&lt;/h2&gt;

&lt;p&gt;A pattern scanner's ruleset is a finite set of &lt;code&gt;(sink, source, condition)&lt;/code&gt; triples. A security reviewer reading the same code carries a much larger implicit model — they notice that &lt;em&gt;this&lt;/em&gt; DB transaction reads and writes the same row without locking, that &lt;em&gt;this&lt;/em&gt; handler joins a user-supplied path against a config root without resolving symlinks, that &lt;em&gt;this&lt;/em&gt; &lt;code&gt;eval&lt;/code&gt; receives a value that's been stringified three functions upstream.&lt;/p&gt;

&lt;p&gt;Concrete example. Semgrep's default TypeScript ruleset catches this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/run&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;           &lt;span class="c1"&gt;// flagged: eval() on request input&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It does &lt;strong&gt;not&lt;/strong&gt; catch this, even though it's the same bug:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;normalise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildPayload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;normalise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/run&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildPayload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)();&lt;/span&gt;        &lt;span class="c1"&gt;// not flagged: sink ≠ eval, source is 2 calls away&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern rule is looking for &lt;code&gt;eval(&amp;lt;tainted&amp;gt;)&lt;/code&gt; literally. The real bug is &lt;code&gt;&amp;lt;any dynamic-code sink&amp;gt;(&amp;lt;tainted, possibly transformed, possibly renamed&amp;gt;)&lt;/code&gt;. You can write a Semgrep rule for this variant — but you can only write rules for variants you've already thought of. The space of "things that behave like eval" is open-ended.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The approach: hypothesis generation per function
&lt;/h2&gt;

&lt;p&gt;The mythos-agent pipeline is four stages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Recon → Hypothesize → Analyze → Exploit (optional)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The interesting stage is &lt;strong&gt;Hypothesize&lt;/strong&gt;. For each function the parser extracts, a prompted LLM agent produces specific, code-grounded security claims — not CWE labels, but statements about &lt;em&gt;this&lt;/em&gt; code:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This handler reads &lt;code&gt;req.query.path&lt;/code&gt; and passes it to &lt;code&gt;fs.readFileSync&lt;/code&gt; via &lt;code&gt;path.join(ROOT, userPath)&lt;/code&gt; without resolving symlinks. Potential path traversal if the filesystem contains symlinks pointing outside &lt;code&gt;ROOT&lt;/code&gt;."&lt;/p&gt;

&lt;p&gt;"This transaction reads &lt;code&gt;balance&lt;/code&gt; at line 42 and writes &lt;code&gt;balance - amount&lt;/code&gt; at line 51, without wrapping in &lt;code&gt;SELECT … FOR UPDATE&lt;/code&gt; or an equivalent lock. Potential TOCTOU race allowing double-spend under concurrent requests."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The hypotheses are inputs to the next stage, not outputs to the user.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The analyzer: grading hypotheses against the code
&lt;/h2&gt;

&lt;p&gt;A separate analyzer agent re-reads the function with the hypothesis attached and decides whether the claim actually holds given the control flow, input reachability, and sink characteristics. Findings get a confidence score in &lt;code&gt;[0, 1]&lt;/code&gt;; &lt;code&gt;--severity high&lt;/code&gt; only surfaces results above a threshold.&lt;/p&gt;

&lt;p&gt;This two-stage split matters. The hypothesis stage is allowed to be speculative — it's cheap to generate a hypothesis that turns out to be wrong, and the analyzer will filter it. The analyzer stage is allowed to be conservative. Running them together in a single prompt collapses the useful separation: the model both proposes and evaluates, and in practice that means it emits plausibility-matched false positives.&lt;/p&gt;

&lt;p&gt;Example output (real, from scanning a test corpus):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; ✗ src/api/transfer.ts:38   [HIGH, conf 0.88]
   Hypothesis: read-modify-write of `balance` without row lock;
               concurrent requests can double-spend.
   Evidence:   line 42 reads `balance`, line 51 writes `balance - amount`;
               no FOR UPDATE / transaction isolation in scope.
   Suggested:  wrap in BEGIN ... SELECT ... FOR UPDATE ... COMMIT,
               or use SERIALIZABLE isolation level.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Structural variant analysis
&lt;/h2&gt;

&lt;p&gt;Given a reference CVE (from NVD, or a user-supplied patch), the variant analyzer searches the codebase for AST-shape-similar regions with semantic-role matching on inputs/sinks. Similar in spirit to what Google Project Zero described in the public &lt;strong&gt;Big Sleep&lt;/strong&gt; writeup, applied to an open-source TypeScript toolchain.&lt;/p&gt;

&lt;p&gt;The use case this actually solves: &lt;em&gt;"we patched bug X in module A; are there other places in the codebase that look like module A before the patch?"&lt;/em&gt; Regex search over &lt;code&gt;git diff&lt;/code&gt; misses these because the variant can rename the variables, reorder the statements, split a helper out, etc.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. What's in the box
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;43 scanner categories&lt;/strong&gt; (15 production-wired, 28 experimental): SQL injection, SSRF, path traversal, command injection, XSS, JWT algorithm confusion, session handling, race conditions, crypto audit, secrets, IaC misconfig, supply chain, AI/LLM security, API security, cloud misconfig, zero trust, privacy/GDPR, GraphQL, WebSocket, CORS, OAuth, SSTI, and more.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;329+ built-in rules&lt;/strong&gt; across &lt;strong&gt;8 languages&lt;/strong&gt; (TypeScript, JavaScript, Python, Go, Java, PHP, C/C++, Rust). Rules compose — "SQL injection" is N smaller rules, not one regex.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt;: SARIF 2.1.0 (drop-in for GitHub Code Scanning), HTML reports, JSON for piping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backends&lt;/strong&gt;: Claude, GPT-4o, Ollama, or any OpenAI-compatible endpoint. &lt;strong&gt;Pattern-only mode works offline without any API key&lt;/strong&gt; — the hypothesis stage is opt-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Releases are Sigstore-signed&lt;/strong&gt; (cosign) with CycloneDX SBOMs attached to each GitHub release.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. Where it still gets things wrong
&lt;/h2&gt;

&lt;p&gt;Hypothesis-driven scanning is not free. Honest limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamically-typed languages&lt;/strong&gt; (Python, JS) produce more noise than statically-typed ones. Type information is a signal the analyzer leans on heavily; without it, confidence scores drift lower and the high-severity filter leaves more on the floor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inter-procedural taint across package boundaries&lt;/strong&gt; still loses signal. If the tainted value crosses into a third-party dep with no source, the hypothesis stage has to reason about the dep's public surface, and it often over-generates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;. Running the hypothesis stage across a 100k-LOC codebase with Claude or GPT-4o is not free. The &lt;code&gt;--severity high&lt;/code&gt; filter helps; incremental scans on changed files help more. CI integration should scope to diff-only by default.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Try it
&lt;/h2&gt;

&lt;p&gt;One command, no install, no API key needed for pattern-only mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx mythos-agent quick       &lt;span class="c"&gt;# 10-second security check&lt;/span&gt;
npx mythos-agent scan        &lt;span class="c"&gt;# full pattern scan&lt;/span&gt;
npx mythos-agent hunt        &lt;span class="c"&gt;# AI-guided scan (needs a model endpoint)&lt;/span&gt;
npx mythos-agent fix &lt;span class="nt"&gt;--apply&lt;/span&gt; &lt;span class="c"&gt;# AI-generated patches for high-confidence findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/mythos-agent/mythos-agent" rel="noopener noreferrer"&gt;https://github.com/mythos-agent/mythos-agent&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Landing / docs&lt;/strong&gt;: &lt;a href="https://mythos-agent.com" rel="noopener noreferrer"&gt;https://mythos-agent.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community (EN)&lt;/strong&gt;: &lt;a href="https://mythos-agent.com/discord" rel="noopener noreferrer"&gt;https://mythos-agent.com/discord&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community (CN · 飞书)&lt;/strong&gt;: &lt;a href="https://mythos-agent.com/feishu" rel="noopener noreferrer"&gt;https://mythos-agent.com/feishu&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Releases&lt;/strong&gt;: Sigstore-signed, SBOM attached&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MIT licensed. v4.0.0 shipped today. If you have a codebase you'd want tested against hypothesis generation (public or a redacted snippet), open an issue or a discussion — I'm specifically looking for cases where the analyzer produces unexpected false positives, since those are the most useful signal for tuning the prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions I'd value technical feedback on
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;For &lt;strong&gt;per-function hypothesis generation&lt;/strong&gt;, where has the "speculate then analyze" split produced the most noise in systems you've built or used?&lt;/li&gt;
&lt;li&gt;For &lt;strong&gt;structural variant analysis on dynamically-typed languages&lt;/strong&gt;, what's your experience with AST-shape normalisation to get useful similarity scores across Python or JS?&lt;/li&gt;
&lt;li&gt;Which &lt;strong&gt;SARIF 2.1.0 consumers beyond GitHub Code Scanning&lt;/strong&gt; actually render SARIF well, and which silently drop half the fields?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Thanks for reading. ⭐Star on GitHub if this is useful; open an issue if you find a bug.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Pawdig is an AI document intelligence tool</title>
      <dc:creator>Zhijie Wong</dc:creator>
      <pubDate>Fri, 03 Apr 2026 13:13:09 +0000</pubDate>
      <link>https://dev.to/zhijiewong/pawdig-is-an-ai-document-intelligence-tool-2hme</link>
      <guid>https://dev.to/zhijiewong/pawdig-is-an-ai-document-intelligence-tool-2hme</guid>
      <description>&lt;p&gt;If you’ve ever tried to copy-paste a table from a PDF, invoice, or contract, you know the pain. &lt;/p&gt;

&lt;p&gt;The formatting breaks. The cells merge. You end up manually re-typing 200 rows of data.&lt;/p&gt;

&lt;p&gt;So, I built a better way. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://pawdig.com/sign-in" rel="noopener noreferrer"&gt;Pawdig&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pawdig&lt;/strong&gt; is an AI document intelligence tool that doesn't just extract messy data instantly, it turns your files into your own private knowledge base. &lt;/p&gt;

&lt;p&gt;Just drag and drop. Pawdig instantly structures your data, and our built-in AI agents lets you chat directly with your documents to extract insights, summarize pages, or find exact clauses.&lt;/p&gt;

&lt;p&gt;It handles:&lt;br&gt;
✅ Building an instantly searchable AI knowledge base&lt;br&gt;
✅ Borderless tables &amp;amp; complex merged cells&lt;br&gt;
✅ Scanned images and poor-quality invoices&lt;br&gt;
✅ Massive page documents&lt;br&gt;
✅ Instant export to Excel, CSV, JSON, or Markdown&lt;/p&gt;

&lt;p&gt;If you deal with invoices, reports, or contracts daily, try it out. 👇&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pawdig.com/sign-in" rel="noopener noreferrer"&gt;https://pawdig.com/sign-in&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>saas</category>
      <category>productivity</category>
      <category>career</category>
    </item>
    <item>
      <title>I built an open-source project OpenHarness🪼</title>
      <dc:creator>Zhijie Wong</dc:creator>
      <pubDate>Wed, 01 Apr 2026 13:54:34 +0000</pubDate>
      <link>https://dev.to/zhijiewong/i-built-an-open-source-project-openharness-2lc6</link>
      <guid>https://dev.to/zhijiewong/i-built-an-open-source-project-openharness-2lc6</guid>
      <description>&lt;p&gt;I built &lt;strong&gt;OpenHarness — an open-source terminal coding agent&lt;/strong&gt; with 17 tools and 16 slash commands. It works with Ollama (free, local), OpenAI, Anthropic, OpenRouter, Deepseek, Qwen or any OpenAI-compatible API.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The problem&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Code agent is amazing but locked to cloud models.&lt;br&gt;
I wanted the same experience with my local Ollama models free, private,&lt;br&gt;
no API key needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What I built&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;17 tools: file read/edit/write, bash, grep, glob, web search, task management, jupyter notebooks, sub-agents&lt;/li&gt;
&lt;li&gt;16 slash commands: /diff /undo /commit /cost /compact /plan /review&lt;/li&gt;
&lt;li&gt;Git-safe: every AI edit auto-committed, /undo reverts instantly&lt;/li&gt;
&lt;li&gt;Headless mode: &lt;code&gt;oh run "fix tests" --json&lt;/code&gt; for CI/CD&lt;/li&gt;
&lt;li&gt;Permission gates: ask/trust/deny — approve before the agent acts&lt;/li&gt;
&lt;li&gt;React+Ink terminal UI with markdown rendering&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Install&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
bash
npm install -g @zhijiewang/openharness
oh --model ollama/llama3
oh --model ollama/qwen2.5:7b

## **Tech stack**

TypeScript, React+Ink, Zod for tool schemas, async generators for streaming.

Everyone is welcome to join and build it together. 👏
GitHub: https://github.com/zhijiewong/openharness
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>webdev</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
