<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: George Hertz</title>
    <description>The latest articles on DEV Community by George Hertz (@hertzg).</description>
    <link>https://dev.to/hertzg</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F463527%2Ffb10ccb7-e49c-44f9-88a7-e9563bc774af.png</url>
      <title>DEV Community: George Hertz</title>
      <link>https://dev.to/hertzg</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hertzg"/>
    <language>en</language>
    <item>
      <title>WASM SIMD by example: 16 RGB pixels to grayscale per instruction</title>
      <dc:creator>George Hertz</dc:creator>
      <pubDate>Sat, 13 Jun 2026 01:16:50 +0000</pubDate>
      <link>https://dev.to/hertzg/wasm-simd-by-example-16-rgb-pixels-to-grayscale-per-instruction-i00</link>
      <guid>https://dev.to/hertzg/wasm-simd-by-example-16-rgb-pixels-to-grayscale-per-instruction-i00</guid>
      <description>&lt;p&gt;I finally understood WASM SIMD by writing a real kernel: RGB → grayscale luma, 16 pixels per instruction instead of one at a time. Same math, &lt;strong&gt;~4× faster&lt;/strong&gt; on a single core. This is the condensed version — the &lt;a href="https://hertz.gg/til/2026-05-26-wasm-simd-by-example-rgb-to-luma.html" rel="noopener noreferrer"&gt;full write-up&lt;/a&gt; has every op desugared into plain TypeScript, lane-by-lane ASCII diagrams, and an optional detour down to the carry wires.&lt;/p&gt;

&lt;p&gt;Spoiler first, theory after. One file, nothing to install besides Deno — it compiles its own AssemblyScript at startup:&lt;/p&gt;

&lt;p&gt;&lt;/p&gt;
  🪄 code A → code B, runnable and benchmarked (click to spoil yourself)
  &lt;br&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// luma_bench.ts, fully self-contained: the AssemblyScript compiles itself on the fly.&lt;/span&gt;
&lt;span class="c1"&gt;// Run: deno bench -A luma_bench.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;asc&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:assemblyscript/asc&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// ---------- code A: the loop anyone would write ----------&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;rgbToLumaNaive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pixelCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;pixelCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2126&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.7152&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.0722&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ---------- code A.5: same loop, float math swapped for Q15 integers ----------&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;rgbToLumaScalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pixelCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;pixelCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6966&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;23436&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2366&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mh"&gt;0x4000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ---------- code B: the same math, spoken in 16-wide lane language ----------&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
export function luma(inPtr: usize, outPtr: usize, pixels: i32): void {
  const redWeight = i16x8.splat(6966);  // 0.2126 in Q15
  const greenWeight = i16x8.splat(23436); // 0.7152 in Q15
  const blueWeight = i16x8.splat(2366);  // 0.0722 in Q15
  const q15Half = i32x4.splat(0x4000);

  for (let i = 0; i + 16 &amp;lt;= pixels; i += 16) {
    const chunkPtr = inPtr + &amp;lt;usize&amp;gt;(i * 3);

    // load 48 interleaved bytes, de-interleave into R/G/B planes
    const chunk0 = v128.load(chunkPtr);
    const chunk1 = v128.load(chunkPtr, 16);
    const chunk2 = v128.load(chunkPtr, 32);
    const redPartial = i8x16.shuffle(chunk0, chunk1, 0,3,6,9,12,15,18,21,24,27,30,0,0,0,0,0);
    const redPlane = i8x16.shuffle(redPartial, chunk2, 0,1,2,3,4,5,6,7,8,9,10,17,20,23,26,29);
    const greenPartial = i8x16.shuffle(chunk0, chunk1, 1,4,7,10,13,16,19,22,25,28,31,0,0,0,0,0);
    const greenPlane = i8x16.shuffle(greenPartial, chunk2, 0,1,2,3,4,5,6,7,8,9,10,18,21,24,27,30);
    const bluePartial = i8x16.shuffle(chunk0, chunk1, 2,5,8,11,14,17,20,23,26,29,0,0,0,0,0,0);
    const bluePlane = i8x16.shuffle(bluePartial, chunk2, 0,1,2,3,4,5,6,7,8,9,16,19,22,25,28,31);

    // widen u8 -&amp;gt; u16
    const redLow = i16x8.extend_low_i8x16_u(redPlane);
    const redHigh = i16x8.extend_high_i8x16_u(redPlane);
    const greenLow = i16x8.extend_low_i8x16_u(greenPlane);
    const greenHigh = i16x8.extend_high_i8x16_u(greenPlane);
    const blueLow = i16x8.extend_low_i8x16_u(bluePlane);
    const blueHigh = i16x8.extend_high_i8x16_u(bluePlane);

    // multiply + accumulate in Q15, 4 pixels per accumulator
    let luma0to3 = i32x4.extmul_low_i16x8_u(redLow, redWeight);
    luma0to3 = i32x4.add(luma0to3, i32x4.extmul_low_i16x8_u(greenLow, greenWeight));
    luma0to3 = i32x4.add(luma0to3, i32x4.extmul_low_i16x8_u(blueLow, blueWeight));
    let luma4to7 = i32x4.extmul_high_i16x8_u(redLow, redWeight);
    luma4to7 = i32x4.add(luma4to7, i32x4.extmul_high_i16x8_u(greenLow, greenWeight));
    luma4to7 = i32x4.add(luma4to7, i32x4.extmul_high_i16x8_u(blueLow, blueWeight));
    let luma8to11 = i32x4.extmul_low_i16x8_u(redHigh, redWeight);
    luma8to11 = i32x4.add(luma8to11, i32x4.extmul_low_i16x8_u(greenHigh, greenWeight));
    luma8to11 = i32x4.add(luma8to11, i32x4.extmul_low_i16x8_u(blueHigh, blueWeight));
    let luma12to15 = i32x4.extmul_high_i16x8_u(redHigh, redWeight);
    luma12to15 = i32x4.add(luma12to15, i32x4.extmul_high_i16x8_u(greenHigh, greenWeight));
    luma12to15 = i32x4.add(luma12to15, i32x4.extmul_high_i16x8_u(blueHigh, blueWeight));

    // round, shift out of Q15, narrow i32 -&amp;gt; i16 -&amp;gt; u8, store 16 results at once
    luma0to3 = i32x4.shr_s(i32x4.add(luma0to3, q15Half), 15);
    luma4to7 = i32x4.shr_s(i32x4.add(luma4to7, q15Half), 15);
    luma8to11 = i32x4.shr_s(i32x4.add(luma8to11, q15Half), 15);
    luma12to15 = i32x4.shr_s(i32x4.add(luma12to15, q15Half), 15);
    const luma0to7 = i16x8.narrow_i32x4_s(luma0to3, luma4to7);
    const luma8to15 = i16x8.narrow_i32x4_s(luma8to11, luma12to15);
    v128.store(outPtr + &amp;lt;usize&amp;gt;i, i8x16.narrow_i16x8_u(luma0to7, luma8to15));
  }
}
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// compile in memory and import the bytes as a module, no files involved&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;binary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;stderr&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;asc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compileString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;optimizeLevel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;stub&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;enable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;simd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;luma&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`data:application/wasm;base64,&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;binary&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toBase64&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;WebAssembly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Memory&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;o&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// ---------- one 12-megapixel photo (4000x3000) ----------&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;IN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;65536&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;IN&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;N&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;grow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;OUT&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;65536&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// stub runtime starts at 0 pages&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rgbWasm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;IN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;N&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// views AFTER grow (grow detaches)&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;rgbWasm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;rgbWasm&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2654435761&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rgbJs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rgbWasm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// the JS versions get their own plain copy, fair is fair&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;outJs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// ---------- the receipts ----------&lt;/span&gt;

&lt;span class="nx"&gt;Deno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bench&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;code A: naive float&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;rgbToLumaNaive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rgbJs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;outJs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;Deno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bench&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;code A.5: scalar Q15&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;rgbToLumaScalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rgbJs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;outJs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;Deno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bench&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;code B: WASM SIMD&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;IN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;OUT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;N&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;p&gt;&lt;/p&gt;

&lt;p&gt;The receipts, M2 Pro, one 12-megapixel photo:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;benchmark&lt;/th&gt;
&lt;th&gt;time/iter&lt;/th&gt;
&lt;th&gt;speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;code A: naive float&lt;/td&gt;
&lt;td&gt;23.3 ms&lt;/td&gt;
&lt;td&gt;1×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;code A.5: scalar Q15&lt;/td&gt;
&lt;td&gt;19.1 ms&lt;/td&gt;
&lt;td&gt;~1.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;code B: WASM SIMD&lt;/td&gt;
&lt;td&gt;5.7 ms&lt;/td&gt;
&lt;td&gt;~4.1×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;(The integer trick alone is worth ~20% before any SIMD.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The one idea behind all of it
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;v128&lt;/code&gt; is just &lt;strong&gt;128 bits = 16 bytes&lt;/strong&gt;. The &lt;em&gt;type prefix&lt;/em&gt; on an op decides how you slice those bytes into "lanes":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;i8x16:  16 lanes ×  8-bit   [b0][b1]...[b15]
i16x8:   8 lanes × 16-bit   [ h0 ][ h1 ]...[ h7 ]
i32x4:   4 lanes × 32-bit   [   w0   ]...[   w3   ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;"Lane-wise" = the op runs on every lane in parallel. One instruction, N results. That's the whole win.&lt;/p&gt;
&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Grayscale (Rec.709): 

&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;L&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.2126&lt;/span&gt;&lt;span class="mord mathnormal"&gt;R&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.7152&lt;/span&gt;&lt;span class="mord mathnormal"&gt;G&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.0722&lt;/span&gt;&lt;span class="mord mathnormal"&gt;B&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, one output byte per pixel. Every pixel is independent, so it &lt;em&gt;should&lt;/em&gt; vectorize perfectly. The friction is purely mechanical: pixels arrive interleaved (&lt;code&gt;RGBRGBRGB…&lt;/code&gt;) but SIMD wants each channel contiguous, and bytes are too narrow to multiply without overflowing.&lt;/p&gt;
&lt;h2&gt;
  
  
  Ditch the floats first: Q15
&lt;/h2&gt;

&lt;p&gt;Store fractions in plain integers: scale by 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;32768&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and remember the decimal point lives 15 bits from the right — the binary version of "store dollars as cents":&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mtable"&gt;&lt;span class="col-align-r"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen"&gt;⌊&lt;/span&gt;&lt;span class="mord"&gt;0.2126&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;⋅&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;⌉&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="col-align-l"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;6966&lt;/span&gt;&lt;span class="mspace"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class="mopen"&gt;⌊&lt;/span&gt;&lt;span class="mord"&gt;0.7152&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;⋅&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;⌉&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="arraycolsep"&gt;&lt;/span&gt;&lt;span class="col-align-r"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;23436&lt;/span&gt;&lt;span class="mspace"&gt;&amp;nbsp;&lt;/span&gt;&lt;span class="mopen"&gt;⌊&lt;/span&gt;&lt;span class="mord"&gt;0.0722&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;⋅&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;⌉&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="col-align-l"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2366&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;



&lt;p&gt;And 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;6966&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;23436&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;2366&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;32768&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;15&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 &lt;strong&gt;exactly&lt;/strong&gt; — deliberate, so dividing by the total weight is an exact shift and pure white comes out as exactly 255, no bias. Multiply by the constant, add half (
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;14&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 = &lt;code&gt;0x4000&lt;/code&gt;) to round, shift right by 15. Bonus: integer math is bit-deterministic across platforms — my float version drifted ±1 between machines, the integer one never does.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6966&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;23436&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2366&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mh"&gt;0x4000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That line is the entire kernel. The rest is plumbing to make it happen 16 pixels at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The kernel in four moves
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. De-interleave.&lt;/strong&gt; Three loads grab 48 bytes = 16 RGB pixels. &lt;code&gt;i8x16.shuffle&lt;/code&gt; picks each output byte from any of 32 input bytes — "every 3rd byte" turns &lt;code&gt;RGBRGB…&lt;/code&gt; into planar &lt;code&gt;RRRR… / GGGG… / BBBB…&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;v128&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunkPtr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;v128&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunkPtr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;v128&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunkPtr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redPartial&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i8x16&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunk1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;27&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redPlane&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i8x16&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redPartial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunk2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;17&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;26&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;29&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// redPlane = [R0 R1 ... R15] — same trick for green and blue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Widen.&lt;/strong&gt; &lt;code&gt;255 × 23436&lt;/code&gt; doesn't fit in a byte, so zero-extend u8 → u16:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redLow&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i16x8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend_low_i8x16_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redPlane&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// pixels 0..7  as u16&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redHigh&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i16x8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend_high_i8x16_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redPlane&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// pixels 8..15 as u16&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Multiply-accumulate in Q15.&lt;/strong&gt; &lt;code&gt;splat&lt;/code&gt; broadcasts a constant to all lanes; &lt;code&gt;extmul&lt;/code&gt; multiplies 8×u16 pairs into 4×i32 products (it &lt;em&gt;has&lt;/em&gt; to split low/high — 8 wide products are 256 bits and a register holds 128):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redWeight&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i16x8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6966&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;luma0to3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extmul_low_i16x8_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redLow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;redWeight&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;luma0to3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma0to3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extmul_low_i16x8_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;greenLow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;greenWeight&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nx"&gt;luma0to3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma0to3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extmul_low_i16x8_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;blueLow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;blueWeight&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;// max value ≈ 23 bits — can't overflow i32, and summing before rounding&lt;/span&gt;
&lt;span class="c1"&gt;// means exactly one rounding error per pixel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Round, narrow, store.&lt;/strong&gt; Add half, shift out of Q15, then squeeze i32 → i16 → u8 and write all 16 grays with one store. The &lt;code&gt;narrow&lt;/code&gt; ops &lt;strong&gt;saturate&lt;/strong&gt; (clamp, not wrap) — that's why they're real ops and not bit-casts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;luma0to3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shr_s&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i32x4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma0to3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;q15Half&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;luma0to7&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i16x8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;narrow_i32x4_s&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma0to3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="nx"&gt;luma4to7&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;luma8to15&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;i16x8&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;narrow_i32x4_s&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma8to11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;luma12to15&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;v128&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;outPtr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i8x16&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;narrow_i16x8_u&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;luma0to7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;luma8to15&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// 16 px, one store&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every op above is desugared into a plain TS loop (with lane diagrams) in the &lt;a href="https://hertz.gg/til/2026-05-26-wasm-simd-by-example-rgb-to-luma.html" rel="noopener noreferrer"&gt;full version&lt;/a&gt; — if you can read &lt;code&gt;for (let lane = 0; lane &amp;lt; 8; lane++)&lt;/code&gt;, you can read SIMD.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it's fast
&lt;/h2&gt;

&lt;p&gt;The lanes aren't "connected" by clever hardware — they're deliberately &lt;strong&gt;disconnected&lt;/strong&gt;. The vector ALU is one wide slab of adders, and the lane prefix just decides where the carry chain gets cut. All 16 byte-adds physically exist side by side and fire in the same cycle. The other half of the win: one instruction fetch/decode/retire now buys 16 results, so the CPU front-end does 1/16th of the work. WASM &lt;code&gt;v128&lt;/code&gt; compiles ~1:1 to NEON/SSE, so none of it is lost in translation. (Full post has the carry-wire detour, down to the AND gate.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotchas
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Pass &lt;code&gt;--enable simd&lt;/code&gt; to the AssemblyScript compiler or &lt;code&gt;v128&lt;/code&gt; won't compile.&lt;/li&gt;
&lt;li&gt;Lane width must match the op — &lt;code&gt;i32x4.add&lt;/code&gt; on bytes silently treats 4 bytes as one int.&lt;/li&gt;
&lt;li&gt;The loop condition &lt;code&gt;i + 16 &amp;lt;= pixels&lt;/code&gt; &lt;strong&gt;silently skips the last &lt;code&gt;pixels % 16&lt;/code&gt; pixels&lt;/strong&gt; — no crash, no error, just stale output bytes. Real code needs a scalar tail loop; mine dodges it because my image dimensions are multiples of 16.&lt;/li&gt;
&lt;li&gt;Integer (Q15) math made the kernel bit-exact against its TS reference. Floats wouldn't.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Isn't this just a tiny GPU?
&lt;/h2&gt;

&lt;p&gt;Kind of, yeah. A GPU is the same lane trick scaled until it's the whole chip — a warp is 32 lanes in lock-step, and the shader programming model exists so you write the scalar loop while the hardware runs the lanes. The catch is that the GPU is in another building: pixels have to commute through buffer uploads and a dispatch queue, and those costs don't care that your math is three multiplies per pixel. For one cheap pass over data already in your hands, SIMD on the CPU is hard to argue with.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Condensed from the &lt;a href="https://hertz.gg/til/2026-05-26-wasm-simd-by-example-rgb-to-luma.html" rel="noopener noreferrer"&gt;canonical version&lt;/a&gt;, which has the per-op TS desugarings, ASCII lane diagrams, and the silicon detour. Origin story: &lt;a href="https://hertz.gg/blog/2026-05-25-eink-dithering-wasm-simd-64ms-to-16ms.html" rel="noopener noreferrer"&gt;dithering photos for an e-ink display, 64 ms → 16 ms&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webassembly</category>
      <category>performance</category>
      <category>deno</category>
      <category>typescript</category>
    </item>
    <item>
      <title>🐋 Incremental (+Parallel) Builds + Manifest Lists = ❤️</title>
      <dc:creator>George Hertz</dc:creator>
      <pubDate>Fri, 09 Apr 2021 15:41:18 +0000</pubDate>
      <link>https://dev.to/hertzg/how-to-build-images-for-multiple-platforms-kind-of-incrementally-using-docker-buildx-50ik</link>
      <guid>https://dev.to/hertzg/how-to-build-images-for-multiple-platforms-kind-of-incrementally-using-docker-buildx-50ik</guid>
      <description>&lt;p&gt;This is &lt;a href="https://hertz.gg/blog/2021-02-13-buildx-each-arch-and-manually-create-manifest-list.html" rel="noopener noreferrer"&gt;a cross post&lt;/a&gt; of &lt;a href="https://hertz.gg/" rel="noopener noreferrer"&gt;my (not really) blog posts from github&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Using buildx to build docker images for foreign architectures separately using qemu and publishing as one multi-arch image to docker hub.
&lt;/h1&gt;

&lt;p&gt;Steps involved in words:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build image for each architecture and push to temp registry&lt;/li&gt;
&lt;li&gt;Create a manifest list grouping them together in the temp registry&lt;/li&gt;
&lt;li&gt;Use scopeo to copy from temp registry to public registry&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These steps are easier said than done, few things need to happen first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example project
&lt;/h2&gt;

&lt;p&gt;Let's image a case where we have a project that runs on docker. We would like to build images for the following&lt;br&gt;
platforms.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;linux/amd64&lt;/li&gt;
&lt;li&gt;linux/arm64/v8&lt;/li&gt;
&lt;li&gt;linux/arm/v7&lt;/li&gt;
&lt;li&gt;linux/arm/v6&lt;/li&gt;
&lt;li&gt;linux/ppc64le&lt;/li&gt;
&lt;li&gt;linux/s390x&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The build should happen in parallel for each platform, but only publish one "multi-arch" image (in other words a&lt;br&gt;
manifest list).&lt;/p&gt;

&lt;p&gt;Here's a sample app&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createServer&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello World&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Server running at %j`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;address&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And it's complementing (not very good) Dockerfile&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM node:14-alpine
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
WORKDIR /app
COPY ./app.js ./app.js
CMD [ "node", "/app/app.js" ]
EXPOSE 3000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 1.1: Setup
&lt;/h2&gt;

&lt;p&gt;To perform the first step of we need to set-up a few things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;registry&lt;/li&gt;
&lt;li&gt;qemu - to emulate different cpus for building&lt;/li&gt;
&lt;li&gt;binfmt&lt;/li&gt;
&lt;li&gt;buildx builder that has access to all above&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 1.1.1: registry
&lt;/h3&gt;

&lt;p&gt;First start a v2 registry and expose as an &lt;strong&gt;INSECURE&lt;/strong&gt; &lt;code&gt;localhost:5000&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run --rm --name registry -p 5000:5000 registry:2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 1.1.2:  qemu, binfmt &amp;amp; buildx
&lt;/h3&gt;

&lt;p&gt;Now setup &lt;code&gt;qemu&lt;/code&gt;, &lt;code&gt;binfmt&lt;/code&gt; configuration to use that &lt;code&gt;qemu&lt;/code&gt; and create a special &lt;code&gt;buildx&lt;/code&gt; container which has access to&lt;br&gt;
host network.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get install qemu-user-static

docker run --privileged --rm tonistiigi/binfmt --install all

docker buildx create \
                --name builder \
                --driver docker-container \
                --driver-opt network=host \
                --use

docker buildx inspect builder --bootstrap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;tonistiigi/binfmt --install all&lt;/code&gt; is a docker container "with side-effects" that will set up &lt;code&gt;binfmt&lt;/code&gt;&lt;br&gt;
configuration &lt;strong&gt;on your host&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;--driver-opt network=host&lt;/code&gt; will allows the &lt;code&gt;buildx&lt;/code&gt; container to reach the &lt;code&gt;registry&lt;/code&gt; running on host&lt;br&gt;
at &lt;code&gt;localhost:5000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;buildx inspect --bootstrap&lt;/code&gt; will kickoff the contianer and print it's information for us.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1.2: Build
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: Buildx by itself runs the builds in parallel if you provide a comma separated list of platforms&lt;br&gt;
to &lt;code&gt;buildx build&lt;/code&gt; command as &lt;code&gt;--platform&lt;/code&gt; flag.&lt;/p&gt;

&lt;p&gt;The problem for me and the whole reason of writing this post is that if the build with multiple &lt;code&gt;--platforms&lt;/code&gt;&lt;br&gt;
fails for &lt;strong&gt;one of the platforms&lt;/strong&gt; then the whole build is marked as failed and you get nothing.&lt;/p&gt;

&lt;p&gt;Another use case can also be that together with one multi-arch image maybe you want to push arch specific repositories (&lt;br&gt;
eg: &lt;code&gt;docker.io/app/app&lt;/code&gt;, &lt;code&gt;docker.io/arm64v8/app&lt;/code&gt; and &lt;code&gt;docker.io/amd64/app&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The other case is that I do builds on multiple actual machines that natively have &lt;code&gt;arm/v6&lt;/code&gt;, &lt;code&gt;arm/v7&lt;/code&gt; and &lt;code&gt;arm64/v8&lt;/code&gt;&lt;br&gt;
cpus (a cluster of different Pis and similar).&lt;/p&gt;

&lt;p&gt;There are probably even more reasons why you would want to build them this way 🤷.&lt;/p&gt;

&lt;p&gt;Now we are ready to start building for different architectures with our &lt;code&gt;buildx&lt;/code&gt; builder for this example.&lt;/p&gt;

&lt;p&gt;The base &lt;code&gt;alpine&lt;/code&gt; image supports the following architectures.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;linux/amd64&lt;/li&gt;
&lt;li&gt;linux/arm/v6&lt;/li&gt;
&lt;li&gt;linux/arm/v7&lt;/li&gt;
&lt;li&gt;linux/arm64/v8&lt;/li&gt;
&lt;li&gt;linux/ppc64le&lt;/li&gt;
&lt;li&gt;linux/s390x&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;lets target all of them 😎&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker buildx build \
        --tag localhost:5000/app:linux-amd64 \
        --platform linux/amd64 \
        --load \
        --progress plain \
        . &amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
docker buildx build \
        --tag localhost:5000/app:linux-arm-v6 \
        --platform linux/arm/v6 \
        --load \
        --progress plain \
        .&amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
docker buildx build \
        --tag localhost:5000/app:linux-arm-v7 \
        --platform linux/arm/v7 \
        --load \
        --progress plain \
        .&amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
docker buildx build \
        --tag localhost:5000/app:linux-arm64-v8 \
        --platform linux/arm64/v8 \
        --load \
        --progress plain \
        .&amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
docker buildx build \
        --tag localhost:5000/app:linux-ppc64le \
        --platform linux/ppc64le \
        --load \
        --progress plain \
        .&amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
docker buildx build \
        --tag localhost:5000/app:linux-s390x \
        --platform linux/s390x \
        --load \
        --progress plain \
        .&amp;gt; /dev/null 2&amp;gt;&amp;amp;1 &amp;amp;
wait

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this is done, the images will be loaded and visible with &lt;code&gt;docker images&lt;/code&gt; command&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ docker images

...
localhost:5000/app   linux-arm64-v8    e3ec56e457e6   About a minute ago   115MB
localhost:5000/app   linux-arm-v7      ab770e5be5d1   About a minute ago   106MB
localhost:5000/app   linux-ppc64le     3a328d516acf   About a minute ago   126MB
localhost:5000/app   linux-s390x       73e064c0c3d4   About a minute ago   119MB
localhost:5000/app   linux-amd64       f6260fedf498   About a minute ago   116MB
localhost:5000/app   linux-arm-v6      5a1fb75b0a45   2 minutes ago        110MB
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no need to &lt;code&gt;--load&lt;/code&gt; the images to your local docker, you can make &lt;code&gt;buildx&lt;/code&gt; directly push to our local registry.&lt;/p&gt;

&lt;p&gt;For this example you have to push these images as an extra step&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker push --all-tags -q localhost:5000/app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Manifest List
&lt;/h2&gt;

&lt;p&gt;Now we only need to group those images together into one big manifest list.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker manifest create --insecure
    localhost:5000/app:1.0.0 \
        localhost:5000/app:linux-amd64 \
        localhost:5000/app:linux-arm-v6 \
        localhost:5000/app:linux-arm-v7 \
        localhost:5000/app:linux-arm64-v8 \
        localhost:5000/app:linux-ppc64le \
        localhost:5000/app:linux-s390x

docker manifest push localhost:5000/app:1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3.1: Skopeo
&lt;/h2&gt;

&lt;p&gt;The one last step is to copy the manifest list and only the blobs that are linked by it. For this we need &lt;code&gt;skopeo&lt;/code&gt;, an&lt;br&gt;
amazing tool for working with registries.&lt;/p&gt;

&lt;p&gt;You can either build form source or for ubuntu 20.04 we can use the prebuilt &lt;code&gt;kubic&lt;/code&gt; packages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: You don't have to install it with this script, just follow the install guide&lt;br&gt;
at &lt;a href="https://github.com/containers/skopeo/blob/master/install.md" rel="noopener noreferrer"&gt;https://github.com/containers/skopeo/blob/master/install.md&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OS="x$(lsb_release --id -s)_$(lsb_release --release -s)"
echo "deb http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/${OS}/ /" &amp;gt; /etc/apt/sources.list.d/skopeop.kubic.list
wget -qO- "https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/${OS}/Release.key" | apt-key add -
apt-get update
apt-get install -y skopeo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now because our local registry is insecure &lt;code&gt;skopeo&lt;/code&gt; will complain when we try to copy from it, so we need to explicitly&lt;br&gt;
configure it to allow insecure connections to our temp registry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[[registry]]
location = 'localhost:5000'
insecure = true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create this file in &lt;code&gt;/etc/containers/registries.conf.d/localhost-5000.conf&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3.2: Copy
&lt;/h2&gt;

&lt;p&gt;The final step is to copy only the &lt;code&gt;localhost:5000/app:1.0.0&lt;/code&gt; to lets say &lt;code&gt;hertzg/example:app-1.0.0&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;First you might need to authenticate with your target registry&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skopeo login docker.io
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can finally copy the image&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skopeo copy \
        --all \
        docker://localhost:5000/app:1.0.0 \
        docker://docker.io/hertzg/example:app-1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This might take some time but once it's finished you can check the docker hub or just pull and run the image on target&lt;br&gt;
architectures&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run --rm -it hertzg/example:app-1.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thats it.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;Cover image from &lt;a href="https://laptrinhx.com/multi-arch-all-the-things-1320316701/" rel="noopener noreferrer"&gt;https://laptrinhx.com/multi-arch-all-the-things-1320316701/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>docker</category>
      <category>node</category>
      <category>skopeo</category>
      <category>qemu</category>
    </item>
    <item>
      <title>Hacking BLE Kitchen Scale</title>
      <dc:creator>George Hertz</dc:creator>
      <pubDate>Fri, 04 Sep 2020 16:13:57 +0000</pubDate>
      <link>https://dev.to/hertzg/hacking-ble-kitchen-scale-55io</link>
      <guid>https://dev.to/hertzg/hacking-ble-kitchen-scale-55io</guid>
      <description>&lt;h1&gt;
  
  
  TL;DR: Result
&lt;/h1&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hertzg" rel="noopener noreferrer"&gt;
        hertzg
      &lt;/a&gt; / &lt;a href="https://github.com/hertzg/metekcity" rel="noopener noreferrer"&gt;
        metekcity
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      ETEKCITY smart nutrition scale protocol reverse engneering
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;h1&gt;
  
  
  Backstory
&lt;/h1&gt;

&lt;p&gt;Recently I have been gaining weight and blaming it all on the  COV-19 (jira issue).&lt;br&gt;
So I thought I have to manage my food intake and count calories therefore I did what I do best, procrastinate and try to do other stuff while still thinking about the task at hand.&lt;br&gt;
All of this + Amazon and my interest in IoT somehow convinced me to buy &lt;a href="https://www.amazon.de/gp/product/B07RJV9QPF" rel="noopener noreferrer"&gt;Etekcity Smart Nutrition Food Calorie Kitchen Digital Scale&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Having hard time waking next day after late night impulse buying, I tried the device, played a bit with the (meh) app and realized that this is just an overpriced kitchen scale with app function whose sole reason is data mining. After installing in VeSync app and skipping as much of the registration as possible and playing enough with it, I decided to try and somehow gain control of the device without having to use the (meh) app.&lt;/p&gt;
&lt;h1&gt;
  
  
  Disclaimer
&lt;/h1&gt;

&lt;p&gt;Before I go into technical details I would like to mention that I have never worked with BLE devices before. Being armed with 0 technical knowledge of Bluetooth Low energy I was (not) equipped with all the knowledge I need and (definitely not)  ready to start hacking.&lt;/p&gt;
&lt;h1&gt;
  
  
  Step 1: Take it apart 🛠
&lt;/h1&gt;

&lt;p&gt;Having dabbled in some other IoT devices (ESP) my first instinct was to disassemble the device and try to find how this thing worked. I was hoping I could find the microcontroller name and model or some debug ports exposed and labeled but I was disappointed to see this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhertzg%2Fetekcity%2Fraw%2Fmaster%2Fresearch%2Fhardware%2Fesn00%2Fphoto_2020-09-04_01-17-35.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhertzg%2Fetekcity%2Fraw%2Fmaster%2Fresearch%2Fhardware%2Fesn00%2Fphoto_2020-09-04_01-17-35.jpg" alt="Photograph of PCB inside with blobbed microcontroller" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PCB was labeled here and there but it was not too helpful as they were just "component ids" to pick and place. The communications device had some information for me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhertzg%2Fetekcity%2Fraw%2Fmaster%2Fresearch%2Fhardware%2Fesn00%2Fphoto_2020-09-04_01-17-34.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhertzg%2Fetekcity%2Fraw%2Fmaster%2Fresearch%2Fhardware%2Fesn00%2Fphoto_2020-09-04_01-17-34.jpg" alt="Photograph of BLE communication module with IC" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The communications module is for Bluetooth 4 which is something that I can start investigating. &lt;/p&gt;
&lt;h1&gt;
  
  
  Step 2: Maybe there's a lib for it? 🥺
&lt;/h1&gt;

&lt;p&gt;Next step was to try to somehow find how to communicate to this and maybe someone else has already done some hacking 💔 on this but I was not able to find information for this device 💔 . The one of the projects that was relatable to this was oliexdev/openScale&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/oliexdev" rel="noopener noreferrer"&gt;
        oliexdev
      &lt;/a&gt; / &lt;a href="https://github.com/oliexdev/openScale" rel="noopener noreferrer"&gt;
        openScale
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Open-source weight and body metrics tracker, with support for Bluetooth scales
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;&amp;nbsp; &lt;a rel="noopener noreferrer" href="https://github.com/oliexdev/openScale/blob/master/fastlane/metadata/android/en-GB/images/icon.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Foliexdev%2FopenScale%2Fraw%2Fmaster%2Ffastlane%2Fmetadata%2Fandroid%2Fen-GB%2Fimages%2Ficon.png" alt="openScale logo" height="60"&gt;&lt;/a&gt; &amp;nbsp;openScale &lt;a href="https://github.com/oliexdev/openScale/actions/workflows/ci_master.yml" rel="noopener noreferrer"&gt;&lt;img src="https://github.com/oliexdev/openScale/actions/workflows/ci_master.yml/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://hosted.weblate.org/engage/openscale/?utm_source=widget" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/7fb8078b1cb0de184a1bba2dbb264580330f2dbe09aa55b3082d1d5b1d519ccf/68747470733a2f2f686f737465642e7765626c6174652e6f72672f776964676574732f6f70656e7363616c652f2d2f737472696e67732f7376672d62616467652e737667" alt="Translation status"&gt;&lt;/a&gt;
&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;Open-source weight and body metrics tracker, with support for Bluetooth scales&lt;/p&gt;
&lt;a href="https://f-droid.org/repository/browse/?fdid=com.health.openscale" rel="nofollow noopener noreferrer"&gt;
  &lt;img src="https://camo.githubusercontent.com/cc19b595c1ae3eb0a0db3de92866179ff59cd5dcf21a9c0707814232cadc00c9/68747470733a2f2f662d64726f69642e6f72672f62616467652f6765742d69742d6f6e2e706e67" alt="Get it on F-Droid" height="80"&gt;
&lt;/a&gt;
&lt;a href="https://play.google.com/store/apps/details?id=com.health.openscale.oss" rel="nofollow noopener noreferrer"&gt;
  &lt;img src="https://camo.githubusercontent.com/38fc1d921d1e02a6d7799c50fd5d2803bf0ac13e6e7db71712d13a72409e4ca3/68747470733a2f2f706c61792e676f6f676c652e636f6d2f696e746c2f656e5f75732f6261646765732f696d616765732f67656e657269632f656e2d706c61792d62616467652e706e67" alt="Get it on Google Play (Beta)" title="Beta version only" height="80"&gt;
&lt;/a&gt;
&lt;div class="markdown-alert markdown-alert-note"&gt;
&lt;p class="markdown-alert-title"&gt;Note&lt;/p&gt;
&lt;p&gt;On &lt;a href="https://play.google.com/store/apps/details?id=com.health.openscale.oss" rel="nofollow noopener noreferrer"&gt;Google Play&lt;/a&gt; the &lt;strong&gt;openScale&lt;/strong&gt; version is offered as an open beta.&lt;/p&gt;
&lt;p&gt;For the latest development state, install the latest &lt;a href="https://github.com/oliexdev/openScale/releases/tag/dev-build" rel="noopener noreferrer"&gt;openScale dev&lt;/a&gt; build from the &lt;a href="https://github.com/oliexdev/openScale/releases" rel="noopener noreferrer"&gt;GitHub release page&lt;/a&gt;
Please be aware that the development version, may contain bugs, and will not receive automatic updates.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Summary 📋&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;Monitor and track your weight, BMI, body fat, body water, muscle and other body metrics in an open source app that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;has an easy to use user interface with graphs,&lt;/li&gt;
&lt;li&gt;supports various Bluetooth scales,&lt;/li&gt;
&lt;li&gt;doesn't require you to create an account,&lt;/li&gt;
&lt;li&gt;can be configured to only show the metrics you care about, and&lt;/li&gt;
&lt;li&gt;respects your privacy and lets you decide what to do with your data.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Supported Bluetooth scales 🚀&lt;/h1&gt;

&lt;/div&gt;
&lt;p&gt;openScale has built-in support for a number of Bluetooth (BLE or "smart") scales from  many manufacturers, e.g. Beurer, Sanitas, Yunmai…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/oliexdev/openScale" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
 

&lt;p&gt;But it was only targeted towards body weight scales 💔. &lt;/p&gt;

&lt;p&gt;I was also able to find a github issue asking about this particular device and model and it was rejected for obvious reason 💔.&lt;/p&gt;


&lt;div class="ltag_github-liquid-tag"&gt;
  &lt;h1&gt;
    &lt;a href="https://github.com/oliexdev/openScale/issues/509" rel="noopener noreferrer"&gt;
      &lt;img class="github-logo" alt="GitHub logo" src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg"&gt;
      &lt;span class="issue-title"&gt;
        Add Support for the ETEKCITY Bluetooth Scale
      &lt;/span&gt;
      &lt;span class="issue-number"&gt;#509&lt;/span&gt;
    &lt;/a&gt;
  &lt;/h1&gt;
  &lt;div class="github-thread"&gt;
    &lt;div class="timeline-comment-header"&gt;
      &lt;a href="https://github.com/Dipan61241828" rel="noopener noreferrer"&gt;
        &lt;img class="github-liquid-tag-img" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Favatars2.githubusercontent.com%2Fu%2F56672883%3Fv%3D4" alt="Dipan61241828 avatar"&gt;
      &lt;/a&gt;
      &lt;div class="timeline-comment-header-text"&gt;
        &lt;strong&gt;
          &lt;a href="https://github.com/Dipan61241828" rel="noopener noreferrer"&gt;Dipan61241828&lt;/a&gt;
        &lt;/strong&gt; posted on &lt;a href="https://github.com/oliexdev/openScale/issues/509" rel="noopener noreferrer"&gt;&lt;time&gt;Oct 17, 2019&lt;/time&gt;&lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag-github-body"&gt;
      &lt;p&gt;Hi it's great app. it works like a charm for almost all devices. Thanks for this great creation.
recently i bought new weight measurement scale of ETEKCITY and it is not supported by this app.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.etekcity.com/product/100334" rel="nofollow noopener noreferrer"&gt;https://www.etekcity.com/product/100334&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;here the debug log file attached with your debug app&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/oliexdev/openScale/files/3738102/openScale_2019-10-17_12-57.txt" rel="noopener noreferrer"&gt;openScale_2019-10-17_12-57.txt&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;More debug Log,
&lt;a href="https://github.com/oliexdev/openScale/files/3738761/openScale_2019-10-17_16-04_new.txt" rel="noopener noreferrer"&gt;openScale_2019-10-17_16-04_new.txt&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Scale information
&lt;a href="https://user-images.githubusercontent.com/56672883/67012835-b41b4400-f10f-11e9-998e-3ee65d5b0b51.jpg" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fuser-images.githubusercontent.com%2F56672883%2F67012835-b41b4400-f10f-11e9-998e-3ee65d5b0b51.jpg" alt="Screenshot_20191017-184936"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Let me know if the above is sufficient or should I need to give more.&lt;/p&gt;
&lt;p&gt;Thank you.&lt;/p&gt;

    &lt;/div&gt;
    &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/oliexdev/openScale/issues/509" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;


&lt;h1&gt;
  
  
  Step 3: Down the rabbit hole 🐰
&lt;/h1&gt;

&lt;p&gt;I love JS and Node.JS and I felt confident (for some weird reason) in worst case scenario I could use some linux tools with &lt;code&gt;child_process&lt;/code&gt; or even hack something in &lt;code&gt;C&lt;/code&gt; to communicate using BLE (via USB). It was already getting late and I was getting delirious :D . &lt;/p&gt;

&lt;p&gt;Now I'm here and I want to be able to at least get the measurements read. I quickly googled up a module for node which was a good start.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/noble" rel="noopener noreferrer"&gt;
        noble
      &lt;/a&gt; / &lt;a href="https://github.com/noble/noble" rel="noopener noreferrer"&gt;
        noble
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A Node.js BLE (Bluetooth Low Energy) central module 
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;&lt;a rel="noopener noreferrer" href="https://github.com/noble/noble/assets/noble-logo.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fnoble%2Fnoble%2FHEAD%2Fassets%2Fnoble-logo.png" alt="noble"&gt;&lt;/a&gt;&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://travis-ci.org/noble/noble" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/3ad657bd76705fcebe470c9434d48ce535eb0229d8bf8dd3a3f34ee2c13a6f77/68747470733a2f2f7472617669732d63692e6f72672f6e6f626c652f6e6f626c652e7376673f6272616e63683d6d6173746572" alt="Build Status"&gt;&lt;/a&gt;
&lt;a href="https://gitter.im/sandeepmistry/noble?utm_source=badge&amp;amp;utm_medium=badge&amp;amp;utm_campaign=pr-badge&amp;amp;utm_content=badge" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/abe08b740a4156153736f791393ec4da6619c4be73212e75769f52edacc0e2b5/68747470733a2f2f6261646765732e6769747465722e696d2f4a6f696e253230436861742e737667" alt="Gitter"&gt;&lt;/a&gt; &lt;a href="https://github.com/noble/noble#backers" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/56817752b1a467b8baceed5510cdaa0bd8d067489bb1bb69db5a4d7b4f98edb9/68747470733a2f2f6f70656e636f6c6c6563746976652e636f6d2f6e6f626c652f6261636b6572732f62616467652e737667" alt="OpenCollective"&gt;&lt;/a&gt;
&lt;a href="https://github.com/noble/noble#sponsors" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/6dec6860472a2a96a3a1f35608d45826244091e95616c796d3ba645e39c50044/68747470733a2f2f6f70656e636f6c6c6563746976652e636f6d2f6e6f626c652f73706f6e736f72732f62616467652e737667" alt="OpenCollective"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;A Node.js BLE (Bluetooth Low Energy) central module.&lt;/p&gt;
&lt;p&gt;Want to implement a peripheral? Checkout &lt;a href="https://github.com/sandeepmistry/bleno" rel="noopener noreferrer"&gt;bleno&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; macOS / Mac OS X, Linux, FreeBSD and Windows are currently the only supported OSes. Other platforms may be developed later on.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Prerequisites&lt;/h2&gt;
&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;OS X&lt;/h3&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;install &lt;a href="https://itunes.apple.com/ca/app/xcode/id497799835?mt=12" rel="nofollow noopener noreferrer"&gt;Xcode&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Linux&lt;/h3&gt;

&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Kernel version 3.6 or above&lt;/li&gt;
&lt;li&gt;&lt;code&gt;libbluetooth-dev&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h4 class="heading-element"&gt;Ubuntu/Debian/Raspbian&lt;/h4&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;sudo apt-get install bluetooth bluez libbluetooth-dev libudev-dev&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Make sure &lt;code&gt;node&lt;/code&gt; is on your path, if it's not, some options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;symlink &lt;code&gt;nodejs&lt;/code&gt; to &lt;code&gt;node&lt;/code&gt;: &lt;code&gt;sudo ln -s /usr/bin/nodejs /usr/bin/node&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nodejs.org/en/download/package-manager/#debian-and-ubuntu-based-linux-distributions" rel="nofollow noopener noreferrer"&gt;install Node.js using the NodeSource package&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h4 class="heading-element"&gt;Fedora / Other-RPM based&lt;/h4&gt;

&lt;/div&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;sudo yum install bluez bluez-libs bluez-libs-devel&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="markdown-heading"&gt;
&lt;h4 class="heading-element"&gt;Intel Edison&lt;/h4&gt;

&lt;/div&gt;
&lt;p&gt;See &lt;a href="http://rexstjohn.com/configure-intel-edison-for-bluetooth-le-smart-development/" rel="nofollow noopener noreferrer"&gt;Configure Intel Edison for Bluetooth LE (Smart) Development&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;FreeBSD&lt;/h3&gt;

&lt;/div&gt;
&lt;p&gt;Make sure you have GNU Make:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;sudo pkg install gmake&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;Disable automatic loading of the default Bluetooth stack by putting &lt;a href="https://gist.github.com/myfreeweb/44f4f3e791a057bc4f3619a166a03b87" rel="noopener noreferrer"&gt;no-ubt.conf&lt;/a&gt; into &lt;code&gt;/usr/local/etc/devd/no-ubt.conf&lt;/code&gt; and restarting devd (&lt;code&gt;sudo service devd restart&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Unload &lt;code&gt;ng_ubt&lt;/code&gt; kernel module if already loaded:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;sudo kldunload ng_ubt&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/noble/noble" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;And starting hacking away and logging output. With some luck and more luck I was able to guess the correct service, characteristic and ended up with some notes where I could start looking at the protocol.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;And around 4 am in the morning finished writing the README and finally tired enough to just go to bed and rest.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hertzg" rel="noopener noreferrer"&gt;
        hertzg
      &lt;/a&gt; / &lt;a href="https://github.com/hertzg/metekcity" rel="noopener noreferrer"&gt;
        metekcity
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      ETEKCITY smart nutrition scale protocol reverse engneering
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;p&gt;&lt;a href="https://actions-badge.atrox.dev/hertzg/etekcity/goto?ref=master" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/4d801c223ca3170551dc2b4f0b61e569a5a2e380c370757767086d8c8bc0e119/68747470733a2f2f696d672e736869656c64732e696f2f656e64706f696e742e7376673f75726c3d6874747073253341253246253246616374696f6e732d62616467652e6174726f782e646576253246686572747a672532466574656b6369747925324662616467652533467265662533446d6173746572267374796c653d666c6174" alt="Build Status"&gt;&lt;/a&gt;
&lt;a href="https://codecov.io/gh/hertzg/node-net-keepalive" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/85aadebddc46906adb713c95d693ad3968d4053f45f4b4c36f1f0aa16a20b069/68747470733a2f2f636f6465636f762e696f2f67682f686572747a672f6e6f64652d6e65742d6b656570616c6976652f6272616e63682f6d61737465722f67726170682f62616467652e737667" alt="codecov"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;ETEKCITY Smart Nutrition Scale&lt;/h1&gt;

&lt;/div&gt;
&lt;p&gt;⚠️ Very much work in progress ⚠️&lt;/p&gt;
&lt;p&gt;This is a potential project that tries to reverse engineer the BLE protocol that ETEKCITY Smart Nutrition Scale (ESN00)
uses.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.etekcity.com/product/100334" rel="nofollow noopener noreferrer"&gt;ETEKCITY Smart Nutrition Scale (ESN00)&lt;/a&gt; (&lt;a href="https://www.amazon.de/gp/product/B07RJV9QPF" rel="nofollow noopener noreferrer"&gt;DE&lt;/a&gt;
| &lt;a href="https://www.amazon.com/Etekcity-ESN00-Nutrition-Counting-Bluetooth/dp/B07FCZSC41" rel="nofollow noopener noreferrer"&gt;US&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/99b9a1efd02583c5de9da5478818f2c594650ba3f9ba0a34ea546e8adb460f7a/68747470733a2f2f696d6167652e6574656b636974792e636f6d2f7468756d622f3230313831302f32382f37663234386337356131333962363662396430653662303831633235613061312e6a7067"&gt;&lt;img src="https://camo.githubusercontent.com/99b9a1efd02583c5de9da5478818f2c594650ba3f9ba0a34ea546e8adb460f7a/68747470733a2f2f696d6167652e6574656b636974792e636f6d2f7468756d622f3230313831302f32382f37663234386337356131333962363662396430653662303831633235613061312e6a7067" alt=""&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;BLE Protocol&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;This section describes the protocol (what was researched so far)&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;BLE Services &amp;amp; Characteristics&lt;/h3&gt;

&lt;/div&gt;
&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;&amp;gt; Service: 00001910-0000-1000-8000-00805f9b34fb
&amp;gt;&amp;gt; Characteristic: 00002c10-0000-1000-8000-00805f9b34fb [READ]
&amp;gt;&amp;gt; Characteristic: 00002c11-0000-1000-8000-00805f9b34fb [WRITEWITHOUTRESPONSE, WRITE]
&amp;gt;&amp;gt; Characteristic: 00002c12-0000-1000-8000-00805f9b34fb [NOTIFY, INDICATE]
&amp;gt; Service: 0000180a-0000-1000-8000-00805f9b34fb
&amp;gt;&amp;gt; Characteristic: 00002a23-0000-1000-8000-00805f9b34fb [READ]
&amp;gt;&amp;gt; Characteristic: 00002a50-0000-1000-8000-00805f9b34fb [READ]
&amp;gt; Service: 00001800-0000-1000-8000-00805f9b34fb
&amp;gt;&amp;gt; Characteristic: 00002a00-0000-1000-8000-00805f9b34fb [READ]
&amp;gt;&amp;gt; Characteristic: 00002a01-0000-1000-8000-00805f9b34fb [READ]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Communication happens on service &lt;code&gt;0x1910&lt;/code&gt;, device to client communication happens on &lt;code&gt;0x2c12&lt;/code&gt; characteristic and client
to device communication on &lt;code&gt;0x2c12&lt;/code&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Protocol&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;All packets have this structure&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h3 class="heading-element"&gt;Packet&lt;/h3&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/eb2d64f7e74578a947562c01200db61df512dd098c4ee309ee3d5c8b42613ba2/68747470733a2f2f6b726f6b692e696f2f7061636b6574646961672f7376672f654e6f7253457a4f5469314a7955784d56366a6d556c417730445732557642495455784a4c624a5751414c362d67724f2d586e464a596c354a56594b4268567071616c707951614a526b41644a6c594b495a55467151725252666b6c69535770746b626d4272485745423042594c5042306b4346706c594b50716c353653555a61457142436c305353784b426b6b41356f446f74434463364a7a5850316a545747746b4a49416d77436d63506277774c4959374d5345334f4c69374e3561726c4167414c4d6a7665"&gt;&lt;img src="https://camo.githubusercontent.com/eb2d64f7e74578a947562c01200db61df512dd098c4ee309ee3d5c8b42613ba2/68747470733a2f2f6b726f6b692e696f2f7061636b6574646961672f7376672f654e6f7253457a4f5469314a7955784d56366a6d556c417730445732557642495455784a4c624a5751414c362d67724f2d586e464a596c354a56594b4268567071616c707951614a526b41644a6c594b495a55467151725252666b6c69535770746b626d4272485745423042594c5042306b4346706c594b50716c353653555a61457142436c305353784b426b6b41356f446f74434463364a7a5850316a545747746b4a49416d77436d63506277774c4959374d5345334f4c69374e3561726c4167414c4d6a7665" alt=""&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h4 class="heading-element"&gt;Structure: Data&lt;/h4&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://github.com/hertzg/metekcity/packages/esn00-packet/README.md" rel="noopener noreferrer"&gt;Payload structure is defined in esn00-packet README&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hertzg/metekcity" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


&lt;h1&gt;
  
  
  Next steps
&lt;/h1&gt;

&lt;p&gt;I would like to write (at least half-) decent library to listen and possibly control the big display with nutritional information from outside the app. For now I need an Android device to sniff the packets and analyze the result.&lt;/p&gt;

&lt;p&gt;I actually do not know which device to choose so maybe one late night I will pick a random cheap Android phone and invest more in my procrastination or maybe someone will tell me in comments which one to go for ¯\_(ツ)_/¯ .&lt;/p&gt;

&lt;p&gt;The end goal would (probably) be to have it integrate with homebridge or home-assistant and have it comfortably enable the nutritional value display based on voice commands.&lt;/p&gt;

</description>
      <category>hacking</category>
      <category>javascript</category>
      <category>bluetooth</category>
      <category>node</category>
    </item>
  </channel>
</rss>
