<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mohammed Aman Khan</title>
    <description>The latest articles on DEV Community by Mohammed Aman Khan (@mohammed_amankhan_7cd0b5).</description>
    <link>https://dev.to/mohammed_amankhan_7cd0b5</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3950376%2F55ed46ff-74c4-4876-890a-d508aa2b8def.jpg</url>
      <title>DEV Community: Mohammed Aman Khan</title>
      <link>https://dev.to/mohammed_amankhan_7cd0b5</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mohammed_amankhan_7cd0b5"/>
    <language>en</language>
    <item>
      <title>Zinc – zero-copy shared memory for polyglot stacks. Want your feedback before publishing to npm / PyPI / crates.io / etc.</title>
      <dc:creator>Mohammed Aman Khan</dc:creator>
      <pubDate>Mon, 25 May 2026 09:47:10 +0000</pubDate>
      <link>https://dev.to/mohammed_amankhan_7cd0b5/zinc-zero-copy-shared-memory-for-polyglot-stacks-want-your-feedback-before-publishing-to-npm--501f</link>
      <guid>https://dev.to/mohammed_amankhan_7cd0b5/zinc-zero-copy-shared-memory-for-polyglot-stacks-want-your-feedback-before-publishing-to-npm--501f</guid>
      <description>&lt;p&gt;Here's the problem I kept running into: two processes on the same machine, say, a Python model loader and a Rust inference server, need to share a 100MB tensor. The default answer is Redis, gRPC, or a Unix socket. All three serialize the data, copy it through kernel space, and deserialize on the other side. The tensor was already in RAM. None of that work is necessary.&lt;/p&gt;

&lt;p&gt;Zinc fixes this. It maps the same physical RAM pages into multiple processes across different languages. Every adapter, Rust, Python, Go, Node.js, Bun, Deno, C++, Java, C#, gets a zero-copy view of identical bytes. There's a single Rust core compiled to a shared library (&lt;code&gt;libzinc_core.so&lt;/code&gt; / &lt;code&gt;.dylib&lt;/code&gt;) with a stable 8-function C ABI. Every language adapter calls those same 8 functions through its native FFI mechanism. No logic is reimplemented in adapters. No serialization format is imposed.&lt;/p&gt;

&lt;p&gt;What this looks like in practice:&lt;br&gt;
Python writes a float32 tensor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SharedRegion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pipeline&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;arr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_numpy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# zero-copy numpy view
&lt;/span&gt;&lt;span class="n"&gt;arr&lt;/span&gt;&lt;span class="p"&gt;[:]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model_output&lt;/span&gt;
&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rust reads it — no copy, no deserialization, same physical bytes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;SharedRegion&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"pipeline"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="nf"&gt;.wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;ptr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="nf"&gt;.as_ptr&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// read directly from shared memory&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Go does the same:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;zinc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"pipeline"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c"&gt;// []byte backed by mmap, not a copy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Numbers (measured, not theoretical):&lt;br&gt;
Notify/wait roundtrip (the synchronization overhead, not data access):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Linux (futex): P50 &amp;lt; 1µs, P99 &amp;lt; 3µs&lt;/li&gt;
&lt;li&gt;macOS (spin loop): P50 ~2µs, P99 ~10µs
Throughput for a 100MB region:&lt;/li&gt;
&lt;li&gt;Zinc: ~60 GB/s (memory-bandwidth-bound, it's just a memory read)&lt;/li&gt;
&lt;li&gt;Unix socket: ~1.4 GB/s (kernel copy-bound)&lt;/li&gt;
&lt;li&gt;gRPC/protobuf: serialization alone is 10–30ms on top of that
The hot path: zinc_ptr, zinc_capacity, zinc_notify, zinc_wait - is allocation-free.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Please look at &lt;a href="https://mine-27913f41.mintlify.app/performance/benchmarks" rel="noopener noreferrer"&gt;https://mine-27913f41.mintlify.app/performance/benchmarks&lt;/a&gt; to know more in detail, along with the throughput examples recorded in an M2 Pro device.&lt;/p&gt;

&lt;p&gt;What's done:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rust core with stable C ABI (8 functions, cbindgen-generated header)&lt;/li&gt;
&lt;li&gt;Linux (futex) and macOS backends, both fully working&lt;/li&gt;
&lt;li&gt;All 9 adapters: Rust (native), Python (cffi + numpy), Go (cgo), Node.js (napi-rs), Bun (bun:ffi), Deno (Deno.dlopen), C++ (header-only RAII), Java (JNA), C# (P/Invoke)&lt;/li&gt;
&lt;li&gt;Ownership model enforced at the type level: creator owns, openers can't unlink&lt;/li&gt;
&lt;li&gt;Docs site: &lt;a href="https://mine-27913f41.mintlify.app" rel="noopener noreferrer"&gt;https://mine-27913f41.mintlify.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/Mohammed-Aman-Khan/zinc" rel="noopener noreferrer"&gt;https://github.com/Mohammed-Aman-Khan/zinc&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's not done yet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Windows (the platform backend is a stub, it's a POSIX-first library)&lt;/li&gt;
&lt;li&gt;Package registry publishing (npm, PyPI, crates.io, pkg.go.dev, NuGet, Maven)&lt;/li&gt;
&lt;li&gt;CI for all 9 adapters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What I'm actually looking for feedback on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API surface - Does the create / open / notify / wait model make sense? Anything you'd expect that's missing?&lt;/li&gt;
&lt;li&gt;Adapter ergonomics - If you work in Python, Go, Java, or C#, does the adapter feel idiomatic or does it feel like a thin C wrapper you're fighting?&lt;/li&gt;
&lt;li&gt;The Windows gap - Is this a dealbreaker for your use case? Honest question.&lt;/li&gt;
&lt;li&gt;Anything I'm obviously missing - competing libraries I should benchmark against, edge cases in the ownership model, platform behaviors I haven't accounted for.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not looking for hype. Looking for the things that would make you not use this or tell a colleague to avoid it. Those are more useful right now than upvotes.&lt;/p&gt;

</description>
      <category>performance</category>
      <category>rust</category>
      <category>showdev</category>
      <category>systems</category>
    </item>
  </channel>
</rss>
