DEV Community

Mohammed Aman Khan
Mohammed Aman Khan

Posted on

Zinc – zero-copy shared memory for polyglot stacks. Want your feedback before publishing to npm / PyPI / crates.io / etc.

Here's the problem I kept running into: two processes on the same machine, say, a Python model loader and a Rust inference server, need to share a 100MB tensor. The default answer is Redis, gRPC, or a Unix socket. All three serialize the data, copy it through kernel space, and deserialize on the other side. The tensor was already in RAM. None of that work is necessary.

Zinc fixes this. It maps the same physical RAM pages into multiple processes across different languages. Every adapter, Rust, Python, Go, Node.js, Bun, Deno, C++, Java, C#, gets a zero-copy view of identical bytes. There's a single Rust core compiled to a shared library (libzinc_core.so / .dylib) with a stable 8-function C ABI. Every language adapter calls those same 8 functions through its native FFI mechanism. No logic is reimplemented in adapters. No serialization format is imposed.

What this looks like in practice:
Python writes a float32 tensor:

region = SharedRegion.create("pipeline", 64 * 1024 * 1024)
arr = region.as_numpy(dtype=np.float32)  # zero-copy numpy view
arr[:] = model_output
region.notify()
Enter fullscreen mode Exit fullscreen mode

Rust reads it — no copy, no deserialization, same physical bytes:

let region = SharedRegion::open("pipeline")?;
region.wait(1000)?;
let ptr = region.as_ptr() as *const f32;
// read directly from shared memory
Enter fullscreen mode Exit fullscreen mode

Go does the same:

region, _ := zinc.Open("pipeline")
region.Wait(1000)
data := region.Bytes() // []byte backed by mmap, not a copy
Enter fullscreen mode Exit fullscreen mode

Numbers (measured, not theoretical):
Notify/wait roundtrip (the synchronization overhead, not data access):

  • Linux (futex): P50 < 1µs, P99 < 3µs
  • macOS (spin loop): P50 ~2µs, P99 ~10µs Throughput for a 100MB region:
  • Zinc: ~60 GB/s (memory-bandwidth-bound, it's just a memory read)
  • Unix socket: ~1.4 GB/s (kernel copy-bound)
  • gRPC/protobuf: serialization alone is 10–30ms on top of that The hot path: zinc_ptr, zinc_capacity, zinc_notify, zinc_wait - is allocation-free.

Please look at https://mine-27913f41.mintlify.app/performance/benchmarks to know more in detail, along with the throughput examples recorded in an M2 Pro device.

What's done:

  • Rust core with stable C ABI (8 functions, cbindgen-generated header)
  • Linux (futex) and macOS backends, both fully working
  • All 9 adapters: Rust (native), Python (cffi + numpy), Go (cgo), Node.js (napi-rs), Bun (bun:ffi), Deno (Deno.dlopen), C++ (header-only RAII), Java (JNA), C# (P/Invoke)
  • Ownership model enforced at the type level: creator owns, openers can't unlink
  • Docs site: https://mine-27913f41.mintlify.app
  • Repo: https://github.com/Mohammed-Aman-Khan/zinc

What's not done yet:

  • Windows (the platform backend is a stub, it's a POSIX-first library)
  • Package registry publishing (npm, PyPI, crates.io, pkg.go.dev, NuGet, Maven)
  • CI for all 9 adapters

What I'm actually looking for feedback on:

  • API surface - Does the create / open / notify / wait model make sense? Anything you'd expect that's missing?
  • Adapter ergonomics - If you work in Python, Go, Java, or C#, does the adapter feel idiomatic or does it feel like a thin C wrapper you're fighting?
  • The Windows gap - Is this a dealbreaker for your use case? Honest question.
  • Anything I'm obviously missing - competing libraries I should benchmark against, edge cases in the ownership model, platform behaviors I haven't accounted for.

Not looking for hype. Looking for the things that would make you not use this or tell a colleague to avoid it. Those are more useful right now than upvotes.

Top comments (0)