What I learned building Hyerix — a Tauri v2 + async-nats desktop app for NATS infrastructure. Covers IPC backpressure, cross-platform signing pain, and the local-first AI architecture.
I built Hyerix — a desktop app for managing NATS infrastructure (JetStream streams, KV buckets, Object Store, consumers, cluster topology). It's Tauri v2 + Rust + React/TypeScript, talking to clusters via async-nats. There's also a natural-language query layer over live cluster state.
This is the technical retrospective. Not a pitch — if you're building a Tauri app or a NATS tool, hopefully some of these notes save you the time I spent figuring them out.
Launching today on Product Hunt — feedback welcome: producthunt.com/products/hyerix
Why a desktop app
The NATS CLI is excellent for one-shot scripting. It's painful for "I have 80 consumers across 12 streams and I need to find the one that's stuck." Every NATS operator I know has a private library of nats consumer info | jq incantations they pull out at 2am.
The visual diff — what changed in the last hour, and which subject filter is responsible — is the gap. The CLI doesn't give you that without a lot of plumbing.
Why Tauri v2 (not Electron)
Three reasons:
- Binary size. Release builds clock around 12MB on macOS. Electron's baseline is ~150MB. For a tool people run alongside their normal dev environment, that matters.
- Rust on the backend. I wanted async-nats, not a Node wrapper around it. Rust gives me the same client the NATS team itself uses, no impedance mismatch.
- Webview is the right primitive for dense UI. Trees, time-series charts, virtualized tables. egui or iced would have been faster to bootstrap but slower to land the visual density. I needed Recharts and react-virtual without rebuilding them.
async-nats notes
The official Rust client is solid. The JetStream API is well-typed and the streaming consumer iterators map cleanly to Tokio's primitives.
The thing that bit me: pull subscription reconnect semantics are subtle. On certain disconnect classes you need to explicitly recreate the pull subscription rather than relying on the underlying connection's auto-reconnect. The docs are thin on which classes need this. After a few painful field reports, I ended up with this pattern:
let mut pull_sub = stream.pull_subscriber("durable").await?;
loop {
match pull_sub.next().await {
Some(Ok(msg)) => process(msg).await,
Some(Err(e)) if e.is_connection_closed() => {
pull_sub = stream.pull_subscriber("durable").await?;
continue;
}
Some(Err(e)) => return Err(e.into()),
None => break,
}
}
The is_connection_closed() check is what I wish I'd known earlier.
Tauri v2 IPC: streaming live data needs backpressure
Tauri v2's channel API is a real upgrade over v1's emit/listen for live data like consumer lag samples or message rate windows.
But backpressure is your problem. If the UI is slow to drain a channel and the producer keeps pushing, you'll either OOM the renderer or drop frames. I landed on a small ring buffer per subscription on the Rust side:
struct LagBuffer {
samples: VecDeque<LagSample>,
capacity: usize,
}
impl LagBuffer {
fn push(&mut self, sample: LagSample) {
if self.samples.len() == self.capacity {
self.samples.pop_front();
}
self.samples.push_back(sample);
}
}
Lossy at the head, never blocks the producer. Worst case the chart is a few frames behind reality. For ops tooling that's fine.
tokio::sync::watch over broadcast
I started with tokio::sync::broadcast for fanning cluster updates to UI subscribers. Switched to tokio::sync::watch for "latest state" channels — way fewer footguns.
broadcast is right when every subscriber must see every value (event-log semantics). watch is right when subscribers only care about the most recent value (state-replication semantics). Most of the UI is state replication. Picking the wrong primitive cost me a week of "why is this consumer lag updating laggily" debugging.
Cross-platform signing tax
This was the single biggest "looks easy in docs, isn't" surprise. Budget two weeks across all three OSes.
Linux
-
webkit2gtk4.0 vs 4.1 split. Tauri v2 needs 4.1; older Ubuntu LTS shipped 4.0. Make the.debpin the right runtime dep. - AppImage signing tooling is sparse. I ship a detached
.sigfor verification. -
RPM signing: Tauri's
rpm-rsproduces signatures that newerrpm+rpm-sequoia(Ubuntu 22.04+) reject as malformed OpenPGP. Workaround: don't sign in Tauri. Re-sign in CI inside an AlmaLinux 8 container usingrpmsign, which delegates to gpg and produces spec-conformant output. Took several iterations to figure out — same root cause that broke goreleaser's RPM signing in 2.5.1.
macOS
- Notarization is solved-but-slow.
xcrun notarytool submit --waittakes 1-5 min per artifact. Plan your CI matrix with that in mind. -
Tauri v2 only notarizes the inner
.app, not the DMG itself. If you want offline Gatekeeper to work, submit the DMG separately and staple post-build, then re-upload to whatever distribution channel you use. - The DMG ships with an embedded SLA license file that hangs
hdiutil attachindefinitely in CI. Verifying the inner.appinstead of mounting the DMG sidesteps it.
Windows
- Code signing is the usual nightmare. Azure Trusted Signing was the cheapest path that didn't require a year of EV-cert wait.
- ARM64 builds work fine on
windows-latestrunners as long as the Rust toolchain has the target installed.
Local-first AI architecture
The natural-language query layer was the part I was most uncertain about. Cluster state — even structural metadata like consumer names and KV bucket layouts — is sensitive. "We send your whole cluster to OpenAI" is not an acceptable answer.
The architecture:
- The Rust backend already maintains a model of cluster state (streams, consumers, KV buckets, recent metrics) for the UI.
- When a user asks "which consumers have growing pending counts in the orders stream?", the LLM receives a structured summary of relevant cluster state — never an unbounded query against the whole cluster.
- The LLM's job is to translate the question into a sequence of API calls Hyerix already supports. It returns a query plan, not an answer. The plan executes locally against cluster state.
- The LLM never sees message bodies, KV values, or anything in the data path. Structural metadata + numeric metrics only.
- Provider is configurable (OpenAI by default). Off by default.
The architectural commitment: the LLM sees a summary, not the cluster. That boundary is what customers actually care about.
Honest tradeoffs
- Local-first means no team-shared dashboards. Each engineer runs their own copy. Hosted/multiplayer mode is the most-requested feature on the roadmap.
- Per-machine licensing. Fingerprint changes (laptop swap) require re-activation.
- The AI layer sends cluster schema and metrics to whichever LLM provider you configure. If your security policy disallows that, turn it off — the rest of the app still works.
What I'd do differently
- Start with
tokio::sync::watchfor state channels instead of working backward frombroadcast. - Prototype cross-platform signing in week 1, not week 12.
- Wrap the pull-subscription reconnect logic in a generic helper from day one — I had three copies of that loop before I noticed.
- Make the LLM provider boundary visible in the UI from the start. The AI layer was initially a hidden default; customers wanted explicit "this query is going to provider X" telemetry.
Try it
If you run NATS in production, there's a 14-day trial: hyerix.ai.
If you don't run NATS but want to play with the stack: github.com/hyerix/hyerix-demo-cluster is a docker-compose 3-node JetStream cluster with synthetic activity. MIT, no telemetry. Useful as a fixture for testing client libraries or monitoring tooling regardless of whether you care about Hyerix.
If any of these Tauri / async-nats notes save you a week, drop a comment — curious what you're building.
Top comments (0)