<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anshuman</title>
    <description>The latest articles on DEV Community by Anshuman (@ansh_xh07).</description>
    <link>https://dev.to/ansh_xh07</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3971645%2Fa0de8532-e379-46ec-b169-2347ec9342c9.jpg</url>
      <title>DEV Community: Anshuman</title>
      <link>https://dev.to/ansh_xh07</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ansh_xh07"/>
    <language>en</language>
    <item>
      <title>How I Avoided C-Linker Hell by Decoupling Rust &amp; Python for an AI Memory Daemon</title>
      <dc:creator>Anshuman</dc:creator>
      <pubDate>Sat, 06 Jun 2026 18:18:36 +0000</pubDate>
      <link>https://dev.to/ansh_xh07/how-i-avoided-c-linker-hell-by-decoupling-rust-python-for-an-ai-memory-daemon-3e4d</link>
      <guid>https://dev.to/ansh_xh07/how-i-avoided-c-linker-hell-by-decoupling-rust-python-for-an-ai-memory-daemon-3e4d</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4502oc4hh4xfpk442u83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4502oc4hh4xfpk442u83.png" alt="A 3D PCA phase space graph showing data points clustering into structural peaks, representing continuous hyperdimensional memory states." width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I set out to build &lt;strong&gt;&lt;a href="https://github.com/CodNoob100/null-drift" rel="noopener noreferrer"&gt;null-drift&lt;/a&gt;&lt;/strong&gt; - a lightweight, local memory daemon for AI agents - my original goal was the holy grail of modern backend development: a single, blazingly fast Rust binary. &lt;/p&gt;

&lt;p&gt;I wanted a self-contained application that could handle both machine learning inference (generating text embeddings) and the heavy lifting of a highly concurrent state machine. &lt;/p&gt;

&lt;p&gt;It sounded great on paper. In practice, I ran straight into a brick wall.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: C-Linker Hell
&lt;/h3&gt;

&lt;p&gt;To do machine learning inference in Rust, you generally rely on bindings to C/C++ libraries. I chose the &lt;code&gt;ort&lt;/code&gt; crate (ONNX Runtime) to handle the embedding models. &lt;/p&gt;

&lt;p&gt;However, trying to cross-compile this setup for Windows immediately resulted in absolute chaos. I encountered endless MSVC linker errors caused by conflicts between static (&lt;code&gt;/MT&lt;/code&gt;) and dynamic (&lt;code&gt;/MD&lt;/code&gt;) C-runtimes. Even when I managed to get it compiling, I ran into bizarre C-runtime deadlocks.&lt;/p&gt;

&lt;p&gt;I realized I was spending more time fighting the C/C++ build toolchain than actually writing my memory daemon.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Decoupled Microservices
&lt;/h3&gt;

&lt;p&gt;I decided to stop fighting the ecosystem and instead play to the strengths of different languages. I decoupled the project into a two-container microservice architecture:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Python (FastAPI):&lt;/strong&gt; Python is the undisputed king of ML tooling. Setting up a FastAPI service to handle sentence-transformer embeddings was trivial, and the ML toolchain "just works" across all operating systems without any linker headaches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rust (Axum/Tokio):&lt;/strong&gt; Rust took over the job it was born to do: managing a highly contested, continuous 10k-dimensional state array. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By splitting the workload, the Python service acts as a pure, stateless compute node, while Rust handles the high-concurrency memory indexing and disk synchronization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling Concurrency in Rust
&lt;/h3&gt;

&lt;p&gt;In the Rust daemon, the core data structure is constantly being queried and updated. To handle this, I wrapped the daemon's state in an Asynchronous Read-Write lock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;tokio&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;RwLock&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;SharedState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;RwLock&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;DaemonState&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We specifically chose &lt;code&gt;tokio::sync::RwLock&lt;/code&gt; over standard library locks to enable high-concurrency reads while writers occasionally mutate the state. But there's a hidden security benefit to this choice as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Securing the State Machine (Why Tokio Locks Matter)
&lt;/h3&gt;

&lt;p&gt;If you use standard library locks (&lt;code&gt;std::sync::RwLock&lt;/code&gt;) in Rust, you have to deal with &lt;strong&gt;lock poisoning&lt;/strong&gt;. If a thread panics (crashes) while holding a lock, Rust permanently "poisons" that lock to prevent other threads from reading potentially corrupted data.&lt;/p&gt;

&lt;p&gt;In a web server, this is a massive vulnerability. If an attacker crafts a malicious request that triggers an Out-of-Memory (OOM) error or a math panic, the lock poisons, and &lt;strong&gt;every subsequent request to the server is permanently blocked&lt;/strong&gt;. It's a trivial Denial of Service (DoS) attack.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;tokio::sync::RwLock&lt;/code&gt; explicitly &lt;em&gt;does not&lt;/em&gt; implement lock poisoning. If an Axum task crashes, Tokio drops the lock guard and safely returns it to the pool. A single bad request cannot permanently lock the daemon's memory state!&lt;/p&gt;

&lt;h3&gt;
  
  
  Defensive Programming for Local Daemons
&lt;/h3&gt;

&lt;p&gt;You might be wondering: &lt;em&gt;"It's a local daemon running on &lt;code&gt;localhost&lt;/code&gt;. Why worry about attackers?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Actually, there are three major reasons we built strict defenses into &lt;code&gt;null-drift&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Untrusted AI Inputs:&lt;/strong&gt; The daemon is designed for AI agents. Agents scrape the web and ingest raw, untrusted data. If an agent blindly dumps malformed data into its memory daemon, we need it to fail gracefully rather than crashing the entire pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network Exposure:&lt;/strong&gt; By default, the daemon binds to &lt;code&gt;0.0.0.0&lt;/code&gt; as a fallback, meaning it's exposed to the local network. Anyone on your local public Wi-Fi could theoretically send it payloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Localhost CSRF:&lt;/strong&gt; Even if tightly bound to &lt;code&gt;127.0.0.1&lt;/code&gt;, a malicious website you visit could use JavaScript to execute Cross-Site Request Forgery (CSRF), silently sending &lt;code&gt;POST&lt;/code&gt; requests to &lt;code&gt;localhost:3000&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To counter this, we implemented multiple layers of defense before a request ever reaches the lock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Strict dimensionality validation before linear algebra&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="py"&gt;.embedding&lt;/span&gt;&lt;span class="nf"&gt;.len&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;384&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;DaemonError&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;InvalidDimension&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Strict body limits to prevent memory exhaustion&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Router&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="c1"&gt;// ... routes&lt;/span&gt;
    &lt;span class="nf"&gt;.layer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;DefaultBodyLimit&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="c1"&gt;// 3. Bounded deserialization for state restoration&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;bincode_opts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;bincode&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;DefaultOptions&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.with_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;cog_state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;CognitiveState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bincode_opts&lt;/span&gt;&lt;span class="nf"&gt;.deserialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We use &lt;code&gt;bincode&lt;/code&gt; for extremely fast, direct-to-disk binary serialization of our massive 10k-dimensional state arrays. But by wrapping it in &lt;code&gt;with_limit()&lt;/code&gt;, we ensure a corrupted state file can't blow up system RAM upon a restart.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wrapping Up
&lt;/h3&gt;

&lt;p&gt;Building &lt;code&gt;null-drift&lt;/code&gt; was a great lesson in choosing the right tool for the job. By letting Python handle the ML friction and Rust handle the concurrent state, the architecture became drastically simpler to deploy, compile, and maintain.&lt;/p&gt;

&lt;p&gt;If you want to join the broader discussion, see the original visual phase-space hook, or share this project with other local-AI builders, check out the launch thread on X:&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2063316920938778694-78" src="https://platform.twitter.com/embed/Tweet.html?id=2063316920938778694"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2063316920938778694-78');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2063316920938778694&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;p&gt;If you're interested in checking out the lock-free implementation, the multi-threaded state architecture, or the Docker setup, you can find the repository here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔗 &lt;a href="https://github.com/CodNoob100/null-drift" rel="noopener noreferrer"&gt;null-drift on GitHub&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me know what you think in the comments!&lt;/p&gt;

</description>
      <category>rust</category>
      <category>python</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
