<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Guocheng Song</title>
    <description>The latest articles on DEV Community by Guocheng Song (@feichai0017).</description>
    <link>https://dev.to/feichai0017</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3892636%2F5cbb5b15-055c-4701-8853-ed71ca6044e3.jpeg</url>
      <title>DEV Community: Guocheng Song</title>
      <link>https://dev.to/feichai0017</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/feichai0017"/>
    <language>en</language>
    <item>
      <title>Why We Split meta/root and coordinator in NoKV</title>
      <dc:creator>Guocheng Song</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:47:30 +0000</pubDate>
      <link>https://dev.to/feichai0017/why-we-split-metaroot-and-coordinator-in-nokv-4alc</link>
      <guid>https://dev.to/feichai0017/why-we-split-metaroot-and-coordinator-in-nokv-4alc</guid>
      <description>&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/feichai0017/NoKV" rel="noopener noreferrer"&gt;https://github.com/feichai0017/NoKV&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive demo:&lt;/strong&gt; &lt;a href="https://demo.eric-sgc.cafe/" rel="noopener noreferrer"&gt;https://demo.eric-sgc.cafe/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;When people first look at a distributed KV system, one of the most natural assumptions is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“There should be one control-plane service that owns the cluster metadata.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That intuition is understandable. If you’ve looked at systems like TiKV, the first mental model you often get is something like a &lt;strong&gt;PD-style&lt;/strong&gt; component: routing, timestamps, heartbeats, scheduling decisions, cluster topology, all gathered around one control-plane authority.&lt;/p&gt;

&lt;p&gt;We started with a similar intuition.&lt;/p&gt;

&lt;p&gt;But as NoKV grew, that model started to feel too coarse.&lt;/p&gt;

&lt;p&gt;The problem was not whether we wanted a control plane. We absolutely did. The problem was whether the &lt;strong&gt;durable truth&lt;/strong&gt; of the distributed system should live inside the same process that answers requests, serves views, and reacts to runtime events.&lt;/p&gt;

&lt;p&gt;Our answer became: &lt;strong&gt;no&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That is why NoKV ended up with a deliberate split between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;meta/root&lt;/code&gt;: the rooted truth kernel&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;coordinator&lt;/code&gt;: the control-plane service and rebuildable runtime view&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once that split became explicit, a lot of other design choices started to become cleaner.&lt;/p&gt;




&lt;h2&gt;
  
  
  The core idea
&lt;/h2&gt;

&lt;p&gt;In NoKV, the “brain” of the distributed system is &lt;strong&gt;not&lt;/strong&gt; the coordinator.&lt;/p&gt;

&lt;p&gt;The durable metadata truth lives in &lt;code&gt;meta/root&lt;/code&gt;, which is implemented as a &lt;strong&gt;typed, append-only committed log plus compact applied state&lt;/strong&gt;. Coordinator lease changes, allocator fences, region lifecycle, pending peer/range changes: these are not “just some in-memory fields inside the control plane”. They are &lt;strong&gt;rooted, replicated, and auditable metadata truth&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The coordinator sits &lt;strong&gt;above&lt;/strong&gt; that truth.&lt;/p&gt;

&lt;p&gt;It is a &lt;strong&gt;service + view&lt;/strong&gt;, not the ultimate owner of metadata persistence.&lt;/p&gt;

&lt;p&gt;That distinction matters a lot.&lt;/p&gt;

&lt;p&gt;Because the moment you let the control plane also be the sole durable metadata owner, you start coupling together several concerns that actually have very different failure and evolution properties:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;serving RPCs&lt;/li&gt;
&lt;li&gt;maintaining routing views&lt;/li&gt;
&lt;li&gt;lease competition&lt;/li&gt;
&lt;li&gt;allocator windows&lt;/li&gt;
&lt;li&gt;scheduling logic&lt;/li&gt;
&lt;li&gt;metadata durability&lt;/li&gt;
&lt;li&gt;metadata replication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We wanted those boundaries to be explicit instead of implicit.&lt;/p&gt;




&lt;h2&gt;
  
  
  From a PD-like intuition to a rooted-truth design
&lt;/h2&gt;

&lt;p&gt;A useful way to explain the evolution is this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the initial intuition was closer to a &lt;strong&gt;TiKV / PD-style control-plane concentration&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;the final direction is closer to a &lt;strong&gt;FoundationDB-style role separation&lt;/strong&gt;, combined with a &lt;strong&gt;Delos-like rooted-truth design&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not in the sense of copying another system’s exact implementation, but in the sense of adopting a cleaner architectural boundary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the log is the truth&lt;/li&gt;
&lt;li&gt;services above it are consumers, views, and operators&lt;/li&gt;
&lt;li&gt;restart should rebuild from truth, not recover from hidden local authority&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the key shift.&lt;/p&gt;

&lt;p&gt;In other words, we did not want &lt;code&gt;coordinator&lt;/code&gt; to become a giant “metadata brain process” that owns everything and then needs more and more local state to stay alive. We wanted it to become something &lt;strong&gt;horizontally deployable&lt;/strong&gt; and &lt;strong&gt;operationally replaceable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So in NoKV:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;meta/root&lt;/code&gt; owns durable rooted truth&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;coordinator&lt;/code&gt; consumes rooted truth and builds a runtime cluster view&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;raftstore&lt;/code&gt; executes data-plane work and region-level replication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is also why the repository documentation describes the system as having &lt;strong&gt;three planes&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;truth plane&lt;/li&gt;
&lt;li&gt;control plane&lt;/li&gt;
&lt;li&gt;execution plane&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why this split is useful in practice
&lt;/h2&gt;

&lt;p&gt;This is not just a conceptual refinement. It has concrete engineering payoffs.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Coordinator becomes much lighter on restart
&lt;/h3&gt;

&lt;p&gt;A coordinator restart is no longer “recover local metadata authority”.&lt;/p&gt;

&lt;p&gt;It becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reconnect to rooted truth&lt;/li&gt;
&lt;li&gt;rebuild the in-memory view&lt;/li&gt;
&lt;li&gt;resume lease competition if appropriate&lt;/li&gt;
&lt;li&gt;continue serving&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes the coordinator much easier to reason about operationally. The only thing that differentiates active and standby coordinators is not some private local metadata store, but the rooted lease state.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Durable truth stops being mixed with runtime convenience
&lt;/h3&gt;

&lt;p&gt;Routing caches, heartbeat-derived state, scheduling hints, and local runtime maps are useful, but they are not the same thing as authoritative metadata truth.&lt;/p&gt;

&lt;p&gt;The split forces us to say that explicitly.&lt;/p&gt;

&lt;p&gt;That reduces a whole category of ambiguity around:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which state is “just a view”&lt;/li&gt;
&lt;li&gt;which state must survive as the source of truth&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Control-plane horizontal scaling becomes more realistic
&lt;/h3&gt;

&lt;p&gt;If the coordinator is “everything”, then horizontal scaling is awkward, because every extra coordinator replica either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;becomes a passive hot standby with hidden state coupling, or&lt;/li&gt;
&lt;li&gt;requires reimplementing distributed truth inside the coordinator layer itself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But if the durable metadata truth already lives below, then multiple coordinator processes become much simpler:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;all consume the same rooted truth&lt;/li&gt;
&lt;li&gt;all rebuild the same kind of view&lt;/li&gt;
&lt;li&gt;lease determines who is currently active for singleton duties&lt;/li&gt;
&lt;li&gt;standby instances are not fake; they are real, warm consumers of the same truth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a much cleaner story for scaling and failover.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Authority handoff becomes auditable
&lt;/h3&gt;

&lt;p&gt;Lease grant, seal, closure, handoff: these become committed rooted events rather than side effects lost inside a single service process.&lt;/p&gt;

&lt;p&gt;That matters both for correctness and for understanding the system later.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why we built a dashboard for this
&lt;/h2&gt;

&lt;p&gt;Once you split the system this way, a static architecture diagram is no longer enough.&lt;/p&gt;

&lt;p&gt;Because the interesting part is not just “there are three kinds of nodes”.&lt;/p&gt;

&lt;p&gt;The interesting part is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;who is currently the &lt;code&gt;meta-root&lt;/code&gt; raft leader&lt;/li&gt;
&lt;li&gt;which coordinator currently holds the lease&lt;/li&gt;
&lt;li&gt;how region leaders are distributed across stores&lt;/li&gt;
&lt;li&gt;how failover changes the live control path&lt;/li&gt;
&lt;li&gt;what stays durable truth, and what is only a rebuildable view&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is why we built a &lt;strong&gt;live dashboard&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The dashboard is not only there to make the demo prettier. It is there because this architecture is much easier to understand when you can observe it from several angles at once:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;truth plane&lt;/strong&gt;: rooted truth ownership and replication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;control plane&lt;/strong&gt;: lease holder, routing view, coordinator role&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;execution plane&lt;/strong&gt;: per-region leadership and store-level state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It turns the system from “a diagram in a README” into something you can actually inspect while it is running.&lt;/p&gt;

&lt;p&gt;That is especially useful for a project like NoKV, because one of our goals is not just to build a storage system, but to build a &lt;strong&gt;maintainable and extensible distributed storage research platform&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If the architecture cannot be made visible, it is much harder to evolve it rigorously.&lt;/p&gt;




&lt;h2&gt;
  
  
  If you want to read the code, start here
&lt;/h2&gt;

&lt;p&gt;If you want to understand this split from the source code instead of only from this post, these are the best entry points:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;meta/root/&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The rooted truth kernel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;typed events&lt;/li&gt;
&lt;li&gt;compact state&lt;/li&gt;
&lt;li&gt;storage backend&lt;/li&gt;
&lt;li&gt;remote service/client&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;coordinator/&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The control-plane service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;routing&lt;/li&gt;
&lt;li&gt;heartbeats&lt;/li&gt;
&lt;li&gt;lease handling&lt;/li&gt;
&lt;li&gt;allocator serving&lt;/li&gt;
&lt;li&gt;rebuildable cluster view&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;raftstore/&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The execution plane:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-Raft region lifecycle&lt;/li&gt;
&lt;li&gt;replicated command execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shortest doc path is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;README.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;docs/architecture.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;docs/rooted_truth.md&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;docs/coordinator.md&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those four together give the cleanest route from:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“What is this repo?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Why are these package boundaries the way they are?”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;A lot of distributed systems talk about separation of concerns, but in practice still let the control plane quietly accumulate too much hidden authority.&lt;/p&gt;

&lt;p&gt;What we wanted in NoKV was a cleaner line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the durable metadata truth should live in its own rooted substrate&lt;/li&gt;
&lt;li&gt;the coordinator should be a service layer on top of that truth&lt;/li&gt;
&lt;li&gt;the execution plane should stay separate from both&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That separation made the architecture easier to explain, easier to visualize, and, I think, easier to extend.&lt;/p&gt;

&lt;p&gt;And that is exactly why the dashboard exists: &lt;strong&gt;not as decoration, but as a way to make those boundaries visible.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>database</category>
      <category>distributedsystems</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
