With the Synaptic Canvas GUI rendering, my bare-metal kernel was fully functional. However, as I expanded the OS features, I ran into multitasking bottlenecks: how do I run background compilation, model inference, and GUI rendering concurrently without crashing the system?
Last night, I solved this by implementing three core infrastructure services: Nexus Swarms, Beacon Headless Streaming, and Zero-Downtime OTA Hot-Patching.
We are building a bare-metal, self-healing operating system running entirely inside the CPU's L3 cache. Here is the roadmap for this 12-part series:The V.E.L.O.C.I.T.Y.-OS 12-Part Roadmap
1. The Nexus Core Swarm Runtime (nexus.rs)
To support concurrent compilation and optimization, I built the Nexus Core Swarm Runtime.
The runtime allows JIT threads or the LLM shell to launch child agents via sys_spawn_agent(source_ptr, source_len, mem_limit). Each spawned agent (such as the translator_agent or optimizer_agent) runs in an isolated heap with sandboxed PIDs under a cooperative scheduler.
Agents communicate using Synaptic Message Rings—lock-free circular ring buffers in shared memory. Every packet header contains a rolling Merkle hash calculated on write and validated on read to prevent message corruption.
Here is the cooperative context switcher implementation in src/gui.rs showing the raw assembly context swap and how task registers are pushed and popped to switch execution stacks on core quiescent ticks:
// velocity-bootloader/src/gui.rs — Cooperative Context Switcher
pub struct JitTask {
pub id: usize,
pub title: String,
pub program: Arc<crate::nda_jit::JitProgram>,
pub stack: Vec<u8>,
pub rsp: u64,
pub completed: bool,
}
pub struct CooperativeScheduler {
pub tasks: Vec<JitTask>,
pub current_task_idx: Option<usize>,
pub scheduler_rsp: u64,
}
// Low-level assembly context switcher (Win64 calling convention)
#[cfg(target_os = "uefi")]
#[unsafe(naked)]
pub unsafe extern "win64" fn switch_context(from_rsp: *mut u64, to_rsp: u64) {
core::arch::naked_asm!(
// 1. Preserve floating-point and SIMD context registers
"sub rsp, 160",
"movdqu [rsp + 0], xmm6",
"movdqu [rsp + 16], xmm7",
"movdqu [rsp + 32], xmm8",
"movdqu [rsp + 48], xmm9",
"movdqu [rsp + 64], xmm10",
"movdqu [rsp + 80], xmm11",
"movdqu [rsp + 96], xmm12",
"movdqu [rsp + 112], xmm13",
"movdqu [rsp + 128], xmm14",
"movdqu [rsp + 144], xmm15",
// 2. Preserve standard registers
"push rbx", "push rbp", "push rdi", "push rsi",
"push r12", "push r13", "push r14", "push r15",
// 3. Swap stack pointer registers
"mov [rcx], rsp", // Save old stack pointer
"mov rsp, rdx", // Load new stack pointer
// 4. Restore new task's registers
"pop r15", "pop r14", "pop r13", "pop r12",
"pop rsi", "pop rdi", "pop rbp", "pop rbx",
"movdqu xmm15, [rsp + 144]",
"movdqu xmm14, [rsp + 128]",
"movdqu xmm13, [rsp + 112]",
"movdqu xmm12, [rsp + 96]",
"movdqu xmm11, [rsp + 80]",
"movdqu xmm10, [rsp + 64]",
"movdqu xmm9, [rsp + 48]",
"movdqu xmm8, [rsp + 32]",
"movdqu xmm7, [rsp + 16]",
"movdqu xmm6, [rsp + 0]",
"add rsp, 160",
"ret"
);
}
2. The Beacon Remote Headless Protocol (beacon.rs)
For edge VMs or headless servers without physical displays, I developed the Beacon headless Protocol.
The compositor divides the screen into an grid of cells. On every tick, the protocol computes signatures for each cell, detects pixel changes, and streams Run-Length Encoded (RLE) delta frames over COM1 serial or Ethernet at 30+ FPS.
Incoming packets from Beacon clients decode keyboard and mouse movements, injecting them directly into the kernel's keyboard::INPUT_QUEUE and mouse registers. (Note: This custom protocol will be replaced with V.E.L.O.C.I.T.Y. Remote soon).
3. Zero-Downtime OTA Hot-Patching (ota.rs)
If a core OS driver (such as fat or nvme) has a bug, rebooting a live JIT compiler is dangerous. I built a cryptographic Zero-Downtime OTA Hot-Patching module.
// Atomic CAS swap of the active FAT32 read pointer
let old_ptr = FAT_READ_PTR.swap(new_ptr, Ordering::SeqCst);
Core driver entrypoints are stored in a global Sitemap Dispatch Table. When an update is pushed, the kernel:
- Allocates fresh memory pages and compiles the new driver code.
- Cryptographically verifies the payload signature against the public developer key embedded in the bootloader.
- Swaps the function pointers atomically using a Compare-And-Swap (
lock cmpxchg) instruction. - Reclaims the old memory pages using a Read-Copy-Update (RCU) reclamation pattern once all active CPU cores pass their quiescent ticks.
Here is the architectural overview comparing the multi-agent cooperative stack switcher and RCU pointer hot-patching pipeline:
Pascal's Analysis: Distributed Transactions
analyzed the agent coordination and hot-patching architecture:
"The pre-commit notification pattern... is essentially a distributed transaction with optimistic concurrency. The discourse board is your conflict resolution layer... The audit trail isn't just for debugging — it's a record of why each change was made and who agreed to it."
Pascal noted that by utilizing RCU pointer swapping and Merkle message verification, the OS was executing kernel-level code updates with identical safety guarantees as database transactions.
But to make this OS self-improving, I needed a way to let the local LLM optimize its own kernel code on-the-fly.
In the next post, I'll document how I completed the self-healing loop, the content-addressed Biosphere registry, and the Boot-to-NDA LLM Terminal handover.
Discussion
How do you handle task scheduling and state consensus in multi-agent environments? Have you implemented cooperative context switching or dynamic RCU hot-patching in low-level systems? Let's discuss in the comments below!
Special thanks to for helping me conceptualize the conflict resolution board for multi-agent state consensus.
Disclaimer: AI was used throughout this project, it is just fitting that it would co-author with me, so special thanks to the Foundry for its tireless hours toiling away and Gemini for producing the cover image.

Top comments (1)
@pascal_cescato_692b7a8a20 One left 🥳 Then a bit of a wait till Series 2, where it becomes an actually usable OS and get it to a point where I can use it as a dev platform!