<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vivek jadhav</title>
    <description>The latest articles on DEV Community by Vivek jadhav (@vivek_1502).</description>
    <link>https://dev.to/vivek_1502</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3943230%2F54a45b7b-6a79-4c1c-bd2a-5a3e962eb926.png</url>
      <title>DEV Community: Vivek jadhav</title>
      <link>https://dev.to/vivek_1502</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vivek_1502"/>
    <language>en</language>
    <item>
      <title>How I built an AWS Lambda clone with Firecracker microVMs</title>
      <dc:creator>Vivek jadhav</dc:creator>
      <pubDate>Thu, 21 May 2026 04:31:19 +0000</pubDate>
      <link>https://dev.to/vivek_1502/how-built-an-aws-lambda-clone-with-firecracker-microvms-1572</link>
      <guid>https://dev.to/vivek_1502/how-built-an-aws-lambda-clone-with-firecracker-microvms-1572</guid>
      <description>&lt;p&gt;Ever wondered what actually happens when you invoke a Lambda function? Not the API layer but the execution layer. What runs your code, how it's isolated, and how AWS gets cold starts low enough to be usable?&lt;/p&gt;

&lt;p&gt;I wanted to understand that deeply. So I built it.&lt;/p&gt;

&lt;p&gt;This is a breakdown of how I built a Firecracker-based serverless runtime from scratch, the architectural decisions I made, and what the numbers look like.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem: cold starts
&lt;/h2&gt;

&lt;p&gt;Every serverless platform faces the same fundamental tension. You want functions to start instantly, but strong isolation requires spinning up a fresh environment per invocation.&lt;/p&gt;

&lt;p&gt;A standard Linux VM boot takes ~200ms at minimum. At scale, that's unusable.&lt;/p&gt;

&lt;p&gt;AWS's solution and the core idea behind this project is &lt;strong&gt;VM snapshots&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How snapshot-based cold start works
&lt;/h2&gt;

&lt;p&gt;Instead of booting a VM on every invocation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Boot the VM once&lt;/li&gt;
&lt;li&gt;Load the Node.js runtime (my project not AWS) and function handler&lt;/li&gt;
&lt;li&gt;Snapshot the initialized memory state to disk&lt;/li&gt;
&lt;li&gt;On every subsequent invocation, &lt;strong&gt;restore from that snapshot&lt;/strong&gt; rather than booting fresh&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Restoring from a snapshot takes 1–5ms. A full cold boot takes 200ms. That's a 40–200x improvement.&lt;/p&gt;

&lt;p&gt;This is exactly what AWS does with Lambda's Firecracker-based execution model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture overview
&lt;/h2&gt;

&lt;p&gt;The system has two main components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Control Plane&lt;/strong&gt; — handles everything outside the VM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Function deployment (accepts a zip, builds a minimal rootfs)&lt;/li&gt;
&lt;li&gt;VM lifecycle (create, snapshot, restore, destroy)&lt;/li&gt;
&lt;li&gt;Per-function request queues with concurrency control&lt;/li&gt;
&lt;li&gt;Multi-tenant scheduling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;MicroVM Runtime&lt;/strong&gt; — runs inside each Firecracker VM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A minimal Linux kernel + custom rootfs&lt;/li&gt;
&lt;li&gt;Node.js runtime executing user handlers&lt;/li&gt;
&lt;li&gt;Deterministic execution: one request → one execution → response&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wbkvd8s0v7l0s7nhxac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4wbkvd8s0v7l0s7nhxac.png" alt="Architecture Diagram" width="799" height="396"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  IPC: how the host talks to the VM
&lt;/h2&gt;

&lt;p&gt;This is where a lot of serverless runtimes lose performance. Every round-trip between host and VM has overhead. If you open a new connection per request, that overhead compounds.&lt;/p&gt;

&lt;p&gt;I used two mechanisms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;vsock (virtio sockets)&lt;/strong&gt; for host ↔ VM communication. vsock is designed specifically for VM-to-host traffic and avoids the overhead of a full network stack.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unix domain sockets&lt;/strong&gt; for intra-VM routing. Faster than TCP for local communication, no kernel networking stack involved.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Eliminating per-request connection setup was the key unlock for throughput.&lt;/p&gt;

&lt;h2&gt;
  
  
  Execution flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. User deploys function.zip via POST /deploy
2. Control plane builds a minimal rootfs with user code inside
3. Firecracker VM boots, runtime initializes
4. Memory snapshot is created and stored
5. On invocation:
   a. Pull a warm VM from the pool (if available)
   b. If no warm VM → restore from snapshot
   c. Send request via vsock
   d. Runtime executes handler
   e. Response returned to client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Benchmark results
&lt;/h2&gt;

&lt;p&gt;Benchmarked with &lt;code&gt;autocannon&lt;/code&gt; — 10 concurrent connections, 30 seconds:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Throughput&lt;/td&gt;
&lt;td&gt;5,400 req/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;p50 latency&lt;/td&gt;
&lt;td&gt;1ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;p99 latency&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total requests&lt;/td&gt;
&lt;td&gt;164,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Key optimizations that got here: snapshot reuse, persistent runtime across invocations, reduced IPC overhead from connection pooling.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tradeoffs
&lt;/h2&gt;

&lt;p&gt;Nothing is free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runtime reuse introduces shared state.&lt;/strong&gt; When you restore from a snapshot and reuse the same runtime across invocations, module-level state in the user's code persists between calls. Strong VM-level isolation, but the runtime isn't fully stateless.&lt;/p&gt;

&lt;p&gt;This is the same tradeoff AWS makes. Lambda execution environments are reused between invocations, they just don't guarantee it, and they don't tell you when a new one is created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Throughput vs. isolation purity.&lt;/strong&gt; You can enforce one invocation-per-VM destroy and recreate for perfect isolation, but your throughput tanks. The snapshot model is the practical middle ground.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;Building this taught me more about OS-level virtualization than any course or book. Specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How Firecracker's works and why it matters for security&lt;/li&gt;
&lt;li&gt;Why vsock exists and what problem it solves over TCP&lt;/li&gt;
&lt;li&gt;How rootfs construction works at a practical level&lt;/li&gt;
&lt;li&gt;Why the IPC layer is the performance bottleneck in VM-based execution, not the VM itself&lt;/li&gt;
&lt;li&gt;How to think about isolation vs. throughput tradeoffs in real systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it yourself
&lt;/h2&gt;

&lt;p&gt;The full source, architecture diagrams, and setup instructions are on GitHub:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/vivek1504/serverless-runtime" rel="noopener noreferrer"&gt;github.com/vivek1504/serverless-runtime&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A prebuilt kernel image and rootfs are available in the releases so you don't have to build from scratch. You'll need a Linux host with KVM support (&lt;code&gt;/dev/kvm&lt;/code&gt; accessible) and the Firecracker binary in your PATH.&lt;/p&gt;

&lt;p&gt;If you've built something similar or have questions about any part of the implementation, I'm happy to go deeper in the comments.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>infrastructure</category>
      <category>serverless</category>
      <category>firecracker</category>
    </item>
  </channel>
</rss>
