<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Slavik</title>
    <description>The latest articles on DEV Community by Slavik (@vyaslav).</description>
    <link>https://dev.to/vyaslav</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3312037%2F57f63e80-7c4e-4375-b03e-c97512985e3e.jpeg</url>
      <title>DEV Community: Slavik</title>
      <link>https://dev.to/vyaslav</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vyaslav"/>
    <language>en</language>
    <item>
      <title>Building a parenting app with UDP</title>
      <dc:creator>Slavik</dc:creator>
      <pubDate>Tue, 05 Aug 2025 13:38:43 +0000</pubDate>
      <link>https://dev.to/vyaslav/building-a-parenting-app-with-udp-4kn2</link>
      <guid>https://dev.to/vyaslav/building-a-parenting-app-with-udp-4kn2</guid>
      <description>&lt;h2&gt;
  
  
  The Problem That Started It All
&lt;/h2&gt;

&lt;p&gt;When I needed a screen time limiting app for my child's Fire HD device, Amazon's built-in parental controls just didn't cut it for my specific use case. Like many developers faced with a problem, I decided to build my own solution. What started as a simple hardcoded timer has evolved into something much more interesting - a real-time parent control system using UDP communication over the local network.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Simple Beginnings
&lt;/h2&gt;

&lt;p&gt;The first version was beautifully simple: a hardcoded time limit with two buttons for the child to request extensions - one minute or five minutes. I implemented this because, let's be honest, kids &lt;em&gt;always&lt;/em&gt; ask for "just a little bit more" time at the end! &lt;/p&gt;

&lt;p&gt;The app reliably did what it needed to do: block the tablet with an overlay activity once the time credit was reached. Simple, effective, and it worked.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evolution: Adding Remote Parent Control
&lt;/h2&gt;

&lt;p&gt;When I decided to open-source the app, I started polishing it and implementing features I'd always thought about but never had the time (or motivation) to build. The most interesting addition? A companion parent app that can communicate with the child's device over the local network.&lt;/p&gt;

&lt;p&gt;Initially, I had implemented admin settings directly in the child app, but this approach had serious drawbacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cumbersome to use&lt;/strong&gt;: Parents had to physically access the child's device&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disruptive experience&lt;/strong&gt;: It interrupted the child's interaction with the device&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Impractical&lt;/strong&gt;: Imagine trying to check time limits or grant extensions from across the room!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Enter UDP: The Perfect Protocol for Local Network Communication
&lt;/h2&gt;

&lt;p&gt;This is where &lt;strong&gt;UDP (User Datagram Protocol)&lt;/strong&gt; comes into play. But first, let's understand what UDP actually is:&lt;/p&gt;

&lt;h3&gt;
  
  
  What is UDP?
&lt;/h3&gt;

&lt;p&gt;UDP is a communication protocol that's like sending postcards between devices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast and lightweight&lt;/strong&gt;: No handshaking or connection establishment needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Broadcast capable&lt;/strong&gt;: One device can send messages to all devices on the network&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simple&lt;/strong&gt;: Perfect for local network communication where speed matters more than guaranteed delivery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fire-and-forget&lt;/strong&gt;: Send a message and move on - ideal for real-time controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlike TCP (which is like a phone call - establishing a connection and ensuring every word is heard), UDP is more like shouting across a room - quick, direct, and perfect for commands like "check time remaining" or "extend time by 5 minutes."&lt;/p&gt;

&lt;h3&gt;
  
  
  Why UDP is Perfect for Parent-Child Device Control
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: The parent app can broadcast "Is there a child device here?" and any child devices respond&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time commands&lt;/strong&gt;: Instant blocking, time extensions, or status checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No complex setup&lt;/strong&gt;: No pairing, no account creation - just connect to the same WiFi&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local network only&lt;/strong&gt;: Commands stay within your home network for privacy&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Technical Implementation
&lt;/h2&gt;

&lt;p&gt;Here's how the system works:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Device Discovery
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Parent App → Broadcasts: "CST_PARENT_DISCOVERY" on port 8888
Child Device → Responds: "CST_CHILD_RESPONSE" with device info
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Secure Command Communication
&lt;/h3&gt;

&lt;p&gt;All commands are encrypted using AES-CBC with device ID-based keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Parent App → Sends: "CST_CMD:[encrypted_command]"
Child Device → Responds: "CST_RESP:[encrypted_response]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Available Commands
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GET_TIME_LEFT&lt;/code&gt;: Check remaining screen time&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LOCK_DEVICE&lt;/code&gt;: Immediately block the device&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;EXTEND_TIME:30&lt;/code&gt;: Add 30 minutes to the time limit&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Real-World Usage Flow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parent opens the companion app&lt;/strong&gt; on their phone/tablet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;App automatically discovers&lt;/strong&gt; child devices on the network&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parent can see&lt;/strong&gt; real-time status: time remaining, current state (active/blocked)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instant control&lt;/strong&gt;: Block immediately or grant time extensions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No disruption&lt;/strong&gt; to the child's experience&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Security Layer
&lt;/h2&gt;

&lt;p&gt;Since this involves controlling a child's device, security is paramount:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AES-CBC Encryption&lt;/strong&gt;: All commands are encrypted using device-specific keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local network only&lt;/strong&gt;: Commands never leave your home WiFi&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Device ID-based keys&lt;/strong&gt;: Each device has unique encryption keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No internet required&lt;/strong&gt;: Everything works offline&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Code Deep Dive: The Discovery Service
&lt;/h2&gt;

&lt;p&gt;Here's a simplified version of how the child device listens for parent commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ParentDiscoveryService&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Service&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="no"&gt;DISCOVERY_PORT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8888&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;DISCOVERY_REQUEST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"CST_PARENT_DISCOVERY"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;DISCOVERY_RESPONSE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"CST_CHILD_RESPONSE"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;handleIncomingMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;InetAddress&lt;/span&gt; &lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DISCOVERY_REQUEST&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Respond to discovery&lt;/span&gt;
            &lt;span class="n"&gt;sendResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;DISCOVERY_RESPONSE&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CST_CMD:"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Handle encrypted command&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;encrypted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;decrypted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;securityManager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;decryptMessage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encrypted&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processCommand&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;decrypted&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// Send encrypted response back&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Building this system taught me several valuable lessons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Simple protocols work best&lt;/strong&gt;: UDP's simplicity made implementation straightforward&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security can't be an afterthought&lt;/strong&gt;: Encryption was essential for a family app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User experience drives architecture&lt;/strong&gt;: The need for non-disruptive parent control shaped the entire design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local-first is powerful&lt;/strong&gt;: No internet dependency makes the system more reliable&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Technical Challenges
&lt;/h2&gt;

&lt;p&gt;Of course, it wasn't all smooth sailing:&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 1: Service Persistence
&lt;/h3&gt;

&lt;p&gt;Android aggressively kills background services. The solution? Multiple layers of protection:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Foreground services with proper notification channels&lt;/li&gt;
&lt;li&gt;AlarmManager for restart scheduling&lt;/li&gt;
&lt;li&gt;WorkManager for periodic health checks&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenge 2: Network Discovery
&lt;/h3&gt;

&lt;p&gt;UDP broadcast doesn't always work reliably across all WiFi configurations. The solution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retry logic with exponential backoff&lt;/li&gt;
&lt;li&gt;Multiple discovery attempts&lt;/li&gt;
&lt;li&gt;Graceful degradation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Challenge 3: Encryption Key Management
&lt;/h3&gt;

&lt;p&gt;Securely sharing encryption keys between devices without user complexity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Device ID-based key derivation&lt;/li&gt;
&lt;li&gt;SHA-256 hashing for consistent keys&lt;/li&gt;
&lt;li&gt;No manual key exchange required&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;The UDP communication system opens up exciting possibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-device management&lt;/strong&gt;: Control multiple child devices from one parent app&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Usage analytics&lt;/strong&gt;: Real-time monitoring and historical data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Family coordination&lt;/strong&gt;: Multiple parent devices controlling the same child device&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The project is open source and &lt;a href="https://github.com/childscreentime/cst" rel="noopener noreferrer"&gt;available on GitHub&lt;/a&gt;. The UDP communication system demonstrates how simple protocols can solve complex real-world problems elegantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Sometimes simple is better&lt;/strong&gt;: UDP's simplicity made it perfect for this use case&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local network communication is underutilized&lt;/strong&gt;: Many problems can be solved without cloud services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User experience should drive technical decisions&lt;/strong&gt;: The need for non-disruptive control shaped the entire architecture&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security matters in family apps&lt;/strong&gt;: Encryption is essential, even for local communication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Android background services require careful handling&lt;/strong&gt;: Multiple protection layers are necessary&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;Have you built similar local network communication systems? What protocols and patterns have worked well for you? Share your experiences in the comments!&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UDP in Android&lt;/strong&gt;: &lt;a href="https://developer.android.com/guide/topics/connectivity/network-ops" rel="noopener noreferrer"&gt;Android Developer Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Foreground Services&lt;/strong&gt;: &lt;a href="https://developer.android.com/guide/background" rel="noopener noreferrer"&gt;Best Practices for Background Work&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AES Encryption in Java&lt;/strong&gt;: &lt;a href="https://docs.oracle.com/javase/8/docs/technotes/guides/security/crypto/CryptoSpec.html" rel="noopener noreferrer"&gt;Java Cryptography Architecture&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mobile</category>
      <category>network</category>
      <category>java</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Building a Toy SSTable Storage Engine in Python</title>
      <dc:creator>Slavik</dc:creator>
      <pubDate>Tue, 01 Jul 2025 08:00:53 +0000</pubDate>
      <link>https://dev.to/vyaslav/building-a-toy-sstable-storage-engine-in-python-a28</link>
      <guid>https://dev.to/vyaslav/building-a-toy-sstable-storage-engine-in-python-a28</guid>
      <description>&lt;p&gt;Have you ever wondered how modern databases like LevelDB, RocksDB, or Cassandra store and retrieve massive amounts of data efficiently? The secret sauce is often a data structure called the &lt;strong&gt;Log-Structured Merge-Tree (LSM-Tree)&lt;/strong&gt; and its core component, the &lt;strong&gt;Sorted String Table (SSTable)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this post, we’ll build a toy, educational SSTable-based storage engine in Python, inspired by Martin Kleppmann’s &lt;em&gt;Designing Data-Intensive Applications&lt;/em&gt;. We’ll start simple and gradually add complexity, so you can follow along even if you’re new to storage internals!&lt;/p&gt;




&lt;h2&gt;
  
  
  What Are LSM-Trees and SSTables?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LSM-Tree&lt;/strong&gt; stands for &lt;strong&gt;Log-Structured Merge-Tree&lt;/strong&gt;. It’s a data structure designed to make writing data to disk very fast and efficient.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;SSTable&lt;/strong&gt; is a file format for storing large, sorted key-value pairs on disk. The key properties are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sorted&lt;/strong&gt;: All keys are stored in order, making range queries and binary search possible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immutable&lt;/strong&gt;: Once written, SSTables are never modified. New data is written to new files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient&lt;/strong&gt;: By combining in-memory and on-disk structures, SSTables enable fast writes and reasonably fast reads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SSTables are the building blocks behind many modern, high-performance databases like LevelDB, RocksDB, and Cassandra.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Does It Work?
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Writes go to memory first:&lt;/strong&gt;
When you add or update data, it’s first stored in a fast, in-memory structure (called a &lt;em&gt;memtable&lt;/em&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flush to disk as SSTable:&lt;/strong&gt;
When the memtable gets full, all its data is written to disk as a new SSTable file. These files are never changed after being written.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reads check memory and disk:&lt;/strong&gt;
When you read data, the system first checks the memtable, then searches through the SSTables on disk.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  How Is This Different from Other Approaches?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Traditional databases&lt;/strong&gt; (like those using B-Trees) update data in place on disk. This means lots of small, random writes, which can be slow on hard drives and even SSDs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LSM-Trees&lt;/strong&gt; always write new data in large, sequential chunks, which is much faster for disks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSTables&lt;/strong&gt; are immutable, so there’s no need to lock files for writing, and old data can be cleaned up later in the background (a process called &lt;em&gt;compaction&lt;/em&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Use LSM-Trees and SSTables?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast writes:&lt;/strong&gt; Great for applications that need to handle lots of inserts and updates quickly (like logs, metrics, or time series data).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient storage:&lt;/strong&gt; Sequential disk writes are much faster and less likely to wear out SSDs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability:&lt;/strong&gt; LSM-Trees can handle huge amounts of data by merging and compacting SSTables in the background.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Project Structure
&lt;/h2&gt;

&lt;p&gt;Here’s what we’ll build (and you can find the &lt;a href="https://github.com/vyaslav/py_sstable" rel="noopener noreferrer"&gt;full code on GitHub&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;memtable.py&lt;/code&gt;: An in-memory, sorted key-value store.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sstable_writer.py&lt;/code&gt;: Writes sorted key-value pairs to disk as an SSTable, with a sparse index and Bloom filter.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sstable_reader.py&lt;/code&gt;: Reads from SSTables using the index and Bloom filter.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sstable.py&lt;/code&gt;: Orchestrates the LSM-Tree logic, combining memtable and SSTables.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;simple_bloom_filter.py&lt;/code&gt;: A simple Bloom filter for fast negative lookups.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sstable_server.py&lt;/code&gt;: A UNIX socket server exposing set/get operations.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;main.py&lt;/code&gt;: A CLI client to interact with the server.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;stress_test.py&lt;/code&gt;: A script to stress test the system.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: The Memtable – Fast In-Memory Writes
&lt;/h2&gt;

&lt;p&gt;When you write data, it first lands in the &lt;strong&gt;memtable&lt;/strong&gt;—a sorted, in-memory structure. In our Python version, we use a sorted list and the &lt;code&gt;bisect&lt;/code&gt; module for efficient lookups and inserts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# memtable.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Memtable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Insert or update, keeping the list sorted
&lt;/span&gt;        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Binary search for fast lookup
&lt;/span&gt;        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the memtable gets too big, we &lt;strong&gt;flush&lt;/strong&gt; it to disk as a new SSTable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Writing SSTables – Persistence and Order
&lt;/h2&gt;

&lt;p&gt;Flushing the memtable means writing all its sorted key-value pairs to a file. But how do we make reads efficient?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sparse Index&lt;/strong&gt;: Every Nth key and its file offset are written to an index file. This lets us quickly jump to the right part of the SSTable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bloom Filter&lt;/strong&gt;: A probabilistic data structure that tells us if a key is &lt;em&gt;definitely not&lt;/em&gt; in the file, saving unnecessary disk reads.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sstable_writer.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SSTableWriter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sorted_kv_pairs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Write data lines and build sparse index
&lt;/span&gt;        &lt;span class="c1"&gt;# Serialize and store the Bloom filter
&lt;/span&gt;        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: Reading SSTables – Fast Lookups
&lt;/h2&gt;

&lt;p&gt;When you want to read a key:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check the memtable&lt;/strong&gt; (fastest).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check the Bloom filter&lt;/strong&gt; for each SSTable (quickly skip files that don’t have the key).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use the sparse index&lt;/strong&gt; to jump to the right spot in the SSTable file and scan for the key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sstable_reader.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SSTableReader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Use Bloom filter and sparse index to minimize disk I/O
&lt;/span&gt;        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4: The LSM-Tree – Orchestrating Everything
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;SSTable&lt;/code&gt; class manages the memtable, SSTable files, and the index cache. It handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Set&lt;/strong&gt;: Write to memtable, flush to SSTable when full.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get&lt;/strong&gt;: Check memtable, then SSTables from newest to oldest.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sstable.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SSTable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 5: Server and CLI – Putting It All Together
&lt;/h2&gt;

&lt;p&gt;We expose our storage engine via a simple UNIX socket server (&lt;code&gt;sstable_server.py&lt;/code&gt;). You can interact with it using the CLI (&lt;code&gt;main.py&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; sstable_server   &lt;span class="c"&gt;# Start the server&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; main &lt;span class="nb"&gt;set &lt;/span&gt;mykey 123
python &lt;span class="nt"&gt;-m&lt;/span&gt; main get mykey
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 6: Stress Testing
&lt;/h2&gt;

&lt;p&gt;How does it perform? The &lt;code&gt;stress_test.py&lt;/code&gt; script:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Starts the server&lt;/li&gt;
&lt;li&gt;Inserts 1000 random key-value pairs&lt;/li&gt;
&lt;li&gt;Reads them all back and prints the sum and average
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python stress_test.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why Does This Matter?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Write-Optimized&lt;/strong&gt;: LSM-Trees and SSTables are designed for fast, sequential writes—perfect for write-heavy workloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient Reads&lt;/strong&gt;: Sparse indexes and Bloom filters keep reads fast, even as data grows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-World Use&lt;/strong&gt;: These ideas power LevelDB, RocksDB, Cassandra, and more.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;This project is a &lt;strong&gt;toy&lt;/strong&gt;—but it’s a great way to learn! You can extend it by adding:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compaction (merging old SSTables)&lt;/li&gt;
&lt;li&gt;Range queries&lt;/li&gt;
&lt;li&gt;Deletion markers (tombstones)&lt;/li&gt;
&lt;li&gt;Compression&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building your own SSTable-based storage engine is a fantastic way to understand the internals of modern databases. By starting simple and adding complexity, you’ll gain intuition for how real-world systems handle massive data efficiently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check out the &lt;a href="https://github.com/vyaslav/py_sstable" rel="noopener noreferrer"&gt;full code on GitHub&lt;/a&gt; and try it yourself!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>database</category>
      <category>datastructures</category>
      <category>development</category>
    </item>
  </channel>
</rss>
