<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tejas Rastogi</title>
    <description>The latest articles on DEV Community by Tejas Rastogi (@tejas_rastogi_6d73fa2a7a3).</description>
    <link>https://dev.to/tejas_rastogi_6d73fa2a7a3</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3726669%2F98a8877c-d42a-4c98-8c7e-ad9acf2aac2b.png</url>
      <title>DEV Community: Tejas Rastogi</title>
      <link>https://dev.to/tejas_rastogi_6d73fa2a7a3</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tejas_rastogi_6d73fa2a7a3"/>
    <language>en</language>
    <item>
      <title>Sentry</title>
      <dc:creator>Tejas Rastogi</dc:creator>
      <pubDate>Thu, 22 Jan 2026 21:23:34 +0000</pubDate>
      <link>https://dev.to/tejas_rastogi_6d73fa2a7a3/sentry-1kna</link>
      <guid>https://dev.to/tejas_rastogi_6d73fa2a7a3/sentry-1kna</guid>
      <description>&lt;p&gt;Building Sentry: A Distributed Message Broker in Go (From Scratch)&lt;br&gt;
I’m learning systems engineering and system design by doing —&lt;br&gt;
not by watching tutorials, not by drawing diagrams, but by actually building infrastructure.&lt;br&gt;
This blog documents my journey building Sentry, a Kafka-inspired distributed message broker written in Go, from scratch.&lt;br&gt;
Not a wrapper.&lt;br&gt;
Not a framework experiment.&lt;br&gt;
Not a CRUD backend.&lt;br&gt;
A real broker: raw TCP, binary wire protocol, append-only logs, offset indexing, crash recovery, and concurrency at scale.&lt;br&gt;
Why I Started This Project&lt;br&gt;
Most backend engineers (including me, earlier) live behind abstractions:&lt;br&gt;
HTTP frameworks&lt;br&gt;
ORMs&lt;br&gt;
Message queues as black boxes&lt;br&gt;
Managed services that “just work”&lt;br&gt;
But at some point I realized something uncomfortable:&lt;br&gt;
I could use Kafka, Redis, and RabbitMQ…&lt;br&gt;
but I had no real idea how they worked internally.&lt;br&gt;
So I asked myself:&lt;br&gt;
How does a broker actually store messages on disk?&lt;br&gt;
How are offsets tracked?&lt;br&gt;
What happens when the process crashes mid-write?&lt;br&gt;
How do consumers resume safely?&lt;br&gt;
How does a binary protocol work over raw TCP?&lt;br&gt;
What breaks under network lag or partial writes?&lt;br&gt;
And that’s where Sentry was born.&lt;br&gt;
What Is Sentry?&lt;br&gt;
Sentry is a high-performance, Kafka-like distributed message broker written in Go.&lt;br&gt;
It focuses on:&lt;br&gt;
Simplicity over features&lt;br&gt;
Deterministic behavior&lt;br&gt;
Low-level correctness&lt;br&gt;
Failure-first design&lt;br&gt;
Observability via logs&lt;br&gt;
Learning by implementation&lt;br&gt;
The goal is not to replace Kafka.&lt;br&gt;
The goal is to understand Kafka-class systems by building one.&lt;br&gt;
High-Level Architecture&lt;br&gt;
At a high level, Sentry has these core layers:&lt;br&gt;
Network Layer&lt;br&gt;
Raw TCP server&lt;br&gt;
Per-connection goroutines&lt;br&gt;
Binary protocol decoding&lt;br&gt;
Protocol Layer&lt;br&gt;
Custom wire protocol&lt;br&gt;
Length-prefixed frames&lt;br&gt;
Correlation IDs&lt;br&gt;
Versioning support&lt;br&gt;
Broker Core&lt;br&gt;
Message routing&lt;br&gt;
Partition selection&lt;br&gt;
Offset assignment&lt;br&gt;
Persistence Layer&lt;br&gt;
Append-only log segments&lt;br&gt;
Offset → byte index files&lt;br&gt;
Time-based index files&lt;br&gt;
Crash-safe replay&lt;br&gt;
Consumer Layer&lt;br&gt;
Offset-based reads&lt;br&gt;
Replay semantics&lt;br&gt;
Deterministic fetch order&lt;br&gt;
Each layer is explicit.&lt;br&gt;
Nothing is hidden behind magic.&lt;br&gt;
Custom Binary Wire Protocol&lt;br&gt;
One of the first things I built was the wire protocol.&lt;br&gt;
Instead of HTTP or gRPC, Sentry uses a custom binary protocol over raw TCP.&lt;br&gt;
Why?&lt;br&gt;
Because production brokers don’t speak JSON over HTTP.&lt;br&gt;
They speak:&lt;br&gt;
Framed binary messages&lt;br&gt;
Big-endian encoded integers&lt;br&gt;
Compact payloads&lt;br&gt;
Deterministic layouts&lt;br&gt;
Versioned schemas&lt;br&gt;
Protocol Design&lt;br&gt;
Each request frame looks like:&lt;br&gt;
Copy code&lt;/p&gt;

&lt;p&gt;| Frame Length (4 bytes) |&lt;br&gt;
| Message Type (2 bytes) |&lt;br&gt;
| Correlation ID (4 bytes) |&lt;br&gt;
| Payload (N bytes)      |&lt;br&gt;
Key features:&lt;br&gt;
Length-prefixed framing&lt;br&gt;
So partial TCP reads can be reassembled correctly.&lt;br&gt;
Correlation IDs&lt;br&gt;
So responses match requests in concurrent connections.&lt;br&gt;
Big-endian encoding&lt;br&gt;
For deterministic cross-platform decoding.&lt;br&gt;
Version field (planned)&lt;br&gt;
For forward compatibility.&lt;br&gt;
Concurrency Model&lt;br&gt;
Sentry uses Go’s strengths:&lt;br&gt;
Goroutines&lt;br&gt;
Channels&lt;br&gt;
Mutexes&lt;br&gt;
Worker pools&lt;br&gt;
Ingress Model&lt;br&gt;
One goroutine per TCP connection&lt;br&gt;
Each connection reads frames&lt;br&gt;
Frames are pushed into a bounded worker pool&lt;br&gt;
Workers decode and route requests&lt;br&gt;
This prevents:&lt;br&gt;
Unbounded goroutine growth&lt;br&gt;
Memory pressure&lt;br&gt;
Head-of-line blocking&lt;br&gt;
Backpressure&lt;br&gt;
If the worker pool queue is full:&lt;br&gt;
New requests block&lt;br&gt;
Producers slow down&lt;br&gt;
The system protects itself&lt;br&gt;
No silent overload.&lt;br&gt;
No hidden memory leaks.&lt;br&gt;
Persistence: Append-Only Log Segments&lt;br&gt;
This is the heart of the system.&lt;br&gt;
Every topic partition in Sentry is backed by:&lt;br&gt;
A .log file → message bytes&lt;br&gt;
A .index file → offset → byte position&lt;br&gt;
A .timeindex file → timestamp → offset&lt;br&gt;
Why Append-Only?&lt;br&gt;
Because append-only logs give you:&lt;br&gt;
Crash safety&lt;br&gt;
Sequential disk writes&lt;br&gt;
High throughput&lt;br&gt;
Simple replay&lt;br&gt;
Deterministic ordering&lt;br&gt;
Segment Structure&lt;br&gt;
Each segment is defined by:&lt;br&gt;
Copy code&lt;br&gt;
Go&lt;br&gt;
type Segment struct {&lt;br&gt;
    baseOffset     uint64&lt;br&gt;
    log            *os.File&lt;br&gt;
    index          *os.File&lt;br&gt;
    timeIndex      *os.File&lt;br&gt;
    topic          string&lt;br&gt;
    partition      uint32&lt;br&gt;
    active         bool&lt;br&gt;
}&lt;br&gt;
Writing a Message&lt;br&gt;
The write path:&lt;br&gt;
Append message bytes to .log&lt;br&gt;
Record (relativeOffset, bytePosition) in .index&lt;br&gt;
Record (timestamp, relativeOffset) in .timeindex&lt;br&gt;
fsync the log file&lt;br&gt;
Return offset to producer&lt;br&gt;
This ensures:&lt;br&gt;
Durability&lt;br&gt;
Deterministic offsets&lt;br&gt;
Replay safety&lt;br&gt;
Offset Indexing&lt;br&gt;
To avoid scanning entire log files:&lt;br&gt;
Each message write updates an index entry:&lt;br&gt;
Copy code&lt;/p&gt;

&lt;p&gt;| Relative Offset (4 bytes) |&lt;br&gt;
| Byte Position (4 bytes)   |&lt;br&gt;
Stored in .index.&lt;br&gt;
This allows:&lt;br&gt;
O(1) offset → file seek&lt;br&gt;
Fast consumer reads&lt;br&gt;
Predictable latency&lt;br&gt;
Time-Based Indexing&lt;br&gt;
Each message also updates a .timeindex file:&lt;br&gt;
Copy code&lt;/p&gt;

&lt;p&gt;| Timestamp (8 bytes) |&lt;br&gt;
| Relative Offset (4 bytes) |&lt;br&gt;
This enables future features like:&lt;br&gt;
Fetch by timestamp&lt;br&gt;
Log compaction&lt;br&gt;
Retention policies&lt;br&gt;
Crash Recovery&lt;br&gt;
This is where systems engineering actually begins.&lt;br&gt;
When the broker restarts:&lt;br&gt;
It scans the partition directory&lt;br&gt;
Discovers all segment files&lt;br&gt;
Loads .index files&lt;br&gt;
Replays .log files if needed&lt;br&gt;
Rebuilds in-memory offsets&lt;br&gt;
Marks last segment as active&lt;br&gt;
This ensures:&lt;br&gt;
No data loss&lt;br&gt;
No duplicate offsets&lt;br&gt;
Safe resume for consumers&lt;br&gt;
Failure-First Design&lt;br&gt;
Everything in Sentry is built assuming failure:&lt;br&gt;
Disk write failures&lt;br&gt;
Partial TCP reads&lt;br&gt;
Process crashes&lt;br&gt;
Power loss&lt;br&gt;
Client disconnects&lt;br&gt;
Examples:&lt;br&gt;
Writes check active segment state&lt;br&gt;
Index lookups validate bounds&lt;br&gt;
Reads handle EOF explicitly&lt;br&gt;
Recovery paths are first-class&lt;br&gt;
Consumer Fetch Logic&lt;br&gt;
Consumers fetch messages using:&lt;br&gt;
Topic&lt;br&gt;
Partition&lt;br&gt;
Offset&lt;br&gt;
Max batch size&lt;br&gt;
The broker:&lt;br&gt;
Looks up offset in .index&lt;br&gt;
Seeks to byte position&lt;br&gt;
Reads message bytes&lt;br&gt;
Returns batch to consumer&lt;br&gt;
Advances consumer offset&lt;br&gt;
This allows:&lt;br&gt;
Replay from any offset&lt;br&gt;
Exactly-once semantics (client-side)&lt;br&gt;
Crash-safe resumes&lt;br&gt;
Observability via Logs&lt;br&gt;
Every major action logs explicitly:&lt;br&gt;
Copy code&lt;/p&gt;

&lt;p&gt;[INFO] Sentry Broker Starting...&lt;br&gt;
[INFO] Data directory ready: ./data&lt;br&gt;
[INFO] Loading log segments from disk...&lt;br&gt;
[INFO] Loaded 3 log segments from disk&lt;br&gt;
[INFO] Recovered offsets up to 10432&lt;br&gt;
[INFO] Listening for producer connections...&lt;br&gt;
Why this matters:&lt;br&gt;
Debuggability&lt;br&gt;
Replay verification&lt;br&gt;
Crash analysis&lt;br&gt;
Trust in the system&lt;br&gt;
What This Project Taught Me&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Abstractions Hide Complexity
Before Sentry, “Kafka stores messages” was a black box.
Now I know:
How bytes hit disk
How offsets are assigned
How indexes are structured
How recovery actually works&lt;/li&gt;
&lt;li&gt;Concurrency Is Not Free
Goroutines are cheap — but not infinite.
Without:
Worker pools
Backpressure
Bounded queues
Your system will explode under load.&lt;/li&gt;
&lt;li&gt;Failure Is the Default
If you don’t explicitly design for:
crashes
restarts
partial writes
corrupt state
your system is already broken.&lt;/li&gt;
&lt;li&gt;Performance Is a Trade-Off
Every choice has consequences:
fsync = durability vs throughput
memory buffers = speed vs safety
batching = latency vs efficiency
There is no “best” choice.
Only conscious trade-offs.
What’s Next for Sentry
This is only the beginning.
Planned improvements:
Consumer groups
Replication across nodes
Leader election
Segment compaction
Retention policies
Snapshotting
Metrics &amp;amp; dashboards
Raft-based metadata layer
Why I’m Building This in Public
Because:
It keeps me accountable
I get feedback from real engineers
It forces me to document decisions
It creates a learning trail others can follow
Final Thoughts
Building Sentry taught me something critical:
Building features is easy.
Designing systems that don’t break is the real challenge.
This project took me far beyond CRUD apps and APIs.
It forced me to think in:
bytes
offsets
failure modes
recovery paths
concurrency limits
durability guarantees
And honestly?
This has been the most educational project I’ve ever built.
Repo &amp;amp; Resources
GitHub:
&lt;a href="https://github.com/tejas2428cse990-svg/Sentry.git" rel="noopener noreferrer"&gt;https://github.com/tejas2428cse990-svg/Sentry.git&lt;/a&gt;
Blog series (coming soon):
👉 Dev.to link
Feedback Welcome&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>architecture</category>
      <category>go</category>
      <category>showdev</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
