<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Serhii Pavlenko</title>
    <description>The latest articles on DEV Community by Serhii Pavlenko (@serhii_pavlenko).</description>
    <link>https://dev.to/serhii_pavlenko</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3900617%2F03c129f9-351d-4bca-a6d6-a1ced1841fe7.png</url>
      <title>DEV Community: Serhii Pavlenko</title>
      <link>https://dev.to/serhii_pavlenko</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/serhii_pavlenko"/>
    <language>en</language>
    <item>
      <title>System Design Interview: It’s Not About the Design</title>
      <dc:creator>Serhii Pavlenko</dc:creator>
      <pubDate>Mon, 27 Apr 2026 13:59:12 +0000</pubDate>
      <link>https://dev.to/serhii_pavlenko/system-design-interview-its-not-about-the-design-jfc</link>
      <guid>https://dev.to/serhii_pavlenko/system-design-interview-its-not-about-the-design-jfc</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wo5s430ttzk3qdwqxh9.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2wo5s430ttzk3qdwqxh9.webp" alt=" " width="800" height="800"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;h1&gt;
  
  
  The System Design Interview: A Map to Winning It
&lt;/h1&gt;

&lt;p&gt;You have 45 minutes. The problem is something you've never built. The interviewer expects a production-grade answer. Welcome to the modern System Design interview.&lt;/p&gt;

&lt;p&gt;Let me be direct: the System Design interview is not a test of whether you can design a system. Nobody designs anything production-worthy in 45 minutes. It is, fundamentally, an &lt;strong&gt;erudition and experience test&lt;/strong&gt; — a structured conversation where you demonstrate breadth of knowledge, depth in key areas, and the kind of instincts you only develop by actually shipping distributed systems.&lt;/p&gt;

&lt;p&gt;The problem you're given — collaborative editor, distributed index, drone delivery — is a canvas. What the interviewer is actually evaluating is whether you paint on it with the vocabulary, patterns, and intuition of someone who has been in the trenches.&lt;/p&gt;

&lt;p&gt;This article is a map of what that conversation should cover, and how to win it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 45-Minute Reality
&lt;/h2&gt;

&lt;p&gt;The interviewer hands you a problem. The surface topic almost doesn't matter. What they're watching for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you know the &lt;strong&gt;vocabulary&lt;/strong&gt; of distributed systems?&lt;/li&gt;
&lt;li&gt;Do you recognize the &lt;strong&gt;canonical problems&lt;/strong&gt; and name them before they do?&lt;/li&gt;
&lt;li&gt;Do you have &lt;strong&gt;real-world intuition&lt;/strong&gt; — war stories, legacy awareness, production instincts?&lt;/li&gt;
&lt;li&gt;Do you make &lt;strong&gt;precise tool selections&lt;/strong&gt; — and explain why?&lt;/li&gt;
&lt;li&gt;Do you have &lt;strong&gt;domain depth&lt;/strong&gt; — the named algorithms, the math, the edge cases?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're hitting all five, you're passing. Let's go through each.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Speak the Language
&lt;/h2&gt;

&lt;p&gt;An interviewer can tell within the first three minutes whether you've &lt;em&gt;lived&lt;/em&gt; in distributed systems or studied them last week. You don't need to recite definitions — you need to reach for the right term at the right moment and use it naturally.&lt;/p&gt;

&lt;p&gt;When discussing a social feed, you say &lt;strong&gt;"eventual consistency is fine here."&lt;/strong&gt; When the interviewer asks about a banking ledger, you say &lt;strong&gt;"strong consistency — we can't show a stale balance at withdrawal time."&lt;/strong&gt; When someone mentions collaborative editing, you say &lt;strong&gt;"causal consistency"&lt;/strong&gt; — not just "we need to keep things in order."&lt;/p&gt;

&lt;p&gt;The same applies to infrastructure vocabulary: &lt;strong&gt;consistent hashing&lt;/strong&gt; (and why it minimizes data movement), &lt;strong&gt;leader-follower vs. leaderless replication&lt;/strong&gt; (and when convergence conflicts become your problem), &lt;strong&gt;range partitioning vs. hash partitioning&lt;/strong&gt; (and which one you can actually do a range scan on).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The move:&lt;/strong&gt; name the model, name the trade-off, name the business consequence. Every time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;CAP is the classic example. Don't just say "CAP theorem." Say: &lt;em&gt;"In the presence of a partition, I'm choosing availability here — a post missing from a social feed for 200ms is invisible to users. But for the payment service, I'd pick consistency — a stale balance means real money lost."&lt;/em&gt; That's the level of fluency they're testing.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Name the Problem Before the Interviewer Does
&lt;/h2&gt;

&lt;p&gt;Every distributed system hits the same walls. &lt;strong&gt;Naming the wall — unprompted, at the right moment&lt;/strong&gt; — is one of the strongest signals of real experience.&lt;/p&gt;

&lt;p&gt;Here's what I mean by &lt;em&gt;unexpected&lt;/em&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  Thundering Herd
&lt;/h3&gt;

&lt;p&gt;Not just about caches. Yes, the classic scenario is a hot cache key expiring and thousands of requests stampeding the database. But the same pattern shows up when you restart a fleet of microservices simultaneously — they all try to establish database connections, fetch configs, and warm up at the same moment. Or after a brief DNS outage, when every client retries at the same instant.&lt;/p&gt;

&lt;p&gt;The solution vocabulary is the same — &lt;strong&gt;jitter, request coalescing, probabilistic early expiration (XFetch)&lt;/strong&gt; — but recognizing the pattern in non-obvious contexts is what makes you stand out.&lt;/p&gt;

&lt;h3&gt;
  
  
  Circuit Breaker
&lt;/h3&gt;

&lt;p&gt;The answer to &lt;em&gt;"what if your dependency is slow, not down?"&lt;/em&gt; Most candidates think about failure as binary. It's not. A dependency returning responses in 30 seconds instead of 30 milliseconds is &lt;em&gt;worse&lt;/em&gt; than one that's completely down — because your threads pile up waiting, your connection pool fills, and you become the outage. Name the three states (&lt;strong&gt;closed, open, half-open&lt;/strong&gt;), mention the half-open probe, and you've signaled that you've been paged at 3 AM for a cascading failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  DLQ Depth as an SLO Signal
&lt;/h3&gt;

&lt;p&gt;When discussing any event-driven architecture, mention that &lt;strong&gt;Dead Letter Queue depth&lt;/strong&gt; is one of the first alerts you'd set up. If the DLQ is growing, something systematic is broken. This is a production instinct — not something you'd get from a textbook.&lt;/p&gt;




&lt;p&gt;The list goes on — &lt;strong&gt;backpressure, idempotency, saga pattern (choreography vs. orchestration)&lt;/strong&gt; — but the point isn't to enumerate them all. The point is: drop these names naturally, at the moment in your design where the problem actually appears. That's the difference between a candidate who's read a list and one who's fought these fires.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Show Real-World Intuition
&lt;/h2&gt;

&lt;p&gt;This is where you separate from candidates who learned system design from YouTube. The marker isn't knowing what to build — it's knowing how systems are &lt;em&gt;actually&lt;/em&gt; built, with all the messy reality of existing infrastructure, legacy services, and organizational constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Legacy System Reality
&lt;/h3&gt;

&lt;p&gt;Almost every design problem in a real company comes with a footnote: &lt;em&gt;"We already have X."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; The interviewer asks you to design a web crawler. At some point, you need to discuss seed data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Textbook answer:&lt;/strong&gt; "We start with a curated list of popular domains."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Experience answer:&lt;/strong&gt; "In most companies, this isn't a cold start problem. We likely already have crawl data from an existing system — maybe a legacy crawler, maybe a sitemap index from a partner integration. I'd bootstrap the seed queue from the last crawl's output. For new domains, I'd build a nomination pipeline — partners submit URLs through an API, and they go through a freshness scoring model before entering the queue. The key insight is that &lt;strong&gt;seed quality matters more than seed quantity&lt;/strong&gt; — garbage in means wasted crawl budget."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That answer tells the interviewer: you've worked in a real system where nothing starts from scratch. You think about migration paths, not greenfield fantasies.&lt;/p&gt;

&lt;p&gt;This pattern applies everywhere: designing a search index? The ranking signals probably already exist in a legacy analytics pipeline. Building a recommendation engine? There's almost certainly a collaborative filtering service already running somewhere — even if it's a cron job generating a CSV file.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Precise Tool Selection
&lt;/h2&gt;

&lt;p&gt;This is not about memorizing a table of databases. It's about showing the interviewer that you &lt;strong&gt;observe the data before reaching for a tool&lt;/strong&gt; — and that your observations lead to non-obvious, precise choices.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Observation That Changes Everything
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; The interviewer asks you to design a URL shortening service. You're discussing the redirect lookup — given a short code, find the original URL.&lt;/p&gt;

&lt;p&gt;Most candidates immediately jump to database sharding: &lt;em&gt;"We'll shard by short code hash across N Postgres instances."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But stop. Think about the data. How many unique domains exist on the internet? Roughly 350 million registered domains. That's a lot — but it's &lt;strong&gt;bounded&lt;/strong&gt;. And for your short URL service, the number of target domains your users actually link to is much smaller — probably in the tens of thousands, following a power-law distribution.&lt;/p&gt;

&lt;p&gt;That observation changes everything. A bounded, high-frequency access pattern with a power-law distribution is a &lt;strong&gt;caching problem, not a sharding problem&lt;/strong&gt;. You can fit the top 10,000 domains (which cover 95%+ of redirects) in a Redis instance with trivial memory. The long tail hits the database, but that's a tiny fraction of traffic.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The move:&lt;/strong&gt; not &lt;em&gt;"I'll add Redis in front of the database"&lt;/em&gt; (everyone says that), but &lt;em&gt;"The cardinality of target domains is bounded and follows a power law, so the cache hit ratio will be extremely high — I'd solve this with caching before I'd even consider sharding."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Cache Patterns — Know When, Not Just What
&lt;/h3&gt;

&lt;p&gt;There are four cache patterns: &lt;strong&gt;cache-aside, read-through, write-through,&lt;/strong&gt; and &lt;strong&gt;write-behind&lt;/strong&gt;. You should know them — but more importantly, you should reach for the right one at the right moment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Write-behind&lt;/strong&gt; is the interesting one for interviews. It's risky — you can lose data if the cache node dies before flushing to the database. But for a metrics ingestion pipeline where you're aggregating counters, that trade-off is acceptable: lose a few seconds of counter increments vs. hammering the database with per-request writes.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"I'd use write-behind here because losing a few seconds of counter data on a cache node crash is acceptable, but saturating the database with per-event writes is not."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's a precise, defensible decision.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Domain Depth — The Real Differentiator
&lt;/h2&gt;

&lt;p&gt;This is where the interview is won or lost. This is where you demonstrate that you're not just an infrastructure generalist — you have the nerdy, specific algorithmic knowledge that comes from real curiosity and deep work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The key move: name the algorithms.&lt;/strong&gt; Don't say "I'd add rate limiting." Say &lt;em&gt;which&lt;/em&gt; rate limiting algorithm and &lt;em&gt;why&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Rate Limiting — Yes, You Need to Know All Five
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fixed Window Counter&lt;/td&gt;
&lt;td&gt;Simple, boundary bursts&lt;/td&gt;
&lt;td&gt;Internal admin APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sliding Window Log&lt;/td&gt;
&lt;td&gt;Precise, memory-heavy&lt;/td&gt;
&lt;td&gt;Audit-sensitive systems&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sliding Window Counter&lt;/td&gt;
&lt;td&gt;Approximate, memory-efficient&lt;/td&gt;
&lt;td&gt;General APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token Bucket&lt;/td&gt;
&lt;td&gt;Bursty-friendly&lt;/td&gt;
&lt;td&gt;User-facing APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Leaky Bucket&lt;/td&gt;
&lt;td&gt;Smooth egress&lt;/td&gt;
&lt;td&gt;Downstream integrations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Interview-winning moment:&lt;/strong&gt; &lt;em&gt;"I'd use a token bucket here because this is a user-facing API where occasional bursts are acceptable — a user opening a dashboard triggers 20 API calls simultaneously, and I don't want to reject those. But for the downstream payment processor, I'd use a leaky bucket to guarantee a smooth egress rate, even if it means buffering."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single sentence shows you know the difference, know when it matters, and have opinions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conflict Resolution — Name the Algorithm, Know Its Constraints
&lt;/h3&gt;

&lt;p&gt;When the problem involves collaborative editing, multi-leader replication, or offline-first sync:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LWW (Last Write Wins):&lt;/strong&gt; simple, lossy, fine for user preferences, never for document editing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector Clocks:&lt;/strong&gt; causality detection, conflict surfacing. DynamoDB uses this — Amazon's shopping cart deliberately preserves concurrent additions (better to show extra items than lose one).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OT (Operational Transformation):&lt;/strong&gt; the algorithm behind Google Docs. Requires a central server to serialize transforms — decentralized OT is notoriously hard (Jupiter protocol, 1995).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRDTs (Conflict-free Replicated Data Types):&lt;/strong&gt; the modern answer for decentralized/P2P systems. Mathematically guaranteed convergence. Know the types: &lt;strong&gt;G-Counter, PN-Counter, LWW-Register, RGA for text sequences&lt;/strong&gt;. Figma uses CRDTs. Apple Notes sync uses CRDTs.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The move in a collaborative editor interview:&lt;/strong&gt; mention OT, acknowledge the centralization constraint, pivot to CRDTs for offline-first scenarios. Name specific CRDT types. The interviewer will know you've gone beyond the surface.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Unexpected Domains
&lt;/h3&gt;

&lt;p&gt;The deepest impression comes from domain-specific knowledge the interviewer wasn't expecting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Food delivery by drone?&lt;/strong&gt; Most candidates talk about GPS and routing APIs. You talk about &lt;strong&gt;battery health modeling&lt;/strong&gt; — effective range is a function of &lt;code&gt;capacity × efficiency(wind, payload, temperature)&lt;/code&gt;, and capacity degrades with cycle count. You mention &lt;strong&gt;drone migration&lt;/strong&gt; — what happens when battery drops to 20% mid-route? Dynamic rerouting to the nearest charging station, similar to EV range-anxiety routing. You mention &lt;strong&gt;geofencing&lt;/strong&gt; — FAA LAANC authorization, temporary flight restrictions, R-tree spatial indexes for polygon containment queries. You mention &lt;strong&gt;fleet rebalancing&lt;/strong&gt; — the same optimization problem as scooter/bike-share.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Designing a stock exchange?&lt;/strong&gt; Talk about order matching engines, LMAX Disruptor pattern for single-threaded mechanical sympathy, the difference between price-time priority and pro-rata matching.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Designing a code deployment pipeline?&lt;/strong&gt; Talk about blue-green vs. canary vs. feature flags, progressive rollout percentages, automated rollback on error budget violations.&lt;/p&gt;

&lt;p&gt;Name the specifics. Even if the interviewer has never built a drone system, they recognize engineering depth when they see it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It Together: The 45-Minute Framework
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[0–5 min]   Clarify requirements
            Ask about scale, consistency needs, read/write ratio, SLA.
            Don't design before you understand the problem.

[5–10 min]  High-level design
            Components, data flow, APIs. Whiteboard the happy path.

[10–25 min] Deep dive
            Pick 2–3 critical components and go deep.
            This is where you demonstrate the above.

[25–35 min] Scaling and failure modes
            "What happens at 10x load?"
            "What's the single point of failure?"

[35–45 min] Trade-offs and evolution
            "If I had more time, I'd..."
            "The decision I'm least confident about is X because..."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The last 10 minutes signal maturity. An engineer who says &lt;em&gt;"I'd revisit the database choice as we get real load data — I might have over-indexed on write throughput at the expense of query flexibility"&lt;/em&gt; is an engineer who has shipped real systems and learned from them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;The best system design interviews share one trait: they think out loud about &lt;strong&gt;trade-offs, not solutions&lt;/strong&gt;. The solution is almost always &lt;em&gt;"it depends."&lt;/em&gt; The trade-off is where the experience lives.&lt;/p&gt;

&lt;p&gt;When you say &lt;em&gt;"I'd use a leaky bucket here instead of a token bucket because this service feeds into a payment processor — I'd rather smooth the egress rate than allow any bursting, even if it means slightly higher latency for legitimate requests"&lt;/em&gt; — &lt;strong&gt;that sentence is the interview&lt;/strong&gt;. Not the diagram.&lt;/p&gt;

&lt;p&gt;Know the names. Know the algorithms. Have the war stories. And think out loud about the trade-offs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's the game.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>interview</category>
      <category>distributedsystems</category>
      <category>leetcode</category>
    </item>
  </channel>
</rss>
