<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jyotheendra Doddala</title>
    <description>The latest articles on DEV Community by Jyotheendra Doddala (@jyotheendra_doddala).</description>
    <link>https://dev.to/jyotheendra_doddala</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3773375%2F143a54b4-c332-4f9d-84d0-214c0c8b3cd1.png</url>
      <title>DEV Community: Jyotheendra Doddala</title>
      <link>https://dev.to/jyotheendra_doddala</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jyotheendra_doddala"/>
    <language>en</language>
    <item>
      <title>Designing a Playback Resume System at Scale (It’s Not Just a Timestamp)</title>
      <dc:creator>Jyotheendra Doddala</dc:creator>
      <pubDate>Sun, 15 Feb 2026 02:05:44 +0000</pubDate>
      <link>https://dev.to/jyotheendra_doddala/designing-a-playback-resume-system-at-scale-its-not-just-a-timestamp-2il4</link>
      <guid>https://dev.to/jyotheendra_doddala/designing-a-playback-resume-system-at-scale-its-not-just-a-timestamp-2il4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;At a surface level, this sounds trivial, as it’s just storing userId, videoId, and timestamp.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foddv1fejr61mbngdrkjb.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foddv1fejr61mbngdrkjb.gif" alt="Seems easy enough" width="220" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But not when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Millions of users press play at the same time&lt;/li&gt;
&lt;li&gt;People switch from TV to phone in seconds&lt;/li&gt;
&lt;li&gt;Writes happen every few seconds&lt;/li&gt;
&lt;li&gt;Resume must feel instant&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  1. Clarifying the Problem
&lt;/h1&gt;

&lt;p&gt;We are designing a Playback Resume System that allows users to resume watching from where they left off across devices.&lt;/p&gt;

&lt;p&gt;We are not designing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The video streaming pipeline itself&lt;/li&gt;
&lt;li&gt;Real-time co-watch (two users watching in sync)&lt;/li&gt;
&lt;li&gt;Multi-region replication, global failover, or cross-region consistency trade-offs&lt;/li&gt;
&lt;li&gt;Perfect real-time synchronisation across devices (1–2 second eventual consistency is acceptable)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This service would live within an existing micro-services architecture, so I won’t deep-dive into service discovery, deployment, etc., and will focus purely on the playback state.&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Functional Requirements (User Centric)
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;User should be able to resume a video from the last watched position.&lt;/li&gt;
&lt;li&gt;User should be able to switch devices and continue seamlessly.&lt;/li&gt;
&lt;li&gt;User should have an independent watch history per profile.&lt;/li&gt;
&lt;li&gt;The system should update the playback position periodically while watching.&lt;/li&gt;
&lt;li&gt;Latest progress should win if multiple devices update.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  3. Non-Functional Requirements
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Resume reads &amp;lt;150ms&lt;/li&gt;
&lt;li&gt;Writes &amp;lt;500ms&lt;/li&gt;
&lt;li&gt;High availability&lt;/li&gt;
&lt;li&gt;Scalable to millions of concurrent users&lt;/li&gt;
&lt;li&gt;Eventual consistency across devices is acceptable (1–2 sec lag)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;a href="https://en.wikipedia.org/wiki/CAP_theorem" rel="noopener noreferrer"&gt;CAP Theorem Consideration&lt;/a&gt;
&lt;/h4&gt;

&lt;p&gt;During network partitions, we prefer Availability + Partition tolerance over Strong Consistency + Partition tolerance.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because if one replica is slightly behind, the user resuming 1 second earlier is acceptable. But we cannot afford downtime.&lt;/p&gt;

&lt;p&gt;So:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High availability &amp;gt; Strong consistency&lt;/li&gt;
&lt;li&gt;Eventual consistency + last write wins is good enough&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For playback, you won't need perfection. Responsiveness is.&lt;/p&gt;

&lt;h1&gt;
  
  
  4. Data Model
&lt;/h1&gt;

&lt;p&gt;Instead of user_id, we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(account_id, profile_id, video_id)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because in one household:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Account 123
├── Profile A → V1 → 1200s
└── Profile B → V1 → 300s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each profile tracks progress independently.&lt;/p&gt;

&lt;p&gt;We also store:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;position
updated_at
device_id
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;updated_at enables conflict resolution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Using updated_at for last write wins assumes reasonably synchronised clocks. In production, this is typically handled using server-generated timestamps or monotonic counters. I’m keeping the conflict resolution logic simple here to focus on system behaviour rather than clock management.&lt;/p&gt;

&lt;h1&gt;
  
  
  5. Scale Estimation
&lt;/h1&gt;

&lt;p&gt;Assume:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;10M daily users&lt;/li&gt;
&lt;li&gt;3M actively watching&lt;/li&gt;
&lt;li&gt;Update every 10 seconds&lt;/li&gt;
&lt;li&gt;30 min session → ~180 updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;~540M writes/day&lt;br&gt;
~6K writes/sec&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbveobq0uliqxju1e0vh.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbveobq0uliqxju1e0vh.gif" alt="Okay... this escalated" width="560" height="311"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is not a small system. Logical reads are similar in magnitude, but database reads are significantly reduced via caching.&lt;/p&gt;

&lt;h4&gt;
  
  
  Smarter Write Strategy
&lt;/h4&gt;

&lt;p&gt;In reality, we don’t unthinkingly update every 10 seconds.&lt;br&gt;
We optimise by writing only when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Position delta &amp;gt; 15–30 seconds&lt;/li&gt;
&lt;li&gt;OR user pauses&lt;/li&gt;
&lt;li&gt;OR the app goes to the background&lt;/li&gt;
&lt;li&gt;OR periodic checkpoint (e.g., every 60 seconds)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write amplification&lt;/li&gt;
&lt;li&gt;Cache churn&lt;/li&gt;
&lt;li&gt;Queue pressure&lt;/li&gt;
&lt;li&gt;Database cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That 540M/day number can realistically drop 3–5x with smarter checkpointing.&lt;/p&gt;
&lt;h1&gt;
  
  
  6. API Design
&lt;/h1&gt;

&lt;p&gt;Update Playback&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /playback/update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Body:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;account_id&lt;/li&gt;
&lt;li&gt;profile_id&lt;/li&gt;
&lt;li&gt;video_id&lt;/li&gt;
&lt;li&gt;position&lt;/li&gt;
&lt;li&gt;device_id&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Resume Playback&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /playback/resume
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  7. Start Simple: DB-Only
&lt;/h1&gt;

&lt;p&gt;We could store everything in DynamoDB/Cassandra.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Primary key:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(account_id#profile_id, video_id)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple&lt;/li&gt;
&lt;li&gt;Durable&lt;/li&gt;
&lt;li&gt;Easy to scale horizontally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every resume hits DB&lt;/li&gt;
&lt;li&gt;Higher latency at scale&lt;/li&gt;
&lt;li&gt;Costly under heavy read traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good for MVP. But not ideal for massive scale.&lt;/p&gt;

&lt;h1&gt;
  
  
  8. Hybrid Architecture
&lt;/h1&gt;

&lt;p&gt;Because a resume is latency sensitive and read-heavy, we introduce caching.&lt;/p&gt;

&lt;h3&gt;
  
  
  High Level Design
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fll2tv34b55wnlk8repdx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fll2tv34b55wnlk8repdx.png" alt="HLD for Playback resume system" width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  How It Works
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Write Flow&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Client sends update.&lt;/li&gt;
&lt;li&gt;Service performs a conditional write to the DB (if updated_at is newer).&lt;/li&gt;
&lt;li&gt;Redis cache is updated.&lt;/li&gt;
&lt;li&gt;Event is optionally published for analytics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conditional Writes (Idempotency)&lt;/strong&gt;&lt;br&gt;
To avoid stale overwrites, we use conditional writes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Update only if incoming.updated_at &amp;gt; existing.updated_at&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This ensures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Last write wins&lt;/li&gt;
&lt;li&gt;Safe retries&lt;/li&gt;
&lt;li&gt;No duplicate corruption&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Redis Crash Safety&lt;/strong&gt;&lt;br&gt;
Instead of writing only to Redis first:&lt;br&gt;
We persist to DB first (durable), then update Redis.&lt;/p&gt;

&lt;p&gt;In a worst-case scenario, if Redis crashes, the DB remains the source of truth. We prefer durability over extreme write latency savings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Read Flow&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check Redis.&lt;/li&gt;
&lt;li&gt;If hit → instant resume.&lt;/li&gt;
&lt;li&gt;If miss → fetch from DB → repopulate cache.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most reads should never touch the database.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Brief stale reads may occur due to replication lag, acceptable under our 1–2 second tolerance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Failure Handling &amp;amp; Retries&lt;/strong&gt;&lt;br&gt;
In production, both Redis and the database may occasionally timeout or throttle under load.&lt;/p&gt;

&lt;p&gt;To protect latency &lt;a href="https://en.wikipedia.org/wiki/Service-level_objective" rel="noopener noreferrer"&gt;SLOs&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads fall back to DB if Redis times out.&lt;/li&gt;
&lt;li&gt;Writes use bounded retries with exponential backoff.&lt;/li&gt;
&lt;li&gt;Timeouts are enforced at the service layer to avoid request pile-ups.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If a write ultimately fails, we prefer dropping that checkpoint rather than blocking playback. The next update will reconcile the state due to our last write-win logic.&lt;/p&gt;

&lt;h1&gt;
  
  
  9. Multi-Device Conflict Handling
&lt;/h1&gt;

&lt;p&gt;If TV and phone both send updates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compare updated_at&lt;/li&gt;
&lt;li&gt;Latest wins&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We accept slight inconsistencies because availability matters more.&lt;br&gt;
That’s our CAP trade-off in action.&lt;/p&gt;

&lt;h1&gt;
  
  
  10. Other Production Considerations
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Storage Lifecycle (TTL)&lt;/strong&gt;&lt;br&gt;
Playback entries shouldn’t live forever. We can expire inactive entries after X days (e.g., 180 days) using TTL policies.&lt;/p&gt;

&lt;p&gt;This prevents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unbounded storage growth&lt;/li&gt;
&lt;li&gt;Cold data occupying hot partitions
&lt;strong&gt;Hot Partition Prevention&lt;/strong&gt;
If we partitioned incorrectly (e.g., by video_id),
A trending show at 8 pm could create hot shards.
Using:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(account_id#profile_id, video_id)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ensures even distribution and avoids the hot partition problem.&lt;/p&gt;

&lt;p&gt;Proper database capacity planning or auto scaling is required to handle peak write bursts and avoid write throttling under load.&lt;/p&gt;

&lt;h1&gt;
  
  
  11. UX Guardrails &amp;amp; Data Freshness
&lt;/h1&gt;

&lt;p&gt;A resume should feel intuitive and not surprising.&lt;br&gt;
To prevent confusing jumps in playback:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If a resume position differs by only a few seconds, the client may ignore minor regressions.&lt;/li&gt;
&lt;li&gt;We may cap backward jumps beyond a safety threshold (e.g., don’t resume 5 minutes earlier unless requested).&lt;/li&gt;
&lt;li&gt;Clients can display “Resume from 11:11?” to give users control when conflicts occur.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps the system technically simple (last-write-wins) while protecting the user experience from edge-case inconsistencies.&lt;/p&gt;

&lt;h1&gt;
  
  
  Final Thoughts
&lt;/h1&gt;

&lt;p&gt;This problem looks like a key-value store.&lt;br&gt;
It’s not.&lt;br&gt;
It touches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Distributed systems&lt;/li&gt;
&lt;li&gt;Caching strategy&lt;/li&gt;
&lt;li&gt;Conflict resolution&lt;/li&gt;
&lt;li&gt;UX latency expectations&lt;/li&gt;
&lt;li&gt;CAP trade-offs&lt;/li&gt;
&lt;li&gt;Data modelling for real households&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>backend</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
