DEV Community

Vikas Kumar
Vikas Kumar

Posted on

Design HLD - Dropbox | Image Upload Service

Requirements

Functional Requirements

  1. Support image upload and download across devices.
  2. Identify and manage exact duplicate images.
  3. Ensure safe retry of upload operations.
  4. Support image transformations (e.g., thumbnails).
  5. Provide secure image access.
  6. Support automatic synchronization across user devices.
  7. Support safe image deletion.

Non Functional Requirements

  1. Highly available and fault tolerant.
  2. Low-latency and high-throughput operations.
  3. High scalability with growing traffic.
  4. Durable and reliable file storage.
  5. Secure storage and access control.
  6. Support large file uploads up to 50 GB.
  7. Cost-efficient at scale.

Key Concepts You Must Know

To be discussed during design

Object Storage vs Metadata Storage

Object storage is a distributed storage system optimized for storing large, unstructured binary data, while metadata storage is a structured data store used to manage information about those objects.

  • Databases are optimized for small, structured records and queries, not large files.
  • Object storage systems are optimized for durability, scalability, and cost, but not for complex querying.
  • Separating image bytes from metadata allows each system to do what it is best at.

Analogy (Library Model)
Object storage is the warehouse storing heavy books. Metadata storage is the catalog system telling you what the book is and where it lives.

Example
Metadata DB → image_id, owner_id, size, hash, storage_path
Object Store → actual image bytes

Multipart / Resumable Uploads

Multipart uploads divide large files into smaller parts that can be uploaded independently and reassembled by the storage system.

  • Large uploads are prone to network failures and timeouts.
  • Chunking allows retries at a fine-grained level instead of restarting the entire upload.
  • Upload state is tracked via an upload session.

Analogy (Shipping Boxes)
Instead of shipping one huge box, ship many small boxes. If one box is lost, only that box is resent.

Example
UploadSession ID
→ Chunk 1 uploaded
→ Chunk 2 uploaded
→ Chunk 3 failed → retry

Signed / Time-Bound URLs

Signed URLs provide temporary, secure access to private objects by embedding authentication information into the URL itself.

  • The backend validates access and generates a URL with an expiry time and signature.
  • Storage systems trust the signature and serve the object directly.
  • This avoids routing large downloads through application servers.

Analogy (Hotel Key Card)
A hotel card opens your room only for a limited time. After checkout, it stops working automatically.

Example
GET /image/123
→ Backend returns signed URL (expires in 5 min)
→ Client downloads from storage

Content-Based Deduplication

Content-based deduplication eliminates redundant data by identifying identical content using cryptographic hashes.

  • Before storing an image, the system computes its hash.
  • If the hash already exists, storage is skipped and a new reference is created.
  • Multiple users can reference the same underlying object.

Analogy (Pointer to Same File)
Instead of saving the same file twice, create another pointer to it.

Example
Hash(H1) exists
→ ref_count++
→ no new storage write

Cryptographic Hash (SHA-256)

SHA-256 is a cryptographic hash function that produces a fixed-length, collision-resistant fingerprint for any input.

  • Same input always produces the same hash.
  • Any change in input produces a drastically different hash.
  • Collision probability is negligible for practical systems.

Analogy (DNA for Files)
Files have unique DNA sequences.

Example
image.jpg → SHA-256 → 256-bit hash

Idempotent Operations

Idempotency ensures that repeating an operation produces the same final state as executing it once.

  • Network failures often cause retries.
  • Without idempotency, retries can corrupt data or create duplicates.
  • Idempotency is usually enforced using unique request IDs.

Analogy (Light Switch)
Turning the light ON multiple times keeps it ON.

Example
DELETE image/123
→ deleted = true
→ retry DELETE → no change

Two-Phase Deletion

Two-phase deletion separates logical deletion from physical deletion to ensure safety and consistency.

  • Immediate physical deletion is risky in distributed systems.
  • Soft delete hides the image immediately.
  • Hard delete is done later by a background process.

Analogy (Recycle Bin)
You delete a file → it goes to trash → later permanently removed.

Example
Phase 1: deleted = true
Phase 2: GC job removes blob


Capacity Estimation

Key Assumptions

  • DAU (Daily Active Users): ~10 million
  • Uploads per user per day: ~2 images
  • Average image size: ~5 MB
  • Traffic pattern: Read-heavy (images viewed more than uploaded)
  • System scale: Large-scale, distributed system assumed

Upload Volume Estimation

Total uploads per day => 10M users × 2 uploads = ~20M images/day
Total data uploaded per day => 20M images × 5 MB ≈ ~100 TB/day

Throughput Estimation (QPS)

Write Traffic - Average write QPS (Queries Per Second) => 20M / 86,400 ≈ ~200 uploads/sec
Read Traffic - Reads are assumed ~5× writes => Average read QPS: ~1,000/sec

Metadata Size Estimation

Metadata per image: ~100 bytes (IDs, hash, timestamps, flags)
Metadata per day => 20M × 100 B ≈ ~2 GB/day


Core Entities

  • User: Represents a system user who uploads, owns, and accesses images.
  • Image: Represents a logical image uploaded by a user; stores ownership and state, not the raw image bytes.
  • ImageObject (ImageBlob): Represents the actual binary image file stored in object storage; can be shared across multiple images due to deduplication.
  • ImageVariant: Represents derived versions of an image such as thumbnails or resized formats.
  • UploadSession: Represents an in-progress multipart upload and enables safe retries and resumable uploads.

Database Design

Users Table

Represents system users.

User
----
user_id (PK)
email
created_at
status
Enter fullscreen mode Exit fullscreen mode

Used for

  • Ownership
  • Sharing
  • Access control

Image (Asset) Table

Represents a user-visible image.

Image
-----
image_id (PK)
owner_id (FK → User)
content_hash
name
size
visibility
status (active / deleted)
created_at
updated_at
Enter fullscreen mode Exit fullscreen mode

Key Points

  • One row per user image.
  • Multiple images can reference the same content hash.
  • Soft delete is handled via status.

ImageContent (Blob) Table

Represents the actual stored image content.

ImageContent
------------
content_hash (PK)
storage_path
size
ref_count
created_at

Enter fullscreen mode Exit fullscreen mode

Key Points

  • One row per unique image content.
  • ref_count tracks how many images reference this blob.
  • Enables safe deduplication and deletion.

ImageVariant Table

Represents thumbnails or resized versions.

ImageVariant
------------
variant_id (PK)
content_hash (FK → ImageContent)
variant_type (thumbnail_small, large, etc.)
storage_path
created_at
Enter fullscreen mode Exit fullscreen mode

Key Points

  • Variants are tied to content, not individual users.
  • Generated asynchronously.

UploadSession Table

Tracks multipart uploads.

UploadSession
-------------
upload_session_id (PK)
owner_id
content_fingerprint
status (uploading / completed)
created_at
expires_at
Enter fullscreen mode Exit fullscreen mode

Optional (if chunk-level tracking is needed)

UploadChunk
-----------
upload_session_id (FK)
chunk_number
status (uploaded / pending)
etag
Enter fullscreen mode Exit fullscreen mode

Key Points

  • Enables resumable uploads.
  • Prevents restarting large uploads.

Indexing Strategy

| Access Pattern    | Index                  |
| ----------------- | ---------------------- |
| Fetch user images | (owner_id, created_at) |
| Dedup lookup      | content_hash           |
| Cleanup jobs      | status + ref_count     |
| Sync              | updated_at             |
Enter fullscreen mode Exit fullscreen mode

Indexes are chosen based on actual query patterns, not theoretical normalization.

Consistency Model

  • Strong consistency for metadata updates (uploads, deletes).
  • Eventual consistency for: Sync across devices, Variant availability, Background cleanup This balances correctness with scalability.

Transactions & Conditional Writes

  • Deduplication uses conditional inserts on content_hash.
  • Reference counts are updated atomically.
  • Prevents race conditions when multiple users upload the same image.

Failure Handling at DB Level

  • If metadata write fails → upload not finalized.
  • Orphaned blobs are cleaned by background jobs.
  • DB failures degrade performance, not correctness.

API / Endpoints

Start Upload → POST: /uploads

Initializes a new upload session and returns the chunk size and session ID.

Request

{
  "file_name": "photo.jpg",
  "file_size": 50000000,
  "mime_type": "image/jpeg"
}
Enter fullscreen mode Exit fullscreen mode

Response

{
  "upload_session_id": "us_123",
  "chunk_size": 5000000
}
Enter fullscreen mode Exit fullscreen mode

Upload Chunk → PUT: /uploads/{upload_session_id}/chunks/{chunk_number}

Uploads a single chunk of the file and supports safe retries.

Request

Raw binary chunk data
Enter fullscreen mode Exit fullscreen mode

Response

{
  "chunk_number": 3,
  "status": "uploaded"
}
Enter fullscreen mode Exit fullscreen mode

Chunk number = position of this piece in the file (0,1,2,…)

Complete Upload → POST: /uploads/{upload_session_id}/complete

Finalizes the upload, assembles chunks, checks deduplication, and creates the image.

Response

{
  "image_id": "img_456",
  "status": "completed"
}
Enter fullscreen mode Exit fullscreen mode

Get Image → GET: /images/{image_id}

Returns a time-bound signed URL to securely download the image.

Response

{
  "download_url": "https://signed-url",
  "expires_in": 300
}
Enter fullscreen mode Exit fullscreen mode

Get Image Metadata → GET: /images/{image_id}/metadata

Fetches lightweight metadata without downloading the image.

Response

{
  "image_id": "img_456",
  "owner_id": "user_1",
  "size": 50000000,
  "status": "active",
  "created_at": "2026-02-05T10:00:00Z"
}
Enter fullscreen mode Exit fullscreen mode

Update Image Metadata → PATCH: /images/{image_id}

Updates image metadata such as name or visibility.

Request

{
  "name": "vacation_photo.jpg",
  "visibility": "private"
}
Enter fullscreen mode Exit fullscreen mode

Response

{
  "status": "updated"
}
Enter fullscreen mode Exit fullscreen mode

Image Variants (Thumbnails) → GET: /images/{image_id}/variants/{variant_type}

Returns a signed URL for a specific image variant (e.g., thumbnail).

Response

{
  "download_url": "https://signed-url",
  "variant": "thumbnail_small"
}

Enter fullscreen mode Exit fullscreen mode

Soft Delete → DELETE: /images/{image_id}

Soft-deletes the image by marking it as deleted in metadata.

Response

{
  "status": "deleted"
}
Enter fullscreen mode Exit fullscreen mode

Hard Delete (Internal) → POST: /internal/images/{image_id}/cleanup

Permanently removes the image from storage after safety checks.

Response

{
  "status": "permanently_deleted"
}
Enter fullscreen mode Exit fullscreen mode

Sync API (Multi-Device) → GET: /sync?since=timestamp

Returns images added, updated, or deleted since the last sync.

Response

{
  "added": ["img_789"],
  "updated": ["img_456"],
  "deleted": ["img_123"]
}
Enter fullscreen mode Exit fullscreen mode

System Components

1. Client (Web / Mobile)

  • Provides UI for users to upload, download, view, and delete images.
  • Splits large images into fixed-size chunks and uploads them independently.
  • Retries only failed chunks during network failures.
  • Maintains local image state and syncs changes with the server.

2. Load Balancer & API Gateway

  • Acts as the single entry point for all client requests.
  • Authenticates users and enforces authorization rules.
  • Applies rate limiting and routes requests to backend services.
  • Shields backend services from direct internet exposure.

3. Image Service (Application Layer)

  • Stateless service that orchestrates all workflows.
  • Creates and manages upload sessions.
  • Generates signed URLs for secure upload and download.
  • Validates permissions and updates image metadata.
  • Coordinates deduplication, deletion, and sync logic.
  • Never handles raw image bytes directly.

4. Metadata Database

  • Persists all image-related metadata and relationships.
  • Stores ownership, content hash, object location, reference counts, and lifecycle state.
  • Serves as the source of truth for: Deduplication, Access control, Synchronization and Deletion safety

5. Object Storage

  • Stores the actual image binaries and transformed variants.
  • Images are addressed using their content hash.
  • Guarantees high durability and virtually unlimited scale.
  • Supports large objects (up to 50 GB).

6. Image Processing Service (Async Workers)

  • Consumes upload-completion events.
  • Generates thumbnails and other image variants asynchronously.
  • Writes transformed images back to object storage.
  • Updates metadata once processing completes.
  • Scales independently from user traffic.

7. CDN (Content Delivery Network)

  • Caches images and thumbnails close to end users.
  • Serves read-heavy traffic efficiently.
  • Uses signed URLs to ensure only authorized access.
  • Reduces load on object storage and backend services.

8. Sync / Notification Layer

  • Observes metadata changes in the system.
  • Notifies connected devices of updates using: Push (WebSockets/SSE) for active images and Polling for inactive images
  • Enables eventual consistency across all devices.

High-Level Flows

Flow 1: Image Upload

  • Client requests an upload session from the Image Service.
  • Image Service returns chunk size and signed upload URLs.
  • Client uploads image chunks directly to object storage.
  • On completion, Image Service: Computes SHA-256 hash, then Checks for duplicates, then Creates or updates metadata.
  • Image becomes available across devices.

Flow 2: Retry / Resume Upload

  • If a chunk upload fails, the client retries only that chunk.
  • Upload session tracks completed chunks.
  • Duplicate chunk uploads are ignored.
  • Ensures idempotent and reliable uploads.

Flow 3: Image Download

  • Client requests access to an image.
  • Image Service verifies ownership or shared access.
  • A time-bound signed URL is generated.
  • Client downloads the image from CDN or object storage.

Flow 4: Deduplication

  • SHA-256 hash uniquely identifies image content.
  • If a matching hash exists: No new blob is stored, Reference count is incremented
  • If not: Image is stored as a new object
  • Each user receives an independent asset reference.

Flow 5: Image Transformation

  • Upload completion emits an asynchronous event.
  • Image processing workers generate thumbnails and variants.
  • Variants are stored as separate objects.
  • Metadata is updated to reference new variants.

Flow 6: Multi-Device Synchronization

  • Metadata updates record change timestamps or versions.
  • Other devices fetch changes via sync APIs or receive push notifications.
  • Devices apply updates locally.
  • System converges using eventual consistency.

Flow 7: Image Deletion (Two-Phase)

  • User deletes image → metadata is marked as deleted.
  • Image is immediately hidden from all devices.
  • Background job checks reference count.
  • Image blob is permanently removed only when no references remain.

Deep Dives - Functional Requirements

1. Support Image Upload and Download Across Devices

  • Clients (web, mobile, desktop) upload images using direct-to-object-storage uploads via signed URLs.
  • Large files are split into chunks and uploaded independently.
  • Downloads use time-bound signed URLs and are served via CDN.
  • This allows seamless access from any device with low latency and high throughput.

2. Identify and Manage Exact Duplicate Images

  • The system computes a SHA-256 hash of image content during upload.
  • This hash uniquely identifies the image bytes.
  • If the hash already exists, the image blob is not stored again.
  • A new metadata reference (asset) is created pointing to the existing content.

3. Ensure Safe Retry of Upload Operations

  • Uploads use multipart (chunked) uploads.
  • Each chunk is uploaded independently and tracked via an upload session.
  • Failed chunks are retried without re-uploading completed chunks.
  • Operations are idempotent, preventing duplicate writes.

4. Support Image Transformations (e.g., Thumbnails)

  • After upload completion, an event is emitted.
  • Asynchronous workers generate thumbnails and other image variants.
  • Transformed images are stored separately and linked via metadata.
  • This keeps uploads fast and processing scalable.

5. Provide Secure Image Access

  • All images are stored in private object storage.
  • Access is granted using short-lived signed URLs after permission checks.
  • URLs expire automatically, limiting unauthorized access.
  • CDN integration ensures fast and secure delivery.

6. Support Automatic Synchronization Across User Devices

  • Metadata is the source of truth for image state.
  • Clients sync changes using polling or push notifications (WebSocket/SSE).
  • Only deltas (added, updated, deleted images) are synced.
  • Ensures eventual consistency across all devices.

7. Support Safe Image Deletion

  • Deletion is handled using a two-phase delete.
  • First, the image is soft-deleted in metadata and hidden immediately.
  • A background job deletes the image blob only when no references remain.
  • This prevents accidental data loss and works with deduplication.

Deep Dives - Non - Functional Requirements

1. High Availability & Fault Tolerance

  • All backend services are stateless and deployed across multiple availability zones.
  • Metadata and storage systems are replicated.
  • Idempotent APIs ensure retries don’t corrupt state.
  • Availability: 99.9%+ (system remains usable despite node/AZ failures)

2. Low Latency & High Throughput

  • Uploads and downloads go directly to object storage using signed URLs.
  • CDN serves read traffic close to users.
  • Duplicate uploads are short-circuited before storing data.
  • Heavy work (thumbnails, scans) runs asynchronously.
  • Duplicate upload latency: < 50 ms (no file transfer)
  • Image read latency (CDN): ~5–20 ms

3. High Scalability with Growing Traffic

  • Stateless services scale horizontally.
  • Metadata, storage, and processing scale independently.
  • Sharding by user/content hash avoids hotspots.
  • Scaling model: Linear (add instances → increase capacity)

4. Durable & Reliable File Storage

  • Images are stored in object storage with built-in replication.
  • Content-addressed (hash-based) storage ensures immutability.
  • Metadata is persisted in a replicated database.
  • Durability: Object storage-grade (11 nines)

5. Secure Storage & Access Control

  • All data encrypted in transit and at rest.
  • Storage buckets remain private.
  • Access granted via short-lived signed URLs after permission checks.
  • Signed URL validity: 5–10 minutes

6. Support Large File Uploads (Up to 50 GB)

  • Files are uploaded using multipart (chunked) uploads.
  • Clients retry only failed chunks.
  • Upload state tracked via upload sessions.
  • Max file size: 50 GB (network-bound, not server-bound)

7. Cost Efficiency at Scale

  • Exact deduplication stores identical images only once.
  • CDN reduces repeated reads from storage.
  • Lifecycle rules clean up unused data.
  • Storage savings via dedup: Significant (workload-dependent)

Trade Offs

1. Object Storage vs Database for Image Bytes

Choice: Store image bytes in object storage, not in a database.

Pros

  • Handles very large files efficiently
  • High durability and low cost
  • Scales independently from metadata

Cons

  • No complex querying on image data
  • Requires separate metadata store

Why This Works

Databases are optimized for small, structured data. Object storage is purpose-built for large blobs and is the industry standard for this use case.

2. Content-Based Deduplication (SHA-256)

Choice: Deduplicate images using cryptographic hashes.

Pros

  • Massive storage savings
  • Simple, deterministic duplicate detection
  • Enables safe reference counting

Cons

  • Hash computation adds CPU overhead
  • Only detects exact duplicates (not visually similar images)

Why This Works

Exact deduplication is reliable, fast, and sufficient for most storage optimization needs. Near-duplicate detection can be added later asynchronously.

3. Multipart Uploads vs Single Upload

Choice: Use multipart (chunked) uploads.

Pros

  • Supports very large files (up to 50 GB)
  • Allows resumable uploads
  • Improves user experience and reliability

Cons

  • More complex client logic
  • Requires tracking upload state

Why This Works

Single uploads do not scale for large files and fail badly under unreliable networks. Chunking is the industry-standard solution.

4. Direct-to-Object Storage Uploads

Choice: Clients upload/download directly from object storage using signed URLs.

Pros

  • Very high throughput
  • Backend stays lightweight and scalable
  • Lower infrastructure cost

Cons

  • Less visibility into byte-level progress on backend
  • Requires careful security handling

Why This Works

Keeping application servers out of the data path is critical for performance and cost at scale.

5. Asynchronous Image Processing

Choice: Generate thumbnails and variants asynchronously.

Pros

  • Faster upload completion
  • Better system throughput
  • Easy horizontal scaling

Cons

  • Variants are not immediately available
  • Requires eventual consistency handling

Why This Works

Users care more about upload completion than immediate thumbnails. Async processing optimizes both latency and scale.

6. Two-Phase Deletion

Choice: Soft delete first, hard delete later.

Pros

  • Prevents accidental data loss
  • Works safely with deduplication
  • Enables recovery and auditing

Cons

  • Requires background cleanup jobs
  • Storage freed with a delay

Why This Works

Immediate deletion is dangerous in distributed systems. Two-phase deletion is safer and widely used.

7. Eventual Consistency for Sync

Choice: Use eventual consistency for multi-device synchronization.

Pros

  • High availability and scalability
  • Reduced coordination overhead
  • Better performance under load

Cons

  • Temporary inconsistencies across devices
  • Requires conflict resolution logic

Why This Works

Strong consistency is unnecessary for file sync and would significantly reduce system availability and throughput.

8. Signed URLs as Bearer Tokens

Choice: Use short-lived signed URLs for access control.

Pros

  • Simple and scalable access control
  • Works seamlessly with CDN
  • No backend involvement during download

Cons

  • URLs can be shared while valid
  • Requires short expiration windows

Why This Works

Short-lived URLs significantly reduce risk while enabling high-performance delivery. Additional restrictions can be layered if needed.


Frequently Asked Questions in Interviews

Q. Why do production systems strictly separate binary storage from metadata storage?

Relational and NoSQL databases are optimized for small, mutable records with indexing and transactions. Storing large binaries:

  • Pollutes buffer cache
  • Increases replication lag
  • Makes backups and restores slow
  • Raises cost per GB significantly
  • Object storage is optimized for immutable large objects, providing:
  • Multi-AZ replication by default
  • High write throughput
  • Lifecycle policies (cold storage, deletion)
  • No need for manual sharding
  • Metadata DB stores only pointers (object_key, hash, size) — never raw bytes.

Q. What does a real metadata schema look like?

A minimal but scalable model:

Blob Table (Content-level)

hash (PK)
object_key
size
ref_count
created_at
Enter fullscreen mode Exit fullscreen mode

Image Table (Ownership-level)

image_id (PK)
user_id (indexed)
hash (FK)
visibility / ACL
created_at
deleted_at
Enter fullscreen mode Exit fullscreen mode

This allows:

  • Exact deduplication
  • Independent ownership
  • Safe deletion via reference counting

Q. Why are uploads designed as direct-to-object-storage in real systems?

Because backend servers:

  • Are expensive per byte
  • Are limited by NIC bandwidth
  • Add failure points

In production, backend servers act as a control plane:

  • Issue upload credentials
  • Validate metadata
  • Finalize uploads
  • All file bytes flow directly from client → object storage.

Q. How are signed uploads implemented technically?

Backend:

  • Initiates multipart upload with object storage
  • Generates signed URLs for each part
  • Returns upload session metadata to client

Client:

  • Uploads parts directly using signed URLs
  • Retries failed parts independently
  • Calls “complete upload” API after all parts succeed
  • Backend never touches file bytes.

Q. How is the entire upload workflow made idempotent?

Idempotency is enforced at three layers:

  • Upload session ID uniquely identifies an upload attempt
  • Chunk uploads are keyed by (session_id, part_number)

Completion step uses conditional update:

UPDATE uploads
SET status = COMPLETED
WHERE session_id = X AND status != COMPLETED
Enter fullscreen mode Exit fullscreen mode

Retries are safe at every step.

Q. What happens if object storage succeeds but metadata commit fails?

The upload remains in a COMPLETED_IN_STORAGE but PENDING_METADATA state.

A background reconciler:

  • Scans incomplete uploads
  • Verifies object existence
  • Retries metadata commit
  • Expires uploads past TTL
  • No user-visible corruption occurs.

Q. Why is content-addressed storage used instead of IDs?

IDs identify ownership, not content.

Content hashes provide:

  • Deterministic identity
  • Deduplication
  • Integrity verification
  • Using IDs alone makes deduplication race-prone and expensive.

Q. When and how is the hash computed?

Client computes hash while chunking the file (streaming).
This avoids loading the full file into memory.

Optionally:

  • Backend verifies hash asynchronously for trust
  • Upload path is never blocked on verification

Q. How do you safely deduplicate under concurrent uploads?

Blob creation uses conditional insert:

INSERT INTO blobs (hash, ...)
IF NOT EXISTS
Enter fullscreen mode Exit fullscreen mode

Outcomes:

  • One writer wins
  • Others reuse existing blob
  • Reference count increment is atomic
  • No locks, no race conditions.

Q. How do you avoid hot-hash contention?

  • Shard blob table by hash prefix
  • Cache hash existence in Redis
  • Use Bloom filters to skip DB hits on negative lookups
  • This keeps deduplication fast even for viral content.

Q. Why are multipart uploads mandatory?

Single uploads fail due to:

  • Client timeouts
  • Gateway size limits
  • Network instability

Multipart uploads allow:

  • Parallelism
  • Resume from failure
  • Independent retries per chunk

Q. How is resume implemented without backend state?

  • Object storage tracks uploaded parts.
  • Client queries uploaded part list and uploads only missing chunks.
  • Backend state is optional — object storage is the source of truth.

Q. What happens if an application server crashes mid-request?

  • Nothing breaks. All servers are stateless.
  • Requests retry against another instance.
  • No in-memory state is required for recovery.

Q. How does the system survive AZ or region failures?

  • App servers: multi-AZ autoscaling
  • Metadata DB: replicas + failover
  • Object storage: multi-AZ by default
  • CDN serves cached content during partial outages
  • Availability degrades gracefully, not catastrophically.

Q. Why is eventual consistency chosen?

Strong consistency requires cross-region coordination, increasing latency and reducing availability.

Eventual consistency:

  • Matches user expectations for file systems
  • Improves availability
  • Enables global scaling
  • Correctness is preserved at metadata layer.

Q. How do multiple devices stay in sync?

  • Devices sync metadata deltas, not binaries:
  • Polling or push notifications
  • Only changed image IDs fetched
  • Actual images downloaded lazily
  • This minimizes bandwidth and latency.

Q. How is access control enforced technically?

  • Buckets are private
  • Backend validates ACLs
  • Signed URLs scoped to object + operation + expiry
  • Clients never receive long-lived credentials.

Q. What prevents signed URL abuse?

  • Short expiration (minutes)
  • Single-object scope
  • Optional IP or device binding
  • Read-only vs write-only URLs
  • Even leaked URLs have minimal blast radius.

Q. What are the largest cost optimizations in practice?

  • Exact deduplication (storage)
  • CDN caching (egress)
  • Avoiding backend data transfer
  • Lifecycle rules for cold data
  • These dwarf micro-optimizations.

Q. Why not aggressively compress images?

JPEG/PNG/WebP are already compressed.
Extra compression:

  • Increases CPU cost
  • Adds latency
  • Saves negligible space
  • Compression is applied selectively, not globally.

Q. What bottleneck appears first at scale?

Metadata write throughput.
Solved via:

  • Sharding
  • Batching
  • Async writes
  • Cache-first lookups

Q. What changes at 10× or 100× scale?

Architecture remains unchanged.
We add:

  • More shards
  • More async workers
  • More regions
  • No redesign — only capacity expansion.

High-Level Summary

This system allows users to upload, store, and sync images across devices at scale. Images are stored using content-addressed object storage to enable exact deduplication, while metadata drives access control, synchronization, and lifecycle management. Large uploads are handled using multipart uploads, and all heavy processing is done asynchronously to keep latency low.

Feel free to ask questions or share your thoughts — happy to discuss!


Top comments (0)