DEV Community

M TOQEER ZIA
M TOQEER ZIA

Posted on

Why Big Tech Doesn't Always Use REST: The Evolution of API Architecture at Scale

Essential Terms Defined

Before diving in, here are the critical concepts you'll see throughout this article:

  • REST (Representational State Transfer): A simple, stateless architecture using HTTP methods (GET, POST, PUT, DELETE) where every request is independent. The most common API pattern on the web.

  • GraphQL: A query language that lets clients request EXACTLY the data they need in a single request, eliminating over-fetching (getting too much data) and under-fetching (needing multiple requests).

  • gRPC (Google Remote Procedure Call): A high-performance framework using binary protocol buffers instead of JSON text. Supports streaming and HTTP/2 for multiplexing requests.

  • Protocol Buffers: A binary serialization format developed by Google that's 10x smaller and faster than JSON/XML.

  • WebSockets: A persistent, bidirectional connection between client and server that stays open for instant, real-time communication (unlike HTTP's request-response model).

  • Event-Driven Architecture: Systems where components communicate by publishing and subscribing to events rather than making direct calls.

  • Webhooks: "Reverse APIs" where the server pushes data to your app when something happens, instead of you constantly checking for updates (polling).

  • Microservices: Breaking a large application into independent, loosely-coupled services that communicate over the network.

  • Stateless vs Stateful: Stateless = server doesn't remember you (each request is independent). Stateful = server maintains connection/session history.


The Problem: Why REST Alone Isn't Enough at Scale

REST is the Default, But It Has Real Limitations

  • REST works great for simple operations but breaks down when you need:

    • Real-time communication (stock prices updating as they change)
    • Multiple data fetches (one REST call fetches too much, requiring 5+ more calls)
    • Massive scale with millions of concurrent connections
    • Extremely fast, low-latency interactions (financial trading, multiplayer gaming)
  • The efficiency problem: REST sends data as human-readable JSON/XML, which is great for understanding but wasteful for performance

    • A single REST request might return a 50KB response when you only need 2KB of data
    • Netflix discovered this cost them millions in bandwidth
  • The flexibility problem: REST forces a fixed data structure

    • Mobile apps get the same response as web apps (wasted data on mobile)
    • Frontend teams must coordinate with backend teams constantly
  • The real-time problem: REST requires polling

    • Your app asks "Got anything new?" every 30 seconds
    • This creates unnecessary server load and delays information delivery
    • Imagine checking your mailbox manually every minute instead of getting notified

Real-World Case: Why Big Tech Made the Switch

Case Study 1: Netflix's REST Bottleneck (2010-2012)

  • The Problem: Netflix was serving the same API response to TVs, tablets, smartphones, and browsers

    • A TV doesn't need compact JSON; it can display rich data
    • A phone on 4G needed lean, minimal responses
    • Their REST API returned massive payloads for every request
  • The Impact:

    • Mobile users experienced slow load times
    • Bandwidth costs skyrocketed (Netflix streams video anyway, but API overhead compounds)
    • Netflix mobile adoption plateaued
  • The Solution: Netflix adopted a hybrid approach

    • Maintained REST for basic operations
    • Implemented GraphQL-like query patterns internally (before public GraphQL)
    • Now runs 70+ federated GraphQL services handling billions of requests daily
    • This single change reduced API payload sizes by 60-80% on mobile

Case Study 2: Facebook's Real-Time Challenge (2009-2011)

  • The Problem: Facebook needed to notify users instantly when:

    • A friend posted something
    • Someone liked their post
    • A message arrived
    • Their status changed
  • REST polling approach didn't work:

    • Checking every 5 seconds = 12 requests per minute × millions of users = server meltdown
    • Checking every 30 seconds = user gets notified after 30 seconds maximum (poor experience)
  • The Solution: Facebook built a real-time messaging system

    • Shifted to WebSocket-based push notifications
    • Server pushes updates the instant they happen
    • Reduced API calls from 12/minute to 1 persistent connection
    • Later invested heavily in event-driven systems

Case Study 3: Google's Microservices Latency Crisis

  • The Problem: Google's internal services communicated via REST

    • A search request triggered 100+ internal API calls
    • Each REST call = HTTP overhead, serialization/deserialization delay
    • Total latency: 100+ milliseconds
    • Users notice anything over 200ms; this was unacceptable
  • The Solution: Google developed gRPC

    • Binary protocol (protocol buffers) instead of text JSON
    • HTTP/2 multiplexing (multiple requests on one connection)
    • Reduced latency from 100ms to 10ms per operation
    • Cut bandwidth usage by 60%
  • Impact: gRPC now powers YouTube, Google Cloud, and countless internal services


The Comparison: REST vs. The Alternatives

When Big Tech Chooses GraphQL Over REST

Why GraphQL Wins:

  • Problem it solves: Eliminate over-fetching and under-fetching

    • REST: "/users/123" returns all user data (over-fetching), then you need "/users/123/posts" (under-fetching)
    • GraphQL: { user(id: 123) { name, email, posts { title } } } — one request, exact data
  • Real-world example:

    • GitHub's API returns massive response objects with REST
    • Their GraphQL API lets clients request only what they need
    • Result: Faster frontend apps, reduced server load
  • Developer experience:

    • GraphQL APIs auto-document themselves
    • Developers can explore queries without asking backend teams
    • Fewer API version conflicts

When NOT to Use GraphQL:

  • Simple CRUD apps where REST is already perfect
  • Public APIs where clients might misuse complex queries
  • Very high throughput scenarios (GraphQL query parsing overhead can add latency)

When Big Tech Chooses gRPC Over REST

Why gRPC Wins:

  • Performance: 10x smaller payloads, 10x faster

    • Netflix uses gRPC internally for microservices
    • High-frequency trading platforms rely on gRPC for microsecond latency
    • Stock exchange infrastructure runs on gRPC
  • Streaming support: Four communication patterns

    • Simple request-response (like REST)
    • Server streaming (continuous updates pushed to client)
    • Client streaming (client uploads continuous data)
    • Bidirectional streaming (both sides talk simultaneously)
  • Real-world use case: Video delivery at Netflix

    • Your device streams video (gRPC server streaming)
    • Your device sends bandwidth info (gRPC client streaming)
    • Codec negotiation happens in parallel (bidirectional streaming)

When NOT to Use gRPC:

  • Public-facing APIs (requires specialized gRPC clients, not just HTTP browsers)
  • Simple applications with low traffic
  • When debugging matters (binary format is harder to inspect than JSON)

When Big Tech Chooses WebSockets Over REST

Why WebSockets Win:

  • Persistent connection: Once connected, both sides can talk anytime

    • REST: client calls, server responds, connection closes
    • WebSocket: connection stays open indefinitely
  • Real-world examples:

    • Discord: Uses WebSockets for instant chat messaging
    • Zoom: Real-time video/audio synchronization
    • Twitch: Live streaming feed updates to thousands of viewers
    • Stock trading apps: Real-time price updates tick-by-tick
  • Performance benefit:

    • Eliminate HTTP handshake overhead (3-way TCP handshake)
    • No need for polling or server-sent events
    • Latency < 100ms for human interaction (vs. 500ms+ with polling)

When NOT to Use WebSockets:

  • One-time requests (overkill to maintain a connection)
  • Situations requiring horizontal scaling across multiple servers (WebSocket state isn't stateless)
  • Simple read-only data fetching

When Big Tech Chooses Event-Driven Systems Over REST

Why Event-Driven Wins:

  • Decoupling: Services don't need to know about each other

    • REST: Service A calls Service B directly (tight coupling)
    • Event-Driven: Service A publishes event; Service C, D, E, F listen (loose coupling)
  • Scalability at massive scale:

    • Amazon processes millions of orders per second
    • Order service publishes "OrderPlaced" event
    • Payment service, shipping service, notification service, analytics service all subscribe
    • If shipping service goes down, order system still works
  • Real-world examples:

    • Amazon: Event-driven order processing
    • Uber: Driver location updates as events; matching algorithm subscribes
    • YouTube: Video upload triggers encoding service, thumbnail generation, recommendation algorithm
    • Stripe: Payment events trigger revenue reporting, fraud detection, customer notifications

When NOT to Use Event-Driven:

  • Simple synchronous operations (e.g., "transfer money from account A to B")
  • Systems requiring immediate confirmation (events are async, responses might be delayed)
  • Teams unfamiliar with distributed systems complexity

Webhooks: The "Reverse API" Pattern

Why Big Tech Uses Webhooks:

  • Eliminate polling: Instead of "anything new?", the server says "here's your update"

    • Stripe: Webhooks fire when payment succeeds, fails, or is disputed
    • GitHub: Webhooks trigger when code is pushed, PR is created, or issue is closed
    • Shopify: Webhooks send order updates instantly
  • Automation at scale:

    • Stripe sends webhook → your system updates invoice → customer gets emailed
    • One webhook payload that would take 10 REST calls to gather

Real-world incident:

  • Companies without webhooks were polling payment services every 5 seconds
  • Led to their applications getting rate-limited or banned
  • Switching to webhooks eliminated the problem instantly

Pros & Cons: The Decision Matrix

REST

Pros:

  • Simple, well-understood, universally supported
  • Works great for CRUD operations
  • Stateless = easy horizontal scaling
  • No learning curve (developers know HTTP)
  • Debug easily (just use curl or browser)

Cons:

  • Over-fetching / under-fetching problems
  • Requires polling for real-time data (inefficient)
  • Stateless = can't maintain persistent connections
  • Multiple calls for complex data relationships
  • Not ideal for very high-frequency operations (latency)

Cost per request: ~50KB-500KB JSON payload


GraphQL

Pros:

  • Query exactly what you need (no over/under-fetching)
  • Self-documenting API (schema serves as documentation)
  • Single endpoint (simpler than REST's multiple endpoints)
  • Reduces API versioning headaches

Cons:

  • Query complexity allows users to request huge datasets (DoS vector)
  • Caching is harder than REST (everything goes to same endpoint)
  • Query parsing overhead (small latency impact)
  • Learning curve (developers new to GraphQL concepts)
  • Not ideal for simple, static data

Cost per request: ~5KB-50KB payload (60-90% reduction vs REST)


gRPC

Pros:

  • Extremely fast (binary, HTTP/2, multiplexing)
  • Streaming support (bidirectional communication)
  • Ideal for microservices communication
  • Language-agnostic (use Go, Python, Java, Node.js)

Cons:

  • Binary format (harder to debug without tools)
  • Not browser-friendly (needs specialized client)
  • Steep learning curve
  • Not good for public APIs

Cost per request: ~1KB-10KB payload (90% reduction vs REST)


WebSockets

Pros:

  • Persistent connection (no reconnection overhead)
  • True bidirectional communication
  • Extremely low latency (<100ms)
  • Ideal for real-time applications

Cons:

  • Stateful (complicates horizontal scaling)
  • Server needs to maintain connection state
  • More complex debugging
  • Firewall/proxy complications possible
  • Overkill for request-response patterns

Cost per request: ~100 bytes per message (persistent connection overhead)


Event-Driven Systems

Pros:

  • Loose coupling (services are independent)
  • Scales to massive throughput
  • Asynchronous (doesn't block)
  • Highly resilient (failures don't cascade)

Cons:

  • Distributed systems complexity (harder to debug)
  • Eventual consistency (updates take time to propagate)
  • More infrastructure (message brokers, queues)
  • Requires event schema management

Cost per request: ~200 bytes-10KB payload (depends on event size)


Visual Diagrams: Request Flows Explained

REST vs. GraphQL Request Flow

=== REST APPROACH (Inefficient) ===

Request 1: GET /users/123
Response: { id: 123, name: "Alice", email: "alice@example.com", phone: "555-1234", address: "..." }
          ^ You only wanted name & email, wasted bandwidth

Request 2: GET /users/123/posts
Response: [{ id: 1, title: "Post 1", content: "...", likes: 42, comments: [...] }]
          ^ You only wanted post titles, again wasted

Total Requests: 2+
Total Payload: ~50KB


=== GraphQL APPROACH (Efficient) ===

Request 1:
query {
  user(id: 123) {
    name
    email
    posts {
      title
    }
  }
}

Response: { user: { name: "Alice", email: "alice@example.com", posts: [{ title: "Post 1" }] } }
          ^ Exactly what you asked for, nothing more

Total Requests: 1
Total Payload: ~2KB
Enter fullscreen mode Exit fullscreen mode

gRPC vs. REST Communication

=== REST (Multiple Round Trips) ===

Time ──────────────────────────────────────────────────
  │
  0ms  ├─ TCP Handshake (3 messages)
       ├─ TLS Handshake (if HTTPS)
  20ms ├─ Send request
       ├─ Receive response
  40ms ├─ Connection closes
  ║
  50ms ├─ TCP Handshake (request 2)
       ├─ TLS Handshake
       ├─ Send request
  70ms ├─ Receive response
       ├─ Connection closes
  ║
  100ms ├─ (Repeat for request 3...)

Total latency: 100ms + for 3 sequential requests


=== gRPC (Single Connection, Multiplexing) ===

Time ──────────────────────────────────────────────────
  │
  0ms  ├─ TCP Handshake (once)
       ├─ TLS Handshake (once)
  10ms ├─ Send requests 1, 2, 3 simultaneously (HTTP/2)
       ├─ Receive responses 1, 2, 3 simultaneously
  25ms ├─ Done

Total latency: 25ms for 3 concurrent requests
(60% faster, same data)
Enter fullscreen mode Exit fullscreen mode

Event-Driven Architecture

=== Traditional REST (Tight Coupling) ===

OrderService
    ├─ Calls → PaymentService (wait for response)
    ├─ Calls → ShippingService (wait for response)
    ├─ Calls → NotificationService (wait for response)
    └─ Calls → AnalyticsService (wait for response)

Problem: If PaymentService is down, OrderService fails


=== Event-Driven (Loose Coupling) ===

OrderService publishes event:
  ├─ "OrderPlaced" event

Event Subscribers (Independent):
  ├─ PaymentService (listens, processes payment)
  ├─ ShippingService (listens, schedules pickup)
  ├─ NotificationService (listens, sends email)
  └─ AnalyticsService (listens, records metric)

Benefit: If PaymentService crashes, order still exists and other services process normally
         (payment can retry later)
Enter fullscreen mode Exit fullscreen mode

WebSocket vs. REST Real-Time Updates

=== REST Polling (Inefficient) ===

Client: "Anything new?" → Server (every 5 seconds)
  │ Response: No
  │ (5 seconds pass)
  │ "Anything new?" → Server
  │ Response: No
  │ (5 seconds pass)
  │ "Anything new?" → Server
  │ Response: Yes! New notification
  └─ Total delay: 5 seconds (on average 2.5 seconds)

Cost: 1 request every 5 seconds × millions of users = insane server load


=== WebSocket (Real-Time) ===

Client: [connection open]
  ←→ (Persistent bidirectional connection)
  ←→ (Server sends notification instantly when it happens)
  └─ Delay: <100ms

Cost: 1 persistent connection per user (more efficient than polling)
Enter fullscreen mode Exit fullscreen mode

Why Big Tech's Decision Flowchart

START: Choosing an API Pattern
  │
  ├─ Is this a simple CRUD app? → YES → Use REST ✓
  │                              → NO → Continue
  │
  ├─ Do clients need different data subsets? → YES → Use GraphQL ✓
  │                                           → NO → Continue
  │
  ├─ Is this internal microservices communication? → YES → Use gRPC ✓
  │                                                → NO → Continue
  │
  ├─ Do you need real-time bidirectional comms? → YES → Use WebSockets ✓
  │                                              → NO → Continue
  │
  ├─ Do you need instant async notifications? → YES → Use Webhooks ✓
  │                                           → NO → Continue
  │
  ├─ Are you at massive scale with async patterns? → YES → Use Event-Driven ✓
  │                                                → NO → Continue
  │
  └─ DEFAULT → Use REST (it works for most cases)
Enter fullscreen mode Exit fullscreen mode

Performance Metrics: Real Numbers from Big Tech

Metric REST GraphQL gRPC WebSocket
Avg Response Size 150KB 20KB 2KB 500 bytes
Latency (req-resp) 150ms 120ms 30ms 5ms (persistent)
Bandwidth Savings 85% 98% 95%
Suitable for Scaling 1000s users 10000s users 100000s users 100000s users
Netflix Internal Calls/sec Billions Billions
Stripe Webhook Throughput Millions/day

The Real-World Trade-Offs Every Engineer Faces

Consistency vs. Simplicity

  • REST: Simple, everyone understands it, but doesn't solve modern problems
  • GraphQL: Solves data fetching, but adds complexity
  • gRPC: Solves performance, but needs specialists
  • Event-Driven: Solves scale, but now you need distributed systems experts

Big Tech's approach: Use REST as the baseline, then add specialized solutions where needed


Development Speed vs. Runtime Efficiency

  • REST: Developers are fast (everyone knows it), but production is inefficient
  • GraphQL: Moderate dev speed (learning curve), production is better
  • gRPC: Slower initial development (code generation), but production is excellent

Example: Netflix chose GraphQL because the efficiency gains ($millions in saved bandwidth) outweighed the development complexity


Debugging Simplicity vs. Performance

  • REST: Debug with curl, easy to inspect
  • GraphQL: Debug with Apollo DevTools, still readable
  • gRPC: Debug with special tools (harder), but performance justifies it

Big Tech's principle: Invest in tooling (e.g., gRPC debugging dashboards) rather than sacrifice performance for debugging ease


The Painful Lessons: When Big Tech Got It Wrong

Lesson 1: Twitter's Fail Whale Era (2009-2010)

  • What happened: Twitter stuck with REST as they scaled

    • Each tweet retrieval required loading friend relationships (REST call)
    • Then loading each friend's tweets (more REST calls)
    • Timeline page = 100+ REST calls
  • Impact: Servers crashed during peak hours (the famous "Fail Whale" error)

  • The fix:

    • Moved to real-time event streaming
    • Introduced caching layers
    • Later adopted message queuing systems
  • Lesson learned: REST doesn't scale linearly; architecture must evolve with demand


Lesson 2: Uber's Dispatch Latency Problem (2012-2013)

  • What happened: Uber used REST to notify drivers of new ride requests

    • Driver polls every 5 seconds: "Got a ride for me?"
    • Average delay: 2.5 seconds to see a ride
    • If multiple drivers polling simultaneously = massive load
  • Impact:

    • Poor driver experience (other apps had faster notifications)
    • Server scalability nightmare
  • The fix:

    • Adopted WebSocket + event streaming
    • Driver notifications now push instantly
    • Reduced notification delay from 2.5s to <500ms
  • Lesson learned: Polling doesn't work at scale; go real-time or go home


Lesson 3: Amazon's Microservices Brittleness (2008-2009)

  • What happened: Early Amazon services called each other with REST

    • OrderService → InventoryService → WarehouseService (3+ sync REST calls)
    • If InventoryService was slow, entire checkout slowed
  • Impact:

    • High latency during peak hours (Prime Day)
    • Cascading failures (one slow service breaks others)
  • The fix:

    • Shifted to event-driven systems
    • Order service publishes event; inventory service consumes async
    • Failures don't cascade anymore
  • Lesson learned: Sync REST calls are brittle; use async + events for resilience


When You Should STILL Use REST (Yes, It's Still Valid!)

REST is Perfect For:

  • Simple CRUD applications: Todo apps, note-taking apps, basic blogs
  • Public-facing APIs: Developers expect REST; it's the standard
  • One-off integrations: Quick API for a third-party service
  • Mobile apps with offline support: REST is simpler to cache locally
  • Teams new to distributed systems: Lower complexity = fewer bugs

REST + Optimization Examples:

Option 1: REST with query parameters (partially addresses issues)

GET /users/123?fields=name,email,posts.title
Enter fullscreen mode Exit fullscreen mode

You specify what you want, reducing payload

Option 2: REST with caching

GET /users/123
Cache-Control: max-age=3600  (cache for 1 hour)
Enter fullscreen mode Exit fullscreen mode

Option 3: REST with webhooks hybrid

REST for static data
Webhooks for real-time events
Enter fullscreen mode Exit fullscreen mode

The Stack Modern Big Tech Actually Uses

Netflix Tech Stack

  • REST: Public API (third-party developers expect it)
  • GraphQL: Internal tools, mobile apps (unified data layer)
  • gRPC: Microservices communication (performance critical)
  • Kafka (Event Streaming): Order processing, notifications
  • WebSockets: Real-time UI updates (playback sync)

Google Tech Stack

  • REST: Firebase, Cloud APIs (public-facing)
  • gRPC: YouTube, Google Cloud services (internal)
  • Pub/Sub (Event System): YouTube notifications, real-time features
  • WebRTC: Google Meet, video features

Uber Tech Stack

  • REST: Driver/rider mobile apps
  • WebSockets: Real-time driver location
  • Kafka (Events): Ride events, payment processing
  • gRPC: Internal microservices

Key Takeaways: The Framework for Decision-Making

Use Case Best API Why
Mobile apps fetching data REST or GraphQL Simple, cacheable, well-supported
Real-time chat/gaming WebSockets Low latency, persistent connection
Internal microservices gRPC Speed, streaming, language-agnostic
Event notifications Webhooks Push-based, eliminates polling
Complex async workflows Event-Driven Decoupled, scalable, resilient
Public third-party APIs REST Standard, expected, easy to use
Multiple data needs (mobile/web) GraphQL Clients get what they need, not more

Further Reading & Deep Dives

REST & HTTP

  • Stateless architecture principles
  • HTTP/1.1 vs HTTP/2 differences
  • REST maturity levels (Richardson Maturity Model)
  • Caching strategies (ETags, Cache-Control headers)

GraphQL

  • Query language fundamentals
  • Schema design patterns
  • N+1 query problem solutions
  • DataLoader batching techniques

gRPC

  • Protocol buffers serialization
  • HTTP/2 multiplexing and flow control
  • Load balancing in gRPC
  • Deadline and timeout patterns

Real-Time Communication

  • WebSocket protocol details
  • Socket.IO and alternatives
  • WebRTC P2P connections
  • STUN and TURN server roles

Event-Driven Systems

  • Event sourcing patterns
  • Message queues (RabbitMQ, Kafka, SQS)
  • Distributed transaction patterns (Saga pattern)
  • Event schema versioning

Scalability

  • Microservices architecture
  • Service mesh (Istio, Linkerd)
  • Circuit breaker patterns
  • Distributed tracing (Jaeger, DataDog)

Conclusion: REST is Dead (Long Live APIs)

The Reality:

  • REST isn't dead — it's just not the answer to every problem
  • Big Tech uses REST for 20% of their systems and specialized solutions for 80%
  • The key is matching the tool to the problem

The Framework:

  • Start with REST (simple, proven)
  • Optimize with GraphQL if data fetching is a bottleneck
  • Scale internal systems with gRPC
  • Add real-time features with WebSockets
  • Reach massive scale with event-driven systems

The Outcome:

  • Netflix saves millions in bandwidth with GraphQL
  • Google powers YouTube with gRPC
  • Uber delivers rides faster with WebSockets
  • Amazon handles trillions of requests with events
  • Stripe processes payments reliably with webhooks

Your Action:

  1. Audit your current APIs — are they solving the right problems?
  2. Identify bottlenecks — latency, bandwidth, scaling limits?
  3. Choose the right tool — don't use REST everywhere
  4. Invest in tooling — make the complex systems easy to debug
  5. Document decisions — help your team understand why

The future of APIs isn't about one tool; it's about using the right tool for the right job at the right time.

Top comments (0)