DEV Community

Rishabh Agarwal
Rishabh Agarwal

Posted on

How Pinterest uses Kafka for Long-Term Data Storage

I spent hours diving into this so you don’t have to!

Here is what I learned:

  • Pinterest doesn't store all data on Kafka brokers forever.
  • Older data is moved to a remote storage like Amazon S3.
  • They built a tool called Segment Uploader to automate this process.
  • The Segment Uploader periodically transfers older data from Kafka brokers to remote storage.
  • Segment Uploader runs as a sidecar alongside the Kafka broker.
  • They also developed a specialized Consumer Library to fetch data intelligently.
  • The library fetches old data directly from remote storage and new data from Kafka brokers.

By combining Kafka’s real-time capabilities with cost-efficient remote storage, Pinterest ensures scalability, reliability, and efficient long-term data management.


PS - I recently published an article on my free Newsletter covering this case study in-depth with visuals: https://designsystemsweekly.substack.com/p/how-pinterest-leverages-kafka-for

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Eliminate Context Switching and Maximize Productivity

Pieces.app

Pieces Copilot is your personalized workflow assistant, working alongside your favorite apps. Ask questions about entire repositories, generate contextualized code, save and reuse useful snippets, and streamline your development process.

Learn more