DEV Community

Rishabh Agarwal
Rishabh Agarwal

Posted on

How Pinterest uses Kafka for Long-Term Data Storage

I spent hours diving into this so you don’t have to!

Here is what I learned:

  • Pinterest doesn't store all data on Kafka brokers forever.
  • Older data is moved to a remote storage like Amazon S3.
  • They built a tool called Segment Uploader to automate this process.
  • The Segment Uploader periodically transfers older data from Kafka brokers to remote storage.
  • Segment Uploader runs as a sidecar alongside the Kafka broker.
  • They also developed a specialized Consumer Library to fetch data intelligently.
  • The library fetches old data directly from remote storage and new data from Kafka brokers.

By combining Kafka’s real-time capabilities with cost-efficient remote storage, Pinterest ensures scalability, reliability, and efficient long-term data management.


PS - I recently published an article on my free Newsletter covering this case study in-depth with visuals: https://designsystemsweekly.substack.com/p/how-pinterest-leverages-kafka-for

Image of Datadog

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

Top comments (0)

Eliminate Context Switching and Maximize Productivity

Pieces.app

Pieces Copilot is your personalized workflow assistant, working alongside your favorite apps. Ask questions about entire repositories, generate contextualized code, save and reuse useful snippets, and streamline your development process.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay