DEV Community

The GeekNarrator

Running Distributed Systems like a Pro with Mayank Shrivastava

Hey Everyone, 

In this episode I am talking to Mayank Shrivastava who has vast experience into building and maintaining high scale distributed systems. He was in the team that originally built Apache Pinot at Linkedin and is now working at StarTree as the Head of Core Data Engineering.  

He has shared some amazing insights from his experience and there is a lot to learn from our discussion.   

We discuss about the following: 

00:00 Introduction 

04:20 Practices to follow while designing and developing Distributed Systems 

05:47 What do we mean by Solid Scalable Design? How do we approach that? 

09:00 Safety Nets for developing Distributed systems 

10:21 When is the right time to do performance benchmarking? 

17:00 What is release certification? 

21:00 Deploying to Production 

24:45 Example when Canary Deployment might not be a good strategy? 

26:00 Example when Canary Deployment a good strategy? 

27:30 Post Deployment - how do we observe our system? 

33:30 How do we avoid on-call(alerting) noise? 

42:00 Maintaining a Large scale Distributed system 

47:15 Scaling up/down for stateful systems 

51:30 Handling Failures in Production (Disaster Recovery) 

01:00:30 Runbooks - How do we keep them updated?  

References: 

The GeekNarrator Linkedin page: https://www.linkedin.com/company/86276626

Kaivalya Apte: https://www.linkedin.com/in/kaivalya-apte-2217221a/

Geeknarrator website: www.geeknarrator.com 

Mayank Shrivastava: https://www.linkedin.com/in/mayankshriv/

StarTree: https://www.startree.ai/

Apache Pinot: https://pinot.apache.org/ 

Hope you enjoy the discussion and learn from it. Please hit the like button if you liked my discussion with Mayank and please subscribe to the channel for more content like this.  

Cheers, The GeekNarrator

Episode source