DEV Community

Cover image for Architecture for High-Throughput Low-Latency Big Data Pipeline on Cloud
Satish Chandra Gupta
Satish Chandra Gupta

Posted on • Edited on • Originally published at ml4devs.com

Architecture for High-Throughput Low-Latency Big Data Pipeline on Cloud

Scalable and efficient data pipelines are as important for the success of analytics and ML as reliable supply lines are for winning a war.


For deploying big-data analytics, data science, and machine learning (ML) applications in real-world, analytics-tuning and model-training is only around 25% of the work. Approximately 50% of the effort goes into making data ready for analytics and ML. The remaining 25% effort goes into making insights and model inferences easily consumable at scale. The data pipeline puts it all together. It is the railroad on which heavy and marvelous wagons of ML run. Long term success depends on getting the data pipeline right.

This article gives an introduction to the data pipeline and an overview of architecture alternatives.

Continue reading »

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

Top comments (0)

Heroku

Build apps, not infrastructure.

Dealing with servers, hardware, and infrastructure can take up your valuable time. Discover the benefits of Heroku, the PaaS of choice for developers since 2007.

Visit Site

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay