cover photo by CEphoto
Several concepts are key parts of the developer experience. In general we learn that that real world developers do not build abstract 'programs' or conceptual 'networks' but rather real services with with actual connections between layers. In coding boot camps and practical academic programs we often build with concepts like CI/CD, test-driven-development and (practically always) change management.
I'd like to suggest that to this list of key concepts we should add data pipelines. A quick definition:
Data pipelines are the arteries of any modern data infrastructure. Their purpose is pretty simple: they are implemented and deployed to copy or move data from “System A” to “System B.”
To be a bit more formal (and abstract enough to justify our titles as engineers), a data pipeline is a process responsible for replicating the state from “System A” to “System B.”
Am I biased because these pipelines are what rudderstack is built to help with?? Okay, very possibly! But the fact is that the question of 'how does this event/record/information get from our data warehouse over here to our sales/marketing/analytics system' is one that I've dealt with at every single dev role, contract job, startup, and enterprise team I've worked.
You can see questions of data pipelines in so many job listings: those request for a SQL pro who understands regex and the Stripe API are actually requesting someone who understands data pipelines.
If you want to read more about pipelines in general, check out this post by Kostas on the subject.
Top comments (0)