cover image by Gillfoto
While a huge part of any Data Engineer's job is about building and managing the data lake managed directly by a company, more and more we're tasked with connecting data from one remote repository to another. This 'data pipeline' concept is growing in the same way SaaS tools for web applications are coming to swallow self-hosted web apps.
Data Engineers are being asked to build integrations out of their data warehouses to many of the same tools (e.g. warehouse -> CRM, warehouse -> Marketing Automation), because the ability to data warehouse and then apply and automate data modeling has expanded greatly. Snowflake and, generally, the separation of compute and storage in cloud data warehouses has made it extremely inexpensive to store a lot of data that you don’t access frequently.
Is the warehouse doomed? Don't count on it, but I'm sure that data pipelines are part of everyone's future.