Snowflake Migration Series — Lesson 1: The Power of a Strategic MVP: Your First Slice of Value
Empower Your Snowflake Migration Through Battle-Tested Lessons
Photo by Barth Bailey on Unsplash
tl;dr
This new series of articles is a follow-up to my migration talk this year. After hearing a set of questions from clients and during our Toronto Snowflake User Group, I decided to turn the talk into a series of articles that discuss each of the lessons.
While I want to get into lesson 1, let's discuss the challenge first. For context, I am talking about a retail corporation and its journey into a large data migration to Snowflake.
The Challenge
When RetailCorp started exploring migration to Snowflake, the potential advantages appeared highly appealing. These included real scalability to manage peak holiday traffic without system outages, cost savings by replacing large hardware costs and Oracle licenses with a pay-as-you-go model, increased agility for analysts to access data and generate reports more quickly, and sophisticated analytics features like machine learning for predicting customer churn and forecasting demand. Snowflake’s architecture effectively fulfilled its technical promises through its separation of storage and compute, strong handling of semi-structured data, and solid data sharing features.
However, the migration process turned out to be more difficult than expected. The team found that their outdated legacy systems from over a decade ago contained hidden complexities, such as poorly documented Informatica workflows built by former developers, with dependencies reaching into unforeseen systems. Additionally, the team faced notable skill gaps: although they were experts in Informatica, they required quick learning in areas like Snowflake SQL, Azure, dbt transformations, Git version control, and managing cloud costs.
The technical challenges went beyond just mastering new tools. While automated conversion tools from Informatica to dbt offered some help, they didn't serve as the perfect solution hoped for. Data validation became particularly time-consuming, as it required significant effort to confirm that the outputs of the new Snowflake/dbt pipeline matched trusted legacy Informatica reports exactly. Moreover, although the pay-as-you-go model seemed appealing in theory, it often led to unexpected costs when inefficient queries or improperly sized virtual warehouses ran continuously.
These experiences underscored the gap between cloud migration promises and actual implementation, providing valuable lessons learned during this complex transformation process.
The MVP
Your first step is to audit the environment and identify all source and target data sets, at least for the workload, preferably using some automated software if you are in a complex environment. For example, I have created small Python utilities that help us capture the environment and use LLMs to analyze the data, including columns, as this gives you early signs of complexity. These utilities use the power of AI to flag early signs of complexity. It works pretty well and flags many of the issues.
Once you set up a parallel process to assess the legacy system while building your new cloud environment, the main question is: where do you begin? The best approach is to focus on the Minimum Viable Product (MVP). In a data migration context, however, an MVP isn't merely a small pilot or a temporary prototype. It should be the immediate, concrete outcome of your initial evaluation and foundational actions — a comprehensive, meaningful segment that showcases the new platform's complete end-to-end capabilities.
Proving a Critical Path with Business Outcome
The primary purpose of the MVP is to prove that one critical, end-to-end data path works in the new environment. This path should be identified and de-risked through your parallel survey of the legacy system. Crucially, the outcome must be valuable to a business sponsor, delivering a vertical business outcome rather than a simple “like-for-like” technical port of an old process.
Getting this single, imperfect thread working in production achieves several goals:
- It lays a new foundation for future work.
- It builds confidence among stakeholders.
- It shows tangible progress, creating momentum.
- It becomes the learning ground for the entire team.
Building and Showcasing the New Foundation
The MVP is more than just data; it serves as a way to showcase the new, modern platform components. The aim is to bring the new “wiring patterns” to life using actual data slices. This means the MVP should actively build and showcase foundational elements like:
- The new secure cloud environment.
- CI/CD pipelines built on an Infrastructure as Code (IaC) framework.
- The modern deployment process and security model.
The First Step in an Iterative Journey
The MVP is clearly identified as the initial stage in an iterative process. Once completed, it starts a “scan-select-build-deploy-operate” cycle. Insights gained from delivering the first segment guide the choice and development of the next, such as shifting focus from sales data to customer experience or inventory data. Each subsequent segment expands on the previous one, increasing the platform's capabilities and complexity as part of an ongoing assessment process.
This approach enables leadership to demonstrate regular successes, helping sustain project momentum and secure continuous budget approvals, thereby effectively minimizing the risk of a large-scale, overly ambitious “boil the ocean” project.
How LLMs Help Deliver Your MVP
LLMs can be a powerful accelerator in the “scan-select-build” portion of your MVP cycle.
- Selecting the Right Slice: When deciding on an MVP, you can use an LLM to help analyze the trade-offs. You can provide the LLM with summaries of different business processes and ask it to “Compare the potential business impact versus the technical complexity for a ‘customer churn prediction’ pipeline versus a ‘daily sales reporting’ pipeline, based on these descriptions.” This aids in selecting a high-impact, achievable first target.
- Creating Test Plans: To ensure the MVP is robust, use an LLM to draft a comprehensive test plan. “Create a test plan for our sales data MVP. Include sections for data validation, unit tests for dbt models, integration tests for the CI/CD pipeline, and user acceptance testing criteria for the final dashboard.”
By defining the MVP as the core upfront goal, you create focus for the entire team. It’s the mechanism that turns the promise of a modern data platform into a tangible reality, building confidence and momentum for the full migration journey ahead.
Conclusion
Every major migration starts with uncertainty, complexity, and unexpected challenges within legacy systems. RetailCorp’s experience teaches us that success isn’t achieved by attempting to move everything simultaneously but by demonstrating value early through a carefully selected MVP. Concentrating on a single, vital pathway that achieves complete business results helps build a reliable foundation for the team to trust and build upon.
An MVP isn’t merely a pilot; it serves as the foundation that integrates modern cloud methods, security, and automation into production with tangible business value. Technologies like LLMs expedite this process by assisting teams in assessing trade-offs, creating code scaffolding, and developing more robust test plans. This leads to quicker confidence building, momentum, and gaining executive backing.
Migration isn't solely about technology; it's about providing proof, fostering trust, and gaining insights rapidly. By adopting a strategic MVP, you transform Snowflake’s promise into tangible progress, laying the foundation for all subsequent lessons in this series.
My next article in this series will cover how to conduct the actual assessment. Keep an eye out for it.
I am Augusto Rosa, a Snowflake Data Superhero and Snowflake SME. I am also the Head of Data, Cloud, & Security Architecture at Archetype Consulting. You can follow me on LinkedIn.
Subscribe to my Medium blog https://blog.augustorosa.com for the most interesting Data Engineering and Snowflake news.
Top comments (0)