Have you ever tried to put a large dataset or model weights into Git? Git is amazing except when it comes to big files... which happens pretty often in machine learning.
As part of an MLOps Tutorials series, I made a video covering:
- Git fundamentals for ML
- How to add external storage (from Google Drive!) to a GitHub repo to store datasets and trained models
There's also some inklings of a topic we'll develop further in upcoming videos: what does it mean to version data as code? How do we create high-level abstractions to separate data from the way it's stored? Stay tuned.