DEV Community

MLOps Community

Data Engineering for ML // Chad Sanderson // Coffee Sessions #117

MLOps Coffee Sessions #117 with Chad Sanderson, Head of Product, Data Platform at Convoy, Data Engineering for ML co-hosted by Josh Wills.

// Abstract
Data modeling is building relationships between core concepts within your data. The physical data model shows how the relationships manifest in your data environment but then there's the semantic data model, the way that entity relationship design is extracted away from any data-centric implementation.  

Let's do the good old fun of talking about why data modeling is so important!

// Bio
Chad Sanderson is the Product Lead for Convoy's Data Platform team, which includes the data warehouse, streaming, BI & visualization, experimentation, machine learning, and data discovery.

Chad has built everything from feature stores, experimentation platforms, metrics layers, streaming platforms, analytics tools, data discovery systems, and workflow development platforms. He’s implemented open source, SaaS products (early and late-stage) and has built cutting-edge technology from the ground up. Chad loves the data space, and if you're interested in chatting about it with him, don't hesitate to reach out.

// MLOps Jobs board  
https://mlops.pallet.xyz/jobs

MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
https://odsc.com/speakers/scaling-machine-learning-with-data-mesh/ https://docs.google.com/presentation/d/1rVtltHkRkP_JaGZdkAS3U_SXfr5Gg-RP980FKXh0YNU/edit?usp=sharing
Josh Wills will be teaching a course on Data Engineering for Machine Learning in September here:
https://www.getsphere.com/ml-engineering/data-engineering-for-machine-learning

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Josh on LinkedIn: https://www.linkedin.com/in/josh-wills-13882b/
Connect with Chad on LinkedIn: https://www.linkedin.com/in/chad-sanderson/

Timestamps:
[00:00] Introduction of the new co-host Josh Wills  
[00:54] Introduction to Chad Sanderson
[01:46] Josh will lead a course for Machine Learning in mid-September
[02:16] Data modeling blog post of Chad
[06:10] Idea of Strategy
[09:40] Modern cloud data warehouses  
[17:01] Layering on contracts
[20:38] Scaling at larger companies
[25:30] Carrot-stick strategy
[34:27] Second and third-order effects
[39:53] Stockholm Syndrome
[41:22] Quality checks at Slack
[45:28] Success in two main ways according to Chad
[47:35] Completely and utterly different universes
[53:42] Product use case to push semantic events
[56:00] Pattern of analysis of the sequence of events
[57:23] Wrap up

Episode source