DEV Community

MLOps Community

MLOps Investments // Sarah Catanzaro // Coffee Session #33

Coffee Sessions #33 with Sarah Catanzaro of Amplify Partners, MLOps Investments.

//Bio
Sarah Catanzaro is a Partner at Amplify Partners, where she focuses on investing in and advising high potential startups in machine intelligence, data management, and distributed systems. Her investments at Amplify include startups like RunwayML, Maze Design, OctoML, and Metaphor Data among others. Sarah also has several years of experience defining data strategy and leading data science teams at startups and in the defense/intelligence sector including through roles at Mattermark, Palantir, Cyveillance, and the Center for Advanced Defense Studies.

//We had a wide-ranging discussion with Sarah, three takeaways stood out:

  1. The relationship between unstructured data and structured data is due for change. In most settings, you have some form of structured data (i.e. a metadata table) and unstructured data (i.e. images, text, etc.) Managing the relationship between these forms of data can constitute the bulk of MLOps. Because of this difficulty, Sarah forecasted new tooling arising to make data management easier.
  2. Academic benchmarks suffer from a lack of transparency on production/industry use cases. In conversation with Andrew Ng, Sarah shared her lesson that despite all the blame industry professionals place on academics for narrowly optimizing to benchmarks with little practical meaning, they also share the blame for making it difficult to create meaningful benchmarks. Companies are loath to share realistic data and the true context in which ML has to operate.
  3. MLOps is due for consolidation, especially as companies adopt platform-driven strategies. As many of you all know, there are tons and tons of MLOps tools out there. As more companies address these challenges, Sarah predicted that many of the point solutions would start to be consolidated into larger platforms.

// Other Links
https://amplifypartners.com/team/sarah/
https://projectstoknow.amplifypartners.com/ml-and-data
https://twitter.com/sarahcat21/status/1360105479620284419

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
Connect with Sarah on LinkedIn: https://www.linkedin.com/in/sarah-catanzaro-9770b98/

Timestamps:
[00:00] Introduction to Sarah Catanzaro
[02:07] Sarah's background in tech
[06:00] Staying engineer oriented despite being an investment firm
[08:50] Tools you wished you had earlier in your career
[12:36] 2 Motives of ML Engineers and ML Platform Team
[16:36] Open-sourcing
[21:29] Startup focus on resources
[23:57] Playout of open-source project
[27:32] Consolidation
[33:18] Finding solutions
[36:18] Evolution of MLOps industry in the coming years
[42:36] Frameworks  
[43:14] Structure data sets available to researchers. Meaningful advances of deep learning applied to structure data as well.


Episode source