DEV Community

MLOps Community

Scalable Python for Everyone, Everywhere // Matthew Rocklin // MLOps Meetup #38

Parallel Computing with Dask and Coiled

Python makes data science and machine learning accessible to millions of people around the world. However, historically Python hasn't handled parallel computing well, which leads to issues as researchers try to tackle problems on increasingly large datasets.  Dask is an open source Python library that enables the existing Python data science stack (Numpy, Pandas, Scikit-Learn, Jupyter, ...) with parallel and distributed computing. Today Dask has been broadly adopted by most major Python libraries, and is maintained by a robust open source community across the world.  

This talk discusses parallel computing generally, Dask's approach to parallelizing an existing ecosystem of software, and some of the challenges we've seen in deploying distributed systems.

Finally, we also addressed the challenges of robustly deploying distributed systems, which ends up being one of the main accessibility challenges for users today. We hope that by the end of the meetup attendees will better understand parallel computing, have built intuition around how Dask works, and have the opportunity to play with their own Dask cluster on the cloud.

Matthew is an open source software developer in the numeric Python ecosystem. He maintains several PyData libraries, but today focuses mostly on Dask a library for scalable computing. Matthew worked for Anaconda Inc for several years, then built out the Dask team at NVIDIA for RAPIDS, and most recently founded Coiled Computing to improve Python's scalability with Dask for large organizations.

Matthew has given talks at a variety of technical, academic, and industry conferences.  A list of talks and keynotes is available at (https://matthewrocklin.com/talks).

Matthew holds a bachelor’s degree from UC Berkeley in physics and mathematics, and a PhD in computer science from the University of Chicago.

Check out our posts here to get more context around where we're coming from:
https://medium.com/coiled-hq/coiled-dask-for-everyone-everywhere-376f5de0eff4
https://medium.com/coiled-hq/the-unbearable-challenges-of-data-science-at-scale-83d294fa67f8

----------- Connect With Us ✌️-------------

Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Matthew on LinkedIn: https://www.linkedin.com/in/matthew-rocklin-461b4323/

Episode source