Awesome Data Science with Python

r0f1 profile image Florian Rohrer ・1 min read

I have created a list of useful python packages for data science.

GitHub logo r0f1 / datascience

Curated list of Python resources for data science.

Awesome Data Science with Python

A curated list of awesome resources for practicing data science using Python, including not only libraries, but also links to tutorials, code snippets, blog posts and talks.


pandas - Data structures built on top of numpy.
scikit-learn - Core ML library.
matplotlib - Plotting library.
seaborn - Data visualization library based on matplotlib.
pandas_summary - Basic statistics using DataFrameSummary(df).summary().
pandas_profiling - Descriptive statistics using ProfileReport.
sklearn_pandas - Helpful DataFrameMapper class.
missingno - Missing data visualization.
rainbow-csv - Plugin to display .csv files with nice colors.

Environment and Jupyter

General Jupyter Tricks
Fixing environment: link
Python debugger (pdb) - blog post, video, cheatsheet
cookiecutter-data-science - Project template for data science projects.
nteract - Open Jupyter Notebooks with doubleclick.
papermill - Parameterize and execute Jupyter notebooks, tutorial.
nbdime - Diff two notebook files, Alternative GitHub App: ReviewNB.

Sometimes, I have also linked to Youtube Talks, other Github Repos that contain short examples, etc.

Want to contribute? Let me know.


Editor guide

Short examples are great in this space. Appreciate the list.