Are you a data science enthusiast, a seasoned practitioner, or just starting your journey into this exciting field? 🤔
How are you learning? Paid courses? Bootcamps? 📚 Why not kickstart your learning with some of the best free data science resources available online? 🆓
GitHub is a treasure trove for open-source projects, learning resources, and curated data science repositories that can significantly boost your skills.
Here are my top 5 GitHub repositories that will help you master data science, from foundational concepts to hands-on projects. 💻
Remember, it's more important how much you code than how many repositories you know. The key is to apply what you learn!
5. Virgilio 🧠
Presenting a fantastic web-based guide for data science learners.
This repository is a meticulously compiled collection of theoretical resources, perfect for building a solid foundation in data science concepts.
Virgilio is an open-source initiative, aiming to mentor and guide anyone in the world of the Data Science Our vision is to give everyone the chance to get involved in this field, get self-started as a practitioner, gain new skills and learn to navigate through the infinite web of resources and find the ones useful for you.
Find me on Twitter to have a chat!
-----> Meet Virgilio now!
Table of Contents
What is Virgilio?
Studying and reading through the Internet means swimming in an infinite jungle of chaotic information, even more so in rapidly changing innovative fields.
Have you ever felt overwhelmed when trying to approach Data Science without a real “path” to follow?
Are you tired of clicking “Run”, “Run”, “Run”.. on a Jupyter Notebook, with that false sense of confidence given by the comfort…
4. Python Data Science Handbook 📖
O'Reilly books are considered the gold standard in the data science community, and they rarely go on sale! 💎
But guess what? This repository contains the complete Python Data Science Handbook along with the code notebooks, making it an invaluable data science learning resource for anyone interested in Python.
jakevdp
/
PythonDataScienceHandbook
Python Data Science Handbook: full text in Jupyter Notebooks
Python Data Science Handbook
This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.
How to Use this Book
-
Read the book in its entirety online at https://jakevdp.github.io/PythonDataScienceHandbook/
-
Run the code using the Jupyter notebooks available in this repository's notebooks directory.
-
Launch executable versions of these notebooks using Google Colab:
-
Launch a live notebook server with these notebooks using binder:
-
Buy the printed book through O'Reilly Media
About
The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.
The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A…
3. Awesome DataScience ✨
Who doesn't love a good cheatsheet? 🤩 This "awesome" repository acts as the ultimate data science cheatsheet, providing a curated list of distributed data, projects, tutorials, and other useful GitHub repositories for all things data science.
It’s the perfect place to find your next project or tutorial!
academic
/
awesome-datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
AWESOME DATA SCIENCE
An open-source Data Science repository to learn and apply towards solving real world problems.
This is a shortcut path to start studying Data Science. Just follow the steps to answer the questions, "What is Data Science and what should I study to learn Data Science?"
Table of Contents
2. Notebooks for Data Science ✍️
Learning isn't just about reading theory—it’s about writing code!
This repository is a perfect solution, offering a comprehensive collection of data science IPython notebooks filled with hands-on examples and code to help you apply what you've learned.
Get ready to dive deep!
donnemartin
/
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
data-science-ipython-notebooks
Index
- deep-learning
- scikit-learn
- statistical-inference-scipy
- pandas
- matplotlib
- numpy
- python-data
- kaggle-and-business-analyses
- spark
- mapreduce-python
- amazon web services
- command lines
- misc
- notebook-installation
- credits
- contributing
- contact-info
- license
deep-learning
IPython Notebook(s) demonstrating deep learning functionality.
tensor-flow-tutorials
Additional TensorFlow tutorials:
- pkmital/tensorflow_tutorials
- nlintz/TensorFlow-Tutorials
- alrojo/tensorflow-tutorial
- BinRoot/TensorFlow-Book
- tuanavu/tensorflow-basic-tutorials
Notebook
Description
tsf-basics
Learn basic operations in TensorFlow, a library for various kinds of perceptual and language understanding tasks from Google.
tsf-linear
Implement linear regression in TensorFlow.
tsf-logistic
Implement logistic regression in TensorFlow.
tsf-nn
Implement nearest neighboars in TensorFlow.
tsf-alex
Implement AlexNet in TensorFlow.
tsf-cnn
Implement convolutional neural networks in TensorFlow.
tsf-mlp
Implement multilayer perceptrons in TensorFlow.
tsf-rnn
Implement recurrent neural networks in TensorFlow.
tsf-gpu
Learn about basic multi-GPU computation in TensorFlow.
tsf-gviz
Learn about graph visualization in TensorFlow.
tsf-lviz
Learn about loss visualization in TensorFlow.
tensor-flow-exercises
Notebook
Description
tsf-not-mnist
Learn simple data curation by creating a pickle with formatted datasets for training, development and testing in
Honorable Mention ⭐
Before we get to the top spot, I want to mention a truly top-class data science resource. It features a huge number of datasets, but it has now moved to its own platform. I highly recommend checking them out for your data science projects! 📊🔍
1. Microsoft Data Science Repo 🌟
Yes, you read that right! Microsoft has launched its own free data science repository for beginners. 🤩 This is, without a doubt, one of the best free data science courses I have ever found. It includes detailed lectures and code to help you learn and practice from scratch. A must-see for anyone serious about data science career! 🎓
microsoft
/
Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
Data Science for Beginners - A Curriculum
Azure Cloud Advocates at Microsoft are pleased to offer a 10-week, 20-lesson curriculum all about Data Science. Each lesson includes pre-lesson and post-lesson quizzes, written instructions to complete the lesson, a solution, and an assignment. Our project-based pedagogy allows you to learn while building, a proven way for new skills to 'stick'.
Hearty thanks to our authors: Jasmine Greenaway, Dmitry Soshnikov, Nitya Narasimhan, Jalen McGee, Jen Looper, Maud Levy, Tiffany Souterre, Christopher Harrison.
🙏 Special thanks 🙏 to our Microsoft Student Ambassador authors, reviewers and content contributors, notably Aaryan Arora, Aditya Garg, Alondra Sanchez, Ankita Singh, Anupam Mishra, Arpita Das, ChhailBihari Dubey, Dibri Nsofor, Dishita Bhasin, Majd Safi, Max Blum, Miguel Correa, Mohamma Iftekher (Iftu) Ebne Jalal, Nawrin Tabassum, Raymond Wangsa Putra…
Conclusion
So there you have it—my top list of data science repositories that will be incredibly helpful for you to learn and create amazing data science projects.
These resources are fantastic whether you're a beginner or looking to sharpen your skills. 🛠️📊
Based on your experience, which one is your favorite? Let me know in the comments! 👇
Top comments (0)