DEV Community


Posted on

Intro to Python Libraries for Data Science

Python is used as one of most popular and widely used programming languages. It is used by most data scientists. Python is easy to learn, very comprehensive when it comes to debugging, used by many different experts and has some very powerful libraries.

The Libraries I touch on are:


NumPy, or Numerical Python, is the fundamental package with its powerful N-dimenional array object. It provides high performance tools to work with theses arrays and allows for efficient operations. NumPy also forms the base for libraries such as SciKit-Learn.


If you are a aspiring to be a data scientist, you will find Pandas to be your bestfriend. Its the most popular Python library for data analysis. It comes with many capabilities including being able to easily clean and manipulate data with its data frames.


Matplotlib is another very powerful tool that is used to create beautiful visualizations. You are able to use it to find correlation between variables. Matplotlib allows the visualization of the distribution of data in order to gain new insights into the data.


SciKit-Learn is known as the library full of machine learning algorithms. If you plan to use data science for a predictive matter, you will find yourself combing through the many model selections this library has to offer.

Top comments (0)