DEV Community

Cover image for Python Libraries for Data Science You Should Know
Deepak Raj
Deepak Raj

Posted on • Originally published at


Python Libraries for Data Science You Should Know

This article was originally published at

According to Wikipedia —

Data Science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.
Data science is related to data mining, deep learning and big data.

In an easy way —

Data Science is the process of extracting useful information from data in order to solve real-world problems.

There are some must-known python libraries for a Data scientist to ease these things.

Numpy:- It's is a python library used for working with arrays. It also has functions for working in the domain of linear algebra, Fourier transform, and matrices. NumPy was created in 2005 by Travis Oliphant. It is an open-source project and you can use it freely.

Pandas:- In computer programming, pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. It is free software released under the three-clause BSD license.

Matplotlib:- It's is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK+.

Seaborn:- It's is a library for making statistical graphics in Python. It is built on top of matplotlib and closely integrated with pandas data structures. ... Options for visualizing univariate or bivariate distributions and for comparing them between subsets of data.

Scikit-learn:- Scikit-learn is a library in Python that provides many unsupervised and supervised learning algorithms. It's built upon some of the technology you might already be familiar with, like NumPy, pandas, and Matplotlib!

Tensorflow:- Created by the Google Brain team, **TensorFlow is an open-source library for numerical computation and large-scale machine learning. TensorFlow bundles together a slew of machine learning and deep learning (aka neural networking) models and algorithms and makes them useful by way of a common metaphor.

Keras:- Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible.

PyTorch:- PyTorch is an open-source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab. It is free and open-source software released under the Modified BSD license.


These are some Python libraries for Data Science, Machine Learning and Artificial Intelligence. As you will learn more you will get familiar with more advanced libraries and tools.

Thanks for reading articles.

More Articles by Author

Join for Weekly Updates.

React ❤️ to encourage Author.

Top comments (0)