Top Library setups for Data Science beginners

shubh28698 profile image shubham chaudhari ・2 min read

Data Science has been a boon for many of applications in the world, whether it could be in the area of healthcare, education, entertainment or in industrial sectors for supply chain logistics and many more things. Many of students and professional are aspiring or making career transitions in data science sector. So this is the post for them to give them a small start regarding libraries.

This post will reflect some of the important libraries for beginners along with installation commands and small introductions, so that they do not have to put on much time to browse the internet and can find the things at one place.

The only prerequisite for libraries is Python itself.

All the installation commands should be executed in command prompt

1. Numpy

  • Powerful N-dimensional arrays
  • Numerical computing tools
  • Interoperable

Installation commands

For windows:
pip install numpy

For linux:
$ sudo apt install python-numpy

2. Scipy

  • Opensource
  • Contains modules for optimization, linear algebra, integration,special functions
  • Depends on numpy and imports many numpy functions

Installation commands

For windows:
pip install scipy

For linux:
$ sudo apt-get install python-pip

$ sudo pip install numpy scipy

3. Pandas

  • Contains high-level data structures and manipulation tools
  • For data manipulation and analysis
  • fast, flexible

Installation commands

For windows:
pip install pandas

For linux:
$ pip3 install pandas

4. Scikit-learn

  • Simple and efficient tools for predictive data analysis
  • lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction
  • Features various algorithm like support vector machine, random forests

Installation commands

For windows:
pip install scikit-learn

For linux:
$ sudo pip install scikit-learn


  • Contains text processing libraries for tokenization, parsing, classification, stemming, tagging and semantic reasoning.
  • Used for developing applications and services that are able to understand human languages

Installation commands

For windows:
pip3 install nltk

For linux:
$ sudo apt-get install python-numpy python-nltk

6. Matplotlib

  • Comprehensive library for creating static, animated, and interactive visualizations
  • Develop publication quality plots with just a few lines of code
  • Use interactive figures that can zoom, pan, update

Installation commands

For windows:
python -m pip install -U pip
python -m pip install -U matplotlib

For linux:
$ sudo apt-get install python3-matplotlib

This post might prove to be helpful for all of the data science enthusiasts.

Any feedback would be much appreciated.

Posted on Jun 2 by:

shubh28698 profile

shubham chaudhari


Data Science | Machine learning | NLP


markdown guide

I would like to add spaCy for NLP and Tensorflow and Pytorch for Neural networks


Yeah.....forgot to add. Thanks for addition.It will surely help our data science enthusiasts