DEV Community

AdaChime
AdaChime

Posted on • Edited on

Data visualization with python for beginners

You might have heard of data visualization previously but do you really know what that means. Well data visualization is the graphic representation of data using charts, graphs, plots etc.

Why is data visualization important
Data visualization is important for the following reasons

  • Identification of useful patterns from data:
    Data visualization helps us know if our data has any form of correlation whether positive or negative.

  • Better understanding of our data:
    Data visualization helps us ask reasonable questions as to why something is the way it is. It makes the data easier to comprehend. From the plot below we know that the table we the size of 4 pays the highest tip.

From the plot below we know that the table we the size of 4 pays the highest tip.

  • Generating useful insights: When data is been visualized, it helps decision makers get useful insights from the data. For example, in the plot below there are more customers on Saturday etc. This helps you know on what day you should have more waiters and so on.

For example, in the plot below there are more customers on Saturday

  • Making profitable business decisions: Through the useful insights derived from data, decision makers are able to make profitable business decisions.

When plotting visualization we need to take to note the type of data we have whether it is continuous or categorical.

Continuous data or continuous variable is a type of quantitative data that can have a wide range of set of numbers or values. Examples of continuous data is the scores of students in a test, the prices of house over a given period of time. We use some specific plot for this kind of data, eg

  1. Scatter plots
  2. Histograms
  3. Swarm plots
  4. Line plot
  5. Strip plots etc

Categorical data or categorical variables is a type of variable/data that has a discrete amount of possible values or can be grouped. Example of categorical data are gender, days of the week, race, eye/hair color, highest level of educational qualification etc. Some of these plot can be used to visualize categorical data,

  1. Point plot
  2. Count plot
  3. Bar plot
  4. Pie chart
  5. Bar charts etc

Why do we use python
Python as a programming language can be used differently. It can
be used for website development, software development, data visualization etc. It has amazing libraries that can be used to visualize data. Example of this libraries are matplotlib and seaborn. These libraries have comprehensive visualizations.
One of the most used IDE(integrated development environment) for visualization is Jupyter notebook which can be installed from anaconda. Jupyter notebook is been used because it is user-friendly, it's said to be safer, it doesn't require internet connection etc. Other IDE that can be used be used in place of jupyter notebook are Pycharm, spyder etc

To create visualization we first import these libraries into our notebook

import pandas as pd #used for data analysis
import matplotlib.pyplot as plt
import seaborn as sns
Enter fullscreen mode Exit fullscreen mode

We are going to do a simple visualization

#this is what the syntax for making visualization looks like
x = [2, 3, 6]
y = [10, 30, 50]
plt.plot(x, y)
Enter fullscreen mode Exit fullscreen mode

Image description

To learn more about data visualization you can watch this video or you can take this course .
Always remember google is your friend.

Top comments (0)