Any computer uses data all the time. Sometimes thats in databases, sometimes on the web, sometimes from sensory input and sometimes office data like excel or csv.
So you probably know you can easily parse a csv file with Pandas. But did you know you can quite easily create plots directly from the csv data?
A csv data set is simply data. It could come from an office suite like GSheets or Open Office. You can save a file a csv, comma separated value. As the name defines, every value is separated by a comma.
Any data set will work, but the example below uses this csv dataset.
This data set is about movies.
For every movie it saves these values:
- Runtime (Minutes)
- Revenue (Millions)
So that's a lot of information. It's a small data set of 1000 records.
We first load the pandas module, matplotlib for plotting and numpy for number crunching. Then uses matplotlib to plot the data. Load the csv data and create the figure.
#!/usr/bin/python3 import pandas as pd import matplotlib.pyplot as plt import numpy as np movie = pd.read_csv("IMDB-Movie-Data.csv") movie["Rating"].mean() movie["Rating"].plot(kind="hist", figsize=(20, 8)) plt.figure(figsize=(20, 8), dpi=80) plt.hist(movie["Rating"], 20) plt.xticks(np.linspace(movie["Rating"].min(), movie["Rating"].max(), 21)) plt.grid(linestyle="--", alpha=0.5) plt.show()
So that shows you the movie rating data. Mind you, there are a lot of records in the csv file and pandas does is instantly.