Seaborn is a python visualization library built on matplotlib. Its a great tool for statistical visualization. Let's dive....
First we need to pip install seaborn
then
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
For visualization we need data. Here we can use any stock data, csv file etc. Even input manually. More data is better for understanding the plots. Here we can use already made dataset saved in seaborn. Run sns.get_dataset_names()
to get all the saved dataset names.
To get the table we gonna use .load_dataset()
Now in seaborn there are mainly three types of plots
 Distribution Plots
 Categorical Plots
 Matrix Plots
To be really informal
 Distribution plot: There's a data against a parameter.
 Categorical plot: There's multiple subparameter or categories for a parameter and there's data for each one.
 Matrix plot: There's a data against two parameter
For example
We need to keep this idea in mind while choosing our plot according to the data.
Secondly, we should use shift+tab
Now let's start
Distribution Plot:
Let's discuss some plots that allow us to visualize the distribution of a data set. These plots are:
 distplot
 jointplot
 pairplot
 rugplot
 kdeplot
Distplot:
Again there's lots of option to customize. Just see the docstrings with shift+tab
and play with it.
Jointplot:
Here we join two different distribution plot so we need to define x and y axis. Also we can select the plot kind.
Here's some kind optionsscatter,reg,resid,kde,hex
Pairplot:
Pairplot will plot pairwise relationships across an entire dataframe (for the numerical columns) and supports a color hue argument (for categorical columns).
Again docstring...add hue(color separation according to the categorical column) and palette(a color combo)
Rugplot:
Rugplots are actually a very simple concept, they just draw a dash mark for every point on a univariate distribution. They are the building block of a KDE plot.
Kdeplot:
kdeplots are Kernel Density Estimation plots. These KDE plots replace every single observation with a Gaussian (Normal) distribution centered around that value.
Categorical Plots:
Now let's discuss using seaborn to plot categorical data! There are a few main plot types for this:
 factorplot
 boxplot
 barplot
 countplot
 violinplot
Factorplot:
Factorplot is the most general form of a categorical plot. It can take in a kind parameter to adjust the plot type:
Barplot and Countplot:
These very similar plots allow you to get aggregate data off a categorical feature in your data. barplot is a general plot that allows you to aggregate the categorical data based off some function, by default the mean:
Countplot is essentially the same as barplot except the estimator is explicitly counting the number of occurrences. Which is why we only pass the x value:
Boxplot and violinplot:
Boxplot:
Boxplots and violinplots are used to shown the distribution of categorical data. A box plot (or boxandwhisker plot) shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable. The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the interquartile range.
Violin plot:
A violin plot plays a similar role as a box and whisker plot. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared. Unlike a box plot, in which all of the plot components correspond to actual datapoints, the violin plot features a kernel density estimation of the underlying distribution.
Matrix Plot:
Matrix plots allow you to plot data as colorencoded matrices.
Let's begin by exploring seaborn's heatmap
In order for a heatmap to work properly, your data should already be in a matrix form, the sns.heatmap function basically just colors it in for you. Now for getting a matrix presentation we can use .pivot_table()
.To get a good plot let's input 'flights' dataset.
There are more customizing options and plotting dimensions. But this was the basic idea.
Notes:
 Know what data goes with your plot and select those datas for the desired plot
 Know the types of the plots
 Syntax is the main thing

shift+tab
is your best friend....get to know him and play with him.
Top comments (0)