DEV Community

Cover image for Seaborn Complete Roadmap - in 2 Days (with "tips" dataset)
Nivesh Bansal
Nivesh Bansal

Posted on

Seaborn Complete Roadmap - in 2 Days (with "tips" dataset)

A focused 2-day roadmap to master Seaborn for data analytics using the tips dataset.
This guide includes setup, explanations, code examples, pitfalls, checkpoints, and a mini-project.
By the end, you’ll have job-ready visualization skills.


Source Code: Click here
Written By: Nivesh Bansal Linkedin GitHub Instagram


Seaborn Roadmap for Data Analytics (Using tips)

Goal: Master Seaborn’s essentials in 2 days (or one power-day).
This roadmap covers every necessary topic with explanations, code snippets, and practice tasks—focused purely on practical analytics.

Key Outcomes:

  • Dataset: sns.load_dataset("tips")
  • Focus: EDA • Storytelling • Clean visuals
  • Result: Job-ready plotting skills

Tip: Don’t memorize. For each topic:

  • Run the example
  • Tweak 2–3 parameters
  • Write one insight in plain English

Requirements:

  • Python ≥ 3.9
  • Libraries: pandas, numpy, matplotlib, seaborn
  • IDE: Jupyter/Colab or any Python IDE

0) Quick Setup

pip install seaborn matplotlib pandas numpy
Enter fullscreen mode Exit fullscreen mode
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

sns.set_theme(style="whitegrid", context="notebook")
tips = sns.load_dataset("tips")
tips.head()
Enter fullscreen mode Exit fullscreen mode

Dataset Columns:
total_bill, tip, sex, smoker, day, time, size


📅 Day 1 — Foundations & Core EDA

Goal: Understand Seaborn’s API, explore distributions, compare categories, and scan pairwise relationships quickly.


1) Seaborn Basics: Figure-level vs Axes-level

  • Axes-level → e.g., sns.scatterplot (draws on Matplotlib Axes, returns Axes)
  • Figure-level → e.g., sns.catplot, sns.pairplot (manages own figure/layout)
  • Common params: data=, x=, y=, hue=, style=, size=

2) Univariate Distributions

Use these to understand shape, center, spread, and outliers:

  • histplot — histogram (+ KDE option)
  • kdeplot — kernel density estimate
  • ecdfplot — empirical CDF (great for medians & quantiles)
  • countplot — frequency for categorical variables
# Histogram & KDE
sns.histplot(tips, x="total_bill", bins=20, kde=True)
plt.title("Distribution of Total Bill")
plt.show()

# ECDF
sns.ecdfplot(tips, x="tip")
plt.title("ECDF of Tip")
plt.show()

# Count
sns.countplot(data=tips, x="day")
plt.title("Count by Day")
plt.show()
Enter fullscreen mode Exit fullscreen mode

When to use: sanity checks, skewness, choosing transforms, spotting outliers.


3) Categorical ↔ Numerical

Compare distributions across groups:

  • boxplot — median, IQR, whiskers, outliers
  • violinplot — full distribution via KDE
  • boxenplot — for large samples
  • stripplot / swarmplot — raw points
  • barplot / pointplot — aggregated means/CI
# Box vs Violin
fig, ax = plt.subplots(1,2, figsize=(10,4))
sns.boxplot(data=tips, x="day", y="total_bill", ax=ax[0])
sns.violinplot(data=tips, x="day", y="tip", ax=ax[1])
ax[0].set_title("Total Bill by Day")
ax[1].set_title("Tip by Day")
plt.tight_layout(); plt.show()

# Strip plot
sns.stripplot(data=tips, x="smoker", y="tip", jitter=True)
plt.title("Raw Tips by Smoker")
plt.show()

# Mean with CI
sns.barplot(data=tips, x="sex", y="tip", estimator=pd.Series.mean, ci=95)
plt.title("Avg Tip by Sex")
plt.show()
Enter fullscreen mode Exit fullscreen mode

Combine violinplot + stripplot for distributions + raw data.


4) Numeric ↔ Numeric Relationships

Start with scatterplots, optionally add regression.

# Basic scatter
sns.scatterplot(data=tips, x="total_bill", y="tip")
plt.title("Total Bill vs Tip")
plt.show()

# Add hue/style/size
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="sex", style="smoker", size="size")
plt.title("Bill vs Tip by Sex/Smoker/Size")
plt.show()

# Trend line
sns.regplot(data=tips, x="total_bill", y="tip", scatter_kws={"alpha":0.6})
plt.title("Trend: Tip ~ Total Bill")
plt.show()
Enter fullscreen mode Exit fullscreen mode

5) Fast Pairwise Scans

sns.pairplot(tips, hue="sex", diag_kind="hist")
plt.suptitle("Pairwise Relationships (tips)", y=1.02)
plt.show()
Enter fullscreen mode Exit fullscreen mode

Checkpoint (Day 1 done): You can read distributions, compare groups, and see pairwise trends. Write 3 insights from the dataset.


📅 Day 2 — Multivariate, Facets, Correlations & Pro Styling

Goal: Add faceting, correlations, palettes, and create presentation-ready visuals.


6) Faceting & Small Multiples

Split data into subplots by category.

# Facet by smoker
sns.catplot(data=tips, x="day", y="tip", hue="sex", col="smoker", kind="bar")
plt.suptitle("Tips by Day (faceted by Smoker)", y=1.02)
plt.show()

# Scatter with facets
sns.relplot(data=tips, x="total_bill", y="tip", hue="sex", col="time", kind="scatter")
plt.show()
Enter fullscreen mode Exit fullscreen mode

Facets make comparisons obvious without clutter.


7) Correlations & Heatmaps

corr = tips[["total_bill","tip","size"]].corr()
sns.heatmap(corr, annot=True, fmt=".2f", cmap="coolwarm", square=True)
plt.title("Correlation (tips)")
plt.show()

# Clustered heatmap
sns.clustermap(corr, annot=True, fmt=".2f", cmap="coolwarm")
plt.show()
Enter fullscreen mode Exit fullscreen mode

Read values: ±1 → strong linear relation, 0 → weak/none.


8) Time/Ordered Trends

avg = tips.groupby("size", as_index=False)["tip"].mean()
sns.lineplot(data=avg, x="size", y="tip")
plt.title("Average Tip by Party Size")
plt.show()
Enter fullscreen mode Exit fullscreen mode

9) Styling, Palettes & Layout

sns.set_theme(style="whitegrid", context="talk", palette="deep")

ax = sns.scatterplot(data=tips, x="total_bill", y="tip", hue="sex")
ax.set_title("Tips vs Total Bill")
ax.set_xlabel("Total Bill ($)")
ax.set_ylabel("Tip ($)")
sns.despine()
plt.tight_layout(); plt.show()
Enter fullscreen mode Exit fullscreen mode

10) Legends, Annotations & Saving

ax = sns.regplot(data=tips, x="total_bill", y="tip")
ax.annotate("Higher tips with higher bills",
            xy=(40,7), xytext=(25,8.5),
            arrowprops=dict(arrowstyle="->", color="white"))
ax.legend_.remove() if ax.legend_ else None
plt.tight_layout()
plt.savefig("tips_scatter.png", dpi=300, bbox_inches="tight", transparent=True)
plt.show()
Enter fullscreen mode Exit fullscreen mode

11) Cheat-Sheet: Axes vs Figure Level

  • Axes-level: scatterplot, lineplot, histplot, kdeplot, boxplot, violinplot, heatmap, regplot
    Use when you manage subplots manually.

  • Figure-level: relplot, catplot, jointplot, pairplot, lmplot
    Use for quick grids/facets and auto layouts.


12) Common Pitfalls

  • Overplotting → use alpha, hexbin, or kdeplot
  • Don’t rely on defaults → always set titles/labels
  • For groups → prefer violin/box + strip over bar means
  • Keep consistent color semantics

Checkpoint (Day 2 done): You can facet, compare multivariate trends, style for clarity, and export.


Mini-Project (Deliverable)

Question: What factors drive higher tips?

Steps:

  1. Univariate: distribution of total_bill, tip
  2. Groups: tip by day, sex, smoker, time
  3. Relationship: total_billtip (add hue & regression)
  4. Correlation heatmap for numeric vars
  5. Facet by smoker/time
  6. Report: 5 insights + 2 charts for LinkedIn/portfolio
import numpy as np
tips = sns.load_dataset("tips").assign(tip_pct=lambda d: d["tip"] / d["total_bill"] * 100)

# 1) Distribution
sns.histplot(tips, x="tip_pct", bins=20, kde=True)
plt.title("Tip % Distribution"); plt.show()

# 2) Groups
sns.boxplot(tips, x="day", y="tip_pct", hue="smoker")
plt.title("Tip % by Day & Smoker"); plt.show()

# 3) Relationship with hue
sns.scatterplot(tips, x="total_bill", y="tip_pct", hue="time", style="sex")
plt.title("Tip % vs Total Bill by Time/Sex"); plt.show()

# 4) Correlation
num = tips[["total_bill","tip","size","tip_pct"]]
sns.heatmap(num.corr(), annot=True, fmt=".2f", cmap="coolwarm")
plt.title("Correlation (with Tip %)"); plt.show()
Enter fullscreen mode Exit fullscreen mode

Practice Checklist

  • Plot hist+KDE for total_bill; describe skewness
  • Compare tip across day using box+strip
  • Scatter total_bill vs tip with hue=sex, style=smoker
  • Create pairplot with hue=time
  • Build correlation heatmap; write 2 interpretations
  • Facet bar chart by smoker and time
  • Export one figure at 300 DPI with transparent background

Quick Reference

Most-used APIs:

  • scatterplot, lineplot, histplot, kdeplot, ecdfplot
  • boxplot, violinplot, stripplot, barplot, pointplot
  • pairplot, jointplot, relplot, catplot
  • heatmap, clustermap, regplot

Styling:

  • sns.set_theme(style, palette, context)
  • sns.despine(), plt.tight_layout()
  • Palettes: deep, muted, pastel, bright, dark, colorblind

Written for: Nivesh Bansal — Data Analytics Journey, Day 10.
You can copy any code block and practice directly.

Happy plotting!


Source Code: Click here
Written By: Nivesh Bansal Linkedin GitHub Instagram

Top comments (0)