Nivesh Bansal

Posted on Sep 16

Seaborn Complete Roadmap - in 2 Days (with "tips" dataset)

#seaborn #roadmap #python #datavisualization

A focused 2-day roadmap to master Seaborn for data analytics using the tips dataset.
This guide includes setup, explanations, code examples, pitfalls, checkpoints, and a mini-project.
By the end, you’ll have job-ready visualization skills.

Source Code: Click here
Written By: Nivesh Bansal Linkedin GitHub Instagram

Seaborn Roadmap for Data Analytics (Using `tips`)

Goal: Master Seaborn’s essentials in 2 days (or one power-day).
This roadmap covers every necessary topic with explanations, code snippets, and practice tasks—focused purely on practical analytics.

Key Outcomes:

Dataset: sns.load_dataset("tips")
Focus: EDA • Storytelling • Clean visuals
Result: Job-ready plotting skills

Tip: Don’t memorize. For each topic:

Run the example
Tweak 2–3 parameters
Write one insight in plain English

Requirements:

Python ≥ 3.9
Libraries: pandas, numpy, matplotlib, seaborn
IDE: Jupyter/Colab or any Python IDE

0) Quick Setup

pip install seaborn matplotlib pandas numpy

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

sns.set_theme(style="whitegrid", context="notebook")
tips = sns.load_dataset("tips")
tips.head()

Dataset Columns:
total_bill, tip, sex, smoker, day, time, size

📅 Day 1 — Foundations & Core EDA

Goal: Understand Seaborn’s API, explore distributions, compare categories, and scan pairwise relationships quickly.

1) Seaborn Basics: Figure-level vs Axes-level

Axes-level → e.g., sns.scatterplot (draws on Matplotlib Axes, returns Axes)
Figure-level → e.g., sns.catplot, sns.pairplot (manages own figure/layout)
Common params: data=, x=, y=, hue=, style=, size=

2) Univariate Distributions

Use these to understand shape, center, spread, and outliers:

histplot — histogram (+ KDE option)
kdeplot — kernel density estimate
ecdfplot — empirical CDF (great for medians & quantiles)
countplot — frequency for categorical variables

# Histogram & KDE
sns.histplot(tips, x="total_bill", bins=20, kde=True)
plt.title("Distribution of Total Bill")
plt.show()

# ECDF
sns.ecdfplot(tips, x="tip")
plt.title("ECDF of Tip")
plt.show()

# Count
sns.countplot(data=tips, x="day")
plt.title("Count by Day")
plt.show()

When to use: sanity checks, skewness, choosing transforms, spotting outliers.

3) Categorical ↔ Numerical

Compare distributions across groups:

boxplot — median, IQR, whiskers, outliers
violinplot — full distribution via KDE
boxenplot — for large samples
stripplot / swarmplot — raw points
barplot / pointplot — aggregated means/CI

# Box vs Violin
fig, ax = plt.subplots(1,2, figsize=(10,4))
sns.boxplot(data=tips, x="day", y="total_bill", ax=ax[0])
sns.violinplot(data=tips, x="day", y="tip", ax=ax[1])
ax[0].set_title("Total Bill by Day")
ax[1].set_title("Tip by Day")
plt.tight_layout(); plt.show()

# Strip plot
sns.stripplot(data=tips, x="smoker", y="tip", jitter=True)
plt.title("Raw Tips by Smoker")
plt.show()

# Mean with CI
sns.barplot(data=tips, x="sex", y="tip", estimator=pd.Series.mean, ci=95)
plt.title("Avg Tip by Sex")
plt.show()

Combine violinplot + stripplot for distributions + raw data.

4) Numeric ↔ Numeric Relationships

Start with scatterplots, optionally add regression.

# Basic scatter
sns.scatterplot(data=tips, x="total_bill", y="tip")
plt.title("Total Bill vs Tip")
plt.show()

# Add hue/style/size
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="sex", style="smoker", size="size")
plt.title("Bill vs Tip by Sex/Smoker/Size")
plt.show()

# Trend line
sns.regplot(data=tips, x="total_bill", y="tip", scatter_kws={"alpha":0.6})
plt.title("Trend: Tip ~ Total Bill")
plt.show()

5) Fast Pairwise Scans

sns.pairplot(tips, hue="sex", diag_kind="hist")
plt.suptitle("Pairwise Relationships (tips)", y=1.02)
plt.show()

✅ Checkpoint (Day 1 done): You can read distributions, compare groups, and see pairwise trends. Write 3 insights from the dataset.

📅 Day 2 — Multivariate, Facets, Correlations & Pro Styling

Goal: Add faceting, correlations, palettes, and create presentation-ready visuals.

6) Faceting & Small Multiples

Split data into subplots by category.

# Facet by smoker
sns.catplot(data=tips, x="day", y="tip", hue="sex", col="smoker", kind="bar")
plt.suptitle("Tips by Day (faceted by Smoker)", y=1.02)
plt.show()

# Scatter with facets
sns.relplot(data=tips, x="total_bill", y="tip", hue="sex", col="time", kind="scatter")
plt.show()

Facets make comparisons obvious without clutter.

7) Correlations & Heatmaps

corr = tips[["total_bill","tip","size"]].corr()
sns.heatmap(corr, annot=True, fmt=".2f", cmap="coolwarm", square=True)
plt.title("Correlation (tips)")
plt.show()

# Clustered heatmap
sns.clustermap(corr, annot=True, fmt=".2f", cmap="coolwarm")
plt.show()

Read values: ±1 → strong linear relation, 0 → weak/none.

8) Time/Ordered Trends

avg = tips.groupby("size", as_index=False)["tip"].mean()
sns.lineplot(data=avg, x="size", y="tip")
plt.title("Average Tip by Party Size")
plt.show()

9) Styling, Palettes & Layout

sns.set_theme(style="whitegrid", context="talk", palette="deep")

ax = sns.scatterplot(data=tips, x="total_bill", y="tip", hue="sex")
ax.set_title("Tips vs Total Bill")
ax.set_xlabel("Total Bill ($)")
ax.set_ylabel("Tip ($)")
sns.despine()
plt.tight_layout(); plt.show()

10) Legends, Annotations & Saving

ax = sns.regplot(data=tips, x="total_bill", y="tip")
ax.annotate("Higher tips with higher bills",
            xy=(40,7), xytext=(25,8.5),
            arrowprops=dict(arrowstyle="->", color="white"))
ax.legend_.remove() if ax.legend_ else None
plt.tight_layout()
plt.savefig("tips_scatter.png", dpi=300, bbox_inches="tight", transparent=True)
plt.show()

11) Cheat-Sheet: Axes vs Figure Level

Axes-level: scatterplot, lineplot, histplot, kdeplot, boxplot, violinplot, heatmap, regplot…
Use when you manage subplots manually.
Figure-level: relplot, catplot, jointplot, pairplot, lmplot…
Use for quick grids/facets and auto layouts.

12) Common Pitfalls

Overplotting → use alpha, hexbin, or kdeplot
Don’t rely on defaults → always set titles/labels
For groups → prefer violin/box + strip over bar means
Keep consistent color semantics

✅ Checkpoint (Day 2 done): You can facet, compare multivariate trends, style for clarity, and export.

Mini-Project (Deliverable)

Question: What factors drive higher tips?

Steps:

Univariate: distribution of total_bill, tip
Groups: tip by day, sex, smoker, time
Relationship: total_bill ↔ tip (add hue & regression)
Correlation heatmap for numeric vars
Facet by smoker/time
Report: 5 insights + 2 charts for LinkedIn/portfolio

import numpy as np
tips = sns.load_dataset("tips").assign(tip_pct=lambda d: d["tip"] / d["total_bill"] * 100)

# 1) Distribution
sns.histplot(tips, x="tip_pct", bins=20, kde=True)
plt.title("Tip % Distribution"); plt.show()

# 2) Groups
sns.boxplot(tips, x="day", y="tip_pct", hue="smoker")
plt.title("Tip % by Day & Smoker"); plt.show()

# 3) Relationship with hue
sns.scatterplot(tips, x="total_bill", y="tip_pct", hue="time", style="sex")
plt.title("Tip % vs Total Bill by Time/Sex"); plt.show()

# 4) Correlation
num = tips[["total_bill","tip","size","tip_pct"]]
sns.heatmap(num.corr(), annot=True, fmt=".2f", cmap="coolwarm")
plt.title("Correlation (with Tip %)"); plt.show()

Practice Checklist

Plot hist+KDE for total_bill; describe skewness
Compare tip across day using box+strip
Scatter total_bill vs tip with hue=sex, style=smoker
Create pairplot with hue=time
Build correlation heatmap; write 2 interpretations
Facet bar chart by smoker and time
Export one figure at 300 DPI with transparent background

Quick Reference

Most-used APIs:

scatterplot, lineplot, histplot, kdeplot, ecdfplot
boxplot, violinplot, stripplot, barplot, pointplot
pairplot, jointplot, relplot, catplot
heatmap, clustermap, regplot

Styling:

sns.set_theme(style, palette, context)
sns.despine(), plt.tight_layout()
Palettes: deep, muted, pastel, bright, dark, colorblind

Written for: Nivesh Bansal — Data Analytics Journey, Day 10.
You can copy any code block and practice directly.

Happy plotting!

Source Code: Click here
Written By: Nivesh Bansal Linkedin GitHub Instagram

DEV Community

Seaborn Complete Roadmap - in 2 Days (with "tips" dataset)

Seaborn Roadmap for Data Analytics (Using `tips`)

0) Quick Setup

📅 Day 1 — Foundations & Core EDA

1) Seaborn Basics: Figure-level vs Axes-level

2) Univariate Distributions

3) Categorical ↔ Numerical

4) Numeric ↔ Numeric Relationships

5) Fast Pairwise Scans

📅 Day 2 — Multivariate, Facets, Correlations & Pro Styling

6) Faceting & Small Multiples

7) Correlations & Heatmaps

8) Time/Ordered Trends

9) Styling, Palettes & Layout

10) Legends, Annotations & Saving

11) Cheat-Sheet: Axes vs Figure Level

12) Common Pitfalls

Mini-Project (Deliverable)

Practice Checklist

Quick Reference

Top comments (0)

Seaborn Roadmap for Data Analytics (Using tips)

0) Quick Setup

📅 Day 1 — Foundations & Core EDA

1) Seaborn Basics: Figure-level vs Axes-level

2) Univariate Distributions

3) Categorical ↔ Numerical

4) Numeric ↔ Numeric Relationships

5) Fast Pairwise Scans

📅 Day 2 — Multivariate, Facets, Correlations & Pro Styling

6) Faceting & Small Multiples

7) Correlations & Heatmaps

8) Time/Ordered Trends

9) Styling, Palettes & Layout

10) Legends, Annotations & Saving

11) Cheat-Sheet: Axes vs Figure Level

12) Common Pitfalls

Mini-Project (Deliverable)

Practice Checklist

Quick Reference

Seaborn Roadmap for Data Analytics (Using `tips`)