DEV Community

Cover image for Matplotlib Tutorial: Life Expectancy & Genocide
Oscar Leo
Oscar Leo

Posted on

Matplotlib Tutorial: Life Expectancy & Genocide

Background

In this tutorial, I'm creating a horrifying data visualization about how life expectancy at birth changed in Bangladesh, Rwanda, and Cambodia during their genocides. I'm using data from the World Bank open data platform, which you should check out.

If you like this tutorial and visualization, make sure to sign up for my free newsletter, The Data Sensei.


Step 1: Imports

We don't need anything special for this tutorial

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from matplotlib.patches import Rectangle
from matplotlib.lines import Line2D
Enter fullscreen mode Exit fullscreen mode

Step 2: Styling

I always use Seaborn to create a style that suits the data. You can't use cheerful colors when you design a plot with data like what we have in this tutorial. Since I still want a bright chart, I only use grayscale colors.

font_family = "DejaVu Serif"
background_color = "#e5e5e5"
light_gray = "#858585"
dark_gray = "#454545"
text_color = "#000000"

sns.set_style({
    "axes.facecolor": background_color + "00",
    "figure.facecolor": background_color,
    "axes.edgecolor": text_color,
    "axes.grid": True,
    "axes.axisbelow": True,
    "grid.color": light_gray,
    "text.color": text_color,
    "font.family": font_family,

    "xtick.color": text_color,
    "ytick.color": text_color,
    "xtick.bottom": False,
    "xtick.top": False,
    "ytick.left": False,
    "ytick.right": False,

    "axes.spines.left": False,
    "axes.spines.bottom": False,
    "axes.spines.right": False,
    "axes.spines.top": False,
})
Enter fullscreen mode Exit fullscreen mode

Step 3: Helper functions

When I create plots, I always create reusable functions, but in this tutorial, I only have one. It's a really simple data visualization, so I don't need more.

def color(v, threshold=38):
    return light_gray if v > threshold else dark_gray
Enter fullscreen mode Exit fullscreen mode

Step 4: Shared styles

Defining shared styles you reuse in several places outside the main functions is a good idea. It's always annoying when you have to change things like color and font sizes in multiple locations, and it's easy to introduce a design bug that you don't find in time.

BAR_STYLE = {"width": 1, "edgecolor": text_color}
LINE_STYLE = {"zorder": -10, "color": dark_gray, "linewidth": 2}
YEAR_STYLE = {"va": "center", "fontsize": 28, "fontfamily": "PT Serif Caption"}
INFO_STYLE = {"ha": "left", "va": "top", "fontsize": 24, "linespacing": 1.8, "color": dark_gray}
Enter fullscreen mode Exit fullscreen mode

Step 5: Loading data

I have created a preprocessed dataset so that you can focus on the Matplotlib code. The CSV has four columns, one for each country and the year of the values. The values represent life expectancy at birth, which means how long someone born that year can expect to live. As you will see, that value changes drastically over time.

data = pd.read_csv(
    "https://raw.githubusercontent.com/oscarleoo/matplotlib-tutorial-data/main/life_expectancy_genocide.csv"
)
Enter fullscreen mode Exit fullscreen mode

Step 6: Data visualization

Now, it's time to create the actual chart. It's a straightforward chart where I use Rectangle to draw the bars, and everything else basically adds information using lines and text. I'm using Rectangle instead of something like seaborn.barplot() because it gives me more control, and I don't need subplots. With subplots, I find it more difficult to add custom styles.

There are some hard-coded values to decide the text position, but I've found that it's faster than trying to automate the placement. All charts require some custom design to get the right feel.

fig = plt.figure(figsize=(30, 30))
ax = fig.add_subplot()
ax.set(ylim=(-68, 3.7*100), xlim=(-2, 63))

# Add bars
for i, row in data.iterrows():
    ax.add_artist(Rectangle((i-0.5, 200), height=row.rwanda, facecolor=color(row.rwanda), **BAR_STYLE))
    ax.add_artist(Rectangle((i-0.5, 100), height=row.cambodia, facecolor=color(row.cambodia), **BAR_STYLE))
    ax.add_artist(Rectangle((i-0.5, 0), height=row.bangladesh, facecolor=color(row.bangladesh), **BAR_STYLE))

# Add country names
ax.annotate("Rwanda", (-0.5, 265), ha="left", va="bottom", fontsize=48)
ax.annotate("Cambodia", (-0.5, 165), ha="left", va="bottom", fontsize=48)
ax.annotate("Bangladesh", (-0.5, 65), ha="left", va="bottom", fontsize=48)

# Tick formatting
ax.tick_params(labelsize=28, pad=16)
ax.set_xticks([])
ax.set_yticks(
    ticks = [0, 20, 40, 60, 100, 120, 140, 160, 200, 220, 240, 260],
    labels = ["0", "20", "40", "60", "0", "20", "40", "60", "0", "20", "40", "60"]
)

# Add lines
ax.add_artist(Line2D([11, 11, 7], [0, -20, -20], **LINE_STYLE))
ax.add_artist(Line2D([16.5, 16.5, 21], [100, -20, -20], **LINE_STYLE))
ax.add_artist(Line2D([34, 34, 42], [200, -20, -20], **LINE_STYLE))

# Bangladesh info 
ax.annotate("1971", (6.5, -20), ha="right", **YEAR_STYLE)
ax.annotate(
    "Ethnic cleansing of\n"
    "Bengalis in East Pakistan during\n"
    "the Bangladesh Liberation War.", (3.8, -28), **INFO_STYLE
)

# Cambodia info
ax.annotate("1975-1978", (21.5, -20), ha="left", **YEAR_STYLE)
ax.annotate(
    "Systematic killing of Cambodian\n"
    "citizens by the Khmer Rouge under\n"
    "the leadership of Pol Pot.", (21.6, -28), **INFO_STYLE
)

# Rwanda info
ax.annotate("1994", (42.5, -20), ha="left", **YEAR_STYLE)
ax.annotate(
    "Hutu militias slaughter\n"
    "the Tutsi population\n"
    "in Rwanda", (42.6, -28), **INFO_STYLE
)

# Title and source
title = "Life Expectancy at Birth During Genocides"
ax.annotate(title, (-0.5, 320), ha="left", va="bottom", fontsize=64)
ax.annotate("Source: World Development Indicators", (-0.5, 343), ha="left", va="bottom", fontsize=32, color=dark_gray)

plt.show()
Enter fullscreen mode Exit fullscreen mode

When you run this code, you get the following visualization.

Life expectancy data visualization


Conclusion

Above, you see the final data visualization. It might not look beautiful at first glance, but remember that we have a sad message that we must keep in mind when creating the design. I'm happy that the chart doesn't contain little information but is easy to understand immediately. I think the color scheme is good, and the entire data visualization gives a sober newspaper feel.

Top comments (0)