DEV Community

Cover image for How to Create Eye-Catching Country Rankings Using Python and Matplotlib
Oscar Leo
Oscar Leo

Posted on

How to Create Eye-Catching Country Rankings Using Python and Matplotlib

Hi, and welcome to this tutorial, where I’ll teach you to create a country ranking chart using Python and Matplotlib.

What I like about this visualization is its clean and beautiful way of showing how countries rank compared to each other on a particular metric.

The alternative to using a standard line chart showing the actual values get messy if some countries are close to each other or if some countries outperform others by a lot.

If you want access to the code for this tutorial, you can find it in this GitHub repository.

If you enjoy this tutorial, make sure to check out my other accounts.

Let’s get started.


About the data

I’ve created a simple CSV containing GDP values for today’s ten largest economies for this tutorial.

Screenshot of pandas DataFrame

The data comes from the World Bank, and the full name of the indicator is "GDP (constant 2015 us$)".

If you want to know more about different ways of measuring GDP, you can look at this Medium story, where I use the same type of data visualization.

Let’s get on with the tutorial.


Step 1: Creating rankings

Step one is to rank the countries for each year in the dataset, which is easy to do with pandas.

def create_rankings(df, columns):
    rank_columns = ["rank_{}".format(i) for i in range(len(columns))]
    for i, column in enumerate(columns):
        df[rank_columns[i]] = df[column].rank(ascending=False)

    return df, rank_columns
Enter fullscreen mode Exit fullscreen mode

The resulting columns look like this.

Screenshot of pandas DataFrame

That’s all the preprocessing we need to continue with the data visualization.


Step 2: Creating and styling a grid

Now that we have prepared our data, it’s time to create a grid where we can draw our lines and flags.

Here’s a function using Seaborn that creates the overall style. It defines things like the background color and font family. I’m also removing spines and ticks.

def set_style(font_family, background_color, grid_color, text_color):
    sns.set_style({
        "axes.facecolor": background_color,
        "figure.facecolor": background_color,

        "axes.grid": True,
        "axes.axisbelow": True,

        "grid.color": grid_color,

        "text.color": text_color,
        "font.family": font_family,

        "xtick.bottom": False,
        "xtick.top": False,
        "ytick.left": False,
        "ytick.right": False,

        "axes.spines.left": False,
        "axes.spines.bottom": False,
        "axes.spines.right": False,
        "axes.spines.top": False,
    }
)
Enter fullscreen mode Exit fullscreen mode

I run the function with the following values.

font_family = "PT Mono"
background_color = "#FAF0F1"
text_color = "#080520"
grid_color = "#E4C9C9"

set_style(font_family, background_color, grid_color, text_color)
Enter fullscreen mode Exit fullscreen mode

To create the actual grid, I have a function that formats the y- and x-axis. It takes a few parameters that allow me to try different setups, such as the size of the labels.

def format_ticks(ax, years, padx=0.25, pady=0.5, y_label_size=20, x_label_size=24):
    ax.set(xlim=(-padx, len(years) -1 + padx), ylim=(-len(df) - pady, - pady))

    xticks = [i for i in range(len(years))]
    ax.set_xticks(ticks=xticks, labels=years)

    yticks = [-i for i in range(1, len(df) + 1)]
    ylabels = ["{}".format(i) for i in range(1, len(df) + 1)]
    ax.set_yticks(ticks=yticks, labels=ylabels)

    ax.tick_params("y",labelsize=y_label_size, pad=16)
    ax.tick_params("x", labeltop=True, labelsize=x_label_size, pad=8)
Enter fullscreen mode Exit fullscreen mode

Here’s what it looks like when I run everything we have so far.

# Load data
years = ["2000", "2005", "2010", "2015", "2020", "2022"]
df = pd.read_csv("rankings.csv", index_col=None)
df, rank_columns = create_rankings(df, years)

# Create chart
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 1.6*len(df)))
format_ticks(ax, years)
Enter fullscreen mode Exit fullscreen mode

And here’s the resulting grid.

Matplotlib grid

Now we can start to add some data.


Step 3: Adding lines

I want a line showing each country's rank for each year in the dataset—an easy task in Matplotlib.

def add_line(ax, row, columns, linewidth=3):
    x = [i for i in range(len(columns))]
    y = [-row[rc] for rc in columns]

    ax.add_artist(
        Line2D(x, y, linewidth=linewidth, color=text_color)
    )
Enter fullscreen mode Exit fullscreen mode

Then I run the function for each row in the dataset like this.

# Load data
years = ["2000", "2005", "2010", "2015", "2020", "2022"]
df = pd.read_csv("rankings.csv", index_col=None)
df, rank_columns = create_rankings(df, years)

# Create chart
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 1.6*len(df)))
format_ticks(ax, years)

# Draw lines
for i, row in df.iterrows():
    add_line(ax, row, rank_columns)
Enter fullscreen mode Exit fullscreen mode

Grid with lines

I’m using the same color for each line because I want to use country flags to guide the eye. Using a unique color for each line makes sense, but it looks messy.


Step 4: Drawing pie charts

I want to indicate how a country’s economy grows over time without adding text. Instead, I aim to inform in a visual format.

My idea is to draw a pie chart on each point showing the size of a country’s economy compared to its best year.

I’m using PIL to create a pie chart image, but you can use Matplotlib directly. I don’t because I had some issues with aspect ratios.

def add_pie(ax, x, y, ratio, size=572, zoom=0.1):
    image = Image.new('RGBA', (size, size))
    draw = ImageDraw.Draw(image)

    draw.pieslice((0, 0, size, size), start=-90, end=360*ratio-90, fill=text_color, outline=text_color)
    im = OffsetImage(image, zoom=zoom, interpolation="lanczos", resample=True, visible=True)

    ax.add_artist(AnnotationBbox(
        im, (x, y), frameon=False,
        xycoords="data",
    ))
Enter fullscreen mode Exit fullscreen mode

The value for the size parameter is slightly larger than the size of my flag images which are 512x512. Later, I want to paste the flags on the pie charts.

Here’s the updated code.

# Load data
years = ["2000", "2005", "2010", "2015", "2020", "2022"]
df = pd.read_csv("rankings.csv", index_col=None)
df, rank_columns = create_rankings(df, years)

# Create chart
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 1.6*len(df)))
format_ticks(ax, years)

# Draw lines
for i, row in df.iterrows():
    add_line(ax, row, rank_columns)

    for j, rc in enumerate(rank_columns):
        add_pie(ax, j, -row[rc], ratio=row[years[j]] / row[years].max())
Enter fullscreen mode Exit fullscreen mode

And here’s the result.

Grid with pie charts

It’s starting to look informative, so it’s time to make it beautiful.


Step 5: Adding flags

I love using flags in my charts because they are simply beautiful.

Here, the purpose of the flags is to make the chart visually appealing, explain which countries we’re looking at, and guide the eye along the lines.

I’m using these rounded flags. They require a license, so, unfortunately, I can’t share them, but you can find similar flags in other places.

I’ve had some issues getting the pie and flag to align perfectly, so instead of creating a separate function to add a flag, I’m rewriting the add_pie() function.

def add_pie_and_flag(ax, x, y, name, ratio, size=572, zoom=0.1):
    flag = Image.open("<location>/{}.png".format(name.lower()))
    image = Image.new('RGBA', (size, size))
    draw = ImageDraw.Draw(image)
    pad = int((size - 512) / 2)

    draw.pieslice((0, 0, size, size), start=-90, end=360*ratio-90, fill=text_color, outline=text_color)
    image.paste(flag, (pad, pad), flag.split()[-1])

    im = OffsetImage(image, zoom=zoom, interpolation="lanczos", resample=True, visible=True)

    ax.add_artist(AnnotationBbox(
        im, (x, y), frameon=False,
        xycoords="data",
    ))
Enter fullscreen mode Exit fullscreen mode

I add it right after the pie chart function.

# Load data
years = ["2000", "2005", "2010", "2015", "2020", "2022"]
df = pd.read_csv("rankings.csv", index_col=None)
df, rank_columns = create_rankings(df, years)

# Create chart
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 1.6*len(df)))
format_ticks(ax, years)

# Draw lines
for i, row in df.iterrows():
    add_line(ax, row, rank_columns)

    for j, rc in enumerate(rank_columns):
        add_pie_and_flag(
            ax, j, -row[rc], 
            name=row.country_name,
            ratio=row[years[j]] / row[years].max()
        )
Enter fullscreen mode Exit fullscreen mode

And now you can behold the visual magic of using flags. It’s a huge difference compared to the previous output.

Grid with flags

We suddenly have something that looks nice and is easy to understand. The last thing to do is to add some helpful information.


Step 5: Adding additional information

Since not everyone knows all the flags by heart, I want to add the country’s name to the right.

I also want to show the size of the economy and how each country compares to the highest ranking.

Here’s my code for doing that.

def add_text(ax, value, max_value, y):
    trillions = round(value / 1e12, 1)
    ratio_to_max = round(100 * value / max_value, 1)

    text = "{}\n${:,}T ({}%)".format(
        row.country_name, 
        trillions,
        ratio_to_max
    )

    ax.annotate(
        text, (1.03, y), 
        fontsize=20,
        linespacing=1.7,
        va="center",
        xycoords=("axes fraction", "data")
    )
Enter fullscreen mode Exit fullscreen mode

As before, I add the function to the main code block. Note that I’m also adding a title.

years = ["2000", "2005", "2010", "2015", "2020", "2022"]
df = pd.read_csv("rankings.csv", index_col=None)
df, rank_columns = create_rankings(df, years)

fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(15, 1.6*len(df)))
format_ticks(ax, years)

for i, row in df.iterrows():
    add_line(ax, row, rank_columns)

    for j, rc in enumerate(rank_columns):
        add_pie_and_flag(
            ax, j, -row[rc], 
            name=row.country_name,
            ratio=row[years[j]] / row[years].max()
        )

    add_text(ax, value=row[years[-1]], max_value=df.iloc[0][years[-1]], y=-(i + 1))
    plt.title("Comparing Today's Largest Economies\nGDP (constant 2015 us$)", linespacing=1.8, fontsize=32, x=0.58, y=1.12)
Enter fullscreen mode Exit fullscreen mode

Voila.

Country rankings chart

That’s it; we’re done.


Conclusion

Today, you’ve learned an alternative way to visualize.

I like this type of data visualization because it’s easy on the eye and conveys a ton of information with very little text.

If you enjoyed it as much as I did, make sure to subscribe to my channel for more of the same! :)

Thank you for reading.

Top comments (3)

Collapse
 
balagmadhu profile image
Bala Madhusoodhanan

Love the tutorial @oscarleo

Collapse
 
oscarleo profile image
Oscar Leo

That's great! :D

Collapse
 
lukaskrimphove profile image
Lukas Krimphove

This looks awesome! Great tutorial!