DEV Community

Cover image for How to make an animated gif fit for /r/dataisbeautiful
Max Humber
Max Humber

Posted on • Edited on

How to make an animated gif fit for /r/dataisbeautiful

A good visualization should capture the interest of the audience and make an impression. Few things capture interest more than bright colors and movement.

In this post I'm going to show you exactly how to make an animated gif, so that you can go farm some internet points on /r/dataisbeautiful, maybe~

Here's what we're going to make:

businesses.gif

Step 0 - Data

Before you make a graph you've gotta get your hands on some data. I grabbed some business data from StatsCanada available here. The data isn't in the best shape, so here's a pinch of pandas to make it suck less:

import pandas as pd

df = pd.read_csv("3310027001-noSymbol.csv", skiprows=7).iloc[1:5]
df = df.rename(columns={"Business dynamics measure": "status"})
df['status'] = df['status'].apply(lambda x: x[:-13])
df = pd.melt(df, id_vars="status", var_name="date", value_name="count")
df["date"] = pd.to_datetime(df["date"], format="%B %Y")
df['count'] = df['count'].apply(lambda x: int(x.replace(",", "")))

print(df.head())
#        status       date   count
# 0      Active 2015-01-01  775497
# 1     Opening 2015-01-01   40213
# 2  Continuing 2015-01-01  731116
# 3     Closing 2015-01-01   30979
# 4      Active 2015-02-01  778554
Enter fullscreen mode Exit fullscreen mode

Step 1 - Graph

If you want to make a gif, you first have to make a single frame. Which, coincidentally, is just a graph:

from matplotlib import pyplot as plt

da = df[df["status"] == "Active"]
plt.plot(da["date"], da["count"])
Enter fullscreen mode Exit fullscreen mode

Step 1

Step 2 - Size

The graph could be bigger, and the y-axis limits could be adjusted. No problem, that's just two extra lines of code: some code to size and limit:

plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, da['count'].max() * 1.1])
Enter fullscreen mode Exit fullscreen mode

Step 2

Step 3 - Tick

I like to manually set the ticks on my graphs, you don't have to, but if you want to:

ymax = int(da['count'].max() * 1.1 // 1)

plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, ymax])
plt.yticks(range(0, ymax, 200_000))
Enter fullscreen mode Exit fullscreen mode

Step 3

Step 4 - Label

If someone saw our graph right now, without any context, they'd have know idea what's going on. Let's fix that by adding some labels:

plt.figure(figsize=(8, 5), dpi=300)
plt.plot(da["date"], da["count"])
plt.ylim([0, ymax])
plt.yticks(range(0, ymax, 200_000))
plt.title("Active Businesses in Canada (Seasonally Adjusted)")
plt.xlabel("Year")
plt.ylabel("Count")
Enter fullscreen mode Exit fullscreen mode

Step 4

Step 4.5 - Detour

Our graph is exclusively about "Active" businesses in Canada. Here's what the "Opening" and "Closing" numbers look like:

dc = df[df["status"] == "Closing"]
do = df[df["status"] == "Opening"]

plt.plot(dc["date"], dc["count"], color='red', label="closing")
plt.plot(do["date"], do["count"], color='green', label="opening")
plt.legend()
Enter fullscreen mode Exit fullscreen mode

Step 4.5

Step 5 - Combine

The "Opening and Closing" graph adds some interesting colour to the "Active" data. Let's combine both with some fancy-pant matplotlib:

rows = 7
figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
    nrows=rows,
    ncols=1,
    wspace=0,
    hspace=0.5,
    figure=figure
)

main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])

main.plot(da["date"], da["count"])
sub.plot(do["date"], do["count"])
sub.plot(dc["date"], dc["count"])
Enter fullscreen mode Exit fullscreen mode

Step 5

Step 6 - Colour

I'm not keen on the colours or spacing of what we have right now. To fix, along with some axis adjustments, here's what you'll need:

figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
    nrows=rows,
    ncols=1,
    wspace=0,
    hspace=0.75,
    figure=figure
)

main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])

main.plot(da["date"], da["count"], color="purple")
sub.plot(do["date"], do["count"], color="blue")
sub.plot(dc["date"], dc["count"], color="red")

main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K\nbusinesses"])

sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
Enter fullscreen mode Exit fullscreen mode

Step 6

Step 7 - Refactor

Our graph code is nearly ready to go. We just need to refactor it so that we can take an individual date and build an individual frame for that date. I've also added some vlines and fixed the xlims to improve legibility and ensure that the plotting space is consistent across plots:

date = pd.Timestamp("2019-08-01")

xmin = df['date'].min()
xmax = df['date'].max()

dd = df[df["date"] <= date]
dc = dd[dd["status"] == "Closing"]
do = dd[dd["status"] == "Opening"]
da = dd[dd["status"] == "Active"]

figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
grid = plt.GridSpec(
    nrows=rows,
    ncols=1,
    wspace=0,
    hspace=1.25,
    figure=figure
)

main = plt.subplot(grid[:5, 0])
sub = plt.subplot(grid[5:, 0])

main.plot(da["date"], da["count"], color="#457b9d")
main.vlines(date, ymin=0, ymax=1e20, color="#000000")
sub.plot(do["date"], do["count"], color="#a8dadc")
sub.plot(dc["date"], dc["count"], color="#e63946")
sub.vlines(date, ymin=0, ymax=1e20, color="#000000")

main.set_xlim([xmin, xmax])
main.set_xticks([])
main.set_ylim([0, ymax])
main.set_yticks(range(0, ymax, 200_000))
main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
main.set_title("Active Businesses in Canada")

sub.set_xlim([xmin, xmax])
sub.set_xticks([date])
sub.set_xticklabels([date.strftime("%B '%y")])
sub.set_ylim([0, 110_000])
sub.set_yticks([0, 100_000])
sub.set_yticklabels([0, "100K"])
sub.set_title("Businesses Opening and Closing")
Enter fullscreen mode Exit fullscreen mode

Step 7

Step 7.5 - Functionize

In order build a bunch of frames on a bunch of dates, we should wrap our code in a function:

def plot(date):
    dd = df[df["date"] <= date]
    dc = dd[dd["status"] == "Closing"]
    do = dd[dd["status"] == "Opening"]
    da = dd[dd["status"] == "Active"]

    figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
    grid = plt.GridSpec(
        nrows=rows,
        ncols=1,
        wspace=0,
        hspace=1.25,
        figure=figure
    )

    main = plt.subplot(grid[:5, 0])
    sub = plt.subplot(grid[5:, 0])

    main.plot(da["date"], da["count"], color="#457b9d")
    main.vlines(date, ymin=0, ymax=1e20, color="#000000")
    sub.plot(do["date"], do["count"], color="#a8dadc")
    sub.plot(dc["date"], dc["count"], color="#e63946")
    sub.vlines(date, ymin=0, ymax=1e20, color="#000000")

    main.set_xlim([xmin, xmax])
    main.set_xticks([])
    main.set_ylim([0, ymax])
    main.set_yticks(range(0, ymax, 200_000))
    main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
    main.set_title("Active Businesses in Canada")

    sub.set_xlim([xmin, xmax])
    sub.set_xticks([date])
    sub.set_xticklabels([date.strftime("%b '%y")])
    sub.set_ylim([0, 110_000])
    sub.set_yticks([0, 100_000])
    sub.set_yticklabels([0, "100K"])
    sub.set_title("Businesses Opening and Closing");
Enter fullscreen mode Exit fullscreen mode

So that we can build a frame with just one call:

plot(pd.Timestamp("2017-06-01"))
Enter fullscreen mode Exit fullscreen mode

Step 8 - import gif

To turn static frames into an animated gif, all we have to do now is to install and import the gif package:

import gif
Enter fullscreen mode Exit fullscreen mode

Decorate the plot function with gif.frame:

@gif.frame
def plot(date):
    dd = df[df["date"] <= date]
    dc = dd[dd["status"] == "Closing"]
    do = dd[dd["status"] == "Opening"]
    da = dd[dd["status"] == "Active"]

    figure = plt.figure(figsize=(8, 4), constrained_layout=False, dpi=300)
    grid = plt.GridSpec(
        nrows=7,
        ncols=1,
        wspace=0,
        hspace=1.25,
        figure=figure
    )

    main = plt.subplot(grid[:5, 0])
    sub = plt.subplot(grid[5:, 0])

    main.plot(da["date"], da["count"], color="#457b9d")
    main.vlines(date, ymin=0, ymax=1e20, color="#000000")
    sub.plot(do["date"], do["count"], color="#a8dadc")
    sub.plot(dc["date"], dc["count"], color="#e63946")
    sub.vlines(date, ymin=0, ymax=1e20, color="#000000")

    main.set_xlim([xmin, xmax])
    main.set_xticks([])
    main.set_ylim([0, ymax])
    main.set_yticks(range(0, ymax, 200_000))
    main.set_yticklabels([0, "200K", "400K", "600K", "800K"])
    main.set_title("Active Businesses in Canada")

    sub.set_xlim([xmin, xmax])
    sub.set_xticks([date])
    sub.set_xticklabels([date.strftime("%b '%y")])
    sub.set_ylim([0, 110_000])
    sub.set_yticks([0, 100_000])
    sub.set_yticklabels([0, "100K"])
    sub.set_title("Businesses Opening and Closing");
Enter fullscreen mode Exit fullscreen mode

Build all the frames:

dates = pd.date_range(df['date'].min(), df['date'].max(), freq="1MS")

frames = [plot(date) for date in dates]
Enter fullscreen mode Exit fullscreen mode

And save the animation to disk:

gif.save(frames, "businesses.gif", duration=5, unit="s", between="startend")
Enter fullscreen mode Exit fullscreen mode

businesses.gif

Now it's your turn to find some interesting data and turn it into a gif.

And if you want to learn more, I'm running a workshop on gifs with ODSC/AI+ on October 22. Hope to see you in class!

Top comments (0)