DEV Community

Cover image for Wrangling financial inclusion data with JavaScript, D3 and Markdown
Dexter
Dexter

Posted on

Wrangling financial inclusion data with JavaScript, D3 and Markdown

Taken from my tweet: https://twitter.com/dextersjab/status/1546257828058345472

I wrestled with the recent Global Findex 2021 created launched by the World Bank. My aim was to update an interactive map I created for the 2017 dataset.

Tools: @observablehq (JS, Markdown, d3).

I also explained every step to myself (links included):

First, I opened the data in Google Sheets.

There are too many indicators to display at once (>7,000).

An example indicator: Financial institution account (% age 15+).

The rows are also grouped into years: 2011, 2014, 2017, 2021.

We'll need to restructure the data.

raw data in Google Sheets

The target shape for the data is easy because I'm using a schema I designed for my original project:

• Country Name
• Country Code
• Indicator Name
• Indicator Code
• 2011 data
• 2014 data
• 2017 data
• 2021 data

First, we import the data and extract indicator texts.

JSON table data

Rather than 4 rows per country, we make lookups simpler by having each row represent one combination of:

• country
• year
• indicator

The next step is to extract the unique values for each.

JSON table data

Easy part over, now things get painful 😂

The shape of the data is one thing, but now we need to deal with the keys 🫠

After looking at the dataset, we can see that similar indicators have keys matching certain patterns. Code roughly handles this.

screenshot from notebook
screenshot from notebook

For everything that the code doesn't neatly handle we do more manual work.

To minimise the boring stuff, we isolate data that's hard to work with and afterwards we put them back into mappings (JSON objects and d3 data structures).

screenshot from notebook

After fixing indicators, I created a tree structure.

I like to use the d3 hierarchy to do this so can always quickly use d3 tools like the tidy tree visualisation.

screenshots from notebook
hierarchy screenshot from notebook

After some (relatively) simple mapping of population data, all that remained was to to rename some variables in the main project to match the newly wrangled data.

Now we can navigate our stats about financial inclusion here:

https://observablehq.com/@dextersjab/global-financial-access-findex

And the ugly behind-the-scenes data wrangling lives here:

https://observablehq.com/@dextersjab/global-findex-data

I'll make some improvements to both notebooks at some point, but sometimes I just like to publish my work as is 😬

Top comments (0)