DEV Community

Cover image for Using fetch() and reduce() to grab and format data from an external API - A practical guide
J.S.
J.S.

Posted on

Using fetch() and reduce() to grab and format data from an external API - A practical guide

Today we’re going to learn how to get and manipulate data from an external API. We’ll use a practical example from one of my current projects that you will hopefully be able to use as a template when starting something of your own.Â
For this exercise, we will look at current job posting data for New York City agencies. New York City is great about publishing all sorts of datasets, but I chose this particular one because it doesn’t require dealing with API keys – the endpoint is a publicly accessible URL.

Here’s a quick roadmap of of our plan. We’ll get the data from New York City’s servers by using JavaScript’s Fetch API, which is a good way to start working with promises. I’ll go over the very bare basics here, but I recommend Mariko Kosaka’s excellent illustrated blog The Promise of a Burger Party for a more thorough (and delicious) primer.Â

If you’ve ever used $.getJSON() in jQuery, you’re mostly there conceptually. If not, that’s okay, too. Take a look at the code below:

const cityJobsData = fetch("https://data.cityofnewyork.us/resource/swhp-yxa4.json");
Enter fullscreen mode Exit fullscreen mode

We declare a variable, cityJobsData, and set its value to fetch(the URL that contains the data we want) which returns something called a promise. For now, just think of it as the the data we will eventually get back from the URL when the request is complete. We can access and manipulate this data once it loads by subsequently calling then() on cityJobsData. To perform multiple operations, we can chain then()s together, making sure we 1. always pass in our data as an argument to the callback, and 2. return a value.Â

const cityJobsData = fetch("https://data.cityofnewyork.us/resource/swhp- yxa4.json");
cityJobsData
  .then(data => data.json())
Enter fullscreen mode Exit fullscreen mode

In the above snippet, we’re telling the computer to execute everything contained inside then() once the data is retrieved from the URL. This is what we call ‘asynchronous’ code. In this case, .then(data => data.json()) returns the data in JSON format, which will allow us to operate on it.

Just a quick aside for wrangling huge amounts of JSON: If you go in your web browser to the URL that contains the data we want, you’ll see an enormous, unformatted block of text that is very hard to read. However, you can copy and paste that text into something like jsonviewer, which will give you an organized, hierarchical, overview of the contents. Let’s say we want to see how many postings there are for each city agency. How can we do this? Well if we look at our JSON schema in this viewer, we can see that it’s an array of objects, with each object containing all the data that makes up a single job posting.Â

formatted JSON

Note that each object contains a key, agency, whose value is the name of the city agency that has a job available.Â

Therefore, if we can somehow keep track of how many times each agency is mentioned throughout this array of objects, we’ll be able to know how many jobs are currently available per agency.Â

How can we do this? One way is to use reduce(). From MDN, “The reduce() method applies a function against an accumulator and each element in the array (from left to right) to reduce it to a single value.” If this sounds like a bunch of nonsense to you, don’t worry! We’ll see soon that it’s not so bad when we have some examples to work through.

Most introductions to reduce() involve simple addition, which is a fine starting point. Let’s walk through this example together: Â

const arr = [1, 2, 4, 6];
const added = arr.reduce((accumulator, item) => {
 return accumulator + item;
}, 0);

console.log(added); // 13
Enter fullscreen mode Exit fullscreen mode

Here’s how it works: the reduce() function loops through the array, arr, and adds each item to an accumulator, which has an initial value of 0 (we make this value reduce()'s second argument, after the callback function). The accumulator’s current value is returned at the end of every loop, which is how the adding happens. Thus, the final value of added is 13.Â

If you’re having trouble visualizing this, try adding a console.log() statement before your return that outputs the current values of the accumulator and the item – this way, you’ll be able to see the looping that’s happening behind the scenes. Here’s a set of log statements for the above example:

adding 1 to accumulator: 0
adding 2 to accumulator: 1
adding 4 to accumulator: 3
adding 6 to accumulator: 7
Enter fullscreen mode Exit fullscreen mode

This is all well and good, and it’s fun to do some addition with ~*functional programming~*, but did you know reduce() can do more than simply count things? And that the accumulator can be something other than a number? It's true!

In our case, we’ll use it to find out how many current job postings there are per New York City agency. This might seem like a big leap from simply adding numbers together, but the core concepts of looping and accumulating are the same.Â

This time, instead of reducing an array of four numbers, we want to reduce our JSON blob of job posting data. And instead of reducing to a single number, we’re going to reduce to a single object. Yes, an object! Once the function is completed, the accumulator object’s keys will be the names of the city agencies and the keys’ values will be the number of postings they have, like this: {"name of agency": number of job postings}. Here’s the whole program:

const cityJobsData = fetch("https://data.cityofnewyork.us/resource/swhp-yxa4.json");
cityJobsData
  .then(data => data.json())
  .then(data => {
    const agencyFrequency = data.reduce((agencies, value) => {
      agencies[value.agency] = agencies[value.agency] ? agencies[value.agency] + 1 : 1;
      return agencies;
    }, {});
    console.log(agencyFrequency);
  })
  .catch(err => console.log(err));

Enter fullscreen mode Exit fullscreen mode

How does this work, exactly? Let’s break it down. Each time around the loop, we’re looking at a specific value, i.e., one object in data, our aforementioned array of objects. We’re checking to see if a key with the name of the current agency (value.agency) already exists within our accumulator object. If not, we add it to the accumulator object and set its value to 1. If a key with the name of the current agency already exists within the accumulator object, we add 1 to its existing value. We return the accumulator object when we’re done and get this nice set of data:

{ 
  'FIRE DEPARTMENT': 17,
  'DEPT OF ENVIRONMENT PROTECTION': 134,
  'DEPARTMENT OF INVESTIGATION': 22,
  'DEPARTMENT OF SANITATION': 14,
  'DEPT OF HEALTH/MENTAL HYGIENE': 247,
  'OFFICE OF THE COMPTROLLER': 14,
  'ADMIN FOR CHILDREN\'S SVCS': 43,
  'DEPT OF DESIGN & CONSTRUCTION': 48,
  'ADMIN TRIALS AND HEARINGS': 16,
  'DEPT OF PARKS & RECREATION': 34,
  'HUMAN RIGHTS COMMISSION': 4,
  'POLICE DEPARTMENT': 36,
  'DEPT OF INFO TECH & TELECOMM': 50,
  'DISTRICT ATTORNEY KINGS COUNTY': 4,
  'TAXI & LIMOUSINE COMMISSION': 11,
  'HOUSING PRESERVATION & DVLPMNT': 21,
  'DEPARTMENT OF BUSINESS SERV.': 18,
  'HRA/DEPT OF SOCIAL SERVICES': 31,
  'DEPARTMENT OF PROBATION': 3,
  'TAX COMMISSION': 4,
  'NYC EMPLOYEES RETIREMENT SYS': 6,
  'OFFICE OF COLLECTIVE BARGAININ': 2,
  'DEPARTMENT OF BUILDINGS': 9,
  'DEPARTMENT OF FINANCE': 29,
  'LAW DEPARTMENT': 21,
  'DEPARTMENT OF CORRECTION': 12,
  'DEPARTMENT OF TRANSPORTATION': 67,
  'DEPT OF YOUTH & COMM DEV SRVS': 5,
  'FINANCIAL INFO SVCS AGENCY': 7,
  'CULTURAL AFFAIRS': 1,
  'OFFICE OF EMERGENCY MANAGEMENT': 12,
  'DEPARTMENT OF CITY PLANNING': 5,
  'DEPT OF CITYWIDE ADMIN SVCS': 15,
  'DEPT. OF HOMELESS SERVICES': 3,
  'DEPARTMENT FOR THE AGING': 2,
  'CONSUMER AFFAIRS': 7,
  'MAYORS OFFICE OF CONTRACT SVCS': 7,
  'DISTRICT ATTORNEY RICHMOND COU': 3,
  'NYC HOUSING AUTHORITY': 9,
  'CIVILIAN COMPLAINT REVIEW BD': 5,
  'OFF OF PAYROLL ADMINISTRATION': 1,
  'EQUAL EMPLOY PRACTICES COMM': 1 
}
Enter fullscreen mode Exit fullscreen mode

Et Voila! We now know that if we want to work for the City government, we should check out the Department of Health and Mental Hygiene’s 247 openings!

We can do a bunch of useful things with this data  –  personally, I want to dip my toes into data visualization, so I’ll be using it to make a simple chart. I hope you’ll be able to use this example as a jumping-off point for your own projects.
If you enjoyed this article, please reach out to me on Twitter!

Thanks to Jim O’Brien for editing.

Top comments (8)

Collapse
 
spunkie profile image
Spunkie

I'm sure there are some super awesome things you can do with reduce() but the examples given just seem to provide the same functionality as a foreach loop?

$cityJobsData = json_decode(SOME_JSON, true);
$agencyFrequency = array();

foreach($cityJobsData as $city) {
    $agency = $city['agency'];

    if(!isset($agencyFrequency[$agency])) {
        $agencyFrequency[$agency] = 0;
    }
    $agencyFrequency[$agency]++;
}
Enter fullscreen mode Exit fullscreen mode

Is there a reduce() example you could give that could not be easily accomplished with a foreach or ends up being significantly more readable than the equivalent foreach loop? I'm primarily a php developer that is just getting into doing more serious javascript past basic jquery/dom stuff, so excuse me if I'm missing something.

Collapse
 
hulloanson profile image
hulloanson

It handles the object initialization for you, for one. Less typing, less bugs :)

Collapse
 
thor77 profile image
Thor77

Thanks for this great article.
I had to try Python's itertools.groupby because of this comment to filter the results:

data = requests.get(...).json()
agency_keyfunc = operator.itemgetter('agency')
agency_frequency = dict(
    map(
        lambda gt: (gt[0], len(list(gt[1]))),
        itertools.groupby(
            sorted(data, key=agency_keyfunc),
            agency_keyfunc
        )
    )
)
Enter fullscreen mode Exit fullscreen mode
Collapse
 
_andys8 profile image
Andy

Think about grouping/partitioning your data and then aggregating the values. This would avoid manipulating the single map object and allow parallelization in other languages :)

ramdajs.com/docs/#groupBy

Collapse
 
manidf profile image
Manny

Great article

Collapse
 
logesh profile image
logesh

Awesome.

Collapse
 
ben profile image
Ben Halpern

Really great tutorial 👌

Some comments may only be visible to logged-in visitors. Sign in to view all comments.