<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Clement Gitonga</title>
    <description>The latest articles on DEV Community by Clement Gitonga (@cgitosh).</description>
    <link>https://dev.to/cgitosh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F827322%2F6aab5031-73ac-4350-a467-13f3dffd92bc.jpg</url>
      <title>DEV Community: Clement Gitonga</title>
      <link>https://dev.to/cgitosh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cgitosh"/>
    <language>en</language>
    <item>
      <title>Extracting data from a website API and Save as CSV</title>
      <dc:creator>Clement Gitonga</dc:creator>
      <pubDate>Wed, 12 Oct 2022 02:32:32 +0000</pubDate>
      <link>https://dev.to/cgitosh/extracting-data-from-a-website-api-and-save-as-csv-o4l</link>
      <guid>https://dev.to/cgitosh/extracting-data-from-a-website-api-and-save-as-csv-o4l</guid>
      <description>&lt;p&gt;&lt;strong&gt;Background&lt;/strong&gt;&lt;br&gt;
I recently wrote a Python Script that extracted data from US Department of Agriculture website, formatted into the desired format and saved it into a CSV file. Since I did some research online while working on the script, I thought it would only be fair that I give back to the community by writing down the steps I followed plus some example code in the hope that someone out there will benefit.&lt;br&gt;
&lt;strong&gt;Problem statement&lt;/strong&gt;&lt;br&gt;
The FoodData Central database has amino acid levels for many foods. Extract data from the SR 'Legacy Foods' category and list the food and amino acid level in CSV format with&lt;br&gt;
rows for each food and fields are the various amino acids (18 amino acids)&lt;br&gt;
&lt;strong&gt;Solution&lt;/strong&gt;&lt;br&gt;
FoodData Central API provides access to the FoodData Central (FDC), to gain access you have to register for an API Key through this link &lt;a href="https://fdc.nal.usda.gov/api-key-signup.html"&gt;https://fdc.nal.usda.gov/api-key-signup.html&lt;/a&gt;.&lt;br&gt;
For the purpose of this article, I created the Get query on the API web interface, specified all the parameters then copied the URL with parameters and API Key into my script, normally you would specify the parameters in your code. Below is the full 'GET' query URL&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;endpoint = 'https://api.nal.usda.gov/fdc/v1/foods/list?dataType=SR%20Legacy&amp;amp;pageSize=20&amp;amp;pageNumber=3&amp;amp;sortBy=fdcId&amp;amp;sortOrder=asc&amp;amp;api_key=XXXXXXXXX'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next step is I want to query the API endpoint using the python 'requests'HTTP Library&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;response = requests.get(endpoint).json()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response variable above contains the data from the FoodData Central Database, it's time to format the data in the desired format. For each food item, I want to have name of the food item (Description) and the quantity of each amino acid that particular food item contains. Therefore my CSV file will have a total of 19 fields. Next Step is to specify the keys/headers list and an empty dictionary that I will populate with data&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# keys global variable will be used to generate keys for the response_dict dictionary and for fieldnames for the csv file
keys = ['Description','Alanine', 'Arginine', 'Aspartic','Cysteine','Glutamic','Glycine','Histidine','Isoleucine','Leucine','Lysine','Methionine','Phenylalanine','Proline','Serine', 'Threonine','Tryptophan','Tyrosine','Valine']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next block of code will loop through the response data, match the response data to the respective dictionary key then append that data to the Food_list variable declared above&lt;/p&gt;

&lt;p&gt;Below is the complete function&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def get_data():

    # The API endpoint to query data from preformated with the desired parameters
    endpoint = 'https://api.nal.usda.gov/fdc/v1/foods/list?dataType=SR%20Legacy&amp;amp;pageSize=20&amp;amp;pageNumber=3&amp;amp;sortBy=fdcId&amp;amp;sortOrder=asc&amp;amp;api_key=3VzJQvEyn9UBBwFgK3pdli3x82rCnpy9sVJFNt3Z'

    # The python HTTP requests library queries the endpoint and returns the data in JSON format
    response = requests.get(endpoint).json()

    food_list = []
    for data in response: 

        #empty dictionary initialized with keys from the 'keys' list variable declared globally       
        response_dict = {key: 0 for key in keys}

        response_dict['Description'] = data['description']
        for nutrient in data['foodNutrients']:
            if nutrient['name'] in keys:
                response_dict[nutrient["name"]] = nutrient['amount']                              
        food_list.append(response_dict)
    return food_list

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally let's write our food_list data into a CSV file&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;filename = 'FoodDatacsv.csv'
with open(filename, 'w', newline='') as myfile:
    writer = csv.DictWriter(myfile, fieldnames=keys)
    writer.writeheader()
    writer.writerows(get_data())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Below is a sample of the extracted data.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--t3d5mBf_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c8q9668cbip0z6s4qb5p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--t3d5mBf_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c8q9668cbip0z6s4qb5p.png" alt="Image description" width="880" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And that is it for today. &lt;/p&gt;

&lt;p&gt;P.S. &lt;br&gt;
If you would like some data extraction done, click &lt;a href="https://www.upwork.com/services/product/development-it-data-extracted-from-the-web-and-save-in-csv-file-in-your-desired-format-1580095848756686848?ref=project_share"&gt;Here&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>tutorial</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
