<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shreyas Soni</title>
    <description>The latest articles on DEV Community by Shreyas Soni (@sonishreyas).</description>
    <link>https://dev.to/sonishreyas</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F720589%2F59796a46-e01c-4435-94a4-862040c89f1e.jpeg</url>
      <title>DEV Community: Shreyas Soni</title>
      <link>https://dev.to/sonishreyas</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sonishreyas"/>
    <language>en</language>
    <item>
      <title>Explore Geopolitical data from GDELT
</title>
      <dc:creator>Shreyas Soni</dc:creator>
      <pubDate>Wed, 06 Oct 2021 18:15:11 +0000</pubDate>
      <link>https://dev.to/sonishreyas/explore-geopolitical-data-from-gdelt-2hnm</link>
      <guid>https://dev.to/sonishreyas/explore-geopolitical-data-from-gdelt-2hnm</guid>
      <description>&lt;p&gt;In this blog, we will explore the geopolitical data from GDELT and see how that data can be used in the analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is GDELT?
&lt;/h2&gt;

&lt;p&gt;The GDELT Project created by Kalev H. Leetaru monitors the world's news from every country in over 100 languages and identifies the people, locations, organizations, themes, sources, emotions, counts, quotes, images, and events driving our global society.&lt;/p&gt;

&lt;p&gt;In this blog, we will have a look at the Events database of GDELT and how this data can be used for analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Event Database
&lt;/h2&gt;

&lt;p&gt;The GDELT Event Database catalog over 20 main categories and more than 300 subcategories. Each category is given a particular cameo code. We will be looking into the 20 main cameo codes. That includes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make Public Statement&lt;/li&gt;
&lt;li&gt;Appeal&lt;/li&gt;
&lt;li&gt;Express intent to cooperate&lt;/li&gt;
&lt;li&gt;Consult&lt;/li&gt;
&lt;li&gt;Engage in diplomatic cooperation&lt;/li&gt;
&lt;li&gt;Engage in material cooperation&lt;/li&gt;
&lt;li&gt;Provide aid&lt;/li&gt;
&lt;li&gt;Yield&lt;/li&gt;
&lt;li&gt;Investigate&lt;/li&gt;
&lt;li&gt;Demand&lt;/li&gt;
&lt;li&gt;Disapprove&lt;/li&gt;
&lt;li&gt;Reject&lt;/li&gt;
&lt;li&gt;Threaten&lt;/li&gt;
&lt;li&gt;Protest&lt;/li&gt;
&lt;li&gt;Exhibit military posture&lt;/li&gt;
&lt;li&gt;Reduce relations&lt;/li&gt;
&lt;li&gt;Coerce&lt;/li&gt;
&lt;li&gt;Assault&lt;/li&gt;
&lt;li&gt;Fight&lt;/li&gt;
&lt;li&gt;Use unconventional mass violence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's see how we can get the data for these events for all countries.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to get the data?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;BigQuery
You can query any data you want according to your need. Here is an example of a query.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;select SQLDATE,EventRootCode,Actor1CountryCode,NumMentions from gdeltv2.events;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Using gdelt python package&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installation: &lt;code&gt;pip install gdelt&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Call the gdelt version 2 database.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gd2 = gdelt.gdelt(version=2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Use gd2 object to search for the data of a given date and set table to events.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;results = gd2.Search(['2020-01-01'],table='events',coverage=True)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Processing the data to get Timeseries data for all countries
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Load the data into the notebook.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df = pd.read_csv("gdelt.csv");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;The data output of the gdelt object has all the columns present in the events database. Now filter it to the columns necessary, i.e., SQLDATE, EventRootCode, Actor1CountryCode, NumMentions
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;results = results[['SQLDATE','EventRootCode','NumMentions','Actor1CountryCode']]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Convert the SQLDATE format from 'YYYYMMDD' to 'YYYY-MM-DD'.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;results['SQLDATE'] = results['SQLDATE'].apply(lambda x: pd.to_datetime(str(x), format='%Y-%m-%d'))            
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Aggregate the data based on SQLDATE, EventRootCode, and Actor1CountryCode.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;results = results.groupby(['SQLDATE','EventRootCode','Actor1CountryCode']).agg('sum').reset_index()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Data Analysis and Visualization
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Mapping a Line Chart of a particular Cameo code for the country over time.&lt;br&gt;
Example: Protest in USA (Aggregated to Weekly basis)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F538qmq746r0pxo7y1v21.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F538qmq746r0pxo7y1v21.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mapping Top Cameo codes in a country based on the Number of Mentions of the particular cameo code.&lt;br&gt;
Example: Top Trends in USA (Last Week)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo83j465hua7r8udx6mor.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo83j465hua7r8udx6mor.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Mapping Top Countries in a particular cameo code based on the Number of Mentions of the particular cameo code in the country.&lt;br&gt;
Example: Top Countries in Protest (Last Week)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7eejebzd5yi58hugr471.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7eejebzd5yi58hugr471.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Plot a choropleth map for a particular cameo code.&lt;br&gt;
Example: Protest (Today)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8kc0hnua6twynecf301.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8kc0hnua6twynecf301.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Technology Used
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;Pandas&lt;/li&gt;
&lt;li&gt;Plotly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Code: &lt;a href="https://colab.research.google.com/drive/11gFRPtbPK7fz6OOUR7dYtmtdnN-O_pWM?usp=sharing" rel="noopener noreferrer"&gt;Link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Co-author: &lt;a class="mentioned-user" href="https://dev.to/ashishsalunkhe"&gt;@ashishsalunkhe&lt;/a&gt; &lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>gdelt</category>
    </item>
  </channel>
</rss>
