<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David Ostler</title>
    <description>The latest articles on DEV Community by David Ostler (@dro248).</description>
    <link>https://dev.to/dro248</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1555703%2Fb1854028-4622-4804-bb41-892883c6a1df.png</url>
      <title>DEV Community: David Ostler</title>
      <link>https://dev.to/dro248</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dro248"/>
    <language>en</language>
    <item>
      <title>Converting .shp files to CSV with GeoPandas</title>
      <dc:creator>David Ostler</dc:creator>
      <pubDate>Fri, 31 May 2024 17:31:56 +0000</pubDate>
      <link>https://dev.to/dro248/converting-shp-files-to-csv-with-geopandas-7k6</link>
      <guid>https://dev.to/dro248/converting-shp-files-to-csv-with-geopandas-7k6</guid>
      <description>&lt;p&gt;Working with geospatial data is really fun! We use this type of data all over the place in data science, and data warehouses have gotten really good at working with it (I'm looking at you, Snowflake!).&lt;/p&gt;

&lt;p&gt;However, as you get started down the path of using geospatial data, you will likely run into a speed bump: &lt;code&gt;.shp&lt;/code&gt; files &lt;/p&gt;

&lt;h2&gt;
  
  
  What is a &lt;code&gt;.shp&lt;/code&gt; file?
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;.shp&lt;/code&gt; file is a common way of encoding geospatial data in a binary format. This geospatial data will contain geospatial points, polygons (a collection of geospatial points that form an area), and multipolygons (more exotic polygons). &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomahubfcr80javfeuqwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fomahubfcr80javfeuqwr.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;.shp&lt;/code&gt; file will be bundled with other files with the same name but different file extensions (e.g., .shx, .dbf, .prj, etc.). In short, you will need all of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the problem with &lt;code&gt;.shp&lt;/code&gt; files?
&lt;/h2&gt;

&lt;p&gt;Unfortunately, most data warehouses don't read &lt;code&gt;.shp&lt;/code&gt; files natively. If you are looking to load this dataset into your data warehouse, you are going to need to convert it to a compatible format like CSV.&lt;/p&gt;

&lt;p&gt;Additionally, a &lt;code&gt;.shp&lt;/code&gt; file might use a different coordinate system than the latitude / longitude system you are familiar with (&lt;a href="https://en.wikipedia.org/wiki/World_Geodetic_System#WGS_84" rel="noopener noreferrer"&gt;WGS 84&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;In this tutorial, we will convert a &lt;code&gt;.shp&lt;/code&gt; file to CSV and transform a geospatial data to the latitude / longitude coordinate system.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The latitude / longitude coordinate system we are familiar with is called WGS 84 EPSG:4326.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Convert a &lt;code&gt;.shp&lt;/code&gt; file to CSV
&lt;/h2&gt;

&lt;p&gt;In this example, we'll use &lt;a href="https://geopandas.org/en/stable/" rel="noopener noreferrer"&gt;GeoPandas&lt;/a&gt; for the conversion.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

import geopandas as gpd

# Import .shp file into a GeoPandas DataFrame
geopandas_df = gpd.read_file('Grid_100m.shp')

# Convert geospatial data to latitude/longitude coordinate system
converted_df = geopandas_df.to_crs('EPSG:4326')

# Write data to CSV
converted_df.to_csv('Grid_100m.csv', index=False)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If all goes well, you should now have a well-formatted CSV file that you can inspect with Excel (and load to your data warehouse).&lt;/p&gt;

&lt;p&gt;Cheers! 🚀&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>python</category>
      <category>tutorial</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
