Exploring the Power of Location Intelligence with Snowflake’s Precise Geospatial Features
I have been curious about geospatial data analysis, aka, Geomatics, and as I see newer Snowflake capabilities, I have decided to try to learn more and I will bring you along with me. I confess that geomatics is a new term for me.
What is Geomatics?
Geomatics is a cutting-edge discipline that brings together activities like collecting, storing, processing, modeling, analyzing, and delivering spatially referenced or location information. This includes geographic information systems (GIS), remote sensing, land surveying, and global positioning systems (GPS). The goal of geomatics is to gather, analyze, and interpret data concerning the Earth’s surface and its features.
Key Areas in Geomatics
- Geographic Information Systems (GIS) refers to the use of software for analyzing and visualizing spatial data. GIS applications are employed in urban planning, environmental management, and transportation.
- Remote Sensing involves collecting data from a distance, typically using satellites or aircraft. It is used for environmental monitoring, agriculture, and disaster management.
- Land surveying involves the measurement and mapping of land areas using tools such as theodolites, GPS, and drones to create precise maps and boundaries.
- Global Positioning Systems (GPS) is the use of space satellites to pinpoint exact locations on Earth. GPS technology plays a crucial role in navigation, mapping, and accurate timing.
- Photogrammetry involves using photography to measure and map objects and is utilized in topographic mapping, architecture, and forensics.
- Cartography involves creating and studying maps. Cartographers design and produce maps for navigation, education, and scientific research.
One cool thing about this area that I was surprised is how many areas it combines: geography, engineering, computer science, and environmental science. While affecting many industries: agriculture, forestry, mining, urban planning, and disaster management.
What are the latest Technological Advancements in Geomatics?
- LiDAR (Light Detection and Ranging) uses laser pulses to create high-resolution maps and 3D models of the Earth’s surface.
- Unmanned Aerial Vehicles (UAVs) and Drones: the use of drones for aerial surveys has become widespread due to their cost-effectiveness, flexibility, and ability to access hard-to-reach areas.
- Satellite Remote Sensing, satellite sensors and the launch of new satellites provide higher resolution images and more frequent data collection.
- Big data and large cloud computing capacity have helped this area grow.
- Artificial Intelligence (AI) and Machine Learning (ML), algorithms are increasingly used for pattern recognition, predictive modeling, and automating data processing tasks.
- Internet of Things (IoT), such as sensors and smart meters, collect continuous spatial and environmental data.
What is Business Value in Geomatics?
All that is nice, but how is Geomatics evolving with technology? It seems that much is happening that touches. This is not a hard list, but it touches everything from drones, LIDAR, data-focused applications, real-time processing, IoT devices, and AI/ML.
It also touches many different important CORE industries, like farming, disaster response, environment, critical infrastructure inspection, self-driving cars, and land identification and classification. It clearly seems to have become an important defense industry. I could write an article just on this part alone. It was quite interesting to see the growth in this area.
How Does Snowflake Enable Spatial Use Cases?
Clearly, Snowflake is a big data technology enabled by the Cloud. At its basic level, Snowflake is a high-performing analytical platform with near-unlimited scale, which means it deals well with spatial geometry-based cases. However, the Snowflake product has been building many features that make practitioners' lives easier.
Snowflake offers native support for geospatial features such as points, lines, and polygons on the Earth’s surface. Let’s walk through some features in Snowflake that are specific to geometry or help geometry.
Data Types
Snowflake provides native first-class data types for geospatial data:
- GEOGRAPHY data type , which models Earth as though it were a perfect sphere. It follows the WGS 84 standard (spatial reference ID 4326; for details, see https://epsg.io/4326) and represents data in points on the earth are represented as degrees of longitude (from -180 degrees to +180 degrees) and latitude (-90 to +90). Snowflake uses 14 decimal places to store GEOGRAPHY coordinates. When the data includes decimal places exceeding this limit, the coordinates are rounded to ensure compliance with the specified length constraint. Altitude is currently not supported.
- GEOMETRY data type , which represents features in a planar (Euclidean, Cartesian) coordinate system. The GEOMETRY data type represents features in a planar (Euclidean, Cartesian) coordinate system. Represented as pairs of real numbers (x, y). Currently, only 2D coordinates are supported. The units of the X and Y are determined by the spatial reference system (SRS) associated with the GEOMETRY object.
Here is a list of differences between these:
Finally, remember that for data validation, you need to follow these rules OGC rules for Simple Features, and a given shape can be a valid GEOGRAPHY object but an invalid GEOMETRY object, and the same applies to the reverse.
Supported standard input and output formats¶
The GEOGRAPHY and GEOMETRY data types support the following standard industry formats for input and output:
- Well-Known Text (“WKT”)
- Well-Known Binary (“WKB”)
- Extended WKT and WKB (EWKT and EWKB)
- IETF GeoJSON
Supported geospatial object types
The GEOGRAPHY and GEOMETRY data types can store the following types of geospatial objects:
- WKT / WKB / EWKT / EWKB / GeoJSON geospatial objects: Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon and GeometryCollection
- These GeoJSON-specific geospatial objects: Feature and FeatureCollection
Output format:
Note: Snowflake clients will behave differently in geospatial object types, and each will give different responses.
Here is an example:
-- Create a table and insert a value
create table geospatial_table (id INTEGER, g GEOGRAPHY);
insert into geospatial_table values
(1, 'POINT(-122.35 47.63)'), (2, 'LINESTRING(31.8 -5, 31.8 -4.7)');
-- make the output GeoJSON
alter session set GEOGRAPHY_OUTPUT_FORMAT='GeoJSON';
-- Select statement
select g
from geospatial_table
order by id;
+------------------------+
| G |
|------------------------|
| { |
| "coordinates": [ |
| -122.35, |
| 47.63 |
| ], |
| "type": "Point" |
| } |
| { |
| "coordinates": [ |
| [ |
| 31.8, |
| -5 |
| ], |
| [ |
| 31.8, |
| -4.7 |
| ] |
| ], |
| "type": "LineString" |
| } |
+------------------------+
Loading geospatial data from stages
Data from CSV or JSON / AVRO files in a stage can be loaded directly into a GEOGRAPHY column.
- CSV: String values from the corresponding CSV column are parsed as GeoJSON, WKT, EWKT, WKB, or EWKB (see TO_GEOGRAPHY(VARCHAR)).
- JSON / AVRO: The JSON values in the file are interpreted as GeoJSON (see TO_GEOGRAPHY(VARIANT)).
Loading data from other file formats like Parquet and ORC is possible through a COPY transform.
Using UDFs with Geospatial data
Java or Python UDFs are supported for geospatial data, and these are even good ways to allow reading formats that are not supported natively, such as Shapefiles (.SHP), TAB, KML, GPKG, and others.
I have been playing with these samples in the Snowflake Github repository.
H3 Geospatial
H3 is a hierarchical geospatial index that partitions the world into hexagonal cells in a discrete global grid system. Snowflake has a wide range of SQL functions to enable H3 with GEOGRAPHY objects.
Snowflake provides SQL functions that enable you to use H3 with GEOGRAPHY objects. You can use these functions to:
Geospatial Functions
Snowflake provides geospatial functions that operate on the GEOGRAPHY and GEOMETRY data types. As you can see, there are many of them, and I cannot cover them in this article, but let me discuss the categories and maybe use an example to demonstrate how useful they can be.
Here is a cool one in action:
-- ST_AREA funtion returns the Area of the object
-- Let say you have a regions defined as polygons in a table for a country or state/province
SELECT region, ST_AREA(geometry) AS area FROM regions;
Search Optimization
You can speed up geospatial queries with search optimization.
ALTER TABLE customers ADD SEARCH OPTIMIZATION ON GEO(cordinates);
Conclusion
This was interesting. There is a lot to know about geospatial in Snowflake and even a lot to know about geospatial in general. I will follow up with some more of what I learned in the subject and other articles.
However, you can really see how well Snowflake supports things, and it’s part of many different existing services in Snowflake. I saw an example in the documentation that gave an answer to how many countries a line passes through between the two data types, and it was interesting to see how GEOGRAPHY and GEOMETRY work differently.
If you want, suggest the next article of the series, and it could be a way to use this to solve an existing problem in your industry.
Thank you, Layne Newhouse, for reviewing this article and ideas for the next.
I am Augusto Rosa, VP of Engineering for Infostrux Solutions. I am also a Snowflake Data Super Hero and Snowflake SME.
Thanks for reading my blog post. You can follow me on LinkedIn.
Subscribe to Infostrux Medium Blogs https://medium.com/infostrux-solutions for the most interesting Data Engineering and Snowflake news.
Sources:
- https://docs.carto.com/getting-started/carto-in-a-nutshell
- https://docs.snowflake.com/en/user-guide/search-optimization/geospatial-queries
Top comments (0)