<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vahid Fazel-Rezai</title>
    <description>The latest articles on DEV Community by Vahid Fazel-Rezai (@vahidfazelrezai).</description>
    <link>https://dev.to/vahidfazelrezai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F204761%2F1a48e538-ffae-4bb0-86e1-8f41561a46f4.png</url>
      <title>DEV Community: Vahid Fazel-Rezai</title>
      <link>https://dev.to/vahidfazelrezai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vahidfazelrezai"/>
    <language>en</language>
    <item>
      <title>Using Tableau with DynamoDB: How to Build a Real-Time SQL Dashboard on NoSQL Data</title>
      <dc:creator>Vahid Fazel-Rezai</dc:creator>
      <pubDate>Wed, 28 Aug 2019 07:00:00 +0000</pubDate>
      <link>https://dev.to/rocksetcloud/using-tableau-with-dynamodb-how-to-build-a-real-time-sql-dashboard-on-nosql-data-52nk</link>
      <guid>https://dev.to/rocksetcloud/using-tableau-with-dynamodb-how-to-build-a-real-time-sql-dashboard-on-nosql-data-52nk</guid>
      <description>&lt;p&gt;In this blog, we examine DynamoDB reporting and analytics, which can be challenging given the lack of SQL and the difficulty running analytical queries in DynamoDB. We will demonstrate how you can build an interactive dashboard with Tableau, using SQL on data from DynamoDB, in a series of easy steps, with no ETL involved.&lt;/p&gt;

&lt;p&gt;DynamoDB is a widely popular transactional primary data store. It is built to handle unstructured data models and massive scales. DynamoDB is often used for organization’s most critical business data, and as such there is value in being able to visualize and dig deeper into this data.&lt;/p&gt;

&lt;p&gt;Tableau, also widely popular, is a tool for building live, interactive charts and dashboards. In this blog post, we will walk through an example of using Tableau to visualize data in DynamoDB.&lt;/p&gt;

&lt;p&gt;DynamoDB works well out-of-the-box for simple lookups by the primary key. For lookups by a different attribute, DynamoDB allows creating a local or global secondary index. However, for even more complex access patterns like filtering on nested or multiple fields, sorting, and aggregations—types of queries that commonly power dashboards—DynamoDB alone is not sufficient. &lt;a href="https://rockset.com/blog/tableau-operational-dashboards-reporting-dynamodb-redshift-athena/" rel="noopener noreferrer"&gt;This blog post&lt;/a&gt; evaluates a few approaches to bridge this gap.&lt;/p&gt;

&lt;p&gt;In this post, we will create an example business dashboard in Tableau on data in DynamoDB, using Rockset as the SQL intelligence layer in between, and JDBC to connect Tableau and Rockset.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Data
&lt;/h2&gt;

&lt;p&gt;For this example, I’ve combined &lt;a href="https://www.kaggle.com/airbnb/seattle" rel="noopener noreferrer"&gt;sample data from Airbnb&lt;/a&gt; and &lt;a href="https://mockaroo.com/" rel="noopener noreferrer"&gt;mock data from Mockaroo&lt;/a&gt; to generate realistic records of users with listings, bookings, and reviews for a hypothetical home rental marketplace. (All names and emails are fake.) The mock data and scripts are &lt;a href="https://github.com/rockset/recipes/tree/master/mock-rental-data" rel="noopener noreferrer"&gt;available on Github&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The data model is typical for a DynamoDB use case—here’s an example item:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "user_id": "28c38f9e-463d-4eae-b53d-16cdad48f150",
    "first_name": "Kimberlyn",
    "last_name": "Maudlin",
    "email": "kmaudlin24@bandcamp.com",
    "listings": [
        {
            "listing_id": "8472954",
            "title": "Private bedroom in adorable home",
            "description": "The spare bedroom in our adorable 2 bedroom home ... ",
            "city": "Bomomani",
            "country": "Indonesia",
            "listed_date": "2015-09-30",
            "cancellation_policy": "flexible",
            "price_usd": "51.00",
            "bathrooms": "2",
            "bedrooms": "2",
            "beds": "2",
            "bookings": [
                {
                    "user": {
                        "user_id": "530cd0c7-b79b-4f94-9e0f-969fc7f9855b",
                        "first_name": "Nahum",
                        "last_name": "Yaus",
                        "email": "nyaus9@angelfire.com"
                    },
                    "start_date": "2015-12-07",
                    "length_days": "5",
                    "review": {
                        "text": "Great convenient location, clean, and ... ",
                        "rating": "3"
                    },
                    "cost_usd": "230.84"
                }
            ]
        }
    ]
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things to note:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In our data, sometimes the &lt;code&gt;review&lt;/code&gt; field will be missing (if the user did not leave a review).&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;bookings&lt;/code&gt; and &lt;code&gt;listings&lt;/code&gt; arrays may be empty, or arbitrarily long!&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;user&lt;/code&gt; field is denormalized and duplicated within a booking, but also exists separately as its own item.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We start with a DynamoDB table called &lt;code&gt;rental_data&lt;/code&gt; loaded with 21,964 such records:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F17d04KOUD3gjwJ8yh1qTyr%2Fea3acb90226d23fa7ad50c2d1ae3296c%2Fdynamodb-table.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F17d04KOUD3gjwJ8yh1qTyr%2Fea3acb90226d23fa7ad50c2d1ae3296c%2Fdynamodb-table.png" alt="dynamodb-table"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting Tableau to DynamoDB
&lt;/h2&gt;

&lt;p&gt;Let’s see this data into Tableau!&lt;/p&gt;

&lt;p&gt;We’ll need accounts for Tableau Desktop and Rockset. I also assume we’ve already &lt;a href="https://docs.rockset.com/amazon-dynamodb/#create-a-dynamodb-integration" rel="noopener noreferrer"&gt;set up credentials&lt;/a&gt; to access our DynamoDB table.&lt;/p&gt;

&lt;p&gt;First, we need to download the Rockset JDBC driver from &lt;a href="https://oss.sonatype.org/#nexus-search;quick~rockset" rel="noopener noreferrer"&gt;Maven&lt;/a&gt; and place it in &lt;code&gt;~/Library/Tableau/Drivers&lt;/code&gt; for Mac or &lt;code&gt;C:\Program Files\Tableau\Drivers&lt;/code&gt; for Windows.&lt;/p&gt;

&lt;p&gt;Next, let’s create an API key in Rockset that Tableau will use for authenticating requests:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5sJX30S5iSZQuyJuNzPvP%2Fe3f07fe192e351adf01bea1bbef2b4d2%2Frockset-apikey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5sJX30S5iSZQuyJuNzPvP%2Fe3f07fe192e351adf01bea1bbef2b4d2%2Frockset-apikey.png" alt="rockset-apikey"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Tableau, we connect to Rockset by choosing “Other Databases (JDBC)” and filling the fields, with our API key as the password:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5qoil33PxGw0a0ToULBfzp%2F0d7729c407899add9a4a1da2b88353c3%2Ftableau-connect.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5qoil33PxGw0a0ToULBfzp%2F0d7729c407899add9a4a1da2b88353c3%2Ftableau-connect.png" alt="tableau-connect"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, back in Rockset, we just create a new collection directly from the DynamoDB table:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F34dLf7TjnDXj2bUyVmUWKJ%2F171d4d09263b30b16e7890c4f69a8f64%2Frockset-collection.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F34dLf7TjnDXj2bUyVmUWKJ%2F171d4d09263b30b16e7890c4f69a8f64%2Frockset-collection.png" alt="rockset-collection"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We see the new collection reflected as a table in Tableau:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2Fz2vplSDXoLANYyB0BeoBj%2Fafead77d11db3250734b594d2aa9c898%2Ftableau-table.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2Fz2vplSDXoLANYyB0BeoBj%2Fafead77d11db3250734b594d2aa9c898%2Ftableau-table.png" alt="tableau-table"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Users Table
&lt;/h2&gt;

&lt;p&gt;Our DynamoDB table has some fields of type Map and List, whereas Tableau expects a relational model where it can do joins on flat tables. To resolve this, we’ll compose SQL queries in the Rockset Console that reshapes the data as desired, and add these as custom SQL data sources in Tableau.&lt;/p&gt;

&lt;p&gt;First, let’s just get a list of all the users on our rental platform:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F3qfm2hTVmOQR3IhPH9ptA0%2Fd7f7a9c87f0731090ddaefb4a4b0f326%2Frockset-query.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F3qfm2hTVmOQR3IhPH9ptA0%2Fd7f7a9c87f0731090ddaefb4a4b0f326%2Frockset-query.png" alt="rockset-query"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Tableau, we drag “New Custom SQL” to the top section, paste this query (without the LIMIT clause), and rename the result to Users:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5nPbTuPhSbRSxcj4nbW9eV%2Ff982ccd42e677454ae437c521da556bc%2Ftableau-custom-sql.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5nPbTuPhSbRSxcj4nbW9eV%2Ff982ccd42e677454ae437c521da556bc%2Ftableau-custom-sql.gif" alt="tableau-custom-sql"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Looks good! Now, let’s repeat this process to also pull out listings and bookings into their own tables.&lt;/p&gt;

&lt;h2&gt;
  
  
  Listings Table
&lt;/h2&gt;

&lt;p&gt;Note that in the original table, each row (user) has an array of listing items. We want to pull out these arrays and concatenate them such that each item itself becomes a row. To do so, we can use the &lt;a href="https://docs.rockset.com/commands/#unnest" rel="noopener noreferrer"&gt;UNNEST function&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F2kPV5eXKIr1n8nh2cA65Ev%2F394a4d5c9ac27dd47f9940cf824ea944%2Frockset-query-2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F2kPV5eXKIr1n8nh2cA65Ev%2F394a4d5c9ac27dd47f9940cf824ea944%2Frockset-query-2.png" alt="rockset-query-2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, let’s select the fields we want to have in our listings table:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5DS0plvM4lG3XZiL3MfJiK%2Ffc7d7825f74b7d21623e0facfb840927%2Frockset-query-3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5DS0plvM4lG3XZiL3MfJiK%2Ffc7d7825f74b7d21623e0facfb840927%2Frockset-query-3.png" alt="rockset-query-3"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And we paste this as custom SQL in Tableau to get our Listings table:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5rJO4RnLUiY6R1aFk7GrsU%2F083367de8bf0e8fa7e3f07730765f5e5%2Ftableau-data-source.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5rJO4RnLUiY6R1aFk7GrsU%2F083367de8bf0e8fa7e3f07730765f5e5%2Ftableau-data-source.png" alt="tableau-data-source"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Bookings Table
&lt;/h2&gt;

&lt;p&gt;Let’s create one more data source for our Bookings table with another UNNEST query:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F3Kv3M9TX6PM2h1QKOdEeqM%2Fe965ffa88a5ab8f76a3971eb715e4adc%2Ftableau-custom-sql-query.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F3Kv3M9TX6PM2h1QKOdEeqM%2Fe965ffa88a5ab8f76a3971eb715e4adc%2Ftableau-custom-sql-query.png" alt="tableau-custom-sql-query"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Chart 1: Listings Overview
&lt;/h2&gt;

&lt;p&gt;Let’s get a high level view of the listings around the world on our platform. With a few drag-and-drops, we use the city/country to place the listings on a map, sized by booking count and colored by cancellation policy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1NYbNXRFjSpSZc6SLkwfYY%2Ff3728449f44e94f9cbe274976d0b7aa3%2Ftableau-sheet-overview.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1NYbNXRFjSpSZc6SLkwfYY%2Ff3728449f44e94f9cbe274976d0b7aa3%2Ftableau-sheet-overview.gif" alt="tableau-sheet-overview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Looks like we have a lot of listings in Europe, South America, and East Asia.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chart 2: Listings Leaderboard
&lt;/h2&gt;

&lt;p&gt;Let’s try to find out more about the listings pulling in the most revenue. We’ll build a leaderboard with the following information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;labeled by &lt;strong&gt;listing ID&lt;/strong&gt; and &lt;strong&gt;email of host&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;total revenue&lt;/strong&gt; as the sum of cost across all bookings (sorted from highest to lowest)&lt;/li&gt;
&lt;li&gt;colored by &lt;strong&gt;year&lt;/strong&gt; it was listed&lt;/li&gt;
&lt;li&gt;details about &lt;strong&gt;title&lt;/strong&gt; , &lt;strong&gt;description&lt;/strong&gt; , and &lt;strong&gt;number of beds&lt;/strong&gt; shown on hover&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that to accomplish this, we have to combine information across all three of our tables, but we can do so directly in Tableau.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5RCJQ0SuAZp29wFclXJPav%2Fdfb02a7c292421fbc61e28483ddb5629%2Ftableau-sheet-leaderboard.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F5RCJQ0SuAZp29wFclXJPav%2Fdfb02a7c292421fbc61e28483ddb5629%2Ftableau-sheet-leaderboard.gif" alt="tableau-sheet-leaderboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Chart 3: Rating by Length
&lt;/h2&gt;

&lt;p&gt;Next, suppose we want to know what kind of users our platform is pleasing the most. Let's look at the average rating for each of the different lengths of bookings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1sNyIg1hzdk0MteBwZwKjv%2F58325b685e3cd21653611222cf684861%2Ftableau-sheet-analysis.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1sNyIg1hzdk0MteBwZwKjv%2F58325b685e3cd21653611222cf684861%2Ftableau-sheet-analysis.gif" alt="tableau-sheet-analysis"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  User Dashboard on Real-Time Data
&lt;/h2&gt;

&lt;p&gt;Let’s throw all these charts together in a dashboard:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1H0joHsReZ9ejz2D7ZKcqV%2F8bb5effa0b0456a134b7e9ee4e75e1c3%2Ftableau-dashboard.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F1H0joHsReZ9ejz2D7ZKcqV%2F8bb5effa0b0456a134b7e9ee4e75e1c3%2Ftableau-dashboard.png" alt="tableau-dashboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You may notice the ratings by length are roughly the same between length of stay—and that’s because the mock data was generated for each length from the same rating distribution!&lt;/p&gt;

&lt;p&gt;To illustrate that this dashboard gets updated in real time on the live DynamoDB source, we’ll add one record to try and noticeably skew some of the charts.&lt;/p&gt;

&lt;p&gt;Let’s say I decide to sign up for this platform and list my own bedroom in San Francisco, listed for $44 a night. Then, I book my own room 444 times and give it a rating of 4 each time. This Python code snippet generates that record and adds it to DynamoDB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import boto3

booking = {
        "user": {
            "first_name": "Vahid",
            "last_name": "Fazel-Rezai",
            "email": "vahid@rockset.com",
            "user_id": "fc8ca81a-d1fa-4156-b983-dc2b07c1443c"
        },
        "start_date": "2019-04-04",
        "length_days": "4",
        "review": {
            "rating": "4",
            "text": "Worked 4 me!"
        },
        "cost_usd": "44.00"
    }
item = {
        "first_name": "Vahid",
        "last_name": "Fazel-Rezai",
        "email": "vahid@rockset.com",
        "user_id": "fc8ca81a-d1fa-4156-b983-dc2b07c1443c",
        "listings": [{
            "listing_id": "444444",
            "title": "Bedroom for rent",
            "description": "A place to stay, simple but sufficient.",
            "city": "San Francisco",
            "country": "United States",
            "listed_date": "2019-04-04",
            "price_usd": "11.00",
            "cancellation_policy": "flexible",
            "bathrooms": "1",
            "bedrooms": "1",
            "beds": "1",
            "bookings": 444 * [booking]
        }]
    }

dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table("rental_data")
table.put_item(Item = item)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sure enough, we just have to refresh our dashboard in Tableau and we can see the difference immediately!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F37qlQnZkFqILeb6gYBqncx%2F4f94f7d2e16a0fcd9c2033482f945590%2Ftableau-real-time-dashboard.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.ctfassets.net%2F1d31s1aajogl%2F37qlQnZkFqILeb6gYBqncx%2F4f94f7d2e16a0fcd9c2033482f945590%2Ftableau-real-time-dashboard.png" alt="tableau-real-time-dashboard"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this blog post, we walked through creating an interactive dashboard in Tableau that monitors core business data stored in DynamoDB. We used Rockset as the SQL intelligence layer between DynamoDB and Tableau. The steps we followed were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Start with data in a DynamoDB table.&lt;/li&gt;
&lt;li&gt;Create a collection in Rockset, using the DynamoDB table as a source.&lt;/li&gt;
&lt;li&gt;Write one or more SQL queries that return the data needed in Tableau.&lt;/li&gt;
&lt;li&gt;Create a data source in Tableau using custom SQL.&lt;/li&gt;
&lt;li&gt;Use the Tableau interface to create charts and dashboards.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>showdev</category>
      <category>aws</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Data-Driven Decisions for Where to Park in SF</title>
      <dc:creator>Vahid Fazel-Rezai</dc:creator>
      <pubDate>Fri, 16 Aug 2019 07:00:00 +0000</pubDate>
      <link>https://dev.to/rocksetcloud/data-driven-decisions-for-where-to-park-in-sf-2549</link>
      <guid>https://dev.to/rocksetcloud/data-driven-decisions-for-where-to-park-in-sf-2549</guid>
      <description>&lt;p&gt;Have you ever felt uncertain parking in a shady area? In particular, have you ever parked in San Francisco and wondered, if I measured the average inverse square distance to every vehicle incident recorded by the SFPD in the last year, at what percentile would my current location fall?&lt;/p&gt;

&lt;p&gt;If so, we built &lt;a href="https://rockset.com/sf-parking"&gt;an app for that&lt;/a&gt;. In this post we’ll explain our methodology and its implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parking in San Francisco
&lt;/h2&gt;

&lt;p&gt;Vehicle-related break-ins and thefts are &lt;a href="https://www.nbcbayarea.com/news/local/Breaking-Point-475109113.html"&gt;notoriously common in San Francisco&lt;/a&gt;. Just last week, items worth half a million dollars were stolen in a &lt;a href="https://abc7news.com/sf-community-reacts-to-police-response-over-car-burglary-with-celebrity-victim/5465666/"&gt;high-profile car burglary&lt;/a&gt;. There’s even a &lt;a href="https://twitter.com/sfcarbreakins"&gt;Twitter account&lt;/a&gt; tracking incidents.&lt;/p&gt;

&lt;p&gt;The San Francisco Police Department maintains an ongoing &lt;a href="https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783"&gt;dataset of all incidents&lt;/a&gt; since January 1, 2018 (there is &lt;a href="https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-Historical-2003/tmnf-yvry"&gt;another one&lt;/a&gt; for 2003-2018).The San Francisco Chronicle has created a &lt;a href="https://projects.sfchronicle.com/trackers/sf-car-breakins/"&gt;great map visualization&lt;/a&gt; from this to track break-ins. We wanted to make this data even more actionable, to help asses the security of parking in a particular location in real-time.&lt;/p&gt;

&lt;p&gt;Hence, the motivating question: if I am looking to park in SF, how can I get a sense of how safe my current spot is?&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining a Risk Score
&lt;/h2&gt;

&lt;p&gt;Of course, the risk of a parking spot can be measured in many different qualitative and quantitative ways. We chose a quantitative measure, admittedly quite arbitrary, as the average inverse square of the distance between the parking location and every break-in location in the past year.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ygihcVVg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/14z48ECo5RdF3LyTAQEwGp/16cfc8376d86c9b662f00d8447c7cea1/image__1_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ygihcVVg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/14z48ECo5RdF3LyTAQEwGp/16cfc8376d86c9b662f00d8447c7cea1/image__1_.png" alt="image (1)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This just gives a numerical score. We then evaluate this score across a representative sample of parking spots across SF, and place the current parking spot at a percentile within that sample. The higher the score, the closer the spot is to historical incidents (inverse of distance), the higher the risk.&lt;/p&gt;

&lt;p&gt;We’ll wrap this all in a simple mobile web app, so in you can open it from your car, press a button, and get a percentile of how secure your parking spot is.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--reob9go---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4pyylPmgECB4tuLmN7F4TR/c5aef2c4c61066159aae6789b16cc57f/image__2_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--reob9go---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4pyylPmgECB4tuLmN7F4TR/c5aef2c4c61066159aae6789b16cc57f/image__2_.png" alt="image (2)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, we just have to use the data to compute the risk score percentile. For this task, we’ll load the SFPD data into a Rockset collection and query it upon a user clicking the button.&lt;/p&gt;

&lt;h2&gt;
  
  
  Loading the Data
&lt;/h2&gt;

&lt;p&gt;To get started quickly, we’ll simply download the data as a CSV and upload the file into a new collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IcPHLE04--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/6eQ5jGFJl65exKhbi3KV5O/98e2be05a8c7da7cc85e1a392489458c/image__3_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IcPHLE04--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/6eQ5jGFJl65exKhbi3KV5O/98e2be05a8c7da7cc85e1a392489458c/image__3_.png" alt="image (3)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Later, we can set up a periodic job to forward the dataset into the collection via the API, so that it always stays up to date.&lt;/p&gt;

&lt;h2&gt;
  
  
  Filtering the Data
&lt;/h2&gt;

&lt;p&gt;Let’s switch over to the query tab and try writing a query to filter down to the incidents we care about. There are a few conditions we want to check:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vehicle-related incidents.&lt;/strong&gt; Each incident has an “Incident Subcategory” assigned by the Crime Analysis Unit of the Police Department. We do a &lt;code&gt;SELECT DISTINCT&lt;/code&gt; query on this field and scan the results to pick out the ones we consider vehicle-related.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Motor Vehicle Theft&lt;/li&gt;
&lt;li&gt;Motor Vehicle Theft (Attempted)&lt;/li&gt;
&lt;li&gt;Theft From Vehicle&lt;/li&gt;
&lt;li&gt;Larceny - Auto Parts&lt;/li&gt;
&lt;li&gt;Larceny - From Vehicle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yWW7oHoB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4Onw198CgowVVE8SCt16Z6/b92dbd2a9da78a98fb066f3a72c676f5/image__4_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yWW7oHoB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4Onw198CgowVVE8SCt16Z6/b92dbd2a9da78a98fb066f3a72c676f5/image__4_.png" alt="image (4)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Initial report.&lt;/strong&gt; According to the &lt;a href="https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783"&gt;data documentation&lt;/a&gt;, records cannot be edited once they are filed, so some records are filed as “supplemental” to an existing incident. We can filter those out by looking for the word “Initial” in the report type description.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--waJvJAq5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/3Fj8GSCVvQNPIthVKs3M67/85588b3eccb299bd8934aa06385aa6be/image__5_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--waJvJAq5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/3Fj8GSCVvQNPIthVKs3M67/85588b3eccb299bd8934aa06385aa6be/image__5_.png" alt="image (5)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Within SF.&lt;/strong&gt; The documentation also specifies that some incidents occur outside SF, and that such incidents will have the value “Out of SF” in the police district field.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bQ2FIXzD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2DKfo7TasT6zfxyNJE5gZw/8ccfcc1b38471d2cee5897f5d09fd9bf/image__6_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bQ2FIXzD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2DKfo7TasT6zfxyNJE5gZw/8ccfcc1b38471d2cee5897f5d09fd9bf/image__6_.png" alt="image (6)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Last year.&lt;/strong&gt; The dataset provides a datetime field, which we can parse and ensure is within the last 12 months.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0CIbACm5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/6cwl9kjufwQrCzPAgtX4ho/f5c2e23d66cb9716fc94e84229d0774b/image__7_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0CIbACm5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/6cwl9kjufwQrCzPAgtX4ho/f5c2e23d66cb9716fc94e84229d0774b/image__7_.png" alt="image (7)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Geolocation available.&lt;/strong&gt; We notice some rows are missing the latitude and longitude fields, instead having an empty string. We will simply ignore these records by filtering them out.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Putting all these conditions together, we can prune down from 242,012 records in this dataset to just the 28,224 relevant vehicle incidents, packaged up into a &lt;code&gt;WITH&lt;/code&gt; query.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XhcKEqH5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2447gpZL6dJp2zLNnm6L1v/76036b14c84c2733f632be114837101d/image__8_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XhcKEqH5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2447gpZL6dJp2zLNnm6L1v/76036b14c84c2733f632be114837101d/image__8_.png" alt="image (8)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Calculating a Risk Score, One Spot
&lt;/h2&gt;

&lt;p&gt;Now that we have all vehicle incidents in the last year, let’s see if we can calculate the security score for San Francisco City Hall, which has a latitude of 37.7793° N and longitude of 122.4193° W.&lt;/p&gt;

&lt;p&gt;Using some good old math tricks (radius times angle in radians to get arc length, approximating arc length as straight-line distance, and Pythagorean theorem), we can compute the distance in miles to each past incident:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--aCpA3FIR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/jIkNDW8IF2YY1uKffoJAi/f716733301f39cd878f56c74745531f9/image__9_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--aCpA3FIR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/jIkNDW8IF2YY1uKffoJAi/f716733301f39cd878f56c74745531f9/image__9_.png" alt="image 9"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We aggregate these distances using our formula from above, and voila!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jldBOD67--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4ztemVlqRHW2Ve5RbsHREZ/822daa21e732b866525a74961b02bdee/image__10_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jldBOD67--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4ztemVlqRHW2Ve5RbsHREZ/822daa21e732b866525a74961b02bdee/image__10_.png" alt="image (10)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For our app, we will replace the latitude/longitude of City Hall with parameters coming from the user’s browser location.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sample of Parking Spots in SF
&lt;/h2&gt;

&lt;p&gt;So we can calculate a risk score—1.63 for City Hall—but that is meaningless unless we can compare it to the other parking spots in SF. We need to find a representative set of all possible parking spots in SF and compute the risk score for each to get a distribution of risk scores.&lt;/p&gt;

&lt;p&gt;Turns out, the SFMTA has exactly what we need—field surveys are conducted to count the number of on-street parking spots and their results are &lt;a href="https://data.sfgov.org/Transportation/On-street-Parking-based-on-Parking-Census/9ivs-nf5y"&gt;published as an open dataset&lt;/a&gt;. We’ll upload this into Rockset as well!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hJpg7USE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/3LmqUAp8f6sSaC0DT88fEW/7356cfe587a64c25adc0f5daac32e261/image__11_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hJpg7USE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/3LmqUAp8f6sSaC0DT88fEW/7356cfe587a64c25adc0f5daac32e261/image__11_.png" alt="image (11)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s see what this dataset contains:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nt-f_2G2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2SGnNv7s1NvEgk1hetZwNw/ab678ba8b0bff6396a1e4bbb1cfa4fbc/image__12_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nt-f_2G2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2SGnNv7s1NvEgk1hetZwNw/ab678ba8b0bff6396a1e4bbb1cfa4fbc/image__12_.png" alt="image 12"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For each street, let’s pull out the latitude/longitude values (just the first point, close enough approximation), count of spots, and a unique identifier (casting types as necessary):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9PG_Q9pE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2HcHZaoaWn2IofkjLdyc4E/729c9df35de62fc8146f37e8a5a108b7/image__13_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9PG_Q9pE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2HcHZaoaWn2IofkjLdyc4E/729c9df35de62fc8146f37e8a5a108b7/image__13_.png" alt="image 13"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Calculating Risk Score, Every Spot in SF
&lt;/h2&gt;

&lt;p&gt;Now, let’s try calculating a score for each of these points, just like we did above for City Hall:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--aF68xCZr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/24f9gHCNt1B3gb4ks16gKs/7ed9901ffc9fce657322e86287d35726/image__14_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--aF68xCZr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/24f9gHCNt1B3gb4ks16gKs/7ed9901ffc9fce657322e86287d35726/image__14_.png" alt="image 14"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And there we have it! A parking risk score for each street segment in SF. This is a heavy query, so to lighten the load we’ve actually sampled 5% of each streets and incidents.&lt;/p&gt;

&lt;p&gt;(Coming soon to Rockset: geo-indexing—watch out for a blog post about that in the coming weeks!)&lt;/p&gt;

&lt;p&gt;Let’s stash the results of this query in another collection so that we can use it to calculate percentiles. We first create a new empty collection:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IRkR-cSu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2hh1DMQefrR7cnLxsnuRs6/43598f6aa8953109a78901178503a7a9/image__15_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IRkR-cSu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/2hh1DMQefrR7cnLxsnuRs6/43598f6aa8953109a78901178503a7a9/image__15_.png" alt="image (15)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we run an &lt;code&gt;INSERT INTO sf_risk_scores SELECT ...&lt;/code&gt; query, bumping up to 10% sampling on both incidents and streets:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xuwVpmoC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/30cftaWgbKF8cOJc9Gdjru/fbeb23885fe41d921197d19ce192d92d/image__17_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xuwVpmoC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/30cftaWgbKF8cOJc9Gdjru/fbeb23885fe41d921197d19ce192d92d/image__17_.png" alt="image (17)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Ranking Risk Score as Percentile
&lt;/h2&gt;

&lt;p&gt;Now let’s get a percentile for City Hall against the sample we’ve inserted into &lt;code&gt;sf_risk_scores&lt;/code&gt;. We keep our spot score calculation as we had at first, but now also count what percent of our sampled parking spots are safer than the current spot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0Am1F8cr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4u9R298T4GXo1WykVwPWgw/70702ed7259413311a3b935e5e676908/image__16_.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0Am1F8cr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://images.ctfassets.net/1d31s1aajogl/4u9R298T4GXo1WykVwPWgw/70702ed7259413311a3b935e5e676908/image__16_.png" alt="image 16"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Parking-Spot-Risk-Score-as-a-Service
&lt;/h2&gt;

&lt;p&gt;Now that we have an arguably useful query, let’s turn it into an app!&lt;/p&gt;

&lt;p&gt;We’ll keep it simple—we’ll create an AWS Lambda function that will serve two types of requests. On &lt;code&gt;GET&lt;/code&gt; requests, it will serve a local &lt;code&gt;index.html&lt;/code&gt; file, which serves as the UI. On &lt;code&gt;POST&lt;/code&gt; requests, it will parse query params for &lt;code&gt;lat&lt;/code&gt; and &lt;code&gt;lon&lt;/code&gt; and pass them on as parameters in the last query above. The lambda code looks like this:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;For the client-side, we write a script to fetch the browser's location and then call the backend:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;We add some basic HTML elements and styling, and it’s ready to ship!&lt;/p&gt;

&lt;p&gt;To finish it off, we add API Gateway as a trigger for our lambda and drop a Rockset API key into the environment, which can all be done in the AWS Console.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;To summarize what we did here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We took two fairly straightforward datasets—one for incidents reported by SPFD and one for parking spots reported by SFMTA—and loaded the data into Rockset.&lt;/li&gt;
&lt;li&gt;Several iterations of SQL later, we had an API we could call to fetch a risk score for a given geolocation. &lt;/li&gt;
&lt;li&gt;We wrote some simple code into an AWS Lambda to serve this as a mobile web app.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only software needed was a web browser (download the data, query in Rockset Console, and deploy in AWS Console), and all told this took less than a day to build, from idea to production. The source code for the lambda is available &lt;a href="https://github.com/rockset/recipes/tree/master/sf-parking"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you park in SF, feel free to try out the app &lt;a href="https://rockset.com/sf-parking"&gt;here&lt;/a&gt;. If not, comment with what kind of real-life decisions you think could be improved with more data!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>showdev</category>
      <category>serverless</category>
    </item>
  </channel>
</rss>
