<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Peter Miller</title>
    <description>The latest articles on DEV Community by Peter Miller (@phm200).</description>
    <link>https://dev.to/phm200</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F314826%2Fe668e2f9-2c11-488b-9a79-8732dc1d1be3.jpeg</url>
      <title>DEV Community: Peter Miller</title>
      <link>https://dev.to/phm200</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/phm200"/>
    <language>en</language>
    <item>
      <title>Mapbox's Sheet Mapper with D3</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Wed, 13 May 2020 15:58:32 +0000</pubDate>
      <link>https://dev.to/phm200/mapbox-s-sheet-mapper-with-d3-2bep</link>
      <guid>https://dev.to/phm200/mapbox-s-sheet-mapper-with-d3-2bep</guid>
      <description>&lt;p&gt;As &lt;a href="https://killedbygoogle.com/"&gt;Killed By Google&lt;/a&gt; dramatically illustrates, Google frequently creates and then shuts down products and APIs. This culture of rapid evolution can leave folks behind if they are not paying attention. Today, I'm taking a look at how this could happen for a nice tool from Mapbox and how to quickly avoid it.&lt;/p&gt;

&lt;p&gt;In the spring of 2020, Mapbox introduced &lt;a href="https://www.mapbox.com/impact-tools/sheet-mapper"&gt;Sheet Mapper&lt;/a&gt;, a tool that displays points of interest (POI) on a map. That covers the mapper part; for the sheet part, the tool uses &lt;a href="https://github.com/jsoma/tabletop"&gt;Tabletop.js&lt;/a&gt; to read POI data from a Google Sheet. All told, a quick and dirty visualization tool that requires little programming background to get started.&lt;/p&gt;

&lt;p&gt;Subsequently, Mapbox revised Sheet Mapper with &lt;a href="https://www.mapbox.com/impact-tools/sheet-mapper-advanced-caching"&gt;Sheet Mapper Advanced&lt;/a&gt;, adding caching with S3 and Lambda. In Sheet Mapper Advanced, the app reads POIs from CSV files on S3 using the &lt;a href="https://github.com/d3/d3"&gt;D3.js&lt;/a&gt; library. A more scalable and robust solution for sure.&lt;/p&gt;

&lt;p&gt;However, the original Sheet Mapper is still a fun tool that unfortunately is going to break as of September 2020 when &lt;a href="https://cloud.google.com/blog/products/g-suite/migrate-your-apps-use-latest-sheets-api"&gt;Google turns off the version of the Sheets API&lt;/a&gt; that Tabletop.js uses. The creator of Tabletop.js has a short section in the readme with a workaround using the &lt;a href="https://www.papaparse.com/"&gt;Papa Parse&lt;/a&gt; library.&lt;/p&gt;

&lt;p&gt;I created a &lt;a href="https://github.com/phm200/sheetmapper-d3"&gt;repo in Github&lt;/a&gt;, replacing Tabletop.js with D3.js to follow Sheet Mapper Advanced. The changes were minimal, copied from Sheet Mapper Advanced, including publishing the Sheet as CSV and using d3.csv() to load the data.&lt;/p&gt;

&lt;p&gt;You can see the result online at &lt;a href="https://phm200.github.io/sheetmapper-d3/"&gt;https://phm200.github.io/sheetmapper-d3/&lt;/a&gt; and check out the repo for more details.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--i3JOwpme--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/phm200"&gt;
        phm200
      &lt;/a&gt; / &lt;a href="https://github.com/phm200/sheetmapper-d3"&gt;
        sheetmapper-d3
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Mapbox's Sheet Mapper using D3.js instead of Tabletop.js
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
sheetmapper-d3&lt;/h1&gt;
&lt;p&gt;Mapbox's &lt;a href="https://www.mapbox.com/impact-tools/sheet-mapper" rel="nofollow"&gt;Sheet Mapper&lt;/a&gt; using D3.js instead of Tabletop.js&lt;/p&gt;
&lt;p&gt;Mapbox's Sheet Mapper impact tool is a quick way to:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Create a live-updating map that displays the locations of all of your POIs or events, powered by a simple spreadsheet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mapbox's &lt;a href="https://github.com/mapbox/impact-tools/blob/1696b19fc5d3ed8872756f1a58a121293410ea4b/Sheet-Mapper-Sample-Code.html"&gt;code template&lt;/a&gt; (as of publishing this repo on May 13th, 2020) uses &lt;a href="https://github.com/jsoma/tabletop"&gt;Tabletop.js&lt;/a&gt; to read data from a Google Sheet.&lt;/p&gt;
&lt;p&gt;As of September 2020, Google is retiring v3 of the Sheets API and Tabletop.js will no longer work. The creator recommends using the &lt;a href="https://www.papaparse.com/" rel="nofollow"&gt;Papa Parse&lt;/a&gt; library instead.&lt;/p&gt;
&lt;p&gt;Given that Mapbox uses D3.js for the &lt;a href="https://www.mapbox.com/impact-tools/sheet-mapper-advanced-caching" rel="nofollow"&gt;Sheet Mapper Advanced&lt;/a&gt; impact tool, I updated Sheet Mapper to follow suit and published this code as reference.&lt;/p&gt;
&lt;p&gt;See a live preview at: &lt;a href="https://phm200.github.io/sheetmapper-d3/" rel="nofollow"&gt;https://phm200.github.io/sheetmapper-d3/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;NOTE:&lt;/em&gt; With the change to D3.js, this sample cannot be run directly from an HTML file on your local disk. You need to run it off a server, local or online&lt;/p&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/phm200/sheetmapper-d3"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


</description>
    </item>
    <item>
      <title>Range and T-Shaped Things</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Wed, 08 Apr 2020 16:17:03 +0000</pubDate>
      <link>https://dev.to/phm200/range-and-t-shaped-things-12jl</link>
      <guid>https://dev.to/phm200/range-and-t-shaped-things-12jl</guid>
      <description>&lt;p&gt;One of the fascinating points in his wonderful book, &lt;a href="https://davidepstein.com/the-range/"&gt;Range: Why Generalists Triumph In A Specialized World&lt;/a&gt;, is when David Epstein explains why a "T-shaped" person with a broad base of knowledge across many subjects is often more innovative and makes better decisions within their area of specialty.&lt;/p&gt;

&lt;p&gt;Epstein finds that a breadth of knowledge encourages a person to make connections, think laterally and avoid the group-think of other experts in their field. This agile thinking is particularly important for solving what Epstein calls "wicked" problems, problems that are unique, complex and not rule-bound. This describes a lot of the most important problems out there and is contrast to "kind" problems, that have well defined rules and are repeatable.&lt;/p&gt;

&lt;p&gt;For example, a golf swing is kind (despite what many golfers may think) because the general parameters remain the same every time. You aren't often given a shovel to use in the back 9. You can practice and perfect the relevant techniques. &lt;/p&gt;

&lt;p&gt;Examples of wicked problems include geo-political conflicts and the famous examples of decision making gone wrong like the Challenger space shuttle disaster. Think novel situations with imperfect information and clouded motivations.&lt;/p&gt;

&lt;h1&gt;
  
  
  T-Shaped Software Design
&lt;/h1&gt;

&lt;p&gt;For (much, much) lower stakes, the concepts of T-shaped and having broad range helped me articulate an advantage I thought my previous company had while working on contact center projects. We were brought in as subject matter experts on implementing contact center software like Twilio. Clients expected us to implement a contact center that was cost-effective in a timely fashion. &lt;/p&gt;

&lt;p&gt;What ended up frequently happening was that our teams would dig into the problem as presented and surprise the client by suggesting non-contact center solutions. For example, we'd suggest ways to improve an app experience or streamline a process such that there was less need for customers to even use the contact center.&lt;/p&gt;

&lt;p&gt;Our advantage here was having teams with broad consulting backgrounds who didn't live and breathe just contact center all the time. Having developed lots of types of apps for lots of business scenarios, we had the range and the T-shaped knowledge to connect our current problem to past ones and propose novel solutions. And it was really the combination of deep knowledge about the contact center platform and broad knowledge of other contexts that gave us the edge.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Range&lt;/em&gt; has a lot more going for it that I didn't cover here. Give it a spin and I think you'll learn even more about how diversity of thought, experience and background can give you and your teams an edge as well.&lt;/p&gt;

&lt;p&gt;As always happy reading!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Getting To Yes: A Classic with a Caveat</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Fri, 20 Mar 2020 18:53:54 +0000</pubDate>
      <link>https://dev.to/phm200/getting-to-yes-a-classic-with-a-caveat-2j47</link>
      <guid>https://dev.to/phm200/getting-to-yes-a-classic-with-a-caveat-2j47</guid>
      <description>&lt;p&gt;Robert Fisher and William Ury's book, &lt;a href="https://www.amazon.com/Getting-Yes-Negotiating-Agreement-Without/dp/0143118757"&gt;Getting to Yes: Negotiating Agreement Without Giving In&lt;/a&gt;, is a classic study of how to conduct successful negotiations. It has gone through several revisions over the past 30 years and continues to be a best seller.&lt;/p&gt;

&lt;p&gt;As the authors address in the forward, folks doing creative work sometimes dismiss negotiation as a specialized activity and not part of our everyday life. Instead, the authors point out how we are constantly negotiating, not just on a contract, but on work-life balance, how to divvy up work for a project, how to co-exist in a relationship.&lt;/p&gt;

&lt;p&gt;Having established the prevalence of negotiation, the authors address another misperception, which is that negotiation is all about tips and tricks to push the other side into accepting your demands. This is the stereotype of the shady salesperson who uses psychological tactics to pressure you into moving off your price to theirs. Rather than this type of behavior being central to successful negotiation, the authors identify bargaining over positions as the problem with most negotiation. Instead, they recommend focusing on the underlying interests behind given positions, focusing on the problem, not the people, looking for mutual gains and using objective criteria to evaluate options.&lt;/p&gt;

&lt;p&gt;If you've done any software consulting or product development, you've probably done a lot of negotiation and I'll illustrate a few of these tips with some anecdotes from my past experiences.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Focus on Interests, Not Positions&lt;/em&gt;&lt;br&gt;
While working on a contact center implementation for a customer, I was tasked with gathering requirements and implementing a solution for displaying realtime metrics. The customer's opening ask was  30 to 40 metrics, each of which "had to be" updated every few seconds. This was going to be a large effort and a natural way to proceed would have been to haggle down the number of metrics between the two sides.&lt;/p&gt;

&lt;p&gt;My team dug deeper into why those metrics were needed in that time frame. Turns out the initial ask was a wishlist from several groups who thought they were listing just what their wildest dreams would be. That got translated into a hard requirement along the way. Once we got past that, we could focus on the customer's interest, around managing the agent population against activity surges and populating performance reports. From there we prioritized and shrank the work ahead of us and could move forward. Instead of trying to win by getting the list down to 20 instead of 40, we exited that bad game and moved into a better one.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disentangle the People from the Problem&lt;/em&gt;&lt;br&gt;
Consulting projects can be tough. Sometimes you end up with a "difficult" client, where interactions are fraught and the project life is stressful. In these projects, it is hard to get through negotiations. The relationship is not setup for a natural give and take. I was in such a project a while ago and even the most basic discussions of ongoing work were difficult. &lt;/p&gt;

&lt;p&gt;Personal insults were flung around and I was tempted to throw up my hands and write the project off, but got some good advice from a colleague along the lines of Fisher and Ury's book. To focus on the problem, not the people. In this case, we paused to take an inventory of how we were running the project vs how the client typically worked. We figured out that a major sticking point was that we perceived as being open about our development process seemed confusing and unstructured to them. This led to a lack of trust in our work.&lt;/p&gt;

&lt;p&gt;We started providing extremely clear and structured updates on what we were doing; we got way stricter on how meetings were run and on only discussing the work items in flight. These adjustments helped rebuild the client relationship. Our acting more "rigid" was actually a comfort to the client. The same people were no longer as difficult and the project could move forward.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Value of Walking Away&lt;/em&gt;&lt;br&gt;
The only critique I have of Fisher and Ury's book is that it does not focus enough on the value of walking away. It is a book focused on solutions, on making negotiations work and has a hopeful tone throughout. The authors acknowledge that it can be difficult to negotiate this way if the other party is not willing, but only glance off the possibility of the best outcome being a failed negotiation with the idea of a BATNA or best alternative to a negotiated agreement.&lt;/p&gt;

&lt;p&gt;A BATNA lets you evaluate how far you are willing to go in a negotiation, and how good the outcome really is, with the implication that you should walk away if the BATNA is better. However, the BATNA is situated in the context of two parties where one party is more powerful or unwilling to negotiate.&lt;/p&gt;

&lt;p&gt;From my past experience, even a negotiation in good faith, with parties looking for the win-win can end up the worse for all sides. It can be rushed, circumstances change midway through or any number of other reasons. I'd love to see more from these authors on these type of scenarios.&lt;/p&gt;

&lt;p&gt;As always happy reading!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Software Lessons from Scarcity</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Fri, 13 Mar 2020 14:57:15 +0000</pubDate>
      <link>https://dev.to/phm200/software-lessons-from-scarcity-book-7i7</link>
      <guid>https://dev.to/phm200/software-lessons-from-scarcity-book-7i7</guid>
      <description>&lt;p&gt;Sendir Mullainathan and Eldar Shafir's book &lt;a href="https://www.amazon.com/Scarcity-Science-Having-Defines-Lives/dp/125005611X/"&gt;&lt;em&gt;Scarcity: The New Science of Having Less and How it Defines Our Lives&lt;/em&gt;&lt;/a&gt; is a wonderful achievement and a great read for anyone with an interest in psychology and behavioral economics. Mullainathan and Sharif present a novel frame for the common problem of scarcity, of not having enough money, time or other resource. &lt;/p&gt;

&lt;p&gt;Their insight is that scarcity taxes our attention, what they call a bandwidth tax, and causes us to narrowly focus our (compromised) attention on the most immediate problem ahead, what they call tunneling. The result is consistent and predictably poor decision making by those facing scarcity. Poor decision making hinders getting more of the scarce resource and so in the end, scarcity systematically creates more scarcity.&lt;/p&gt;

&lt;p&gt;Mullainathan and Sharif use descriptive anecdotes and experimental data to support their thesis. Again, worth your time to check out if you liked books like &lt;em&gt;Thinking Fast and Slow&lt;/em&gt; by Daniel Kahneman or &lt;em&gt;Freakonomics&lt;/em&gt; by Steven D. Levitt and Stephen J. Dubner.&lt;/p&gt;

&lt;p&gt;The idea of scarcity is an interesting lens to apply to software development as well. Here's a few thoughts that came to mind, some conventional wisdom, some not, all informed by scarcity.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Organizations and teams are right to cautiously adopt new technologies&lt;/em&gt;&lt;br&gt;
As a long-time software consultant, I'm used to hearing complaints, and complaining myself about a client that seems stuck in the mind, intent on using what seems like a Stone Age tech stack. How can they not realize how much better, cooler, faster new technology X is?&lt;/p&gt;

&lt;p&gt;From a scarcity perspective, sticking with a known solution can be a smart strategy. In the context of a consulting project, it is often the case that the teams' bandwidth is limited by time or money. Operating under this bandwidth tax, the team doesn't have enough capacity to fairly evaluate a new tech approach, in addition to implementing the specific deliverable. In contrast, if those tech decisions are already made, the team can focus their limited attention on the business value.&lt;/p&gt;

&lt;p&gt;This does not mean orgs and teams are always right to be cautious. Creative organizations will find a way to give the right people enough time and support to make a well reasoned evaluation of new technology. Organizations that want to innovate can also build more slack into their timelines. Slack is a critical way to mitigate the mistakes that come about in a scarce environment. A team that has enough time to make a mistake is often one that can learn from it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;One person cannot shape and implement at the same time&lt;/em&gt;&lt;br&gt;
I've long been a huge fan of Basecamp, formally known as 37Signals. Recently, Basecamp released guidance on their software development lifecycle, &lt;a href="https://basecamp.com/shapeup"&gt;Shape Up&lt;/a&gt;. A key facet of their process is that a small group shapes (defines) the parameters of a small cycle of work and then another group implements that work. Once the pitch for the work is complete and approved, the implementing team has freedom within the pitch definition to implement it.&lt;/p&gt;

&lt;p&gt;Having these &lt;a href="https://basecamp.com/shapeup/1.1-chapter-02#two-tracks"&gt;two tracks&lt;/a&gt;, of shaping and building, makes perfect sense from a scarcity perspective. When leading a team to build an application, I'm focused (tunneled) into the implementation. If I'm also trying to figure out what we're building, one is going to get short-changed. Our brains are good at focusing and we can be incredibly productive in the tunnel, but at the expense of items outside it. Whatever track we are on, our brain wants to get back to and will shortchange the other track.&lt;/p&gt;

&lt;p&gt;This insight seems mundane on the surface, but in my experience it is quite common for technical leads on projects to be in charge of both implementing the current phase and planning the next one. While they may have the skill to do both, the expectation that those different tracks can occur in parallel without a loss of quality in one or the other is misleading. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Even LeBron James needs rest days&lt;/em&gt;&lt;br&gt;
If NBA teams were run more like software projects, then star players like LeBron James would never be given a rest day, or limited minutes. Why would you take your best performer off the court? What seems obvious in a physical undertaking like basketball, that overwork, a scarcity of rest, leads to injury or poor performance is just as true for mental work. From &lt;em&gt;Scarcity&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;...our effects [of scarcity] correspond to between 13 and 14 IQ points... losing 13 points can take you from "average" to a category labeled "borderline deficient"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;More to the point of high performers, the (temporary) loss of IQ is also enough to take someone from "superior" to "average". This effect has nothing to do with that person's inherent grit or toughness. Put the same person in a better, more abundant situation can perform to their potential.&lt;/p&gt;

&lt;p&gt;To put it another way, when a team is told to consistently put in extra hours, the implicit message is that we are no longer concerned about the quality of the work, we just hope to get the work done at any quality level in a given calendar timeframe. For software consultancies that differentiate on quality of work, this doesn't sound too appealing.&lt;/p&gt;

&lt;p&gt;There's a lot more in &lt;em&gt;Scarcity&lt;/em&gt; that I didn't cover here. And I'm sure other sources to provide contrary lenses on these points. Keep reading and learning, and as always leave me comments and questions below. Thanks!&lt;/p&gt;

</description>
      <category>books</category>
    </item>
    <item>
      <title>More on Geohashing: Covering and an updated DynamoDB library</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Wed, 05 Feb 2020 20:54:06 +0000</pubDate>
      <link>https://dev.to/phm200/more-on-geohashing-covering-and-an-updated-dynamodb-library-2p3n</link>
      <guid>https://dev.to/phm200/more-on-geohashing-covering-and-an-updated-dynamodb-library-2p3n</guid>
      <description>&lt;p&gt;&lt;em&gt;This is the third part in a series of posts. See the links above to jump to the prior posts&lt;/em&gt; &lt;/p&gt;

&lt;h1&gt;
  
  
  S2 vs. H3 Covering
&lt;/h1&gt;

&lt;p&gt;In my last post I talked through how we can use S2 cells to fill or cover a shape on the map. In this case a circle, within which we want to search for items of interest. I also mentioned that Uber came up with H3 cells, a technique for covering a shape on the map with hexagons.&lt;/p&gt;

&lt;p&gt;The images below show S2 and H3 coverings of the same area in Washington, DC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;S2 Covering&lt;/strong&gt; (shown in blue)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fygpmv1nhkwxwirwdx163.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fygpmv1nhkwxwirwdx163.png" alt="S2 Covering downtown DC"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;H3 Covering&lt;/strong&gt; (shown in red)&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkmh78a2t8pr8f3cdcbo7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkmh78a2t8pr8f3cdcbo7.png" alt="H3 Covering downtown DC"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The S2 covering is composed of cells of different sizes or levels. Some are larger cells that fit mostly into the circle, others are smaller cells that help cover the edges. This means fewer cells overall are needed to cover the space vs the H3 example, albeit with the downside of a more uneven covering compared to what we see in H3.&lt;/p&gt;

&lt;p&gt;With that somewhat uneven covering, we can see how important that distance calculation is in the final step of our S2 geo-search from the prior post. It excludes those points of interest that came from the outer edges of covering cells.&lt;/p&gt;

&lt;p&gt;These images also illustrate how S2 cells can nest. Any given S2 cell will fit completely into its parent cell at a higher level. S2 cells nesting allow us to calculate coverings with diverse cell sizes. Nesting also would allow us to roll up location based data into larger aggregates. If data is stored at a small (high level) cell, then we can roll up that data to a neighborhood, city or region level by summing up the data from all the child cells in a bigger cell that covers the target area.&lt;/p&gt;

&lt;p&gt;In contrast while H3 provides a more precise covering (at least a high resolution), hexagons do not nest. As we scale up levels, the hexagons overlap incompletely. So we cannot have a meaningful H3 covering with diverse hexagon sizes. Without nesting, aggregation is not as automatic. While we can still roll up data from the smaller hexagons, we cannot map it directly to larger ones.&lt;/p&gt;
&lt;h2&gt;
  
  
  Dash-Labs dynamodb-geo
&lt;/h2&gt;

&lt;p&gt;While watching &lt;a href="https://t.co/oYKSyktKni?amp=1" rel="noopener noreferrer"&gt;a brilliant discussion on Twitch&lt;/a&gt; between DynamoDB gurus &lt;a href="https://twitter.com/houlihan_rick" rel="noopener noreferrer"&gt;Rick Houlihan&lt;/a&gt; and &lt;a href="https://twitter.com/alexbdebrie" rel="noopener noreferrer"&gt;Alex DeBrie&lt;/a&gt;, at around an hour in, Rick dropped a quick reference to an AWS customer who had taken the DynamoDB Geo library I looked at last post and improved upon it. Thanks to twitch user switch184 in the comments for pointing me to the repo at: &lt;a href="https://github.com/Dash-Labs/dynamodb-geo" rel="noopener noreferrer"&gt;https://github.com/Dash-Labs/dynamodb-geo&lt;/a&gt;&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Dash-Labs" rel="noopener noreferrer"&gt;
        Dash-Labs
      &lt;/a&gt; / &lt;a href="https://github.com/Dash-Labs/dynamodb-geo" rel="noopener noreferrer"&gt;
        dynamodb-geo
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;#Geo Library for Amazon DynamoDB&lt;/p&gt;
&lt;p&gt;&lt;a href="https://travis-ci.com/Dash-Labs/dynamodb-geo" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/01016e215fe5b2b3ab3d5bcf4a333976f48c2daadd1369eac41b583f8a32ff94/68747470733a2f2f7472617669732d63692e636f6d2f446173682d4c6162732f64796e616d6f64622d67656f2e7376673f6272616e63683d6d6173746572" alt="Build Status"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This library was forked from the &lt;a href="http://awslabs.github.io/dynamodb-geo/" rel="nofollow noopener noreferrer"&gt;AWS geo library&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Following limitations with the aws geo library were the main reasons that necessitated this fork:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Usage required a table’s hash and range key to be replaced by geo data. This approach is not feasible as it cannot be used when performing user-centric queries, where the hash and range key have to be domain model specific attributes.&lt;/li&gt;
&lt;li&gt;Developed prior to GSI, hence only used LSI&lt;/li&gt;
&lt;li&gt;No solution for composite queries. For e.g. “Find something within X miles of lat/lng AND category=‘restaurants’;&lt;/li&gt;
&lt;li&gt;The solution executed the queries and returned the final result. It did not provide the client with any control over the query execution.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What methods are available for geo-querying in this library?&lt;/h2&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Query for a given lat/long&lt;/li&gt;
&lt;li&gt;Radius query&lt;/li&gt;
&lt;li&gt;Box/Rectangle query&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of the above queries can be run as composite queries, depending on their…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Dash-Labs/dynamodb-geo" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;While this fork of the original dynamodb-geo is in Java, not JavaScript, it is worth your time to take a look. There are many improvements, which are summarized in the repo's readme. &lt;/p&gt;

&lt;p&gt;One bit of documentation that stuck out to me was more discussion on hash key length, including the number of queries the library produces for a radius search with a given hash key. Something that tripped me up at first is that the geohash key they are talking about here is the first X digits of an S2 cell id, and the length of that key does not match 1:1 to the level of the cell. As the Dash-Labs repo suggests, a 5 or 6 digit long geohash key is well suited for near proximity searches. I still struggle with understanding the math behind that assertion, but the sample query results are convincing.&lt;/p&gt;

&lt;p&gt;Thanks for reading and happy geohashing!&lt;/p&gt;

</description>
      <category>geo</category>
      <category>guide</category>
      <category>dynamodb</category>
    </item>
    <item>
      <title>The Problem of Nearness: Part 2 - A Solution with S2</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Fri, 24 Jan 2020 19:32:00 +0000</pubDate>
      <link>https://dev.to/phm200/the-problem-of-nearness-part-2-a-solution-with-s2-23gm</link>
      <guid>https://dev.to/phm200/the-problem-of-nearness-part-2-a-solution-with-s2-23gm</guid>
      <description>&lt;p&gt;&lt;em&gt;This is part 2 of a series, for the full context see the &lt;a href="https://dev.to/untilawesome/the-problem-of-nearness-part-1-geohash-4hh8"&gt;first post&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We covered a lot of theory and math around calculating distance last time. For today's post, we will focus on some of the details of an implementation I referenced. &lt;/p&gt;

&lt;h1&gt;
  
  
  Caffeinate-Me by &lt;a class="mentioned-user" href="https://dev.to/jbesw"&gt;@jbesw&lt;/a&gt;
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://read.acloud.guru/location-based-search-results-with-dynamodb-and-geohash-267727e5d54f" rel="noopener noreferrer"&gt;Location-based search results with DynamoDB and Geohash [Medium]&lt;/a&gt;, [Caffeinate me! Build a serverless app to find the nearest Starbucks &lt;a href="https://dev.tosimilar%20to%20first%20post,%20but%20more%20on%20the%20front-end"&gt;Medium&lt;/a&gt;](&lt;a href="https://medium.com/swlh/caffeinate-me-build-a-serverless-app-to-find-the-nearest-starbucks-54512124e639" rel="noopener noreferrer"&gt;https://medium.com/swlh/caffeinate-me-build-a-serverless-app-to-find-the-nearest-starbucks-54512124e639&lt;/a&gt;), and if you are interested in how it scales &lt;a href="https://itnext.io/will-it-scale-lets-load-test-geohashing-on-dynamodb-fbdc612d9ec3" rel="noopener noreferrer"&gt;Will it scale? Let’s load test geohashing on DynamoDB [Medium]&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.youtube.com/watch?v=GZGAoeD1TS0" rel="noopener noreferrer"&gt;Using geospatial searches with DynamoDB [YouTube]&lt;/a&gt; and &lt;a href="https://www.youtube.com/watch?v=y-mQtpB-a6g" rel="noopener noreferrer"&gt;Caffeinate me! Using VueJS to query your API Gateway [YouTube]&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://gitlab.com/jbesw/askjames-caffeinate-me" rel="noopener noreferrer"&gt;Caffeinate-Me backend API repo [Gitlab]&lt;/a&gt; and &lt;a href="https://gitlab.com/jbesw/askjames-caffeinate-me-frontend" rel="noopener noreferrer"&gt;front-end Vue JS repo [Gitlab]&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://twitter.com/jbesw" rel="noopener noreferrer"&gt;James Beswick&lt;/a&gt; is a developer advocate at AWS for Serverless. He wrote three fantastic posts about using DynamoDB to implement location-based searches, accompanied by explanatory videos and the implementing Git repo's. Please take a moment to check out his work, it is excellent. I learned a lot playing around with it and also found a few interesting items that I highlight below.&lt;/p&gt;

&lt;h1&gt;
  
  
  "Geohash" and Google's S2 Geometry
&lt;/h1&gt;

&lt;p&gt;My prior post talked about geohashing, specifically the canonical implementation of GeoHash by Gustavo Niemeyer using alphanumeric hashes to address a grid of nested squares and rectangles that covers the Earth.&lt;/p&gt;

&lt;p&gt;The DynamoDB geo library that James' Starbucks locator uses &lt;a href="https://aws.amazon.com/blogs/mobile/geo-library-for-amazon-dynamodb-part-1-table-structure/" rel="noopener noreferrer"&gt;does not use that geohash algorithm&lt;/a&gt;. Instead, it uses &lt;a href="https://s2geometry.io/about/overview" rel="noopener noreferrer"&gt;Google's S2 Geometry&lt;/a&gt; for addressing locations. I promised less math, so the big takeaway to focus on here is that points of interest in S2 are placed in cells, that like our squares and rectangles from geohash are nested and cover the Earth's surface. S2 cells are addressed by 64-bit integers (not alphanumeric strings) and certain distance and covering calculations are much faster than with GeoHash. &lt;/p&gt;

&lt;p&gt;For a more detailed look at how S2 works, including a fun animated gif of the Hilbert Curve, check out &lt;a href="http://blog.christianperone.com/2015/08/googles-s2-geometry-on-the-sphere-cells-and-hilbert-curve/" rel="noopener noreferrer"&gt;Christian Perone's post on S2&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;An important concept in S2 is "covering" geographic shapes. This means identifying the neighboring cells that when tiled together, fill (or nearly fill) the specified shape. For (math) reasons, generating a range of covering cells can be done very quickly with S2. To nerd out a bit more on S2 vs. geohash, check out &lt;a href="https://blog.nobugware.com/post/2016/geo_db_s2_geohash_database/" rel="noopener noreferrer"&gt;this post from Fabrice Aneche&lt;/a&gt;. &lt;a href="https://docs.google.com/presentation/d/1Hl4KapfAENAOf4gv-pSngKwvS_jwNVHRPZTTDzXXn6Q/view#slide=id.i130" rel="noopener noreferrer"&gt;Google's presentation on S2&lt;/a&gt; has a nice visual of what covering (referred to as approximating regions here) looks like in practice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fdjyglgdtxhx65s6x0h86.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fdjyglgdtxhx65s6x0h86.png" alt="Approximating Regions Using S2 Cells"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I was trying to reach the end of the internet while writing this post, I came across an additional geohashing or spatial library. As any old school war-gamer knows, hexagonal grids are way cooler than square grids, and sure enough, Uber created &lt;a href="https://uber.github.io/h3/#/" rel="noopener noreferrer"&gt;H3&lt;/a&gt; a "hexagonal hierarchical geospatial indexing system". I'll leave the details for you to look into if you are interested, but Uber states it is a good fit for &lt;a href="https://uber.github.io/h3/#/documentation/overview/use-cases" rel="noopener noreferrer"&gt;use cases&lt;/a&gt; like analysis of "locations of cars in a city". If that's your thing.&lt;/p&gt;

&lt;p&gt;Now that we know we are working with S2, let's check out some of the details of James' Caffeinate-Me app.&lt;/p&gt;

&lt;h1&gt;
  
  
  Finding Items within a Circle
&lt;/h1&gt;

&lt;p&gt;When using the Caffeinate-Me app, you click around a map and are shown all the Starbucks that are within a circle centered on your click. The code to get the Starbucks within that circle is shown below (from query.js):&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;

&lt;span class="nx"&gt;myGeoTableManager&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;queryRadius&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;RadiusInMeter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
    &lt;span class="na"&gt;CenterPoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;latitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;40.7769099&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;longitude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;73.9822532&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The queryRadius method on the GeoDataManager.js shows how the dynamodb-geo package breaks down this request:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;

     &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;param&lt;/span&gt; &lt;span class="nx"&gt;queryRadiusInput&lt;/span&gt;
     &lt;span class="o"&gt;*&lt;/span&gt;    &lt;span class="nx"&gt;Container&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;necessary&lt;/span&gt; &lt;span class="nx"&gt;parameters&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;execute&lt;/span&gt; &lt;span class="nx"&gt;radius&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
     &lt;span class="o"&gt;*&lt;/span&gt;
     &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nd"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;radius&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
     &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="nx"&gt;GeoDataManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prototype&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queryRadius&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryRadiusInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;_this&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;latLngRect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;S2Util_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;S2Util&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getBoundingLatLngRectFromQueryRadiusInput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryRadiusInput&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;covering&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Covering_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Covering&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;S2RegionCoverer&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;getCoveringCells&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;latLngRect&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchQueries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;covering&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;queryRadiusInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;_this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filterByRadius&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;queryRadiusInput&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In pseudo-code:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;(Line 8) Get a rectangle that defines the min and max latitude and longitudes of a bounding box that encloses a circle of the specified RadiusInMeter from the center point&lt;/li&gt;
&lt;li&gt;(Line 9) Get a collection of S2 cell addresses (hashes) that cover this rectangle of space&lt;/li&gt;
&lt;li&gt;(Line 10-100) Query DynamoDB to retrieve the Starbucks within the specified S2 cells and then drop the Starbucks that were part of the covering rectangle, but beyond the radius of the circle&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The S2 library handily takes care of the details of #1 and #2. More specifically, to back to questions from my first post, that getCoveringCells method is figuring out the neighboring geo-bins (cells). Like with GeoHash, &lt;a href="https://s2geometry.io/resources/s2cell_statistics.html" rel="noopener noreferrer"&gt;S2 cells have different levels&lt;/a&gt;, from 0 (huge) to 30 (1cm squared). &lt;/p&gt;

&lt;p&gt;By default, the S2 library will attempt to return 8 S2 cells (possibly at different levels) to cover the given shape. This creates some work for the dispatchQueries method, which has to generate one or more DynamoDB queries per covering cell:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;

&lt;span class="nx"&gt;GeoDataManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prototype&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dispatchQueries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;covering&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;geoQueryInput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;_this&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;promises&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;covering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGeoHashRanges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hashKeyLength&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;range&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;hashKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;S2Manager_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;S2Manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateHashKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;range&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rangeMin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hashKeyLength&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;_this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dynamoDBManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;queryGeohash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;geoQueryInput&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;QueryInput&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hashKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;range&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;promises&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;mergedResults&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
            &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryOutputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;queryOutputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryOutput&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;mergedResults&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;push&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mergedResults&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;queryOutput&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;mergedResults&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In pseudo-code:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;(Line 3) Get a collection of geohashes of the table's hash key length that encompasses the covering S2 cells&lt;/li&gt;
&lt;li&gt;(Line 4-5) Setup a DynamoDB query that uses the partition key of the hash key, and the range key of with the covering S2 cell addresses to get all the Starbucks in that part of the covering region&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's a lot to take in. For a concrete example, I added some logging to the library, and from a center point in New York City, I got 8 covering S2 cells, which went 1:1 to 8 DynamoDB queries. For example, one query was of hash key -82501, sort key of S2 cell ids between &lt;strong&gt;-85201&lt;/strong&gt;50788008312831 and &lt;strong&gt;-85201&lt;/strong&gt;41991915290625. Another query was of hash key -85199, sort key of S2 cells ids between &lt;strong&gt;-85199&lt;/strong&gt;82196314865663 to &lt;strong&gt;-85199&lt;/strong&gt;82196180647937.&lt;/p&gt;

&lt;p&gt;This is where selecting the length of the hash key becomes important. As James explains in his post, based on the radius of the circle you are searching in and length of that key, the number of queries against DynamoDB and how much you are hammering individual partitions (hash keys) can vary dramatically.&lt;/p&gt;

&lt;p&gt;The dynamodb-geo library defaults to a 6 digit hash key. James uses a 5 digit hash key in his example. A rule of thumb for these mostly local type of searches seems to be 5 to 7 digits.&lt;/p&gt;

&lt;p&gt;There's a lot more to dig into on S2, the dynamodb-geo library and spatial searches, but for, let's call it a day. Please reach out with any comments or questions. Happy geo searching!&lt;/p&gt;

</description>
      <category>geo</category>
      <category>guide</category>
      <category>dynamodb</category>
    </item>
    <item>
      <title>The Problem of Nearness: Part 1 - Geohash</title>
      <dc:creator>Peter Miller</dc:creator>
      <pubDate>Mon, 20 Jan 2020 21:02:12 +0000</pubDate>
      <link>https://dev.to/phm200/the-problem-of-nearness-part-1-geohash-4hh8</link>
      <guid>https://dev.to/phm200/the-problem-of-nearness-part-1-geohash-4hh8</guid>
      <description>&lt;p&gt;&lt;em&gt;updated 2020-01-24&lt;/em&gt;&lt;br&gt;
&lt;em&gt;-noted that the example in part 2 will use S2, not Geohash&lt;/em&gt;&lt;br&gt;
&lt;em&gt;-removed "relational" term when discussing spatial data types&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Google Maps, Yelp, and Meetup all can help us answer a variation on the same question of "what X is nearest to Y". What subway stop is closest to my friend's house, where's the nearest sushi place, who else is interested in hiking in my town, etc. These sites use geographic data about points of interest, and our current location or a location of our choosing to calculate what's close.&lt;/p&gt;

&lt;p&gt;In this post, I'll discuss how we can make these types of calculations in our own apps. Because of the number of points in these datasets and the number of visitors querying the datasets, we will be looking at not only how to make these calculations, but also how to make them faster and more efficient.&lt;/p&gt;

&lt;p&gt;We'll start by looking at points of interest on a line, then move our way up to 2d planes and then a globe. Along the way we'll encounter geohashing, an elegant solution to the problem of nearness.&lt;/p&gt;

&lt;h1&gt;
  
  
  Staying Close in Lineland
&lt;/h1&gt;

&lt;p&gt;Let's imagine a world with one-dimension. All points are described by a single value, the relative position on that single great line of existence. To avoid &lt;a href="https://en.wikipedia.org/wiki/Flatland#Plot" rel="noopener noreferrer"&gt;any unfortunate incidents&lt;/a&gt;, we can stay in our 3d world and just observe this lineland from afar.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fa4s64lpe14us93z5290m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fa4s64lpe14us93z5290m.png" alt="Points in lineland"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We see 5 points of interest on the line, with the following positions:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Place of Interest: Position on Line
"The Red Fox Tavern": -5
"Charlie's Chicken Shack": -3
"Smoothie Town": 0
"The Wilted Cauliflower": 2
"Eggplant Paradise": 4


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;With only one-dimension, the distance between any two points in lineland is simply the difference in their positions. If we want a lineland Yelp clone to answer the question of what restaurants are within 5 units of my house at position 1, we can use a brute-force approach by calculating the distance from every point to the center point:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Distances from Points of Interest to 1
"The Red Fox Tavern": -5 =&amp;gt; 1 - (-5) =&amp;gt; 6
"Charlie's Chicken Shack": -3 =&amp;gt; 1 - (-3) =&amp;gt; 4
"Smoothie Town": 0 =&amp;gt; 1 - 0 =&amp;gt; 1
"The Wilted Cauliflower": 2 =&amp;gt; 1 - 2 =&amp;gt; -1
"Eggplant Paradise": 4 =&amp;gt; 1 - 4 =&amp;gt; -3


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We find three restaurants in the specified range.&lt;/p&gt;

&lt;p&gt;As our dataset of points of interest grows larger, calculating the distance from every point to the center point is more and more work and our algorithm gets slower (or we parallelize the calculation and spend more computation in exchange for time). &lt;/p&gt;

&lt;p&gt;To keep the size of dataset manageable in our example, we can modify our algorithm to return all restaurants in the position range of -4 through 6, taking the farthest allowable distance to the left through the farthest allowable distance to the right.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fkyrl8vsa1dgelk9wa459.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fkyrl8vsa1dgelk9wa459.png" alt="Position range in lineland"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Despite it's range of culinary delights, there's not much going on in lineland, so let's bump up another dimension.&lt;/p&gt;

&lt;h1&gt;
  
  
  Infinite Planes and the Distance Formula
&lt;/h1&gt;

&lt;p&gt;We are now in the world of two-dimensions, with the familiar X and Y coordinate system of geometry and mathematics. We plot points of interest on an infinite plane, with a horizontal X position and vertical Y position. &lt;/p&gt;

&lt;p&gt;In our plane world, we see the same 5 restaurants from lineland, but with positions composed of X and Y pairs:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Place of Interest: (X, Y) coordinates
"The Red Fox Tavern": (-5,8)
"Charlie's Chicken Shack": (-3,-1)
"Smoothie Town": (0,-4)
"The Wilted Cauliflower": (2,7)
"Eggplant Paradise": (4,-6)


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fkp6cfbb1aw2pcujvefgz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fkp6cfbb1aw2pcujvefgz.png" alt="Points in plane"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With two-dimensions the distance between any two points is calculated using a derivation of the Pythagorean theorem, the &lt;a href="https://www.mathwarehouse.com/algebra/distance_formula/index.php" rel="noopener noreferrer"&gt;Distance formula&lt;/a&gt;, the square root of the squares of the differences:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Ff3yjnjo7zuqj2ayzx0tp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Ff3yjnjo7zuqj2ayzx0tp.png" alt="Distance Formula"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our 2d Yelp clone, we want to find restaurants within 5 units of my house at position (1,2). Again, we start with brute-force and do the calculation for every point to the center point:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Distances to (1,2)
"The Red Fox Tavern": (-5,8) =&amp;gt; sqrt((-5 - 1)*(-5 - 1) + (8 - 2)*(8 - 2)) =&amp;gt; 7.75
"Charlie's Chicken Shack": (-3,-1) =&amp;gt; sqrt(16+9) =&amp;gt; 5
"Smoothie Town": (0,-4) =&amp;gt; sqrt(1+36) =&amp;gt; 6.08
"The Wilted Cauliflower": (2,7) =&amp;gt; sqrt(1+25) =&amp;gt; 5.10
"Eggplant Paradise": (4,-6) =&amp;gt; sqrt(9+64) =&amp;gt; 8.54


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Doing this calculation for 5 points is no big deal, but as the dataset grows larger, our algorithm needs more and more compute capacity and/or time. As with lineland, we address scalability by limiting the size of the dataset. The fewer points to feed into the distance formula, the faster.&lt;/p&gt;

&lt;p&gt;To limit the size of the dataset we use a &lt;a href="https://en.wikipedia.org/wiki/Minimum_bounding_rectangle" rel="noopener noreferrer"&gt;minimum bonding rectangle&lt;/a&gt; or bounding box:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The &lt;strong&gt;minimum bounding rectangle&lt;/strong&gt; (&lt;strong&gt;MBR&lt;/strong&gt;), also known as &lt;strong&gt;bounding box&lt;/strong&gt; (BBOX) or &lt;strong&gt;envelope&lt;/strong&gt;, is an expression of the maximum extents of a 2-dimensional object (e.g. point, line, polygon) or set of objects within its (or their) 2-D (x, y) &lt;a href="https://en.wikipedia.org/wiki/Coordinate_system" rel="noopener noreferrer"&gt;coordinate system&lt;/a&gt;, in other words min(x), max(x), min(y), max(y). The MBR is a 2-dimensional case of the &lt;a href="https://en.wikipedia.org/wiki/Minimum_bounding_box" rel="noopener noreferrer"&gt;minimum bounding box&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As the diagram below illustrates, we can create a bounding box that is guaranteed to include at least all the points within R units of the center point. Given a point at (x, y), the four corners of the box will be at:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Upper Right Corner: ((x + r), (y + r))
Lower Right Corner: ((x + r), (y - r))
Lower Left Corner: ((x - r), (y - r))
Upper Left Corner: ((x - r), (y + r))


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fjddiopa2rh9l6kjpm8zq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fjddiopa2rh9l6kjpm8zq.png" alt="Bounding box for a plane"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With these outer bounds defined, we can restrict the set of points we calculate the distance for to be only points between these bounds. &lt;/p&gt;

&lt;p&gt;In pseudocode, our approach becomes something like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pick a center point and how far out we are going to look (the distance)&lt;/li&gt;
&lt;li&gt;Calculate a bounding box around the center point, based on the distance&lt;/li&gt;
&lt;li&gt;Find the subset of the points of interest with a X between the min and max X of our bounding box and Y between the min and max Y of our bounding box&lt;/li&gt;
&lt;li&gt;For each point in the bounding box, run the Distance formula to calculate distance from the center point&lt;/li&gt;
&lt;li&gt;Return points whose distance to the center point is within the desired distance&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Without getting into the technical details here, most data stores have indices that support range queries to make finding the subset of points within the bounding box efficient.&lt;/p&gt;

&lt;p&gt;With that, we're ready to enter the third dimension and think about distances over a spherical object, like say planet Earth.&lt;/p&gt;

&lt;h1&gt;
  
  
  Spheres, Latitude, Longitude and the Haversine Formula
&lt;/h1&gt;

&lt;p&gt;While we use street addresses to locate places in our everyday lives, under the covers we are all using &lt;a href="https://en.wikipedia.org/wiki/Geographic_coordinate_system" rel="noopener noreferrer"&gt;latitude and longitude&lt;/a&gt; coordinates:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The combination of these two components specifies the position of any location on the surface of Earth, without consideration of &lt;a href="https://en.wikipedia.org/wiki/Altitude" rel="noopener noreferrer"&gt;altitude&lt;/a&gt; or depth.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://gisgeography.com/latitude-longitude-coordinates/" rel="noopener noreferrer"&gt;From GISGeography's site&lt;/a&gt;: Latitude coordinates are essentially Y-values between -90 and 90 degrees with 0 at the Equator. Longitude values are essentially X-values between -180 and 180 degrees with 0 at the Prime Meridian.&lt;/p&gt;

&lt;p&gt;For example, the latitude and longitude of the Washington Monument is 38.8895°, -77.0353°.&lt;/p&gt;

&lt;p&gt;Before we move on, it is important to note that calculating the exact distance between points on Earth is &lt;a href="https://stackoverflow.com/questions/1108965/taking-altitude-into-account-when-calculating-geodesic-distance" rel="noopener noreferrer"&gt;far more complex&lt;/a&gt; than for points on a two-dimensional plane. Fortunately, for the scenarios we are concerned with, say locating the nearest taco stand, we are OK with simplifying our distance calculations at the expense of precision by making these assumptions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Earth is a perfect sphere (&lt;a href="https://en.wikipedia.org/wiki/Figure_of_the_Earth" rel="noopener noreferrer"&gt;it's actually an oblate spheroid&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;All points of interest are directly on the surface of the Earth, i.e. we ignore elevation changes&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With these assumptions in place, let's take a look at the 5 points of interest from our prior examples, now specified by latitude and longitude coordinates:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Place of Interest: (Latitude, Longitude) coordinates
"The Red Fox Tavern": 32.7549°, -117.1425°
"Charlie's Chicken Shack": 32.7524°, -117.1427°
"Smoothie Town": 32.7376°, -117.1714°
"The Wilted Cauliflower": 32.5229°, -117.1165°
"Eggplant Paradise": 32.5073°, -117.0855°


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Me drawing those points on a sphere isn't going to help anyone, but if you are curious, these are all just locations selected around San Diego and Tijuana.&lt;/p&gt;

&lt;p&gt;To calculate the distance between points on Earth with latitude and longitude, many software packages use the &lt;a href="https://en.wikipedia.org/wiki/Haversine_formula" rel="noopener noreferrer"&gt;Haversine formula&lt;/a&gt;, a type of &lt;a href="https://en.wikipedia.org/wiki/Great-circle_distance" rel="noopener noreferrer"&gt;great-circle distance calculation&lt;/a&gt;. The idea of great-circle distance is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;the shortest &lt;a href="https://en.wikipedia.org/wiki/Distance" rel="noopener noreferrer"&gt;distance&lt;/a&gt; between two &lt;a href="https://en.wikipedia.org/wiki/Point_(geometry)" rel="noopener noreferrer"&gt;points&lt;/a&gt; on the surface of a &lt;a href="https://en.wikipedia.org/wiki/Sphere" rel="noopener noreferrer"&gt;sphere&lt;/a&gt;, measured along the surface of the sphere (as opposed to a straight line through the sphere's interior). The distance between two points in &lt;a href="https://en.wikipedia.org/wiki/Euclidean_space" rel="noopener noreferrer"&gt;Euclidean space&lt;/a&gt; is the length of a straight line between them, but on the sphere there are no straight lines. In &lt;a href="https://en.wikipedia.org/wiki/Manifold" rel="noopener noreferrer"&gt;spaces with curvature&lt;/a&gt;, straight lines are replaced by &lt;a href="https://en.wikipedia.org/wiki/Geodesic" rel="noopener noreferrer"&gt;geodesics&lt;/a&gt;. Geodesics on the sphere are circles on the sphere whose centers coincide with the center of the sphere, and are called &lt;em&gt;&lt;a href="https://en.wikipedia.org/wiki/Great_circle" rel="noopener noreferrer"&gt;great circles&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Haversine formula itself is described in great detail on the Wikipedia page if you want to flex your geometry muscles. Lots of sines and cosines. The phi variables are latitude, the lambda variables are longitude:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwikimedia.org%2Fapi%2Frest_v1%2Fmedia%2Fmath%2Frender%2Fsvg%2Fa65dbbde43ff45bacd2505fcf32b44fc7dcd8cc0" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwikimedia.org%2Fapi%2Frest_v1%2Fmedia%2Fmath%2Frender%2Fsvg%2Fa65dbbde43ff45bacd2505fcf32b44fc7dcd8cc0" alt="{\displaystyle {\begin{aligned}d&amp;amp;=2r\arcsin \left(Proximity%20-%201.assets/a65dbbde43ff45bacd2505fcf32b44fc7dcd8cc0)\\&amp;amp;=2r\arcsin \left({\sqrt {\sin ^{2}\left({\frac {\varphi _{2}-\varphi _{1}}{2}}\right)+\cos(\varphi _{1})\cos(\varphi _{2})\sin ^{2}\left({\frac {\lambda _{2}-\lambda _{1}}{2}}\right)}}\right)\end{aligned}}}"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our 3d Yelp clone, we want to find restaurants within 5 miles of my house at 32.7584°, -117.1402° (not actually my house). Again, we start with brute-force and do the calculation for every point to the center point (in this case using an &lt;a href="https://andrew.hedges.name/experiments/haversine/" rel="noopener noreferrer"&gt;online calculator&lt;/a&gt; to avoid doing the math myself):&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Distances to 32.7584°, -117.1402°
"The Red Fox Tavern": 32.7549°, -117.1425° =&amp;gt; 0.276 miles
"Charlie's Chicken Shack": 32.7524°, -117.1427° =&amp;gt; 0.44 miles
"Smoothie Town": 32.7376°, -117.1714° =&amp;gt; 2.315 miles
"The Wilted Cauliflower": 32.5229°, -117.1165° =&amp;gt; 16.339 miles
"Eggplant Paradise": 32.5073°, -117.0855° =&amp;gt; 17.649 miles


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Many modern data stores have support for storing latitude and longitude pairs, as well as built-in functions for calculating the distance between these points. Typically these features or packages are called Spatial Data support or GIS (Geographic information system) support. &lt;/p&gt;

&lt;p&gt;Even it is built-in to the data store, this type of calculation can get more and more expensive as we increase the size of our dataset. Again, our approach to limit computational and time use is to limit the dataset we operate on.&lt;/p&gt;

&lt;p&gt;Conceptually, like with planes, we want to only calculate the distance between points that are within a bounding box around the center point. One method for calculating the edges of such a bounding box using latitude and longitude is described in &lt;a href="http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates" rel="noopener noreferrer"&gt;this paper&lt;/a&gt;. The linked paper suggests that this method could be used with a SQL-compliant datastore to leverage latitude and longitude indices to select the subset of points in the bounding box and only then calculate the distance. A similar approach using MySQL specifically, &lt;a href="https://derickrethans.nl/spatial-indexes-mysql.html" rel="noopener noreferrer"&gt;is detailed in this post&lt;/a&gt;. There are many more similar techniques I came across while writing this post. I haven't tested them to give a concrete recommendation of one over the other.&lt;/p&gt;

&lt;p&gt;The solution I want to talk about is called geohashing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Geohashing
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Geohash&lt;/strong&gt; is a &lt;a href="https://en.wikipedia.org/wiki/Public_domain" rel="noopener noreferrer"&gt;public domain&lt;/a&gt; &lt;a href="https://en.wikipedia.org/wiki/Geocode#Geocode_system" rel="noopener noreferrer"&gt;geocode system&lt;/a&gt; invented in 2008 by Gustavo Niemeyer[&lt;a href="https://en.wikipedia.org/wiki/Geohash#cite_note-first2008-1" rel="noopener noreferrer"&gt;1]&lt;/a&gt; and (similar work in 1966) G.M. Morton[&lt;a href="https://en.wikipedia.org/wiki/Geohash#cite_note-morton66-2" rel="noopener noreferrer"&gt;2]&lt;/a&gt;, which encodes a geographic location into a short string of letters and digits. It is a hierarchical spatial data structure which subdivides space into buckets of &lt;a href="https://en.wikipedia.org/wiki/Grid_(spatial_index)" rel="noopener noreferrer"&gt;grid&lt;/a&gt; shape, which is one of the many applications of what is known as a &lt;a href="https://en.wikipedia.org/wiki/Z-order_curve" rel="noopener noreferrer"&gt;Z-order curve&lt;/a&gt;, and generally &lt;a href="https://en.wikipedia.org/wiki/Space-filling_curves" rel="noopener noreferrer"&gt;space-filling curves&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;from &lt;a href="https://en.wikipedia.org/wiki/Geohash" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Geohash&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Flv48ewgfrz4cwb79gvus.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Flv48ewgfrz4cwb79gvus.jpg" alt="Geohash grid"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In other words, through geohashing &lt;a href="https://www.pubnub.com/learn/glossary/what-is-geohashing/" rel="noopener noreferrer"&gt;you divide the world into a grid of squares and rectangles&lt;/a&gt;, addressed by hash. Any point on the earth (latitude and longitude) has a corresponding hash that fits into one of these  grid cells or bins. To see a visual explanation of geohashing, check out this &lt;a href="https://www.youtube.com/watch?v=UaMzra18TD8" rel="noopener noreferrer"&gt;video&lt;/a&gt; or this &lt;a href="https://www.movable-type.co.uk/scripts/geohash.html" rel="noopener noreferrer"&gt;page&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;The length of the geohash we use determines how large the bin is. For example, a geohash length of 5 will give us an approximately 5 km by 5 km bin.&lt;/p&gt;

&lt;p&gt;Geohashes were originally used as part of a URL-shortening service, but across the industry are now used for &lt;a href="https://www.mapzen.com/blog/geohashes-and-you/" rel="noopener noreferrer"&gt;spatial indexing and searches as well&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To return to our Yelp-like example scenario of finding nearby restaurants, let's add 5 character geohashes to our points of interest:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Place of Interest: (Latitude, Longitude) coordinates | Geohash
"The Red Fox Tavern": 32.7549°, -117.1425° | 9mudq
"Charlie's Chicken Shack": 32.7524°, -117.1427° | 9mudq
"Smoothie Town": 32.7376°, -117.1714° | 9mudj
"The Wilted Cauliflower": 32.5229°, -117.1165° | 9mu9n
"Eggplant Paradise": 32.5073°, -117.0855° | 9mu8z


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Again, my fake house is at my house at 32.7584°, -117.1402° or 9mudq.&lt;/p&gt;

&lt;p&gt;A naive algorithm to find nearby points of interest would be to return points with the same geohash as the center point. This gets us &lt;em&gt;some&lt;/em&gt; of the nearby points of interest, but depending on the distance we want to search, likely leaves out nearby points of interest that are in neighboring geohash bins.&lt;/p&gt;

&lt;p&gt;To get a more comprehensive set of points of interest, we can get all the points in the center geohash bin and its 8 immediate neighbors. From there, assuming we want to be precise, we calculate the distance between all the points and the center, dropping the points that are too far away.&lt;/p&gt;

&lt;p&gt;Like a bounding box algorithm, a geohash based algorithm can handle larger datasets by first chopping those datasets into manageable chunks for our distance calculations. Also, while the latitude and longitude based bounding boxes described above depend on the data store having support for spatial data, we can manipulate geohashes in any data store, as they are just strings.&lt;/p&gt;

&lt;p&gt;The ability to use geohashes without custom spatial data types will come in handy in my next post, where I dig into a solution &lt;a href="https://read.acloud.guru/location-based-search-results-with-dynamodb-and-geohash-267727e5d54f" rel="noopener noreferrer"&gt;I studied that uses DynamoDB as a data store, along with some JavaScript functions to implement a simple Starbucks locator&lt;/a&gt;. This solution actually uses Google's S2 as a different type of geohash, which I'll explain. In that post, I'll also get into two issues of some importance that I glided past here: first, deciding how big a geohash to use in your application and second, how to find the neighboring geohash bins.&lt;/p&gt;

&lt;p&gt;See you then!&lt;/p&gt;

</description>
      <category>geo</category>
      <category>guide</category>
    </item>
  </channel>
</rss>
