<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Citation Builder</title>
    <description>The latest articles on DEV Community by Citation Builder (@citationbuilder).</description>
    <link>https://dev.to/citationbuilder</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4013570%2F441ecc5c-b657-43c2-a3b8-fe56b3631401.png</url>
      <title>DEV Community: Citation Builder</title>
      <link>https://dev.to/citationbuilder</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/citationbuilder"/>
    <language>en</language>
    <item>
      <title>Checking local-business NAP consistency in code</title>
      <dc:creator>Citation Builder</dc:creator>
      <pubDate>Fri, 03 Jul 2026 12:00:49 +0000</pubDate>
      <link>https://dev.to/citationbuilder/checking-local-business-nap-consistency-in-code-5148</link>
      <guid>https://dev.to/citationbuilder/checking-local-business-nap-consistency-in-code-5148</guid>
      <description>&lt;p&gt;If you have ever done local SEO, you have heard the acronym &lt;strong&gt;NAP&lt;/strong&gt;: Name, Address, Phone. The folklore is that these three fields should be byte-for-byte identical everywhere your business is listed (Google, Yelp, Apple Maps, the long tail of directories), or your local rankings suffer.&lt;/p&gt;

&lt;p&gt;"Just compare the strings" is the instinct. Then you actually try it, and every assumption falls apart. This is a writeup of the engineering problems I ran into building a free NAP checker, and how I ended up solving them. No framework, just plain JavaScript and a few HTTP calls.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core problem: equality is the wrong operator
&lt;/h2&gt;

&lt;p&gt;Here is the same real business as three directories store it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Joe's Plumbing &amp;amp; Heating, LLC   | 123 N Main St Ste 4   | (415) 555-0199
Joe's Plumbing and Heating      | 123 North Main Street  | +1 415-555-0199
Joeβ€™s Plumbing &amp;amp; Heating LLC    | 123 N. Main St. #4     | 415.555.0199
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A human reads those as one business. &lt;code&gt;===&lt;/code&gt; reads three different businesses across all three fields. The whole job is closing that gap &lt;strong&gt;without&lt;/strong&gt; being so fuzzy that two genuinely different businesses score as a match. Every field needs its own normalization pipeline before you can compare anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phone is the easy win, so start there
&lt;/h2&gt;

&lt;p&gt;Phone numbers feel messy but they are the most tractable field, because there is a real spec underneath. Strip everything to digits, drop the country code, and compare the &lt;strong&gt;national significant number&lt;/strong&gt;. &lt;code&gt;libphonenumber&lt;/code&gt; does this properly, but the 80% version is small:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;US&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;digits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\D&lt;/span&gt;&lt;span class="sr"&gt;/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;digits&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// US/CA: 11 digits starting with 1 -&amp;gt; drop the country code&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;digits&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;US&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;nsn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// store last 10 as the comparable key&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nsn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;(415) 555-0199&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// "4155550199"&lt;/span&gt;
&lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;+1 415-555-0199&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// "4155550199"&lt;/span&gt;
&lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;415.555.0199&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;     &lt;span class="c1"&gt;// "4155550199"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All three formats above collapse to &lt;code&gt;4155550199&lt;/code&gt;. Phone becomes a clean equality check on a normalized key. The one trap: extensions. &lt;code&gt;555-0199 x12&lt;/code&gt; should match &lt;code&gt;555-0199&lt;/code&gt; on the main number, so compare the leading NSN, not a naive "last 10 digits" that swallows the extension. For anything beyond US/CA, hand it to &lt;code&gt;libphonenumber&lt;/code&gt; and trust the region metadata, because national-number length is not universal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Names: strip the legal suffix, then match tokens
&lt;/h2&gt;

&lt;p&gt;Business names break equality in two predictable ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Legal suffixes.&lt;/strong&gt; &lt;code&gt;LLC&lt;/code&gt;, &lt;code&gt;Inc&lt;/code&gt;, &lt;code&gt;Ltd&lt;/code&gt;, &lt;code&gt;Co&lt;/code&gt;, &lt;code&gt;GmbH&lt;/code&gt;, &lt;code&gt;Corp&lt;/code&gt;, &lt;code&gt;L.L.C.&lt;/code&gt; These are noise for matching. One directory keeps them, one drops them, one abbreviates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connectors and punctuation.&lt;/strong&gt; &lt;code&gt;&amp;amp;&lt;/code&gt; vs &lt;code&gt;and&lt;/code&gt;, smart quotes (&lt;code&gt;β€™&lt;/code&gt; vs &lt;code&gt;'&lt;/code&gt;), trailing commas, casing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So the name pipeline is: lowercase, normalize unicode (&lt;code&gt;β€™ β†’ '&lt;/code&gt;), expand/standardize &lt;code&gt;&amp;amp;&lt;/code&gt;, strip a known list of legal suffixes, collapse whitespace. Then I don't do a raw string compare. I tokenize and measure &lt;strong&gt;token overlap&lt;/strong&gt;, because word order and a stray extra word shouldn't tank the score:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;LEGAL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\b(&lt;/span&gt;&lt;span class="sr"&gt;llc|l&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;l&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;c&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;|inc|incorporated|ltd|co|corp|company|gmbh|kg|bv|sa|srl&lt;/span&gt;&lt;span class="se"&gt;)\b&lt;/span&gt;&lt;span class="sr"&gt;/g&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;normalizeName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;NFKC&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;β€™'&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&amp;amp;/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; and &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;LEGAL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;[^&lt;/span&gt;&lt;span class="sr"&gt;a-z0-9&lt;/span&gt;&lt;span class="se"&gt;\s]&lt;/span&gt;&lt;span class="sr"&gt;/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;tokenScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;A&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;normalizeName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Boolean&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;normalizeName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Boolean&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;shared&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="nx"&gt;shared&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;shared&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Jaccard-ish, biased to the longer side&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"Joe's Plumbing &amp;amp; Heating, LLC"&lt;/code&gt; vs &lt;code&gt;"Joe's Plumbing and Heating"&lt;/code&gt; now scores 1.0 on shared tokens. Dividing by the &lt;em&gt;longer&lt;/em&gt; token set (rather than the union) keeps a directory that bolts on a city name (&lt;code&gt;"... Heating San Francisco"&lt;/code&gt;) from scoring a perfect match against the clean version.&lt;/p&gt;

&lt;h2&gt;
  
  
  Addresses: the abbreviation swamp
&lt;/h2&gt;

&lt;p&gt;Addresses are the worst field, and it is worth being honest about why. The variation isn't typos, it's a dictionary of equivalent forms that postal systems treat as identical:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;In the wild&lt;/th&gt;
&lt;th&gt;Canonical&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;St&lt;/code&gt;, &lt;code&gt;St.&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Street&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Ave&lt;/code&gt;, &lt;code&gt;Av&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Avenue&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Blvd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Boulevard&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;N&lt;/code&gt;, &lt;code&gt;N.&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;North&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Ste&lt;/code&gt;, &lt;code&gt;#&lt;/code&gt;, &lt;code&gt;Unit&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;(suite designator)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Hwy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Highway&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The fix is an expansion map applied token by token, the same way you normalize the name. &lt;code&gt;123 N Main St Ste 4&lt;/code&gt; and &lt;code&gt;123 North Main Street #4&lt;/code&gt; both reduce to &lt;code&gt;123 north main street 4&lt;/code&gt; and compare cleanly. Keep the map small and US/region-specific; an over-eager map will "correct" things that were never abbreviations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why reverse-geocoding lies to you
&lt;/h3&gt;

&lt;p&gt;Here is the subtle one that cost me the most time. Some directories only give you a &lt;strong&gt;lat/long&lt;/strong&gt;, not a clean street string. The tempting move is to reverse-geocode those coordinates back into an address and compare. &lt;strong&gt;Don't trust that comparison for a mismatch verdict.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Reverse geocoding snaps a point to the &lt;em&gt;nearest known feature&lt;/em&gt; in the geocoder's database. A rooftop pin 12 meters off can resolve to the building next door, a different suite, or the road centerline. So you reverse-geocode &lt;code&gt;123 Main St&lt;/code&gt; and get back &lt;code&gt;125 Main St&lt;/code&gt;, and your checker screams "address mismatch!" when the listing is perfectly fine. The error is in your verification method, not in the data.&lt;/p&gt;

&lt;p&gt;The rule I settled on: reverse-geocoded addresses are &lt;strong&gt;advisory&lt;/strong&gt;. If the forward, text-based address agrees, great. If it disagrees, fall back to a coordinate-distance check (are the two points within ~50m?) before ever reporting a mismatch. A mismatch born from snapping is a false positive, and false positives in this tool are worse than gaps, because they send people editing listings that were already correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scoring: per field, not one blob
&lt;/h2&gt;

&lt;p&gt;Early on I concatenated N + A + P and computed one similarity number. Useless. A wrong phone and a wrong suite number are completely different problems, and an aggregate score hides which one you have.&lt;/p&gt;

&lt;p&gt;So scoring is &lt;strong&gt;field-level&lt;/strong&gt;. Each directory result returns a per-field verdict, and &lt;code&gt;not_found&lt;/code&gt; is explicitly &lt;em&gt;not&lt;/em&gt; a failure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scoreField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;theirs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;theirs&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;not_found&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;   &lt;span class="c1"&gt;// coverage gap, not a mismatch&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nx"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;phone&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;normalizePhone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;theirs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;tokenScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;theirs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;normalizeAddress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nf"&gt;normalizeAddress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;theirs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;match&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mismatch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;theirs&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;not_found&lt;/code&gt; distinction matters more than it looks. If a directory simply doesn't list you, that is a &lt;em&gt;coverage&lt;/em&gt; signal ("go claim this listing"), not a &lt;em&gt;consistency&lt;/em&gt; signal ("you have conflicting data"). Folding them together makes a business with thin coverage look like a business with a data integrity problem. They need opposite fixes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The data sources are the actual hard part
&lt;/h2&gt;

&lt;p&gt;The algorithms above are a weekend. Getting &lt;em&gt;clean field data to compare against, for free, at scale&lt;/em&gt; is the part that never ends.&lt;/p&gt;

&lt;p&gt;What worked, server-side and free:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenStreetMap / Nominatim&lt;/strong&gt; is the workhorse. Geocode the address, get structured &lt;code&gt;address&lt;/code&gt; parts back. Critically, request &lt;code&gt;extratags&lt;/code&gt; β€” that is where &lt;code&gt;phone&lt;/code&gt; / &lt;code&gt;contact:phone&lt;/code&gt; and &lt;code&gt;website&lt;/code&gt; live, so you can compare more than just the address. Respect the usage policy: one request per second, a real &lt;code&gt;User-Agent&lt;/code&gt;, and cache aggressively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iBegin&lt;/strong&gt; and &lt;strong&gt;Lacartes&lt;/strong&gt; gave usable structured listings for US/CA/UK without a login wall.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What did &lt;strong&gt;not&lt;/strong&gt; work, and why it's worth knowing before you waste an afternoon:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Yelp and Yellow Pages&lt;/strong&gt;: hard &lt;code&gt;403&lt;/code&gt; to anything from a datacenter IP. Their bot detection assumes residential traffic; your server is not that. Scraping them at scale means a residential proxy budget, which kills "free."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hotfrog&lt;/strong&gt;: same datacenter &lt;code&gt;403&lt;/code&gt; story.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;hub.biz&lt;/strong&gt;: echoed my query back at me instead of returning its own stored record, which is worthless for a &lt;em&gt;consistency&lt;/em&gt; check (you'd be comparing the input to itself).&lt;/li&gt;
&lt;li&gt;A long tail of directories sit behind Cloudflare challenges or render NAP only after client-side JS, so a plain server fetch sees an empty shell.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lesson is that "check across all directories" is marketing; "check across the directories that expose structured data to an anonymous server request" is the real, shippable scope. Be upfront about coverage instead of pretending a &lt;code&gt;not_found&lt;/code&gt; from a bot-wall means the listing is absent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it together
&lt;/h2&gt;

&lt;p&gt;The shape that worked:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Normalize the user's own NAP once.&lt;/li&gt;
&lt;li&gt;For each reachable source, fetch, extract NAP (incl. &lt;code&gt;extratags&lt;/code&gt;), normalize the same way.&lt;/li&gt;
&lt;li&gt;Score &lt;strong&gt;per field&lt;/strong&gt;, with &lt;code&gt;match&lt;/code&gt; / &lt;code&gt;mismatch&lt;/code&gt; / &lt;code&gt;not_found&lt;/code&gt; as distinct states.&lt;/li&gt;
&lt;li&gt;Treat reverse-geocoded addresses as advisory and gate mismatches behind a distance check.&lt;/li&gt;
&lt;li&gt;Report coverage and consistency as two separate things.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I packaged this into &lt;a href="https://localseocitationbuilder.com/nap-checker" rel="noopener noreferrer"&gt;a free checker I built&lt;/a&gt; if you want to see the normalization behavior on a real business without writing the code. But the genuinely portable lesson is the framing: NAP consistency is a &lt;strong&gt;per-field normalization-and-matching&lt;/strong&gt; problem with a nasty data-acquisition tail, and the single biggest correctness bug is trusting a reverse-geocode to tell you two addresses disagree.&lt;/p&gt;

&lt;p&gt;If you build your own, start with phone (real spec, clean wins), get your name and address normalizers right, and spend your remaining time on the data sources, because that is where the project actually lives.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>seo</category>
      <category>api</category>
    </item>
  </channel>
</rss>
