DEV Community

Curt Kloc
Curt Kloc

Posted on

How I Built Software to Show Business Owners What Google and AI Understand About Their Company

I was frustrated.

Not because local SEO is impossible.

Not because websites are hard to build.

I was frustrated because business owners could not see the thing that was costing them visibility.

A local business owner would have a website, a Google Business Profile, some reviews, a service page, a contact page, and maybe even a decent-looking homepage.

To them, the site looked finished.

But when I looked at it through the lens of Google, Google Maps, and AI search, the picture was usually very different.

The business existed online, but it was not clearly explained.

Google had to guess what the company did.

Google had to guess where it operated.

Google had to guess which services mattered most.

Google had to guess which pages supported which local searches.

AI systems had to piece together the company from scattered fragments.

And when search systems have to guess, they often choose a competitor that is easier to understand.

That is the problem we built software to expose at Firm IQ.

Not another generic SEO audit.

Not a page-speed report.

Not a keyword list.

We built a Google & AI Visibility Infrastructure Audit.

The goal was simple:

Show a business owner what Google, Maps, and AI can likely understand about their company — and what they still have to guess.


TL;DR

At Firm IQ, we built audit software that analyzes how a local business appears to Google, Google Maps, and AI search systems.

The software reviews:

  • homepage clarity
  • service pages
  • location pages
  • service-location gaps
  • Google Business Profile landing-page alignment
  • schema and structured data
  • internal linking
  • site footprint
  • competitor contrast
  • map visibility
  • authority signals
  • AI understanding infrastructure

The report translates technical visibility gaps into plain-English business consequences:

  • why the business may not be showing up
  • why competitors may be easier for Google to recommend
  • where local demand is leaking
  • what pages or proof are missing
  • what should be fixed first

The hardest parts were not just crawling websites.

The hardest parts were:

  1. Turning messy website data into a business-owner-friendly diagnosis.
  2. Comparing a business against map competitors without making it a full competitor audit.
  3. Explaining AI visibility as a documentation problem, not just an SEO problem.
  4. Creating a report that creates urgency without exaggeration.

Stack

The first working version was built as a Flask-based web app, developed with AI-assisted coding in Antigravity, deployed on Render, and connected to external visibility data where useful.

The system includes:

  • Flask web app
  • website crawler
  • page classifier
  • schema detector
  • service/location matching logic
  • Google Business Profile landing-page alignment check
  • competitor contrast module
  • Ahrefs authority/context integration
  • AI understanding infrastructure diagnostic
  • PDF report generator
  • plain-English recommendation layer
  • Render deployment

The app is not meant to be a traditional SEO crawler.

The goal is not just to say:

This page has a missing meta description.

The goal is to say:

This business may not be giving Google and AI enough structured evidence to understand, trust, and recommend it.

That changed how we designed the whole product.


The Problem I Was Actually Solving

Most business owners think visibility works like this:

I have a website, so Google should know what I do.

But that is not how modern search works.

Having a website is not the same as being understood.

A website can look professional to a human and still be weak evidence for Google.

A business can be great at what it does and still be difficult for AI systems to summarize, classify, or recommend.

This is especially obvious in local search.

A buyer does not usually search for vague categories.

They search for specific services in specific places.

Examples:

  • handyman Fort Lauderdale
  • bathroom renovation Fort Lauderdale
  • roof repair Chandler
  • commercial HVAC Dallas
  • kitchen remodeling near me
  • personal injury lawyer Phoenix

Those searches combine service intent with location intent.

If a business has one generic services page, no dedicated location pages, no service-location pages, weak schema, and a Google Business Profile pointing to a generic homepage, Google has to infer too much.

Meanwhile, a competitor may have:

  • a strong homepage
  • dedicated service pages
  • dedicated city pages
  • local proof
  • reviews
  • project examples
  • FAQs
  • schema markup
  • internal links
  • a stronger landing page connected to the Google Business Profile

That competitor may not be better.

They may simply be easier for Google to understand.

That became one of the core ideas behind the software:

The current map winners are not always unbeatable. They are often just giving Google a clearer picture.


Why Existing SEO Audits Weren’t Enough

There are plenty of SEO audit tools.

Many of them are useful.

But most of them output technical issues, not business clarity.

They tell you:

  • missing title tag
  • meta description too short
  • missing H1
  • no alt text
  • low word count
  • broken link
  • schema missing
  • low domain authority

Those things can matter.

But a business owner usually does not care about a missing H1 by itself.

They care about calls.

They care about estimates.

They care about appointments.

They care about whether the competitor down the street is showing up while they are buried.

So the report could not just say:

Missing location page.

It needed to say:

Google does not have a dedicated local page that supports this target market, which can make competitors with clearer location assets easier to rank and recommend.

It could not just say:

Thin content.

It needed to say:

The site may not provide enough service-specific evidence for Google and AI systems to confidently match this business to high-intent buyer searches.

That was the real challenge.

The crawler had to find the data.

But the report had to explain the consequence.


Part 1: Building the Website Crawler

The first layer was the crawler.

The software needed to accept a business website URL, normalize it, crawl the site, classify pages, and extract signals.

Simple in theory.

Messy in practice.

Real-world local business websites are unpredictable.

Some use WordPress.

Some use Wix.

Some use Squarespace.

Some use custom builders.

Some have no sitemap.

Some have multiple sitemaps.

Some block automated requests.

Some redirect between www and non-www.

Some have modern domains like .realtor, .construction, or .ai.

Some return unusual HTTP statuses.

Some sites are indexed by Google but still block a normal server-side crawler.

So the crawler had to become practical.

It needed to handle:

  • http and https
  • www and non-www
  • trailing slashes
  • redirects
  • modern TLDs
  • sitemap discovery
  • timeout failures
  • crawler blocking
  • empty HTML
  • non-HTML responses
  • shallow websites
  • large websites

At a high level, URL normalization looked something like this:

from urllib.parse import urlparse

def normalize_url(raw_url):
    raw_url = raw_url.strip()

    if not raw_url.startswith(("http://", "https://")):
        raw_url = "https://" + raw_url

    parsed = urlparse(raw_url)

    if not parsed.netloc or "." not in parsed.netloc:
        raise ValueError("Invalid domain or URL")

    return raw_url.rstrip("/") + "/"
Enter fullscreen mode Exit fullscreen mode

That sounds basic, but it matters.

If your validator rejects modern TLDs like .realtor, the crawler never even gets a chance.

If your crawler only tries one URL variant, you miss sites that resolve only on www, redirect weirdly, or respond differently over HTTP/HTTPS.

So the crawler eventually needed fallback attempts:

def build_url_variants(domain):
    clean = domain.replace("https://", "").replace("http://", "").strip("/")

    variants = [
        f"https://{clean}/",
        f"https://www.{clean}/" if not clean.startswith("www.") else f"https://{clean}/",
        f"http://{clean}/",
        f"http://www.{clean}/" if not clean.startswith("www.") else f"http://{clean}/",
    ]

    return list(dict.fromkeys(variants))
Enter fullscreen mode Exit fullscreen mode

The goal was not perfection.

The goal was reliability across messy local business websites.


Part 2: Extracting Page-Level Signals

Once a page is fetched, the software extracts the basic signals:

  • title tag
  • meta description
  • H1
  • word count
  • internal links
  • schema presence
  • page type
  • contact/NAP signals
  • Google Maps or GBP evidence
  • review/proof signals
  • service terms
  • location terms

A simplified extraction pass might look like this:

def extract_page_signals(url, soup):
    title = soup.title.get_text(strip=True) if soup.title else ""
    h1 = soup.find("h1").get_text(strip=True) if soup.find("h1") else ""
    body_text = soup.get_text(" ", strip=True)

    internal_links = extract_internal_links(url, soup)
    schema_blocks = soup.find_all("script", type="application/ld+json")

    return {
        "url": url,
        "title": title,
        "h1": h1,
        "word_count": len(body_text.split()),
        "internal_link_count": len(internal_links),
        "schema_present": len(schema_blocks) > 0,
        "body_text": body_text,
    }
Enter fullscreen mode Exit fullscreen mode

But raw extraction is only the first step.

The bigger challenge is interpretation.

A page with 300 words is not automatically bad.

A page with schema is not automatically strong.

A page that mentions a city is not automatically a location page.

That is where classification came in.


Part 3: Classifying Pages by Visibility Role

The software needed to classify pages by their visibility role.

A page could be:

  • Homepage
  • Service Page
  • Location Page
  • Service-Location Page
  • FAQ Page
  • Contact Page
  • About Page
  • Gallery/Proof Page
  • Blog/Resource Page
  • Supporting Page

This matters because local visibility depends on structure.

A website with ten random pages is different from a website with:

  • one homepage
  • five dedicated service pages
  • three dedicated location pages
  • ten service-location pages
  • pricing page
  • process page
  • FAQ page
  • reviews page

The second site gives Google more evidence.

Not just more content.

More structured evidence.

A simplified classifier might look like this:

def classify_page(url, title, h1, body_text, services, locations):
    text = f"{url} {title} {h1} {body_text}".lower()

    service_matches = [s for s in services if s.lower() in text]
    location_matches = [l for l in locations if l.lower() in text]

    if is_homepage(url):
        return "Homepage"

    if "contact" in url:
        return "Contact"

    if "faq" in url or "questions" in text:
        return "FAQ"

    if service_matches and location_matches:
        return "Service-Location Page"

    if service_matches:
        return "Service Page"

    if location_matches:
        return "Location Page"

    return "Supporting Page"
Enter fullscreen mode Exit fullscreen mode

Of course, the real logic needs more nuance.

A city in the footer should not turn a page into a true location page.

A generic services page should not count as a dedicated service page.

A single mention of “kitchen” should not mean the page supports “kitchen renovation Fort Lauderdale.”

So we created stricter status labels:

  • True Dedicated Page
  • Broad Supporting Page
  • Related Supporting Page
  • Mentioned Only
  • Missing
  • Unclear

That distinction matters.

The report should not flatter the website.

It should tell the truth.

If the page does not clearly support the buyer intent, the report should say so.


Part 4: Turning Missing Pages Into Local Demand Gaps

The next layer was translating missing structure into local demand.

This became the “Local Demand Capture Gaps” section.

Instead of saying:

You are missing service-location pages.

The software shows buyer-intent searches like:

  • handyman Fort Lauderdale
  • general contractor Fort Lauderdale
  • home improvement Fort Lauderdale
  • kitchen renovation Fort Lauderdale
  • bathroom renovation Fort Lauderdale
  • painting contractor Fort Lauderdale

Then it explains why those searches matter.

This is where the report becomes more useful for a business owner.

They can immediately understand:

These are the kinds of searches buyers use when they are ready to call.

The audit is not pretending to be a full keyword research report.

It is saying:

These are real-world search patterns, and your website does not currently provide clear assets to capture them.

That connects the technical gap to revenue.

Missing page becomes missed demand.

Missed demand becomes missed calls.

Missed calls become missed jobs.

That is the language business owners understand.

A simplified service-location matrix might look like this:

def build_missing_asset_matrix(services, locations, crawled_pages):
    matrix = []

    for service in services:
        for location in locations:
            match = find_best_matching_page(service, location, crawled_pages)

            matrix.append({
                "service": service,
                "location": location,
                "status": match.status if match else "Missing",
                "matching_page": match.url if match else None,
            })

    return matrix
Enter fullscreen mode Exit fullscreen mode

The technical output is a matrix.

The business output is a revenue-risk story.

That translation layer is what makes the audit useful.


Part 5: Google Business Profile Landing-Page Alignment

For local businesses, the Google Business Profile is often one of the most important visibility assets.

But most audits either ignore it or overstate what they can check.

We decided to check something narrow and useful:

Does the Google Business Profile website link point to the expected URL?

That is a link alignment check.

It does not mean the Google Business Profile is optimized.

It does not mean the categories are correct.

It does not mean the services are built out.

It does not mean the reviews support the target services.

It simply means:

The GBP points to this page.

Then the report asks a more important question:

Is that landing page strong enough?

Many businesses point their Google Business Profile to a generic homepage.

That can be okay if the homepage is strong.

But if the homepage lacks local proof, clear service signals, contact evidence, schema, review support, map evidence, and links to service/location pages, the GBP is pointing to a weak local visibility asset.

That led to one of the report’s plain-English lines:

Google Business Profiles pointing to a generic homepage often make Google rely on weaker local/service signals.

That is the kind of sentence a business owner can understand.


Part 6: Competitor Contrast

The competitor section changed the entire feel of the report.

Before competitor contrast, the audit could show:

You have a weak structure.

Useful, but not urgent enough.

After competitor contrast, it could show:

Here are the businesses Google is surfacing ahead of you.

That creates a different kind of pain.

The software allows us to enter top map competitors and compare them against the target business.

The comparison looks at:

  • observed map rank
  • site footprint
  • service pages
  • location pages
  • service-location evidence
  • local proof
  • schema
  • overall visibility picture

The goal is not to run a full competitor audit.

The goal is to show the business owner why competitors may be easier for Google to surface.

One key addition was site footprint.

If the target business has 6 pages and a competitor has 50+ pages, that tells a story.

It does not mean the competitor has better SEO automatically.

But it does show that the competitor may have built more evidence.

More pages do not always mean better visibility.

But a larger, better-structured site often gives Google more information about services, locations, proof, and relevance.

The table puts competitors first, then the target business.

That matters psychologically.

The report reads like:

Here are the map winners. Here is where you are.

Then it explains:

The problem is not that competitors have perfect websites. The problem is that even imperfect competitors are still easier for Google to surface in the map results today.

That line became one of the strongest parts of the report.


Part 7: Schema and Structured Data

The software also checks schema markup and structured data.

Schema helps search engines understand what kind of entity they are looking at.

A local business website may include:

  • Organization schema
  • LocalBusiness schema
  • Service schema
  • FAQ schema
  • Review schema
  • Person schema
  • Breadcrumb schema
  • Website schema

A simplified detector might look like this:

import json

def detect_schema_types(soup):
    schema_types = set()

    scripts = soup.find_all("script", type="application/ld+json")

    for script in scripts:
        try:
            data = json.loads(script.string or "{}")

            items = data if isinstance(data, list) else [data]

            for item in items:
                schema_type = item.get("@type")
                if isinstance(schema_type, list):
                    schema_types.update(schema_type)
                elif schema_type:
                    schema_types.add(schema_type)

        except Exception:
            continue

    return list(schema_types)
Enter fullscreen mode Exit fullscreen mode

But we had to be careful not to overvalue schema.

Schema is useful, but it is not magic.

A website can have schema on every page and still have weak content.

A site can technically pass a schema check while still failing to clearly explain services, locations, pricing, process, proof, and customer fit.

So the report treats schema as one layer of understanding.

Not the whole story.

The deeper question is:

Does the website give search engines and AI systems enough structured explanation to understand and recommend the business?

That led to the next section.


Part 8: AI Understanding Infrastructure

This is the part I think will matter more over time.

Most businesses do not have a single clear source of truth about the company.

They have scattered information:

  • homepage
  • about page
  • service page
  • reviews
  • social profiles
  • directory listings
  • maybe a few blog posts

But there is usually no structured knowledge layer that explains:

  • what the company is
  • what it does
  • where it operates
  • who it serves
  • what problems it solves
  • what the process looks like
  • what pricing factors exist
  • what proof supports it
  • who the owner or team is
  • how services connect to locations
  • how all the business entities relate

That matters because AI systems need a coherent explanation.

They are not just ranking pages.

They are generating answers.

They are summarizing companies.

They are deciding which businesses to mention, compare, or recommend.

So we added an AI Understanding Infrastructure diagnostic.

It checks for:

  • company/entity explanation
  • service catalog clarity
  • location catalog clarity
  • service-location relationships
  • pricing/cost explanation
  • process explanation
  • FAQs/customer questions
  • proof/project evidence
  • structured data depth
  • entity/SameAs links
  • knowledge/resource layer
  • AI Knowledge Catalog or subdomain

This section does not just ask:

Does the site have content?

It asks:

Does the site teach Google and AI what this business is?

That is a more important question.


Part 9: The AI Knowledge Catalog Idea

One of the recommendations that came out of this work is what we call an AI Knowledge Catalog.

This is not a blog.

A blog is chronological.

A knowledge catalog is structural.

A blog says:

Here are posts we published over time.

A knowledge catalog says:

Here is what this company is, what it does, who it helps, where it operates, what proof supports it, and how all of those pieces connect.

The main website can remain conversion-focused.

The knowledge catalog can be understanding-focused.

It could live on a subdomain like:

  • knowledge.company.com
  • data.company.com
  • answers.company.com

The purpose is not to replace the main website.

The purpose is to give search engines and AI systems a clean, crawlable company reference layer.

For example, a knowledge catalog might include:

  • company/entity page
  • service catalog
  • location catalog
  • service-location relationship pages
  • pricing factor pages
  • process pages
  • FAQ catalog
  • proof/case study catalog
  • founder/team pages
  • glossary
  • sameAs/entity links
  • schema markup
  • sitemap
  • internal linking map

Most businesses have never heard of this.

That is exactly why it is interesting.

For years, businesses were told to publish blogs.

But in the AI search era, random blog posts may be less useful than a structured company knowledge layer.

AI cannot recommend what it does not understand.

The knowledge catalog is designed to reduce guessing.


Part 10: Making the Report Useful

The hardest part was making the report readable.

It is easy to generate data.

It is harder to create clarity.

The report had to answer the business owner’s real questions:

  • Is my website broken?
  • Are we likely losing customers?
  • What is our biggest visibility gap?
  • Do we need a new website?
  • What should we fix first?
  • Why are competitors showing up?
  • Why does this matter now?

So the report structure became:

  1. Headline scores
  2. Owner summary
  3. Plain-English diagnosis
  4. Revenue risk summary
  5. Local demand capture gaps
  6. Local map visibility snapshot
  7. Competitor visibility contrast
  8. Top problems costing visibility
  9. What Google/AI sees vs. what it still has to guess
  10. AI Understanding Infrastructure
  11. Google & AI Visibility Scorecard
  12. 90-day roadmap
  13. Expected business impact
  14. Technical appendix

The technical appendix still matters.

It contains page inventories, schema checks, crawl evidence, authority context, and detailed diagnostics.

But the owner does not need to start there.

The owner needs the story first.

Then the evidence.

That was a major design principle.


Part 11: Scoring Without Overclaiming

Scoring is tricky.

A score creates urgency.

But a bad score can destroy trust if it feels arbitrary.

So we made the scoring painful but explainable.

The report evaluates areas like:

  • homepage foundation
  • website structure/content
  • service page coverage
  • location page coverage
  • service-location coverage
  • thin page risk
  • schema coverage
  • internal linking
  • authority/search context

We also added caps so weak sites do not accidentally score too high.

For example, if a site has no schema, no search visibility, no service-location assets, and only a handful of thin pages, it should not receive a comfortable score.

A site with six pages, no organic keywords, no organic traffic, no true location pages, no service-location pages, and thin content should feel urgent.

Not exaggerated.

Urgent.

The label matters too.

A score like 35/100 with Severe local visibility gap is much clearer than a vague needs improvement.

Business owners need to feel the size of the gap.

But they also need to believe it is fixable.

That balance became central to the report.


Part 12: The 90-Day Roadmap

The report does not stop at diagnosis.

It gives a practical roadmap.

A typical roadmap might include:

Days 1–30: Homepage and GBP Landing Page Strengthening

Clarify title tags, H1s, metadata, primary service, primary location, contact signals, local proof, map evidence, schema, and internal links.

Days 31–60: Build Service-Location Assets

Create dedicated pages for high-value service-location combinations.

Add localized proof, FAQs, internal links, and structured data.

Days 61–90: Expand Proof, Schema, and AI Understanding

Add project evidence, customer questions, process explanations, pricing factors, local proof, schema, and potentially an AI Knowledge Catalog.

That roadmap is important because it turns the report from criticism into a plan.

The business owner does not just learn what is wrong.

They see what to do next.


Lessons Learned

After building and testing this software, a few lessons stood out.

1. Business owners do not need more jargon.

They need translation.

A missing location page is not just a missing page.

It is a missing local visibility asset.

Thin content is not just a content issue.

It is weak evidence.

A generic homepage is not just a design choice.

It may be the page Google is relying on to understand the entire business.

2. Competitor contrast creates urgency.

A score is useful.

But seeing named competitors at #1 and #2 while the prospect is buried creates a different level of clarity.

It turns the report from theoretical to real.

3. AI visibility is really an understanding problem.

People talk about AI search like it is a completely separate thing.

But the foundation is similar:

Can the system understand the business?

Can it classify the entity?

Can it connect services, locations, proof, and authority?

Can it recommend the business without guessing?

4. More content is not the answer.

Better structured evidence is the answer.

Random blogs are not enough.

A company needs clear service pages, location pages, proof pages, FAQs, process pages, pricing explanations, schema, and internal links.

5. A good-looking website can still be underbuilt.

This may be the most important lesson.

Design and visibility are not the same thing.

A website can look beautiful and still fail to explain the business in a way Google and AI can use.


The Outcome

The final software gives business owners a report that explains how their company appears to Google, Google Maps, and AI search systems.

It shows:

  • where local demand may be leaking
  • why competitors may be easier to recommend
  • what pages are missing
  • where the homepage is weak
  • whether the Google Business Profile points to a strong landing page
  • whether the site has enough structured data
  • whether the company has an AI understanding layer
  • what should be fixed first

That is the benefit.

Not a generic SEO audit.

A visibility infrastructure diagnosis.

The goal is not to scare business owners.

The goal is to show them what they have never been able to see before.

Why they are not getting found.

Why competitors may be easier to understand.

Why Google and AI still have to guess.

And what can be built to close the gap.

At Firm IQ, we are making these Google & AI Visibility Infrastructure Audits available for a limited time.

You can request one here:

https://firmiq.io/free-search-ai-audit/

You can also view an example report here:

https://drive.google.com/file/d/12CbK11O35tYAw9SIRCWKXY-o7zUwCp7f/view?usp=sharing

The search problem is changing.

It is no longer just:

Can your website rank?

It is becoming:

Do Google and AI understand your business well enough to recommend it?

That is the problem this software was built to show.

Top comments (0)