Mike

Posted on Jun 4 • Originally published at brandswarm.io

Schema markup for AI search: the complete 2026 reference

#ai #webdev #seo #tutorial

Originally published at brandswarm.io/blog/schema-markup-for-ai-search/.

AI engines retrieve structured pages more reliably than unstructured ones. Not
in the abstract — measurably, in side-by-side tests we and others have run. A
product page with proper SoftwareApplication JSON-LD gets cited in
AI Overviews and Perplexity 2–3× more often than the same page without it. A
documentation page with FAQPage gets snippet-quoted by ChatGPT
directly. The marginal effort is 30 minutes; the marginal value is large.

This is the reference. We've put together the four Schema.org types that
actually matter for AI search in 2026, with copy-pasteable JSON-LD, the
validation step that catches 80% of mistakes, and the three structured-data
patterns that quietly tank visibility even when they look right.

The 4 schemas that matter

Of the 800+ types in the Schema.org vocabulary, four cover almost every situation a B2B or B2C brand cares about for AI search:

Organization — your identity. Goes on the homepage; usually one per site.
SoftwareApplication or Product — what you sell. Goes on product pages and pricing.
FAQPage — question-answer content. Goes on docs, billing FAQ, help pages.
HowTo — step-by-step instructions. Goes on tutorials and onboarding guides.

Two more sometimes worth adding: Article / BlogPosting
(for blog posts; reasonable but smaller AI-retrieval lift than the above) and
BreadcrumbList (small lift, near-zero effort if your URLs are clean).
Skip the rest for now.

1. Organization schema (every site needs this)

The Organization block is doing one specific job: connecting your
domain to your other web presences so AI engines can build a coherent identity
graph. The sameAs array is what does the connecting.

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Brandswarm",
  "alternateName": "Brandswarm.io",
  "url": "https://brandswarm.io",
  "logo": "https://brandswarm.io/static/logo/brandswarm-wordmark-dark-md.png",
  "description": "AI brand visibility tracking across ChatGPT, Claude, Perplexity, Gemini, and AI Overviews.",
  "foundingDate": "2024-09-01",
  "email": "hello@brandswarm.io",
  "sameAs": [
    "https://twitter.com/brandswarm",
    "https://linkedin.com/company/brandswarm",
    "https://github.com/brandswarm",
    "https://www.crunchbase.com/organization/brandswarm",
    "https://en.wikipedia.org/wiki/Brandswarm"
  ]
}

What to omit: aspirational sameAs URLs that
don't exist yet. Google's structured-data validator flags broken URLs as
errors and Google may discount your whole block. Only list what's live.

Where to put it: in the <head> of your
homepage as <script type="application/ld+json">. Repeating
it on every page is allowed but unnecessary — the homepage is enough.

2. SoftwareApplication / Product schema

This block tells AI engines what you sell, what category, and roughly what it
costs. Goes on the product page and pricing page.

{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "Brandswarm",
  "description": "Track how AI assistants describe your brand. 5 surfaces, daily monitoring.",
  "applicationCategory": "BusinessApplication",
  "applicationSubCategory": "AI Brand Monitoring",
  "operatingSystem": "Web",
  "url": "https://brandswarm.io",
  "offers": [
    {"@type": "Offer", "name": "Starter", "price": "49", "priceCurrency": "USD"},
    {"@type": "Offer", "name": "Growth",  "price": "149", "priceCurrency": "USD"},
    {"@type": "Offer", "name": "Enterprise", "price": "399", "priceCurrency": "USD"}
  ],
  "creator": {"@type": "Organization", "name": "Brandswarm", "url": "https://brandswarm.io/"}
}

About aggregateRating: if you've seen the spec,
you've seen aggregateRating blocks with star averages. Don't add
one unless you have a real source for the rating (Trustpilot, G2 review count,
etc.). Google's algorithm specifically downweights brands that ship fabricated
ratings. Risk-to-reward is bad.

For B2C / physical products: use Product instead
of SoftwareApplication. Same structure; different
@type and include brand, sku,
gtin, and image.

3. FAQPage schema (the highest-lift block per page)

AI Overviews quote FAQ schemas directly. Perplexity cites them. ChatGPT
retrieves them when answering similar questions. Pound for pound, the
best-converting structured-data type for AI visibility.

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What does Brandswarm track?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Brandswarm tracks how your brand appears in ChatGPT, Claude, Perplexity, Gemini, and AI Overviews. For each surface we capture mention rate, position, sentiment, sample quotes, and citation sources."
      }
    },
    {
      "@type": "Question",
      "name": "Do I need a credit card to start?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "No. The free instant scan runs without signup or card. Paid trial also starts without a card."
      }
    }
  ]
}

What works: 3–10 question/answer pairs per page. Each
question should be one a real user would type. Each answer should be a
complete, standalone sentence (50–250 chars is the sweet spot — long enough
to be useful, short enough to be quotable).

What doesn't: stuffing 30 Q&As to "win more snippets." Google
cracked down on this in 2023; AI engines followed. Quality > quantity.

4. HowTo schema (the most under-used)

Tutorial pages with HowTo schema get retrieved disproportionately
by Gemini and AI Overviews because the structure maps cleanly to a step-by-step
answer format. If you have a "getting started" doc or onboarding guide, this
is your easiest win.

{
  "@context": "https://schema.org",
  "@type": "HowTo",
  "name": "Get your first AI brand-visibility report",
  "description": "Track your brand in ChatGPT, Claude, Perplexity, Gemini, and AI Overviews in under 5 minutes.",
  "totalTime": "PT5M",
  "step": [
    {"@type": "HowToStep", "position": 1, "name": "Run an instant report",
     "text": "Visit brandswarm.io/scan/, enter your domain. Free, no signup."},
    {"@type": "HowToStep", "position": 2, "name": "Claim the report",
     "text": "Enter your email to save the scan and unlock the full report."},
    {"@type": "HowToStep", "position": 3, "name": "Add your prompts",
     "text": "Add the competitive queries your buyers actually search."}
  ]
}

Keep each text field a single, instructive sentence. Avoid links
inside step text (they don't help; sometimes they hurt). Include
image URLs if you have step illustrations — both Google and
Perplexity surface them.

The 3 structured-data mistakes that quietly tank visibility

Mistake #1: Mismatched URL in `url` vs canonical

Your JSON-LD says "url": "https://www.brandswarm.io/" and your
canonical link tag says https://brandswarm.io/. Or vice versa.
Google penalizes the inconsistency; AI engines silently discount the page.
Fix: keep the URL identical to the canonical, including the trailing slash
policy.

Mistake #2: `sameAs` URLs that 404

You listed https://twitter.com/brandswarm in sameAs
but the handle is actually @brandswarm_io, or the LinkedIn page
was deleted, or the GitHub org never existed. Each broken URL nudges your
block toward "untrusted." Audit annually. Use Google's structured-data testing
tool — it'll flag broken sameAs URLs explicitly.

Mistake #3: FAQ schema where the visible page doesn't show the questions

You put FAQPage JSON-LD on a page where the questions and
answers aren't visibly rendered (they're hidden in an accordion that loads
via JS, or they're in the schema but not on the page at all). Google's spam
team called this out specifically in late 2023: schema content must match
what users actually see. AI engines now check the same way. Fix: render the
Q/A pairs as visible <h3> + <p> on the
page, then mirror in JSON-LD.

The validation step that catches 80% of mistakes

Two free tools, in this order:

Google's Rich Results Test (search.google.com/test/rich-results) — paste a URL or the raw JSON-LD. Catches syntax errors and missing required fields.
Schema.org's own validator (validator.schema.org) — stricter than Google's; catches type mismatches that Google tolerates.

Run both before shipping. Then sample 5–10 of your most important URLs in
Google Search Console under Enhancements → check that the structured data
types you expect are reported as discovered. If they're not, Google didn't
recognize what you shipped — usually a syntax issue.

What you don't need

Speakable — niche, voice-search only, AI engines don't use.
VideoObject on every page with embedded video — only useful if the video is the page's main content.
Three different schemas on the same page just to be thorough — pick the most specific type that matches the page's main content. Multiple non-conflicting types are fine, but redundant ones add noise.
Schema generators that produce 200-line blocks — most fields are optional. Ship the minimum and add only when you have content to fill it.

Quick checklist

If you do nothing else, do this in this order:

Add Organization with a complete sameAs array to your homepage.
Add SoftwareApplication (or Product) to your product/pricing page.
Add FAQPage to any page with question-answer content (billing FAQ, docs, support).
Add HowTo to your getting-started or onboarding guide.
Validate everything in Google's Rich Results Test before shipping.
Re-validate quarterly; sameAs URLs especially go stale.

This is two hours of work for most sites. The retrieval-layer lift you get is
measurable within a month. If you want to see where structured data is moving
your AI visibility specifically, the free Brandswarm scan flags structured-data
gaps per surface as part of the recommendations engine.

DEV Community

Schema markup for AI search: the complete 2026 reference

The 4 schemas that matter

1. Organization schema (every site needs this)

2. SoftwareApplication / Product schema

3. FAQPage schema (the highest-lift block per page)

4. HowTo schema (the most under-used)

The 3 structured-data mistakes that quietly tank visibility

Mistake #1: Mismatched URL in `url` vs canonical

Mistake #2: `sameAs` URLs that 404

Mistake #3: FAQ schema where the visible page doesn't show the questions

The validation step that catches 80% of mistakes

What you don't need

Quick checklist

Top comments (0)

The 4 schemas that matter

1. Organization schema (every site needs this)

2. SoftwareApplication / Product schema

3. FAQPage schema (the highest-lift block per page)

4. HowTo schema (the most under-used)

The 3 structured-data mistakes that quietly tank visibility

Mistake #1: Mismatched URL in url vs canonical

Mistake #2: sameAs URLs that 404

Mistake #3: FAQ schema where the visible page doesn't show the questions

The validation step that catches 80% of mistakes

What you don't need

Quick checklist

Mistake #1: Mismatched URL in `url` vs canonical

Mistake #2: `sameAs` URLs that 404