Google quietly retired the Structured Data Testing Tool a while back, and the thing it replaced it with, the Rich Results Test, is a manual web form: rate-limited, and it only works on URLs that are already live. So the usual workflow is "ship the page, then go find out the JSON-LD was broken." That is backwards.
I wanted the check to live where every other quality gate lives: in CI, on every push, failing the build before the broken markup reaches production. I couldn't find a clean zero-dependency one, so I wrote one and open-sourced it.
What it does
structured-data-action validates the JSON-LD / Schema.org structured data on a page (or in a file, or a raw snippet) and fails the job when a required rich-result property is missing.
- One Python file, standard library only (3.8+). No
pip install, no node_modules. - Validates 20+ types Google uses for rich results: Article, FAQPage, Product, Offer, Organization, LocalBusiness, BreadcrumbList, HowTo, Review, Recipe, Event, VideoObject, JobPosting, WebSite, and more.
- Errors vs. warnings. Missing a required prop = error (the rich result is broken). Missing a recommended prop = warning (eligible, but weaker).
- Extracts every
<script type="application/ld+json">block on the page, including objects nested inside@graph. - CI-native: GitHub annotations on the exact file, a job-summary table, and meaningful exit codes.
Drop it in a workflow
name: SEO checks
on: [push, pull_request]
jobs:
structured-data:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: atlashey-collab/structured-data-action@v1
with:
target: 'dist/**/*.html' # a URL, an HTML file, or a glob
fail-on-error: 'true' # break the build if required props are missing
fail-on-warning: 'false' # flip to true to also fail on missing recommended props
Point target at a live URL instead of a glob and it'll fetch and check that page directly.
Or just run it locally
python3 validate_schema.py https://example.com
python3 validate_schema.py "dist/**/*.html" --fail-on-error
python3 validate_schema.py snippet.json --json
• https://example.com/product
✓ Product / merchant listing — complete
• Offer
✗ missing required: priceCurrency
⚠ missing recommended: availability
✓ FAQ rich result — complete
Summary: 1 required (error) and 1 recommended (warning) properties missing.
How it decides
For each typed object, it checks the properties Google documents for that type. Required missing → error (ineligible). Recommended missing → warning (qualifies, shown with less detail). It also enforces a few high-value extras, e.g. a Product must carry at least one of offers, review, or aggregateRating, which is the single most common reason a product snippet silently doesn't show.
It's a fast structural pre-flight, not a renderer. For final "will Google actually draw the rich result on this live URL" confirmation, still run Google's Rich Results Test once before launch. Use this on every commit; use Google's tool for the final sign-off.
One more reason it matters in 2026
The same JSON-LD that earns Google rich results also helps AI answer engines (ChatGPT, Perplexity, Gemini) parse and cite your pages. Broken schema is a silent tax on both classic SEO and AI-search visibility. A CI gate keeps it honest.
It's MIT licensed, issues and PRs welcome: https://github.com/atlashey-collab/structured-data-action
There's also a free in-browser version if you just want to paste a snippet and eyeball it: schema-validator. The Action keeps the same rule set, just wired into your pipeline.
Top comments (0)