DEV Community

Joshua Gutierrez
Joshua Gutierrez

Posted on

ChatGPT Can't Render Your Site. Here's What Happens When You Ask it to Audit One Anyway.

Image
We wanted to know how reliable ChatGPT is as an SEO auditor, so we asked it to audit our own agency site, axiondeepdigital.com. Four prompts. Fresh thread. No setup, no priming. We then verified every concrete claim against
the actual page, against PageSpeed Insights, and against our own SEO workbench.

The audit came back at roughly 2,000 words. It included a numeric grade rubric ("Technical SEO 9/10, Architecture 8.8/10"), a prioritized fix list, named architectural risks, and an opinion on Googlebot rendering behavior. It looked like the kind of audit a senior SEO consultant might produce after a real review.

Most of it was confabulation dressed as analysis. One of its predictions was exactly backwards. And in the same thread, ChatGPT contradicted itself in a way neither it nor a casual reader would catch.

Here's the receipt for each prompt.

Prompt 1: The grade rubric was assigned without measurement

We asked: "Can you do a full technical SEO audit of [our URL]? I want specific issues, not a generic checklist."

ChatGPT returned an extensive audit with a grade rubric:

  • Technical SEO: 9/10
  • Architecture: 8.8/10
  • Content Depth: 6.8/10
  • Topical Authority: 6.5/10
  • Local SEO Structure: 8.5/10
  • Conversion SEO: 8.7/10

These numbers feel authoritative. They map to nothing measurable. Lighthouse outputs a Performance score and a separate SEO score. Search Console reports impressions, clicks, and CTR. None of these tools produce a "Technical SEO" or "Architecture" or "Conversion SEO" number. The grades exist nowhere in any real auditing tool. ChatGPT invented the rubric and assigned scores against it.

We then checked the factual claims. ChatGPT identified eight technology brands as "mentioned" on our homepage:

Vercel, Stripe, Netflix, Next.js, React, FastAPI, OpenAI, AWS

Of those eight, three (FastAPI, OpenAI, AWS) are not mentioned on our homepage at all. That is a 37.5% hallucination rate on a claim that takes less than a minute to verify in a browser.

Same pattern with the city pages it cited. It listed four geo pages by URL: /albuquerque, /las-cruces, /new-mexico, /web-development-new-mexico. The first two exist. The last two don't.

When ChatGPT mixes real observations with confabulated ones inside the same paragraph, the result is harder to catch than pure fabrication. A reader assumes the whole list was verified the same way.

Prompt 2: ChatGPT contradicted itself two messages later

We asked: "What are the current Core Web Vitals (LCP, CLS, INP) for [URL] on mobile? Give me the actual numbers."

ChatGPT replied:

I cannot reliably give you the real current field numbers for LCP, CLS, and INP from Google's CrUX dataset without direct access to PageSpeed Insights or the Chrome UX Report API for that URL. The web search environment here can access documentation and indexed pages, but it cannot execute a live Lighthouse/PageSpeed test against your domain or pull the live CrUX metrics feed.

This is correct. But ChatGPT had just spent 2,000 words in the previous message grading our site as if it had run Lighthouse against it. If it cannot execute a Lighthouse test, the grades in prompt 1 came from somewhere other than measurement. The two responses contradict each other inside the same conversation. ChatGPT did not flag the contradiction. A casual reader wouldn't either.

Then it guessed anyway:

I would expect CLS to already be very good, LCP to be decent but hero/media dependent, INP to likely be your weakest metric if third party scripts or animations are present.

We ran the actual Lighthouse mobile audit the same day. Lighthouse weights each Core Web Vital and reports a contribution out of a maximum. Here is the prediction next to the measurement:

Metric ChatGPT predicted Lighthouse measured (mobile)
CLS "already very good" +25 / 25 (perfect)
LCP "decent but hero/media dependent" +2 / 25 (lowest tier)
INP / TBT (interaction) "likely your weakest metric" +27 / 30 (one of our best)

CLS was a gimme. CLS is good on most modern sites and predicting that requires no information at all. LCP, the metric ChatGPT softened with "decent," scored 2 out of a possible 25 on mobile, which is the bottom tier Lighthouse reports. The metric ChatGPT confidently named as weakest scored 27 out of 30, the strongest of the three. The ranking was exactly inverted.

There is one more thing in this answer worth flagging. ChatGPT recommended PageSpeed Insights to retrieve the real CrUX field data. We checked. Our site does not have CrUX field data at all. Sites below Google's traffic threshold do not get field data, and the CrUX dataset returns "does not have sufficient real-world speed data" for our origin. A reader following ChatGPT's advice would arrive at PageSpeed, look for the field numbers ChatGPT promised, find nothing, and have no explanation for why.

Prompt 3: A confident render analysis without rendering anything

We asked: "Does Google see the same content on [URL] that a human visitor sees? Walk me through what Googlebot renders vs what the user renders."

ChatGPT delivered a careful, technically literate answer:

Based on the rendered structure, Googlebot is seeing substantially the same primary content that human users see on axiondeepdigital.com.

The phrase "based on the rendered structure" is doing all the work in that sentence. ChatGPT did not render anything. It cannot render anything. Its earlier message in this same conversation said so explicitly. The entire walkthrough of what Googlebot sees vs. what the user sees is pattern-matching on what a "modern Next.js site" typically does, not measurement of what our site actually does.

If we had a broken hydration path, if our hero was client-only, if our city pages were rendering empty divs to the bot, the answer would have looked identical. That is the worst possible failure mode for an auditing tool: confident output that is invariant to the actual state of the thing being audited.

Prompt 4: Three priorities that have nothing to do with the actual problem

We asked: "If I could only fix three things on [URL] this week to improve search rankings, what should they be and why those three?"

ChatGPT's answer: build a topical authority cluster, convert portfolio items into standalone case studies, improve internal semantic linking. Each recommendation came with an explanation that would apply word-for-word to any well-built Next.js agency site on the planet.

What the answer did not mention: our mobile Performance score is 69, our mobile LCP scores 2 out of 25, and Google has been weighting Core Web Vitals into ranking since 2021. A real "top three this week" list, given the data, starts with "fix mobile LCP." ChatGPT could not start there because it did not know.

It also credited us with "fast rendering stack using Next.js" as a strength in prompt 1. On mobile, our Performance score is 69 with LCP scoring 2 of a possible 25. Fast is generous.

The deeper problem

Our SEO category in Lighthouse scores 100. Best Practices is 100. Our own SEO workbench gives the site 96/100 overall. ChatGPT's gut call of "9/10 Technical SEO" is, by accident, close to the truth.

That is the trap.

ChatGPT arrived at a roughly correct grade not by measuring our site, but by pattern-matching what a well-built Next.js agency site usually looks like. We happen to be one. If we had been one of the 96.9% of small business sites our 292-site study found failing mobile Core Web Vitals, the audit would have read identically. Same grades. Same praise for "fast rendering stack." Same "your bottleneck is authority, not technical." Same generic top three.

You cannot use ChatGPT to find out whether your site is broken, because ChatGPT will tell you it is fine either way.

What to use instead

For SEO auditing, use tools that actually load your page in a real browser:

  • PageSpeed Insights: synthetic Lighthouse run, free, gives you real numbers.
  • Google Search Console URL Inspection: shows you the rendered HTML Google sees.
  • DeepAudit AI: our free real browser SEO audit, 100+ checks, full Chromium rendering. Built precisely because pattern-matching auditors lie. axiondeepdigital.com/free-seo-audit

ChatGPT is excellent at writing, summarizing, and explaining concepts. It is not an auditor. Asking it to audit a website is asking it to do a job it cannot do, and the danger is not that it refuses, but that it answers anyway.


Originally published on axiondeepdigital.com.

Top comments (0)