<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jan-Willem Bobbink</title>
    <description>The latest articles on DEV Community by Jan-Willem Bobbink (@jbobbink).</description>
    <link>https://dev.to/jbobbink</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F30113%2F572dbcfb-86d0-4511-b1bc-0f91ad4021bf.png</url>
      <title>DEV Community: Jan-Willem Bobbink</title>
      <link>https://dev.to/jbobbink</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jbobbink"/>
    <language>en</language>
    <item>
      <title>LLM trackers are quietly breaking their users' own analytics</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Wed, 29 Apr 2026 09:40:32 +0000</pubDate>
      <link>https://dev.to/jbobbink/llm-trackers-are-quietly-breaking-their-users-own-analytics-20o3</link>
      <guid>https://dev.to/jbobbink/llm-trackers-are-quietly-breaking-their-users-own-analytics-20o3</guid>
      <description>&lt;h2&gt;
  
  
  And nobody in SEO is talking about it yet
&lt;/h2&gt;

&lt;p&gt;There is a measurement problem sitting at the centre of the AI visibility industry, and the tools sold to solve AI search measurement are the ones causing it. The pollution is silent, it is structural, and it is showing up in exactly the dataset that GEO practitioners now rely on most.&lt;/p&gt;

&lt;p&gt;This post lays out the mechanism, why it matters more than the rank tracker problem that came before it, and the simple product fix that would clear it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mechanism
&lt;/h2&gt;

&lt;p&gt;When you run daily prompt tracking across ChatGPT, Perplexity, Google AI Mode, Claude and the other LLM surfaces, the model sometimes decides it needs fresh information to answer the prompt. It fires off a retrieval-augmented generation request, often through a search index, and your pages get fetched. RAG systems work by injecting retrieved content into the LLM's context window before generation, anchoring the answer to external sources rather than relying purely on training data [1]. Not every tracked prompt triggers retrieval. Models cache, they reuse prior context, and they only ground when they judge it necessary. But when you are tracking hundreds or thousands of prompts per day across multiple engines, the cumulative volume of triggered fetches is significant.&lt;/p&gt;

&lt;p&gt;Those fetches land in your server logs as bot hits. They inflate your crawl data. They dirty your content performance analysis. And they are caused by the tool you bought to measure your AI visibility in the first place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa57lpx6pjjion1qb04d6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa57lpx6pjjion1qb04d6.jpg" alt=" " width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The scale of legitimate AI crawler activity already makes this hard to disentangle. Cloudflare's 2025 Year in Review reports that AI "user action" crawling, the category that includes pages fetched in response to user prompts, grew more than 15 times across 2025 [2]. Botify's analysis of more than 7 billion log files found that OpenAI's combined crawl of the web tripled between August 2025 and March 2026, with OAI-SearchBot and GPTBot both at all-time highs [3]. Single Grain reported GPTBot traffic growing 305% between May 2024 and May 2025 [4]. Tracker-induced fetches are sitting on top of an already noisy baseline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters more than it did two years ago
&lt;/h2&gt;

&lt;p&gt;GA4 automatically excludes traffic from known bots and spiders, and according to Google's own documentation you cannot disable this filter or see how much was excluded [5]. If your AI visibility analysis lives in GA4, the pollution does not show up there in any obvious way, which is part of why the issue has stayed invisible.&lt;/p&gt;

&lt;p&gt;But GA4 is not where the real SEO (or any of the new abbreviations!) work is happening anymore. Log file analysis is. As Search Engine Land notes, while crawl tools like SEMrush or Screaming Frog simulate bot behaviour, log files capture what crawlers actually do in real time, including for bots that GSC and GA4 will never report on [6]. That is the only honest record of what AI systems are doing on your site.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://www.botsanalyser.com" rel="noopener noreferrer"&gt;botsanalyser.com&lt;/a&gt; have made server log parsing accessible to any SEO without needing to set up a data pipeline from scratch. Practitioners are increasingly using logs to answer questions that GA4 cannot: which AI crawlers visit, how often, which pages they fetch, how deep they go, and how that behaviour correlates with citations and visibility in AI answers. Search Engine Land's recent coverage of log file analysis for AI crawlers explicitly frames logs as the closest substitute for the missing feedback loop in AI search, where impressions, clicks, and indexing data simply do not exist the way they do in traditional SEO [7].&lt;/p&gt;

&lt;p&gt;This is the dataset where the signal lives for AI search optimisation. And this is exactly the dataset that LLM tracker traffic is contaminating.&lt;/p&gt;

&lt;h2&gt;
  
  
  A familiar pattern, with a worse outcome
&lt;/h2&gt;

&lt;p&gt;This is not the first time the SEO industry has bought a measurement tool that quietly polluted its own data. Rank trackers did the same thing to Google Search Console for years. Every time a rank tracker checked position 37 for a keyword, Google counted an impression. The more keywords you tracked, the noisier your GSC impression data became.&lt;/p&gt;

&lt;p&gt;The proof showed up clearly when Google stopped supporting the &amp;amp;num=100 parameter on 12 September 2025. Within days, GSC impressions dropped sharply across the industry, with some sites reporting declines of 20 to 50 percent. The "alligator effect" graphs that many SEOs had attributed to AI Overviews snapped shut almost overnight. Search Engine Land's analysis concluded that automated crawlers had been inflating impression counts, and that the post-change baseline reflected real user activity rather than scraper noise [8]. Smith Digital framed those vanished impressions as "ghost impressions" generated by machine activity that never represented a real human seeing a result [9].&lt;/p&gt;

&lt;p&gt;Google itself later confirmed a separate logging error that had been over-reporting GSC impressions from 13 May 2025 onwards. The fix was rolled out in April 2026, almost a full year after the bug began [10]. Between rank tracker pollution and Google's own logging bug, GSC impression data was structurally unreliable for most of 2025.&lt;/p&gt;

&lt;p&gt;The log file version of this same problem is worse for three reasons.&lt;/p&gt;

&lt;p&gt;First, the noise is harder to identify. Rank tracker traffic in GSC was at least bundled into a single metric you could mentally discount. LLM tracker traffic in your logs arrives with rotating user agents, sometimes through Bing's infrastructure, sometimes through Google, sometimes direct from OpenAI or Anthropic. Seer Interactive has documented how stealth AI crawling, where bots reappear under generic browser headers and unrelated IPs, makes traditional bot detection unreliable [11]. There is no clean way to label this traffic after the fact.&lt;/p&gt;

&lt;p&gt;Second, the noise is harder to filter. You cannot simply exclude a known IP range or user agent string. The same fetches that come from real LLM grounding for real user prompts arrive through the same infrastructure as the fetches caused by your tracker. They are mechanically identical from the server's perspective. Passion Digital flagged this exact problem when it noted that misidentifying bot traffic is one of the most common errors in LLM bot tracking, particularly because not all bots clearly identify themselves and user agent strings can be spoofed [12].&lt;/p&gt;

&lt;p&gt;Third, the dataset is being used to drive decisions, not just reporting. Log data is feeding content prioritisation, internal linking strategy, technical SEO fixes for AI crawlers, and conversations with leadership about which AI surfaces are sending qualified bot traffic. Every one of those decisions is being made on top of polluted data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ozcyjtn1rrjai3mhru6.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ozcyjtn1rrjai3mhru6.PNG" alt=" " width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What "polluted" actually looks like
&lt;/h2&gt;

&lt;p&gt;Imagine a mid-sized site running daily prompt tracking on 500 prompts across five LLM engines. Even if only a fraction of those prompt executions trigger retrieval, you are looking at potentially hundreds of additional fetches per day attributable to the tracker, on top of organic AI crawler activity.&lt;/p&gt;

&lt;p&gt;Those fetches will tend to cluster around the pages your tracker considers most relevant to the prompts you set up, which are the same pages you are trying to evaluate. So the pollution is not evenly distributed. It is concentrated on exactly the URLs you most want clean data for.&lt;/p&gt;

&lt;p&gt;The result is that pages with strong tracked-prompt coverage look healthier in your log analysis than they actually are, and pages outside your tracked prompt set look quieter than they actually are. The measurement is structurally biased toward the prompts you chose to monitor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft903tzpz962vyoqum532.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft903tzpz962vyoqum532.png" alt=" " width="800" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This bias matters more given how skewed the underlying crawl-to-referral economics already are. Cloudflare's crawl-to-refer ratio metric, which compares how much a platform crawls versus how much referral traffic it sends back, showed Anthropic peaking at roughly 500,000 to 1 and OpenAI peaking at around 3,700 to 1 during 2025 [13]. Practitioners are already trying to read meaningful signal out of fetch volumes that dwarf any human traffic those platforms send back. Adding tracker noise on top of that makes the signal even harder to extract.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix is a product decision, but it requires more than one party
&lt;/h2&gt;

&lt;p&gt;There is a path out of this for vendors who care about giving practitioners clean data, and it is worth being precise about who needs to do what. The architectural reality is that the page fetches that pollute server logs are not made by the tracker itself in the most common case. They are made by the LLM provider's own crawler in response to a prompt the tracker submitted to the API. The tracker can attach any header it likes to its API call, but that header does not propagate down into the RAG fetch the LLM subsequently fires off. So a header-only fix solves the wrong half of the problem.&lt;/p&gt;

&lt;p&gt;A clean solution stacks three mechanisms.&lt;/p&gt;

&lt;p&gt;The first is scheduling. Trackers should run prompts in a declared time window, ideally outside peak hours for their target audience, and publish that schedule. Practitioners can then filter logs by removing crawler hits during the declared window. This works without any LLM cooperation at all and is the easiest mitigation to deploy. It is not perfect because real users prompt at all hours, but it produces a meaningful baseline correction at very low cost.&lt;/p&gt;

&lt;p&gt;The second is the tracker activity feed. LLM tracking platforms know exactly when each prompt was executed, against which engine, and in many cases they can infer or directly observe whether a retrieval call was triggered. A timestamped export of that activity, ideally as an API endpoint, with at minimum the timestamp, engine, prompt identifier, and where possible the URLs that were fetched as part of grounding, lets practitioners reconcile log entries against tracker activity with more precision than the time window alone allows.&lt;/p&gt;

&lt;p&gt;The third is LLM cooperation, and this is the one that closes the loop. When a tracker calls the API, the LLM provider should mark the resulting RAG crawler fetches in a way that downstream log analysis can identify. This could be a custom User-Agent suffix on the OAI-SearchBot or ClaudeBot or PerplexityBot request, an extra HTTP header passed through from the originating API call, or a published list of IP ranges used specifically for API-originated retrieval. Without this, no amount of tracker discipline cleans up the actual problematic fetches, because the tracker is not the party making them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkhh3gcfrx77vpc34829.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhkhh3gcfrx77vpc34829.png" alt="LLM tracker solution diagram" width="800" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The combination is what gets you clean data. The time window is the easy first cut, the activity feed is the cross-reference, and LLM cooperation is what makes the filtering precise. None of the three is technically difficult. All three are product decisions about whether to give practitioners visibility into a measurement system that currently obscures itself.&lt;/p&gt;

&lt;p&gt;The harder question is whether LLM providers will cooperate. They have less commercial incentive than tracker vendors do, and they are the bottleneck on the cleanest part of the fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why no vendor has shipped this yet
&lt;/h2&gt;

&lt;p&gt;Probably because the easy parts make the product look smaller and the hard part requires someone else's cooperation. A tracker that shows you "your site was fetched 40,000 times by AI crawlers last month" reads differently than a tracker that shows you "your site was fetched 40,000 times, of which 12,000 were caused by us, leaving 28,000 organic AI crawler hits." The honest version is more useful. It is also less impressive on a dashboard.&lt;/p&gt;

&lt;p&gt;There is also a competitive logic. The first tracker vendor to publish a schedule and an activity feed effectively concedes that their measurement creates noise. No vendor wants to be the first to admit that, even though every vendor in the category has the same problem. And LLM providers, who hold the cleanest part of the fix, have even less incentive: they get the value of training data and answer grounding from those crawls, and the cost of the noise lands on publishers and SEO practitioners, not on them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What practitioners can do in the meantime
&lt;/h2&gt;

&lt;p&gt;Until vendors ship a clean activity feed, there are partial mitigations worth considering. Time-of-day patterns can sometimes isolate tracker traffic if your tracker runs on a fixed schedule. Cross-referencing fetch volumes against your tracked prompt list can flag suspiciously consistent crawl patterns on covered URLs. Comparing log data from before and after enabling a tracker, on the same site, gives you a rough estimate of the baseline shift.&lt;/p&gt;

&lt;p&gt;None of these are substitutes for vendor-provided data. They are workarounds for a problem that should not be the customer's to solve.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bigger point
&lt;/h2&gt;

&lt;p&gt;Every AI visibility report being published right now, including the ones used to set strategy at large brands, is sitting on top of log data that has been contaminated by the measurement tools themselves. The industry is making decisions on a dataset it has not properly cleaned, because the only people who can clean it have a commercial reason not to.&lt;/p&gt;

&lt;p&gt;That is the real story. Not that LLM trackers are bad, they are useful, but that the standard practice of evaluating AI search performance from log data is currently broken in a way that vendors could fix tomorrow and have chosen not to. The easy parts sit with tracker vendors. The hardest and most important part sits with the LLM providers themselves.&lt;/p&gt;

&lt;p&gt;The first tracker vendor to publish a schedule and an activity feed will, briefly, look like the one with the noisier product. They will also be the only one giving practitioners data they can actually trust at the tracker layer. The first LLM provider to mark API-originated RAG fetches will give the entire industry the missing piece. None of this is hard. All of it is overdue.&lt;/p&gt;

&lt;p&gt;Not going to start the debate about involving LLMs themselves :)&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;[1] Firecrawl, "What is RAG grounding?" (accessed 29-04-2026), Firecrawl Glossary.&lt;br&gt;
&lt;a href="https://www.firecrawl.dev/glossary/web-search-apis/rag-grounding" rel="noopener noreferrer"&gt;https://www.firecrawl.dev/glossary/web-search-apis/rag-grounding&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[2] David Belson, "The 2025 Cloudflare Radar Year in Review: The rise of AI, post-quantum, and record-breaking DDoS attacks" (29-01-2026), Cloudflare Blog.&lt;br&gt;
&lt;a href="https://blog.cloudflare.com/radar-2025-year-in-review/" rel="noopener noreferrer"&gt;https://blog.cloudflare.com/radar-2025-year-in-review/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[3] Chris Long, "OpenAI Has Tripled Their Crawl of the Web: An Analysis of 7B+ Log Files" (23-04-2026), Botify Blog.&lt;br&gt;
&lt;a href="https://www.botify.com/blog/openai-tripled-web-crawl" rel="noopener noreferrer"&gt;https://www.botify.com/blog/openai-tripled-web-crawl&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[4] Single Grain, "Log File Analysis for Understanding AI Crawling Behavior" (28-12-2025), Single Grain Blog.&lt;br&gt;
&lt;a href="https://www.singlegrain.com/blog-posts/analytics/log-file-analysis-for-understanding-ai-crawling-behavior/" rel="noopener noreferrer"&gt;https://www.singlegrain.com/blog-posts/analytics/log-file-analysis-for-understanding-ai-crawling-behavior/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[5] Google, "[GA4] Filter incoming data: Known bot-traffic exclusion" (accessed 29-04-2026), Google Analytics Help.&lt;br&gt;
&lt;a href="https://support.google.com/analytics/answer/9888366" rel="noopener noreferrer"&gt;https://support.google.com/analytics/answer/9888366&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[6] Search Engine Land, "Log file analysis for SEO: Find crawl issues &amp;amp; fix them fast" (27-11-2025), Search Engine Land.&lt;br&gt;
&lt;a href="https://searchengineland.com/guide/log-file-analysis" rel="noopener noreferrer"&gt;https://searchengineland.com/guide/log-file-analysis&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[7] Lauren Busby, "Why log file analysis matters for AI crawlers and search visibility" (16-04-2026), Search Engine Land.&lt;br&gt;
&lt;a href="https://searchengineland.com/log-file-analysis-ai-crawlers-search-visibility-474428" rel="noopener noreferrer"&gt;https://searchengineland.com/log-file-analysis-ai-crawlers-search-visibility-474428&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[8] Search Engine Land, "Why Google Search Console impressions fell (and why that's good)" (23-10-2025), Search Engine Land.&lt;br&gt;
&lt;a href="https://searchengineland.com/why-google-search-console-impressions-dropped-interpret-data-463677" rel="noopener noreferrer"&gt;https://searchengineland.com/why-google-search-console-impressions-dropped-interpret-data-463677&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[9] Smith Digital, "Why Google Search Console Impressions Dropped in Sept 2025" (16-12-2025), Smith Digital Blog.&lt;br&gt;
&lt;a href="https://smithdigital.io/blog/google-search-console-impression-drop-sept-2025" rel="noopener noreferrer"&gt;https://smithdigital.io/blog/google-search-console-impression-drop-sept-2025&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[10] Danny Goodwin, "Google is fixing a Search Console bug that inflated impression counts" (03-04-2026), Search Engine Land.&lt;br&gt;
&lt;a href="https://searchengineland.com/google-search-console-bug-inflated-impression-counts-473530" rel="noopener noreferrer"&gt;https://searchengineland.com/google-search-console-bug-inflated-impression-counts-473530&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[11] Seer Interactive, "Perplexity, Stealth AI Crawling, and the Impacts on GEO and Log File Analysis" (30-10-2025), Seer Interactive Insights.&lt;br&gt;
&lt;a href="https://www.seerinteractive.com/insights/perplexity-stealth-ai-crawling-and-the-impacts-on-geo-and-log-file-analysis" rel="noopener noreferrer"&gt;https://www.seerinteractive.com/insights/perplexity-stealth-ai-crawling-and-the-impacts-on-geo-and-log-file-analysis&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[12] Passion Digital, "Tracking LLMs Bots on Your Site using Log File Analysis" (15-07-2025), Passion Digital Blog.&lt;br&gt;
&lt;a href="https://passion.digital/blog/tracking-llms-bots-on-your-site-using-log-file-analysis/" rel="noopener noreferrer"&gt;https://passion.digital/blog/tracking-llms-bots-on-your-site-using-log-file-analysis/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[13] Cloudflare, "The crawl before the fall... of referrals: understanding AI's impact on content providers" (01-07-2025), Cloudflare Blog.&lt;br&gt;
&lt;a href="https://blog.cloudflare.com/ai-search-crawl-refer-ratio-on-radar/" rel="noopener noreferrer"&gt;https://blog.cloudflare.com/ai-search-crawl-refer-ratio-on-radar/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>seo</category>
      <category>geo</category>
      <category>ai</category>
    </item>
    <item>
      <title>Building Glippy MCP: giving Claude the power to audit a site's AI readiness</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Tue, 21 Apr 2026 08:30:55 +0000</pubDate>
      <link>https://dev.to/jbobbink/building-glippy-mcp-giving-claude-the-power-to-audit-a-sites-ai-readiness-2db</link>
      <guid>https://dev.to/jbobbink/building-glippy-mcp-giving-claude-the-power-to-audit-a-sites-ai-readiness-2db</guid>
      <description>&lt;p&gt;I spent the last few weekends turning &lt;a href="https://www.glippy.dev/mcp" rel="noopener noreferrer"&gt;Glippy&lt;/a&gt;, a desktop app and browser extension that scores a site's readiness for AI crawlers, into a Model Context Protocol (MCP) server. The result is &lt;code&gt;glippy-mcp&lt;/code&gt;: a Node.js binary that plugs into Claude Desktop, Claude Code, Cursor, Windsurf, and anything else that speaks MCP, then exposes nine tools for analysing, comparing, and exporting GEO reports on any domain.&lt;/p&gt;

&lt;p&gt;This post walks through why I built it, what it does, and the handful of design decisions that actually mattered.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why an MCP server at all?
&lt;/h2&gt;

&lt;p&gt;Glippy already had a perfectly good desktop app. The engine &lt;code&gt;geo-checker.js&lt;/code&gt; fetches &lt;code&gt;robots.txt&lt;/code&gt;, &lt;code&gt;llms.txt&lt;/code&gt;, the homepage HTML, &lt;code&gt;sitemap.xml&lt;/code&gt;, and a few security headers, then runs 10 weighted scoring categories (Structured Data, Semantic HTML, Machine Readability, Citability &amp;amp; Answer-Readiness, and so on). You paste a domain, you get a report.&lt;/p&gt;

&lt;p&gt;The problem is that I kept finding myself copy-pasting that report back into Claude to ask follow-up questions like &lt;em&gt;"which of these issues should I fix first for a Shopify site?"&lt;/em&gt; or &lt;em&gt;"compare this to the three competitors in the report I ran yesterday."&lt;/em&gt; The conversation loop was slow and lossy — Claude was operating on stale text instead of live crawls.&lt;/p&gt;

&lt;p&gt;MCP fixes that. Instead of me being a ferry between two tools, Claude can call &lt;code&gt;analyze_domain&lt;/code&gt;, &lt;code&gt;compare_domains&lt;/code&gt;, or &lt;code&gt;analyze_sitemap&lt;/code&gt; directly during the conversation. The model decides &lt;em&gt;when&lt;/em&gt; a fresh crawl is needed, and my job shrinks to asking good questions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the server actually exposes
&lt;/h2&gt;

&lt;p&gt;Nine tools, all stdio-transport JSON-RPC 2.0 under the hood:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;analyze_domain&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full 10-category GEO analysis of one domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;check_robots_txt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which AI crawlers (GPTBot, ClaudeBot, …) are blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;check_llms_txt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Is there an &lt;code&gt;llms.txt&lt;/code&gt;? Show the contents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_geo_summary&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quick score + top 3 strengths and weaknesses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;compare_domains&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run 2–10 domains in parallel, rank them&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;analyze_sitemap&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fetch a sitemap, score every page&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;analyze_urls&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same, but for an arbitrary URL list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;export_report&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Styled Markdown or HTML report for one domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;export_bulk_report&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same, for comparisons / sitemaps / URL sets&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Everything interesting lives in &lt;code&gt;src/geo-checker.js&lt;/code&gt; (the scoring engine reused from the desktop app) and &lt;code&gt;src/index.js&lt;/code&gt; (the MCP wrapper).&lt;/p&gt;

&lt;h2&gt;
  
  
  The skeleton: less code than you'd think
&lt;/h2&gt;

&lt;p&gt;The MCP SDK does most of the heavy lifting. A minimal version of the server is about twenty lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;McpServer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/mcp.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;StdioServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/stdio.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;checkGEO&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./geo-checker.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;McpServer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;glippy-mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0.1.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;analyze_domain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Full GEO analysis of a domain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;e.g. example.com — no https:// prefix&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;max_pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;max_pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;checkGEO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;maxPages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;max_pages&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;formatReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioServerTransport&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zod schemas double as both runtime validation and the JSON schema Claude sees when deciding &lt;em&gt;how&lt;/em&gt; to call the tool, so clear &lt;code&gt;.describe()&lt;/code&gt; text matters more than the parameter name. "Do not include &lt;code&gt;https://&lt;/code&gt;" in the description saves a lot of round-trips where the model would otherwise guess wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  The decisions that actually mattered
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Reuse the engine, don't rewrite it
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;geo-checker.js&lt;/code&gt; is ~4,800 lines of &lt;code&gt;cheerio&lt;/code&gt;-based HTML inspection that has already been battle-tested against thousands of real-world sites. The MCP wrapper imports its public functions (&lt;code&gt;checkGEO&lt;/code&gt;, &lt;code&gt;analyseHTML&lt;/code&gt;, &lt;code&gt;analyseRobotsTxt&lt;/code&gt;, &lt;code&gt;parseSitemapUrls&lt;/code&gt;, &lt;code&gt;throttledFetchUrl&lt;/code&gt;, &lt;code&gt;aggregatePageScores&lt;/code&gt;) and does zero scoring itself. Every bug fix in the desktop app flows through to the MCP server for free.&lt;/p&gt;

&lt;p&gt;If you're MCP-ifying an existing tool, resist the urge to "do it properly this time." Wrap what you have.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Keep everything local except the license check
&lt;/h3&gt;

&lt;p&gt;A Glippy MCP license key (&lt;code&gt;GLMCP-XXXX-XXXX-XXXX&lt;/code&gt;) hits a Cloudflare Worker (&lt;code&gt;mcp-worker/&lt;/code&gt;) on first use and caches the result for 24 hours. Actual crawling and scoring run on the user's machine: no domains, results, or HTML ever leave their box.&lt;/p&gt;

&lt;p&gt;That choice kept the server very cheap to run (my Worker handles only verify/deactivate/Stripe-webhook traffic) and kept privacy-sensitive users happy. The validation logic falls back gracefully: if the license server is unreachable but a cached valid license exists, the tool keeps working.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Two tricks to avoid re-crawling
&lt;/h3&gt;

&lt;p&gt;Crawling a sitemap of 500 pages is expensive and rude. I added two layers of deduplication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In-memory cache, 5-minute TTL.&lt;/strong&gt; Keyed on &lt;code&gt;domain + maxPages&lt;/code&gt;. The clever bit: if you ask for &lt;code&gt;max_pages=3&lt;/code&gt; and there's already a cached run at &lt;code&gt;max_pages=5&lt;/code&gt;, the cache hits. Subsequent tools in the same conversation (&lt;code&gt;get_geo_summary&lt;/code&gt;, &lt;code&gt;export_report&lt;/code&gt;) reuse the crawl automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explicit JSON output mode.&lt;/strong&gt; For workflows where the model needs to generate &lt;em&gt;multiple&lt;/em&gt; report formats, every analysis tool accepts &lt;code&gt;output_format="json"&lt;/code&gt;. The raw result object can then be handed to &lt;code&gt;export_report&lt;/code&gt; or &lt;code&gt;export_bulk_report&lt;/code&gt; via an &lt;code&gt;analysis_result&lt;/code&gt; parameter, bypassing the cache entirely. This shows up in practice as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;analyze_domain  domain="example.com" max_pages=5 output_format="json"
→ export_report format="html" analysis_result=&amp;lt;from above&amp;gt;
→ export_report format="markdown_full" analysis_result=&amp;lt;from above&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One crawl, three artifacts.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Per-domain rate limiting, not global
&lt;/h3&gt;

&lt;p&gt;The batch tools (&lt;code&gt;analyze_sitemap&lt;/code&gt;, &lt;code&gt;analyze_urls&lt;/code&gt;, &lt;code&gt;compare_domains&lt;/code&gt;) fan out concurrent fetches. Naively doing this against a single origin will get you rate-limited — or worse, get you blocked. The &lt;code&gt;throttledFetchUrl&lt;/code&gt; helper in &lt;code&gt;geo-checker.js&lt;/code&gt; keeps a per-host queue with a configurable delay (default 5 rps, tunable via the &lt;code&gt;GLIPPY_RATE_LIMIT&lt;/code&gt; env var or a &lt;code&gt;rate_limit&lt;/code&gt; parameter) while a global semaphore caps total in-flight requests at 10.&lt;/p&gt;

&lt;p&gt;Result: comparing &lt;code&gt;example.com&lt;/code&gt; and &lt;code&gt;competitor.com&lt;/code&gt; runs effectively in parallel because they're different origins, but hammering a single sitemap stays polite.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Stderr for logs, stdout is sacred
&lt;/h3&gt;

&lt;p&gt;MCP over stdio means stdout is a JSON-RPC channel. A stray &lt;code&gt;console.log&lt;/code&gt; anywhere in the engine will corrupt the frame and the client will disconnect with a cryptic parse error. Route every log through &lt;code&gt;console.error&lt;/code&gt;, and audit third-party dependencies for chatty output. I caught one cheerio helper printing a deprecation warning to stdout in an older version; pinning fixed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Client config: the part users actually interact with
&lt;/h2&gt;

&lt;p&gt;Getting an MCP server installed is still the single biggest adoption barrier. I wrote one config block and then copy-pasted it across every guide:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"glippy-geo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glippy-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"GLIPPY_LICENSE_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GLMCP-XXXX-XXXX-XXXX"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same JSON works for Claude Desktop, Claude Code (&lt;code&gt;.mcp.json&lt;/code&gt;), Cursor (&lt;code&gt;.cursor/mcp.json&lt;/code&gt;), Windsurf (&lt;code&gt;.windsurf/mcp.json&lt;/code&gt;), and Continue.dev. Using &lt;code&gt;npx -y&lt;/code&gt; means users don't manage a global install; they always get the latest published version.&lt;/p&gt;

&lt;p&gt;For ChatGPT / OpenAI, which doesn't speak MCP natively yet, a small bridge does the job, but that's a post for another day.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ship JSON-mode from day one.&lt;/strong&gt; I added it in v0.1 after realising chained exports were the most common workflow. Cache-hit logic is fine, but explicit result passing is faster and more predictable for agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fewer tools, sharper descriptions.&lt;/strong&gt; Nine is on the edge of "too many to reason about." In hindsight, &lt;code&gt;analyze_domain&lt;/code&gt; with rich options subsumes half of the others. Next major version might consolidate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming responses.&lt;/strong&gt; A full sitemap crawl can take a minute. Right now it's a single tool call; a streaming update ("scored 42/500 pages…") would be a nicer UX once MCP clients support progress notifications more widely.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx &lt;span class="nt"&gt;-y&lt;/span&gt; glippy-mcp   &lt;span class="c"&gt;# needs a license key — grab one at glippy.dev&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop the config block into your MCP client of choice and ask Claude something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Give me a GEO readiness summary for stripe.com and explain the top three issues in plain English.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The whole project ended up being a small reminder that MCP is mostly just "expose your existing tool well." The SDK is thin, the protocol is boring in a good way, and the hard problems: caching, rate limiting and clean log separation are the same ones you already know from building any CLI.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Why Lovable.dev sites struggle with search engine and LLM indexing</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Sun, 01 Feb 2026 12:42:16 +0000</pubDate>
      <link>https://dev.to/jbobbink/why-lovabledev-sites-struggle-with-search-engine-and-llm-indexing-36kp</link>
      <guid>https://dev.to/jbobbink/why-lovabledev-sites-struggle-with-search-engine-and-llm-indexing-36kp</guid>
      <description>&lt;p&gt;&lt;strong&gt;&lt;a href="https://lovable.dev/" rel="noopener noreferrer"&gt;Lovable.dev&lt;/a&gt;'s pure client-side rendering architecture creates significant SEO challenges&lt;/strong&gt; because search engines receive only an empty HTML shell when crawling these React applications. Google takes approximately &lt;strong&gt;9x longer&lt;/strong&gt; to index JavaScript-heavy pages compared to static HTML, and other search engines—including AI crawlers—often cannot render the content at all. The platform itself &lt;a href="https://docs.lovable.dev/tips-tricks/seo" rel="noopener noreferrer"&gt;acknowledges these limitations&lt;/a&gt;, noting that indexing can take "days instead of hours" and that social media previews are broken by default.&lt;/p&gt;

&lt;p&gt;This problem isn't unique to Lovable.dev—it affects most single-page applications (SPAs) built with React, Vue, or Angular that rely on client-side JavaScript to render content. The solutions range from implementing server-side rendering to using prerendering services, with SEO experts like &lt;a href="https://notprovided.eu/" rel="noopener noreferrer"&gt;Jan-Willem Bobbink&lt;/a&gt; consistently recommending SSR as the safest approach for SEO-critical sites.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lovable.dev's technical architecture creates an empty-shell problem
&lt;/h2&gt;

&lt;p&gt;Lovable.dev generates React applications using a modern but SEO-problematic stack: &lt;strong&gt;React with TypeScript, &lt;a href="https://vitejs.dev/" rel="noopener noreferrer"&gt;Vite&lt;/a&gt; for builds, &lt;a href="https://tailwindcss.com/" rel="noopener noreferrer"&gt;Tailwind CSS&lt;/a&gt; with &lt;a href="https://ui.shadcn.com/" rel="noopener noreferrer"&gt;shadcn/ui&lt;/a&gt; components, and &lt;a href="https://reactrouter.com/" rel="noopener noreferrer"&gt;React Router&lt;/a&gt; for client-side navigation&lt;/strong&gt;. The platform exclusively produces &lt;a href="https://docs.lovable.dev/features/tech-stack" rel="noopener noreferrer"&gt;client-side rendered (CSR) single-page applications&lt;/a&gt; with no built-in server-side rendering options.&lt;/p&gt;

&lt;p&gt;When a search engine crawler visits a Lovable.dev site, it receives HTML that looks essentially like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&amp;lt;title&amp;gt;&lt;/span&gt;Loading...&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"root"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"/bundle.js"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All meaningful content—text, images, navigation, metadata—exists only after JavaScript executes in the browser. &lt;a href="https://docs.lovable.dev/tips-tricks/seo" rel="noopener noreferrer"&gt;Lovable's own documentation&lt;/a&gt; acknowledges this limitation: "Platforms like Facebook, X/Twitter, and LinkedIn do not wait for content to render, so they only see the initial HTML page structure."&lt;/p&gt;

&lt;p&gt;The platform offers workarounds but no native fix. Users can &lt;a href="https://docs.lovable.dev/features/deploy" rel="noopener noreferrer"&gt;export their code to GitHub&lt;/a&gt; and deploy elsewhere, use prerendering services like &lt;a href="https://prerender.io/" rel="noopener noreferrer"&gt;Prerender.io&lt;/a&gt; or &lt;a href="https://lovablehtml.com/" rel="noopener noreferrer"&gt;LovableHTML&lt;/a&gt; ($9+/month), or migrate entirely to &lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;—though this breaks Lovable's visual editor functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google's two-wave indexing creates multi-day delays
&lt;/h2&gt;

&lt;p&gt;Google processes JavaScript websites through a &lt;strong&gt;three-phase pipeline: crawling, rendering, and indexing&lt;/strong&gt;. When Googlebot first visits a CSR page, it captures the raw HTML immediately but places the page in a rendering queue for JavaScript execution. Google's &lt;a href="https://www.youtube.com/watch?v=PFEakcD3CSs" rel="noopener noreferrer"&gt;Tom Greenaway confirmed at Google I/O&lt;/a&gt; that "the final render can actually arrive several days later."&lt;/p&gt;

&lt;p&gt;This creates what researchers call "flaky indexing." The same page might appear differently on different crawl attempts. Some pages get fully indexed while others remain partially indexed or show errors like "Crawled – currently not indexed" in Search Console. A &lt;a href="https://www.onely.com/blog/javascript-seo-experiment/" rel="noopener noreferrer"&gt;study by Onely&lt;/a&gt; demonstrated that Google takes &lt;strong&gt;up to 9 times longer&lt;/strong&gt; to properly render JavaScript pages than static HTML.&lt;/p&gt;

&lt;h3&gt;
  
  
  The crawl and render budget problem
&lt;/h3&gt;

&lt;p&gt;Search engines allocate finite computational resources for JavaScript execution. Sites that exceed their rendering budget experience &lt;strong&gt;up to 40% lower indexation rates&lt;/strong&gt; and 23% decreased organic traffic. Heavy JavaScript bundles—particularly common in React applications—can cause Google to abandon rendering entirely before completion.&lt;/p&gt;

&lt;p&gt;Each JavaScript file competes for crawl budget allocation. Framework files (React, Redux), third-party scripts (analytics, ads), and component libraries all require HTTP requests and processing time. Failed rendering attempts waste budget without producing successful indexing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Beyond Google: other crawlers struggle more
&lt;/h3&gt;

&lt;p&gt;While Googlebot uses an evergreen Chromium browser for rendering, other crawlers have more limited capabilities:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Crawler&lt;/th&gt;
&lt;th&gt;JavaScript Support&lt;/th&gt;
&lt;th&gt;Implication&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Googlebot&lt;/td&gt;
&lt;td&gt;Full (with delays)&lt;/td&gt;
&lt;td&gt;Content eventually indexed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bingbot&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Microsoft recommends dynamic rendering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DuckDuckBot&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;Requires static content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI Crawlers (GPTBot, ClaudeBot)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;None&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Completely miss CSR content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Social Media Bots&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Broken link previews&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://www.bing.com/webmasters/help/webmasters-guidelines-30fba23a" rel="noopener noreferrer"&gt;Bing's official documentation&lt;/a&gt; explicitly states: "In order to increase the predictability of crawling and indexing by Bing, we recommend dynamic rendering as a great alternative for websites relying heavily on JavaScript." Tests by &lt;a href="https://www.screamingfrog.co.uk/javascript-seo/" rel="noopener noreferrer"&gt;Screaming Frog&lt;/a&gt; found that Angular.io—a JavaScript-heavy site—shows "problematic indexing issues" in Bing with missing canonical tags, meta descriptions, and H1 elements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six specific indexing challenges affect Lovable.dev sites
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Metadata and title tags aren't visible to crawlers
&lt;/h3&gt;

&lt;p&gt;Meta tags generated client-side may not be processed during the first crawl. The &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; tag must exist before JavaScript execution for proper indexing. Social media crawlers don't execute JavaScript at all, which is why Lovable sites often display generic or incorrect information when shared on Facebook, Twitter, or LinkedIn.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/nfl/react-helmet" rel="noopener noreferrer"&gt;React Helmet&lt;/a&gt; can manage meta tags dynamically, but &lt;strong&gt;must be combined with SSR&lt;/strong&gt; for full effectiveness with search engines.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Structured data often goes unseen
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.searchenginejournal.com/structured-data-javascript-ai-crawlers/" rel="noopener noreferrer"&gt;Search Engine Journal&lt;/a&gt; reports that "structured data added only through client-side JavaScript is invisible to most AI crawlers." While Googlebot can eventually process JavaScript-generated JSON-LD, the rendering delays and potential failures create inconsistency. Rich results may not appear if schema markup isn't in the initial HTML.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Internal links may not be crawlable
&lt;/h3&gt;

&lt;p&gt;Links created via &lt;code&gt;onclick&lt;/code&gt; events or &lt;code&gt;addEventListener&lt;/code&gt; are &lt;a href="https://developers.google.com/search/docs/crawling-indexing/links-crawlable" rel="noopener noreferrer"&gt;not crawlable&lt;/a&gt;. Google ignores URL fragments (&lt;code&gt;#&lt;/code&gt;), meaning SPAs using hash-based routing appear as a single URL. A &lt;a href="https://www.momentic.ai/blog/react-seo-case-study" rel="noopener noreferrer"&gt;case study documented by Momentic&lt;/a&gt; found that a React website lost &lt;strong&gt;51% of traffic&lt;/strong&gt; partly because "link types that were not crawlable" were implemented as click events rather than proper &lt;code&gt;&amp;lt;a href&amp;gt;&lt;/code&gt; elements.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Core Web Vitals suffer under client-side rendering
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://web.dev/lcp/" rel="noopener noreferrer"&gt;Largest Contentful Paint (LCP)&lt;/a&gt;&lt;/strong&gt; typically performs poorly with CSR because content loads only after JavaScript execution. With pure client-side rendering, the LCP element doesn't exist in initial HTML—JavaScript must build the DOM first, creating significant render delays. The target is 2.5 seconds or less; CSR sites often exceed this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://web.dev/cls/" rel="noopener noreferrer"&gt;Cumulative Layout Shift (CLS)&lt;/a&gt;&lt;/strong&gt; increases as JavaScript-rendered content causes elements to shift during load. Brands optimizing their rendering approach report &lt;strong&gt;67% reduction in layout shifts&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Mobile-first indexing amplifies the problem
&lt;/h3&gt;

&lt;p&gt;Google primarily uses mobile Googlebot for indexing. Mobile devices have slower processors and limited bandwidth, making JavaScript execution significantly slower. Industry guidelines recommend keeping JavaScript bundles under &lt;strong&gt;100-170KB minified and gzipped&lt;/strong&gt; for initial load—a threshold many React applications exceed.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. AI search visibility is nearly zero
&lt;/h3&gt;

&lt;p&gt;Modern AI assistants like ChatGPT, Claude, and Perplexity rely on crawlers that don't execute JavaScript. &lt;a href="https://vercel.com/blog/how-search-engines-and-llms-see-your-site" rel="noopener noreferrer"&gt;Vercel research&lt;/a&gt; found that most AI crawlers "only fetch static HTML and bypass JavaScript." &lt;a href="https://docs.lovable.dev/tips-tricks/seo" rel="noopener noreferrer"&gt;Lovable's documentation&lt;/a&gt; acknowledges: "Many AI systems don't reliably see dynamically rendered content, so they may miss your pages or only see partial content."&lt;/p&gt;

&lt;h2&gt;
  
  
  Jan-Willem Bobbink's framework for JavaScript SEO
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://notprovided.eu/" rel="noopener noreferrer"&gt;Jan-Willem Bobbink&lt;/a&gt;, founder of notprovided.eu and an SEO consultant with &lt;strong&gt;30 years of web development experience&lt;/strong&gt;, has become a leading voice on JavaScript SEO. At &lt;a href="https://www.brightonseo.com/" rel="noopener noreferrer"&gt;BrightonSEO&lt;/a&gt; 2019, he presented findings from building 10 websites using the 10 most popular JavaScript frameworks—conducting hands-on testing rather than relying on client data alone.&lt;/p&gt;

&lt;p&gt;His observation that JavaScript framework adoption among clients jumped from &lt;strong&gt;28% in 2016 to 65% in 2019&lt;/strong&gt; underscores why this expertise matters. His &lt;a href="https://notprovided.eu/javascript-seo-recommendations/" rel="noopener noreferrer"&gt;ten core recommendations&lt;/a&gt; provide a practical framework for addressing JavaScript SEO challenges.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bobbink's primary recommendation: server-side rendering
&lt;/h3&gt;

&lt;p&gt;"Server Side Rendering (SSR) is just the safest way to go," Bobbink states. "For SEO you just don't want to take a risk Google sees anything else than a fully SEO optimized page in the initial crawl." He specifically recommends &lt;strong&gt;&lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;&lt;/strong&gt; as an SEO-friendly framework for React development.&lt;/p&gt;

&lt;p&gt;His preferred approach is a &lt;strong&gt;hybrid model&lt;/strong&gt;: "Content and important elements for SEO are delivered as Server Side Rendered and then you sprinkle all the UX/CX improvements for the visitors as a Client Side Rendered 'layer.'"&lt;/p&gt;

&lt;h3&gt;
  
  
  Critical technical warnings from Bobbink
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Data persistence creates ranking risks.&lt;/strong&gt; "Googlebot is crawling with a headless browser, not passing anything to the next successive URL request." Sites using cookies, local storage, or session data to populate SEO elements—like personalized product links—have lost rankings because crawlers don't carry this data between requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unit test your SSR implementation.&lt;/strong&gt; Bobbink shared a case where broken SSR caused &lt;strong&gt;two weeks of visibility loss&lt;/strong&gt;. He recommends &lt;a href="https://jestjs.io/" rel="noopener noreferrer"&gt;Jest&lt;/a&gt; for Angular and React testing, and &lt;a href="https://test-utils.vuejs.org/" rel="noopener noreferrer"&gt;vue-test-utils&lt;/a&gt; for Vue applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitor prerendering services for failures.&lt;/strong&gt; Services like &lt;a href="https://prerender.io/" rel="noopener noreferrer"&gt;Prerender.io&lt;/a&gt; can fail silently. He advocates monitoring tools like &lt;a href="https://www.contentkingapp.com/" rel="noopener noreferrer"&gt;ContentKing&lt;/a&gt;, &lt;a href="https://littlewarden.com/" rel="noopener noreferrer"&gt;Little Warden&lt;/a&gt;, &lt;a href="https://pagemodified.com/" rel="noopener noreferrer"&gt;PageModified&lt;/a&gt;, and &lt;a href="https://seoradar.com/" rel="noopener noreferrer"&gt;SEORadar&lt;/a&gt; to detect when rendered pages differ from expectations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Bobbink advises against dynamic rendering
&lt;/h3&gt;

&lt;p&gt;Despite Google historically promoting &lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/dynamic-rendering" rel="noopener noreferrer"&gt;dynamic rendering&lt;/a&gt;, Bobbink advises against it due to &lt;strong&gt;outdated content issues&lt;/strong&gt;. Cached rendered pages can serve stale prices, ratings, or stock information in rich snippets—creating poor user experiences and potential policy violations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solutions for improving Lovable.dev site indexability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Migrate to Next.js for proper SSR
&lt;/h3&gt;

&lt;p&gt;The most comprehensive solution involves exporting Lovable code to GitHub and converting to &lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;. Tools like "&lt;a href="https://vitetonext.ai/" rel="noopener noreferrer"&gt;ViteToNext.AI&lt;/a&gt;" and "&lt;a href="https://github.com/engrafstudio/next-lovable" rel="noopener noreferrer"&gt;next-lovable&lt;/a&gt;" facilitate this migration. Next.js provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Server-side rendering&lt;/strong&gt; via &lt;code&gt;getServerSideProps&lt;/code&gt; for dynamic content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static site generation&lt;/strong&gt; via &lt;code&gt;getStaticProps&lt;/code&gt; for content that doesn't change frequently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incremental Static Regeneration (ISR)&lt;/strong&gt; for automatic page updates without full rebuilds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in metadata API&lt;/strong&gt; for proper SEO tags in initial HTML&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Native sitemap and robots.txt generation&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trade-off: Lovable's visual editor no longer functions after migration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: Implement prerendering services
&lt;/h3&gt;

&lt;p&gt;Prerendering services intercept crawler requests and serve pre-rendered HTML while users receive the normal JavaScript application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://prerender.io/" rel="noopener noreferrer"&gt;Prerender.io&lt;/a&gt;&lt;/strong&gt; (industry leader): Starts at $9/month for 3,000 renders, with average delivery time of 0.03 seconds. Supports Google, Bing, and AI crawlers. Requires Cloudflare Workers or similar proxy configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://lovablehtml.com/" rel="noopener noreferrer"&gt;LovableHTML&lt;/a&gt;&lt;/strong&gt;: Built specifically for Lovable.dev sites at $9+/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/GoogleChrome/rendertron" rel="noopener noreferrer"&gt;Rendertron&lt;/a&gt;&lt;/strong&gt;: Google's open-source solution. Free but requires self-hosting and DevOps expertise.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 3: Add SSR via Vike
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://vike.dev/" rel="noopener noreferrer"&gt;Vike&lt;/a&gt; (formerly vite-plugin-ssr) can add server-side rendering to existing Vite projects. This preserves the React Router structure but requires VPS deployment rather than Lovable's built-in hosting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 4: Islands architecture with Astro
&lt;/h3&gt;

&lt;p&gt;For content-heavy sites, &lt;a href="https://astro.build/" rel="noopener noreferrer"&gt;Astro&lt;/a&gt; provides an alternative approach: render pages as static HTML with isolated "islands" of interactivity that hydrate independently. This ships zero JavaScript by default, adding client-side code only where interactivity is required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google's official recommendations for JavaScript sites
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics" rel="noopener noreferrer"&gt;Google Search Central documentation&lt;/a&gt;, updated in December 2025, provides clear guidance for JavaScript-heavy websites.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic rendering is now deprecated as a long-term strategy.&lt;/strong&gt; Google &lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/dynamic-rendering" rel="noopener noreferrer"&gt;explicitly states&lt;/a&gt;: "Dynamic rendering was a workaround and not a long-term solution for problems with JavaScript-generated content in search engines. Instead, we recommend that you use server-side rendering, static rendering, or hydration as a solution."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't block JavaScript resources.&lt;/strong&gt; Ensure robots.txt allows all JavaScript files, CSS files, and API endpoints needed for rendering. Blocking these prevents Google from understanding pages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use proper HTML links.&lt;/strong&gt; Links must be implemented as &lt;code&gt;&amp;lt;a href&amp;gt;&lt;/code&gt; elements, not &lt;code&gt;&amp;lt;span onclick&amp;gt;&lt;/code&gt; or JavaScript event handlers. Google &lt;a href="https://developers.google.com/search/docs/crawling-indexing/links-crawlable" rel="noopener noreferrer"&gt;may not follow&lt;/a&gt; programmatically triggered navigation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Place metadata in initial HTML.&lt;/strong&gt; Canonical URLs and robots directives should exist in server-rendered HTML. Google &lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/fix-search-javascript" rel="noopener noreferrer"&gt;advises&lt;/a&gt;: "You shouldn't use JavaScript to change the canonical URL to something else than the URL you specified as the canonical URL in the original HTML."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HTTP status codes matter.&lt;/strong&gt; Pages returning non-200 status codes may skip JavaScript execution entirely. Use proper 404s for missing pages rather than soft 404s.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical implementation priorities for Lovable.dev users
&lt;/h2&gt;

&lt;p&gt;For sites where SEO is a primary growth channel, the recommended approach depends on project scale and resources:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;New SEO-critical project&lt;/td&gt;
&lt;td&gt;Build with &lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt; instead of Lovable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Existing Lovable site, limited budget&lt;/td&gt;
&lt;td&gt;Implement &lt;a href="https://prerender.io/" rel="noopener noreferrer"&gt;Prerender.io&lt;/a&gt; or &lt;a href="https://lovablehtml.com/" rel="noopener noreferrer"&gt;LovableHTML&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large site with development resources&lt;/td&gt;
&lt;td&gt;Migrate to Next.js with SSR/SSG hybrid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content marketing focus&lt;/td&gt;
&lt;td&gt;Consider &lt;a href="https://astro.build/" rel="noopener noreferrer"&gt;Astro&lt;/a&gt; for static generation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://docs.lovable.dev/tips-tricks/seo" rel="noopener noreferrer"&gt;Lovable acknowledges&lt;/a&gt; that SSR "may help" for very large sites, projects where organic search is the primary growth channel, highly competitive verticals, and sites prioritizing AI/LLM visibility. For applications where SEO matters less than rapid development—internal tools, authenticated dashboards, or apps primarily shared via direct links—Lovable's CSR architecture presents fewer concerns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The core tension with &lt;a href="https://lovable.dev/" rel="noopener noreferrer"&gt;Lovable.dev&lt;/a&gt; is architectural: the platform optimizes for rapid full-stack application development using client-side rendering, while search engines and AI crawlers work best with server-rendered content. This isn't a bug but a fundamental trade-off inherent to the platform's design.&lt;/p&gt;

&lt;p&gt;The practical path forward depends on priorities. Teams needing strong SEO should either avoid Lovable.dev for those projects, implement prerendering services immediately, or plan for eventual migration to SSR frameworks like &lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;. &lt;a href="https://notprovided.eu/" rel="noopener noreferrer"&gt;Jan-Willem Bobbink&lt;/a&gt;'s hybrid approach—server-rendered SEO elements with client-side UX enhancements—represents the industry consensus on balancing searchability with interactivity.&lt;/p&gt;

&lt;p&gt;As AI-powered search grows in importance, the inability of AI crawlers to execute JavaScript makes this problem increasingly urgent. Sites invisible to ChatGPT, Claude, and Perplexity miss a growing discovery channel. Google's &lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/dynamic-rendering" rel="noopener noreferrer"&gt;December 2025 deprecation of dynamic rendering&lt;/a&gt; as a long-term strategy signals that the search giant expects sites to solve JavaScript SEO at the source through proper SSR implementation rather than workarounds.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Lovable.dev Documentation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.lovable.dev/tips-tricks/seo" rel="noopener noreferrer"&gt;Lovable Docs – SEO&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.lovable.dev/features/tech-stack" rel="noopener noreferrer"&gt;Lovable Docs – Tech Stack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.lovable.dev/features/deploy" rel="noopener noreferrer"&gt;Lovable Docs – Deploying&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Google Official Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics" rel="noopener noreferrer"&gt;Google Search Central – JavaScript SEO Basics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/fix-search-javascript" rel="noopener noreferrer"&gt;Google Search Central – Fix JavaScript Issues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/search/docs/crawling-indexing/javascript/dynamic-rendering" rel="noopener noreferrer"&gt;Google Search Central – Dynamic Rendering (Deprecated)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.google.com/search/docs/crawling-indexing/links-crawlable" rel="noopener noreferrer"&gt;Google Search Central – Links Best Practices&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/watch?v=PFEakcD3CSs" rel="noopener noreferrer"&gt;Tom Greenaway – Google I/O Talk on JavaScript Rendering&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Research &amp;amp; Studies
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.onely.com/blog/javascript-seo-experiment/" rel="noopener noreferrer"&gt;Onely – JavaScript SEO Experiment (9x Rendering Delay)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.momentic.ai/blog/react-seo-case-study" rel="noopener noreferrer"&gt;Momentic – React Website Lost 51% of Traffic (Case Study)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.screamingfrog.co.uk/javascript-seo/" rel="noopener noreferrer"&gt;Screaming Frog – JavaScript &amp;amp; Bing Indexing Issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Bing &amp;amp; Other Search Engines
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.bing.com/webmasters/help/webmasters-guidelines-30fba23a" rel="noopener noreferrer"&gt;Bing Webmaster Guidelines – Dynamic Rendering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blogs.bing.com/webmaster/2024/javascript-seo-recommendations" rel="noopener noreferrer"&gt;Bing Webmaster Blog – JavaScript SEO&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AI Search &amp;amp; Crawlers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://vercel.com/blog/how-search-engines-and-llms-see-your-site" rel="noopener noreferrer"&gt;Vercel – AI Crawlers and Static HTML&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.searchenginejournal.com/structured-data-javascript-ai-crawlers/" rel="noopener noreferrer"&gt;Search Engine Journal – Structured Data &amp;amp; AI Crawlers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tools &amp;amp; Solutions
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://prerender.io/" rel="noopener noreferrer"&gt;Prerender.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lovablehtml.com/" rel="noopener noreferrer"&gt;LovableHTML&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/GoogleChrome/rendertron" rel="noopener noreferrer"&gt;Rendertron (Google Open Source)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vike.dev/" rel="noopener noreferrer"&gt;Vike (formerly vite-plugin-ssr)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://astro.build/" rel="noopener noreferrer"&gt;Astro&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://vitetonext.ai/" rel="noopener noreferrer"&gt;ViteToNext.AI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/engrafstudio/next-lovable" rel="noopener noreferrer"&gt;next-lovable&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  SEO Monitoring Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.contentkingapp.com/" rel="noopener noreferrer"&gt;ContentKing (now Conductor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://littlewarden.com/" rel="noopener noreferrer"&gt;Little Warden&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pagemodified.com/" rel="noopener noreferrer"&gt;PageModified&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://seoradar.com/" rel="noopener noreferrer"&gt;SEORadar&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Jan-Willem Bobbink
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://notprovided.eu/" rel="noopener noreferrer"&gt;notprovided.eu&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.brightonseo.com/" rel="noopener noreferrer"&gt;BrightonSEO 2019 – JavaScript SEO Presentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://notprovided.eu/javascript-seo-recommendations/" rel="noopener noreferrer"&gt;10 Recommendations for JavaScript SEO&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Core Web Vitals
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://web.dev/vitals/" rel="noopener noreferrer"&gt;Google Web Vitals&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.dev/lcp/" rel="noopener noreferrer"&gt;Largest Contentful Paint (LCP)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://web.dev/cls/" rel="noopener noreferrer"&gt;Cumulative Layout Shift (CLS)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>architecture</category>
      <category>javascript</category>
      <category>llm</category>
      <category>react</category>
    </item>
    <item>
      <title>17 common SEO mistakes LLMs and vibecoders make</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Tue, 13 Jan 2026 17:19:54 +0000</pubDate>
      <link>https://dev.to/jbobbink/17-common-seo-mistakes-llms-and-vibecoders-make-2h9j</link>
      <guid>https://dev.to/jbobbink/17-common-seo-mistakes-llms-and-vibecoders-make-2h9j</guid>
      <description>&lt;p&gt;The rise of AI-assisted development has democratized coding like never before. Anyone can spin up a SaaS, build a landing page, or create a web app by prompting their way to a working product. But here's the uncomfortable truth: most of these projects are SEO disasters waiting to happen.&lt;/p&gt;

&lt;p&gt;LLMs don't inherently understand SEO. They generate code that &lt;em&gt;works&lt;/em&gt;, not code that &lt;em&gt;ranks&lt;/em&gt;. And vibecoders (developers who ship by feel rather than fundamentals) often lack the technical SEO knowledge to catch these issues before they tank their organic traffic.&lt;/p&gt;

&lt;p&gt;After analyzing countless AI-generated codebases and vibe-coded projects, here are the 17 most common SEO mistakes I see repeatedly.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Client-side rendering without SSR or SSG
&lt;/h2&gt;

&lt;p&gt;This is the big one. LLMs default to whatever framework is most popular in their training data, which often means React SPAs with client-side rendering.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What the LLM generates&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;BlogPost&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;slug&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setPost&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/api/posts/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setPost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;article&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;article&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Googlebot will see an empty &lt;code&gt;&amp;lt;article&amp;gt;&lt;/code&gt; tag. Your content doesn't exist until JavaScript executes, and while Google &lt;em&gt;can&lt;/em&gt; render JavaScript, it's slow, unreliable, and puts you at a significant disadvantage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Use Next.js with &lt;code&gt;getStaticProps&lt;/code&gt; or &lt;code&gt;getServerSideProps&lt;/code&gt;, Nuxt with SSR, or Astro for content-heavy sites. If you must use a SPA, implement pre-rendering or dynamic rendering for crawlers.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Hash-based or query parameter routing
&lt;/h2&gt;

&lt;p&gt;LLMs often generate routing patterns that are technically functional but SEO-hostile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Terrible for SEO&lt;/span&gt;
&lt;span class="nx"&gt;yoursite&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;blog&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;my&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;
&lt;span class="nx"&gt;yoursite&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;blog&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;

&lt;span class="c1"&gt;// What you actually need&lt;/span&gt;
&lt;span class="nx"&gt;yoursite&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;blog&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;my&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hash fragments (&lt;code&gt;#&lt;/code&gt;) are completely ignored by search engines. Query parameters create duplicate content issues and look spammy to users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Always use clean, semantic URL paths. Configure your framework's router for history-based navigation, not hash-based.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Auto-generated slugs without human review
&lt;/h2&gt;

&lt;p&gt;When LLMs generate content management systems, they typically create slugs from titles automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+/g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;-&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// "10 Best Ways to Optimize Your Website!!!" becomes&lt;/span&gt;
&lt;span class="c1"&gt;// "10-best-ways-to-optimize-your-website!!!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces slugs with special characters, excessive length, and no keyword optimization. Worse, if you change a title, many systems regenerate the slug, breaking existing links without redirects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Generate slugs as suggestions, but require human approval. Strip special characters, limit length to 60 characters, and implement automatic 301 redirects when slugs change.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Missing or duplicate meta tags
&lt;/h2&gt;

&lt;p&gt;Ask an LLM to build you a blog and you'll often get pages with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No meta description at all&lt;/li&gt;
&lt;li&gt;The same title tag on every page&lt;/li&gt;
&lt;li&gt;Titles that exceed 60 characters and get truncated&lt;/li&gt;
&lt;li&gt;Meta descriptions that are either missing or auto-truncated content
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- What you get --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;My Blog&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;&amp;lt;!-- What you need --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;How to fix Core Web Vitals issues in 2024 | Your Brand&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"Learn the 7 most effective techniques to improve LCP, CLS, and INP scores. Includes code examples and before/after case studies."&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Build meta tag management into your content model from day one. Every page needs a unique, optimized title (50-60 chars) and description (150-160 chars).&lt;/p&gt;

&lt;h2&gt;
  
  
  5. No canonical URLs
&lt;/h2&gt;

&lt;p&gt;Duplicate content is the silent killer of SEO. LLMs rarely implement canonical tags, which means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;yoursite.com/blog/post&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;yoursite.com/blog/post/&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;yoursite.com/blog/post?utm_source=twitter&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;www.yoursite.com/blog/post&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All compete against each other, diluting your ranking signals.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"canonical"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"https://yoursite.com/blog/post"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Implement canonical tags on every page. Pick one URL format (with or without trailing slash) and stick to it. Configure your server to redirect all variations to the canonical version.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Completely ignoring internal linking
&lt;/h2&gt;

&lt;p&gt;LLMs generate isolated pages. They don't understand your content architecture or how pages should relate to each other. You end up with blog posts that link nowhere, category pages that don't link to their children, and pillar content that doesn't establish topical authority.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Design your internal linking architecture deliberately. Every piece of content should link to 3-5 related pieces. Important pages should receive more internal links. Use descriptive anchor text, not "click here."&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Invalid or incorrect schema markup
&lt;/h2&gt;

&lt;p&gt;When LLMs attempt structured data, they often produce schema that is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Syntactically invalid JSON-LD&lt;/li&gt;
&lt;li&gt;Using deprecated schema types&lt;/li&gt;
&lt;li&gt;Missing required properties&lt;/li&gt;
&lt;li&gt;Semantically incorrect (marking a blog post as a Product)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// LLM-generated mess&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Article&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;John Doe&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;// Wrong: should be a Person object&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;datePublished&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;January 5, 2024&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;// Wrong: needs ISO 8601 format&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Validate all schema markup with Google's Rich Results Test. Use the correct types: &lt;code&gt;BlogPosting&lt;/code&gt; for blog posts, &lt;code&gt;Article&lt;/code&gt; for news, &lt;code&gt;Product&lt;/code&gt; for products. Include all required and recommended properties.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Hallucinated facts and statistics
&lt;/h2&gt;

&lt;p&gt;This is an LLM-specific problem that creates both credibility and potential legal issues. LLMs confidently generate statistics, quotes, and "studies" that don't exist:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"According to a 2023 Stanford study, 73% of websites with proper schema markup see a 45% increase in click-through rates."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That study doesn't exist. That statistic was invented. And when your content is full of hallucinated facts, it destroys E-E-A-T signals and can get you penalized for misinformation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Fact-check every statistic, quote, and claim in AI-generated content. Link to primary sources. Remove anything you can't verify.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. No robots.txt or sitemap.xml
&lt;/h2&gt;

&lt;p&gt;LLMs build features, not infrastructure. They won't remind you that search engines need a roadmap to your site.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// robots.txt you need
User-agent: *
Disallow: /admin/
Disallow: /api/
Sitemap: https://yoursite.com/sitemap.xml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without a sitemap, Google has to discover pages through crawling alone, which may never find your deeper pages. Without robots.txt, you might be letting bots crawl your API endpoints and admin panels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Generate a dynamic sitemap.xml that updates when content changes. Include lastmod dates. Create a robots.txt that guides crawlers to what matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Images without alt text or optimization
&lt;/h2&gt;

&lt;p&gt;AI-generated code typically handles images like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;img&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No alt text. No width/height (causing CLS). No lazy loading. Massive unoptimized files. No next-gen formats.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Every image needs descriptive alt text for accessibility and image search. Specify dimensions to prevent layout shift. Use WebP/AVIF formats. Implement lazy loading for below-the-fold images.&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Broken heading hierarchy
&lt;/h2&gt;

&lt;p&gt;LLMs choose heading levels based on visual size, not document structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;h3&amp;gt;&lt;/span&gt;Main Page Title&lt;span class="nt"&gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Should be h1 --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;A Section Header&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt; &lt;span class="c"&gt;&amp;lt;!-- Should be h2 --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;h4&amp;gt;&lt;/span&gt;Subsection&lt;span class="nt"&gt;&amp;lt;/h4&amp;gt;&lt;/span&gt;       &lt;span class="c"&gt;&amp;lt;!-- Should be h3 --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or worse, multiple H1 tags on a single page because the developer wanted multiple "big text" elements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Every page gets exactly one H1. Headings follow logical order: H1 → H2 → H3. Never skip levels. Use CSS for styling, not heading tags.&lt;/p&gt;

&lt;h2&gt;
  
  
  12. Ignoring Core Web Vitals
&lt;/h2&gt;

&lt;p&gt;Vibecoders ship features. Core Web Vitals are an afterthought, if they're thought of at all. Common issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LCP (Largest Contentful Paint):&lt;/strong&gt; Hero images that take 8 seconds to load because nobody optimized them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CLS (Cumulative Layout Shift):&lt;/strong&gt; Ads, images, and fonts that shift content as they load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;INP (Interaction to Next Paint):&lt;/strong&gt; JavaScript bundles so large that clicks take 500ms to register&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Test with PageSpeed Insights before shipping. Lazy load below-the-fold content. Optimize your critical rendering path. Reserve space for dynamic content.&lt;/p&gt;

&lt;h2&gt;
  
  
  13. JavaScript-dependent critical content
&lt;/h2&gt;

&lt;p&gt;Beyond the CSR problem, LLMs often put critical content behind JavaScript interactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Your important content is hidden until user clicks&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Accordion&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Product Features"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;All your keyword-rich content here&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Accordion&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Content inside collapsed accordions, tabs, or "read more" sections may be deprioritized or ignored by search engines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Important content should be visible in the initial HTML. If you must use interactive elements, ensure the content is in the DOM on page load, just visually hidden.&lt;/p&gt;

&lt;h2&gt;
  
  
  14. No mobile optimization
&lt;/h2&gt;

&lt;p&gt;LLMs are trained on desktop-centric code. Mobile is an afterthought:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fixed widths instead of responsive layouts&lt;/li&gt;
&lt;li&gt;Tiny tap targets&lt;/li&gt;
&lt;li&gt;Horizontal scrolling on mobile&lt;/li&gt;
&lt;li&gt;Text too small to read without zooming&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Google uses mobile-first indexing. If your mobile experience is broken, your rankings suffer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Design mobile-first. Test on real devices. Use responsive images. Ensure tap targets are at least 48x48px.&lt;/p&gt;

&lt;h2&gt;
  
  
  15. Missing or wrong hreflang tags
&lt;/h2&gt;

&lt;p&gt;When LLMs build multilingual sites, they either ignore hreflang entirely or implement it incorrectly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- Common mistakes --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"alternate"&lt;/span&gt; &lt;span class="na"&gt;hreflang=&lt;/span&gt;&lt;span class="s"&gt;"english"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;&amp;lt;!-- Wrong: should be "en" --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"alternate"&lt;/span&gt; &lt;span class="na"&gt;hreflang=&lt;/span&gt;&lt;span class="s"&gt;"en"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;       &lt;span class="c"&gt;&amp;lt;!-- Missing: x-default --&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- Also missing: the return links on the other language versions --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Use proper language codes (en, en-US, de-DE). Always include x-default. Ensure every page in the hreflang set references all other pages, including itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  16. Pagination done wrong
&lt;/h2&gt;

&lt;p&gt;LLMs generate infinite scroll because it's trendy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Infinite scroll that search engines can't follow&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;InfiniteScroll&lt;/span&gt; &lt;span class="na"&gt;loadMore&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;fetchNextPage&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;PostCard&lt;/span&gt; &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;InfiniteScroll&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or they create paginated content without proper linking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- What's missing --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"next"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/blog?page=2"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"prev"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/blog"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Provide crawlable pagination with static links. Consider a "load more" button that appends to existing content rather than replacing it. Ensure all pages are accessible via links, not just JavaScript.&lt;/p&gt;

&lt;h2&gt;
  
  
  17. Zero consideration for page speed
&lt;/h2&gt;

&lt;p&gt;The default LLM stack is bloated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Import the entire Lodash library for one function&lt;/li&gt;
&lt;li&gt;Include three animation libraries&lt;/li&gt;
&lt;li&gt;Bundle fonts you're not using&lt;/li&gt;
&lt;li&gt;No code splitting&lt;/li&gt;
&lt;li&gt;No tree shaking&lt;/li&gt;
&lt;li&gt;Synchronous third-party scripts blocking render
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What LLMs generate&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;_&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;lodash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sorted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sortBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;date&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// What you need&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;sortBy&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;lodash/sortBy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;// Or just: items.sort((a, b) =&amp;gt; a.date - b.date);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Audit your bundle size regularly. Use dynamic imports for heavy components. Lazy load third-party scripts. Question every dependency.&lt;/p&gt;

&lt;h2&gt;
  
  
  The root cause
&lt;/h2&gt;

&lt;p&gt;These mistakes share a common origin: LLMs optimize for "does it work?" not "will it rank?"&lt;/p&gt;

&lt;p&gt;SEO isn't a feature you bolt on later. It's architectural. By the time you realize your React SPA isn't indexing properly, you're looking at a significant rewrite, not a quick fix.&lt;/p&gt;

&lt;p&gt;The vibecoders shipping MVPs without SEO fundamentals are building on sand. They'll get traffic from Product Hunt and Hacker News, wonder why organic never materializes, and blame "SEO takes time" rather than examining their technical foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The solution
&lt;/h2&gt;

&lt;p&gt;If you're using AI to build web projects:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Specify SEO requirements upfront.&lt;/strong&gt; Tell the LLM you need SSR, semantic URLs, and proper meta tags before it generates code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use SEO-first frameworks.&lt;/strong&gt; Next.js, Nuxt, Astro, and SvelteKit have good defaults. Vanilla React SPAs don't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit before launch.&lt;/strong&gt; Run Lighthouse, check your rendered HTML, validate your schema, test your mobile experience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor continuously.&lt;/strong&gt; Set up Google Search Console. Track your Core Web Vitals. Watch for indexing issues.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The bar for "working websites" is low. The bar for "websites that ranks" or "websites that show up in LLMs" is much higher. Know the difference.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>WordPress plugin: Track ChatGPT Hits</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Mon, 11 Dec 2023 17:44:13 +0000</pubDate>
      <link>https://dev.to/jbobbink/wordpress-plugin-track-chatgpt-hits-1h4</link>
      <guid>https://dev.to/jbobbink/wordpress-plugin-track-chatgpt-hits-1h4</guid>
      <description>&lt;p&gt;Due to high demand I decided to make a user friendly version of tracking known OpenAI bot hits. This WordPress plugin tracks URL requests by the ChatGPT / OpenAI bots and direct user actions by tracking request made by specific user agents.&lt;/p&gt;

&lt;p&gt;You can download the plugin at &lt;a href="https://www.notprovided.eu/track-chatgpt/" rel="noopener noreferrer"&gt;my blog&lt;/a&gt; – You can simply upload the folder track-chatgpt to your plugin directory via sFTP or use the WordPress plugin interface to upload the ZIP.&lt;/p&gt;

&lt;p&gt;There are currently two known useragents and a small set of IP addresses which can be used to check if it are valid requests by OpenAI ChatGPTs. The plugin will show if the request where from a verified source (valid requests) or not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3ro47kt2jev1y54vw6h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3ro47kt2jev1y54vw6h.png" alt="WordPress plugin to track ChatGPT behaviour" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The plugin shows a graph of the hits during the past 28 days. It has a download functionality to download the full dataset at once.&lt;/p&gt;

&lt;p&gt;REST API is also enabled, so you can connect via &lt;em&gt;/wp-json/chatgpt-tracker/v1/download-data/&lt;/em&gt; and use automated exports to an external database to include the data into your monitoring dashboards. Its a simple dump of the full dataset. I may update this feature in the future if there is enough interest for it.&lt;/p&gt;

&lt;p&gt;Any feature requests? PM me on &lt;a href="https://www.linkedin.com/in/jbobbink/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/jbobbink" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; or leave a comment at the bottom.&lt;/p&gt;

&lt;p&gt;For updates (if useragents or IPs change for example), follow me on LinkedIn or Twitter. I’m trying to get the plugin into the official WordPress repository as soon as possible which enables auto-updates too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How can I test if the tracker works?&lt;/strong&gt;&lt;br&gt;
The easiest way to test the plugin is to go to the ChatGPT 4 interface and request it to summarize one of your latest URLs on your website. Make sure it actually requests the URL. It could be Bing has already crawled and stored the contents so that will be used instead of visiting the live URL. Make sure it actually shows it is browsing:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3x54aln34po8v90zyz1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3x54aln34po8v90zyz1.png" alt="ChatGPT browsing functionality" width="800" height="175"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also set your own browser with a useragent containing GPTBot or ChatGPT. You will notice those hits will be documented as invalid since the IP address will not match OpenAI’s ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does it have any privacy related impact?&lt;/strong&gt;&lt;br&gt;
No, it doesn’t impact any privacy related matters since the plugin only tracks and documents user-agents and IP adresses from validated sources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the difference between crawling and browsing behaviour?&lt;/strong&gt;&lt;br&gt;
More information about the behaviour of the different bots can be found on &lt;a href="https://platform.openai.com/docs/plugins/bot" rel="noopener noreferrer"&gt;ChatGPT-User&lt;/a&gt; browsing and &lt;a href="https://platform.openai.com/docs/gptbot" rel="noopener noreferrer"&gt;GPTBot crawling&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I use this data in external reporting?&lt;/strong&gt;&lt;br&gt;
Yes, you definitely can. You can use the REST API. The plugin has a specific endpoint enabled.&lt;/p&gt;

&lt;h2&gt;
  
  
  Known issues / Feature requests
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Feature: Referral traffic: do people click on the mentioned URLs from chat.openai.com to your website&lt;/li&gt;
&lt;li&gt;Issue: When your site is using a CDN like Cloudflare, it reports their IP addresses&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Changelog
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;= 1.0 =&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Added a download functionality&lt;/li&gt;
&lt;li&gt;Added a simple graph plotting the last 28 days of hits.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;= 0.5 =&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic functionality&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>wordpress</category>
      <category>seo</category>
      <category>chatgpt</category>
      <category>bing</category>
    </item>
    <item>
      <title>What I learned about SEO from using the 10 most used JS frameworks</title>
      <dc:creator>Jan-Willem Bobbink</dc:creator>
      <pubDate>Thu, 06 Feb 2020 15:32:20 +0000</pubDate>
      <link>https://dev.to/jbobbink/what-i-learned-about-seo-from-using-the-10-most-used-js-frameworks-4alk</link>
      <guid>https://dev.to/jbobbink/what-i-learned-about-seo-from-using-the-10-most-used-js-frameworks-4alk</guid>
      <description>&lt;p&gt;JavaScript will define and impact the future of most SEO consultants. A big chunk of websites has, is or will move over to a JS framework driven platform. Stack Overflow published an extensive study about the data gathered from an enquiry amongst more than 100.000 professional programmers’ most used Programming, Scripting and Markup languages: read more at &lt;a href="https://insights.stackoverflow.com/survey/2018#most-popular-technologies" rel="noopener noreferrer"&gt;Most Popular Technologies&lt;/a&gt; The outcome is quite clear, it’s all about JavaScript today:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic3i1fv0x233kadhzr5b.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic3i1fv0x233kadhzr5b.jpg" alt="Programming, Scripting and Markup languages" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But JavaScript and search engines are a tricky combination. It turns out there is a fine line between successful and disastrous implementations. Below I will share 10 tips to prevent SEO disasters to happen with your own or your clients sites.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Always go for Server Side Rendering (SSR)
&lt;/h2&gt;

&lt;p&gt;As Google shared earlier this year during Google I/O the pipeline for crawling, indexing and rendering is somewhat different from the original pipeline. Check out &lt;a href="https://web.dev/javascript-and-google-search-io-2019" rel="noopener noreferrer"&gt;https://web.dev/javascript-and-google-search-io-2019&lt;/a&gt; for more context but the diagram below is clear enough to start with: there is a separate track, also known as the second wave, where the rendering of JavaScript takes place. To make sure Google has URLs to be processed and returned to the crawl queue, the initial HTML response needs to include all relevant HTML elements for SEO. This means at least the basic page elements that show up in SERPs and links. It’s always about links right? 🙂&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fan9n1ahlwswoe1l3oguh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fan9n1ahlwswoe1l3oguh.png" alt="JavaScript and Google" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google showed numerous setups in their article about rendering on the web but forget to include the SEO perspective. That made me publish an alternative table: read more at &lt;a href="https://www.notprovided.eu/rendering-on-the-web-the-seo-version/" rel="noopener noreferrer"&gt;https://www.notprovided.eu/rendering-on-the-web-the-seo-version/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8yhjnm6aafj9ky1bpdh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8yhjnm6aafj9ky1bpdh.png" alt="Real version: JavaScript and Google" width="660" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Server Side Rendering (SSR) is just the safest way to go. There are cons, but for SEO you just don’t want to take a risk Google sees anything else then a fully SEO optimized page in the initial crawl. Don’t forget that the most advanced search engine, Google, can’t handle it well. How about all the other search engines like Baidu, Naver, Bing etcetera?&lt;/p&gt;

&lt;p&gt;Since Google openly admits there are some challenges ahead, they have been sharing setups of dynamic rendering. Pick the best suitable scenario for a specific group of users (low CPU power mobile phone users for example) or bots. An example setup could be the following where you make use of the client side rendering setup for most users (not for old browsers, non JS users, slow mobile cell phones etcetera) and sent search engine bots or social media crawlers the fully static rendered HTML version.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63tgd04f7nhfszfysq6f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F63tgd04f7nhfszfysq6f.png" alt="Dynamic Renderer" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whatever Google tells us, read &lt;a href="https://ja.dev/entry/blog/nagayama/render-budget-en" rel="noopener noreferrer"&gt;Render Budget, or: How I Stopped Worrying and and Learned to Render Server-Side&lt;/a&gt; by a former Google Engineer.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Tools for checking what search engines do and don’t see
&lt;/h2&gt;

&lt;p&gt;Since most platforms capture user agents for dynamic rendering setups, changing it directly into Chrome is the first thing I always do. Is this 100% proof? No, some setups also match on IPs. But I would target the SSR as broad as possible, also think about social media crawlers wanting to capture OpenGraph tags for example. Targeting a combination of IPs and User Agents will not cover enough. Better cover too much requests and spend some more money on sufficient servers pushing out rendered HTML then missing out on specific platforms possibilities.&lt;/p&gt;

&lt;p&gt;Next thing you need to check is if users, bots and other requests get the same elements of content and directives back. I’ve seen example where Googlebot got different titles, H1 headings and content blocks back compared to what the users got to see. A nice Chrome plugin is &lt;a href="https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob/" rel="noopener noreferrer"&gt;View Rendered Source&lt;/a&gt; that compares the fetched and rendered differences directly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fepv74w809d0y9il1s26g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fepv74w809d0y9il1s26g.png" alt="View Rendered Source" width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you have access to a domain in Google Search Console, of course use the inspection tool. It now also uses an evergreen Googlebot version (like all other Google Search tools) so it represents what Google will actually see during crawling. Check the HTML and screenshots to be sure every important element is covered and is filled with the correct information.&lt;/p&gt;

&lt;p&gt;Non-owned URLs that you want to check? Use the Rich Result Tester &lt;a href="https://search.google.com/test/rich-results" rel="noopener noreferrer"&gt;https://search.google.com/test/rich-results&lt;/a&gt; which also shows the rendered HTML version and you can check for Mobile and Desktop versions separately to double check if there are no differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The minimal requirement for initial HTML response
&lt;/h2&gt;

&lt;p&gt;It is a simple list of search engine optimization basics, but important for SEO results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Title and meta tags&lt;/li&gt;
&lt;li&gt;Directives like indexing and crawling directives, canonical references and hreflangs annotations.&lt;/li&gt;
&lt;li&gt;All textual content, including a semantically structure set of Hx-headings&lt;/li&gt;
&lt;li&gt;Structured data markup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lazy loading: surely a best practice in modern performance optimization but it turns out that for things like mobile SERP thumbnails and Google Discover Feed, Googlebot likes to have a noscript version. Make sure that Google can find a clean &lt;a href="" class="article-body-image-wrapper"&gt;&lt;img&gt;&lt;/a&gt; link without the need of any JavaScript.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Data persistence risks
&lt;/h2&gt;

&lt;p&gt;Googlebot is crawling with a headless browser, not passing anything to the next, sucessive URL request. So don’t make use of cookies, local storage or session data to fill in any important SEO elements. I’ve seen examples where products were personalized within category pages and that product links were only loaded based on a specific cookie. Don’t do that or accept a ranking loss.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Unit test SSR
&lt;/h2&gt;

&lt;p&gt;Whatever developers tell you, things can break. Things can go offline due to network failures. It could be due to new release or just some unknown bug that gets introduced while working on completely different things. Below an example of a site were the SSR was broken (just after last year’s #BrightonSEO) causing two weeks of trouble internally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd20badmnon5go18qxrqe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd20badmnon5go18qxrqe.png" alt="Visibility in Google is down!" width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Make sure you setup unit testing for server side rendering. Testing setups for the most used JavaScript frameworks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Angular &amp;amp; React testing: &lt;a href="https://jestjs.io/" rel="noopener noreferrer"&gt;https://jestjs.io/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Vue testing &lt;a href="https://github.com/vuejs/vue-test-utils" rel="noopener noreferrer"&gt;https://github.com/vuejs/vue-test-utils&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  6. Third party rendering – Setup monitoring
&lt;/h1&gt;

&lt;p&gt;Also third party rendering like prerender.io is not flawless, those can break too. If Amazon crashes their infrastructure, most third parties you’ll use will be offline. Use third party (haha!) tools like ContentKing, Little Warden or PageModified. Do consider where they host their services 🙂&lt;/p&gt;

&lt;p&gt;Another tactic you can apply to be sure Google doesn’t index empty pages is to start serving a 503 header, load the page and send a signal back to the server once content is loaded and update header status. This is quite tricky and you need to really tune this to not ruin your rankings completely. It is more of a band-aid for unfinished setups.&lt;/p&gt;

&lt;h1&gt;
  
  
  7. Performance: reduce JS
&lt;/h1&gt;

&lt;p&gt;Even if every element relevant for SEO is available in the initial HTML response, I have had clients losing traffic due to performance getting worse for both users and search engine bots. First of all, think of real users’ experiences. Google Chrome UX reports are a great way of monitoring the actual performance. And Google can freely use that data to feed it to their monstrous algorithms haha!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yqeb1iba90no4yxzb6l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yqeb1iba90no4yxzb6l.png" alt="JavaScript is impacted user experience heavily" width="606" height="275"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most effective tip is using tree-shaking to simply reduce the amount of JavaScript bytes that needs to be loaded. Making your scripts more clean can also speed up processing which helps a lot with older, slower CPUs. Specifically for older mobile phones this can help speeding up user experiences.&lt;/p&gt;

&lt;h1&gt;
  
  
  8. Can Google load all JS scripts?
&lt;/h1&gt;

&lt;p&gt;Make sure you monitor and analyze log files to see if any static JS files are generating any errors. &lt;a href="https://www.botify.com/" rel="noopener noreferrer"&gt;Botify is perfect&lt;/a&gt; for this with their separate section monitoring static file responses. The brown 404 trends clearly show an issue with files not being accessible at the moment Google required them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxzea4uto1wdpjs3o19e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foxzea4uto1wdpjs3o19e.png" alt="Use Botify to monitor server logs for static files" width="606" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  9. Prevent analytics views triggered during pre rendering
&lt;/h1&gt;

&lt;p&gt;Make sure you don’t send pageviews into your analytics when pre rendering. Easy way is just blocking all request to the tracking pixel domain. As simple as it can get. Noticed an uplift in traffic? Check your SSR first before reporting massive traffic gains.&lt;/p&gt;

&lt;h1&gt;
  
  
  10. Some broader SSR risks
&lt;/h1&gt;

&lt;p&gt;Cloaking in the eyes of search engines: they still don’t like it and make sure you don’t accidently cloak. In the case of server side rendering this could mean showing users different content compared to search engines.&lt;/p&gt;

&lt;p&gt;Caching rendered pages can be cost effective but think about the effects on the datapoints sent to Google: you don’t want outdated structured data like product markup to be outdated.&lt;/p&gt;

&lt;p&gt;Check the differences between Mobile and Desktop Googlebots, a tool like SEO Radar can help you quickly identify differences between the two user agents.&lt;/p&gt;

&lt;p&gt;&lt;iframe src="https://www.slideshare.net/slideshow/embed_code/key/5fxpn2fRz8344t" alt="5fxpn2fRz8344t on slideshare.net" width="100%" height="487"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Questions? Just let me know!
&lt;/h3&gt;

</description>
      <category>seo</category>
      <category>javascript</category>
      <category>react</category>
    </item>
  </channel>
</rss>
