<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jan Michalík</title>
    <description>The latest articles on DEV Community by Jan Michalík (@pagecoder).</description>
    <link>https://dev.to/pagecoder</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3918760%2F4e9c2673-f502-4a09-bcbd-0a5b94d3d58e.jpg</url>
      <title>DEV Community: Jan Michalík</title>
      <link>https://dev.to/pagecoder</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pagecoder"/>
    <language>en</language>
    <item>
      <title>WordPress AI chat plugins make 6–11 outbound requests per visitor question. Architecture writeup of an alternative.</title>
      <dc:creator>Jan Michalík</dc:creator>
      <pubDate>Thu, 07 May 2026 21:35:54 +0000</pubDate>
      <link>https://dev.to/pagecoder/wordpress-ai-chat-plugins-make-6-11-outbound-requests-per-visitor-question-architecture-writeup-of-3a6a</link>
      <guid>https://dev.to/pagecoder/wordpress-ai-chat-plugins-make-6-11-outbound-requests-per-visitor-question-architecture-writeup-of-3a6a</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker/" rel="noopener noreferrer"&gt;pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker&lt;/a&gt;. Cross-posted here for the dev.to community.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Last week we sat in a café in Wien with a friend — a freelance dev who's been shipping WordPress sites for a decade. He'd just installed an AI chat plugin for a client, a small cosmetics brand. It looked nice. Brand colors, custom name.&lt;/p&gt;

&lt;p&gt;Then he opened the network tab.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Three. Four. Five." He scrolled.
"Eight. Wait — eleven? Where is this one going?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eleven outbound requests. Per visitor question. Before the bot's reply finishes rendering.&lt;/p&gt;

&lt;p&gt;If you ship WordPress sites for clients and you're considering an AI chatbot, this post is the architecture audit you probably haven't done yourself yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture of the leak
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;visitor question
  ├─→ third-party AI provider (LLM call)
  │     - prompt logged for "abuse monitoring"
  │     - retention window = vendor-defined
  ├─→ chatbot vendor's own backend (the "data product")
  │     - browser fingerprint + IP + geolocation
  │     - full conversation + page URL
  │     - this IS the actual product the vendor sells
  └─→ widget's embedded CDNs / analytics endpoints (~3-9 of these)
        - CDN providers owe no privacy policy
        - implicit consent via embed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single visitor question on a small WordPress site running a popular AI chatbot can leak to &lt;strong&gt;6–11 different companies&lt;/strong&gt; before the page renders the reply. None of them are in the site's cookie banner. None appear in any data-subject access request. None know who the brand is — all of them know who its visitors are.&lt;/p&gt;

&lt;p&gt;EU regulators are catching up. Default architecture isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we started over
&lt;/h2&gt;

&lt;p&gt;We'd been installing those plugins for clients for years. Ticked the privacy-policy boxes. Stopped noticing. Then a client asked, casually, "where exactly does that go when someone types it in?"&lt;/p&gt;

&lt;p&gt;We didn't have a clean answer.&lt;/p&gt;

&lt;p&gt;So we started over. Three architectural rules:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Vectors live on the user's server
&lt;/h3&gt;

&lt;p&gt;Standard SaaS playbook says "use Pinecone / Weaviate / our managed vector DB." We store embeddings in the customer's WordPress database (custom post type with &lt;code&gt;vector&lt;/code&gt; + &lt;code&gt;chunk_id&lt;/code&gt; + &lt;code&gt;source_post_ref&lt;/code&gt;). Lookup is a single SQL query with cosine similarity computed in PHP — yes, not as fast as a dedicated vector DB, but fast enough for the typical 10K-20K-chunk corpus a small WP site has.&lt;/p&gt;

&lt;p&gt;Tradeoff: scaling. Customers with 1M+ chunks would need a real vector DB. We bet 99% of WP customers won't hit that wall.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The math is stateless
&lt;/h3&gt;

&lt;p&gt;The flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;embedding (request) → similarity_search() → top_k chunks → response → done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No log. No "queries collection" admin tab. The backend literally forgets the request happened.&lt;/p&gt;

&lt;p&gt;This costs us product features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can't show "your top 10 most-asked questions" dashboard&lt;/li&gt;
&lt;li&gt;Can't do per-user conversation history&lt;/li&gt;
&lt;li&gt;Can't optimize answers based on aggregate signals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only honest way to promise we won't lose your visitors' data is to never have it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Zero third-party calls from the widget
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fonts subset and self-hosted (no Google Fonts CDN)&lt;/li&gt;
&lt;li&gt;JS bundle has no external script tags&lt;/li&gt;
&lt;li&gt;No analytics pixel (we built our own backend on the same server, daily-rotating salts, no IP storage)&lt;/li&gt;
&lt;li&gt;No social tracking pixels for share buttons (intent URLs only)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open the network tab on a site running our plugin and you'll count to two: WP itself, and one stateless math endpoint at our backend.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Loop
&lt;/h2&gt;

&lt;p&gt;Standard chatbots forget the conversation. Visitor asks, bot answers, session closes, the site never learns. The chatbot vendor learns — they aggregate questions across all customers — but the site owner sees nothing.&lt;/p&gt;

&lt;p&gt;We made that the actual product. Plugin clusters incoming question variations, drafts a clean FAQ entry, shows the admin two buttons: publish or discard. AI proposes; human curates. Output is a real indexed page at &lt;code&gt;/faq/your-question/&lt;/code&gt; with &lt;code&gt;FAQPage&lt;/code&gt; schema markup.&lt;/p&gt;

&lt;p&gt;Visitors who type the question into the bot get the answer instantly. Visitors who Google the question land on the same page. One piece of content, two jobs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three questions before installing any AI chatbot
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Where do my visitors' questions go?&lt;/strong&gt; If the answer involves any company name other than yours, the answer is "to that company too".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where do you store my content?&lt;/strong&gt; "Our cloud" = your content is part of their dataset. "Your database" — ask to see the table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What happens if I uninstall?&lt;/strong&gt; "Your data stays with us forever" vs "the data is gone, because it was always yours".&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We pass all three. We're one of the few that do.&lt;/p&gt;




&lt;p&gt;Full manifesto with the cosmetics-brand callback and the closing scene: &lt;a href="https://pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker/" rel="noopener noreferrer"&gt;pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Plugin: &lt;a href="https://pagecoder.ai/products/rag-chat" rel="noopener noreferrer"&gt;pagecoder.ai/products/rag-chat&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(&lt;em&gt;Disclosure: I'm a co-founder. The audit pattern in this post applies regardless of which plugin you pick — even if you never use ours.&lt;/em&gt;)&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>privacy</category>
      <category>wordpress</category>
    </item>
  </channel>
</rss>
