<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jan Michalík</title>
    <description>The latest articles on DEV Community by Jan Michalík (@pagecoder).</description>
    <link>https://dev.to/pagecoder</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3918760%2F4e9c2673-f502-4a09-bcbd-0a5b94d3d58e.jpg</url>
      <title>DEV Community: Jan Michalík</title>
      <link>https://dev.to/pagecoder</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pagecoder"/>
    <language>en</language>
    <item>
      <title>Search with no AI in the answer, and why I chose plain chunks over tree-RAG</title>
      <dc:creator>Jan Michalík</dc:creator>
      <pubDate>Sun, 07 Jun 2026 20:28:39 +0000</pubDate>
      <link>https://dev.to/pagecoder/search-with-no-ai-in-the-answer-and-why-i-chose-plain-chunks-over-tree-rag-184c</link>
      <guid>https://dev.to/pagecoder/search-with-no-ai-in-the-answer-and-why-i-chose-plain-chunks-over-tree-rag-184c</guid>
      <description>&lt;p&gt;`&amp;gt; &lt;em&gt;I develop a privacy-first RAG chatbot for WordPress. This combines two writeups from pagecoder.ai - on &lt;a href="https://pagecoder.ai/blog/search-without-ai-answers/" rel="noopener noreferrer"&gt;search&lt;/a&gt; and on &lt;a href="https://pagecoder.ai/blog/chunks-vs-tree-rag-the-boring-one-won/" rel="noopener noreferrer"&gt;chunking&lt;/a&gt; - and continues &lt;a href="https://dev.to/pagecoder/wordpress-ai-chat-plugins-make-6-11-outbound-requests-per-visitor-question-architecture-writeup-of-3a6a"&gt;my earlier post on what AI chat plugins leak per question&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Last cycle I added two things to my WordPress RAG plugin that people kept asking for: visitor-facing search, and indexing big PDFs. Each one came down to a retrieval decision worth writing down.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Search: retrieve, but don't generate
&lt;/h2&gt;

&lt;p&gt;My chatbot answers in two stages - retrieve the most relevant content, then hand it to a model that writes a reply. Search is just the first stage, stopped before the second.&lt;/p&gt;

&lt;p&gt;When you type into the search box, I run the same retrieval the chatbot uses and then stop. No model is asked to compose an answer. You get the raw matches back: ranked results with your terms highlighted.&lt;/p&gt;

&lt;p&gt;Stopping early is the point. It buys three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No hallucination.&lt;/strong&gt; Nothing is generated, so nothing can be invented. Every result links to a real page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You land on the source.&lt;/strong&gt; A ranked link, not a paraphrase that may or may not match the page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's lighter.&lt;/strong&gt; The slow, expensive part of a chat reply is the model writing a few hundred words. Skip it and the same lookup gets cheaper and faster.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One honest caveat, because the whole product is about not overclaiming: search is not &lt;em&gt;zero&lt;/em&gt; AI. To match on meaning and not just keywords, the query is still turned into an embedding (same backend the chat uses, then discarded). What it does not do is the thing people actually worry about - no model writes text about your content.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Chunking: the boring method beat the clever one
&lt;/h2&gt;

&lt;p&gt;To index a big PDF you first have to cut it into pieces. There's a boring way and a clever way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boring:&lt;/strong&gt; fixed-size chunks. Walk the document, cut it into roughly equal pieces of a few paragraphs each, with a little overlap so a sentence on a boundary isn't lost. No idea what a heading is. Just consistent slices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clever:&lt;/strong&gt; tree-RAG (e.g. PageIndex). Build a tree from the table of contents, sections become nodes, and at query time you walk down to the most relevant branch and pull that whole section. On paper it's obviously better for long structured documents.&lt;/p&gt;

&lt;p&gt;I wanted the clever one - it's the more impressive thing to say you built. So I tested it properly instead of guessing: a graded eval (every answer scored correct / partial / wrong), run on ordinary pages &lt;em&gt;and&lt;/em&gt; on the long, table-of-contents-heavy PDFs that are the tree's home turf.&lt;/p&gt;

&lt;p&gt;It didn't win. Directional results - I'd rather you run your own eval than trust my numbers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What I measured&lt;/th&gt;
&lt;th&gt;Fixed-size chunks&lt;/th&gt;
&lt;th&gt;Tree-RAG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy on ordinary pages&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Higher&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy on long structured PDFs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;More reliable&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost per question&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Cheaper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Noticeably more&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tokens fed to the model&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Lean&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Much heavier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The biggest PDF I threw at it&lt;/td&gt;
&lt;td&gt;Indexed fine&lt;/td&gt;
&lt;td&gt;Failed to build its tree at all&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Where it actually won&lt;/td&gt;
&lt;td&gt;Most question types&lt;/td&gt;
&lt;td&gt;Broad "summarize the whole thing" questions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why it won: tokens and cost
&lt;/h3&gt;

&lt;p&gt;It comes down to how much you hand the model. When the tree pulls "the most relevant section," that section can be pages long and all of it goes into the prompt. Fixed-size chunks hand over a few tight pieces and nothing else. That shows up on the bill (you pay for what the model reads) and in quality (burying the answer in surrounding text gives the model room to wander). Lean retrieval is often &lt;em&gt;more&lt;/em&gt; accurate, not just cheaper.&lt;/p&gt;

&lt;h3&gt;
  
  
  The honest caveat
&lt;/h3&gt;

&lt;p&gt;The tree wasn't useless. For "what is this entire document about?" it was genuinely better - it can feed the model a whole structured section at once. If your use case is almost entirely whole-document summarization, it might be worth the cost. For the specific "where does it say X?" questions real visitors ask, the boring slices won. So I shipped chunks and parked the tree.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for PDFs (and your data)
&lt;/h2&gt;

&lt;p&gt;I index big PDFs with the method that proved itself. Text is extracted, sliced, and searched - and on the privacy side it works like the rest of the product: extraction happens in memory, the file itself is never stored on my side, and the searchable pieces live in your own database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take it as a nudge, not gospel
&lt;/h2&gt;

&lt;p&gt;The reason I tested instead of reading opinions is that this stuff is easy to assert and easy to check. If you're choosing a retrieval or chunking strategy, build a small graded eval on &lt;em&gt;your&lt;/em&gt; documents before adopting the fancy thing. It's the most useful afternoon you'll spend on RAG.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I build &lt;a href="https://pagecoder.ai/products/rag-chat" rel="noopener noreferrer"&gt;RAG Chat&lt;/a&gt; - a privacy-first AI chatbot + search for WordPress. 7-day free trial, no card. Need something custom built? &lt;a href="https://pagecoder.ai/build" rel="noopener noreferrer"&gt;Tell me what you need&lt;/a&gt;. No tracking pixels were used in this post.&lt;/em&gt;`&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>webdev</category>
      <category>wordpress</category>
    </item>
    <item>
      <title>WordPress AI chat plugins make 6–11 outbound requests per visitor question. Architecture writeup of an alternative.</title>
      <dc:creator>Jan Michalík</dc:creator>
      <pubDate>Thu, 07 May 2026 21:35:54 +0000</pubDate>
      <link>https://dev.to/pagecoder/wordpress-ai-chat-plugins-make-6-11-outbound-requests-per-visitor-question-architecture-writeup-of-3a6a</link>
      <guid>https://dev.to/pagecoder/wordpress-ai-chat-plugins-make-6-11-outbound-requests-per-visitor-question-architecture-writeup-of-3a6a</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker/" rel="noopener noreferrer"&gt;pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker&lt;/a&gt;. Cross-posted here for the dev.to community.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Last week we sat in a café in Wien with a friend — a freelance dev who's been shipping WordPress sites for a decade. He'd just installed an AI chat plugin for a client, a small cosmetics brand. It looked nice. Brand colors, custom name.&lt;/p&gt;

&lt;p&gt;Then he opened the network tab.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Three. Four. Five." He scrolled.
"Eight. Wait — eleven? Where is this one going?"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eleven outbound requests. Per visitor question. Before the bot's reply finishes rendering.&lt;/p&gt;

&lt;p&gt;If you ship WordPress sites for clients and you're considering an AI chatbot, this post is the architecture audit you probably haven't done yourself yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture of the leak
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;visitor question
  ├─→ third-party AI provider (LLM call)
  │     - prompt logged for "abuse monitoring"
  │     - retention window = vendor-defined
  ├─→ chatbot vendor's own backend (the "data product")
  │     - browser fingerprint + IP + geolocation
  │     - full conversation + page URL
  │     - this IS the actual product the vendor sells
  └─→ widget's embedded CDNs / analytics endpoints (~3-9 of these)
        - CDN providers owe no privacy policy
        - implicit consent via embed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single visitor question on a small WordPress site running a popular AI chatbot can leak to &lt;strong&gt;6–11 different companies&lt;/strong&gt; before the page renders the reply. None of them are in the site's cookie banner. None appear in any data-subject access request. None know who the brand is — all of them know who its visitors are.&lt;/p&gt;

&lt;p&gt;EU regulators are catching up. Default architecture isn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we started over
&lt;/h2&gt;

&lt;p&gt;We'd been installing those plugins for clients for years. Ticked the privacy-policy boxes. Stopped noticing. Then a client asked, casually, "where exactly does that go when someone types it in?"&lt;/p&gt;

&lt;p&gt;We didn't have a clean answer.&lt;/p&gt;

&lt;p&gt;So we started over. Three architectural rules:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Vectors live on the user's server
&lt;/h3&gt;

&lt;p&gt;Standard SaaS playbook says "use Pinecone / Weaviate / our managed vector DB." We store embeddings in the customer's WordPress database (custom post type with &lt;code&gt;vector&lt;/code&gt; + &lt;code&gt;chunk_id&lt;/code&gt; + &lt;code&gt;source_post_ref&lt;/code&gt;). Lookup is a single SQL query with cosine similarity computed in PHP — yes, not as fast as a dedicated vector DB, but fast enough for the typical 10K-20K-chunk corpus a small WP site has.&lt;/p&gt;

&lt;p&gt;Tradeoff: scaling. Customers with 1M+ chunks would need a real vector DB. We bet 99% of WP customers won't hit that wall.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The math is stateless
&lt;/h3&gt;

&lt;p&gt;The flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;embedding (request) → similarity_search() → top_k chunks → response → done
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No log. No "queries collection" admin tab. The backend literally forgets the request happened.&lt;/p&gt;

&lt;p&gt;This costs us product features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can't show "your top 10 most-asked questions" dashboard&lt;/li&gt;
&lt;li&gt;Can't do per-user conversation history&lt;/li&gt;
&lt;li&gt;Can't optimize answers based on aggregate signals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The only honest way to promise we won't lose your visitors' data is to never have it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Zero third-party calls from the widget
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fonts subset and self-hosted (no Google Fonts CDN)&lt;/li&gt;
&lt;li&gt;JS bundle has no external script tags&lt;/li&gt;
&lt;li&gt;No analytics pixel (we built our own backend on the same server, daily-rotating salts, no IP storage)&lt;/li&gt;
&lt;li&gt;No social tracking pixels for share buttons (intent URLs only)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open the network tab on a site running our plugin and you'll count to two: WP itself, and one stateless math endpoint at our backend.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Loop
&lt;/h2&gt;

&lt;p&gt;Standard chatbots forget the conversation. Visitor asks, bot answers, session closes, the site never learns. The chatbot vendor learns — they aggregate questions across all customers — but the site owner sees nothing.&lt;/p&gt;

&lt;p&gt;We made that the actual product. Plugin clusters incoming question variations, drafts a clean FAQ entry, shows the admin two buttons: publish or discard. AI proposes; human curates. Output is a real indexed page at &lt;code&gt;/faq/your-question/&lt;/code&gt; with &lt;code&gt;FAQPage&lt;/code&gt; schema markup.&lt;/p&gt;

&lt;p&gt;Visitors who type the question into the bot get the answer instantly. Visitors who Google the question land on the same page. One piece of content, two jobs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three questions before installing any AI chatbot
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Where do my visitors' questions go?&lt;/strong&gt; If the answer involves any company name other than yours, the answer is "to that company too".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where do you store my content?&lt;/strong&gt; "Our cloud" = your content is part of their dataset. "Your database" — ask to see the table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What happens if I uninstall?&lt;/strong&gt; "Your data stays with us forever" vs "the data is gone, because it was always yours".&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We pass all three. We're one of the few that do.&lt;/p&gt;




&lt;p&gt;Full manifesto with the cosmetics-brand callback and the closing scene: &lt;a href="https://pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker/" rel="noopener noreferrer"&gt;pagecoder.ai/blog/why-your-ai-chatbot-is-a-tracker&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Plugin: &lt;a href="https://pagecoder.ai/products/rag-chat" rel="noopener noreferrer"&gt;pagecoder.ai/products/rag-chat&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(&lt;em&gt;Disclosure: I'm a co-founder. The audit pattern in this post applies regardless of which plugin you pick — even if you never use ours.&lt;/em&gt;)&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>privacy</category>
      <category>wordpress</category>
    </item>
  </channel>
</rss>
