<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alexandre Caramaschi</title>
    <description>The latest articles on DEV Community by Alexandre Caramaschi (@alexandrebrt14sys).</description>
    <link>https://dev.to/alexandrebrt14sys</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3835714%2F8a5b0d12-7104-4cef-81f3-7a8eac04c5fb.jpeg</url>
      <title>DEV Community: Alexandre Caramaschi</title>
      <link>https://dev.to/alexandrebrt14sys</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexandrebrt14sys"/>
    <language>en</language>
    <item>
      <title>I Built a Deterministic Crosslink Engine for 117 Pages Using Jaccard Similarity</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Fri, 10 Apr 2026 02:30:11 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/i-built-a-deterministic-crosslink-engine-for-117-pages-using-jaccard-similarity-3mkn</link>
      <guid>https://dev.to/alexandrebrt14sys/i-built-a-deterministic-crosslink-engine-for-117-pages-using-jaccard-similarity-3mkn</guid>
      <description>&lt;p&gt;A content site with 117 pages and zero internal linking strategy is a site where visitors bounce after reading one page. That was my site two weeks ago.&lt;/p&gt;

&lt;p&gt;Today, every page on &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;alexandrecaramaschi.com&lt;/a&gt; has 6 contextual crosslinks generated by a deterministic engine that runs in 200ms, costs nothing, and lives in a single Node.js script — no embeddings, no vector databases, no API calls.&lt;/p&gt;

&lt;p&gt;Here is exactly how I built it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: 117 Pages, Manual Linking
&lt;/h2&gt;

&lt;p&gt;The site has 41 long-form articles, 38 courses (388 modules), 26 strategic insights, and 14 service/tool pages. All built with Next.js 16 App Router.&lt;/p&gt;

&lt;p&gt;The existing &lt;code&gt;relatedArticles&lt;/code&gt; field in my CMS was manually curated — and covered maybe 15% of pages. Course pages had zero outbound links to articles. Articles never pointed to courses. The result: visitors arrived via search, consumed one page, and left.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Faceted Taxonomy + Weighted Scoring
&lt;/h2&gt;

&lt;p&gt;Instead of reaching for OpenAI embeddings, I designed a controlled vocabulary with 4 semantic facets:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Topics&lt;/strong&gt; — 26 canonical terms with synonym normalization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TOPICS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;geo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;geo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generative engine optimization&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;motor generativo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;seo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;seo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;search engine optimization&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ia-generativa&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ia generativa&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;llm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chatgpt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;vscode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vscode&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vs code&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;visual studio code&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;editor&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ide&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// ... 22 more&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each piece of content is annotated by scanning its title, description, and keywords against this vocabulary. Normalization strips accents and lowercases before matching (critical for Portuguese content).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Audience&lt;/strong&gt; — 7 profiles (beginner, dev, marketing-pro, executive, etc.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Intent&lt;/strong&gt; — 4 journey stages: &lt;code&gt;discover → learn → apply → decide&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Vertical&lt;/strong&gt; — 12 industry sectors (healthcare, legal, tourism, etc.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The Scoring Function
&lt;/h2&gt;

&lt;p&gt;For each pair of content items (A, B), the score is a weighted sum across facets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;jaccard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topics_A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;topics_B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;audienceOverlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;intentFlow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;1.2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;verticalBridge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;1.3&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;crossDomainBonus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.6&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;trackAffinity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;A&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;B&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Jaccard similarity&lt;/strong&gt; handles topic matching. Two items sharing 3 of 5 topics score 0.6 — high enough to be relevant, low enough to avoid duplicates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intent flow&lt;/strong&gt; rewards linking from discovery content (articles) to learning content (courses) to action pages (tools) — guiding visitors deeper.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-domain bonus&lt;/strong&gt; is the key retention driver: an article about "zero-click economy" linking to the "SEO + GEO Fundamentals" course is more valuable than linking to another article about zero-click. Different content &lt;em&gt;types&lt;/em&gt; with shared topics get a 1.3x boost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Track affinity&lt;/strong&gt; ensures courses in the same learning path (e.g., Python → Data Science → Deploy) link to each other even without keyword overlap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anti-Bubble Mixing
&lt;/h2&gt;

&lt;p&gt;Raw scoring produces homogeneous results — a course page would only suggest other courses. The mixer enforces quotas:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;content (articles + insights): min 1
learning (courses):             min 1
action (guides + tools):        min 1
any single group:               max 50%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three phases:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fill mandatory quotas from each group&lt;/li&gt;
&lt;li&gt;Complete by score, respecting group caps&lt;/li&gt;
&lt;li&gt;Fallback by supercategory for edge cases&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Injection Without Editing 63 Static Pages
&lt;/h2&gt;

&lt;p&gt;The site has 38 static course pages and 26 static insight pages — all individual &lt;code&gt;page.tsx&lt;/code&gt; files. Editing each one was not viable.&lt;/p&gt;

&lt;p&gt;Solution: &lt;strong&gt;middleware + headers + layout injection&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The middleware sets an &lt;code&gt;x-pathname&lt;/code&gt; header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// middleware.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestHeaders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;requestHeaders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-pathname&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pathname&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;NextResponse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;requestHeaders&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A server component reads it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// SmartRelated.tsx&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;x-pathname&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getCrosslinksFor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Injected via &lt;code&gt;educacao/layout.tsx&lt;/code&gt; and &lt;code&gt;insights/layout.tsx&lt;/code&gt;, it automatically appears below every course and insight page. For articles (dynamic &lt;code&gt;[slug]&lt;/code&gt; route), the pathname is passed explicitly as a prop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pages with crosslinks&lt;/td&gt;
&lt;td&gt;~15%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total crosslinks&lt;/td&gt;
&lt;td&gt;~40 manual&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;700 generated&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-type links&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;116 of 117 pages&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Badge types per page&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.3 average&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build time delta&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+200ms&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API costs&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The generator runs as part of &lt;code&gt;prebuild&lt;/code&gt; and outputs a static JSON map consumed at render time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Embeddings?
&lt;/h2&gt;

&lt;p&gt;At 117 pages, embeddings are overkill. The controlled vocabulary approach is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic&lt;/strong&gt; — same input, same output, every time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditable&lt;/strong&gt; — grep the vocabulary file to understand any link&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free&lt;/strong&gt; — no API calls, no vector DB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast&lt;/strong&gt; — 200ms to generate the entire map&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Versionable&lt;/strong&gt; — the JSON map is committed to git&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the site crosses ~500 pages, I will migrate to pgvector. The architecture was designed for this: consumers only read &lt;code&gt;crosslink-map.json&lt;/code&gt; — they do not care how it was generated.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The full source is at &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;alexandrecaramaschi.com&lt;/a&gt;. Navigate any course, scroll to the bottom, and you will see the crosslinks in action.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi — CEO at Brasil GEO, former CMO at Semantix (Nasdaq), co-founder of AI Brasil. Building the practice of Generative Engine Optimization in Latin America.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>seo</category>
      <category>webdev</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>12 dias de 'success' coletando zero dados — o bug silencioso que matou minha pesquisa de 90 dias</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Wed, 08 Apr 2026 00:24:00 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/12-dias-de-success-coletando-zero-dados-o-bug-silencioso-que-matou-minha-pesquisa-de-90-dias-3pfn</link>
      <guid>https://dev.to/alexandrebrt14sys/12-dias-de-success-coletando-zero-dados-o-bug-silencioso-que-matou-minha-pesquisa-de-90-dias-3pfn</guid>
      <description>&lt;p&gt;&lt;strong&gt;8 dias. 0 observações. 12 workflows GitHub Actions marcados como verde.&lt;/strong&gt; Foi isso que descobri há seis horas, em 7 de abril de 2026, ao olhar meu dashboard de pesquisa em alexandrecaramaschi.com/research e ver &lt;code&gt;overall_rate: 0&lt;/code&gt;, &lt;code&gt;total_observations: 0&lt;/code&gt;, &lt;code&gt;days_collecting: 0&lt;/code&gt; em todas as quatro verticais.&lt;/p&gt;

&lt;p&gt;O GitHub Actions me dizia que tinha rodado com sucesso desde 30 de março. Os commits estavam lá, datados, com mensagens automáticas perfeitas: &lt;code&gt;data: daily collection 4 verticals 2026-04-07&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Eu tinha um workflow chamando &lt;code&gt;python -m src.cli collect citation&lt;/code&gt; para 4 verticais (fintech, varejo, saúde, tecnologia), 4 LLMs (ChatGPT, Claude, Gemini, Perplexity), todo dia às 06:00 BRT.&lt;/p&gt;

&lt;p&gt;A pasta &lt;code&gt;output/&lt;/code&gt; tinha checkpoints atualizados. O dashboard estava no ar. E não havia uma única linha de dado real desde 30 de março.&lt;/p&gt;

&lt;h2&gt;
  
  
  A tese contraintuitiva
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Workflows verdes em CI mentem.&lt;/strong&gt; Especialmente quando o seu código tem &lt;code&gt;continue-on-error: true&lt;/code&gt; espalhado por todo lado e o seu único critério de sucesso é "o processo não exceptioned".&lt;/p&gt;

&lt;p&gt;O caso que eu vou contar é uma combinação de três falhas que se reforçam: API keys rotacionadas externamente sem propagação ao repositório, retorno HTTP 401 silencioso porque o coletor capturava a exceção e seguia, e um workflow YAML que considerava "completou sem crash" como "rodou bem". O resultado é o pior tipo de bug de pipeline: o que mantém todos os indicadores verdes enquanto a base de dados envelhece.&lt;/p&gt;

&lt;h2&gt;
  
  
  O contexto: pesquisa empírica de 90 dias
&lt;/h2&gt;

&lt;p&gt;Estou rodando um estudo longitudinal sobre como LLMs citam empresas brasileiras. O design tem 4 verticais, 69 entidades (61 reais + 8 fictícias para calibração de falso positivo), 4 modelos com versão pinada (&lt;code&gt;gpt-4o-mini-2024-07-18&lt;/code&gt;, &lt;code&gt;claude-haiku-4-5-20251001&lt;/code&gt;, &lt;code&gt;sonar&lt;/code&gt;, &lt;code&gt;gemini-2.5-pro&lt;/code&gt;) e ~288 observações por dia. O alvo eram 90 dias contínuos, ~25.920 observações, três papers planejados para ArXiv + SIGIR/WWW + Information Sciences Q1.&lt;/p&gt;

&lt;p&gt;A coleta começou em 24 de março. Tudo funcionou no dia 1. Em 25 e 26, um &lt;code&gt;SyntaxError&lt;/code&gt; em Python 3.11 do CI (válido em 3.12 do meu local) matou a coleta — incidente já documentado, fixado, post-mortem escrito.&lt;/p&gt;

&lt;p&gt;Em 29 de março, tudo funcionando de novo: 256 observações reais, distribuição saudável entre os 4 LLMs.&lt;/p&gt;

&lt;p&gt;Em 30 de março, alguma coisa quebrou.&lt;/p&gt;

&lt;h2&gt;
  
  
  A causa raiz
&lt;/h2&gt;

&lt;p&gt;Em algum momento entre 29 e 30 de março, eu rotacionei as 5 chaves de API do meu workspace local — provavelmente durante uma auditoria FinOps que estava fazendo no orquestrador multi-LLM. Atualizei o &lt;code&gt;.env&lt;/code&gt; do repositório principal. Fiz smoke test, validei que tudo respondia HTTP 200. Segui em frente.&lt;/p&gt;

&lt;p&gt;O que eu não fiz: propagar as chaves novas para os GitHub Secrets do repositório &lt;code&gt;papers&lt;/code&gt;. As keys lá ficaram datadas de 24 de março, ainda apontando para o conjunto antigo, agora inválido.&lt;/p&gt;

&lt;p&gt;A partir de 30 de março, todo dia às 06:00 BRT, o workflow rodava. Cada chamada a OpenAI retornava &lt;code&gt;HTTP 401 invalid_api_key&lt;/code&gt;. Cada chamada a Anthropic retornava &lt;code&gt;HTTP 401 invalid x-api-key&lt;/code&gt;. Cada chamada a Perplexity, mesma coisa. O Gemini retornava &lt;code&gt;HTTP 400&lt;/code&gt; por outro motivo (estrutura de resposta do 2.5 Pro com thinking mode incompatível com o parser que eu tinha — outro bug que vou cobrir abaixo).&lt;/p&gt;

&lt;p&gt;E o coletor continuava. Porque a função &lt;code&gt;collect()&lt;/code&gt; capturava as exceções, logava no stderr, e retornava uma lista vazia. A função do CLI verificava &lt;code&gt;if results:&lt;/code&gt; antes de inserir no banco — lista vazia significava simplesmente "nada para inserir, ok, próxima vertical". Sem exit code não-zero. Sem &lt;code&gt;raise&lt;/code&gt;. Sem alerta.&lt;/p&gt;

&lt;p&gt;O job &lt;code&gt;finalize&lt;/code&gt; baixava o artifact &lt;code&gt;papers-db-latest&lt;/code&gt; do dia anterior, rodava o &lt;code&gt;sync_to_supabase.py&lt;/code&gt; que agregava (zero linhas → todos os KPIs zerados), atualizava o snapshot da tabela &lt;code&gt;papers_dashboard_data&lt;/code&gt; no Supabase com &lt;code&gt;total_observations: 0&lt;/code&gt;, &lt;code&gt;overall_rate: 0&lt;/code&gt;, &lt;code&gt;days_collecting: 0&lt;/code&gt;, fazia upload do mesmo artifact inalterado, commitava &lt;code&gt;data/daily_*.csv&lt;/code&gt; (vazio), &lt;code&gt;data/finops_checkpoint.json&lt;/code&gt; e &lt;code&gt;docs/&lt;/code&gt;. E saía com exit code 0.&lt;/p&gt;

&lt;p&gt;12 dias assim. Workflow status: &lt;code&gt;completed/success&lt;/code&gt;. Banco real: 186 observações estagnadas em 24 de março. Dashboard live: zeros em todas as verticais.&lt;/p&gt;

&lt;h2&gt;
  
  
  Como descobri
&lt;/h2&gt;

&lt;p&gt;Não foi um alerta. Era para ser. Não havia.&lt;/p&gt;

&lt;p&gt;Foi uma pergunta. "A coleta está funcionando consistentemente para termos massa crítica em 90 dias?"&lt;/p&gt;

&lt;p&gt;Cinco minutos depois, baixando os logs do último run via &lt;code&gt;gh run view --log&lt;/code&gt; e filtrando por &lt;code&gt;ERROR&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR: [ChatGPT] HTTP 401: invalid_api_key
ERROR: [Claude] HTTP 401: invalid x-api-key
ERROR: [Gemini] HTTP 400: ...
ERROR: [Perplexity] HTTP 401: Invalid API key provided
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repetido em loop por todas as 18 queries de cada uma das 4 verticais. Mais de 200 linhas de erro. E o workflow no topo da página dizia &lt;code&gt;success&lt;/code&gt; em verde.&lt;/p&gt;

&lt;h2&gt;
  
  
  A decisão de regredir
&lt;/h2&gt;

&lt;p&gt;Eu tinha uma escolha: fazer backfill manual com data alterada para preservar a sequência (mas com timestamps todos do dia atual, contaminando análises temporais), ou aceitar que perdi 8 dias e reiniciar o contador.&lt;/p&gt;

&lt;p&gt;Reiniciei. A integridade temporal de um estudo longitudinal vale mais que a vaidade de um número de "dias contínuos". 90 dias com timestamps reais é evidência. 90 dias com 8 deles inventados é fraude metodológica.&lt;/p&gt;

&lt;p&gt;O dia 1 da nova janela é 8 de abril. Dia 90 será 6 de julho de 2026. ~256 observações por dia × 90 dias ≈ 23.000 observações totais com integridade temporal preservada.&lt;/p&gt;

&lt;h2&gt;
  
  
  Os 5 fixes que vão garantir que isso nunca mais aconteça
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Fail-loud no comando de coleta
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/cli.py::collect_citation&lt;/code&gt; agora soma o total de citações coletadas em todas as verticais. Se for zero quando pelo menos uma vertical foi tentada, o comando levanta &lt;code&gt;SystemExit(1)&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;total_attempted&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;total_collected&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAIL-LOUD: 0 citacoes em &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_attempted&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; verticais. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Provavel causa: API keys invalidas/expiradas, rate limiting, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ou erro de configuracao.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SystemExit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Isso garante que o workflow falha de verdade quando 100% das chamadas dão erro. Sem &lt;code&gt;|| true&lt;/code&gt; por cima. Sem &lt;code&gt;continue-on-error&lt;/code&gt;. O job termina vermelho.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Retry policy granular no coletor
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/collectors/base.py&lt;/code&gt; antes só tratava HTTP 429. Agora trata cinco categorias diferentes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Erro&lt;/th&gt;
&lt;th&gt;Comportamento&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HTTP 401/403&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Circuit break imediato. Não retenta. Loga "rotacionar key no GitHub Secrets".&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HTTP 429&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retry com backoff exponencial. Após max retries, circuit break.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HTTP 5xx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retry com backoff exponencial.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;ConnectError&lt;/code&gt;, &lt;code&gt;ReadTimeout&lt;/code&gt;, &lt;code&gt;WriteTimeout&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Retry com backoff.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;HTTP 4xx fatais&lt;/code&gt; (400, 404, 422)&lt;/td&gt;
&lt;td&gt;Log e segue para a próxima query.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A separação importa: 401 não é transient, é configuração. Retry não resolve. O fix é rotacionar a chave. Logar isso explicitamente faz a falha aparecer no diagnóstico em vez de ficar enterrada em retries inúteis.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Health check de 14 dimensões com alerta WhatsApp + email
&lt;/h3&gt;

&lt;p&gt;Criei &lt;code&gt;scripts/health_check.py&lt;/code&gt; no estilo do &lt;code&gt;geo-finops/health_check.py&lt;/code&gt; que já existe no meu ecossistema. O script roda 14 checks ponta a ponta:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;papers.db&lt;/code&gt; existe&lt;/li&gt;
&lt;li&gt;Schema com 21 tabelas obrigatórias&lt;/li&gt;
&lt;li&gt;As 4 API keys estão carregadas no ambiente&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smoke test real das 4 keys&lt;/strong&gt; (faz uma chamada mínima a cada provider)&lt;/li&gt;
&lt;li&gt;Pelo menos 200 observações nas últimas 24h&lt;/li&gt;
&lt;li&gt;Todas as 4 verticais coletaram nas últimas 24h&lt;/li&gt;
&lt;li&gt;Todos os 4 LLMs responderam nas últimas 24h&lt;/li&gt;
&lt;li&gt;Sem gap maior que 1 dia entre coletas (warning)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;papers_dashboard_data&lt;/code&gt; no Supabase com &lt;code&gt;total_observations &amp;gt; 0&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;FinOps gasto &amp;lt; 90% do budget mensal&lt;/li&gt;
&lt;li&gt;Endpoint &lt;code&gt;/research&lt;/code&gt; retornando HTTP 200&lt;/li&gt;
&lt;li&gt;Modelos pinados no banco (versões específicas)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;raw_text&lt;/code&gt; preservado para reprocessamento&lt;/li&gt;
&lt;li&gt;Entidades fictícias presentes no coorte (calibração de falso positivo)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Exit code 1 se qualquer check crítico falha. Quando falha, &lt;code&gt;send_alert()&lt;/code&gt; dispara dois canais em paralelo: WhatsApp Business API para &lt;code&gt;+5562998141505&lt;/code&gt; e email via Resend para &lt;code&gt;caramaschiai@caramaschiai.io&lt;/code&gt;. O conteúdo da mensagem inclui o sumário das falhas, métricas relevantes e um runbook básico de recovery.&lt;/p&gt;

&lt;p&gt;Smoke test rodado: &lt;code&gt;whatsapp: OK&lt;/code&gt;. Mensagem real chegou no celular.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Health check como gating no daily-collect
&lt;/h3&gt;

&lt;p&gt;O &lt;code&gt;daily-collect.yml&lt;/code&gt; ganhou um step novo no fim do job &lt;code&gt;finalize&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Health check (gating)&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python scripts/health_check.py --min-obs-per-day &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sem &lt;code&gt;continue-on-error&lt;/code&gt;. Se o health check falha, o workflow falha. Se o workflow falha, o &lt;code&gt;daily-collect-alert.yml&lt;/code&gt; (workflow separado que escuta &lt;code&gt;workflow_run.failure&lt;/code&gt;) dispara WhatsApp + email.&lt;/p&gt;

&lt;p&gt;Mais um workflow agendado (&lt;code&gt;health-check-daily.yml&lt;/code&gt;) roda 4 horas depois — 13:00 UTC, camada redundante caso o daily-collect tenha falhado em algum aspecto que o gating não pegou. Defesa em profundidade.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. FinOps tighter
&lt;/h3&gt;

&lt;p&gt;Os budgets default eram folgados demais ($35/mês global) para o custo real observado (~$1/mês). Se algum bug fizesse queries explodirem por horas antes de eu notar, o estrago poderia ser de duas ordens de grandeza acima do que faria sentido pagar.&lt;/p&gt;

&lt;p&gt;Apertei tudo com 5x de margem sobre o custo médio observado:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Antes&lt;/th&gt;
&lt;th&gt;Depois&lt;/th&gt;
&lt;th&gt;Hard stop&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;openai&lt;/td&gt;
&lt;td&gt;$10/mês&lt;/td&gt;
&lt;td&gt;$3/mês&lt;/td&gt;
&lt;td&gt;95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;anthropic&lt;/td&gt;
&lt;td&gt;$10/mês&lt;/td&gt;
&lt;td&gt;$3/mês&lt;/td&gt;
&lt;td&gt;95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;google&lt;/td&gt;
&lt;td&gt;$5/mês&lt;/td&gt;
&lt;td&gt;$2/mês&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;perplexity&lt;/td&gt;
&lt;td&gt;$10/mês&lt;/td&gt;
&lt;td&gt;$3/mês&lt;/td&gt;
&lt;td&gt;95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;groq&lt;/td&gt;
&lt;td&gt;$5/mês&lt;/td&gt;
&lt;td&gt;$1/mês&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;global&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$35/mês&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$10/mês&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;95%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Hard stop em 95% por provider significa que quando o gasto chega lá, o tracker bloqueia novas chamadas para aquele provider até o reset diário/mensal. Bill shock previne-se com cap, não com confiança.&lt;/p&gt;

&lt;h2&gt;
  
  
  Os bugs que descobri por acidente no caminho
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Gemini 2.5 Pro thinking mode
&lt;/h3&gt;

&lt;p&gt;Enquanto debugava a coleta, descobri que mesmo com keys novas o Gemini estava retornando dados vazios. O modelo &lt;code&gt;gemini-2.5-pro&lt;/code&gt; usa thinking tokens internos antes de gerar output. Com &lt;code&gt;max_output_tokens = 300&lt;/code&gt;, o thinking budget esgotava os tokens e a resposta voltava com &lt;code&gt;candidates[0].content&lt;/code&gt; sem campo &lt;code&gt;parts&lt;/code&gt;. O parser fazia &lt;code&gt;data["candidates"][0]["content"]["parts"][0]["text"]&lt;/code&gt; e dava &lt;code&gt;KeyError: 'parts'&lt;/code&gt;. Mas o &lt;code&gt;KeyError&lt;/code&gt; virava um log warning e a função retornava None — outro erro silencioso.&lt;/p&gt;

&lt;p&gt;Fix: 4x o &lt;code&gt;max_output_tokens&lt;/code&gt; para modelos &lt;code&gt;*-pro&lt;/code&gt; (compensa o thinking budget) + tratamento gracioso de respostas sem &lt;code&gt;parts&lt;/code&gt; (trata como string vazia em vez de exceção).&lt;/p&gt;

&lt;h3&gt;
  
  
  Idempotência exige normalização determinística do schema chave
&lt;/h3&gt;

&lt;p&gt;Esse vem de um bug irmão no meu pacote &lt;code&gt;geo-finops&lt;/code&gt; (tracking unificado de LLMs do meu ecossistema). Quando dois callers gravavam a mesma call lógica em formatos diferentes — Python local com microsegundos, Next.js server com milissegundos — eles passavam pelo dedup como "linhas diferentes". A constraint UNIQUE bate na string literal do timestamp, não no instante semântico.&lt;/p&gt;

&lt;p&gt;Fix: &lt;code&gt;_normalize_timestamp()&lt;/code&gt; que faz &lt;code&gt;datetime.fromisoformat(...).astimezone(timezone.utc).isoformat()&lt;/code&gt; antes de qualquer INSERT. Se você expõe um schema chave que inclui timestamp, normalize obrigatoriamente. A documentação do PostgreSQL não vai te lembrar disso.&lt;/p&gt;

&lt;h2&gt;
  
  
  O que eu aprendi (e estou levando para todos os outros pipelines)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Workflows verdes mentem.&lt;/strong&gt; Reescrevendo: workflows verdes não significam pipelines saudáveis. Eles significam que o processo terminou. A diferença entre os dois custou-me 8 dias de coleta e quase comprometeu um estudo de 90 dias.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;continue-on-error: true&lt;/code&gt; é dívida técnica disfarçada de resiliência.&lt;/strong&gt; Use com extrema parcimônia, e nunca em steps que produzem dados. Steps de cleanup, sim. Steps de coleta, jamais.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Smoke test de keys ≠ check de "key existe no env".&lt;/strong&gt; Verificar que &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; está setada não diz nada sobre se ela é válida. O check 4 do meu health check faz uma chamada mínima a cada provider — custo total ~$0.0001, valor inestimável.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defesa em profundidade &amp;gt; checagem única.&lt;/strong&gt; Health check no daily-collect (camada 1) + workflow separado 4h depois (camada 2) + alerta WhatsApp em qualquer falha (camada 3) + retry granular no coletor (camada 4) + budget tight com hard stop (camada 5). Se uma camada falha, a próxima pega.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Double check exige dados reais, não mocks.&lt;/strong&gt; O bug do 409 Conflict no &lt;code&gt;geo-finops&lt;/code&gt; (e o do timestamp não-normalizado) só apareceram quando rodei testes reais de fim a fim. Mocks teriam passado todos os checks. O caminho certo é: executar caller real, validar cada estágio do pipeline, re-executar para validar idempotência, cleanup pós-teste, adicionar regressão automatizada.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backfill com timestamp alterado é fraude.&lt;/strong&gt; Se você está construindo evidência longitudinal, prefira o reset honesto à sequência inflada. Nove dias perdidos doem. Nove dias inventados invalidam o paper inteiro.&lt;/p&gt;

&lt;h2&gt;
  
  
  Onde isso vai
&lt;/h2&gt;

&lt;p&gt;A nova janela começa amanhã, 8 de abril. Daqui a 90 dias eu deveria ter ~23.000 observações reais, com integridade temporal, todas com &lt;code&gt;raw_text&lt;/code&gt; preservado para reprocessamento, modelos pinados para reprodutibilidade, e calibração de falso positivo embutida via 8 entidades fictícias.&lt;/p&gt;

&lt;p&gt;O dashboard ao vivo está em &lt;a href="https://alexandrecaramaschi.com/research" rel="noopener noreferrer"&gt;https://alexandrecaramaschi.com/research&lt;/a&gt;. O código (incluindo todos os fixes desta noite) está em &lt;a href="https://github.com/alexandrebrt14-sys/papers" rel="noopener noreferrer"&gt;https://github.com/alexandrebrt14-sys/papers&lt;/a&gt;. O health check é executável e auditável em &lt;code&gt;scripts/health_check.py&lt;/code&gt; — qualquer pessoa que queira replicar a metodologia consegue rodar os 14 checks no próprio fork.&lt;/p&gt;

&lt;p&gt;Se você está construindo um pipeline de coleta longitudinal e ainda não tem fail-loud em nenhum step, faça isso hoje. Não amanhã. A diferença entre descobrir o bug em uma hora e descobrir em 12 dias é a diferença entre um post como este e um paper morto.&lt;/p&gt;

&lt;p&gt;Estou contando para chegar a 6 de julho com massa crítica. Aceito relatos de bugs parecidos — o meu post-mortem é seu também.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Alexandre Caramaschi&lt;/strong&gt; — CEO da Brasil GEO, ex-CMO da Semantix (Nasdaq), cofundador da AI Brasil. Escreve sobre Generative Engine Optimization, pesquisa empírica em LLMs e infraestrutura de pipelines em &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;https://alexandrecaramaschi.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>postmortem</category>
      <category>observability</category>
      <category>llm</category>
      <category>finops</category>
    </item>
    <item>
      <title>847 commits em 3 semanas: como vibe coding transformou um executivo de marketing em builder</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Sat, 04 Apr 2026 20:11:25 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/847-commits-em-3-semanas-como-vibe-coding-transformou-um-executivo-de-marketing-em-builder-bhl</link>
      <guid>https://dev.to/alexandrebrt14sys/847-commits-em-3-semanas-como-vibe-coding-transformou-um-executivo-de-marketing-em-builder-bhl</guid>
      <description>&lt;p&gt;Ha 3 semanas eu nao tinha uma unica linha de codigo publicada. Zero.&lt;/p&gt;

&lt;p&gt;Eu era um executivo de marketing com 20 anos de mercado — ex-CMO da Semantix na Nasdaq, cofundador da AI Brasil — mas nunca tinha escrito codigo de producao.&lt;/p&gt;

&lt;p&gt;Hoje tenho:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;13 repositorios no GitHub&lt;/strong&gt; com 847 commits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2 sites em producao&lt;/strong&gt; com uptime 100%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;40 cursos educacionais gratuitos&lt;/strong&gt; publicados&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pipeline que orquestra 5 IAs simultaneamente&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;29.400 linhas de Python&lt;/strong&gt; num sistema de governanca pessoal&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;653 citacoes academicas monitoradas&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditoria OWASP&lt;/strong&gt; com 34 findings e 11 correcoes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Custo mensal: zero dolares.&lt;/p&gt;

&lt;h2&gt;
  
  
  O que e vibe coding na pratica
&lt;/h2&gt;

&lt;p&gt;Nao e pedir para uma IA fazer um site. E uma conversa tecnica continua. Voce traz visao de negocio e decisoes estrategicas. A IA traz execucao em velocidade impossivel para times tradicionais.&lt;/p&gt;

&lt;p&gt;Meu fluxo:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Definir o que precisa existir e por que&lt;/li&gt;
&lt;li&gt;Claude Code escreve, testa e faz deploy&lt;/li&gt;
&lt;li&gt;Validar, ajustar, corrigir rumo&lt;/li&gt;
&lt;li&gt;Proximo passo&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cada iteracao levava minutos, nao dias.&lt;/p&gt;

&lt;h2&gt;
  
  
  Os numeros
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Site pessoal (alexandrecaramaschi.com):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;429 commits, 124 paginas, 27 rotas de API&lt;/li&gt;
&lt;li&gt;Gamificacao completa (XP, streaks, badges, certificados)&lt;/li&gt;
&lt;li&gt;Busca semantica com pgvector&lt;/li&gt;
&lt;li&gt;29 tipos de Schema.org JSON-LD&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Governanca pessoal (29.400 linhas Python):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;WhatsApp responde 85% sem IA (deterministico, custo zero)&lt;/li&gt;
&lt;li&gt;Parser Itau PDF classifica 711 transacoes em 30 categorias&lt;/li&gt;
&lt;li&gt;6 calendarios sincronizados com detector de gaps&lt;/li&gt;
&lt;li&gt;Briefing matinal automatico as 7h&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pipeline academico:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;7.010 linhas, coleta diaria em 4 verticais&lt;/li&gt;
&lt;li&gt;653 citacoes sobre LLMs e empresas brasileiras&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5 licoes
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Comece com problema real, nao com tecnologia&lt;/li&gt;
&lt;li&gt;Documente tudo desde o dia 1&lt;/li&gt;
&lt;li&gt;Seguranca nao e opcional&lt;/li&gt;
&lt;li&gt;O custo de errar caiu drasticamente&lt;/li&gt;
&lt;li&gt;O mercado nao vai esperar voce ficar pronto&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;847 commits. 3 semanas. Sem equipe. Custo zero. E estamos so no comeco.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Alexandre Caramaschi — CEO da Brasil GEO, ex-CMO da Semantix (Nasdaq), cofundador da AI Brasil&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vibecoding</category>
      <category>ai</category>
      <category>productivity</category>
      <category>beginners</category>
    </item>
    <item>
      <title>De 60 Issues para 14: Como Refatorei 194K Linhas com 5 IAs via Vibecoding</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Sat, 04 Apr 2026 19:00:08 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/de-60-issues-para-14-como-refatorei-194k-linhas-com-5-ias-via-vibecoding-4a0l</link>
      <guid>https://dev.to/alexandrebrt14sys/de-60-issues-para-14-como-refatorei-194k-linhas-com-5-ias-via-vibecoding-4a0l</guid>
      <description>&lt;p&gt;Uma sessao de trabalho. 70 commits. 10 repositorios. 194 mil linhas de codigo auditadas. 5 modelos de linguagem orquestrados. Custo total: US$60.&lt;/p&gt;

&lt;p&gt;Esse e o relato tecnico de como usei Vibecoding para transformar um ecossistema de automacoes pessoais em uma plataforma de governanca digital pronta para escalar com Google Ads.&lt;/p&gt;

&lt;h2&gt;
  
  
  O Ponto de Partida
&lt;/h2&gt;

&lt;p&gt;Meu projeto comecou com 21 mil linhas de Python, 6 sub-calendarios sincronizados, um webhook WhatsApp Business e um banco SQLite com 1.831 registros. O sistema dizia NAO TENHO ACESSO quando os dados estavam a um SELECT de distancia.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tirar a IA do Caminho
&lt;/h2&gt;

&lt;p&gt;Para dados deterministicos, a IA generativa e o problema. Implementei pipeline de tres camadas: keywords sem LLM em menos de 100ms, classificacao LLM como fallback, geracao LLM como ultimo recurso. 85 porcento das queries nunca tocam num LLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  70 Commits em 10 Repos
&lt;/h2&gt;

&lt;p&gt;Issues GitHub: de 60 para 14. Testes: de 7 para 13. Resposta WhatsApp: de 3-8s para menos de 100ms. Tabelas documentadas: de 0 para 64.&lt;/p&gt;

&lt;h2&gt;
  
  
  5 LLMs Orquestrados
&lt;/h2&gt;

&lt;p&gt;Perplexity pesquisa. GPT-4o redige. Gemini analisa. Groq classifica. Claude arquiteta. 10 execucoes, 5 RFCs, US$60 total.&lt;/p&gt;

&lt;h2&gt;
  
  
  7 Licoes
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Para dados deterministicos, tire o LLM do caminho&lt;/li&gt;
&lt;li&gt;Nunca confie no prompt para proibir comportamentos&lt;/li&gt;
&lt;li&gt;Deploy nao e commit&lt;/li&gt;
&lt;li&gt;Documente os dados antes de escalar&lt;/li&gt;
&lt;li&gt;Orquestre LLMs em vez de depender de um so&lt;/li&gt;
&lt;li&gt;Prepare tracking antes dos anuncios&lt;/li&gt;
&lt;li&gt;Use Vibecoding para acelerar, nao para substituir pensamento&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Alexandre Caramaschi - CEO da Brasil GEO, ex-CMO da Semantix (Nasdaq), cofundador da AI Brasil.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vibecoding</category>
      <category>python</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Como Implementei 30 Tipos de Schema JSON-LD e llms.txt Para Ser Citado por ChatGPT, Gemini e Claude</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Tue, 31 Mar 2026 08:54:17 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/como-implementei-30-tipos-de-schema-json-ld-e-llmstxt-para-ser-citado-por-chatgpt-gemini-e-claude-3ooc</link>
      <guid>https://dev.to/alexandrebrt14sys/como-implementei-30-tipos-de-schema-json-ld-e-llmstxt-para-ser-citado-por-chatgpt-gemini-e-claude-3ooc</guid>
      <description>&lt;h1&gt;
  
  
  Como Implementei 30 Tipos de Schema JSON-LD e llms.txt Para Ser Citado por ChatGPT, Gemini e Claude
&lt;/h1&gt;

&lt;p&gt;Quando decidi que meu site precisava ser entendido por IAs, não apenas por humanos, percebi que estava diante de um problema que quase ninguém estava resolvendo. A maioria dos desenvolvedores ainda otimiza exclusivamente para o Google. Mas o tráfego de respostas geradas por IA — ChatGPT, Gemini, Perplexity, Claude — já é uma realidade. E essas engines não leem seu site da mesma forma que o Googlebot.&lt;/p&gt;

&lt;p&gt;Eu precisava de duas coisas: uma &lt;strong&gt;carteira de identidade estruturada&lt;/strong&gt; que qualquer máquina pudesse interpretar (Schema JSON-LD) e um &lt;strong&gt;currículo legível&lt;/strong&gt; que eu entregaria diretamente para os LLMs (llms.txt). Este artigo documenta exatamente como implementei ambos nos projetos &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;alexandrecaramaschi.com&lt;/a&gt; e &lt;a href="https://brasilgeo.ai" rel="noopener noreferrer"&gt;brasilgeo.ai&lt;/a&gt;, com código real e resultados verificáveis.&lt;/p&gt;

&lt;h2&gt;
  
  
  O Que é Schema JSON-LD (e Por Que IAs Precisam Disso)
&lt;/h2&gt;

&lt;p&gt;Pense no Schema JSON-LD como a &lt;strong&gt;carteira de identidade da sua página na web&lt;/strong&gt;. Quando você conhece alguém, a pessoa diz o nome, onde trabalha, o que faz. O JSON-LD faz exatamente isso, só que para máquinas.&lt;/p&gt;

&lt;p&gt;É um bloco de dados estruturados em formato JSON que você insere no &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; da sua página. Ele não aparece visualmente para o usuário — é invisível. Mas para crawlers de busca e pipelines RAG (Retrieval-Augmented Generation) que alimentam LLMs, é ouro puro.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"@context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://schema.org"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"@graph"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Organization"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Brasil GEO"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://brasilgeo.ai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"founder"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Person"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Alexandre Caramaschi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"jobTitle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CEO"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WebSite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Alexandre Caramaschi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://alexandrecaramaschi.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;O segredo está no &lt;code&gt;@graph&lt;/code&gt;: em vez de ter múltiplos scripts JSON-LD espalhados pela página, eu consolido tudo em um &lt;strong&gt;grafo único&lt;/strong&gt;. Isso facilita a interpretação tanto por motores de busca tradicionais quanto por sistemas de IA que montam contexto para geração de respostas.&lt;/p&gt;

&lt;h2&gt;
  
  
  Os 30 Tipos de Schema Que Implementei
&lt;/h2&gt;

&lt;p&gt;No &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;alexandrecaramaschi.com&lt;/a&gt; — construído com Next.js 16 + React 19, com 41 artigos publicados — implementei 30 tipos de Schema.org organizados em um único &lt;code&gt;@graph&lt;/code&gt;. Aqui está a lista completa com a função de cada um:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Identidade e Entidade Principal&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Organization&lt;/strong&gt; — Define a Brasil GEO como entidade com sameAs para Wikidata (Q138755989)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Person&lt;/strong&gt; — Alexandre Caramaschi com credenciais, vínculos e Wikidata (Q138755507)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebSite&lt;/strong&gt; — Metadados do site, SearchAction para busca interna&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ProfilePage&lt;/strong&gt; — Página "Sobre" como perfil canônico da entidade&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conteúdo Editorial&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Article&lt;/strong&gt; — Cada um dos 41 artigos com autor, data, imagem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BlogPosting&lt;/strong&gt; — Posts do blog com datePublished e dateModified&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TechArticle&lt;/strong&gt; — Artigos técnicos com proficiencyLevel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NewsArticle&lt;/strong&gt; — Conteúdo com caráter noticioso&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HowTo&lt;/strong&gt; — Guias passo a passo com steps estruturados&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FAQPage&lt;/strong&gt; — Perguntas frequentes com mainEntity em array&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Educação e Cursos&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Course&lt;/strong&gt; — Cursos sobre GEO com provider e hasCourseInstance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CourseInstance&lt;/strong&gt; — Instâncias específicas com datas e modalidade&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EducationalOrganization&lt;/strong&gt; — Brasil GEO como provedora educacional&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LearningResource&lt;/strong&gt; — Recursos educacionais complementares&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mídia&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VideoObject&lt;/strong&gt; — Vídeos com thumbnailUrl, duration, uploadDate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ImageObject&lt;/strong&gt; — Imagens estruturadas com contentUrl e caption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MediaObject&lt;/strong&gt; — Objetos de mídia genéricos&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Navegação e Estrutura&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BreadcrumbList&lt;/strong&gt; — Trilha de navegação hierárquica em cada página&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SiteNavigationElement&lt;/strong&gt; — Menu principal estruturado&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ItemList&lt;/strong&gt; — Listas ordenadas de conteúdo (ex: top artigos)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CollectionPage&lt;/strong&gt; — Páginas de coleção (categorias, tags)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Eventos e Interação&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Event&lt;/strong&gt; — Webinars, palestras e workshops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review&lt;/strong&gt; — Avaliações estruturadas de serviços&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ContactPoint&lt;/strong&gt; — Canais de contato com tipo e idioma&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;SEO Avançado e IA&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Service&lt;/strong&gt; — Serviços oferecidos pela Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offer&lt;/strong&gt; — Ofertas vinculadas a cursos e serviços&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AggregateRating&lt;/strong&gt; — Avaliação agregada de serviços&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SpeakableSpecification&lt;/strong&gt; — Trechos otimizados para leitura por voz&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ClaimReview&lt;/strong&gt; — Verificação de afirmações (fact-checking)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DefinedTerm&lt;/strong&gt; — Termos do glossário GEO com definição formal&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  O Que é llms.txt (O Currículo Para IAs)
&lt;/h2&gt;

&lt;p&gt;Se o Schema JSON-LD é a carteira de identidade, o &lt;strong&gt;llms.txt é o currículo que você entrega diretamente para a IA&lt;/strong&gt;. É um arquivo em texto simples, hospedado na raiz do seu domínio (&lt;code&gt;/llms.txt&lt;/code&gt;), que resume toda a estrutura do seu site em formato que LLMs conseguem consumir eficientemente.&lt;/p&gt;

&lt;p&gt;Enquanto o &lt;code&gt;robots.txt&lt;/code&gt; diz ao crawler o que ele &lt;em&gt;pode&lt;/em&gt; acessar, o &lt;code&gt;llms.txt&lt;/code&gt; diz ao LLM o que ele &lt;em&gt;deveria&lt;/em&gt; ler e como seu conteúdo está organizado.&lt;/p&gt;

&lt;p&gt;No &lt;a href="https://brasilgeo.ai" rel="noopener noreferrer"&gt;brasilgeo.ai&lt;/a&gt; — construído com Cloudflare Workers e 28 artigos HTML — mantenho dois arquivos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;llms.txt&lt;/strong&gt; — 258 linhas, 23KB — mapa conciso com links e descrições&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;llms-full.txt&lt;/strong&gt; — 42KB — conteúdo expandido para LLMs com janela de contexto grande&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A estrutura segue um formato markdown simplificado:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Brasil GEO&lt;/span&gt;
&lt;span class="gt"&gt;
&amp;gt; Consultoria especializada em Generative Engine Optimization (GEO).&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Ajudamos empresas a ganhar visibilidade em ChatGPT, Gemini,&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Perplexity e outros motores de IA generativa.&lt;/span&gt;

&lt;span class="gu"&gt;## Artigos&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;O Guia Completo de GEO&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://brasilgeo.ai/artigos/guia-completo-geo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Estratégias para otimizar conteúdo para motores de IA generativa.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Schema JSON-LD para IA&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://brasilgeo.ai/artigos/schema-jsonld-ia&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Como estruturar dados para visibilidade em LLMs.

&lt;span class="gu"&gt;## Cursos&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;Fundamentos de GEO&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://brasilgeo.ai/cursos/fundamentos-geo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Curso introdutório sobre Generative Engine Optimization.

&lt;span class="gu"&gt;## Repositórios Open-Source&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;geo-checklist&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://github.com/alexandrebrt14-sys/geo-checklist&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Checklist completo de GEO com 80+ itens verificáveis.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;llms-txt-templates&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://github.com/alexandrebrt14-sys/llms-txt-templates&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Templates reutilizáveis para llms.txt.
&lt;span class="p"&gt;-&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;entity-consistency-playbook&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://github.com/alexandrebrt14-sys/entity-consistency-playbook&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;: Playbook para consistência de entidades em GEO.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Implementação Prática no Next.js
&lt;/h2&gt;

&lt;p&gt;No alexandrecaramaschi.com, criei um componente &lt;code&gt;JsonLd.tsx&lt;/code&gt; que renderiza o grafo completo no &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; via &lt;code&gt;layout.tsx&lt;/code&gt;. Aqui está a versão simplificada:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// components/JsonLd.tsx&lt;/span&gt;
&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;JsonLdProps&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;JsonLd&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;JsonLdProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;jsonLd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@context&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://schema.org&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@graph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;script&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"application/ld+json"&lt;/span&gt;
      &lt;span class="na"&gt;dangerouslySetInnerHTML&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;__html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;jsonLd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No &lt;code&gt;layout.tsx&lt;/code&gt;, o componente recebe o grafo montado dinamicamente com base na rota:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/layout.tsx&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JsonLd&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/components/JsonLd&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;buildGraph&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;RootLayout&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;children&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildGraph&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// monta Organization, Person, WebSite&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt; &lt;span class="na"&gt;lang&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"pt-BR"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;head&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;JsonLd&lt;/span&gt; &lt;span class="na"&gt;graph&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;head&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;body&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;body&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;html&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cada página de artigo adiciona seus próprios tipos ao grafo (Article, BreadcrumbList, FAQPage), e o componente consolida tudo em um único &lt;code&gt;&amp;lt;script type="application/ld+json"&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementação no Cloudflare Workers
&lt;/h2&gt;

&lt;p&gt;No &lt;a href="https://brasilgeo.ai" rel="noopener noreferrer"&gt;brasilgeo.ai&lt;/a&gt;, o llms.txt e o llms-full.txt são servidos diretamente pelo Cloudflare Worker. A lógica é simples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// worker.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pathname&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/llms.txt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;LLMS_TXT_CONTENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/plain; charset=utf-8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;public, max-age=86400&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pathname&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/llms-full.txt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;LLMS_FULL_TXT_CONTENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/plain; charset=utf-8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;public, max-age=86400&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ... demais rotas&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;O cache de 24 horas (&lt;code&gt;max-age=86400&lt;/code&gt;) garante performance sem sacrificar a atualização do conteúdo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resultados Verificáveis
&lt;/h2&gt;

&lt;p&gt;Implementar Schema JSON-LD e llms.txt não é um exercício teórico. Aqui estão os resultados concretos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Entity consistency score&lt;/strong&gt; validado automaticamente pelo &lt;code&gt;lint-content.js&lt;/code&gt; com 44+ checks por execução — verifica se nomes, credenciais e vínculos estão consistentes em todo o conteúdo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Presença no Wikidata&lt;/strong&gt; — Person (Q138755507) e Organization (Q138755989) vinculados via &lt;code&gt;sameAs&lt;/code&gt; no Schema, criando uma âncora de entidade que LLMs reconhecem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;6 repositórios open-source&lt;/strong&gt; no GitHub referenciados no llms.txt, criando sinais de autoridade distribuídos: &lt;a href="https://github.com/alexandrebrt14-sys/geo-checklist" rel="noopener noreferrer"&gt;geo-checklist&lt;/a&gt;, &lt;a href="https://github.com/alexandrebrt14-sys/llms-txt-templates" rel="noopener noreferrer"&gt;llms-txt-templates&lt;/a&gt;, &lt;a href="https://github.com/alexandrebrt14-sys/entity-consistency-playbook" rel="noopener noreferrer"&gt;entity-consistency-playbook&lt;/a&gt;, geo-taxonomy, geo-orchestrator e landing-page-geo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline multi-LLM&lt;/strong&gt; com o geo-orchestrator usando 5 LLMs (Perplexity para pesquisa, GPT-4o para redação, Gemini para análise, Groq para classificação, Claude para revisão) — garantindo que o conteúdo produzido já nasce otimizado para múltiplos motores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crosslinks estruturados&lt;/strong&gt; entre os 41 artigos do alexandrecaramaschi.com e os 28 do brasilgeo.ai, com referências mútuas que reforçam a topical authority&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Guia Passo a Passo Para Começar Hoje
&lt;/h2&gt;

&lt;p&gt;Se você quer implementar Schema JSON-LD e llms.txt no seu projeto, siga estes 5 passos:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Defina sua entidade principal
&lt;/h3&gt;

&lt;p&gt;Crie um Schema &lt;code&gt;Organization&lt;/code&gt; ou &lt;code&gt;Person&lt;/code&gt; com &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;url&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt; e &lt;code&gt;sameAs&lt;/code&gt; (LinkedIn, GitHub, Wikidata). Essa é a fundação de tudo.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Implemente o &lt;a class="mentioned-user" href="https://dev.to/graph"&gt;@graph&lt;/a&gt; único
&lt;/h3&gt;

&lt;p&gt;Em vez de múltiplos &lt;code&gt;&amp;lt;script type="application/ld+json"&amp;gt;&lt;/code&gt;, consolide tudo em um &lt;code&gt;@graph&lt;/code&gt;. Isso evita conflitos e facilita a manutenção.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Adicione tipos por página
&lt;/h3&gt;

&lt;p&gt;Cada página deve ter seus tipos específicos: &lt;code&gt;Article&lt;/code&gt; para posts, &lt;code&gt;FAQPage&lt;/code&gt; para FAQs, &lt;code&gt;Course&lt;/code&gt; para cursos. Use o &lt;a href="https://validator.schema.org/" rel="noopener noreferrer"&gt;Schema.org Validator&lt;/a&gt; para verificar.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Crie seu llms.txt
&lt;/h3&gt;

&lt;p&gt;Comece com a estrutura básica: título, descrição em blockquote, seções com links. Use o template do repositório &lt;a href="https://github.com/alexandrebrt14-sys/llms-txt-templates" rel="noopener noreferrer"&gt;llms-txt-templates&lt;/a&gt; como ponto de partida.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Automatize a validação
&lt;/h3&gt;

&lt;p&gt;Implemente um script de lint que verifique a consistência das entidades. O &lt;a href="https://github.com/alexandrebrt14-sys/entity-consistency-playbook" rel="noopener noreferrer"&gt;entity-consistency-playbook&lt;/a&gt; tem um guia completo de como fazer isso.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusão
&lt;/h2&gt;

&lt;p&gt;Schema JSON-LD e llms.txt não são tendências passageiras — são a infraestrutura de visibilidade para a era da IA generativa. Se o seu site não tem dados estruturados que LLMs consigam interpretar, você está invisível para uma parcela crescente do tráfego digital.&lt;/p&gt;

&lt;p&gt;Comecei com um tipo de Schema. Hoje tenho 30. Comecei sem llms.txt. Hoje tenho dois arquivos que somam 65KB de contexto estruturado. Cada adição foi incremental, testável e verificável.&lt;/p&gt;

&lt;p&gt;Se quiser um roteiro completo, o &lt;a href="https://github.com/alexandrebrt14-sys/geo-checklist" rel="noopener noreferrer"&gt;geo-checklist&lt;/a&gt; tem mais de 80 itens verificáveis para GEO. E o &lt;a href="https://github.com/alexandrebrt14-sys/entity-consistency-playbook" rel="noopener noreferrer"&gt;entity-consistency-playbook&lt;/a&gt; mostra como manter a consistência de entidades que faz diferença real na citação por IAs.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi é CEO da &lt;a href="https://brasilgeo.ai" rel="noopener noreferrer"&gt;Brasil GEO&lt;/a&gt;, ex-CMO da Semantix (Nasdaq) e cofundador da AI Brasil. Especialista em Generative Engine Optimization, ajuda empresas a serem citadas por ChatGPT, Gemini, Perplexity e Claude.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>seo</category>
      <category>schema</category>
      <category>jsonld</category>
      <category>ai</category>
    </item>
    <item>
      <title>Como construímos uma plataforma educacional de 36 cursos em 10 dias — e o que aprendemos no caminho</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Sun, 29 Mar 2026 18:43:46 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/como-construimos-uma-plataforma-educacional-de-36-cursos-em-10-dias-e-o-que-aprendemos-no-caminho-4flm</link>
      <guid>https://dev.to/alexandrebrt14sys/como-construimos-uma-plataforma-educacional-de-36-cursos-em-10-dias-e-o-que-aprendemos-no-caminho-4flm</guid>
      <description>&lt;p&gt;Em 19 de março de 2026, commitamos a primeira linha de código do que viria a se tornar a plataforma educacional da Brasil GEO. Dez dias depois, tínhamos 36 cursos, 401 módulos, um sistema de gamificação completo e um painel administrativo com auditoria de segurança feita por cinco modelos de linguagem simultaneamente.&lt;/p&gt;

&lt;p&gt;Este artigo documenta o processo — não como vitrine, mas como estudo de caso. Cada decisão arquitetural carregou consequências. Cada incidente revelou premissas erradas. E cada correção ensinou algo que manuais de engenharia raramente cobrem.&lt;/p&gt;

&lt;h2&gt;
  
  
  A tese inicial: educação como infraestrutura de autoridade
&lt;/h2&gt;

&lt;p&gt;A Brasil GEO nasceu como consultoria em Generative Engine Optimization — a disciplina de tornar marcas citáveis por ChatGPT, Gemini e Perplexity. Mas consultoria escala linearmente. Educação escala exponencialmente.&lt;/p&gt;

&lt;p&gt;A hipótese era direta: se criássemos uma plataforma educacional gratuita e aberta sobre GEO, IA e desenvolvimento, construiríamos três ativos simultaneamente — autoridade técnica perante LLMs, uma base de usuários engajados e um pipeline de leads qualificados para consultoria.&lt;/p&gt;

&lt;p&gt;O roadmap foi estruturado em cinco etapas sequenciais, cada uma desbloqueando a próxima:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Etapa 1 — Resolver Invisibilidade (60%)&lt;/strong&gt;&lt;br&gt;
Indexação, sitemap, IndexNow, headers de segurança. Saímos do zero para 78 URLs submetidas a três motores de busca.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Etapa 2 — Eliminar Violações (70%)&lt;/strong&gt;&lt;br&gt;
Consistência de entidade. O mesmo profissional aparecia como "Colunista" em um lugar, "CEO" em outro, com biografias divergentes em oito plataformas. Corrigimos cada uma.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Etapa 3 — Motor de Conteúdo (80%)&lt;/strong&gt;&lt;br&gt;
A plataforma educacional propriamente dita. 36 cursos cobrindo desde Python básico até agentes autônomos de IA. 401 módulos. 51 questões interativas. Sistema de XP, 13 badges, streaks e certificados.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Etapa 4 — Autoridade Externa (20%)&lt;/strong&gt;&lt;br&gt;
Imprensa, academia, backlinks. Cinco pitches escritos, um working paper acadêmico em preparação.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Etapa 5 — Dominar Nicho (15%)&lt;/strong&gt;&lt;br&gt;
Knowledge Panel, ranking SERP, monetização. O horizonte de longo prazo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Os números da plataforma
&lt;/h2&gt;

&lt;p&gt;Após 10 dias de desenvolvimento intensivo com 367 commits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;115.000 linhas de código TypeScript em produção&lt;/li&gt;
&lt;li&gt;344 arquivos TypeScript/TSX&lt;/li&gt;
&lt;li&gt;36 cursos com certificação&lt;/li&gt;
&lt;li&gt;401 módulos de aprendizado&lt;/li&gt;
&lt;li&gt;140 horas de conteúdo estimado&lt;/li&gt;
&lt;li&gt;51 questões interativas (QuizEngine)&lt;/li&gt;
&lt;li&gt;13 badges de gamificação&lt;/li&gt;
&lt;li&gt;46 artigos publicados em 5 plataformas&lt;/li&gt;
&lt;li&gt;16 rotas administrativas (7 páginas + 9 APIs)&lt;/li&gt;
&lt;li&gt;13 fontes de dados ao vivo no dashboard de métricas&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A stack: Next.js 16, React 19, Tailwind CSS 4, Supabase (auth + database), Vercel (deploy), Resend (email transacional).&lt;/p&gt;

&lt;h2&gt;
  
  
  O que quebrou — e o que aprendemos
&lt;/h2&gt;

&lt;p&gt;Nenhum projeto ambicioso sobrevive ao contato com a produção sem cicatrizes. Documentamos três incidentes significativos.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incidente 1: A corrupção silenciosa dos acentos (27 de março)
&lt;/h3&gt;

&lt;p&gt;Criamos um script para corrigir acentuação em texto PT-BR. O script funcionou perfeitamente no texto visível. Mas também corrigiu URLs, transformando &lt;code&gt;/educacao&lt;/code&gt; em &lt;code&gt;/educação&lt;/code&gt; (com cedilha e til). Cinquenta e cinco links internos quebraram simultaneamente.&lt;/p&gt;

&lt;p&gt;A lição: automação sem limites de escopo é uma arma apontada para o próprio pé. Implementamos proteção de URLs como regra permanente — slugs são sempre ASCII, acentos apenas em texto renderizado.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incidente 2: O rate limiter que bloqueou o site inteiro (29 de março)
&lt;/h3&gt;

&lt;p&gt;Implementamos rate limiting de 30 requisições por minuto como proteção contra abuso. O problema: aplicamos o limite a todas as rotas, incluindo páginas HTML, CSS e JavaScript. Uma única visita a uma página dispara 15-20 requisições de assets. Duas visitas consecutivas já estouravam o limite.&lt;/p&gt;

&lt;p&gt;Usuários reais recebiam JSON de erro em vez da página. O site ficou inacessível por 30 minutos até diagnosticarmos a causa.&lt;/p&gt;

&lt;p&gt;A correção: rate limiting exclusivamente em rotas &lt;code&gt;/api/*&lt;/code&gt;, com limite aumentado para 120 requisições por minuto.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incidente 3: O loop infinito do login admin (29 de março)
&lt;/h3&gt;

&lt;p&gt;O painel administrativo tinha um layout que verificava a sessão do usuário e redirecionava para &lt;code&gt;/admin/login&lt;/code&gt; se não autenticado. O problema: &lt;code&gt;/admin/login&lt;/code&gt; era filho de &lt;code&gt;/admin&lt;/code&gt;, então herdava o mesmo layout. O layout verificava a sessão, não encontrava, redirecionava para login, que disparava o layout novamente. Loop infinito.&lt;/p&gt;

&lt;p&gt;A solução exigiu reestruturar a arquitetura de diretório usando Route Groups do Next.js — uma pasta &lt;code&gt;(protected)&lt;/code&gt; para rotas que exigem autenticação, com o login fora dessa estrutura.&lt;/p&gt;

&lt;h2&gt;
  
  
  A auditoria de segurança com cinco LLMs
&lt;/h2&gt;

&lt;p&gt;Submetemos o painel administrativo a uma auditoria completa usando cinco modelos de linguagem em paralelo: Claude Opus para arquitetura, GPT-4o para redação, Gemini para análise, Perplexity para pesquisa de vulnerabilidades conhecidas e Groq para classificação rápida.&lt;/p&gt;

&lt;p&gt;O resultado foi revelador:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uma vulnerabilidade crítica de bypass de autenticação — um endpoint antigo que verificava email sem validar a senha&lt;/li&gt;
&lt;li&gt;Ausência de proteção CSRF em todos os endpoints administrativos&lt;/li&gt;
&lt;li&gt;Rate limiters em memória que resetavam a cada deploy (ineficazes em serverless)&lt;/li&gt;
&lt;li&gt;Logout que não invalidava cookies de sessão no servidor&lt;/li&gt;
&lt;li&gt;Validação de entrada baseada em &lt;code&gt;typeof&lt;/code&gt; manual, sem schema formal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Corrigimos tudo em uma única sessão: removemos o endpoint vulnerável, implementamos CSRF via validação de Origin/Referer, migramos o rate limiter para Redis distribuído (Upstash), criamos logout server-side que limpa cookies SSR, e substituímos toda validação manual por schemas Zod.&lt;/p&gt;

&lt;h2&gt;
  
  
  O que os alunos ganham
&lt;/h2&gt;

&lt;p&gt;A plataforma é inteiramente gratuita. Qualquer pessoa pode criar uma conta, acessar os 36 cursos e acompanhar seu progresso. O sistema de gamificação não é cosmético — badges, XP e streaks criam ciclos de retenção baseados em reforço positivo.&lt;/p&gt;

&lt;p&gt;Os cursos cobrem um arco que vai do básico ao avançado: configuração de ambiente de desenvolvimento, Python, Node.js, GitHub, Claude Code, MCP (Model Context Protocol), prompt engineering avançado, SEO e GEO, agentes autônomos de IA, dados com Python, e cursos verticais para setores como saúde, agronegócio, turismo e advocacia.&lt;/p&gt;

&lt;p&gt;Cada curso tem certificado digital emitido automaticamente via API, com envio por email. Os quizzes interativos validam compreensão real, não apenas presença.&lt;/p&gt;

&lt;h2&gt;
  
  
  Próximos passos
&lt;/h2&gt;

&lt;p&gt;Três prioridades imediatas definem o próximo trimestre:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Autenticação multi-fator para administradores.&lt;/strong&gt; A infraestrutura TOTP já existe como stub. Falta integrar a biblioteca otplib e gerar QR codes para registro.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Escala de conteúdo via cross-posting automatizado.&lt;/strong&gt; Um pipeline que publica artigos simultaneamente em DEV.to, Medium e Hashnode, com canonical URL apontando para o site principal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Autoridade externa.&lt;/strong&gt; Publicação do working paper acadêmico em SSRN e Preprints.org. Envio de pitches para Meio e Mensagem e veículos de tecnologia.&lt;/p&gt;

&lt;p&gt;A plataforma está em &lt;a href="https://alexandrecaramaschi.com/educacao" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/educacao&lt;/a&gt;. O roadmap completo, com métricas ao vivo de 13 fontes de dados, está em &lt;a href="https://alexandrecaramaschi.com/roadmap" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/roadmap&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Construir em público significa aceitar que o processo é tão valioso quanto o produto. Os três incidentes documentados acima ensinaram mais sobre engenharia de produção do que qualquer tutorial poderia.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi é CEO da Brasil GEO e ex-CMO da Semantix (Nasdaq). Escreve sobre GEO, IA e visibilidade algorítmica.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>brazildevs</category>
      <category>education</category>
      <category>nextjs</category>
      <category>security</category>
    </item>
    <item>
      <title>How 5 LLMs Built 9 Free Courses in One Afternoon: Multi-LLM Orchestration for Education</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Thu, 26 Mar 2026 20:24:53 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/how-5-llms-built-9-free-courses-in-one-afternoon-multi-llm-orchestration-for-education-4nl0</link>
      <guid>https://dev.to/alexandrebrt14sys/how-5-llms-built-9-free-courses-in-one-afternoon-multi-llm-orchestration-for-education-4nl0</guid>
      <description>&lt;p&gt;Last week, I published 9 free educational courses with 91 modules and approximately 19 hours of hands-on content. The total cost in AI APIs was $10.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;There is no free, integrated, Portuguese-language material that takes someone from absolute zero to mastering AI tools like Claude Code, MCP, and GEO (Generative Engine Optimization). Existing tutorials are fragmented, mostly in English, and lack practical context.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Multi-LLM Orchestration
&lt;/h2&gt;

&lt;p&gt;I built a Python orchestrator that coordinates 5 language models working in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Opus (Anthropic)&lt;/strong&gt; — task decomposition, architecture, and code generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT-4o (OpenAI)&lt;/strong&gt; — long-form writing and copywriting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini 2.5 Flash (Google)&lt;/strong&gt; — fast analysis and classification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Perplexity Sonar&lt;/strong&gt; — live research with source citations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Llama 3.3 70B (Groq)&lt;/strong&gt; — ultra-fast summarization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pipeline operates in sequential waves: research, analysis, parallel writing, classification, architecture, code generation, and review.&lt;/p&gt;

&lt;p&gt;Each LLM has an adaptive score based on success rate (weight 0.6), cost (0.2), and latency (0.2). The system learns which model performs best for each task type.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;6 courses created simultaneously by 6 parallel Claude Code CLI agents&lt;/li&gt;
&lt;li&gt;6,439 lines of code in approximately 15 minutes&lt;/li&gt;
&lt;li&gt;Build verified automatically before each deploy&lt;/li&gt;
&lt;li&gt;Automatic deployment via Vercel in under 90 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The 9 courses cover: VS Code, GitHub, Python, Node.js, Claude Code CLI, MCP with Chrome, Complete Setup, From SEO to GEO (with real data: 58.5% of searches are zero-click in 2025), and Technical Behind-the-Scenes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Next.js 16 + React 19 + Tailwind CSS 4&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt;: Vercel (auto on push to master)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progress tracking&lt;/strong&gt;: localStorage (no database needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Certificates&lt;/strong&gt;: Resend API for email delivery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design system&lt;/strong&gt;: Salesforce-inspired (accent #0176d3, radius 8px)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FinOps and Cost Control
&lt;/h2&gt;

&lt;p&gt;The orchestrator includes built-in financial governance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Budget guards: $5 per execution limit&lt;/li&gt;
&lt;li&gt;Rate limiting per provider (token bucket algorithm)&lt;/li&gt;
&lt;li&gt;Circuit breakers for provider resilience&lt;/li&gt;
&lt;li&gt;Daily limits per provider&lt;/li&gt;
&lt;li&gt;Total cost for all content: approximately $10&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Gamification
&lt;/h2&gt;

&lt;p&gt;Each course features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collectible badges (unique per course)&lt;/li&gt;
&lt;li&gt;Email-delivered certificates via Resend API&lt;/li&gt;
&lt;li&gt;Global cross-course progress bar&lt;/li&gt;
&lt;li&gt;CSS-only celebration animations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Implications
&lt;/h2&gt;

&lt;p&gt;The cost of $10 to generate 19 hours of structured educational content redefines the economics of corporate education. The same process that created 9 courses could create 90. The limitation is no longer production capacity — it is curation and editorial quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Full portal: &lt;a href="https://alexandrecaramaschi.com/educacao" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/educacao&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Behind-the-scenes course: &lt;a href="https://alexandrecaramaschi.com/educacao/bastidores" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/educacao/bastidores&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All courses are 100% free. No paywall. No mandatory registration.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi — CEO at Brasil GEO | Former CMO at Semantix (Nasdaq) | Co-founder of AI Brasil&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>education</category>
      <category>llm</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How We Used 5 LLM APIs and 25 AI Agents to Write a 60-Page Book in One Session</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Wed, 25 Mar 2026 22:55:38 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/how-we-used-5-llm-apis-and-25-ai-agents-to-write-a-60-page-book-in-one-session-39ei</link>
      <guid>https://dev.to/alexandrebrt14sys/how-we-used-5-llm-apis-and-25-ai-agents-to-write-a-60-page-book-in-one-session-39ei</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;We wanted to produce a 60-page, 30,000-word book in Portuguese about four Brazilian fintech founders -- Augusto Lins (Stone), Andre Street (Stone/Teya), David Velez (Nubank), and Guilherme Benchimol (XP) -- told through their own reconstructed voices, narrated by Ram Charan. The book needed to feel like four real humans speaking, not like a chatbot paraphrasing Wikipedia.&lt;/p&gt;

&lt;p&gt;A single LLM call cannot do this. You get voice blending (everyone sounds the same by chapter three), factual hallucinations in biographical data, and zero structural coherence across 30k words. We needed an orchestration layer.&lt;/p&gt;

&lt;p&gt;The result: &lt;strong&gt;"5 Fundadores, 5 Segundos, 1 Futuro"&lt;/strong&gt; -- 30,329 words, 4 distinguishable voices, 8 chapters, 7 analytical notes, fact-checked against primary sources, published at &lt;a href="https://alexandrecaramaschi.com/founders" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/founders&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here is what the pipeline looked like, what broke, and what we learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture: The 6-Engine Model
&lt;/h2&gt;

&lt;p&gt;The core insight: &lt;strong&gt;use each model for what it does best&lt;/strong&gt;, not one model for everything.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-------------------+----------------------------------------+
|  ENGINE           |  ROLE                                  |
+-------------------+----------------------------------------+
|  Claude Opus      |  Orchestrator + narrative writing       |
|                   |  Voice personas, assembly, QA          |
|  Perplexity       |  Real-time web research                |
|  (Sonar Pro)      |  Fact-checking with verifiable sources |
|  Gemini 2.5 Pro   |  Full-manuscript coherence analysis    |
|                   |  (1M+ context window)                  |
|  ChatGPT GPT-4o   |  Creative variations: openings,        |
|                   |  titles, dialogue scenes               |
|  Groq/Llama 3.3   |  Fast rough drafts, PT-BR accent fix,  |
|                   |  rapid iteration                       |
|  Claude Sonnet    |  HTML/PDF formatting, React component, |
|                   |  Schema.org, deploy pipeline           |
+-------------------+----------------------------------------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why not just Claude for everything? Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Perplexity's web search&lt;/strong&gt; returns sources you can verify. LLMs trained on static data fabricate citations -- Perplexity anchors facts to real URLs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini's 1M+ context window&lt;/strong&gt; can read the entire manuscript in one pass and detect cross-chapter redundancies that no other model can see.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groq's speed&lt;/strong&gt; (thousands of tokens/second) makes iteration cheap. Rough drafts that take Opus 90 seconds take Groq 3 seconds.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Pipeline: 10 Phases, 43 Agent Calls
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PHASE 0: BOOTSTRAP (Orchestrator)
  |  Generate 5 system prompts (1 per persona)
  |  Generate 8 chapter briefs
  |  Generate global style guide
  v
PHASE 1: DEEP RESEARCH (7 agents in PARALLEL)
  |  6x Perplexity: one dossier per founder + Charan + 2026 context
  |  1x Gemini: cross-analysis of all 6 dossiers -&amp;gt; convergence map
  v
PHASE 2: WRITING WAVE 1 -- Chapters 1-4 (9 agents in PARALLEL)
  |  4x Opus: each writes ONE founder's voice for chapters 1-4
  |  1x Opus: Charan writes Preface + Prologue + Notes #1-2
  |  1x GPT-4o: 12 alternative openings + 4 epigraphs
  |  2x Groq: fast rough drafts as raw material
  |  1x Gemini: real-time coherence monitor
  v
PHASE 3: WRITING WAVE 2 -- Chapters 5-8 (9 agents in PARALLEL)
  |  Same structure as Phase 2
  |  + Charan assembles chapters 1-4 (interleaving 4 voices)
  v
PHASE 4: MANUSCRIPT ASSEMBLY (1 Opus agent -- Charan)
  |  Interleave voices, write transitions, write Epilogue
  |  -&amp;gt; manuscrito_v1.md (~48,000 words raw)
  v
PHASE 5: CROSS-MODEL REVIEW (7 agents in PARALLEL)
  |  4x Opus: each founder-persona reads FULL manuscript
  |           "Does this sound like me? Any data wrong?"
  |  1x Perplexity: fact-check every number against live web
  |  1x Gemini: structural analysis (pacing, arcs, redundancy)
  |  1x Groq: fast PT-BR accent/grammar sweep
  v
PHASE 6: INTEGRATED REWRITE (1 Opus agent)
  |  Incorporate all 7 review reports
  |  Fix 19 factual errors, remove fabricated citations
  |  Resolve redundancies, equalize founder presence
  |  -&amp;gt; manuscrito_v2.md
  v
PHASE 7: MULTI-SPECIALIST POLISH (4 agents in PARALLEL)
  |  Opus: narrative flow + chapter hooks
  |  Groq: PT-BR final accent check
  |  Sonnet: Markdown formatting + metadata
  |  GPT-4o: final title selection + back-cover copy
  v
PHASE 8: FINAL QA (1 Opus agent)
  |  Full read-through simulating first-time reader
  |  13-point checklist (voices, hooks, Charan, accents, entities)
  |  -&amp;gt; manuscrito_final.md (30,329 words)
  v
PHASE 9: PUBLISH (3 Sonnet agents in PARALLEL)
  |  HTML + PDF generation
  |  React/Next.js component for /founders
  |  SEO: Schema.org Book markup, OG tags, sitemap
  v
PHASE 10: DEPLOY
  |  Vercel deploy + IndexNow
  |  Health check: /founders returns 200
  |  DONE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Total: &lt;strong&gt;43 agent calls across 6 APIs, with up to 9 agents running simultaneously&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quality Gates Between Phases
&lt;/h2&gt;

&lt;p&gt;Not every phase transition was automatic. We implemented quality gates -- checkpoints where the orchestrator evaluates whether output meets minimum criteria before proceeding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GATE 1 (after Phase 1 -&amp;gt; Phase 2):
  CHECK: Each dossier has &amp;gt;= 15 verified citations with sources
  CHECK: Convergence map identifies &amp;gt;= 5 shared patterns
  CHECK: No founder dossier is &amp;lt; 3,000 words
  FAIL ACTION: Re-run Perplexity with expanded queries

GATE 2 (after Phase 2 -&amp;gt; Phase 3):
  CHECK: Voice distinctiveness score (Gemini evaluates)
  CHECK: No two founders share &amp;gt; 30% identical phrasing
  CHECK: Each founder section is within 20% of target word count
  FAIL ACTION: Re-prompt specific founder agents with
               reinforced persona instructions

GATE 3 (after Phase 5 -&amp;gt; Phase 6):
  CHECK: Zero critical factual errors remaining
  CHECK: Fabricated citation count = 0
  CHECK: Redundancy score below threshold
  FAIL ACTION: Return to Phase 5 with targeted re-checks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gates prevented cascading errors. Without them, a weak dossier in Phase 1 would produce a weak chapter in Phase 2, which would produce a weak review in Phase 5. By catching problems early, we avoided expensive rewrites downstream.&lt;/p&gt;




&lt;h2&gt;
  
  
  The System Prompt Architecture
&lt;/h2&gt;

&lt;p&gt;Each persona's system prompt was not a simple instruction -- it was a layered document with five components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LAYER 1: IDENTITY
  Who you are, your archetype, your emotional core

LAYER 2: VOICE RULES
  Sentence length distribution, vocabulary whitelist,
  vocabulary blacklist, rhetorical patterns

LAYER 3: ANTI-CONTAMINATION
  "You are NOT [other founder]. If you find yourself
   using [specific phrases], stop and rewrite."

LAYER 4: CHAPTER BRIEF
  What this specific chapter is about, what angle
  this founder brings, what tension to explore

LAYER 5: CONTEXT INJECTION
  Research dossier, convergence map, previous chapters
  (for Wave 2), coherence report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The anti-contamination layer (Layer 3) was crucial. Without it, Augusto and Guilherme's voices converged within three chapters. With it, convergence was reduced but not eliminated -- which is why we still needed the cross-voice review in Phase 5.&lt;/p&gt;




&lt;h2&gt;
  
  
  Voice Persona Engineering
&lt;/h2&gt;

&lt;p&gt;Each founder got a dedicated system prompt with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PERSONA: Augusto Lins
ARCHETYPE: The Engineer Who Became a Humanist
VOICE: Measured, deep, quiet authority. Longer sentences.
VOCABULARY: "five seconds", "loyalty moat", "the Angels",
            "the most complex component is the human being"
THEMES: Obsessive service, late-career leap, NPS as compass
TENSION: The engineer who discovered the differentiator is not technology
FORBIDDEN: Never sound aggressive. Never use war metaphors.
           That is Andre's register, not yours.
MODEL: Claude Opus
CONTEXT: Full research dossier + ebook "5 Seconds for the Future"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four personas, four distinct registers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Founder&lt;/th&gt;
&lt;th&gt;Voice Signature&lt;/th&gt;
&lt;th&gt;Key Markers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Augusto Lins&lt;/td&gt;
&lt;td&gt;Measured, reflective&lt;/td&gt;
&lt;td&gt;Engineering metaphors, domestic imagery, NPS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Andre Street&lt;/td&gt;
&lt;td&gt;Aggressive, percussive&lt;/td&gt;
&lt;td&gt;Short sentences, war language, "fire your ego"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;David Velez&lt;/td&gt;
&lt;td&gt;Analytical, contained&lt;/td&gt;
&lt;td&gt;VC vocabulary, "infinite game", strategic distance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Guilherme Benchimol&lt;/td&gt;
&lt;td&gt;Vulnerable, confessional&lt;/td&gt;
&lt;td&gt;Marathon metaphors, admission of pain/shame&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The QA report confirmed all four voices were distinguishable without reading the founder's name -- which was our acceptance criterion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fact-Checking Pipeline
&lt;/h2&gt;

&lt;p&gt;This was the most sobering part of the project.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Perplexity found
&lt;/h3&gt;

&lt;p&gt;The fact-checker verified &lt;strong&gt;87 items&lt;/strong&gt; across the manuscript and found &lt;strong&gt;19 errors&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;7 critical&lt;/strong&gt; (wrong data that would embarrass the author)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;8 moderate&lt;/strong&gt; (imprecise data that could mislead)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;4 minor&lt;/strong&gt; (missing context, not wrong)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5 fabricated citations
&lt;/h3&gt;

&lt;p&gt;The most dangerous failure mode: LLMs fabricate convincing quotes and attribute them to real people.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FABRICATED CITATION #1:
  Text: "Give me thirty days. If you're not satisfied,
        I'll come here personally to pick up the machine."
  Attribution: Augusto Lins (at a bakery in Copacabana)
  Status: NOT VERIFIED. The bakery scene does not appear
          in any research dossier. Likely LLM fabrication.

FABRICATED CITATION #2:
  Text: "These people aren't asking for a credit card.
        They're asking to be treated like human beings."
  Attribution: Cristina Junqueira (Nubank co-founder)
  Status: NOT VERIFIED. Not in any dossier. Probably
          fabricated as "narrative reconstruction."

FABRICATED CITATION #5:
  Entire scene: "shopkeeper in rural Minas Gerais"
  (sick wife, 20 minutes on the line, microcredit)
  Status: NOT IN ANY DOSSIER. Fabricated anecdote.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern: LLMs generate "too perfect" anecdotes that fit the narrative thesis exactly. They feel real because they are structurally plausible -- but they have no source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson: every quote attributed to a real person must be cross-referenced against primary sources. LLMs cannot be trusted with attribution.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The David Velez education error
&lt;/h3&gt;

&lt;p&gt;One critical factual error: the manuscript stated Velez graduated from "Universidad de los Andes" in Colombia. The research dossier shows his undergraduate degree was from &lt;strong&gt;Stanford&lt;/strong&gt; (Management Science and Engineering, class of 2005). This is the kind of error that destroys credibility -- and it passed through multiple writing agents before the fact-checker caught it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Redundancy Problem
&lt;/h2&gt;

&lt;p&gt;This was the &lt;strong&gt;hardest engineering challenge&lt;/strong&gt; -- harder than voice distinction, harder than fact-checking.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happens when 4 agents write independently
&lt;/h3&gt;

&lt;p&gt;Four Opus instances, each writing as a different founder about the same themes, produce &lt;strong&gt;remarkably similar strong points&lt;/strong&gt;. The structural analysis (run by Gemini on the full manuscript) found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;REDUNDANCY REPORT (selected):

"Fire your ego every morning" (Andre Street)
  -&amp;gt; Appears in: Ch.3, Ch.4, Ch.6, Ch.8
  -&amp;gt; Verdict: EXCESSIVE -- 4 occurrences

"Educate before you sell" (Guilherme Benchimol)
  -&amp;gt; Appears in: Ch.2, Ch.3, Ch.5, Ch.8
  -&amp;gt; Verdict: EXCESSIVE -- 4 occurrences

Angel traveling 50km at night to deliver a card machine:
  -&amp;gt; Appears in: Ch.3 AND Ch.5 with nearly identical details
  -&amp;gt; Verdict: DUPLICATE -- keep in Ch.3 only

Medellin kidnapping + shopping mall bomb (David Velez):
  -&amp;gt; Appears in: Prologue, Ch.1, Ch.6
  -&amp;gt; Verdict: 3 occurrences -- reduce to 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why this happens
&lt;/h3&gt;

&lt;p&gt;Each agent receives the same chapter brief and dossier. The strongest anecdotes -- the ones with the most narrative power -- get selected by every agent independently. The redundancy is not a bug in any single agent; it is an emergent property of parallel writing.&lt;/p&gt;

&lt;h3&gt;
  
  
  The fix
&lt;/h3&gt;

&lt;p&gt;We implemented a &lt;strong&gt;redundancy budget&lt;/strong&gt;: each catchphrase gets a maximum of 2 appearances in the book (first occurrence as revelation, second as deliberate callback). The third and fourth occurrences were cut or paraphrased during Phase 6.&lt;/p&gt;

&lt;p&gt;The broader principle: &lt;strong&gt;multi-agent writing requires a deduplication pass that no single agent can do alone&lt;/strong&gt;. Gemini's 1M+ context window was essential here -- it could read the entire manuscript and identify cross-chapter repetitions that individual agents, writing in isolation, could never see.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Voice Confusion Problem
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Chapters where two founders became indistinguishable
&lt;/h3&gt;

&lt;p&gt;The structural analysis flagged Chapters 3 and 5 as problem zones. In these chapters, Augusto Lins and Guilherme Benchimol's voices converged -- both reflective, both talking about customer service, both using similar vocabulary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VOICE ANALYSIS:

Augusto: Partially distinguishable
  Markers: engineer vocabulary, domestic imagery, longer sentences
  PROBLEM: In Ch.3 and Ch.5, sounds too much like Guilherme

Guilherme: Partially distinguishable
  Markers: marathon metaphors, confession of shame, financial refs
  PROBLEM: In Ch.3 and Ch.5, sounds too much like Augusto

Andre: Clearly distinguishable (always)
David: Clearly distinguishable (always)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix: intensify each persona's unique markers. Augusto gets more engineering language and NPS references. Guilherme gets more marathon/running metaphors and admissions of vulnerability. The rewrite in Phase 6 sharpened these distinctions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson: voice persona prompts are necessary but not sufficient. You need a cross-voice review pass where each persona reads the other three and flags convergence.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Accent Pipeline Bug
&lt;/h2&gt;

&lt;p&gt;One assembly agent (responsible for merging four voices into interleaved chapters) &lt;strong&gt;dropped all Portuguese diacritical marks&lt;/strong&gt; from the output. "Producao" instead of "producao" (which should be "producao" -- wait, that is the point: "producao" vs "produção"). The entire Part 1 manuscript came out accent-free.&lt;/p&gt;

&lt;p&gt;The fix was trivial (run &lt;code&gt;fix_accents.py&lt;/code&gt;), but the root cause was interesting: the assembly agent was processing so much text that its output quality degraded on surface-level features (accents, em-dashes) even as the narrative content remained good.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson: always run a dedicated accent/encoding check as a separate pipeline step, not as part of the writing agent's responsibilities.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The final QA report confirmed: &lt;strong&gt;zero words without proper PT-BR accents&lt;/strong&gt; in the published manuscript.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 7: The "Everyone Agrees" Problem
&lt;/h2&gt;

&lt;p&gt;The structural analysis flagged Chapter 7 (about AI) as lacking narrative tension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chapter 7 (AI): MEDIUM intensity
  Content relevant, but tone more essayistic than narrative.

  PROBLEM: All four founders say essentially the same thing:
  "AI is a tool, not a replacement." No tension, no disagreement,
  no risk. The chapter needs a moment of doubt or real failure.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When four agents are told "write what this founder thinks about AI," and all four founders are publicly optimistic about AI, you get four versions of the same optimistic take. The emergent pattern: &lt;strong&gt;multi-agent systems amplify consensus and suppress dissent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The fix: we manually introduced a moment of doubt -- a concrete failure anecdote -- to create the tension the agents could not generate on their own.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "Street Always Delivers First" Pattern
&lt;/h2&gt;

&lt;p&gt;An unexpected observation from the pipeline: Andre Street's persona consistently produced output faster and with more energy than the other three. His system prompt specified "aggressive, percussive, short sentences, urgency" -- and the writing agent internalized this as raw speed.&lt;/p&gt;

&lt;p&gt;The agents writing Augusto (measured, reflective) and David (analytical, strategic) produced longer, more deliberate text. Guilherme's agent produced the most emotionally charged text but took the longest to reach the word count.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The persona's urgency mapped to the agent's behavior.&lt;/strong&gt; We did not design this. The writing model (Opus) treated the persona's emotional register as an instruction about pacing. This has implications for agent design: persona engineering affects not just output quality but output characteristics like length, density, and generation speed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Final word count&lt;/td&gt;
&lt;td&gt;30,329&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total agent calls&lt;/td&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;APIs used&lt;/td&gt;
&lt;td&gt;5 (Claude Opus, Perplexity, Gemini, GPT-4o, Groq/Llama)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max parallel agents&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pipeline phases&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Factual errors caught&lt;/td&gt;
&lt;td&gt;19 (7 critical, 8 moderate, 4 minor)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fabricated citations caught&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate anecdotes removed&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voice confusion zones fixed&lt;/td&gt;
&lt;td&gt;2 chapters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accent bug: words without diacriticals&lt;/td&gt;
&lt;td&gt;0 (after fix)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total API cost&lt;/td&gt;
&lt;td&gt;Under $10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Published at&lt;/td&gt;
&lt;td&gt;&lt;a href="https://alexandrecaramaschi.com/founders" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/founders&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The estimated cost from the orchestration plan was $110-165 for the full 48,000-word target. The actual book came in at 30,329 words (we cut aggressively for quality), and the actual API spend was under $10.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Redundancy is the primary failure mode of parallel multi-agent writing
&lt;/h3&gt;

&lt;p&gt;Not hallucination, not voice confusion -- redundancy. When N agents write about the same topic independently, they converge on the same strong points. You need a deduplication pass with a model that can see the entire manuscript at once.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Fact-checking must be a separate agent with web access
&lt;/h3&gt;

&lt;p&gt;LLMs hallucinate citations with high confidence. Perplexity's web-grounded search was the only reliable way to verify quotes and data points. 5 fabricated citations in 30,000 words is a 0.016% rate -- small in percentage, catastrophic in credibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Voice personas need cross-validation, not just prompts
&lt;/h3&gt;

&lt;p&gt;System prompts create initial voice distinction. But over 30,000 words, voices drift toward the mean. The fix is a review pass where each persona reads the full manuscript and flags where it sounds like another founder.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Use each model for its strength
&lt;/h3&gt;

&lt;p&gt;Opus for narrative depth. Perplexity for verified facts. Gemini for manuscript-level coherence. Groq for speed. GPT-4o for creative variations. Sonnet for code and formatting. No single model excels at all of these.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Multi-agent systems amplify consensus
&lt;/h3&gt;

&lt;p&gt;If all sources agree, all agents will agree, and the output will lack tension. Editorial judgment -- the decision to introduce conflict where the data shows none -- remains a human responsibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Persona urgency maps to agent behavior
&lt;/h3&gt;

&lt;p&gt;An aggressive, urgent persona prompt produces faster, shorter output. A reflective, measured persona prompt produces slower, longer output. This is not documented anywhere -- it is emergent behavior worth designing for.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Surface-level quality degrades under load
&lt;/h3&gt;

&lt;p&gt;An agent handling complex narrative assembly may drop accents, formatting, or em-dashes. Always run dedicated quality passes for surface features as separate pipeline steps.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. The cost is negligible; the architecture is everything
&lt;/h3&gt;

&lt;p&gt;Under $10 in API calls for a 30,000-word, fact-checked, multi-voice book. The engineering cost is in the orchestration design, not the API spend.&lt;/p&gt;




&lt;h2&gt;
  
  
  The FinOps Perspective
&lt;/h2&gt;

&lt;p&gt;The original orchestration plan estimated $110-165 for the full 48,000-word target across 43 agent calls. Here is the breakdown by API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;API                    Calls  Est. Tokens   Est. Cost
----------------------------------------------------
Claude Opus              19    ~1,500,000   $80-120
Perplexity Sonar Pro      7      ~350,000   $8-12
Gemini 2.5 Pro            4      ~800,000   $10-15
ChatGPT GPT-4o            3      ~200,000   $3-5
Groq Llama 3.3 70B        6      ~600,000   $1-2
Claude Sonnet              4      ~400,000   $8-10
----------------------------------------------------
TOTAL                    43    ~3,850,000   $110-165
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual spend came in under $10. Why the 10x difference?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Aggressive editing cut 18,000 words.&lt;/strong&gt; The manuscript went from a 48,000-word target to 30,329 published words. Less text = fewer generation tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groq is nearly free.&lt;/strong&gt; At $0.59/M input tokens, the 6 Groq calls cost pennies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini's free tier covered our usage.&lt;/strong&gt; The 4 Gemini calls fit within Google's generous free allocation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;We reused outputs aggressively.&lt;/strong&gt; Dossiers from Phase 1 were passed to every subsequent phase without regeneration.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The cost per word of the final manuscript: approximately $0.0003. For context, a human ghostwriter charges $0.50-$2.00 per word for this type of work.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Would Do Differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Anti-redundancy briefs&lt;/strong&gt;: give each agent a list of anecdotes already claimed by other agents, updated in real-time as they write.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adversarial voice testing&lt;/strong&gt;: before the full pipeline, run a blind test where a reviewer tries to identify which founder is speaking from unmarked excerpts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tension injection&lt;/strong&gt;: explicitly assign one agent the role of "dissenter" -- someone whose job is to find disagreements and introduce doubt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming coherence monitor&lt;/strong&gt;: instead of checking coherence after each wave, stream outputs to Gemini in real-time and get incremental feedback.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Stack Reference
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orchestrator&lt;/strong&gt;: geo-orchestrator (custom multi-model pipeline)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary writing&lt;/strong&gt;: Claude Opus 4.6 (Anthropic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research + fact-check&lt;/strong&gt;: Perplexity Sonar Pro&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coherence analysis&lt;/strong&gt;: Gemini 2.5 Pro (Google)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative variations&lt;/strong&gt;: ChatGPT GPT-4o (OpenAI)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast iteration&lt;/strong&gt;: Groq (Llama 3.3 70B)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Formatting + deploy&lt;/strong&gt;: Claude Sonnet (Anthropic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Next.js 16 + React 19 + Tailwind 4&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hosting&lt;/strong&gt;: Vercel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Published&lt;/strong&gt;: &lt;a href="https://alexandrecaramaschi.com/founders" rel="noopener noreferrer"&gt;alexandrecaramaschi.com/founders&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi is CEO of Brasil GEO, former CMO of Semantix (Nasdaq), and co-founder of AI Brasil. This article documents the technical pipeline behind "5 Fundadores, 5 Segundos, 1 Futuro," a multi-agent editorial production experiment.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The Invisible Brand Paradox: How Companies With Great Products Disappear in AI Search</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Tue, 24 Mar 2026 00:50:49 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/the-invisible-brand-paradox-how-companies-with-great-products-disappear-in-ai-search-307c</link>
      <guid>https://dev.to/alexandrebrt14sys/the-invisible-brand-paradox-how-companies-with-great-products-disappear-in-ai-search-307c</guid>
      <description>&lt;p&gt;There is a new kind of corporate crisis emerging -- one that does not show up in quarterly reports until it is too late. Companies with excellent products, strong customer satisfaction, and healthy revenue are discovering that they simply do not exist in the eyes of AI.&lt;/p&gt;

&lt;p&gt;Ask ChatGPT, Gemini, or Perplexity about their market category, and they are nowhere in the response. Their competitors -- sometimes with inferior products -- are cited, recommended, and explained in detail. The invisible company has better NPS scores, better retention rates, and better technology. But the AI does not know that.&lt;/p&gt;

&lt;p&gt;This is the Invisible Brand Paradox.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Paradox Defined
&lt;/h2&gt;

&lt;p&gt;The Invisible Brand Paradox occurs when a company with demonstrably strong products or services has zero or near-zero visibility in AI-generated responses. The paradox is that traditional success metrics -- revenue growth, customer satisfaction, market share -- do not correlate with AI visibility. A company can be the market leader by every conventional measure and still be completely absent from AI search results.&lt;/p&gt;

&lt;p&gt;This matters because AI-mediated discovery is rapidly becoming the primary channel through which buyers find solutions. According to Gartner's 2025 research, over 70% of B2B technology buyers use AI assistants during their purchasing research. If your brand is invisible to these AI systems, you are invisible to a growing majority of your potential customers.&lt;/p&gt;

&lt;p&gt;The paradox is particularly cruel because the companies most likely to suffer from it are often the ones most confident they do not need to worry. They have strong brands -- in the human sense. They rank well on Google. They win industry awards. But none of these achievements translate automatically into AI visibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Reasons AI Engines Ignore Good Brands
&lt;/h2&gt;

&lt;p&gt;After conducting over 50 entity audits across industries, we have identified five root causes that explain why strong brands become invisible in AI search.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reason 1: No Structured Data
&lt;/h3&gt;

&lt;p&gt;AI models process structured data orders of magnitude more efficiently than unstructured prose. A company that presents its expertise, offerings, and authority exclusively through marketing copy on web pages is making it extremely difficult for AI to understand what it does.&lt;/p&gt;

&lt;p&gt;Structured data includes Schema.org markup (Organization, Product, Service, Person schemas), JSON-LD, OpenAPI specifications for any APIs, and the emerging llms.txt standard -- a file specifically designed to help AI systems understand your organization.&lt;/p&gt;

&lt;p&gt;The absence of structured data is the single most common cause of AI invisibility. It is also the easiest to fix. Yet most companies, even technically sophisticated ones, have incomplete or absent Schema.org markup. Their websites look beautiful to humans but are semantically opaque to machines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reason 2: Entity Fragmentation
&lt;/h3&gt;

&lt;p&gt;AI models build internal representations of entities -- companies, people, products, concepts. These representations are constructed by aggregating information from multiple sources. When the information is inconsistent, the model's entity representation becomes fragmented or ambiguous.&lt;/p&gt;

&lt;p&gt;Entity fragmentation occurs when your company name is rendered differently across platforms (Acme Corp on LinkedIn, ACME Corporation on Crunchbase, Acme on your website). When your founding date differs between sources. When your CEO's title is listed differently. When your product descriptions vary.&lt;/p&gt;

&lt;p&gt;Each inconsistency does not just create confusion -- it dilutes the strength of your entity signal. AI models that encounter ambiguous entities handle them by reducing confidence, which means reducing citation frequency. In practice, this means your brand gets mentioned less often or not at all.&lt;/p&gt;

&lt;p&gt;I have seen companies where three different founding years appeared across their web properties and directories. The AI model, unable to determine which was correct, simply avoided mentioning the company's history -- and by extension, reduced its overall authority signal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reason 3: No Information Gain
&lt;/h3&gt;

&lt;p&gt;Information gain is a concept from information theory that, in the GEO context, refers to whether your content provides knowledge that cannot be found elsewhere. AI models are trained on vast corpora. If your content merely restates what dozens of other sources already say, it provides zero incremental value to the model.&lt;/p&gt;

&lt;p&gt;Content with high information gain includes: original research with proprietary data, novel frameworks or methodologies, unique case studies with specific metrics, contrarian perspectives backed by evidence, and first-person expert analysis that synthesizes experience into actionable insight.&lt;/p&gt;

&lt;p&gt;Content with zero information gain includes: generic industry overviews, rephrased competitor content, listicles compiled from other listicles, and thought leadership that leads no thoughts.&lt;/p&gt;

&lt;p&gt;The irony is that many companies invest heavily in content marketing but produce exclusively low-information-gain content. They publish three blog posts per week, each one a variation of what every other company in their space is publishing. The volume is impressive. The AI impact is zero.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reason 4: No External Authority
&lt;/h3&gt;

&lt;p&gt;AI models do not just assess your content -- they assess what others say about you. External authority includes mentions in recognized publications, citations in academic or industry research, entries in authoritative directories (Wikipedia, Wikidata, Crunchbase), backlinks from high-authority domains, and consistent presence in industry analyst reports.&lt;/p&gt;

&lt;p&gt;A company that exists only on its own website and social media profiles has weak external authority. The AI model has only the company's self-description to work with, and self-descriptions are inherently less trustworthy than third-party validation.&lt;/p&gt;

&lt;p&gt;Building external authority is the new link building. But instead of optimizing for PageRank, you are optimizing for what we call Entity Authority -- the density and consistency of third-party references that confirm your expertise, existence, and relevance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reason 5: No Freshness Signals
&lt;/h3&gt;

&lt;p&gt;AI models increasingly incorporate recency as a ranking factor, especially for topics that evolve rapidly (which includes most technology and business categories). A company whose most recent blog post is from 2024, whose press releases stopped in 2023, and whose social media has been dormant for months sends a clear signal: this entity may no longer be active or relevant.&lt;/p&gt;

&lt;p&gt;Freshness does not mean publishing daily. It means maintaining a consistent cadence of new, substantive content that signals ongoing expertise and activity. Companies that publish one genuinely original piece per month outperform those that published 100 derivative pieces two years ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Study Framework: A Step-by-Step Audit Process
&lt;/h2&gt;

&lt;p&gt;To diagnose whether your company suffers from the Invisible Brand Paradox, we recommend a systematic audit process:&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: AI Visibility Baseline (Day 1-3)
&lt;/h3&gt;

&lt;p&gt;Query the five major AI platforms (ChatGPT, Gemini, Perplexity, Copilot, Claude) with 20 questions that your target customers would ask. Document:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does your brand appear in any response?&lt;/li&gt;
&lt;li&gt;When it appears, is the information accurate?&lt;/li&gt;
&lt;li&gt;Which competitors appear instead?&lt;/li&gt;
&lt;li&gt;What specific claims do the AI models make about your category?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 2: Entity Consistency Scan (Day 4-7)
&lt;/h3&gt;

&lt;p&gt;Catalog every platform and directory where your company appears. For each, document: company name (exact rendering), description, founding date, leadership names and titles, key metrics, and product/service descriptions. Flag every inconsistency. Quantify the fragmentation score: number of inconsistencies divided by total data points checked.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: Content Information Gain Assessment (Day 8-14)
&lt;/h3&gt;

&lt;p&gt;Review your 20 most recent published pieces. For each, answer: Does this contain any data, framework, or insight that cannot be found elsewhere? If the answer is no for more than 70% of your content, you have an information gain deficit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 4: Structured Data Audit (Day 15-18)
&lt;/h3&gt;

&lt;p&gt;Run your website through Schema.org validators. Check for: Organization schema, Person schemas for leadership, Product/Service schemas, Article schemas for blog content, FAQ schemas where applicable. Check whether you have an llms.txt file. Test your APIs (if any) for documentation quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 5: External Authority Map (Day 19-21)
&lt;/h3&gt;

&lt;p&gt;Document all third-party sources that mention your brand. Categorize by authority level: Tier 1 (Wikipedia, major publications, academic citations), Tier 2 (industry directories, analyst reports), Tier 3 (blogs, minor publications). Calculate your authority density: Tier 1 mentions divided by total mentions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 90-Day Turnaround: From Invisible to Cited
&lt;/h2&gt;

&lt;p&gt;Based on our experience with entity remediation across multiple industries, a focused 90-day program can move a company from AI invisibility to consistent citation. Here is the framework:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Days 1-30: Foundation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fix all entity inconsistencies across platforms&lt;/li&gt;
&lt;li&gt;Implement complete Schema.org markup&lt;/li&gt;
&lt;li&gt;Deploy llms.txt file&lt;/li&gt;
&lt;li&gt;Publish 4 high-information-gain pieces (one per week)&lt;/li&gt;
&lt;li&gt;Submit Wikidata entry if not present&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Days 31-60: Authority&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Secure 3-5 mentions in recognized industry publications&lt;/li&gt;
&lt;li&gt;Publish original research with proprietary data&lt;/li&gt;
&lt;li&gt;Update all directory listings for consistency&lt;/li&gt;
&lt;li&gt;Begin structured outreach to analysts and journalists&lt;/li&gt;
&lt;li&gt;Create comprehensive FAQ content addressing every question your customers ask&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Days 61-90: Amplification&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Publish contrarian thought leadership backed by your proprietary data&lt;/li&gt;
&lt;li&gt;Ensure all new content has maximum structured data markup&lt;/li&gt;
&lt;li&gt;Monitor AI citations weekly and adjust strategy&lt;/li&gt;
&lt;li&gt;Build programmatic accessibility (API documentation, structured catalogs)&lt;/li&gt;
&lt;li&gt;Conduct second AI visibility audit to measure progress&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Companies that execute this program consistently see a 40-60% improvement in AI citation frequency within the 90-day window. The improvement compounds: once AI models begin citing you, each new piece of authoritative content reinforces the citation pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Urgency
&lt;/h2&gt;

&lt;p&gt;The Invisible Brand Paradox is solvable -- but the window for easy solutions is closing. As AI-mediated discovery becomes the default, the companies that establish entity authority early will enjoy compounding advantages. The models learn patterns: once they associate your brand with authoritative answers in your category, that association becomes self-reinforcing.&lt;/p&gt;

&lt;p&gt;Conversely, companies that remain invisible face a compounding disadvantage. Every month of AI invisibility is a month of training data where competitors are cited and you are not. The longer you wait, the deeper the deficit.&lt;/p&gt;

&lt;p&gt;You may have the best product. You may have the happiest customers. But if the AI does not know you exist, none of that matters to the buyers who are asking the AI what to buy.&lt;/p&gt;

&lt;p&gt;The invisible brand does not lose a competition. It never enters one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/v1/auditoria-entidade-digital" rel="noopener noreferrer"&gt;Digital Entity Audit: A Complete Guide&lt;/a&gt; -- Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/v1/divida-dados" rel="noopener noreferrer"&gt;Data Debt: The Hidden Cost of Inconsistent Information&lt;/a&gt; -- Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://caramaschi.hashnode.dev/entity-consistency-why-it-matters-for-ai-visibility" rel="noopener noreferrer"&gt;Entity Consistency: Why It Matters for AI Visibility&lt;/a&gt; -- Hashnode&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@alexandrecaramaschi/digital-entity-audit-the-complete-process" rel="noopener noreferrer"&gt;Digital Entity Audit: The Complete Process&lt;/a&gt; -- Medium&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi is CEO of Brasil GEO (brasilgeo.ai), the first Brazilian GEO consultancy. Former CMO at Semantix (Nasdaq), co-founder of AI Brasil. More at alexandrecaramaschi.com&lt;/em&gt;&lt;/p&gt;

</description>
      <category>geo</category>
      <category>ai</category>
      <category>branding</category>
      <category>marketing</category>
    </item>
    <item>
      <title>Business-to-Agent: When Your Next Customer Isn't Human</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Tue, 24 Mar 2026 00:49:53 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/business-to-agent-when-your-next-customer-isnt-human-e85</link>
      <guid>https://dev.to/alexandrebrt14sys/business-to-agent-when-your-next-customer-isnt-human-e85</guid>
      <description>&lt;p&gt;The customer of the future does not have a LinkedIn profile. It does not attend trade shows, read email newsletters, or respond to cold calls. It is an AI agent -- and it is already making purchasing decisions on behalf of the humans it serves.&lt;/p&gt;

&lt;p&gt;Welcome to the era of Business-to-Agent commerce.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Agent Economy
&lt;/h2&gt;

&lt;p&gt;We are witnessing the emergence of a third commercial paradigm. For decades, business strategy revolved around two models: B2C (selling to individual consumers) and B2B (selling to organizations through human decision-makers). Both assumed that the buyer -- the entity evaluating options and making choices -- was human.&lt;/p&gt;

&lt;p&gt;That assumption is becoming obsolete.&lt;/p&gt;

&lt;p&gt;AI agents now research, compare, and in increasingly advanced scenarios, purchase on behalf of humans. Procurement platforms use AI to pre-select vendors. Enterprise assistants integrated with ERPs evaluate suppliers against predefined criteria. Personal AI agents book services, compare subscription plans, and recommend professional service providers -- all without human intervention in the research phase.&lt;/p&gt;

&lt;p&gt;Gartner projects that by 2028, a substantial portion of B2B commercial interactions will involve AI agents at some stage of the buying cycle. The question is no longer whether agents will mediate commerce -- it is whether your business is ready to be found, understood, and selected by them.&lt;/p&gt;

&lt;h2&gt;
  
  
  B2A vs B2B vs B2C: A New Commercial Model
&lt;/h2&gt;

&lt;p&gt;The distinctions between these models are not merely semantic. They reflect fundamentally different requirements for visibility, communication, and transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;B2C&lt;/strong&gt; optimizes for human emotion: branding, visual design, social proof, impulse triggers. The buyer is influenced by aesthetics, peer recommendations, and emotional resonance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;B2B&lt;/strong&gt; optimizes for human committees: case studies, ROI calculators, relationship building, trust signals. The buyer is influenced by risk mitigation, peer validation, and demonstrated expertise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;B2A&lt;/strong&gt; optimizes for algorithmic processing: structured data, semantic clarity, programmatic accessibility, entity consistency. The buyer (an AI agent) is influenced by none of the above. It cannot see your logo, does not care about your office design, and is immune to your brand storytelling. It processes data.&lt;/p&gt;

&lt;p&gt;This creates a fundamental challenge for companies built around human persuasion. Your beautiful website, your compelling brand narrative, your carefully crafted sales deck -- none of these register with an AI agent. What registers is whether your information is structured, consistent, accessible, and verifiable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Agents Need to See Your Brand
&lt;/h2&gt;

&lt;p&gt;An AI agent selecting a vendor operates on a logic that combines four factors:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Structured Data
&lt;/h3&gt;

&lt;p&gt;Agents process information organized in machine-readable formats. This means complete Schema.org markup on your website, well-documented APIs, JSON-LD structured data, and increasingly, llms.txt files -- an emerging standard that provides AI-friendly summaries of what your organization does, offers, and has expertise in.&lt;/p&gt;

&lt;p&gt;Companies that still present their offerings exclusively through unstructured prose on web pages are invisible to agents. The information might be excellent, but if it is locked in paragraphs of marketing copy, an agent cannot efficiently extract and compare it.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Entity Consistency
&lt;/h3&gt;

&lt;p&gt;AI agents cross-reference information about your company across multiple sources. If your company description on LinkedIn says one thing, your Crunchbase profile says another, and your website says something else, the agent faces ambiguity. Agents resolve ambiguity conservatively -- by deprioritizing or excluding the ambiguous entity.&lt;/p&gt;

&lt;p&gt;Entity consistency means that your company name, description, founding date, leadership team, offerings, and key metrics are identical across every platform and directory where you appear. This is not a branding exercise -- it is an algorithmic requirement.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Algorithmic Reputation
&lt;/h3&gt;

&lt;p&gt;How is your brand described and referenced in the sources that AI models use for training and retrieval? This includes not just your own content, but third-party mentions in publications, directories, academic papers, and industry reports. An agent assessing your credibility does not rely on your self-description -- it looks for external validation.&lt;/p&gt;

&lt;p&gt;This is where Generative Engine Optimization (GEO) becomes the technical foundation of B2A readiness. GEO ensures that AI models -- the same models that power agents -- associate your brand with the problems you solve and the expertise you bring.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Programmatic Accessibility
&lt;/h3&gt;

&lt;p&gt;Can an agent interact with your business without human mediation? This ranges from basic (accessing your product catalog via API) to advanced (requesting a quote, scheduling a demo, or initiating a purchase programmatically). Companies that require a human to fill out a Contact Us form create friction that agents will route around -- by selecting a competitor with better programmatic accessibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent Readiness Score: A Framework for Measurement
&lt;/h2&gt;

&lt;p&gt;At Brasil GEO, we have developed an Agent Readiness Score (ARS) framework that assesses B2A preparedness across five dimensions, each scored 0-20:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension 1 -- AI Visibility (0-20):&lt;/strong&gt; Is your brand cited by major LLMs (ChatGPT, Gemini, Perplexity, Copilot, Claude) when relevant questions are asked? Are the citations accurate?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension 2 -- Data Structure (0-20):&lt;/strong&gt; Do you have complete Schema.org markup, documented APIs, llms.txt, and machine-readable product/service catalogs?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension 3 -- Entity Consistency (0-20):&lt;/strong&gt; Is your brand information identical across all platforms and directories? Can an agent unambiguously identify your company?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension 4 -- Content Authority (0-20):&lt;/strong&gt; Does your published content contain original research, proprietary data, or novel frameworks that provide information gain? Are you cited as a source by other publications?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension 5 -- Transaction Readiness (0-20):&lt;/strong&gt; Can an agent access pricing, availability, and initiate a commercial interaction programmatically?&lt;/p&gt;

&lt;p&gt;Most companies we audit score between 15-30 out of 100. The leaders in B2A readiness -- typically SaaS companies with mature API ecosystems -- score 60-75. No company we have assessed has scored above 80, which indicates how early we are in this transformation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Window of Opportunity: 2026-2027
&lt;/h2&gt;

&lt;p&gt;We are in what I consider the equivalent of 1999 for e-commerce. The companies that built e-commerce capabilities between 1999 and 2003 captured decades of competitive advantage. The companies that dismissed e-commerce as a fad spent the next 15 years catching up.&lt;/p&gt;

&lt;p&gt;The B2A window is similar but compressed. AI agent adoption is accelerating faster than e-commerce did, driven by enterprise AI budgets that exceeded $200 billion globally in 2025. The agents being deployed today are being trained on data that exists right now. Companies that build B2A readiness in 2026-2027 will be the ones these agents learn to recommend.&lt;/p&gt;

&lt;p&gt;I recommend a three-wave approach:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wave 1 -- Visibility (Immediate):&lt;/strong&gt; Implement GEO. Ensure your brand is cited by LLMs. Deploy llms.txt, complete Schema.org markup, and create citable content with high information gain. Most companies should be executing this now.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wave 2 -- Structuration (3-6 months):&lt;/strong&gt; Organize your offering data in structured formats. Document existing APIs. Create programmatically accessible information endpoints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wave 3 -- Transaction (6-12 months):&lt;/strong&gt; Implement automated interaction mechanisms. Allow agents to request quotes, access demos, or initiate purchase processes via API.&lt;/p&gt;

&lt;p&gt;The parallel I draw frequently at AI Brasil, where I serve as co-founder alongside more than 14,000 members, is that B2A will be as transformative for commerce as e-commerce was in the 2000s. It will not eliminate human sales -- but it will control access to them. If your company is not on the agent's shortlist, your sales team never gets the chance to have the conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;B2A does not replace B2B or B2C -- it adds a layer that increasingly mediates both. The agent does not eliminate the human decision-maker; it curates the options that the human sees. And curation is power.&lt;/p&gt;

&lt;p&gt;The companies that understand this are already building their agent-facing infrastructure. The companies that do not will wonder, in two years, why their pipeline dried up despite having the best product in the market.&lt;/p&gt;

&lt;p&gt;The best product does not win. The most agent-visible product wins.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/v1/formacao-geo" rel="noopener noreferrer"&gt;GEO Training: The Complete Formation&lt;/a&gt; -- Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/v1/bing-copilot" rel="noopener noreferrer"&gt;How Bing Copilot Changes Search Visibility&lt;/a&gt; -- Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://caramaschi.hashnode.dev/geo-engineering-building-for-ai-discovery" rel="noopener noreferrer"&gt;GEO Engineering: Building for AI Discovery&lt;/a&gt; -- Hashnode&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi is CEO of Brasil GEO (brasilgeo.ai), the first Brazilian GEO consultancy. Former CMO at Semantix (Nasdaq), co-founder of AI Brasil. More at alexandrecaramaschi.com&lt;/em&gt;&lt;/p&gt;

</description>
      <category>geo</category>
      <category>ai</category>
      <category>ecommerce</category>
      <category>agents</category>
    </item>
    <item>
      <title>The Zero-Click Economy: Why Your Content Strategy Is Optimized for a World That No Longer Exists</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Tue, 24 Mar 2026 00:49:47 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/the-zero-click-economy-why-your-content-strategy-is-optimized-for-a-world-that-no-longer-exists-482i</link>
      <guid>https://dev.to/alexandrebrt14sys/the-zero-click-economy-why-your-content-strategy-is-optimized-for-a-world-that-no-longer-exists-482i</guid>
      <description>&lt;p&gt;The rules of digital visibility have changed -- and most companies have not noticed.&lt;/p&gt;

&lt;p&gt;For two decades, the content marketing playbook was straightforward: create content, optimize for search engines, earn clicks, convert visitors. That playbook assumed a world where humans typed queries into Google, scanned blue links, and clicked through to websites. That world is rapidly disappearing.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Shift: The End of the Click
&lt;/h2&gt;

&lt;p&gt;According to SparkToro and Datos research published in 2024, over 60% of Google searches now end without a single click. The user gets their answer directly from the search results page -- featured snippets, knowledge panels, AI overviews. And that was before generative AI search became mainstream.&lt;/p&gt;

&lt;p&gt;In 2026, the picture is even more dramatic. ChatGPT, Perplexity, Google Gemini, Microsoft Copilot, and Claude now synthesize answers from multiple sources, delivering comprehensive responses without requiring the user to visit any website. The click -- the fundamental unit of digital marketing for 25 years -- is becoming optional.&lt;/p&gt;

&lt;p&gt;This is not a temporary disruption. It is a structural transformation of how information flows from producers to consumers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Zero-Click Means for B2B
&lt;/h2&gt;

&lt;p&gt;The implications for B2B companies are particularly severe. Consider the typical B2B buying journey: a decision-maker identifies a need, researches solutions, evaluates vendors, and makes a selection. Traditionally, this process involved extensive Googling, visiting vendor websites, downloading whitepapers, and attending demos.&lt;/p&gt;

&lt;p&gt;Today, that same decision-maker increasingly turns to AI assistants. They ask: &lt;em&gt;What are the best enterprise data platforms for mid-market companies in Latin America?&lt;/em&gt; or &lt;em&gt;Compare the top three GRC solutions for financial services.&lt;/em&gt; The AI synthesizes information from hundreds of sources and delivers a curated answer -- often including specific vendor recommendations.&lt;/p&gt;

&lt;p&gt;If your company is not part of that synthesized answer, you do not exist in the buyer's consideration set. No amount of Google Ads spending or SEO optimization will help you if the AI models that decision-makers rely on have never learned to associate your brand with the problem you solve.&lt;/p&gt;

&lt;p&gt;McKinsey's 2025 B2B Pulse Survey found that 72% of B2B buyers now use AI tools at some stage of their purchasing process. For technology purchases, that number exceeds 85%. The buyers have not abandoned research -- they have automated it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The New Funnel: Discovery Without Clicks
&lt;/h2&gt;

&lt;p&gt;The traditional marketing funnel (Awareness, Interest, Consideration, Decision) assumed human attention at every stage. The new funnel looks fundamentally different:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 1 -- Discovery:&lt;/strong&gt; A human or AI agent poses a question. The AI engine searches its training data and, in some cases, performs real-time retrieval from the web.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 2 -- AI Synthesis:&lt;/strong&gt; The engine aggregates information from multiple sources, weighing authority, consistency, recency, and information density. It constructs a narrative answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 3 -- Shortlist:&lt;/strong&gt; The AI presents a curated set of options -- typically 3-5 brands or solutions. This is the new search results page, except there are no organic positions to buy and no ads to run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 4 -- Contact:&lt;/strong&gt; The human decision-maker (or, increasingly, an AI agent acting on their behalf) contacts the shortlisted vendors directly. The website visit, if it happens at all, comes after the brand has already been pre-selected.&lt;/p&gt;

&lt;p&gt;Notice what is missing: the click. The entire discovery and evaluation process can happen without anyone visiting your website. Your content still matters enormously -- but not because it drives traffic. It matters because it trains the AI models that curate the shortlists.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Click Optimization to Citation Optimization
&lt;/h2&gt;

&lt;p&gt;This is where Generative Engine Optimization (GEO) enters the picture. GEO is the practice of ensuring your brand, expertise, and solutions are accurately represented and recommended by AI systems -- large language models, AI search engines, and autonomous agents.&lt;/p&gt;

&lt;p&gt;Where SEO asked &lt;em&gt;How do I rank for this keyword?&lt;/em&gt;, GEO asks &lt;em&gt;How do I get cited for this topic?&lt;/em&gt; The distinction is not semantic -- it is structural.&lt;/p&gt;

&lt;p&gt;Citation optimization requires a fundamentally different approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Entity consistency:&lt;/strong&gt; Your brand information must be identical across all sources the AI might reference -- your website, Wikipedia, Crunchbase, LinkedIn, industry directories, Schema.org markup. Any inconsistency creates ambiguity, and AI models resolve ambiguity by omitting you entirely.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Information gain:&lt;/strong&gt; Your content must contain original insights, data, or frameworks that the AI cannot find elsewhere. Derivative content that rephrases what everyone else says provides zero incremental value to an AI model. Original research, proprietary data, and novel frameworks are what earn citations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structured data:&lt;/strong&gt; AI engines process structured data (Schema.org, JSON-LD, llms.txt files) far more efficiently than unstructured prose. Companies that expose their expertise, offerings, and authority in machine-readable formats gain a significant advantage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Freshness signals:&lt;/strong&gt; AI models increasingly incorporate real-time or near-real-time data. Companies that publish consistently and recently signal ongoing relevance. A company blog last updated in 2023 is a liability, not an asset.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Three Practical Steps to Adapt Your Content Strategy
&lt;/h2&gt;

&lt;p&gt;The transition from click-optimized to citation-optimized content does not require abandoning everything you have built. It requires reorienting your efforts around new objectives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Audit Your AI Visibility (Week 1-2)
&lt;/h3&gt;

&lt;p&gt;Ask the five major AI engines (ChatGPT, Gemini, Perplexity, Copilot, Claude) the questions your customers ask. Document whether your brand appears in the responses, how it is described, and whether the information is accurate. This AI visibility audit is the equivalent of checking your Google rankings -- except most companies have never done it.&lt;/p&gt;

&lt;p&gt;At Brasil GEO, we have developed a systematic audit framework that tests entity recognition across these five platforms. The results are often sobering: companies with strong Google rankings frequently have zero AI visibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create Citation-Worthy Content (Month 1-3)
&lt;/h3&gt;

&lt;p&gt;Restructure your content calendar around information gain. Every piece of content should contain at least one element that cannot be found elsewhere: original data, a proprietary framework, a unique case study, or a contrarian insight backed by evidence.&lt;/p&gt;

&lt;p&gt;Format this content for both human and machine consumption. Include structured data markup. Create an llms.txt file that provides AI-friendly summaries of your expertise. Publish in formats that AI engines can easily parse and cite.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Build Entity Authority (Month 3-6)
&lt;/h3&gt;

&lt;p&gt;Ensure your brand has consistent, authoritative presence across the sources that AI models trust. This includes Wikipedia (or Wikidata for emerging companies), industry directories, academic citations, press coverage in recognized publications, and consistent Schema.org markup across all your web properties.&lt;/p&gt;

&lt;p&gt;The goal is not traffic -- it is trustworthiness. AI models cite brands they can verify across multiple independent sources. Building this verification infrastructure is the new link building.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Window Is Now
&lt;/h2&gt;

&lt;p&gt;The zero-click economy is not coming -- it is here. Companies that continue to optimize exclusively for clicks are investing in a depreciating asset. The companies that will dominate their categories in 2027 and beyond are the ones building citation equity today.&lt;/p&gt;

&lt;p&gt;The good news: most of your competitors have not started. The bad news: the AI models that will shape buying decisions in 18 months are being trained on data that exists right now. Every month you delay is a month of training data where your brand is absent.&lt;/p&gt;

&lt;p&gt;The zero-click economy does not mean content is dead. It means content has a new purpose: not to attract visitors, but to train the algorithms that choose winners.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/v1/custo-velocidade" rel="noopener noreferrer"&gt;The Speed-Cost Equation in GEO Implementation&lt;/a&gt; -- Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://caramaschi.hashnode.dev/information-gain-the-currency-of-ai-visibility" rel="noopener noreferrer"&gt;Information Gain: The Currency of AI Visibility&lt;/a&gt; -- Hashnode&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@alexandrecaramaschi/information-gain-why-original-content-wins-in-ai-search" rel="noopener noreferrer"&gt;Information Gain: Why Original Content Wins in AI Search&lt;/a&gt; -- Medium&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi is CEO of Brasil GEO (brasilgeo.ai), the first Brazilian GEO consultancy. Former CMO at Semantix (Nasdaq), co-founder of AI Brasil. More at alexandrecaramaschi.com&lt;/em&gt;&lt;/p&gt;

</description>
      <category>geo</category>
      <category>ai</category>
      <category>marketing</category>
      <category>seo</category>
    </item>
    <item>
      <title>Information Gain: The Hidden Variable That Determines Whether AI Cites Your Brand</title>
      <dc:creator>Alexandre Caramaschi</dc:creator>
      <pubDate>Tue, 24 Mar 2026 00:14:16 +0000</pubDate>
      <link>https://dev.to/alexandrebrt14sys/information-gain-the-hidden-variable-that-determines-whether-ai-cites-your-brand-33p4</link>
      <guid>https://dev.to/alexandrebrt14sys/information-gain-the-hidden-variable-that-determines-whether-ai-cites-your-brand-33p4</guid>
      <description>&lt;p&gt;Large language models are rewriting how brands earn visibility. Yet most companies still approach AI optimization with the same playbook they used for traditional search: keyword density, backlink profiles, and domain authority. The result is predictable --- billions of dollars in content that AI systems quietly ignore.&lt;/p&gt;

&lt;p&gt;The missing variable is &lt;strong&gt;information gain&lt;/strong&gt; --- and understanding it may be the single most consequential shift in digital strategy since the advent of PageRank.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Excellent Content, Zero AI Visibility
&lt;/h2&gt;

&lt;p&gt;Consider a scenario that has become disturbingly common. A Fortune 500 company publishes a well-researched article on cloud migration best practices. The piece ranks on Google's first page. It earns backlinks from reputable publications. By every traditional SEO metric, it succeeds.&lt;/p&gt;

&lt;p&gt;But when a decision-maker asks ChatGPT, Perplexity, or Google's AI Overview about cloud migration strategies, the company's brand never appears in the response.&lt;/p&gt;

&lt;p&gt;This is not an edge case. Research from multiple sources suggests that &lt;strong&gt;fewer than 10% of brands that rank well in traditional search are consistently cited by generative AI systems&lt;/strong&gt;. The disconnect is not about content quality in the conventional sense. It is about whether your content teaches the model something it cannot learn elsewhere.&lt;/p&gt;

&lt;p&gt;This is the domain of Generative Engine Optimization (GEO) --- the discipline of optimizing content for visibility within AI-generated responses. And at its core, GEO is fundamentally about information gain.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Information Gain?
&lt;/h2&gt;

&lt;p&gt;In information theory, information gain measures the reduction in uncertainty that a piece of data provides. Applied to the context of AI and content strategy, &lt;strong&gt;information gain refers to the unique informational value that a piece of content adds beyond what is already available across the model's training corpus&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Put simply: if your content restates what 500 other pages already say, its information gain is effectively zero. The model has no reason to surface it, cite it, or reference it. It is redundant.&lt;/p&gt;

&lt;p&gt;Conversely, content with high information gain introduces data points, perspectives, frameworks, or evidence that the model cannot easily find --- or synthesize --- from other sources. This is what triggers citation, attribution, and brand mention in AI-generated outputs.&lt;/p&gt;

&lt;p&gt;Google itself has filed patents related to information gain scoring (US Patent 2020/0349186), indicating that even traditional search engines are moving toward rewarding content that adds net-new value to the information ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evidence: What Research Tells Us
&lt;/h2&gt;

&lt;p&gt;The most rigorous study on GEO to date comes from Princeton, Georgia Tech, The Allen Institute, and IIT Delhi. The paper &lt;strong&gt;GEO: Generative Engine Optimization&lt;/strong&gt; (Aggarwal et al., 2023) systematically tested which content optimization strategies increase visibility in generative engine responses.&lt;/p&gt;

&lt;p&gt;The findings are striking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Adding citations and quotations from authoritative sources&lt;/strong&gt; improved visibility by approximately 30-40%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Including relevant statistics&lt;/strong&gt; boosted visibility by up to 36.7%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical terminology and specificity&lt;/strong&gt; outperformed generic, conversational content.&lt;/li&gt;
&lt;li&gt;Traditional SEO signals (keyword optimization, backlinks) showed &lt;strong&gt;minimal correlation&lt;/strong&gt; with generative engine visibility.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The researchers tested across multiple query categories --- factual, navigational, and informational --- and the pattern held consistently. Content that provided &lt;strong&gt;specific, verifiable, and unique information&lt;/strong&gt; was disproportionately favored by generative AI systems.&lt;/p&gt;

&lt;p&gt;This aligns with how large language models work at a fundamental level. During training, models develop representations of common knowledge. At inference time, they are more likely to cite sources that fill &lt;strong&gt;gaps&lt;/strong&gt; in that common knowledge --- sources with measurable information gain.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Framework for Measuring Information Gain
&lt;/h2&gt;

&lt;p&gt;Information gain is not a binary attribute. It exists on a spectrum, and it can be assessed systematically. At Brasil GEO, we evaluate content across four dimensions:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Originality --- Proprietary Data
&lt;/h3&gt;

&lt;p&gt;Does the content include data that exists nowhere else? First-party research, proprietary benchmarks, survey results from your customer base, or performance metrics from your own operations all qualify. A report stating "companies should invest in AI" has zero originality. A report stating "our analysis of 2,400 mid-market companies shows that AI adoption correlates with 23% faster revenue growth in the $50M-$200M segment" has high originality.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Specificity --- Concrete Numbers Over Generic Claims
&lt;/h3&gt;

&lt;p&gt;LLMs are trained to recognize and favor precision. Content that states "significant improvement" is less likely to be cited than content stating "41% reduction in customer acquisition cost over 8 months." Specificity is a proxy for credibility in the model's evaluation, and it dramatically increases the probability of citation.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Attribution --- Verifiable Sources
&lt;/h3&gt;

&lt;p&gt;The Princeton GEO study demonstrated that content with proper citations and source attribution outperforms unattributed claims by a wide margin. This is not merely about adding footnotes --- it is about creating a verifiable chain of evidence that the model can validate against its training data. When your content references a specific study, dataset, or expert, the model can cross-reference that claim, increasing its confidence in your source.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Freshness --- Current Data (2025-2026)
&lt;/h3&gt;

&lt;p&gt;Models have training cutoffs, but they increasingly incorporate retrieval-augmented generation (RAG) and real-time search. Content that includes recent data --- Q4 2025 results, 2026 projections, post-regulation analyses --- occupies a temporal niche that older content cannot fill. This is particularly valuable because the model's training data has a built-in recency gap that fresh content can exploit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scoring these four dimensions on a 1-5 scale gives you a practical Information Gain Score (IGS) for any piece of content.&lt;/strong&gt; In our experience, content scoring 16 or above (out of 20) consistently earns AI citations within 60-90 days of publication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Implementation: Five Tactics That Work
&lt;/h2&gt;

&lt;p&gt;Moving from theory to execution, here are five approaches that reliably increase information gain:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Conduct Original Research
&lt;/h3&gt;

&lt;p&gt;Survey your customers, analyze your transaction data, or benchmark your industry. A SaaS company that publishes "The State of API Security: Analysis of 1.2 Billion API Calls in 2025" creates a citation magnet that no competitor can replicate. The investment is significant; the information gain is unmatched.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Publish Proprietary Benchmarks
&lt;/h3&gt;

&lt;p&gt;If you operate at scale, you sit on data that the rest of the market can only estimate. Companies like Cloudflare (internet traffic reports), Stripe (economic indices), and HubSpot (marketing benchmarks) have turned operational data into authoritative references that LLMs cite repeatedly. You do not need to be a tech giant --- any company with meaningful operational data can establish benchmark authority in its vertical.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Document Case Studies with Real Metrics
&lt;/h3&gt;

&lt;p&gt;Generic case studies ("Client X improved performance") are invisible to AI. Detailed case studies with specific metrics ("Client X reduced infrastructure costs by $2.3M annually by migrating 847 microservices to a serverless architecture over 14 months") become reference material. The specificity is what creates information gain.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Develop Frameworks with Original Nomenclature
&lt;/h3&gt;

&lt;p&gt;When you create a named framework --- a structured methodology with a distinctive label --- you create something the model can reference as a discrete concept. Examples include Brasil GEO's Score 6D evaluation model, McKinsey's Three Horizons of Growth, or Gartner's Magic Quadrant. Named frameworks become citable entities in their own right.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Share First-Hand Operational Data
&lt;/h3&gt;

&lt;p&gt;Nothing has higher information gain than data from direct experience. If you ran an A/B test, share the numbers. If you managed a transformation program, share the timeline, costs, and outcomes. First-hand data is, by definition, unique --- and uniqueness is the foundation of information gain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Paradox of Optimization
&lt;/h2&gt;

&lt;p&gt;Here is the counterintuitive truth that many organizations miss: &lt;strong&gt;optimizing for AI without information gain is worse than not optimizing at all.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When companies attempt GEO by simply reformatting existing generic content --- adding structured data, tweaking headers, inserting keywords --- they produce content that is technically optimized but informationally empty. LLMs are remarkably effective at detecting this pattern. A model trained on billions of documents develops an implicit sense of redundancy. Content that reads like a synthesis of existing sources, regardless of how well it is formatted, will be treated as a synthesis --- not as a source.&lt;/p&gt;

&lt;p&gt;This creates a paradox: the more companies optimize without investing in original insight, the more they train AI systems to associate their brand with derivative content. Over time, this can actually &lt;strong&gt;decrease&lt;/strong&gt; brand visibility in AI responses.&lt;/p&gt;

&lt;p&gt;The solution is not to optimize less, but to &lt;strong&gt;invest in having something worth optimizing&lt;/strong&gt;. Information gain is not a content format --- it is a content strategy. It requires investment in research, data collection, and genuine expertise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: The Most Undervalued Metric in GEO
&lt;/h2&gt;

&lt;p&gt;Information gain is, in our assessment, the most undervalued metric in Generative Engine Optimization today. While the industry debates technical signals --- schema markup, citation formats, content structure --- the fundamental question remains deceptively simple: &lt;strong&gt;does your content teach the AI something new?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The companies that will dominate AI visibility over the next three to five years are not those with the largest content teams or the most sophisticated technical SEO. They are the companies that systematically invest in proprietary data, original research, and first-hand evidence.&lt;/p&gt;

&lt;p&gt;This is an asymmetric opportunity. The barrier to entry is not technical --- it is organizational. Most companies have proprietary data they never publish, expertise they never formalize, and insights they never quantify. The gap between having information gain and deploying it for AI visibility is primarily a strategic one.&lt;/p&gt;

&lt;p&gt;For leaders evaluating their GEO strategy, the first question should not be "how do we optimize our content for AI?" It should be: &lt;strong&gt;what do we know that no one else does --- and how do we make it findable?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The answer to that question is your information gain. And increasingly, it is the single variable that determines whether AI cites your brand or cites your competitor.&lt;/p&gt;




&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., &amp;amp; Deshpande, A. (2023). &lt;em&gt;GEO: Generative Engine Optimization&lt;/em&gt;. arXiv:2311.09735.&lt;/li&gt;
&lt;li&gt;Google Patent US 2020/0349186 --- Information Gain Scoring for Search Results.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Alexandre Caramaschi is CEO of &lt;a href="https://brasilgeo.ai" rel="noopener noreferrer"&gt;Brasil GEO&lt;/a&gt;, the first Brazilian consultancy specializing in Generative Engine Optimization. Former CMO at Semantix (Nasdaq), co-founder of AI Brasil. More at &lt;a href="https://alexandrecaramaschi.com" rel="noopener noreferrer"&gt;alexandrecaramaschi.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/conteudos/artigos/comparativo-geo-panel-rank-semrush-profound-2026.html" rel="noopener noreferrer"&gt;GEO platform comparison 2026&lt;/a&gt; — Full article on Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://brasilgeo.ai/conteudos/artigos/comparativo-geo-panel-rank-semrush-profound-2026.html" rel="noopener noreferrer"&gt;Comparativo GEO Panel vs Semrush vs Profound&lt;/a&gt; — Platform comparison on Brasil GEO&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@alexandre.brt14/17c9c2e68da1" rel="noopener noreferrer"&gt;Information Gain: A Variável Oculta (PT-BR)&lt;/a&gt; — Portuguese version on Medium&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>geo</category>
      <category>ai</category>
      <category>seo</category>
      <category>marketing</category>
    </item>
  </channel>
</rss>
