Bright U. Emmanuel

Posted on Mar 10

What Does an AI Actually See When It Reads Your Website? I Built a Tool to Find Out

#go #ai #beginners #webdev

You've spent hours perfecting your portfolio. The colors, the layout, the animations.

But when an AI recruiter bot visits it, it doesn't see your CSS. It sees a mangled wall of raw text.

The Gap Nobody Talks About

As developers, we build for human eyes. We see rendered DOMs, perfectly spaced flexboxes, and interactive state changes.

But a massive portion of web traffic today isn't human. It’s AI scrapers, summary bots, and automated recruiters. These agents don't see your animations. They see raw text nodes, aria labels, and mangled markup.

This gap between human rendering and machine reading is why "optimizing for AI" is becoming its own discipline. If an AI agent can't understand your personal site, it can't recommend you.

So, I built a Go server that shows you exactly what an AI sees when it visits any URL—and streams the AI's honest, brutal summary back to you live.

How AI Bots Actually Read Websites

Most AI scrapers do not execute JavaScript. Firing up a headless Chromium instance for every URL is incredibly expensive. Instead, they fetch the raw HTML.

What survives this scrape? Text nodes, alt attributes, meta tags, and semantic headings.

What disappears? CSS, animations, and crucially, client-side rendered content.

If your portfolio is a React Single Page Application (SPA) with no Server-Side Rendering (SSR), an AI bot essentially sees a blank page with a <div id="root"></div> and a script tag. All your hard work is invisible to the machine.

What We Are Building

We are building a lightweight Go server to simulate this exact process.

The Architecture:

Browser → POST /analyze (URL)
             ↓
        Go fetches raw HTML
             ↓
        Strip tags, extract text
             ↓
        Pipe to OpenRouter API (Free LLMs)
             ↓
        Stream response via SSE
             ↓
        Browser renders live

We will use the standard net/http package to fetch the target URL, golang.org/x/net/html to strip the noise, and OpenRouter's free API to act as our AI agent. To make it feel alive, we'll stream the AI's judgment back to the browser using Server-Sent Events (SSE).

Step 1 — Fetching and Stripping the Target

First, we need to fetch the target URL and rip out the visual noise.

An AI doesn't care about your <nav>, your inline <style>, or your Google Analytics <script>. We use an HTML parser to walk the DOM tree recursively and extract only the meaningful text nodes.

import (
    "strings"
    "golang.org/x/net/html"
)

// extractText walks the HTML tree and grabs only visible text
func extractText(n *html.Node) string {
    if n.Type == html.TextNode {
        return strings.TrimSpace(n.Data) + " "
    }

    // Strip the noise: scripts, styles, navigation, footers
    if n.Type == html.ElementNode && (n.Data == "script" || n.Data == "style" || n.Data == "nav" || n.Data == "footer") {
        return ""
    }

    var text string
    for c := n.FirstChild; c != nil; c = c.NextSibling {
        text += extractText(c)
    }
    return text
}

By the end of this function, a beautiful website is reduced to a single, dense string of raw text.

Step 2 — Piping to the AI

Next, we hand that stripped text over to an LLM.

Because we are taking random URLs, we also need to truncate the text before sending it to the API so we don't accidentally exceed the free-tier token limits on massive websites!

I'm using OpenRouter because they offer free tiers for powerful models like Llama 3 and Mistral. Because free APIs can occasionally timeout, passing an array of models tells OpenRouter to automatically fall back to the next model if one fails.

func callOpenRouter(text string) (*http.Response, error) {
    // Truncate to avoid exceeding LLM token limits
    if len(text) > 4000 {
        text = text[:4000]
    }

    url := "https://openrouter.ai/api/v1/chat/completions"
    prompt := "Analyze this raw website text as an AI scraper. Summarize the purpose, who it is for, and note if it looks like a blank JS/React app. Be direct. Text: " + text

    reqBody := map[string]interface{}{
        "models": []string{
            "google/gemma-3-27b-it:free",
            "meta-llama/llama-3.3-70b-instruct:free",
            "mistralai/mistral-7b-instruct:free",
        },
        "messages": []map[string]string{{"role": "user", "content": prompt}},
        "stream":   true, // We want the data live!
    }

    jsonBody, _ := json.Marshal(reqBody)
    req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonBody))

    // Ensure you set your API key in your environment variables!
    req.Header.Set("Authorization", "Bearer "+os.Getenv("OPENROUTER_API_KEY"))
    req.Header.Set("Content-Type", "application/json")

    client := &http.Client{Timeout: 15 * time.Second}
    return client.Do(req) // Return the response and any errors back to the handler
}

Step 3 — Streaming the Response (SSE)

We don't want the user staring at a loading spinner for 10 seconds. We want the AI's text to type out live on the screen as it reads the site.

To do this, we use Server-Sent Events (SSE).

We set the required headers, properly assert our ResponseWriter as an http.Flusher, and pipe the chunks directly from the OpenRouter response back to the client.

func analyzeHandler(w http.ResponseWriter, r *http.Request) {
    // ... Fetch and clean target URL text here ...

    aiResponse, err := callOpenRouter(cleanText)
    if err != nil || aiResponse == nil {
        http.Error(w, "Failed to reach AI API", http.StatusBadGateway)
        return
    }
    defer aiResponse.Body.Close()

    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    w.Header().Set("Connection", "keep-alive")
    w.Header().Set("X-Accel-Buffering", "no") // Prevents Nginx from holding the stream

    flusher, ok := w.(http.Flusher)
    if !ok {
        http.Error(w, "Streaming unsupported", http.StatusInternalServerError)
        return
    }

    // Read the AI's streaming response and push it to the browser
    reader := bufio.NewReader(aiResponse.Body)
    for {
        line, err := reader.ReadBytes('\n')
        if err != nil {
            break
        }

        fmt.Fprintf(w, "data: %s\n\n", line)
        flusher.Flush() // Push immediately, don't wait!
    }
}

Step 4 — The Minimal Frontend

Because we are using SSE, the frontend doesn't need React, npm, or heavy libraries. We just need a single HTML file using the browser's native EventSource API.

<script>
    function analyze() {
        const url = document.getElementById('urlInput').value;
        const output = document.getElementById('output');
        output.innerText = "Analyzing site content...\n\n";

        // Native browser API for Server-Sent Events
        const evtSource = new EventSource('/analyze?url=' + encodeURIComponent(url));

        evtSource.onmessage = function(event) {
            // Append chunks to the DOM as they arrive
            const data = JSON.parse(event.data);
            if (data.choices && data.choices[0].delta.content) {
                output.innerText += data.choices[0].delta.content;
            }
        };

        evtSource.onerror = function() {
            evtSource.close(); // Close stream when finished
        };
    }
</script>

The "Aha" Moment — Testing Real Sites

I ran this tool against two different types of websites to see what the AI would report.

Test 1: A Plain HTML Site (`quotes.toscrape.com`)

Because this site serves standard HTML, the AI easily read the structure. It accurately identified it as a collection of famous quotes, recognized the tags and authors, and deduced that there might be user accounts. It perfectly understood the context of the page.

Test 2: A Client-Side React SPA (`discord.com/app`)

I pointed the scraper at the Discord web app. Because it relies heavily on client-side rendering, the AI absolutely roasted it. The summary read: "This looks like a very basic web page... it's possible someone started a project intending to integrate with Discord." The AI confidently concluded it was a blank app showing a default JavaScript error.

What This Means for Your Portfolio

You don't need to rewrite your entire stack, but you do need to ensure your content survives the scrape.

Here is the 2026 checklist for AI visibility:

[ ] Meta description: This survives every scraper. Make it highly descriptive.
[ ] Semantic HTML: Use <h1>, <article>, and <section>. AI understands structure.
[ ] Alt text: Your image alt attributes are doing heavy lifting for your project screenshots.
[ ] SSR / Static Fallbacks: If you use React or Vue, ensure your critical text (your name, your skills) is rendered on the server, not just in the client.
[ ] Don't hide text in CSS: display: none or CSS-only pseudo-elements are often ignored by simple scrapers.

Try It Yourself!

The web has always been a mix of human and machine readers. AI just made the machine reader smarter and much more consequential. Building with AI agents in mind isn't about "gaming the system." It’s just fundamentally good, accessible web development.

Want to see what an AI thinks of your portfolio?

Live Demo: https://ai-web-reader.pxxl.click

(Note: The AI in this live demo is powered by a free-tier API. If it times out or errors, it means my free credits are currently fighting for their life. Just try again in a few seconds, or clone the GitHub repo below!)

Full Source Code:

brighto7700 / ai-web-reader

A lightweight Go server that fetches raw HTML and uses LLMs to show exactly what bots and scrapers see when they visit your website.

ai-web-reader

A lightweight Go server that fetches raw HTML and uses LLMs to show exactly what bots and scrapers see when they visit your website.

View on GitHub

(How did your site do? Did the AI completely miss your projects? Let me know below!)

Top comments (3)

Bright U. Emmanuel • Mar 10

I’m genuinely curious: has anyone here ever actually optimized their personal site specifically for AI recruiter bots before? Or are we all just building for human eyes and hoping for the best? 🤔
Let me know how your site holds up in the live demo

Godwin Edward • Mar 10

Great article, Bright. We talk so much about SEO, but optimizing for AI agents is clearly the new standard. Love that the 'fix' is really just a return to good, accessible web fundamentals

Bright U. Emmanuel • Mar 10

Spot on, Godwin! 💯 That was my biggest realization while building this tool. It's ironic that after years of complex SEO tricks, the best way to impress an AI is just writing proper semantic HTML like it's 2005 again. 😂 Thanks for reading!

The Gap Nobody Talks About

How AI Bots Actually Read Websites

What We Are Building

Step 1 — Fetching and Stripping the Target

Step 2 — Piping to the AI

Step 3 — Streaming the Response (SSE)

Step 4 — The Minimal Frontend

The "Aha" Moment — Testing Real Sites

Test 1: A Plain HTML Site (quotes.toscrape.com)

Test 2: A Client-Side React SPA (discord.com/app)

What This Means for Your Portfolio

Try It Yourself!

brighto7700 / ai-web-reader

A lightweight Go server that fetches raw HTML and uses LLMs to show exactly what bots and scrapers see when they visit your website.

ai-web-reader

Test 1: A Plain HTML Site (`quotes.toscrape.com`)

Test 2: A Client-Side React SPA (`discord.com/app`)