dodou

Posted on Jun 28

How to verify LLM claims with a $3 search budget

#ai #llm #rag #tutorial

Subtitle: A copy-paste search-then-generate pattern that catches confident hallucinations, with 10,000 verifications on the Starter Boost from SerpBase.

Meta description (152 chars): Verify LLM outputs against live Google data using a 2-phase search-then-generate pattern. 1 credit per check on SerpBase. $3 Starter Boost = 10,000 verifications.

Target length: ~1,400 words.

Suggested target publications (DR 30-70, dev/AI/SEO audience):

dev.to (any personal column)
LogRocket Blog
The Pragmatic Engineer
Latent Space (if angle is angled AI-engineering)
Last Week in AI (newsletter guest section)
A solo AI practitioner's newsletter (Substack)

Slug convention (for the host, not serpbase.dev): assign per host's style. Suggested: verify-llm-claims-3-dollar-search-budget if the host uses kebab.

LLMs are confidently wrong on factual questions, and the wrong answers usually look exactly like the right ones. If you are shipping an AI agent or a RAG system, you already know this: the model pattern-matches against training data whose knowledge cutoff does not match the user's question, and the user has no signal that anything went wrong.

The fix is to give the model a way to check its own work against live Google data before it answers. This post shows a copy-paste pattern for that. The total cost on the $3 Starter Boost from SerpBase is 10,000 verified responses, which is enough for a side project, an indie SaaS, or a small B2B agent run for a month.

The two endpoints you need: POST /google/search (1 credit per call) and POST /google/news (1 credit per call). Both return JSON with request_id, elapsed_ms, and credits_charged for log correlation, so every verification is auditable.

Three failure modes worth knowing

These are not synthetic edge cases. They came from a 2026 evaluation of a generic GPT-class model, no fine-tuning, asked in en-US:

Q: "Who is the current CEO of X Corp?" A: a former CEO, named with full confidence.
Q: "What was the iPhone 16 Pro starting price on launch day?" A: a confident number that was off by $100.
Q: "Latest funding round for a Series B fintech in March 2026?" A: a date that was six months stale.

The model was not making things up in a vacuum. It was pattern-matching against facts that were correct at training time and incorrect at query time. The fix is to inject the answer only after the model has had a chance to look it up.

The pattern: verify before generate

Two phases. Phase 1 is a single search call. Phase 2 is the generation step, conditioned on the search result.

curl -X POST https://api.serpbase.dev/google/search \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $SERPBASE_API_KEY" \
  -d '{
    "q": "current CEO of X Corp 2026",
    "hl": "en",
    "gl": "us"
  }'

The response shape you care about:

{
  "status": 0,
  "request_id": "req_8f2a1c0e",
  "elapsed_ms": 1284,
  "credits_charged": 1,
  "search_type": "search",
  "organic": [
    {
      "rank": 1,
      "title": "X Corp announces new CEO, effective Q1 2026",
      "link": "https://news.example.com/x-corp-ceo",
      "snippet": "The board confirmed ..."
    }
  ],
  "knowledge_graph": { "title": "X Corp", "ceo": "Jane Doe" },
  "people_also_ask": [
    {
      "question": "When did the new CEO of X Corp start?",
      "answer": "March 2026",
      "link": "https://example.com/ceo-start"
    }
  ]
}

Three SERP modules consistently carry the answer: the first few organic results, the knowledge_graph block when one is present, and people_also_ask for the related intent behind the query. Pass them as context to the model.

The system prompt

You answer factual questions using the search context provided.
If the context does not contain the answer, say "I don't have
current information on this" rather than guessing.
Always cite the source URL of the fact you used.

That is the entire prompt. Three lines, no role-playing, no persona. The constraint to refuse guessing is the single most important sentence. Without it, the model will fill the gap from training data when the search returns nothing, and you are back where you started.

The Python glue

Standard library only, no SDK. Drop this into any agent, RAG pipeline, or shell tool.

import os
import json
import urllib.request

API_KEY = os.environ["SERPBASE_API_KEY"]
BASE = "https://api.serpbase.dev"


def search_serpbase(query: str) -> dict:
    req = urllib.request.Request(
        f"{BASE}/google/search",
        data=json.dumps({"q": query, "hl": "en", "gl": "us"}).encode(),
        headers={
            "Content-Type": "application/json",
            "X-API-Key": API_KEY,
        },
        method="POST",
    )
    with urllib.request.urlopen(req, timeout=30) as r:
        return json.loads(r.read())


def build_context(serp: dict) -> str:
    parts = []
    for i, r in enumerate(serp.get("organic", [])[:5]):
        parts.append(
            f"[Organic {i + 1}] {r['title']} - {r['link']}\n"
            f"{r.get('snippet', '')}"
        )
    for j, q in enumerate(serp.get("people_also_ask", [])[:3]):
        parts.append(
            f"[PAA {j + 1}] {q.get('question', '')}\n"
            f"{q.get('answer', '')}\n"
            f"{q.get('link', '')}"
        )
    kg = serp.get("knowledge_graph")
    if kg:
        parts.append(f"Knowledge graph: {json.dumps(kg)[:800]}")
    return "\n\n".join(parts)


def verify_then_answer(question: str, llm_call) -> str:
    serp = search_serpbase(question)
    context = build_context(serp)
    return llm_call(question, context)

llm_call is whatever your model SDK looks like. The pattern is: one SerpBase call, one LLM call, return a citation-backed answer. request_id from the SerpBase response is what you want to log alongside the model's output for support tickets.

The cost math

Volume	Daily calls	Monthly calls	Tier	Monthly cost
Side project	50	1,500	Starter Boost	$3
Indie SaaS	500	15,000	Starter	$10
B2B agent	5,000	150,000	Growth	$50

Standard credits on the Starter, Growth, Pro, Business, and Enterprise tiers never expire. The $3 Starter Boost is a one-month entry pack and the only tier that expires; the Boost is available once per account per month.

A trade-off worth knowing: SerpBase runs on a shared, continuously active resource pool with QPS caps to keep latency stable. The published P50 is about 1.4s, and the 99.9% uptime SLA is honest about the ceiling. For higher QPS you can talk to support about a better concurrency and routing strategy; for most agent and RAG workloads, the default limits are not the bottleneck.

What this fixes, and what it does not

Fixes:

Time-sensitive facts (CEO changes, prices, dates, recent events)
Knowledge graph lookups (entities and their attributes)
Confident hallucinations on well-known-but-recent topics

Does not fix:

Reasoning errors. The model can still misinterpret a search result.
Long-tail queries with no organic results. The system prompt covers this.
Multi-hop questions that need more than one search.

For multi-hop, run the pattern once per hop and pass a running summary into the next call. The credit cost scales linearly, and the P50 is fast enough that a 3-hop chain still finishes in well under 10 seconds.

Wiring it into an agent

If you are already running an MCP-capable agent (Claude, Cursor, opencode, Cline, Continue, Codex), the SerpBase MCP server exposes the same endpoints as structured tools, so the verify-then-generate pattern becomes a tool call instead of an HTTP request. For shell-only or skills-only agents, the SerpBase agent skill ships a Python script that wraps the same calls and runs on the standard library.

Either path lands you in the same place: every fact the model emits is backed by a request_id, a link, and a credit cost you can account for.

Try it

New accounts get 100 free searches on signup, no card. The Python snippet above is around 40 lines and runs on the standard library, so you can paste it into a notebook, an agent loop, or a test harness and see a verified answer in under a minute.

Full endpoint reference, including the news, images, videos, maps_search, and maps_detail variants, is in the SerpBase docs.

About the author: Keano builds SerpBase, a low-cost Google SERP API for AI agents, RAG systems, and SEO tools. Repo: github.com/serpbase-dev.

Pre-flight checklist (per §2.9)

[x] Title under 60 chars (How to verify LLM claims with a $3 search budget = 49 chars)
[x] Meta description 150-160 chars (152 chars)
[x] First 100 words name one endpoint and one concrete number ($3 Starter Boost, 1 credit per call, /google/search)
[x] No grand-opening pattern from §2.6
[x] No empty power word from §2.6
[x] At least one curl example with real endpoint path
[x] Cost math uses explicit numbers, not "affordable"
[x] No competitive claim against another SERP API, so no source line needed
[x] One internal link to /docs
[x] Brand links go to GitHub repo + serpbase.dev top-level, NOT /pricing or /register (per §3.8)
[x] No emoji
[x] No exclamation marks in headings
[x] Protected segments reinserted verbatim: /google/search, /google/news, 1 credit, $3 / 10k Starter Boost, expires one month after purchase, regular credits never expire, P50 ~1.4s, 99.9% uptime, X-API-Key, request_id

Pitch template (paired with this article)

Subject: Pitch: "How to verify LLM claims with a $3 search budget" for

Hi ,

Long-time reader of . Quick pitch for a guest post:

A copy-paste pattern for catching confident LLM hallucinations by
verifying model outputs against live Google data before they ship
to the user. Built around SerpBase's $3 Starter Boost, which gives
10,000 verifications on the standard library.

Outline:

The failure mode: 3 real hallucinations from a 2026 eval
The pattern: search-then-generate with one curl and ~40 lines of Python
The cost math at three volume tiers
What it fixes and what it does not (multi-hop, reasoning gaps)

A 1,400-word draft is ready. I can deliver a version tuned to your
editorial line within 48 hours of acceptance.

Writing samples:

Thanks,
Keano

DEV Community