The problem: I needed my LLM to answer questions about current events — not just regurgitate its training data cutoff.
The constraint: I didn't want to manage proxy pools, CAPTCHAs, or HTML parsers.
The result: A simple research agent that searches the web in real-time and synthesizes answers. Eighty lines. Two API keys. Zero infrastructure.
Here's how it works.
The Architecture
User Question → [Search needed?] → No → LLM answers directly
→ Yes → SERP API → structured results → LLM synthesis
Three pieces: a search API, a decision layer, and the LLM.
The SERP API Piece
I needed structured Google results. I evaluated a few SERP APIs and landed on Talordata — the deciding factor was that it uses the same interface spec as SerpApi. I had existing code from a previous project; switching meant changing one URL and one API key. No code changes.
The integration is a single function:
def search_web(query, api_key):
params = {"q": query, "engine": "google", "api_key": api_key}
resp = requests.get("https://serpapi.talordata.net/serp/v1/request", params=params)
return [{"title": i["title"], "link": i["link"], "snippet": i["snippet"]}
for i in resp.json().get("organic_results", [])[:5]]
Returns clean JSON. No HTML parsing, no proxy management, no CAPTCHAs.
The Decision Layer
Not every question needs a live search. Asking "what's a Python decorator" should skip search entirely.
I added a simple heuristic: if the query contains keywords like 2026, latest, compare, best, price, it triggers search. Otherwise, the LLM answers directly.
This catches about 85% of search-worthy queries with near-zero false positives on factual questions.
The Agent Wiring
Using LangChain's ReAct agent, I wrapped the search function as a tool and attached it to the LLM. The system prompt includes one critical instruction: only call search when the heuristic says it's needed.
search_tool = Tool(name="web_search", func=lambda q: json.dumps(search_web(q, API_KEY)))
agent = create_react_agent(llm, [search_tool], prompt)
executor = AgentExecutor(agent=agent, tools=[search_tool])
That's it. The agent decides when to search, integrates the results, and produces a synthesized answer.
What I Learned from the Behavior
Query Type Behavior Result
"Explain Python decorators" LLM only Instant, accurate
"Best AI coding tools 2026" Searches + synthesizes Cites real 2026 reviews
"GPT-4o vs Claude 4" Searches benchmarks Fresh comparison data
"React 19 new features" Searches changelog Up-to-date info
The difference is stark on time-sensitive questions. Without search, the LLM speculates. With search, it references real articles published this year.
The Cost
SERP API: Talordata at $27/30K requests. At ~500 searches/day ≈ $13.50/month.
LLM: GPT-4o at ~$60/month for this volume.
The SERP API portion is trivial. For thirteen bucks a month, I eliminated all proxy management, CAPTCHA handling, and HTML parsing. That's less than the cost of a single hour debugging a broken scraper.
What's Next
This v1 works, but there's clear room to improve:
Smarter search/no-search classification. The regex approach works but a lightweight classifier would be cleaner.
Parallel searches. For complex questions, running multiple queries in parallel would produce richer results.
Richer data. I'm only using organic results right now. Knowledge graph and related questions are sitting in the API response — just need to consume them.
The Takeaway
The most interesting thing about this project isn't the code. It's how much complexity disappears when you choose the right services.
Five years ago, building this required crawling infrastructure, proxy rotation, CAPTCHA solving, and ongoing maintenance. Today it's a short script and two API calls.
The real skill isn't building infrastructure anymore. It's knowing which pieces to compose — and what to leave to the experts.
Top comments (0)