Cecilia Hill

Posted on Jun 8

How to Get Google Search Results in JSON for an AI Agent

#python #webscraping #api #ai

So you’re building an AI agent.

At first, everything feels simple. The user asks a question, the model thinks, and the agent returns an answer.

Then you hit the obvious problem:

The agent needs fresh information.

Maybe it needs today’s search results.

Maybe it needs current competitors.

Maybe it needs recent product pages.

Maybe it needs local business results from a specific city.

Maybe it needs sources before writing a research summary.

A language model can reason well, but it does not always know what is happening right now. If the task depends on current search results, you need a search layer.

For many agent workflows, that search layer starts with Google results in JSON.

In this post, we’ll walk through a simple way to think about it:

User task → search query → SERP API → JSON results → AI agent response

What we are building

Let’s say we want to build a basic research agent.

The user asks:

Find the top competitors for email marketing software and summarize what appears in Google.

The agent should:

Generate a search query
Fetch Google search results
Parse titles, links, snippets, and rankings
Pass clean search context into an LLM
Return a useful summary

The important part is that we do not want raw HTML.

We want structured data like this:

{
  "query": "best email marketing software",
  "organic_results": [
    {
      "position": 1,
      "title": "Best Email Marketing Software Tools",
      "link": "https://example.com",
      "snippet": "Compare email marketing platforms, pricing, and features..."
    }
  ]
}

This is much easier for an agent to use than a full search result page.

Why not scrape Google directly?

You can scrape Google yourself.

For a small experiment, it may work. Send a request, parse the HTML, extract the links, and move on.

But production is different.

Search pages change. Layouts vary by query. Some results include ads, local packs, images, videos, shopping blocks, People Also Ask, or news results. Results also change by country, language, device, and location.

Then there are the operational problems:

blocked requests
CAPTCHA
inconsistent HTML
proxy maintenance
parser updates
retry logic
location mismatch
broken selectors

If your goal is to build an AI agent, maintaining a search scraper is usually not the core product.

The agent needs reliable search context. It does not care how much work went into collecting it.

That is why a SERP API is often a better fit.

What a SERP API does

A SERP API collects search engine results and returns them in a structured format, usually JSON.

Instead of asking your agent to deal with raw HTML, you give it clean fields:

result position
title
link
snippet
domain
result type
location
search engine
related results or SERP features

The workflow becomes much simpler:

Search query → SERP API request → JSON response → agent context

You can test this workflow with providers like SerpApi, SearchAPI, Bright Data, or Talordata. The provider matters less than the response quality for your actual use case.

For this example, I’ll use a generic SERP API request structure. Replace the endpoint and parameter names with the provider you are using.

Basic Python example

First, install requests:

pip install requests

Then create a small script:

import os
import requests


SERP_API_KEY = os.getenv("SERP_API_KEY")
SERP_API_URL = os.getenv("SERP_API_URL")


def search_google(query, location="United States", language="en"):
    if not SERP_API_KEY:
        raise ValueError("Missing SERP_API_KEY environment variable")

    if not SERP_API_URL:
        raise ValueError("Missing SERP_API_URL environment variable")

    params = {
        "q": query,
        "engine": "google",
        "location": location,
        "language": language,
        "output": "json",
        "api_key": SERP_API_KEY,
    }

    response = requests.get(SERP_API_URL, params=params, timeout=30)
    response.raise_for_status()

    return response.json()


if __name__ == "__main__":
    data = search_google("best email marketing software")

    for result in data.get("organic_results", [])[:5]:
        print(result.get("position"))
        print(result.get("title"))
        print(result.get("link"))
        print(result.get("snippet"))
        print("---")

This script does three things:

Sends a search query to a SERP API
Gets structured JSON back
Prints the top organic results

In a real agent workflow, you probably would not print the results. You would convert them into context for the model.

Convert SERP results into agent context

An LLM does not need the full API response.

It needs a clean summary of the useful parts.

Here is a simple helper function:

def build_search_context(serp_data, max_results=5):
    results = serp_data.get("organic_results", [])[:max_results]

    context_blocks = []

    for result in results:
        position = result.get("position", "N/A")
        title = result.get("title", "")
        link = result.get("link", "")
        snippet = result.get("snippet", "")

        block = f"""
Position: {position}
Title: {title}
URL: {link}
Snippet: {snippet}
""".strip()

        context_blocks.append(block)

    return "\n\n".join(context_blocks)

Now you can use it like this:

serp_data = search_google("best email marketing software")
search_context = build_search_context(serp_data)

print(search_context)

Example output:

Position: 1
Title: Best Email Marketing Software Tools
URL: https://example.com
Snippet: Compare email marketing platforms, pricing, and features...

Position: 2
Title: Top Email Marketing Services for Small Businesses
URL: https://example.org
Snippet: A guide to email marketing tools for startups and growing teams...

This is the kind of context an agent can use.

Add the search context to an agent prompt

Once you have structured search context, you can pass it into your LLM prompt.

Example:

def build_agent_prompt(user_task, search_context):
    return f"""
You are a research assistant.

Use the search results below to answer the user's task.
Do not invent sources.
If the search results are not enough, say what is missing.

User task:
{user_task}

Search results:
{search_context}

Write a concise answer with:
- key findings
- important domains mentioned
- possible competitors
- source URLs
""".strip()

Usage:

user_task = "Find the top competitors for email marketing software and summarize what appears in Google."

serp_data = search_google("best email marketing software")
search_context = build_search_context(serp_data)

prompt = build_agent_prompt(user_task, search_context)

print(prompt)

At this point, the prompt can be sent to your LLM of choice.

The important part is that the agent is no longer answering only from static knowledge. It has fresh search context.

A simple agent flow

Here is the full simplified flow:

def run_search_agent(user_task):
    # In a real agent, this query could be generated by the LLM.
    query = "best email marketing software"

    serp_data = search_google(query)
    search_context = build_search_context(serp_data)

    prompt = build_agent_prompt(user_task, search_context)

    return prompt


task = "Find the top competitors for email marketing software and summarize what appears in Google."

agent_prompt = run_search_agent(task)

print(agent_prompt)

This is not a full production agent yet.

But it gives you the core search layer:

task → query → SERP JSON → clean context → LLM prompt

From here, you can improve it by adding:

query generation
multiple searches
source filtering
domain extraction
location targeting
result caching
ranking comparison
citations
retry logic
scheduled monitoring

Location matters for AI agents

One easy mistake is assuming Google results are the same everywhere.

They are not.

A query like:

best CRM software

may produce different results depending on country, language, or device.

A local query like:

coffee shop near me

can change completely by city or neighborhood.

If your AI agent is doing local SEO, market research, competitor tracking, travel research, or e-commerce analysis, location targeting matters.

That means your SERP API should support parameters such as:

country
city
language
device
search engine
output format

Without this, your agent may summarize results from the wrong market.

What to check before choosing a SERP API

Before choosing a provider, test it with real agent tasks.

Do not only run one demo query.

Try the actual searches your agent will run. Check whether the response contains the fields your workflow needs.

A few practical questions:

Does it return clean JSON?
Does it also offer HTML if needed?
Does it support location-based results?
Does it include organic results, ads, maps, shopping, news, or other SERP blocks?
Are the fields stable enough for automation?
Are failed requests billed?
How much cleanup is needed before the data can be passed into an LLM?

This is where testing matters more than marketing pages.

For example, if you are comparing SerpApi, SearchAPI, Bright Data, Talordata, or other SERP API providers, send the same real queries to each one and compare the response structure.

The best API is usually the one that gives your agent usable context with the least extra work.

Where this fits in real projects

This pattern works for many AI agent use cases:

research agents
SEO copilots
competitor monitoring tools
market research assistants
e-commerce intelligence tools
local search analysis
automated report generation
search-driven RAG workflows

The agent does not need to “browse the web” like a human.

It needs reliable search data in a structure it can reason over.

That is the real value of a SERP API for AI agents.

Final thoughts

Getting Google search results in JSON is not only a scraping problem.

For AI agents, it is a context problem.

The model needs fresh, relevant, structured search data. Raw HTML is messy. Static model knowledge may be outdated. Manual search does not scale.

A SERP API gives your agent a cleaner search layer.

You send a query, get structured results, extract the useful fields, and pass them into the model as context.

If you are testing this kind of workflow, Talordata is one option worth comparing. It supports structured SERP data, JSON / HTML response formats, geo-targeted results, and search workflows for AI agents, SEO monitoring, competitor tracking, and market research.

Talordata also offers 1,000 free API requests after signup, which is enough to test the response format with real agent queries before committing to a provider.

DEV Community