DEV Community: James Collins

MCP Apps with FastMCP: Turning Tool Output Into Interactive UI

James Collins — Wed, 01 Jul 2026 10:21:13 +0000

If you have built anything on the Model Context Protocol, you already know the shape of a tool. A request goes in, JSON comes back, and the model works through that JSON so it can tell you what it found. The design is simple and it has served the protocol well, but it quietly wastes a lot of tokens, because a single search can return tens of kilobytes of results that exist only so the model can boil them down to a sentence or two.

This post walks through what MCP Apps are and how to add them to a FastMCP server, starting with a table that takes about ten lines and ending with a dashboard that picks a different layout for each search engine. The screenshots are real rendered output, not mockups.

If you are new to the protocol itself, our overview — Model Context Protocol (MCP): A Unified Standard for AI Agents and Tools — is the place to start. The rest of this post assumes you have used an MCP server before.

What MCP Apps actually are

MCP Apps are the first official extension to the Model Context Protocol, introduced as SEP-1865 and developed jointly by Anthropic, OpenAI, and the team behind MCP-UI. The idea is modest: alongside the structured result a tool already returns, the server can attach a small HTML interface, which the host then renders inside a sandboxed iframe in the chat.

That small change has a few effects worth spelling out. The person using the client gets a real interface — a sortable table, a chart, a set of cards — instead of a block of JSON, and because the data behind it lives in the browser, the model only ever receives a summary rather than the full payload. Routine interactions like sorting a column or filtering a list also stay on the client, so they cost neither a tool call nor a round trip to the model.

Support is already broad enough to rely on. Claude, ChatGPT, Goose, and VS Code can all render MCP Apps today, and any host that cannot simply falls back to the structured result and ignores the interface, so adding one never breaks an existing client.

Setup

FastMCP provides MCP Apps support through Prefab, a Python component library that you install with the apps extra:

uv add "fastmcp[apps]"

Prefab is still young, and it changes between releases without always preserving backward compatibility. FastMCP does not cap its version, so it is worth pinning Prefab yourself to keep a future release from changing your output unexpectedly:

uv add "fastmcp[apps]" "prefab-ui>=0.20,<0.21"

The most useful part of the workflow is the local preview, which lets you build an App without Claude or any other host:

uv run fastmcp dev apps app.py

That command starts your server and opens a browser with a tool picker. You select a tool, fill in its arguments, and see the interface exactly as a host would render it.

The four ways to build one, and which actually matter

FastMCP offers four levels of MCP App. They increase in both capability and complexity, and most servers need only the first.

Interactive tools are the common case. You mark a tool with app=True and return a Prefab component, which renders as the tool's interface, and sorting, searching, and filtering all happen on the client. This covers the large majority of real use cases, so it is where you should start and, usually, where you should stop.

FastMCPApp is the next step up, for when the interface needs to call back into the server, such as a button that triggers an action or a form that submits data. It lets the interface invoke private backend tools that never appear in the model's tool list. That is worth the extra machinery for write actions, but it is overkill for a read-only view.

Generative UI lets the model write the Prefab layout itself, tailored to the data it has just received. It is flexible and occasionally impressive, but it is also unpredictable, so it is best treated as an experiment rather than a default.

Custom HTML lets you return raw HTML and own the entire surface. That gives you full control along with full responsibility, and it is an escape hatch rather than a starting point.

The rest of this post uses the first pattern, since it is the one you are most likely to reach for.

A results table in about ten lines

Here is the smallest App that does something useful. It takes search results and renders them as an interactive table.

from typing import Any

from fastmcp import FastMCP
from prefab_ui.app import PrefabApp
from prefab_ui.components import Column, DataTable, DataTableColumn

mcp = FastMCP("MCP Apps demo")

COLUMNS = [
    DataTableColumn(key="position", header="#", sortable=True, width="56px"),
    DataTableColumn(key="title", header="Title", sortable=True),
    DataTableColumn(key="source", header="Source", sortable=True),
    DataTableColumn(key="snippet", header="Snippet"),
]


@mcp.tool(app=True)
def search_table(params: dict[str, Any] | None = None) -> PrefabApp:
    rows = organic_rows(fetch_search_data(params))  # plain list[dict]
    with PrefabApp(title="Search results") as app:
        with Column(gap=4, css_class="p-4"):
            DataTable(columns=COLUMNS, rows=rows, search=True, paginated=True, page_size=10)
    return app

Two things are worth noticing here. The app=True flag does all the work, since it tells FastMCP to register a UI renderer for the tool and advertise it to the host; without it, the component would simply serialize back to JSON. The data, meanwhile, is an ordinary list of dictionaries, so there is no special serialization step. You shape your data to match what the component expects and let Prefab handle the rest.

The result is a live table, with headers that sort, a filter box that searches, and working pagination, none of which costs another tool call.

One design decision: a new tool, not a mode flag

A natural first instinct is to add a mode="ui" argument to an existing search tool and return an interface when it is set. It is worth resisting that.

The app=True flag is static metadata. FastMCP advertises a tool as a UI tool when it lists the available tools, before any call is made, so there is no per-call switch to flip. Adding UI to your existing search tool would change its behavior for every client, including the ones that only want JSON.

A separate, opt-in tool avoids the problem entirely. You leave search untouched and add search_table alongside it. Hosts that support Apps can offer the richer tool, and every other client keeps the one it already had.

A real dashboard

A single table is a good start, but Prefab can compose a complete layout out of metric cards, a chart, and a table that responds to clicks. The dashboard below summarizes a search, shows where its results come from, and reveals a detail panel when you select a row.

from prefab_ui.actions import SetState
from prefab_ui.components import (
    Card, CardContent, CardHeader, Column, DataTable, Grid, H3, If, Link, Metric, Small, Text,
)
from prefab_ui.components.charts import PieChart
from prefab_ui.rx import STATE, Rx


def build_dashboard_app(data: dict[str, Any]) -> PrefabApp:
    params = data.get("search_parameters") or {}
    rows = organic_rows(data)
    sources = source_breakdown(rows)            # [{"source": ..., "count": ...}]
    total = (data.get("search_information") or {}).get("total_results")

    with PrefabApp(title="Search dashboard", state={"selected": None}) as app:
        with Column(gap=4, css_class="p-4"):
            with Grid(columns=[1, 1, 1], gap=4):
                Metric(label="Query", value=params.get("q", "—"))
                Metric(label="Results shown", value=str(len(rows)))
                Metric(label="Total matches", value=f"{total:,}" if isinstance(total, int) else "—")

            with Grid(columns=[1, 2], gap=4):
                PieChart(data=sources, data_key="count", name_key="source", show_legend=True)
                DataTable(
                    columns=COLUMNS,
                    rows=rows,
                    search=True,
                    on_row_click=SetState("selected", Rx("$event")),
                )

            # Renders only after a row is clicked — entirely client-side.
            with If(STATE.selected):
                with Card():
                    with CardHeader():
                        H3(Rx("selected.title"))
                        Small(content=Rx("selected.source"))
                    with CardContent():
                        with Column(gap=2):
                            Text(content=Rx("selected.snippet"))
                            Link(content=Rx("selected.link"), href=Rx("selected.link"), target="_blank")
    return app

The reactive state is the part worth understanding. PrefabApp(state={"selected": None}) declares a piece of interface state, the table's on_row_click writes the selected row into it with SetState, and the If(STATE.selected) block renders the detail card once a row is chosen. Clicking a row updates the panel immediately, with no server involvement and no tokens spent.

Scaling across engines

Once you have a dashboard, it becomes clear that different searches call for different views. A list of organic links has little in common with a set of products that carry prices and ratings, and forcing both through one generic layout serves neither of them well.

A small map from engine to builder function handles this cleanly:

ENGINE_APP_BUILDERS: dict[str, Callable[[dict], PrefabApp]] = {
    "google_shopping": build_shopping_app,
}


@mcp.tool(app=True)
def search_dashboard(params: dict[str, Any] | None = None) -> PrefabApp:
    data = fetch_search_data(params)
    engine = (params or {}).get("engine", "google")
    builder = ENGINE_APP_BUILDERS.get(engine, build_dashboard_app)
    return builder(data)

With the map in place, adding an engine-specific dashboard is a single line, and any engine you have not registered falls back to the generic dashboard, so there is no gap in coverage. The shopping build below uses the same pieces arranged for products, with a price bar chart and currency formatting.

build_shopping_app is simply another function that returns a PrefabApp. It reaches for BarChart instead of PieChart and adds a currency column format, but its structure matches the generic dashboard. That is the point of the pattern, since a new vertical means shaping new data rather than building new plumbing.

A few things to keep in mind

The first is to keep your plain JSON tool. MCP Apps are additive, so the search tool that returns JSON should stay exactly as it is, both for hosts without App support and for agents that genuinely want the raw data.

The second is to pin Prefab, for the reason mentioned earlier. It moves quickly, and an unpinned dependency will eventually surprise you.

The third is to test the data rather than the rendering. The functions that shape your rows, such as organic_rows and source_breakdown, are pure and easy to test, while the rendering itself is Prefab's responsibility. Keeping your logic in small, testable helpers lets the components stay thin.

Wrapping up

MCP Apps change what a tool is allowed to return. With a modest amount of FastMCP and Prefab, a search tool can hand back a sortable table, a dashboard, or a layout tailored to a particular engine, and the data behind it never has to pass through the model's context window.

The SerpApi MCP server exposes exactly these kinds of search results, which makes it a natural place to try this out. You can connect to the hosted server at https://mcp.serpapi.com/YOUR_API_KEY/mcp, or run it yourself from the open-source repository. If you have not set it up before, Introducing SerpApi's MCP Server covers the basics, and How to use SerpApi engine schemas in SerpApi MCP goes deeper on getting good tool calls.

The State of MCP: Everything That Changed in H1 2026

James Collins — Thu, 04 Jun 2026 17:25:39 +0000

Six months ago, the Model Context Protocol was the exciting new thing. By the middle of 2026, it had become something less exciting but far more important: a piece of infrastructure that much of the AI ecosystem now depends on.

Over the first half of the year, MCP changed hands, gained a user interface, had its core protocol rewritten, and ran into the security problems that tend to appear once millions of people rely on something. None of those are the problems of a promising idea. They are the problems of working infrastructure, and the first half of 2026 is when MCP started dealing with them.

If you are new to the protocol, our overview — Model Context Protocol (MCP): A Unified Standard for AI Agents and Tools — is the place to start. The rest of this post assumes the basics and focuses on what actually changed.

Date	Event
Dec 9, 2025	Anthropic donates MCP to the Agentic AI Foundation (Linux Foundation)
Jan 26, 2026	MCP Apps ships as the first official extension, bringing UI into the chat
Jan–Feb 2026	More than 30 MCP-related CVEs filed; tool poisoning becomes a mainstream concern
Mar 9, 2026	The 2026 roadmap is published, with enterprise readiness as the top priority
May 21, 2026	The `2026-07-28` spec release candidate locks — the largest revision since launch

Governance moved to a neutral foundation

The most important change happened right at the start of the period, and it shaped everything that followed.

On December 9, 2025, Anthropic donated MCP to the newly formed Agentic AI Foundation, a fund under the Linux Foundation. Block and OpenAI were co-founders, and AWS, Google, Microsoft, Cloudflare, and Bloomberg joined as platinum members.

This matters more than it might sound. A protocol controlled by a single AI lab is a risk for every competing lab that adopts it, because that lab could change the standard in its own favor. Handing MCP to a neutral foundation removes the concern, since changes now go through a public proposal process in which every major lab and cloud provider has a seat, and none of them benefits from breaking the protocol for the others.

Google’s A2A protocol moved under the same foundation, with a clear division of responsibility: MCP connects agents to tools, while A2A coordinates communication between agents. Together, these two moves settled much of the fragmentation the ecosystem had worried about throughout 2025.

Adoption kept climbing

Metric	Where it stood in H1 2026
Monthly SDK downloads	~97 million, up from ~2 million at launch (roughly 4,750% growth in 16 months)
Public MCP servers	~9,400–17,000+ across the major registries
AI clients with native MCP	Claude, Cursor, Windsurf, Codex CLI, and VS Code with Copilot, plus ChatGPT and Gemini

To put that in perspective, the React npm package took about three years to reach the download numbers MCP reached in sixteen months.

The composition of that growth is as telling as its size. Fewer than 5% of public servers are monetized, so the ecosystem remains largely open source, and the momentum is shifting from community experiments toward vendor-built connectors and private, enterprise-internal servers. In short, MCP is moving from something developers try out to something companies run in production.

The protocol matured

Two releases turned MCP from a promising design into something built for scale.

The first was MCP Apps, which shipped in January as the protocol’s first official extension. It lets a tool return an interactive HTML interface that renders directly inside the conversation, such as a dashboard, a form, or a live visualization, all running in a sandboxed iframe. Figma used it to offer inline component editing, and Hex used it to render a filterable dashboard. It is worth noting that Anthropic and OpenAI developed the extension together, and it launched with support in Claude, ChatGPT, Goose, and VS Code on the first day.

The second was the 2026-07-28 release candidate, locked in May, which is the largest revision to the protocol since it launched. Its central change is that MCP becomes stateless.

The initialize handshake and the Mcp-Session-Id header are gone. Protocol version, client information, and capabilities now travel in the _meta field of every request.
This matters because a stateless server can sit behind an ordinary load balancer, with no sticky routing, no session store, and no long-lived streams to maintain. That is what allows MCP to scale on standard HTTP infrastructure.
The release also adds operational features, including required Mcp-Method and Mcp-Name headers that let gateways route traffic, and HTTP-style caching fields on discovery responses.
Finally, it introduces a formal deprecation policy. Roots, Sampling, and Logging are deprecated, but nothing is removed for at least twelve months.

Most existing servers continue to work unchanged. But the intent behind the release is clear: MCP is being re-engineered to be predictable and unremarkable, which is exactly what dependable infrastructure should be.

Security caught up with adoption

Rapid growth came with a cost. In January and February alone, more than thirty MCP-related CVEs were filed, including a remote-code-execution flaw rated 9.6 on the CVSS scale.

The defining issue was tool poisoning, and the problem is structural rather than a bug in any single server. A tool’s description is text the server controls, and it lands in the model’s context as if it were a trusted instruction. A malicious server can hide a directive in that description — for example, telling the model to read the user’s SSH key and pass it as a parameter — and an instruction-following model will comply. The user never sees the directive, because tool descriptions are not shown in the interface.

The underlying weakness is a gap in trust. A tool’s description is reviewed once, when the agent first connects, but the tool’s responses flow into the model afterward without any equivalent check. The defenses that emerged are layered: sanitizing tool metadata and responses, isolating high-privilege tools, scanning server manifests, and disabling the “always allow” settings that run tools without confirmation.

These are the problems of a widely used protocol rather than a niche one. The March roadmap responded to them directly by making enterprise readiness — audit trails, single sign-on, and gateway support — its top priority, to be delivered mostly through extensions rather than changes to the core.

What this adds up to

Taken together, the first half of 2026 tells a single story. MCP finished moving from a clever idea owned by one company to shared infrastructure governed by a foundation, and it began to behave accordingly.

What makes the period unusual is that all of this happened at once: record adoption alongside a candid acknowledgment, through both the rewrite and the security work, that the original design was not built for the scale it had reached. That is less a contradiction than a sign of maturity. MCP has moved past the stage of being interesting and into the more demanding stage of being relied upon.

Where SerpApi fits

If you are building agents on top of MCP, live search is one of the most useful capabilities you can give them, and you do not need to build the connector yourself.

The SerpApi MCP server exposes a single search tool that any MCP client — Claude Desktop, Cursor, VS Code, or your own agent — can call to query more than 100 engines and receive structured JSON in return. We have also exposed each engine’s schema as an MCP resource, which improves the accuracy of tool calls and aligns well with the discovery and caching direction the new specification is taking.

If you would prefer a lighter setup inside Claude Code, the SerpApi Claude Code plugin gives Claude native search through a single skill, with no server to run. MCP is the better choice when you work across several clients, while the plugin is simpler when you work primarily in one.

5 Things You Can Build with Claude Code and Live Search Data

James Collins — Tue, 19 May 2026 14:26:36 +0000

The SerpApi Claude Code plugin gives Claude native access to 100+ search engines. But what does that actually look like in practice? Here are five workflows you can run today — each takes a single prompt, chains one or more search engines, and produces structured output you can use immediately.

All of the outputs below come from actual API calls made through the plugin during a single Claude Code session. You describe what you need, and the plugin handles engine selection, parameter construction, and result parsing.

1. Cross-Platform Price Comparison

Prompt:

Compare PS5 console prices across Amazon, Walmart, and eBay.

What happens: Claude queries three engines — amazon, walmart, and ebay — extracts pricing and product details from each, and produces a side-by-side comparison.

Output:

Platform	Product	Price	Rating
Amazon	PlayStation®5 console – 1TB	$649.99	4.7 / 5
Walmart	PlayStation 5 Pro Console	$899.00	4.6 / 5
eBay	Sony PlayStation 5 Slim Disc (Brand New)	$199.98	—
eBay	Sony PlayStation 5 Slim Digital Edition 1TB (Brand New)	$549.99	—

What's useful here: You get structured, comparable data from three different marketplaces in one pass. No tab switching, no manual extraction. Claude can also sort by price, flag the best deal, or export the table to CSV — all in the same conversation.

Engines used: amazon, walmart, ebay

2. Academic Literature Review

Prompt:

Find the most-cited papers on retrieval-augmented generation published since 2024. Summarize each one.

What happens: Claude queries the google_scholar engine with as_ylo=2024 to filter by year, then extracts titles, authors, citation counts, and publication details.

Output:

#	Paper	Authors	Citations
1	Retrieval-Augmented Generation for AI-Generated Content: A Survey	Zhao, Zhang, Yu et al.	837
2	Corrective Retrieval Augmented Generation	Yan, Gu, Zhu, Ling	527
3	Retrieval-Augmented Generation for NLP: A Survey	Wu, Xiong, Cui et al.	218
4	Chain-of-Retrieval Augmented Generation	Wang, Chen, Yang, Huang	99
5	RAG and Understanding in Vision: A Survey and New Outlook	Zheng, Weng, Lyu et al.	51

From here, Claude can summarize each paper's contribution based on the snippets returned by Scholar, or you can ask it to identify common themes across the top results. The structured citation counts let you quickly assess which papers are most influential.

What's useful here: A literature review that would normally take an afternoon of Google Scholar tab-hopping becomes a single prompt. You get ranked results with citation metrics, and Claude can synthesize them into a narrative summary or a LaTeX bibliography.

Engine used: google_scholar

3. Local Business Audit

Prompt:

Find coffee shops near Union Square, San Francisco. Give me ratings, review counts, and hours.

What happens: Claude queries the google_maps engine and returns structured local results with ratings, review volume, addresses, and operating hours.

Output:

Name	Rating	Reviews	Address	Hours (Mon–Fri)
Caffe Central	4.5	1,511	133 O'Farrell St	7 AM – 7 PM
The Coffee Movement	4.7	1,320	1030 Washington St	7 AM – 2 PM
Scullery	4.7	572	687 Geary St	7 AM – 2 PM
Cable Car CoffeeSF	4.8	141	902 Market St	6 AM – 4:30 PM
Origin Lab Coffee & Matcha	4.8	102	211 Sutter St	6 AM – 6 PM

What's useful here: This is the kind of data you would normally compile by hand for a competitive analysis, a real estate evaluation, or a relocation decision. Claude returns it structured and ready to use. You can ask follow-up questions — filter by rating, sort by review count, or expand the search radius — without leaving the terminal.

Engine used: google_maps

4. Trend-Informed Content Brief

Prompt:

What's trending around "AI agents" right now? Check Google Trends for interest over time and Google News for recent coverage. Draft a content brief based on what you find.

What happens: Claude chains two engines — google_trends for search interest data and google_news for current headlines — then synthesizes the results into an actionable content brief.

Trends data (sample):

The term "AI agents" shows consistent search interest between 33–43 (normalized index) over the past three months, with no seasonal dips — indicating sustained, growing attention rather than a spike-and-fade pattern.

Recent headlines:

"AI tool poisoning exposes a major flaw in enterprise agent security" — VentureBeat
"Qualcomm CEO says 2026 is the year AI agents go mainstream" — Fortune
"The barista is human but an AI agent runs this experimental Swedish cafe" — AP News
"AI agents fueled a frenzy of startup building at the Consensus Miami hackathon" — CoinDesk

Content brief (generated by Claude):

Topic: AI Agents in 2026 — From Hype to Production

Angle: The conversation has shifted from "what are AI agents?" to "how do we secure and deploy them?" Coverage focuses on enterprise security concerns, real-world deployments, and hardware enablement.

Suggested keywords: AI agents, AI agent security, agentic AI, enterprise AI agents, AI automation 2026

Content gap: Most coverage is news-driven. There is room for practical developer tutorials (how to build, test, deploy agents) and security-focused guides.

What's useful here: Combining trend data with live news gives you a research-backed content brief in under a minute. Claude identifies the narrative shift, suggests angles, and spots content gaps — all grounded in current data rather than training-set knowledge.

Engines used: google_trends, google_news

5. Job Market Snapshot

Prompt:

What do senior ML engineer roles look like in NYC right now? Show me titles, companies, salary ranges, and how recently they were posted.

What happens: Claude queries the google_jobs engine and extracts structured listing data including company names, salary information, and posting recency.

Output:

Title	Company	Salary	Posted
Sr AI/ML Engineer	Phaxis	$175K – $250K/yr	5 days ago
Senior Machine Learning Engineer - Ads	Uber	—	12 days ago
Senior Machine Learning Engineer, Content	Paramount	—	12 days ago
Senior MLOps Engineer	Paramount	—	10 days ago
Senior Machine Learning Engineer (Remote)	Zencastr	—	4 days ago

What's useful here: Whether you are hiring, job hunting, or benchmarking compensation, this gives you a real-time snapshot of the market. Claude can go further — aggregate salary ranges across many listings, identify the most common required skills, or compare markets across cities — all by adjusting the query parameters.

Engine used: google_jobs

The Pattern Behind All Five

If you look at these workflows side by side, they share a structure: a natural-language prompt leads to one or more engine calls, and the structured JSON that comes back gets shaped into something readable — a table, a brief, a summary. You never touch a query parameter directly unless you want to.

The interesting part is what happens after the search. Because this runs inside Claude Code, the results do not disappear into a browser tab. Claude can write them to a file, pipe them into a script, use them as context for a follow-up question, or chain another search on top. A price comparison can feed into a spreadsheet. A literature review can become a bibliography. A job market snapshot can inform a salary negotiation email. Search becomes a composable step in a longer workflow rather than a dead end.

Getting Started

Install the plugin and try any of these prompts yourself:

/plugin marketplace add serpapi/serpapi-claude-plugin
/plugin install serpapi@serpapi-plugins

You will need a SerpApi API key — the free tier includes 250 searches per month. Set it as an environment variable:

export SERPAPI_API_KEY="your_key_here"

Then just ask Claude what you need. The plugin handles the rest.

What's Next

These five examples are starting points. The plugin covers 107 search engines — flights, hotels, patents, finance, app stores, and more. Any workflow that starts with "search for..." or "find me..." or "compare..." is a candidate.

The plugin is open source at github.com/serpapi/serpapi-claude-plugin. If you build something interesting with it, we would like to hear about it.

Introducing SerpApi's Claude Code Plugin

James Collins — Fri, 17 Apr 2026 15:31:33 +0000

Last year, we released the SerpApi MCP Server to give AI agents structured access to live search data through the Model Context Protocol. MCP works well across a range of clients — Claude Desktop, VS Code, Cursor, and others — but it requires running a server process, either locally or via our hosted endpoint.

For developers who work primarily inside Claude Code, Anthropic's agentic coding tool, there is now a lighter option. Today, we are releasing an open-source Claude Code plugin that gives Claude native access to 100+ search engines through the SerpApi REST API. No server to run, no SDK to install — just a plugin that teaches Claude how to search.

What Is a Claude Code Plugin?

Claude Code plugins are packages that extend what Claude can do inside its coding environment. A plugin can include skills — markdown instruction files that tell Claude how to use a specific tool or API — along with configuration for tool access and parameter schemas.

When a skill is relevant to what you are asking, Claude loads it automatically. The skill tells Claude which tools it can use, how to construct requests, and how to interpret results. In our case, the skill teaches Claude to call the SerpApi REST API directly using curl, pick the right search engine based on your intent, and summarize the results.

This is different from MCP. With MCP, the AI client communicates with a running server that wraps the API. With a Claude Code plugin, there is no intermediary — Claude reads the skill, understands the API, and calls it directly. Both approaches give you the same search data; they just differ in how the connection is made.

Key Features

Single skill, all engines. One skill — serpapi:search — covers every engine SerpApi supports. Claude selects the right one based on what you ask. Web search, shopping, maps, news, scholar, flights, jobs, finance — all through one interface.
Schema-driven accuracy. The plugin ships 107 JSON parameter schemas, one per engine. When Claude needs to construct a search for a specific engine, it reads the relevant schema to get the exact parameters, types, and options. No guessing.
Auto-invocation. You do not need to remember slash commands. Claude detects search-related requests and loads the skill automatically. You can also invoke it explicitly with /serpapi:search if you prefer.
Cost-aware defaults. The plugin defaults to Google Light for general web searches — it is faster and cheaper than the full Google endpoint. Claude also confirms before making API calls, since each non-cached call costs one search credit.
Always up to date. Engine schemas are regenerated weekly by CI and committed automatically, so the plugin stays in sync as SerpApi adds new engines or parameters.
Open source. The full plugin is available on GitHub. Contributions are welcome.

Getting Started

1. Get an API key

Sign up at serpapi.com — the free tier includes 250 searches per month, no credit card required. Then set your key as an environment variable:

export  SERPAPI_API_KEY="your_key_here"

2. Install the plugin

/plugin marketplace add serpapi/serpapi-claude-plugin
/plugin install serpapi@serpapi-plugins

Or, if you prefer to work from source:

git  clone  https://github.com/serpapi/serpapi-claude-plugin.git
claude  --plugin-dir  /path/to/serpapi-claude-plugin

3. Search

Once installed, just ask Claude what you need in natural language:

Search for the best noise-cancelling headphones under $300

Find recent academic papers on retrieval-augmented generation

Compare prices for a PS5 on Amazon, Walmart, and eBay

Look up flights from SFO to Tokyo next month

Claude will pick the appropriate engine, construct the API call, execute it, and summarize the results — all without leaving your coding environment.

How It Works

Under the hood, the plugin is a single skill file backed by a directory of engine schemas. When Claude receives a search-related request, it:

Loads the skill, which contains an engine selection table mapping intents to engines (e.g., "academic papers" → Google Scholar, "local restaurants" → Google Maps, "product prices" → Amazon).
Reads the engine schema for the selected engine to determine the correct query parameter name and any engine-specific options.
Constructs and executes the API call to the SerpApi REST endpoint.
Summarizes the results in a readable format with source links.

Every SerpApi engine follows the same REST pattern — https://serpapi.com/search.json?engine=ENGINE&... — so the skill teaches one pattern and parameterizes it across all engines. Claude can also chain multiple engines together for tasks like cross-platform price comparisons or multi-source research.

Supported Engines

The plugin covers the full range of SerpApi's search engine APIs:

Category	Engines
Web Search	Google, Google Light, Bing, DuckDuckGo, Yahoo, Yandex, Baidu, Naver
AI Search	Google AI Mode, Google AI Overview, Bing Copilot, Brave AI Mode
Shopping	Amazon, Walmart, eBay, Google Shopping, Home Depot
Local / Maps	Google Maps, Google Local, Yelp, TripAdvisor, OpenTable
Research	Google Scholar, Google Patents, Google Trends
News	Google News, Bing News, DuckDuckGo News, Baidu News
Media	Google Images, Google Videos, YouTube, Google Lens
Travel	Google Flights, Google Hotels, Google Travel Explore
Jobs	Google Jobs
Finance	Google Finance
Apps	Google Play, Apple App Store

MCP or Plugin — Which Should You Use?

If you are already using our MCP server, you do not need to switch. MCP is the right choice when you work across multiple AI clients — Claude Desktop, VS Code Copilot, Cursor, or custom agents — since it provides a single server that any MCP-compatible client can connect to.

The Claude Code plugin is a better fit if Claude Code is your primary environment and you want a simpler setup. There is no server process to manage, no Docker container to run, and no port to configure. You install the plugin and it works.

Both options return the same structured JSON from the same SerpApi endpoints. They cover the same engines and support the same parameters. Choose the one that fits your workflow.

What's Next

The plugin is open source and available now at github.com/serpapi/serpapi-claude-plugin. We are continuing to expand our AI integrations to make sure developers can access live search data wherever they work — whether that is through MCP, Claude Code plugins, or the REST API directly.

If you run into issues or have ideas for improvement, open an issue or pull request on the repository. And if you haven't tried SerpApi yet, create a free account to get started with 250 searches per month.

Build an AI Research Agent for Slack and Linear with SerpApi

James Collins — Tue, 24 Mar 2026 17:48:29 +0000

Context switching is the silent killer of developer productivity. You are deep in a Linear ticket or debugging a Slack thread, you hit a technical question, and suddenly you are in a browser tab reading through Stack Overflow and pasting links back to your team. Multiply that by every developer on your team, several times a day, and you start to see how much focused time disappears.

What if the tools you already use could do that research for you?

In this tutorial, we build a cross-platform *Research Agent* — a single Python backend that plugs into both Slack and Linear. You can mention it in a channel or assign it to a ticket, and it will search the web using SerpApi, synthesize the results with an LLM, and post a cited answer directly in your workspace. No tab switching required. We will mostly focus on Linear example since that's where the assignment happens explicitly, but Slack extension is supported as well.

Why Build a Workspace Research Agent

Project management tools and chat apps are excellent at tracking internal context — who owns a task, what the status is, what decisions were made. But they have a blind spot when it comes to external information: updated documentation, recent CVEs, syntax examples, framework comparisons, or trending approaches to a problem.

Developers bridge that gap manually, dozens of times a day, mostly submitting prompts to the LLM and then coming back and hour later to pick up on the necessary context. There is no task -> in progress -> done update loop here.

An AI research agent solves this by bringing live web data into the conversation. Instead of leaving Slack or Linear, a developer tags the agent with a question. The agent fetches structured search results via SerpApi, passes them to an LLM for synthesis, and posts a concise, cited answer right where the question was asked.

The key insight is that SerpApi returns clean, structured JSON — titles, snippets, links — rather than raw HTML. Our agent never scrapes or parses web pages. It reasons over high-quality, normalized data from Google, Google Scholar, YouTube, and dozens of other engines. The output is a complete research with references and an extendable engine to back it.

Architecture: One Backend, Two Platforms

Building a cross-platform bot means dealing with different platform lifecycles. Slack requires you to acknowledge an event within 3 seconds. Linear’s Agent API requires your agent to emit a “thought” activity within 10 seconds of receiving an assignment. If you miss either deadline and the platform marks your bot as unresponsive.

To handle this cleanly, we built a single FastAPI server with two webhook endpoints — /slack/events and /linear/webhooks — and a shared research engine behind them. Both endpoints immediately satisfy their platform’s timeout constraint and then hand the actual work to a background task.

Here is what the flow looks like:

*Ingest*: A webhook arrives from Slack (app mention) or Linear (agent session event).
*Acknowledge*: The endpoint responds instantly — Slack Bolt calls ack() automatically; the Linear handler returns HTTP 200 and spawns a background coroutine.
*Search*: The research engine calls SerpApi with the user’s query and gets structured JSON results.
*Synthesize*: An LLM reads the search results and produces a markdown summary with source citations.
*Respond*: The answer is posted back — as a threaded Slack message or a Linear agent activity.

The entire backend is async Python. FastAPI handles routing, Slack Bolt’s AsyncApp handles Slack-specific event parsing, and a lightweight httpx-based client handles Linear’s GraphQL API. Everything shares the same perform_research() function, platform-specific code only exists at the edges.

The Research Engine: SerpApi + LLM

The core of the agent is surprisingly simple. Two functions do all the work: one fetches search results, the other synthesizes them.

For search, we use SerpApi’s Python client. The agent automatically picks the right engine based on the query. For this article, we keep the search loop simple, but you can read about more advanced approaches on the prior blog posts. If someone asks about a “research paper,” it routes to Google Scholar; if they want a “video tutorial,” it routes to YouTube; otherwise it defaults to Google Search:

client = serpapi.Client(api_key=settings.serpapi_api_key)

results = client.search({"q": query,  "engine": _pick_engine(query),  "num": 5})

Because SerpApi returns normalized JSON regardless of the engine, the rest of the pipeline does not need to care which search backend was used. We extract title, snippet, and link from each result and format them into a context block.

For synthesis, we pass those results to an LLM with a system prompt that enforces citation discipline, summarize the findings, always cite sources using the provided URLs, never fabricate beyond what the search results contain:

response =  await openai_client.chat.completions.create(
  model="gpt-5.4",
  messages=[
    {"role": "system",  "content": RESEARCH_SYSTEM_PROMPT},
    {"role": "user",  "content": f"Answer this query: {query}\n\nSearch results:\n{context}"},
  ],
  temperature=0.3,
)

The low temperature keeps the output factual. SerpApi handles proxy rotation, CAPTCHA solving, and result parsing behind the scenes, so our code never touches raw HTML.

Connecting to Slack

For Slack, we use the Slack Bolt framework. Bolt handles event parsing, signature verification, and the 3-second acknowledgment automatically — we just write a handler for the app_mention event:

@app.event("app_mention")
async  def  handle_mention(event, say):
  thread_ts = event.get("thread_ts", event["ts"])

  await  say("Researching...",  thread_ts=thread_ts)

  answer =  await  perform_research(_extract_query(event["text"]))

  await  say(_slack_format(answer),  thread_ts=thread_ts)

When someone types @ResearchBot What is the best way to handle database migrations in Django?, the handler strips out the bot mention to get the raw question, posts a “Researching…” message so the user knows something is happening, runs the SerpApi + LLM pipeline, and replies in the same thread with the cited answer.

One detail worth noting: Slack uses its own link format (<url|text>) instead of standard markdown. Since our research engine always outputs standard markdown, we convert links at the boundary with a simple regex. The engine itself stays platform-agnostic.

Connecting to Linear’s Agent API

Linear’s Agent API is the more interesting integration. Unlike a basic webhook that creates comments, Linear’s agent system gives your bot a first-class identity in the workspace. It shows up in the assignee dropdown, it can be @mentioned in issues and documents, and its responses appear as structured *agent activities* with visible thinking states, not just plain comments.

When a user assigns your agent to an issue or mentions it in a comment, Linear creates an AgentSession and sends a webhook to your server. Your agent communicates back through typed activities:

*thought*: Shows a thinking indicator (e.g., “Performing web research via SerpApi…”)
*response*: The final answer, rendered as rich markdown in the issue
*error*: Displayed with error styling if something goes wrong

The webhook handler needs to respect two constraints: return HTTP 200 within 5 seconds, and emit a thought within 10 seconds. We handle this by returning immediately and spawning the research work as a background coroutine:

if event_type ==  "AgentSessionEvent"  and action in  ("created",  "prompted"):
  asyncio.create_task(_handle_session(payload))
  return  Response(status_code=200)

The background task sends the thought activity right away — well within the 10-second window — then runs the research pipeline and posts the result as a response. The query is extracted from Linear’s promptContext field, which contains the issue title, description, and any relevant context:

await linear_client.send_thought(session_id,  "Performing web research via SerpApi...")

answer =  await perform_research(query)

await linear_client.send_response(session_id, answer)

For GraphQL communication, we wrote a thin async client using httpx rather than pulling in a full GraphQL library. All we need are two mutations — agentActivityCreate for posting activities and agentSessionUpdate for managing session state — so raw query strings are more than sufficient.

Linear also signs every webhook with HMAC-SHA256, which we verify against the raw request body before processing anything. And if a user sends a follow-up message within an existing session, Linear dispatches a prompted event (instead of created), making the interaction conversational, the agent picks up the new message and researches again.

Running It Locally

Both Slack and Linear support private development without any app directory submission or review process. You can have the full end-to-end flow running in minutes.

The project uses uv for dependency management and Python 3.12+:

git  clone  https://github.com/serpapi/serpapi-research-bot.git
cd  serpapi-research-agent
uv  sync
cp  .env.example  .env

Fill in your .env with API keys for SerpApi, OpenAI, and the Slack/Linear credentials from your dev app configurations. Then start the server and expose it via ngrok:

uv  run  uvicorn  serpapi_research_agent.main:app  --reload  --port  3000

# In another terminal:
ngrok  http  3000

Point your Slack Event Subscriptions Request URL and Linear app webhook URL to the ngrok tunnel, and you are ready to test. For Slack, invite the bot to a channel and mention it. For Linear, create an issue and assign the agent as a delegate.

Seeing It in Action

We will focus on Linear example, since it is more interesting and has a better Agent integration, but setting up Slack is straightforward as well.

Linear

For Linear, first we will need to create a new Linear Oauth app. Sample settings below, you get the URL from ngrok, add /linear/webhooks to it, and add it to the webhook section. You will also need to get a Linear ouath access token, you can take it be submitting a request with your client ID, client secret, and grant_type=client_credentials (note that client credentials must be enabled during the auth) setup. Then save all the received credentials into the .env file.

Once the agent is properly configured and everything is running, you can open a ticket and assign it an issue. The Research Bot will appear as a first-class Linear member:

And once the research is actually complete, then the agent will follow with a response activity and a full response content.

Conclusion

By integrating SerpApi and an LLM directly into Slack and Linear, we built a research assistant that meets developers exactly where they work. The architecture is deliberately simple — a single FastAPI server with a shared research engine and thin platform-specific adapters at the edges.

SerpApi handles the hard parts of web search: proxy rotation, CAPTCHA solving, and result normalization across multiple engines. The LLM handles synthesis and citation. And the platform integrations handle the last mile — getting the answer back into the right thread or ticket, formatted correctly, within each platform’s timeout constraints.

The full source code is available on GitHub. You can have it running against your own Slack workspace and Linear instance in under 15 minutes.

Ready to build your own workspace agent? Create a free SerpApi account to get started.

Competitive Intelligence Agent: From Slides to Live Signals

James Collins — Mon, 02 Mar 2026 14:13:14 +0000

In many B2B organizations, "competitive intelligence" still means hunting through old slide decks, ad‑hoc notes, and a long history of Slack threads. Even if you track competitors somewhere in your CRM, the reality on the ground changes faster than those assets: pricing pages move, job descriptions hint at new bets, and funding announcements reset roadmaps overnight. When your analysis or enablement material lags behind, it becomes background noise instead of a strategic guide.

Historically, closing this gap meant a lot of copy‑paste work: open a few tabs, search for news, skim job boards, and then try to reconcile that with whatever your CRM says about the account. With modern tooling, this workflow can be turned into a repeatable pattern. By combining OpenAI, optional HubSpot CRM access, and SerpApi, a Competitive Intelligence Agent can pull live web signals, overlay your internal context, and return a sourced briefing on demand.

The Challenge: Stale Assets and Fragmented Context

Competitive landscapes do not stand still, but the artifacts we use to describe them often do. Common issues include:

Static collateral in a moving market: Competitive decks are produced for a launch, a QBR, or a training session, then slowly drift away from reality.
Many sources, no single view: Product marketers and PMs check pricing pages, comparison sites, review platforms, press releases, and job listings—but there is no unified answer to "What is happening with this competitor right now?"
Internal knowledge is siloed: Your CRM may already show where a competitor appears in deals and which stakeholders you know, yet that insight rarely shows up in market analysis.

Bringing all of this into a single, timely briefing used to require custom glue code. Standards like the Model Context Protocol (MCP) aim to give models a consistent way to talk to APIs and databases, but in this project we use a straightforward approach: direct integrations with SerpApi and, optionally, the HubSpot SDK.

How the Competitive Intelligence Agent Behaves

The agent follows a simple loop—Plan → Gather → Summarize—to turn scattered data into a coherent briefing.

1. Plan

Given a request such as "Give me a competitive snapshot of acme.com, including what we know in HubSpot," the model decides which tools to use:

External: SerpApi‑backed web, news, and jobs search to understand positioning, recent milestones, and hiring focus.
Internal (optional): HubSpot company, contact, and activity lookups to see how this competitor (or customer with a known competitor) appears in your own pipeline.

2. Gather

The agent then calls the tools it selected:

SerpApi:
Web search for pricing pages, comparison articles, and review sites
News search for launches, funding, and partnerships within a defined time window
Jobs search for roles and locations that hint at strategic priorities
HubSpot SDK (optional):
Find a company by domain if it already exists in your CRM
Look up specific contacts by email
Retrieve a timeline of notes, calls, and emails associated with those contacts

Each tool returns structured JSON that can be reasoned about and cited.

3. Summarize

Once enough information has been collected, the model stops calling tools and produces a concise briefing. A typical output might look like:

"Acme emphasizes low‑latency analytics for operational workloads [1], [2]. Over the past month they announced an EU expansion and a partnership with a major cloud provider [3]. In our HubSpot data, we see one open deal where Acme is listed as a competing vendor, plus a Director‑level contact who raised concerns about migration cost three months ago [4]. For upcoming conversations, we should highlight data residency guarantees and total cost of ownership versus their current stack."

The result is something closer to a one‑page memo than a raw list of links.

Key Building Blocks

The repository contains a complete implementation. At a high level, the Competitive Intelligence Agent is assembled from three layers.

1. SerpApi for Web, News, and Jobs

SerpApi handles live search across several Google surfaces:

Web search: Company sites, competitor comparison pages, and review content.
News search: Time‑bounded news about funding, launches, acquisitions, or leadership changes.
Jobs search: Open roles and locations that reveal where and how a company is investing.

Because SerpApi returns normalized JSON instead of HTML, the agent can focus on reasoning over titles, snippets, links, and timestamps rather than scraping and parsing.\n+

2. Optional HubSpot SDK for Internal Context

If you provide a HUBSPOT_ACCESS_TOKEN, the agent also exposes HubSpot‑backed tools that answer "what do we already know?":

hubspot_search_company_by_domain — discover whether the company is already in your CRM and retrieve its basic record.
hubspot_get_contact_by_email — fetch contact details for known stakeholders.
hubspot_get_contact_activity_history — pull a summarized history of notes, calls, and emails tied to that contact.

This turns the agent from a pure external‑research helper into a bridge between the market and your existing pipeline.

3. OpenAI for Orchestration and Writing

An OpenAI model coordinates the process by:

Selecting which tools to invoke based on the question.
Interpreting the structured JSON returned by SerpApi and HubSpot.
Drafting a narrative answer with clear sections and numbered citations.

The Python layer is responsible for defining tools, handling authentication, and streaming tool results back into the conversation; the model is responsible for the reasoning and writing.

Practices for Reliable Briefings

To keep outputs trustworthy and useful, the implementation follows a few simple practices:

Explicit citations: External claims—such as funding rounds, launch dates, or pricing changes—are tied to numbered references so readers can verify them.
Time‑aware queries: Loose phrases like "recent" or "last month" are translated into date filters for SerpApi, ensuring news queries match the intended window.
Read‑only CRM access: HubSpot integration in this project is observational. The agent reads companies, contacts, and activities but does not create or update records.

These constraints make it easier to embed the agent into real workflows without surprising side effects.

Where MCP Fits In

If you are exploring the Model Context Protocol (MCP), this project offers a concrete pattern you could later port into that world. MCP aims to give language models a unified way to talk to tools and data sources. In contrast, this agent wires SerpApi and HubSpot directly via their SDKs and APIs.

Functionally, the shape is similar: the model chooses tools, the runtime executes them, and the model synthesizes an answer. A future variant could swap the direct integrations for SerpApi MCP and a HubSpot MCP server while preserving the same Plan → Gather → Summarize behavior.

Conclusion: Continuous Competitive Sensing

Instead of occasionally updating a competitive deck, teams can move toward continuous competitive sensing—asking for a current snapshot of a market or rival whenever they need it.\n+

By pairing SerpApi, OpenAI, and optional HubSpot CRM, the Competitive Intelligence Agent turns scattered signals into a single, sourced briefing. Strategy, product, and revenue teams get fast context without tab‑sprawl, and they can make decisions with both public information and their own CRM history in view.

How to use SerpApi engine schemas in SerpApi MCP to improve tool call quality

James Collins — Tue, 03 Feb 2026 13:16:43 +0000

The Model Context Protocol (MCP) makes it possible for AI assistants to call external tools in a structured, standardized way. If you’re new to MCP, we recommend starting with our overview: Model Context Protocol (MCP): A Unified Standard for AI Agents and Tools.

SerpApi’s MCP server already exposes a unified search tool that can query dozens of SerpApi engines - Google Search, News, Flights, Shopping, and more. Now, we’ve introduced a major upgrade: engine schemas are exposed as MCP resources.

This means that for every SerpApi engine, you can now programmatically discover exactly which parameters it accepts, their types, and how they’re meant to be used - directly from MCP. This unlocks a new level of accuracy, reliability, and automation for AI-powered developer workflows.

In this post, we’ll explain how engine schemas work, why they matter, and how to use them in practice - including a hands-on example using Google Flights from VS Code.

Quick recap: SerpApi MCP in your workflow

SerpApi MCP lets you connect AI tools (like Claude Desktop, Cursor, VS Code MCP-compatible extensions or your custom AI agents) directly to SerpApi’s live search infrastructure.

Once configured, your AI assistant can:

Discover available search engines
Call the search tool with structured parameters
Receive normalized JSON results

If you haven’t set it up yet, you’ll need a SerpApi API key. You can obtain one at here.

Then add SerpApi MCP to your MCP configuration:

{
  "mcpServers": {
    "serpapi": {
      "url": "https://mcp.serpapi.com/YOUR_SERPAPI_API_KEY/mcp"
    }
  }
}

Once connected, your AI assistant can immediately start interacting with SerpApi tools and resources.

What’s new: Engine schemas as MCP resources

Previously, the LLM had to guess which parameters each SerpApi engine accepted or rely on hardcoded knowledge.

Now, each engine exposes its own schema as an MCP resource.

What this means

With the new feature, MCP exposes:

A list of available engines
A detailed parameter schema for each engine

Conceptually, this looks like:

serpapi://engines → lists all supported engines
serpapi://engines/google_flights → returns the schema for Google Flights
serpapi://engines/google_shopping → returns the schema for Google Shopping

Each schema describes:

Parameter names
Data types
Optional vs required fields
Parameter descriptions

This gives AI agents a real-time contract for every engine.

How engine schemas improve tool call quality

This upgrade solves several real-world problems when building AI-powered integrations.

1. No more guessing parameter names

Instead of inventing fields like origin or destination, the agent can see the real API fields such as:

departure_id
arrival_id
departure_date

2. Engine-specific optimization

Each SerpApi engine has its own unique features. Schemas allow the assistant to:

Use flight-specific filters for Google Flights
Use price ranges for Google Shopping
Use topic tokens for Google News

3. Fewer failed requests

When parameters are generated directly from the schema:

Invalid arguments are avoided
Missing required fields are detected earlier
Requests become deterministic and repeatable

4. Better developer experience

In IDEs like VS Code, schemas unlock:

Better prompt grounding
Smarter auto-completion
Self-documenting integrations

Example: Using Google Flights with MCP engine schemas

Let’s walk through a real example.

Imagine you want your AI assistant to search for round-trip flights from San Francisco (SFO) to New York (JFK).

Step 1: Discover available engines

The assistant can query the engine index resource:

serpapi://engines

This returns a list including:

google_search
google_news
google_shopping
google_flights

The agent selects google_flights.

Step 2: Load the Google Flights schema

Next, the assistant fetches:

serpapi://engines/google_flights

This returns a schema describing supported parameters such as:

departure_id
arrival_id
departure_date
return_date
currency
travel_class

Now the assistant knows exactly what Google Flights accepts.

Step 3: Make a structured search call

Using the schema, the AI can generate a correct MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "engine": "google_flights",
      "departure_id": "SFO",
      "arrival_id": "JFK",
      "departure_date": "2026-03-15",
      "return_date": "2026-03-20",
      "currency": "USD"
    }
  }
}

Because the parameters match the schema:

The request is valid
No field guessing is involved
Results are returned immediately

The response contains structured flight data including prices, airlines, durations, and layovers.

Example: Shopping search with Google Shopping

Schemas are just as powerful for product search.

Let’s say you want to find wireless earbuds under $100.

Load the engine schema

serpapi://engines/google_shopping

You’ll discover parameters such as:

q
min_price
max_price
gl
hl

Perform the search

{
  "name": "search",
  "arguments": {
    "params": {
      "engine": "google_shopping",
      "q": "wireless earbuds",
      "min_price": 10,
      "max_price": 100,
      "gl": "us",
      "hl": "en"
    }
  }
}

The schema ensures:

The right filter fields are used
The query structure matches the engine
The response is clean and predictable

Using schemas inside VS Code

VS Code is one of the most popular environments for MCP-powered workflows.

With SerpApi MCP configured:

Your AI assistant can dynamically load engine schemas
Generate valid tool calls
Iterate on search logic without switching tabs

This enables workflows like:

"Find the cheapest flights under $500"
"Track shopping price drops"
"Monitor breaking news by topic"

All powered by real-time schema-driven requests.

Summary: Why this feature matters

By exposing engine schemas as MCP resources, SerpApi MCP becomes:

More reliable
More developer-friendly
More automation-ready

You get:

Accurate parameter generation
Engine-aware AI behavior
Lower error rates
Faster integration cycles

This turns SerpApi MCP into a self-describing API layer for AI agents.

Get started today

To try engine schemas with SerpApi MCP:

Create a SerpApi account: https://serpapi.com
Enable MCP using your API key
Connect from VS Code, Claude Desktop, or Cursor
Start exploring serpapi://engines

If you’re building AI-powered tools that rely on search, shopping data, travel data, or news intelligence - engine schemas will dramatically improve the quality and reliability of your integrations.

Building a Sales Assistant with real-time market awareness and CRM insights

James Collins — Wed, 21 Jan 2026 10:50:40 +0000

In modern B2B sales, the “Context Gap” can limit deal effectiveness. Even with comprehensive CRM records—emails, call notes, and deal stages—sales teams often miss critical real-time developments in their prospects’ businesses. If a lead recently raised funding or launched a new product and outreach does not reflect it, it risks being overlooked in a crowded inbox.

Traditionally, closing this gap required manual research: searching online, reviewing news, and checking CRM records. Today, this process can be automated. By integrating OpenAI, the HubSpot CRM SDK, and SerpApi, a Sales Assistant can combine internal CRM data with real-time market signals, producing actionable insights and personalized outreach.

The Problem: Static CRMs and the Context Gap

Most CRMs are retrospective—they record historical interactions—but sales are forward-looking. Identifying high-potential leads requires real-time insights:

Is the company hiring or expanding?
Are there recent funding rounds or product launches?
What new market trends might impact their priorities?

Previously, integrating live external signals into a CRM workflow required custom, complex code. While concepts like the Model Context Protocol (MCP) aim to standardize such integrations, our Sales Assistant achieves the same outcomes using direct integrations with HubSpot and SerpApi, without relying on MCP.

How the Sales Assistant Works

The assistant operates in a continuous Plan → Execute → Synthesize loop, combining CRM data with external market intelligence.

1. Plan

When asked, “Prepare a briefing for my meeting with InnovateTech,” the agent evaluates available tools:

Internal: Query HubSpot for interaction history, primary contacts, and deal stages.
External: Search the web and news via SerpApi for recent company developments.

2. Execute

The agent retrieves raw data from multiple sources:

HubSpot SDK: Access company records, contact details, and engagement history.
SerpApi: Pull recent news articles, funding announcements, and product updates.

3. Synthesize

Using internal and external data, the agent generates personalized recommendations or outreach drafts:

“Congratulations on your recent Series B funding. During our last conversation about cloud migration, you mentioned scaling challenges. With this new investment, we’d like to demonstrate how our solution can help support your upcoming initiatives.”

Core Technical Components

The complete implementation is available on GitHub. The Sales Assistant relies on three main components:

1. HubSpot Native SDK

Provides programmatic access to company and contact data, including:

hubspot_search_company_by_domain
hubspot_get_contact_by_email
hubspot_get_contact_activity_history

2. SerpApi Web and News Search

Enables structured retrieval of relevant web content and news, with date filtering and normalized results for AI consumption.

3. OpenAI LLM Integration

Coordinates tool calls, synthesizes results, and drafts recommendations or outreach messages.

Operational Guidelines

To maintain accuracy and reliability:

Source Attribution: All external research is linked to original sources.
Date Normalization: Phrases like “last week” are converted to ISO date ranges before querying.
Scope Limitation: HubSpot access is read-only for safety.

MCP as an Alternative

While the current implementation does not use MCP, it is a promising concept for standardizing AI access to multiple data sources. MCP would allow models to interface with any API or database in a consistent manner.

In this case, the agent achieves similar functionality through direct SDK and API integrations with HubSpot and SerpApi.

The alternative implementation would rely on SerpApi MCP and HubSpot MCP.

Conclusion: Observational CRMs

Sales teams are moving from transactional CRMs, where humans manually enter data, to observational CRMs, where AI agents monitor the market and update pipelines proactively.

By integrating HubSpot, SerpApi, and OpenAI, research becomes part of the sales workflow. Sales teams can focus on building relationships with real-time intelligence, improving responsiveness, and reducing context-switching.

Integrating SerpApi MCP into Your Developer Workflow

James Collins — Thu, 08 Jan 2026 15:47:43 +0000

AI-assisted development tools such as Cursor and Claude have changed how developers interact with their codebases. These tools excel at generating, refactoring, and explaining code by leveraging the local project context. However, they are still limited when it comes to retrieving fresh, external information such as updated documentation, security advisories, or real-world usage examples.

The Model Context Protocol (MCP) enables developers to extend AI tools beyond static knowledge by connecting them to external services. By integrating custom MCP servers, developers can allow their AI assistants to query live data sources directly from the IDE. SerpApi provides SerpApi MCP server, which adds real-time web search capabilities to MCP-compatible AI tools.

Extending AI Tools with MCP

MCP defines a standard way for AI clients to call tools using structured requests. Instead of hardcoding integrations, MCP allows tools like search, databases, or internal APIs to be plugged into an AI assistant dynamically.

In practice, this means:

Your AI assistant can decide when it needs external information
It can call an MCP tool with a structured query
The tool returns machine-readable results
The assistant uses those results to improve its response

For developers using VS Code, Cursor, or Claude Desktop, this turns the IDE into a live research environment rather than a closed system.

What the SerpApi MCP Server Provides

The SerpApi MCP server exposes a single tool called search. This tool allows AI assistants to perform live web searches using SerpApi and receive structured results. These results can include:

Documentation excerpts
Code examples
Security advisories
API usage discussions
Tooling and configuration references

The AI assistant can then reason over this data instead of guessing or relying on outdated training information.

Setting Up SerpApi MCP

Step 1: Obtain a SerpApi API Key

Create an account on the SerpApi website and get an API key from your dashboard. This key is required to authenticate requests made through the MCP server.

Step 2: Add the MCP Server to Your AI Client

SerpApi provides a hosted MCP endpoint, which is the simplest way to get started.

For MCP-compatible clients such as Claude Desktop or Cursor, add the following configuration:

{
  "mcpServers": {
    "serpapi": {
      "url": "https://mcp.serpapi.com/YOUR_SERPAPI_API_KEY/mcp"
    }
  }
}

After saving the configuration, restart the AI tool. The SerpApi MCP server will now be available to the assistant.

Alternatively, developers who prefer local control can clone the SerpApi MCP repository and run the server locally, then point their AI client to http://localhost:8000.

Using the SerpApi Search Tool

Once integrated, the AI assistant can call the SerpApi MCP Search tool using structured input. Below are hands-on examples that reflect real developer workflows.

Example 1: Database Syntax Lookup

A developer working with PostgreSQL wants to implement an UPSERT operation.

MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "q": "PostgreSQL UPSERT ON CONFLICT example"
    }
  }
}

Result usage:

The assistant retrieves current syntax examples and produces a valid SQL snippet using INSERT ... ON CONFLICT DO UPDATE, tailored to the developer’s schema.

Example 2: Security Vulnerability Investigation

A dependency audit flags a potential vulnerability.

MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "q": "CVE-2023-4863 vulnerability details mitigation"
    }
  }
}

Result usage:

The assistant summarizes the vulnerability, affected versions, and mitigation steps, helping the developer decide whether an upgrade or patch is required.

Example 3: Language Syntax and Code Patterns

A developer needs to set a timeout for a JavaScript fetch request.

MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "q": "JavaScript fetch timeout AbortController example"
    }
  }
}

Result usage:

The assistant composes a correct code example using AbortController, based on real-world usage patterns.

const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);

fetch(url, { signal: controller.signal })
  .then(res => res.json())
  .catch(err => console.error(err));

Example 4: Tooling and Configuration Questions

A developer is configuring Docker and needs clarification on CMD vs ENTRYPOINT.

MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "q": "Docker CMD vs ENTRYPOINT differences"
    }
  }
}

Result usage:

The assistant explains the behavioral differences and provides examples of when to use each directive.

Example 5: Ecosystem and Version Awareness

Before upgrading a dependency, a developer wants to understand recent changes.

MCP tool call:

{
  "name": "search",
  "arguments": {
    "params": {
      "q": "React 19 breaking changes"
    }
  }
}

Result usage:

The assistant summarizes notable changes and highlights potential migration concerns relevant to the existing codebase.

Conclusion

By integrating the SerpApi MCP server into AI-assisted development tools, developers unlock real-time information retrieval directly inside their IDEs. This approach reduces context switching, improves accuracy, and allows AI assistants to operate with up-to-date knowledge.

As MCP tooling becomes more widely adopted, AI-driven development is likely to shift from static code generation toward more adaptive, tool-augmented workflows. Search-enabled MCP servers such as one provided by SerpApi are an early example of how external knowledge can be seamlessly integrated into everyday software development, ultimately improving productivity and decision-making across the development lifecycle.

AI-Powered SEO Research Agent with OpenAI & SerpApi

James Collins — Wed, 10 Sep 2025 13:10:41 +0000

Search engines and AI are rapidly reshaping how businesses find opportunities online. Today’s “AI agents” – systems that autonomously browse and query the web on a user’s behalf – are already changing SEO best practices. For example, industry data shows that ChatGPT’s user agents doubled their web-search activity in July 2025, fundamentally altering how sites need to be discovered and indexed . At the same time, language models alone cannot know the latest trends or live keyword data.

To bridge this gap, we built a SEO Research Agent: a chat-based assistant that combines OpenAI’s new function-calling with SerpApi’s Google search tools. It plans queries, gathers live SERP data, and synthesizes a full cited SEO report – giving marketers up-to-date insights into keywords, competitors, and news-driven content ideas.

Why Use an AI Agent for SEO Research

SEO research often means chasing fresh, authoritative information. For instance, effective keyword discovery relies on current autocomplete suggestions and competitor SERPs . Google Autocomplete is one of the most accurate keyword research tools for real-time ideas . Likewise, staying ahead in content strategy requires monitoring industry news. However, manually collecting this data is tedious. That’s where an agentic AI comes in.

By combining a language model’s reasoning with automated search tools, our agent can batch together keyword queries, competitor analysis, and news searches, then distill the results into a concise report. This “plan → execute → synthesize” workflow is an established pattern for multi-tool AI agents .

The model first plans all needed searches, the system executes them in parallel (using SerpApi to avoid captchas and proxy issues), and then the agent synthesizes the findings into a structured answer . In practice, this means our SEO agent automatically collects live SERP snippets, autocomplete suggestions, ranking positions, and recent headlines – all without extra work from the user.

How the SEO Research Agent Works

The agent is implemented in Python as a conversational assistant. It uses OpenAI’s API with function-calling tools. We’ve defined four main tools based on SerpApi endpoints:

search_web – Runs a Google organic search for a query, returning the top titles and snippets.
search_autocomplete – Calls Google Autocomplete to list related keyword suggestions (long-tail terms).
search_news – Queries Google News for top headlines and snippets on a topic.
check_rank – Finds the ranking position of a given domain for a specific keyword (scanning Google’s top 100 results).

These tools are exposed to the language model through a system prompt that enforces an iterative research loop. In essence:

Phase 1: the model writes a natural-language plan, describing which keywords and data it needs (for example, “we’ll start by gathering autocomplete suggestions for the brand name, then list competitors from the SERP, and finally check our rank on those terms.”).
Phase 2: the model emits a batch of structured tool calls in one JSON object. Each call has an ID, name, and arguments (e.g. {id: "c1", type: "function", function: { name: "search_autocomplete", arguments: {"query": "ai SEO tools"} }}), which the host executes in parallel. This approach dramatically reduces latency and ensures comprehensive coverage. Once all tool calls have run, their results (concise title:snippet lists, or ranking positions) are returned to the model as “tool” messages. The agent may iterate: if the first set of results suggests new queries (perhaps additional keywords or competitor sites), the model can plan a second round of tool calls. This loop continues until the model has gathered enough information.
Phase 3: the model generates the final SEO Report in Markdown. This report includes sections like Keyword Opportunities, SERP Insights, Domain Ranking, and News & Topical Opportunities, each formatted with bullets and embedded citations from the tool outputs. By following this explicit plan–execute–synthesize cycle, the agent provides a transparent, auditable research process .

Key Tools & Capabilities

The SEO agent’s power comes from integrating SerpApi’s search data:

Real-time SERP Data: By calling SerpApi’s Google Search API, the agent gets organic results, snippets, and related searches. As Nimbleway explains, a SERP API “scrapes and retrieves data from search engine results pages” and can return organic results, ads, snippets, URLs, knowledge graph data, and related searches . We use this to identify competitor domains, featured snippets, and keyword contexts. For example, search_web might reveal that “brightdata.com” and “apify.com” are top competitors for web scraping APIs, with specific snippet text that the report can cite.
Keyword Suggestions: The search_autocomplete tool taps Google Autocomplete for suggestions on each seed keyword or brand name. This yields dozens of long-tail and related keyword ideas. Experts note that starting with Google suggestions is a core keyword research technique . In fact, “Google Autocomplete is the most accurate keyword research tool” for uncovering current search queries . Our agent formats these suggestions into bullet lists for the Keyword Opportunities section.
News & Trends: The search_news function uses SerpApi’s Google News API to fetch recent headlines relevant to the topic. Monitoring news is crucial for timely SEO content: as RapidSeedBox points out, you can “keep up with industry news” by pulling data from Google News and spotting brand mentions in real time . In the report’s News & Topical Opportunities section, the agent highlights any breaking stories or trending angles related to the keywords (with source citations).
Rank Checking: Finally, the check_rank tool runs a Google search for a keyword and scans the results for the target domain. It reports the rank position (or notes if the domain is not in the top results). According to SEO guides, rank tracking is one of the most common uses of SERP APIs, since it automates the tedious task of checking positions . Our agent uses this to populate the Domain Ranking section: e.g. “example.com ranks #4 for ‘best coffee grinder’ and is not in the top 50 for ‘grinder maintenance tips’.”

Behind the scenes, the code parallelizes these API calls for speed. SerpApi handles proxy rotation and CAPTCHAs, so we get reliable, structured JSON results for each query. The agent then extracts just titles, snippets, or rank numbers, keeping the returned context concise for the model to digest. This combination of tools means the agent covers broad SEO research tasks (keyword ideation, competitor scan, ranking analysis, trend spotting) in one multi-step workflow.

Example SEO Report Output

After gathering data, the agent writes a cohesive report in Markdown. The report generally follows this structure:

Keyword Opportunities: A bullet list of long-tail keywords and related terms from autocomplete. For example: - "brand X tutorial", - "brand X vs competitors", etc. These come from the search_autocomplete results, and focus on phrases that real users are searching for.
SERP Insights: Highlights of the web search results. It might say, for example, “Competitor: apify.com – offers web scraping & automation tools .” (Here we might cite a relevant snippet from the search results). The agent points out high-ranking competitors, identifies whether featured snippets or rich cards appear, and notes any content gaps. As one AI-SEO guide explains, specialized agents can “analyze the data by themselves and provide...actionable insights” rather than just raw numbers . Our agent does the same for SERP analysis.
Domain Ranking: A list showing where the target domain ranks for each keyword. E.g. - example.com ranks #1 for "primary keyword" or “not in top 50” for weaker terms. These come from the check_rank outputs. The report may include all keywords checked (often 5–10 or more) to give a clear picture of current SEO standing.
News & Topical Opportunities: Any timely articles or news items. For instance: "New AI agents in search workflows" – recent news item (source). This section is drawn from the search_news results. The agent treats emerging trends as SEO opportunities, noting angles that content creators could exploit. (RapidSeedBox notes that pulling Google News is great for tracking brand mentions and breaking stories in real time .)
Recommendations: Based on the analysis, the agent may add suggested actions (e.g. target long-tail keywords, improve content around topics where competitors rank, etc.). These are generated by the model using the collected data. Everything is written in a clear, business-friendly tone with inline citations back to the search snippets.

The final report is easy to read and can even be published internally or shared. All key claims are traceable to the underlying SERP data (for example, competitor mentions are accompanied by “[Source]” linking to the snippet’s origin). This makes the process auditable and transparent – a direct benefit of using an agentic approach where each fact comes from a known search query .

Getting Started

The SEO Research Agent runs locally via Python. Setup is straightforward:

Install requirements: Python 3.9+ and pip install openai serpapi.
Set API keys: Provide your OPENAI_API_KEY and SERPAPI_API_KEY (SerpApi has a free tier with 250 searches/month for testing).
Run the agent: You can use the CLI (python seo_agent.py -q "SEO analysis for example.com") or import the SEOResearchAgent class in your code. It supports interactive mode as well: just run without -q to chat.

For example, running:

python seo_agent.py -q "SEO report for vanta.com"

will prompt the agent to produce a full SEO report for the given domain. The conversation trace can be saved in JSON (--outfile trace.json) for debugging or auditing purposes.

Under the hood, the code follows the plan–tool–report loop described above. The system prompt guides the model to first plan the research (listing data to collect), then emit all tool calls together, then refine if needed, and finally output the Markdown report. This matches best practices for multi-tool AI agents and ensures the agent doesn’t stop at shallow answers. Instead, it iterates until it confidently has comprehensive data for all sections.

Conclusion

By combining OpenAI’s latest models with SerpApi’s real-time search APIs, the SEO Research Agent brings cutting-edge AI to everyday SEO workflows. It automates keyword brainstorming, competitor audits, rank tracking, and news monitoring – tasks that normally take hours of manual search and curation.

Building Ahead: a multi-tool AI Travel Agent with OpenAI + SerpApi

James Collins — Fri, 29 Aug 2025 12:16:40 +0000

Modern travel planning demands fresh, structured signals and predictable automation. Flight availability, hotel inventory, and local recommendations are time-sensitive and often encoded in vertical search outputs. AI agents can directly assist in solving this problem. A reliable AI travel agent will combine the reasoning capabilities of a language model with direct access to different search APIs.

This post describes a follow-up to the research agent pattern (covered in prior AI agent blog): a travel planning agent that plans its retrievals, verifies structured inputs (IATA codes), runs targeted vertical searches (Flights, Hotels, Local/Maps, Web), and synthesizes concise, cited itineraries.

Full code available here: https://github.com/serpapi/travel-planning-agent

Use cases

The travel agent’s goal is to help travelers discover new destinations faster and plan trips better. It also very cheap to run in comparison to hiring a travel agency or travel advisors.

Itinerary assembly: curate flight options, hotel availability, possible travel experiences and return a brief, actionable plan with source links.
Discovery workflows: surface destination ideas when only soft constraints are provided (season, tone, budget).
Constraint-driven planning: honor explicit limits (max price, cabin class, passenger mix, all-inclusive) and produce ranked options.
Audit and reproducibility: save a JSON trace of the model’s planned tool calls and returned snippets for debugging, compliance, or user review.

Each use case benefits from the same core pattern: explicit planning by the model, concurrent retrieval by the host, and a single synthesis step that ties everything together with citations.

How it works — high level

Three coordinated stages form the backbone of the agent:

Plan. The model emits a batch of structured tool calls that enumerate the data required — IATA lookups, candidate flight date windows, hotel queries for target neighborhoods, and local POI searches. The set of tool calls is produced before any external requests are executed so coverage is explicit and auditable.
Execute. The host executes the model’s tool calls. Independent calls are dispatched concurrently to reduce latency (thread pool or async execution). Each tool returns compact, structured snippets (title/snippet/price/link) which are appended to the conversation as tool messages and associated with the originating tool_call_id.
Synthesize. With the retrieved snippets in context the model composes a final answer: a concise itinerary, ranked flight/hotel options, local recommendations, and footnote-style citations that map directly to the returned URLs.

This plan → execute → synthesize loop preserves an auditable trace and minimizes token waste while ensuring the model reasons over current, structured data.

Minimal setup

Python 3.9+
Environment variables: OPENAI_API_KEY, SERPAPI_API_KEY
Libraries: openai (or the official OpenAI SDK used for function/tool calls) and serpapi for vertical search access

You can install necessary libraries using:

pip install -r requirements.txt

export OPENAI_API_KEY="..."
export SERPAPI_API_KEY="..."

You can obtain SerpApi API key on the SerpApi website. There is a free plan that offers 250 searches / month, so you can freely test the agent.

A CLI wrapper supports interactive chat and one-shot queries and can optionally persist the full JSON trace for auditing.

Key operational rules

The agent follows a small set of explicit operational rules encoded in the system prompt and enforced by the host:

IATA verification: the flights tool requires 3-letter IATA codes. Before calling the flights endpoint the model must resolve and verify airport codes via a web lookup (e.g., "Warsaw IATA code → WAW"). This prevents invalid flight queries and reduces tool errors.
Date disambiguation: ambiguous dates are interpreted as future travel. If a referenced month has already passed in the current year, the agent interprets it as next year. For vague phrasing such as “mid-May,” the agent probes a small date window (for example 13–17 May) and searches multiple candidate windows rather than a single day.
Reasonable defaults: when the user omits details, the agent assumes sensible defaults and surfaces them in the response (example defaults: economy cabin, up to one stop, 2 adults, ±3 days flexibility). Tone in the query (e.g., “luxury”) adjusts defaults (premium cabins, higher star hotels).
Batching and parallelism: the model should emit all needed tool calls in a single assistant message when multiple external queries are required. The host executes independent calls concurrently to reduce latency.
Transparency and citations: final outputs include footnote-style citations that link back to the URLs returned by the vertical APIs. The JSON trace preserves tool_call_id → result associations for reproducibility.

Encoding these rules as part of the system prompt plus light host-side validation produces predictable, auditable behavior and fewer failed calls.

Tooling integration

The model is equipped with multiple tools that it has access to. For each tool a schema is defined that the model should output in order to request results from a certain tool. Below we show Google Flights tool integration, since it's one of the core tools to plan the travel. Other tools are integrated in a similar fashion.

# Representative tool schema (flights) and host-side mapping (conceptual)
# Model-facing function schema (what the LLM can call)
{
  "type": "function",
  "function": {
    "name": "search_flights",
    "description": "Find flights given IATA airport codes and dates (departure_date required).",
    "parameters": {
      "type": "object",
      "properties": {
        "departure": {"type": "string"},
        "destination": {"type": "string"},
        "departure_date": {"type": "string", "description": "YYYY-MM-DD"},
        "return_date": {"type": "string", "description": "YYYY-MM-DD (optional)"},
        "cabin": {"type": "string"},
        "max_price": {"type": "string"}
      },
      "required": ["departure", "destination", "departure_date"]
    }
  }
}

# Host-side behavior (conceptual):
# 1) Validate IATA codes (3 letters). If not present, run search_web to resolve.
# 2) Normalize dates and params.
# 3) Call SerpApi Flights engine:
#    params = {"engine":"google_flights","api_key":SERPAPI_API_KEY,
#              "departure_id":dep_code,"arrival_id":arr_code,"outbound_date":departure_date, ...}
# 4) Normalize returned results to: {price, total_duration_min, legs, link}
# 5) Append normalized JSON to conversation as a `tool` message with the tool_call_id.

This integration pattern enforces a clear contract: the model requests flights using the declared schema, the host validates and materializes the request against SerpApi, and the model receives compact, normalized results suitable for synthesis and citation.

The following tools are integrated. Each tool is mapped to a specific SerpApi endpoint.

Google Flights (engine=google_flights) yields structured flight listings (price, duration, legs) and a canonical flights result URL.
Google Hotels (engine=google_hotels) returns property metadata, ratings, and snippet links to booking surfaces.
Google Local/Maps (engine=google_local) supplies POIs, ratings, types, and map links.
Google organic (engine=google) supports general web lookups (IATA lookups, policy pages, travel advisories).

This approach to integration keeps token consumption predictable; an optional “deep scrape” tool can be added later for full-page context when necessary.

Inference loop

The inference loop is structured in the following sequence:

Append system prompt and user message to the conversation.
Request a model completion with the function/tool schema.
If the model returns a tool_calls assistant message, append it and execute those calls:
- Validate dependent calls (for example, IATA resolution) before invoking dependent tools.
- Run non-dependent calls concurrently; append results as tool messages associated with their tool_call_id.
- Reinvoke the model to synthesize a final answer using the returned snippets.
- If the model requests further tool calls, repeat; otherwise return the final synthesis.

Preserving the ordering and identifiers between tool calls and results is essential for accurate citations and for saving a reproducible JSON trace.

Behavior, defaults and error handling

Assumptions must be surfaced. Any default assumptions used to produce results are shown in the final answer (for example: “assumed economy, 2 adults, flexible ±3 days”). If the missing detail would materially change results (passenger mix, max price), the agent asks a concise clarifying question rather than guessing.
Cache small, stable lookups. IATA lookups, frequent POIs, and static data should be cached locally to reduce API usage and speed repetitive queries.
Quota and rate limiting. Batched concurrent retrievals reduce latency but increase burst usage. Implement simple rate limiting or token bucket strategies around SerpApi calls to avoid quota issues.
Graceful tool errors. If a tool returns an error (invalid arguments, rate limit), the host returns a short structured error snippet to the model and allows the model to either retry with corrected arguments or ask the user for clarification.

Example flow

User query: Weekend in Rome mid-May, leaving Lisabon.

Planner (model) emits a small batch of tool calls:

search_web("Lisabon IATA code")
search_web("Rome IATA code")
search_hotels(destination="Rome", check_in=2026-05-15, check_out=2026-05-17) (candidate window)
search_places(query="Colosseum Rome", location="Rome", limit=5)
several search_flights calls for 2–3 Friday→Sunday windows once IATA codes are resolved

Executor runs IATA lookups first, then runs hotels and places concurrently while issuing the validated flight calls. Results are returned as compact snippets. Synthesizer writes a short itinerary with three flight options, two hotel picks, and local highlights, each with footnote links to the SerpApi results. The response explicitly notes assumed defaults and suggests refinement steps.

CLI and programmatic usage

A small CLI wrapper supports two modes:

Interactive chat for iterative discovery and follow-up refinement.
One-shot query for quick plans, with an option to persist the JSON trace (--outfile) for auditing.

Typical commands:

# interactive
python travel_planning_agent.py

# one-shot, save trace
python travel_planning_agent.py -q "Luxury honeymoon Bali December" -o trace.json

--debug exposes intermediate tool calls and SerpApi responses for prompt tuning and debugging.

Limitations and future extension points

Booking flow: moving from planning to booking requires partner APIs and additional compliance considerations (payment, PII handling).
Richer passenger modeling: support for infants, children, special assistance and multi-party trips increases complexity in passenger constraints and pricing models.
Personalization: persist user preferences (airlines, seat class, hotel loyalty numbers) to bias search and filter results.
Explainability UI: render the JSON trace with click-through snippets so end users or auditors can inspect every tool call and its returned evidence.
Broader data sources and destinations: to create a really powerful agent capable of planning your next trip, expansion of data sources will be necessary. Integration of cruise lines, private transfers, travel perk / discount discovery, better coverage of non-default travel options such as road trips, themed parks, natural parks, etc. Coverage of additional options like travel insurance, upgrades, group travel, etc. are all necessary to create a high-quality getaway planner.

Conclusion

The research-agent pattern (plan → execute → synthesize) adapts naturally to travel planning when combined with vertical search APIs. Encoding a small set of operational rules (IATA verification, date logic, batching) and returning compact, cited snippets produces reliable, actionable itineraries with an auditable trace. Batched planning plus concurrent execution reduces latency and improves coverage, while proper validation and caching reduce failures and API costs. The resulting travel planner agent provides a practical foundation for booking integrations, personalization, and richer optimization in future work.

Building a fast, self‑hosted research agent with OpenAI models + SerpAPI

James Collins — Thu, 14 Aug 2025 10:59:52 +0000

Modern language models are effective at synthesis but do not inherently provide fresh, verifiable information. Connecting a model to the web search creates an autonomous AI agent that closes the gap: it enables current sources, systematic coverage of a topic, and traceable answers. This post describes a compact research agent that plans its searches, executes them concurrently via SerpAPI, and produces a cited synthesis—designed to be readable, auditable, and simple to run locally.

Full code available here: link.

Use cases for AI research agents

The AI agent targets questions where recency and attribution matter. It streamline research and can automate many repeatable workflows and complex tasks: market scans, literature overviews, competitive comparisons, and quick technical surveys. Instead of incremental browsing, the model first enumerates all searches it needs, the system executes those queries in parallel, and the model then writes a grounded answer using the returned snippets. Since the agent has access to web on the fly, the results are close to real-time.

Among different use cases, the web research agents in particular allow for scalable data collection of online datasets. The agent’s outputs could be streamed into natural language processing pipelines for further generation of insights. Those insights could be stored in a knowledge base or further refined via more complex systems, such as multi-agent systems where several agents collect data and other agents conduct it’s review.

Organizations could use web research agents for internal uses too, such as using them to conduct research on customer data to enhance customer experience and improve customer support.

How it works (overview)

Planning: Large language model emits a batch of structured tool calls, each containing a Google query.
Execution: The agent runs those queries concurrently through SerpAPI and returns concise title:snippet pairs as tool messages.
Synthesis: With the results in context, the model produces the final answer, including inline citations derived from the snippets.

This pattern keeps latency low (parallel requests), improves coverage (the plan precedes the data), and preserves a step-by-step trace for auditing.

Minimal setup

Python 3.9+ (3.10+ recommended)
Environment variables: OPENAI_API_KEY and SERPAPI_API_KEY

You can obtain OpenAI API key at OpenAI platform website. For SerpAPI key, you can register at SerpApi website. There is a free plan so you can test the agent first. Then install the repository run the agent:

git clone https://github.com/vladm-serpapi/web-research-agent
cd web-research-agent
pip install -r requirements.txt

# set keys in shell (recommended)
export OPENAI_API_KEY="sk-..."
export SERPAPI_API_KEY="..."

python research_agent.py -q "What are the latest approaches to retrieval‑augmented generation in 2025?"

Implementation

The agent is implemented as a single class with a run method. The constructor initializes model configuration, API clients, the tool schema, and the system prompt. Model configuration accepts different AI models, but primarily the ones supported by OpenAI. The code could be extended to optimize this and allow for provider-agnostic inference (e.g. using OpenRouter). Once the agent is built, the run method executes the inference loop until a final answer is produced.

# research_agent.py
class ResearchAgent:
    """LLM‑powered researcher that combines OpenAI o‑series model with SerpAPI."""

    def __init__(
        self,
        model: str = "o3",
        topn: int = 10,
        debug: bool = False,
        openai_key: t.Optional[str] = None,
        serpapi_key: t.Optional[str] = None,
    ) -> None:
        self.model = model
        self.topn = topn
        self.debug = debug
        self.openai_key = openai_key or os.getenv("OPENAI_API_KEY")
        self.serp_key = serpapi_key or os.getenv("SERPAPI_API_KEY")
        if not self.openai_key or not self.serp_key:
            raise RuntimeError("OPENAI_API_KEY and SERPAPI_API_KEY must be set.")

        self.client = OpenAI(api_key=self.openai_key)
        # tools + prompt initialized below ...

In order to actually connect to the Web, we need to provide the model with a tool to do so. self.tools field is initialized with a tool schema that the model will generate when it needs to get the web data. Tool schema definition includes the function name, tool description and the parameter that we want the model to provide. In this case we ask the model to just provide query: string parameter for Google Search. Code:

self.tools = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search Google and return the top result snippets.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Google search string"}
                },
                "required": ["query"],
            },
        },
    }
]

The system prompt is defined in a way that encourages the model to research the user's question using the search_web tool. The prompt requires the model to generate all necessary tool calls in a batch and return them together. This tool batching is necessary to improve the total processing latency. If the model needs to make say 10 requests, then each request will take 1 second, then it will take a total of 10 seconds. If we use batching and run all the calls concurrently, then we will reduce the total run time to just 1 second (ideal case).

self.sys_prompt = (
    "You are a meticulous research assistant.\n"
    "When outside knowledge is needed, you must emit ALL `search_web` tool calls "
    "in a SINGLE assistant message before reading any results.\n\n"
    "You must return them in the exact JSON structure the API expects for `tool_calls`,\n"
    "with each having its own `id`, `type`, and `function` fields.\n"
    "Do not write explanations, just the tool calls.\n\n"
    "Always batch between 2 and 50 calls in a single turn if you need external data.\n"
    "Only after all tool outputs are returned should you write your final, well-cited answer."
)

In practice, this nudges the model to enumerate queries that span the topic (overview, specifics, recent updates) before seeing any retrieved snippets.

The retrieval backend (SerpAPI integration)

Each tool call is executed against SerpAPI’s Google endpoint. The agent returns compact title:snippet pairs, which are sufficient for grounding and efficient on tokens. We use SerpAPI Python SDK in order to make requests and obtain the result snippets.

def _search_web(self, query: str) -> str:
    if self.debug:
        print(f"[DEBUG] → SerpAPI query: '{query}'")
    search = GoogleSearch({"q": query, "api_key": self.serp_key, "num": self.topn})
    org = search.get_dict().get("organic_results", [])[: self.topn]
    return "".join(
        f"- {r.get('title','(untitled)')}: {r.get('snippet','(no snippet)')}"
        for r in org
    ) or "No results found."

Returning concise strings instead of full pages keeps interactions predictable and reduces overhead. It is possible to extend this implementation in the future, to include a scraper tool that will fetch the complete pages. That could allow the model to generate better insights by getting access to more context.

Agentic inference loop

At the top level, run(question: str) builds the message context (system + user), calls the chat completions API, and branches depending on whether the model returned tool calls or a final answer.

def run(self, question: str) -> dict[str, t.Any]:
    messages = [
        {"role": "system", "content": self.sys_prompt},
        {"role": "user", "content": question},
    ]
    steps: list[dict[str, t.Any]] = []

    while True:
        if self.debug:
            print("[DEBUG] → OpenAI chat.completions.create request …")

        resp = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=self.tools,
            tool_choice="auto",
        )
        msg = resp.choices[0].message

        if msg.tool_calls:
            # append assistant message FIRST (per API contract)
            messages.append(msg)

            # fetch all tool results concurrently
            def fetch(call):
                args = json.loads(call.function.arguments)
                q = args["query"]
                steps.append({"type": "tool_call", "query": q})
                return call.id, q, self._search_web(q)

            with ThreadPoolExecutor() as pool:
                results = list(pool.map(fetch, msg.tool_calls))

            # append tool results in the same order as tool_calls
            for call_id, q, result in results:
                steps.append({"type": "tool_result", "content": result})
                messages.append({
                    "role": "tool",
                    "tool_call_id": call_id,
                    "content": result,
                })
            continue  # next iteration → model now has snippets; produce final answer

        # no tool calls → final answer
        answer = msg.content.strip()
        steps.append({"type": "assistant_answer", "content": answer})
        return {"question": question, "answer": answer, "steps": steps}

Key points in the loop:

The agent always appends the assistant’s tool_calls message before returning tool outputs (API contract).
Tool execution is concurrent via ThreadPoolExecutor to reduce latency.
Tool outputs are appended in order and associated with tool_call_id; the next model call then has everything needed to synthesize the answer.

Putting it all together

Finally, a small wrapper exposes the agent for use in the terminal. It parses flags, instantiates ResearchAgent, runs it, prints the final answer, and optionally writes a JSON trace (steps + answer) for auditing. The common argparse library is used to do that. The wrapper exposes then necessary options for the client, such as -q - provide a question, -m - provide the model name, --outfile - specifies file for results, --debug - enables debug logging to trace intermediate execution steps.

The result is a compact, auditable pipeline: the model is doing the planning and generates search requests, search requests are retrieved in parallel via SerpAPI, the model generates an answer with citations - all within a few clear components.

Usage

The code could be used either via the terminal or imported and used directly as a Python module. Examples:

# basic
python research_agent.py -q "State of LLM reasoning benchmarks in 2025"

# save a JSON trace (tool calls, results, final answer)
python research_agent.py -q "Compare FAISS vs. Milvus vs. Qdrant for RAG (2025)" --outfile trace.json

# control model and results per search
python research_agent.py -q "Airline industry trends in 2025" -m gpt-4o -n 8

Using in Python directly:

from research_agent import ResearchAgent

agent = ResearchAgent(model="gpt-4o", topn=10, debug=False)
result = agent.run("Summarize the most cited papers on RAG.")
print(result["answer"])  # final, cited summary
print(len(result["steps"]))  # trace length

Notes on the usage:

Keys: ensure both OPENAI_API_KEY and SERPAPI_API_KEY are available in the environment before running the scripts.
Model behavior: o3 / o4‑mini may prefer fewer tool calls per turn; gpt‑4o often batches more queries when broad coverage is required.
Model’s output: as usual with all AI models hallucination is possible. Human oversight is required to ensure the output quality.

Conclusion and what's next

In this blog post, we showed how to design an Agent capable of running multiple complex research workflows, such as: news scans, literature reviews, vendor/technology comparisons, and quick technical surveys where recency and traceability matter. It runs locally via a CLI or as a library, with optional JSON traces to audit tool calls and outputs.

For future blog posts, we plan to expand the Agent's functionality to include multiple tools (maps, flights, hotels, domain APIs, etc.) to support more complex, goal-directed workflows. We will experiment with different AI tools and generative AI frameworks to build an Agent capable of a more complex decision making.

DEV Community: James Collins

MCP Apps with FastMCP: Turning Tool Output Into Interactive UI

What MCP Apps actually are

Setup

The four ways to build one, and which actually matter

A results table in about ten lines

One design decision: a new tool, not a mode flag

A real dashboard

Scaling across engines

A few things to keep in mind

Wrapping up

The State of MCP: Everything That Changed in H1 2026

Governance moved to a neutral foundation

Adoption kept climbing

The protocol matured

Security caught up with adoption

What this adds up to

Where SerpApi fits

5 Things You Can Build with Claude Code and Live Search Data

1. Cross-Platform Price Comparison

2. Academic Literature Review

3. Local Business Audit

4. Trend-Informed Content Brief

5. Job Market Snapshot

The Pattern Behind All Five

Getting Started

What's Next

Introducing SerpApi's Claude Code Plugin

What Is a Claude Code Plugin?

Key Features

Getting Started

1. Get an API key

2. Install the plugin

3. Search

How It Works

Supported Engines

MCP or Plugin — Which Should You Use?

What's Next

Build an AI Research Agent for Slack and Linear with SerpApi

Table of Contents

Why Build a Workspace Research Agent

Architecture: One Backend, Two Platforms

The Research Engine: SerpApi + LLM

Connecting to Slack

Connecting to Linear’s Agent API

Running It Locally

Seeing It in Action

Linear

Conclusion

Competitive Intelligence Agent: From Slides to Live Signals

The Challenge: Stale Assets and Fragmented Context

How the Competitive Intelligence Agent Behaves

1. Plan

2. Gather

3. Summarize

Key Building Blocks

1. SerpApi for Web, News, and Jobs

2. Optional HubSpot SDK for Internal Context

3. OpenAI for Orchestration and Writing

Practices for Reliable Briefings

Where MCP Fits In

Conclusion: Continuous Competitive Sensing

How to use SerpApi engine schemas in SerpApi MCP to improve tool call quality

Quick recap: SerpApi MCP in your workflow

What’s new: Engine schemas as MCP resources

What this means

How engine schemas improve tool call quality

1. No more guessing parameter names

2. Engine-specific optimization

3. Fewer failed requests

4. Better developer experience

Example: Using Google Flights with MCP engine schemas

Step 1: Discover available engines

Step 2: Load the Google Flights schema

Step 3: Make a structured search call

Example: Shopping search with Google Shopping

Load the engine schema

Perform the search

Using schemas inside VS Code

Summary: Why this feature matters