Adding Real-Time Google AI Mode Data to Your OpenClaw Agent with Pangolinfo's AI Search Skill

#api #python #agents #webdev

TL;DR

Google AI Mode generates dynamic AI summaries at the top of search results. Most Agent frameworks can't see them. Pangolinfo's AI Mode API (now packaged as an official OpenClaw Skill) fixes this in a single POST request. This post covers the complete integration — API spec, Python code, error handling, and production tips.

OpenClaw AI Search Skill: giving your Agent real-time access to Google AI Overview data

The Problem

If you're building Agent workflows that depend on Google search data, there's a systematic gap you may not have noticed yet: Google AI Overview content is invisible to standard scrapers.

AI Overview (part of Google's AI Mode, triggered by the udm=50 URL parameter) generates summaries server-side via LLM inference and injects them into the page via JavaScript after initial render. Consequences:

Static HTTP requests: don't see it
Headless browsers: need precise render timing + anti-detection to capture it reliably
DIY maintenance: proxy pools + fingerprint rotation + Google update tracking (ongoing)

For an OpenClaw Agent doing keyword research, competitor monitoring, or content strategy, this means a meaningful slice of what users actually see on Google is systematically absent from Agent intelligence.

The Solution: Pangolinfo AI Mode API as an OpenClaw Skill

Pangolinfo's AI Overview SERP API handles the infrastructure complexity server-side. You send a standard POST request; you get back structured JSON with the full AI Overview content.

API Specification

AI Mode API complete call flow: from OpenClaw Skill request to structured JSON response

POST https://scrapeapi.pangolinfo.com/api/v2/scrape

Headers:
  Content-Type: application/json
  Authorization: Bearer {YOUR_API_KEY}

Body:
  url         (string, required)    → Google Search URL with udm=50
  parserName  (string, required)    → "googleAISearch"  ← this selects AI Mode parser
  screenshot  (boolean, optional)   → return page screenshot URL
  param       (string[], optional)  → multi-turn prompts, max 5 (>5 = slower response)

Key point: The udm=50 parameter in the URL is what triggers Google's AI Mode interface. Without it, you get standard SERP results.

Complete curl Example

curl --request POST \
  --url https://scrapeapi.pangolinfo.com/api/v2/scrape \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "url": "https://www.google.com/search?num=10&udm=50&q=javascript+async+await",
    "parserName": "googleAISearch",
    "screenshot": false,
    "param": ["explain with examples", "compare to promises"]
  }'

Response Structure

{
  "code": 0,
  "message": "ok",
  "data": {
    "ai_overview": 1,
    "json": {
      "type": "organic",
      "items": [{
        "type": "ai_overview",
        "items": [{
          "type": "ai_overview_elem",
          "content": [
            "async/await is syntactic sugar over Promises...",
            "An async function always returns a Promise..."
          ]
        }],
        "references": [{
          "type": "ai_overview_reference",
          "title": "MDN Web Docs - async function",
          "url": "https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function",
          "domain": "MDN Web Docs"
        }]
      }]
    },
    "screenshot": null,
    "taskId": "1768988520324-example",
    "url": "https://www.google.com/search?num=10&udm=50&q=javascript+async+await"
  }
}

Credit consumption: 2 credits per call when ai_overview: 1 (AI Overview successfully retrieved).

OpenClaw Skill Implementation

Core Skill Class (Production-Ready)

"""
pangolin_ai_search_skill.py

OpenClaw AI Search Skill — wraps Pangolinfo AI Mode API.
Provides real-time Google AI Overview access for your Agent.
"""

from __future__ import annotations

import asyncio
import aiohttp
import requests
from urllib.parse import quote
from typing import Optional
from dataclasses import dataclass, field


@dataclass
class AIOverviewData:
    """Structured result from Google AI Mode query."""
    query: str
    has_ai_overview: bool
    ai_content: list[str] = field(default_factory=list)
    references: list[dict] = field(default_factory=list)
    screenshot_url: str = ""
    task_id: str = ""

    def summary(self) -> str:
        """Agent-friendly text summary of the result."""
        if not self.has_ai_overview:
            return f"No AI Overview found for: {self.query}"
        lines = [f"## Google AI Overview: {self.query}\n"]
        lines += [f"- {c}" for c in self.ai_content]
        if self.references:
            lines.append("\n**Sources:**")
            lines += [f"  - [{r['domain']}] {r['title']}" for r in self.references]
        return "\n".join(lines)


class PangolinAISearchSkill:
    """
    OpenClaw AI Search Skill

    Integrates Pangolinfo AI Mode API for real-time Google AI Overview data.
    Supports both sync (single queries) and async (batch queries) interfaces.

    Credit cost: 2 credits per successful AI Overview retrieval.
    """

    _ENDPOINT = "https://scrapeapi.pangolinfo.com/api/v2/scrape"
    _GOOGLE_AI_MODE = "https://www.google.com/search?num=10&udm=50&q={q}"
    _MAX_PARAMS = 5  # Official limit; exceeding reduces response efficiency

    def __init__(self, api_key: str, timeout: int = 30) -> None:
        if not api_key:
            raise ValueError("Pangolinfo API key is required.")
        self._timeout = timeout
        self._headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {api_key}",
        }

    # ------------------------------------------------------------------
    # Sync interface
    # ------------------------------------------------------------------

    def search(
        self,
        query: str,
        follow_up: Optional[list[str]] = None,
        screenshot: bool = False,
    ) -> AIOverviewData:
        """Synchronous single-query search."""
        payload = self._build_payload(query, follow_up, screenshot)
        resp = requests.post(
            self._ENDPOINT, headers=self._headers, json=payload, timeout=self._timeout
        )
        resp.raise_for_status()
        return self._parse(query, resp.json())

    # ------------------------------------------------------------------
    # Async interface (recommended for batch workloads)
    # ------------------------------------------------------------------

    async def search_async(
        self,
        query: str,
        session: aiohttp.ClientSession,
        follow_up: Optional[list[str]] = None,
        screenshot: bool = False,
    ) -> AIOverviewData:
        """Async single-query search."""
        payload = self._build_payload(query, follow_up, screenshot)
        async with session.post(
            self._ENDPOINT,
            headers=self._headers,
            json=payload,
            timeout=aiohttp.ClientTimeout(total=self._timeout),
        ) as resp:
            data = await resp.json()
            return self._parse(query, data)

    async def batch_search(
        self,
        queries: list[str],
        max_concurrent: int = 5,
    ) -> list[AIOverviewData | Exception]:
        """
        Async batch search with bounded concurrency.

        Args:
            queries: List of search queries
            max_concurrent: Max simultaneous requests (5–10 is safe for most use cases)

        Returns:
            List of AIOverviewData results (or Exceptions for failed queries)
        """
        sem = asyncio.Semaphore(max_concurrent)

        async def bounded(q: str, session: aiohttp.ClientSession):
            async with sem:
                return await self.search_async(q, session)

        connector = aiohttp.TCPConnector(limit=max_concurrent)
        async with aiohttp.ClientSession(connector=connector) as session:
            tasks = [bounded(q, session) for q in queries]
            return await asyncio.gather(*tasks, return_exceptions=True)

    # ------------------------------------------------------------------
    # Private helpers
    # ------------------------------------------------------------------

    def _build_payload(self, query: str, follow_up, screenshot: bool) -> dict:
        payload: dict = {
            "url": self._GOOGLE_AI_MODE.format(q=quote(query)),
            "parserName": "googleAISearch",
            "screenshot": screenshot,
        }
        if follow_up:
            payload["param"] = follow_up[: self._MAX_PARAMS]
        return payload

    def _parse(self, query: str, raw: dict) -> AIOverviewData:
        if raw.get("code") != 0:
            raise RuntimeError(f"API error [{raw.get('code')}]: {raw.get('message')}")
        data = raw["data"]
        result = AIOverviewData(
            query=query,
            has_ai_overview=bool(data.get("ai_overview")),
            screenshot_url=data.get("screenshot") or "",
            task_id=data.get("taskId") or "",
        )
        for item in data.get("json", {}).get("items", []):
            if item.get("type") != "ai_overview":
                continue
            for sub in item.get("items", []):
                if sub.get("type") == "ai_overview_elem":
                    result.ai_content.extend(sub.get("content", []))
            result.references = [
                {"title": r["title"], "url": r["url"], "domain": r["domain"]}
                for r in item.get("references", [])
            ]
        return result

Usage Examples

# Initialize
skill = PangolinAISearchSkill(api_key="YOUR_PANGOLINFO_API_KEY")

# --- Single sync query ---
result = skill.search("best Python web scraping library 2026")
print(result.summary())

# --- With multi-turn follow-up ---
result = skill.search(
    "asyncio tutorial python",
    follow_up=["show me error handling", "compare to threading"]
)
if result.has_ai_overview:
    for point in result.ai_content:
        print(f"• {point}")

# --- Async batch (recommended for bulk keyword research) ---
keywords = [
    "OpenClaw agent configuration",
    "Google AI Mode scraping",
    "Pangolinfo API review",
    "SERP data extraction tools",
]

results = asyncio.run(skill.batch_search(keywords, max_concurrent=3))

for r in results:
    if isinstance(r, Exception):
        print(f"Error: {r}")
    else:
        status = "✅" if r.has_ai_overview else "⚠️  No AI Overview"
        print(f"{status} — {r.query}")

Production Considerations

Concurrency: Keep max_concurrent at 5–10 for most workloads. Higher values can cause timeout clustering under load.

Multi-turn prompts: The param array supports up to 5 items per the API docs — beyond that, response efficiency degrades. Design multi-turn scenarios as sequential calls rather than single large batches.

AI Overview trigger rate: Not every query generates an AI Overview. Informational queries (how/what/why) trigger at higher rates than navigational (brand names) or transactional queries. Check has_ai_overview before processing to avoid false negatives.

Caching: For repeated queries on hot keywords, a simple TTL cache (1–2 hours) significantly reduces credit consumption without meaningfully affecting data freshness.

Credit budgeting: At 2 credits per AI Overview call, budget planning is straightforward. Consider routing queries that don't need AI Overview to standard parsers to conserve credits.