NexGenData

Posted on May 14 • Edited on May 18 • Originally published at thenextgennexus.com

Google Kills Custom Search API on Jan 1, 2027. You Have 9 Months

#apify #google #search #migration

Google Kills Custom Search API on Jan 1, 2027. You Have 9 Months

Google Custom Search died on January 1, 2027. Here's the playbook.

As of this writing in August 2026, the API is still technically up. The shutdown notice landed in January 2026 in a quiet Google Developers blog post and a batch of console email notifications. The CSE JSON API — the thing every indie hacker, research team, academic, and scrappy SaaS company has leaned on since 2006 for "give me Google results as JSON" — returns HTTP 410 Gone on January 1, 2027. You have about nine months.

Google's recommended replacement is Vertex AI Search, which is a fundamentally different product: an enterprise-tier "build your own semantic search over your own corpus" offering with a pricing structure starting at roughly $2 per 1,000 queries for basic tier and climbing fast with extensions. It is not a CSE replacement. It does not return the public web search results that CSE returned. Google's official position is that for "web search at scale," developers should "explore third-party providers" — which is corporate-speak for "we are getting out of that business."

The third-party SERP providers — SerpApi at $75-$275/month, ScaleSerp at ~$50/month, Bright Data SERP API enterprise tier — are all viable, but each emits its own JSON schema. Migrating from CSE to SerpApi is a rewrite of your parsing layer. Migrating to ScaleSerp is a different rewrite. Migrating to Bright Data is a third rewrite.

The google-cse-replacement actor takes the same approach the Dark Sky replacement takes for weather: it emits the exact CSE JSON schema on top of a SERP scraping backend (Apify's GOOGLE_SERP proxy), so if your code parses searchInformation.totalResults, items[].title, items[].link, and items[].snippet, you can change two lines and keep shipping.

Pricing and rate limit data are as of Q3 2026; confirm with each vendor before committing.

Why CSE mattered

Google CSE JSON API was the closest thing the web had to an officially-blessed "search the internet" API for years. The v1 API launched in 2006, v2 around 2011, and the JSON v1 variant that most current integrations use since 2015. It served three distinct audiences:

"Site search" operators. You could scope CSE to a specific set of sites and use it as a search bar on your own marketing site. Easy, cheap, and Google-accurate. About 40% of observed CSE usage, per Google's own aggregate stats.
"Web search for an agent / bot / assistant." You had a bot or a research assistant or a RAG pipeline that needed to look things up on the public web. CSE returned the top 10 organic Google results for any query, which was a clean input for summarization or downstream scraping. About 35% of observed usage.
"Research and academic work." Corpus builders, media researchers, political scientists studying information retrieval, law firms verifying citations. The dataset is not reproducible-by-scrape the way CommonCrawl is, but CSE gave you a stable, commercially-licensable path to "what does Google think are the top results for this query today." About 25% of usage.

Each audience has different needs from a replacement, and each breaks in a different way when CSE goes down.

The current CSE JSON shape

For reference, a CSE response looks like this:

{
  "kind": "customsearch#search",
  "url": {
    "type": "application/json",
    "template": "https://www.googleapis.com/customsearch/v1?q={searchTerms}&..."
  },
  "queries": {
    "request": [{
      "title": "Google Custom Search - climate change",
      "totalResults": "124000000",
      "searchTerms": "climate change",
      "count": 10,
      "startIndex": 1,
      "inputEncoding": "utf8",
      "outputEncoding": "utf8",
      "safe": "off",
      "cx": "YOUR_CSE_ID"
    }],
    "nextPage": [{ "startIndex": 11, "count": 10 }]
  },
  "context": { "title": "My CSE" },
  "searchInformation": {
    "searchTime": 0.42,
    "formattedSearchTime": "0.42",
    "totalResults": "124000000",
    "formattedTotalResults": "124,000,000"
  },
  "items": [
    {
      "kind": "customsearch#result",
      "title": "Climate change - Wikipedia",
      "htmlTitle": "Climate change - <b>Wikipedia</b>",
      "link": "https://en.wikipedia.org/wiki/Climate_change",
      "displayLink": "en.wikipedia.org",
      "snippet": "Climate change includes both human-induced...",
      "htmlSnippet": "<b>Climate change</b> includes both...",
      "formattedUrl": "https://en.wikipedia.org/wiki/Climate_change",
      "htmlFormattedUrl": "https://en.wikipedia.org/wiki/Climate_change",
      "pagemap": { ... }
    }
  ]
}

That's a lot of structure. The replacement returns every one of those fields.

Old vs. new: the feature parity matrix

Feature	Google CSE (RIP Jan 2027)	Vertex AI Search	SerpApi	ScaleSerp	Bright Data SERP	google-cse-replacement
Returns public web Google results	yes	no (your corpus only)	yes	yes	yes	yes
JSON schema matches CSE	yes	no	no	no	no	yes
Site-scoped search (`siteSearch=`)	yes	yes (on your corpus)	yes	yes	yes	yes
`q` query string	yes	yes	yes	yes	yes	yes
Pagination (`start=`, `num=`)	up to 100	n/a	up to 100	up to 100	up to 300+	up to 100
`lr`, `cr` language/country filters	yes	partial	yes	yes	yes	yes
Image search	yes	no	yes	yes	yes	yes
`sort` parameter	yes	n/a	limited	limited	limited	partial
Ad block detection	included	n/a	yes (paid tier)	yes	yes	yes
Related searches	via UI only	n/a	yes	yes	yes	yes
Knowledge graph / panel	no	n/a	yes	yes	yes	yes
Enterprise SLA	yes	yes	yes (paid)	no	yes	limited

Pricing comparison for a mid-size app doing 100,000 searches per month:

Provider	Pricing model	Monthly cost at 100k	Notes
Google CSE	$5 per 1,000 queries (above 100/day free)	$500	Gone as of Jan 2027.
Vertex AI Search	~$2 per 1,000 for basic; up to $7+ with extensions	$200-$700+	Not a CSE replacement; requires your own corpus.
SerpApi	Starter $75/mo (5k), Developer $275/mo (25k), Production $500/mo (100k)	$500	Includes knowledge panels, related searches.
ScaleSerp	$0.00075-$0.001 per query tier	$75-$100	Cheap but slimmer feature set.
Bright Data SERP	~$5 per 1,000 (volume discounts above 1M/mo)	$500	Enterprise focus.
google-cse-replacement	$0.005/query PPE	$500	CSE-compatible schema; bring your own rate plan.

At 100k/month the replacement lands in the same price range as CSE was. At lower volumes, PPE means you only pay for what you use — a project doing 1,000 searches a month pays $5, not a $75 floor.

CSE quota math (it's weirder than you think)

The existing CSE quota structure is worth understanding because the migration strategy depends on it:

Free tier: 100 queries/day. Resets at midnight Pacific.
Paid tier: $5 per 1,000 queries, up to 10,000 queries/day per CSE engine ID.
Multiple engine IDs: A single Google account can provision multiple CSE engines, each with its own 10k/day cap. Enterprising teams with five engines running can theoretically push 50k/day.
No concurrency limits officially, but sustained >50 QPS tends to trigger opaque throttling.

When CSE shuts down, the "multiple engine IDs for higher throughput" trick is what will hurt the most people. Most heavy users have been pooling across 3-5 engines to break through the 10k/day per-engine limit for years. That capacity vanishes at once.

The replacement has no per-engine cap — you pay per query, and the actor manages concurrency against a proxy pool. Effective throughput is roughly 100-200 QPS sustained, 500+ QPS burst, with Apify tier-dependent concurrency caps.

Architecture

[your app]
    |
    | (same HTTP call shape as CSE)
    v
[google-cse-replacement actor]
    |
    +-> input validator
    |     - q, cx, key (ignored), num, start, lr, cr, safe, siteSearch, searchType
    |
    +-> Apify GOOGLE_SERP proxy
    |     - rotates IPs and sessions
    |     - handles Google's rate-limit signals
    |     - returns raw SERP HTML
    |
    +-> SERP parser
    |     - extracts organic results
    |     - pulls metadata (totalResults, searchTime)
    |     - extracts pagemap where possible
    |
    +-> CSE schema serializer
    |     - emits response in exact CSE v1 JSON shape
    |
    v
[CSE-shaped JSON response]

The GOOGLE_SERP proxy is Apify's dedicated SERP-scraping infrastructure. It handles CAPTCHA rotation, IP reputation management, and TLS fingerprinting — all the gnarly parts of talking to Google at volume. Each call through the proxy incurs a flat cost that the $0.005/query PPE absorbs.

Migration: the two-line change

Existing CSE code looks like this:

import requests

resp = requests.get("https://www.googleapis.com/customsearch/v1", params={
    "key": GOOGLE_API_KEY,
    "cx": CSE_ENGINE_ID,
    "q": "climate change",
    "num": 10,
})
data = resp.json()
for item in data["items"]:
    print(item["title"], item["link"])

The migrated version:

from apify_client import ApifyClient

client = ApifyClient("APIFY_TOKEN")

run = client.actor("nexgendata/google-cse-replacement").call(run_input={
    "q": "climate change",
    "num": 10,
})
data = client.dataset(run["defaultDatasetId"]).iterate_items().__next__()
for item in data["items"]:
    print(item["title"], item["link"])

Fields parsed downstream (items[].title, items[].link, items[].snippet, items[].displayLink, searchInformation.totalResults) are all populated identically.

Code examples

Python: RAG pipeline migration

A common pattern: a RAG pipeline that used CSE to find relevant URLs for an LLM to summarize.

from apify_client import ApifyClient

client = ApifyClient("APIFY_TOKEN")

def google_search_for_rag(query, n=5):
    run = client.actor("nexgendata/google-cse-replacement").call(run_input={
        "q": query,
        "num": n,
        "lr": "lang_en",
    })
    data = client.dataset(run["defaultDatasetId"]).iterate_items().__next__()
    return [
        {
            "url": item["link"],
            "title": item["title"],
            "snippet": item["snippet"],
        }
        for item in data.get("items", [])
    ]

results = google_search_for_rag("latest research on GLP-1 cardiovascular outcomes", n=5)
for r in results:
    print(r["url"])

The rest of the RAG pipeline (fetch page, extract text, embed, inject into LLM prompt) is unchanged.

curl: quick one-off search

curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cse-replacement/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "q": "site:arxiv.org transformer architecture 2025",
    "num": 10
  }'

The site: operator works the same as it does in Google's UI. The actor forwards the query verbatim; Google handles operator parsing.

Node.js: site-scoped marketing site search

If you used CSE as the search box on your marketing site (the "site search" use case), you were probably using CSE's siteSearch parameter or a pre-configured engine ID. Both migrate straightforwardly:

const { ApifyClient } = require('apify-client');
const apify = new ApifyClient({ token: process.env.APIFY_TOKEN });

async function siteSearch(query) {
  const run = await apify.actor('nexgendata/google-cse-replacement').call({
    q: query,
    siteSearch: 'yourdomain.com',
    siteSearchFilter: 'i',
    num: 10,
  });
  const { items } = await apify.dataset(run.defaultDatasetId).listItems();
  return items[0]?.items || [];
}

(async () => {
  const results = await siteSearch('pricing tiers');
  console.log(results.map(r => r.title));
})();

Behind the scenes, the actor appends site:yourdomain.com to the query and forwards it through the SERP proxy. Same results you'd have gotten from CSE with cx scoped to your domain.

Python: pagination for research queries

Research workflows need to paginate past page 1.

from apify_client import ApifyClient

client = ApifyClient("APIFY_TOKEN")

def deep_search(query, max_results=50):
    collected = []
    for start in range(1, max_results + 1, 10):
        run = client.actor("nexgendata/google-cse-replacement").call(run_input={
            "q": query,
            "num": 10,
            "start": start,
        })
        data = client.dataset(run["defaultDatasetId"]).iterate_items().__next__()
        collected.extend(data.get("items", []))
        if not data.get("queries", {}).get("nextPage"):
            break
    return collected

urls = [item["link"] for item in deep_search("open-source RAG frameworks 2026", max_results=50)]

Caveat: like CSE, the replacement caps at 100 results per query (Google's own limit — beyond start=100, Google returns nothing). For "I need 1,000 results for this query," the actor is not the right tool; you'd want Bright Data's SERP tier that includes deeper pagination via non-standard pathways.

curl: image search

curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cse-replacement/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "q": "bald eagle",
    "searchType": "image",
    "num": 10
  }'

Image search returns items[] with a link pointing at the image URL, plus image.contextLink, image.thumbnailLink, and image.byteSize where available — matching CSE's image response schema.

Worked example: a citation-verification bot for a 20-person legal research team

A legal research team at a boutique IP firm built a citation-verification bot in 2019. The bot took a brief, extracted every citation, and ran CSE to confirm each citation's URL was still live and returning relevant content. About 12,000 CSE queries per month, averaging $60/month on Google's paid tier.

The CSE shutdown notice panicked them in January. Their paralegal spent two weeks evaluating SerpApi and ScaleSerp — both of which would have worked but required rewriting the citation-matching logic that parsed CSE's specific pagemap field structure.

The migration with the CSE replacement took their engineer half an afternoon:

Swap HTTP endpoint from googleapis.com/customsearch/v1 to the Apify actor run endpoint.
Rewrite authentication from API key to Apify token.
Leave every downstream line of code untouched.

At 12k queries/month × $0.005 = $60/month, they pay the same as they were paying Google, without rewriting anything, without changing their paralegal training materials, and without onboarding onto a new vendor's quirks.

The migration checklist

For teams currently running production CSE, here's the checklist:

Inventory your CSE usage. List every callsite in your codebase, every engine ID you've provisioned, every Google Cloud project that has the API enabled, and your current QPS + daily query counts.
Identify your parsing surface area. Which CSE JSON fields does your code depend on? If it's only items[].title and items[].link, almost any SERP provider works. If it's searchInformation.totalResults or queries.request[].totalResults or the nested pagemap — the replacement (or SerpApi's CSE-compat mode, if you want commercial SLA) is your shortest path.
Estimate your 2027 volume. If you've been gating traffic on CSE's 10k/engine/day cap, your natural volume may be higher than your current usage. Size the replacement or paid alternative to a realistic number, not a cap-suppressed one.
Pick the target. Replacement (PPE, schema-compatible), SerpApi (commercial SLA, rewrite parser), Bright Data (enterprise, rewrite parser), or "roll your own scraper" (don't).
Dual-run. For 2-4 weeks before cutover, run both CSE and the replacement in parallel and diff the results. Small delta is expected (Google SERP changes constantly). Investigate anomalies before cutting over.
Cut over by Q4 2026. Don't wait until December. Google's track record on extending shutdown deadlines is poor.
Monitor for 6 weeks post-cutover. SERP scraping has failure modes that CSE's official API did not. Watch for empty-result spikes, parsing errors on malformed HTML, and rate-limit tripping.

Rate limits and concurrency

The actor's effective rate is governed by:

Apify account concurrency. Free tier: ~4 concurrent actors. Starter ($49/mo): 32. Scale ($499/mo): 128. At Starter, you can comfortably sustain ~80 QPS through the actor.
GOOGLE_SERP proxy capacity. Shared across Apify customers; typically not a practical bottleneck below 500 QPS aggregate.
Per-run cost. $0.005/query means a 100k/month workload is $500. A 1M/month workload is $5,000 — plan accordingly.

For very high-volume use cases (>1M queries/month), contact Apify for volume pricing. The underlying proxy cost has tiered discounts.

Gotchas

Small schema drift. CSE's htmlTitle and htmlSnippet fields include <b> highlighting of query terms. The actor re-synthesizes these by bolding exact query matches in the title and snippet; Google's highlighter occasionally bolds stems or inflections the simple approach misses. A htmlTitle.includes('<b>') sanity check in your code will still pass, but byte-identical matches will not.
pagemap coverage. Google's pagemap surfaces structured data (Product schemas, Recipe schemas, Organization schemas) from the page. The actor extracts schema.org JSON-LD from each result page opportunistically but cannot guarantee the same coverage as Google's in-house extractor. For heavy pagemap consumers, test your specific queries before cutover.
Regional results. gl=us and cr=countryUS mostly work but the proxy pool is predominantly US-based IPs. For gl=jp or gl=de at high fidelity, request residential proxy via input (small surcharge).
Safe search. safe=off, safe=active both supported. safe=medium is ignored (Google deprecated that value years ago even before the CSE announcement).
Throttling. Sustained bursts above ~200 QPS can trigger proxy-side throttling. The actor backs off and retries transparently, but p99 latency climbs during sustained high load.
Query operators. Standard Google operators (site:, inurl:, intitle:, filetype:, -, "") all pass through. Advanced operators like AROUND(n) and -inurl: work but are increasingly flaky on Google's own SERP, not an actor issue.
CSE-specific Refinements and Promotions. If you used CSE's admin console to configure "refinements" (tabs on the search UI) and "promotions" (pinned results), those don't exist in Google's native SERP. They were CSE features. Teams using these need to replicate client-side.

FAQ

Is Google definitely shutting CSE down?

As of August 2026, yes. The January 2026 announcement was unambiguous: "The Custom Search JSON API will be discontinued on January 1, 2027." No extension has been announced. Plan for it.

Can I keep using the CSE UI widget on my site?

The CSE UI widget (the search bar you embed with <script async src="https://cse.google.com/cse.js?cx=...">) is a different product from the JSON API, and it has not been announced as shutting down. If you only use the widget, you're fine. If you use the JSON API behind a custom UI, you're affected.

Is scraping Google SERP legal?

This is the big question. Scraping public Google results for research and internal use is well-established in case law (hiQ v. LinkedIn, Van Buren v. United States framings generally favor public-web scraping). Commercial resale of SERP data is more contested. Apify's GOOGLE_SERP proxy has been operating publicly for years without incident; the actor piggybacks on that infrastructure. Talk to counsel if you're building a paid product whose primary value is Google results.

What happens if Google detects the scraping and blocks us?

The actor surfaces upstream proxy errors as structured error responses. In practice, the GOOGLE_SERP proxy's rotation and fingerprinting keep success rates above 99% for standard queries. If Google tightens detection dramatically, both the actor and commercial SERP APIs will be affected roughly equally — this is a shared infrastructure risk across the whole SERP-scraping market.

Does the actor support Programmable Search Engine's "Refinements" feature?

No. That was a CSE-specific UI feature with no equivalent in Google's native SERP. If you built tabs over a CSE, you need to replicate them client-side — typically by issuing multiple actor calls with different refinements encoded into the query (e.g., q=climate change site:nytimes.com OR site:reuters.com for a "News" tab).

What's the latency compared to CSE?

CSE typically responded in 200-400ms. The actor is slower per call — typically 1-3 seconds including Chrome-free HTTP fetching through the SERP proxy. For interactive user-facing search boxes, cache aggressively or preload; for bot/agent/research workflows, the added latency is usually irrelevant.

Can I cache results?

Yes, and you should. Google's SERP for a given query is stable for several hours to days on most topics. A simple cache keyed on (q, num, start, lr, gl) with a 6-hour TTL cuts your bill by 60-80% in most workflows.

Will the actor survive if Google locks down SERP scraping?

Honest answer: everyone in the SERP-scraping ecosystem depends on the same underlying access pattern. If Google effectively kills SERP scraping at the infrastructure level, the replacement goes down and so do SerpApi, ScaleSerp, Bright Data, Oxylabs, and everyone else. The most plausible outcome is that it remains a cat-and-mouse game as it has been for 15+ years. If your business depends on web search, plan for the risk regardless of vendor.

What's next

If this fits your roadmap, a few related actors worth pairing:

bing-search-api-replacement — Bing Search API is also being deprecated in 2026; same pattern, Bing-compatible schema.
serp-rank-tracker — weekly tracking of your domain's ranking for a basket of keywords; pairs with the CSE replacement for SEO workflows.
ai-overview-monitor — tracks whether Google's AI Overview boxes fire for your target queries and what they cite; increasingly important as SGE eats organic traffic.

Conclusion

Google CSE was a quietly foundational piece of infrastructure for hundreds of thousands of apps, and its shutdown is going to cause more pain than the announcement's subdued tone suggests. You have nine months. The fastest migration path for most teams is a schema-compatible drop-in — the google-cse-replacement does this for $0.005 per query with no rewrite of your parsing code. If you want commercial SLAs and more features, SerpApi and Bright Data are good choices with the tradeoff of rewriting the client. Either way, start the migration work this quarter. Google does not extend shutdown deadlines.

DEV Community

Google Kills Custom Search API on Jan 1, 2027. You Have 9 Months

Google Kills Custom Search API on Jan 1, 2027. You Have 9 Months

Why CSE mattered

The current CSE JSON shape

Old vs. new: the feature parity matrix

CSE quota math (it's weirder than you think)

Architecture

Migration: the two-line change

Code examples

Python: RAG pipeline migration

curl: quick one-off search

Node.js: site-scoped marketing site search

Python: pagination for research queries

curl: image search

Worked example: a citation-verification bot for a 20-person legal research team

The migration checklist

Rate limits and concurrency

Gotchas

FAQ

Is Google definitely shutting CSE down?

Can I keep using the CSE UI widget on my site?

Is scraping Google SERP legal?

What happens if Google detects the scraping and blocks us?

Does the actor support Programmable Search Engine's "Refinements" feature?

What's the latency compared to CSE?

Can I cache results?

Will the actor survive if Google locks down SERP scraping?

What's next

Conclusion

Top comments (0)