DEV Community

Ken-Mutisya
Ken-Mutisya

Posted on

"Google Publishes Every Advertiser's Ads. Here Is the Keyless API Behind the Page"

If you have ever wanted to see every Google ad a competitor is running, Google already gives you the data: the Ads Transparency Center lists the live creatives of every verified advertiser, across Search, Display, and YouTube. What it does not give you is an export button.

The page itself, though, runs on a small set of JSON RPC endpoints that work without a login, a cookie, or an API key. Here is how they fit together, including the one undocumented field that cost me an hour.

Finding an advertiser

Advertisers are identified by IDs that start with AR. You resolve a brand name to IDs with a POST:

POST https://adstransparency.google.com/anji/_/rpc/SearchService/SearchSuggestions?authuser=0
content-type: application/x-www-form-urlencoded

f.req={"1":"Nike","2":20,"3":20}
Enter fullscreen mode Exit fullscreen mode

The body is a protobuf rendered as JSON, so the keys are field numbers. Field 1 is your query; 2 and 3 are result counts. Send them as integers: pass an object and the service answers with a BadRequestException about failing to convert your request.

The response is the same numbered style. Each suggestion carries the advertiser name (1), the AR id (2), the country (3), and an ad count. Searching "Nike" returns a long tail of small verified advertisers named Nike-something, with the real Nike, Inc. sitting among them, so match on the name you actually want.

Pulling the creatives

POST https://adstransparency.google.com/anji/_/rpc/SearchService/SearchCreatives?authuser=

f.req={"2":40,"3":{"13":{"1":["AR04119126533128323073"]}},"7":{"1":1}}
Enter fullscreen mode Exit fullscreen mode

Field 2 is the page size (40 is what the site uses), and 3.13.1 is the list of advertiser IDs.

The part that will bite you: the "7":{"1":1} member is mandatory. Leave it out and the endpoint does not error. It returns {}, which looks exactly like "this advertiser has no ads". I chased that empty object through several request variations before finding the missing field.

Each creative row includes the creative ID, the advertiser display name, preview markup you can pull the image or video asset URL out of, and two timestamp pairs: first shown and last shown. The response also carries a cursor in field 2; pass it back as field 4 to page through the full set.

Filtering by country

Region filtering is field 8 inside member 3, and the values are a small puzzle: the enum is 2000 plus the ISO 3166-1 numeric country code. The United States is 840, so:

"3":{"13":{"1":["AR..."]},"8":[2840]}
Enter fullscreen mode Exit fullscreen mode

Omit field 8 entirely and you get ads from anywhere.

Rate limits

Plain fetch from Node works fine, no TLS fingerprint tricks needed. But the surface has a per IP burst quota: hammer it with several full runs inside a minute or two and you start seeing HTTP 429 with an HTML error page, which recovers on its own shortly after. A retry with growing backoff (2s, 6s, 15s) absorbs it. Space your paging requests a few hundred milliseconds apart and single runs never hit it.

What you can build with it

The obvious one is competitor monitoring: snapshot an advertiser's creative set on a schedule, diff the creative IDs, and you know the day a new campaign launches. First and last shown dates also tell you which creatives have survived the longest, which is a decent proxy for what is working.

I packaged this whole flow (name search, pagination, region filter, format detection, retries) into an actor you can run on Apify, part of a growing set of keyless scrapers I have been shipping. But the endpoints above are all you need to roll your own. The data is public; Google just forgot the download button.

Top comments (0)