Skip the Setup — Use a Ready-Made Facebook Ads Scraper
Building a Facebook Ads Library scraper from scratch means handling React rendering, anti-bot detection, and constant UI changes. Our Facebook Ads Scraper on Apify is production-ready: political ads, commercial ads, spend data, creatives, and advertiser profiles — all extracted to structured JSON.
Free plan included. No credit card required.
What Is the Facebook Ads Library?
The Facebook Ads Library is Meta's publicly accessible archive of advertisements. Launched in 2019, it was designed to increase transparency around advertising on Facebook's platforms, particularly political and issue-based ads.
Key features of the library:
- Political ads are permanently archived: Unlike commercial ads, political and issue ads remain in the library for 7 years after last being shown
- Spend and impression ranges: Political ads include declared spending ranges and estimated impression counts
- Advertiser information: Including page name, disclaimers, and "Paid for by" disclosures
- Ad creative access: The actual text, images, and videos used in ads
- Targeting information: For political ads, some demographic targeting data is disclosed
- Global coverage: Ads from all countries where Meta operates
Why Scrape the Facebook Ads Library?
While Meta provides a web interface and a basic API, there are compelling reasons to build automated scraping solutions:
- The official API is limited: The Ad Library API has strict rate limits, limited search capabilities, and frequently breaks or changes without notice
- Bulk data export isn't available: You can't simply download all ads for a given country or time period
- Research at scale: Academic researchers studying political advertising need thousands or millions of records
- Competitive intelligence: Marketers want to analyze competitor ad strategies, creatives, and messaging patterns
- Journalism: Investigative journalists tracking dark money, misleading claims, or coordinated campaigns need comprehensive data
- Campaign monitoring: Political watchdogs monitoring election integrity need real-time ad tracking
Understanding the Data Structure
Before scraping, let's understand what data is available for different ad types:
Political and Issue Ads
These are ads about social issues, elections, or politics. They contain the richest data:
// Typical political ad data structure
const politicalAdSchema = {
adId: "string", // Unique ad identifier
pageId: "string", // Advertiser's Facebook page ID
pageName: "string", // Page name
disclaimer: "string", // "Paid for by" disclosure
adCreativeBody: "string", // Ad text content
adCreativeLinkCaption: "string",
adCreativeLinkTitle: "string",
adCreativeLinkDescription: "string",
adDeliveryStartTime: "date", // When the ad started running
adDeliveryStopTime: "date", // When it stopped (null if active)
currency: "string", // Currency code
spendLower: "number", // Minimum spend range
spendUpper: "number", // Maximum spend range
impressionsLower: "number", // Minimum impression range
impressionsUpper: "number", // Maximum impression range
demographicDistribution: [ // Age/gender breakdown
{ percentage: "1.5%", age: "18-24", gender: "female" }
],
deliveryByRegion: [ // Geographic distribution
{ percentage: "25%", region: "California" }
],
publisherPlatforms: ["facebook", "instagram"],
adSnapshot: { // Visual creative data
images: ["url"],
videos: ["url"],
cards: [] // Carousel card data
}
};
Commercial Ads
Regular commercial ads contain less data (no spend/impressions):
const commercialAdSchema = {
adId: "string",
pageId: "string",
pageName: "string",
adCreativeBody: "string",
adDeliveryStartTime: "date",
adDeliveryStopTime: "date",
publisherPlatforms: ["facebook", "instagram"],
adSnapshot: {
images: ["url"],
videos: ["url"]
}
// No spend, impressions, or demographic data
};
Setting Up Your Scraping Environment
The Facebook Ads Library is a dynamic web application built with React. Scraping it effectively requires a browser automation approach. Let's set up a robust scraping pipeline:
// facebook-ads-scraper.js
const { chromium } = require('playwright');
class FacebookAdsLibraryScraper {
constructor(options = {}) {
this.browser = null;
this.context = null;
this.page = null;
this.baseUrl = 'https://www.facebook.com/ads/library/';
this.delay = options.delay || 2000;
this.results = [];
}
async init() {
this.browser = await chromium.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage'
]
});
this.context = await this.browser.newContext({
viewport: { width: 1440, height: 900 },
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' +
'AppleWebKit/537.36 (KHTML, like Gecko) ' +
'Chrome/120.0.0.0 Safari/537.36',
locale: 'en-US'
});
this.page = await this.context.newPage();
}
async sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async close() {
if (this.browser) await this.browser.close();
}
}
Searching for Political Ads
The first step is navigating the search interface to find relevant ads:
async function searchPoliticalAds(scraper, params) {
const {
query = '',
country = 'US',
adType = 'political_and_issue_ads',
dateFrom = null,
dateTo = null
} = params;
// Build the search URL with query parameters
const searchUrl = new URL(scraper.baseUrl);
searchUrl.searchParams.set('active_status', 'all');
searchUrl.searchParams.set('ad_type', adType);
searchUrl.searchParams.set('country', country);
searchUrl.searchParams.set('media_type', 'all');
if (query) {
searchUrl.searchParams.set('q', query);
}
await scraper.page.goto(searchUrl.toString(), {
waitUntil: 'networkidle',
timeout: 30000
});
// Wait for results to load
await scraper.page.waitForSelector(
'[class*="adLibrary"]',
{ timeout: 15000 }
).catch(() => {
console.log('Initial selector not found, checking for results...');
});
// Apply date filters if specified
if (dateFrom || dateTo) {
await applyDateFilter(scraper, dateFrom, dateTo);
}
return scraper;
}
async function applyDateFilter(scraper, dateFrom, dateTo) {
// Click the date filter button
const filterButtons = await scraper.page.$$('button');
for (const btn of filterButtons) {
const text = await btn.textContent();
if (text.includes('Date') || text.includes('Start date')) {
await btn.click();
await scraper.sleep(1000);
break;
}
}
// Fill in date inputs
if (dateFrom) {
const fromInput = await scraper.page.$(
'input[placeholder*="From"], input[aria-label*="start"]'
);
if (fromInput) {
await fromInput.fill(dateFrom);
}
}
if (dateTo) {
const toInput = await scraper.page.$(
'input[placeholder*="To"], input[aria-label*="end"]'
);
if (toInput) {
await toInput.fill(dateTo);
}
}
// Apply the filter
const applyBtn = await scraper.page.$('button:has-text("Apply")');
if (applyBtn) await applyBtn.click();
await scraper.sleep(2000);
}
Extracting Ad Data
Once we have search results, we need to extract structured data from each ad card:
async function extractAdCards(scraper, maxAds = 100) {
const ads = [];
let loadMoreAttempts = 0;
const maxLoadMoreAttempts = 50;
while (ads.length < maxAds && loadMoreAttempts < maxLoadMoreAttempts) {
// Extract visible ad cards
const newAds = await scraper.page.evaluate(() => {
const adCards = document.querySelectorAll(
'[class*="adCard"], [data-testid="ad_card"],' +
' [class*="SearchResultCard"]'
);
const extracted = [];
adCards.forEach(card => {
const bodyEl = card.querySelector(
'[class*="adText"], [class*="body"]'
);
const pageNameEl = card.querySelector(
'[class*="pageName"], a[href*="/ads/library/?view_all_page_id"]'
);
const disclaimerEl = card.querySelector(
'[class*="disclaimer"]'
);
const dateEl = card.querySelector(
'[class*="startDate"], span:has-text("Started running")'
);
const platformEls = card.querySelectorAll(
'[class*="platform"] img, [aria-label*="Facebook"],' +
' [aria-label*="Instagram"]'
);
const statusEl = card.querySelector(
'[class*="status"]'
);
// Extract spend data (political ads only)
const spendEl = card.querySelector(
'[class*="spend"], span:has-text("Spent")'
);
const impressionEl = card.querySelector(
'[class*="impression"]'
);
// Extract image URLs
const images = [];
card.querySelectorAll('img[src*="scontent"]').forEach(img => {
images.push(img.src);
});
extracted.push({
pageName: pageNameEl
? pageNameEl.textContent.trim() : '',
disclaimer: disclaimerEl
? disclaimerEl.textContent.trim() : '',
adBody: bodyEl
? bodyEl.textContent.trim() : '',
startDate: dateEl
? dateEl.textContent.trim() : '',
status: statusEl
? statusEl.textContent.trim() : 'Unknown',
spend: spendEl
? spendEl.textContent.trim() : '',
impressions: impressionEl
? impressionEl.textContent.trim() : '',
platforms: Array.from(platformEls).map(
el => el.getAttribute('aria-label') || ''
),
imageUrls: images,
scrapedAt: new Date().toISOString()
});
});
return extracted;
});
// Deduplicate and add new ads
for (const ad of newAds) {
const isDuplicate = ads.some(
a => a.adBody === ad.adBody && a.pageName === ad.pageName
);
if (!isDuplicate) ads.push(ad);
}
// Scroll to trigger loading more results
await scraper.page.evaluate(
() => window.scrollTo(0, document.documentElement.scrollHeight)
);
await scraper.sleep(scraper.delay);
// Check for "See more" or "Load more" buttons
const loadMoreBtn = await scraper.page.$(
'button:has-text("See more results"),' +
' button:has-text("Load more")'
);
if (loadMoreBtn) {
await loadMoreBtn.click();
await scraper.sleep(2000);
}
loadMoreAttempts++;
console.log(
`Collected ${ads.length}/${maxAds} ads ` +
`(attempt ${loadMoreAttempts})`
);
}
return ads.slice(0, maxAds);
}
Extracting Advertiser Profiles
To build a complete picture of political advertising, we need to analyze advertiser pages:
async function scrapeAdvertiserProfile(scraper, pageId) {
const profileUrl =
`${scraper.baseUrl}?active_status=all&ad_type=political_and_issue_ads` +
`&country=US&view_all_page_id=${pageId}&media_type=all`;
await scraper.page.goto(profileUrl, {
waitUntil: 'networkidle',
timeout: 30000
});
await scraper.sleep(3000);
const profile = await scraper.page.evaluate(() => {
// Extract page-level summary data
const pageNameEl = document.querySelector('h2, [class*="pageTitle"]');
const totalAdsEl = document.querySelector(
'span:has-text("total ads")'
);
// Extract spending summary if available
const spendingSection = document.querySelector(
'[class*="spendingSummary"]'
);
let totalSpend = '';
if (spendingSection) {
totalSpend = spendingSection.textContent.trim();
}
// Get all visible ad previews
const adPreviews = [];
document.querySelectorAll('[class*="adCard"]').forEach(card => {
const body = card.querySelector('[class*="adText"]');
const date = card.querySelector('[class*="startDate"]');
adPreviews.push({
text: body ? body.textContent.trim().slice(0, 200) : '',
date: date ? date.textContent.trim() : ''
});
});
return {
pageId: new URLSearchParams(window.location.search)
.get('view_all_page_id'),
pageName: pageNameEl
? pageNameEl.textContent.trim() : '',
totalAds: totalAdsEl
? totalAdsEl.textContent.trim() : '',
totalSpend: totalSpend,
recentAds: adPreviews.slice(0, 10)
};
});
return profile;
}
Extracting Spend and Targeting Data
Political ad spending data is crucial for election transparency research:
async function extractSpendData(scraper, adDetailUrl) {
await scraper.page.goto(adDetailUrl, {
waitUntil: 'networkidle',
timeout: 30000
});
await scraper.sleep(3000);
const spendData = await scraper.page.evaluate(() => {
const data = {
spend: { lower: null, upper: null, currency: 'USD' },
impressions: { lower: null, upper: null },
demographics: [],
regions: []
};
// Parse spend range
const spendText = document.querySelector(
'[class*="spend"]'
)?.textContent || '';
const spendMatch = spendText.match(
/\$?([\d,]+)\s*-\s*\$?([\d,]+)/
);
if (spendMatch) {
data.spend.lower = parseInt(
spendMatch[1].replace(/,/g, '')
);
data.spend.upper = parseInt(
spendMatch[2].replace(/,/g, '')
);
}
// Parse impression range
const impText = document.querySelector(
'[class*="impression"]'
)?.textContent || '';
const impMatch = impText.match(
/([\d,]+)\s*-\s*([\d,]+)/
);
if (impMatch) {
data.impressions.lower = parseInt(
impMatch[1].replace(/,/g, '')
);
data.impressions.upper = parseInt(
impMatch[2].replace(/,/g, '')
);
}
// Parse demographic distribution
const demoRows = document.querySelectorAll(
'[class*="demographic"] tr,' +
' [class*="ageGender"] [class*="row"]'
);
demoRows.forEach(row => {
const cells = row.querySelectorAll('td, span');
if (cells.length >= 3) {
data.demographics.push({
age: cells[0]?.textContent?.trim(),
gender: cells[1]?.textContent?.trim(),
percentage: cells[2]?.textContent?.trim()
});
}
});
// Parse regional distribution
const regionRows = document.querySelectorAll(
'[class*="region"] tr,' +
' [class*="deliveryRegion"] [class*="row"]'
);
regionRows.forEach(row => {
const cells = row.querySelectorAll('td, span');
if (cells.length >= 2) {
data.regions.push({
region: cells[0]?.textContent?.trim(),
percentage: cells[1]?.textContent?.trim()
});
}
});
return data;
});
return spendData;
}
Scaling with Apify
For production-scale Facebook Ads Library scraping, Apify provides the infrastructure you need. Here's how to set up and run a comprehensive scraping operation:
const { ApifyClient } = require('apify-client');
const client = new ApifyClient({
token: 'YOUR_APIFY_TOKEN'
});
async function runFacebookAdsActor() {
const run = await client.actor('facebook-ads-library-scraper').call({
searchTerms: [
'climate change',
'immigration',
'election 2024',
'healthcare reform'
],
country: 'US',
adType: 'political_and_issue_ads',
maxAdsPerSearch: 500,
dateRange: {
from: '2024-01-01',
to: '2024-12-31'
},
extractCreatives: true,
extractDemographics: true,
proxy: {
useApifyProxy: true,
apifyProxyGroups: ['RESIDENTIAL']
}
});
const { items } = await client
.dataset(run.defaultDatasetId)
.listItems();
console.log(`Collected ${items.length} political ads`);
return items;
}
Building a Custom Apify Actor
For fine-tuned control over the scraping process:
// src/main.js - Custom Facebook Ads Library Actor
const Apify = require('apify');
const { PlaywrightCrawler } = require('crawlee');
Apify.main(async () => {
const input = await Apify.getInput();
const {
searchTerms = [],
country = 'US',
adType = 'political_and_issue_ads',
maxAdsPerSearch = 100,
extractCreatives = false
} = input;
const proxyConfiguration = await Apify.createProxyConfiguration({
useApifyProxy: true,
apifyProxyGroups: ['RESIDENTIAL']
});
// Build initial request list from search terms
const requests = searchTerms.map(term => ({
url: `https://www.facebook.com/ads/library/?active_status=all` +
`&ad_type=${adType}&country=${country}` +
`&q=${encodeURIComponent(term)}&media_type=all`,
userData: { searchTerm: term, adsCollected: 0 }
}));
const crawler = new PlaywrightCrawler({
proxyConfiguration,
maxConcurrency: 3,
navigationTimeoutSecs: 60,
requestHandlerTimeoutSecs: 300,
async requestHandler({ page, request, log }) {
const { searchTerm, adsCollected } = request.userData;
log.info(
`Scraping "${searchTerm}" - ${adsCollected} ads so far`
);
// Wait for ad cards to load
await page.waitForSelector(
'[class*="adCard"], [class*="SearchResult"]',
{ timeout: 20000 }
).catch(() => log.warning('No ad cards found'));
// Scroll and collect
let collected = adsCollected;
let scrollAttempts = 0;
while (
collected < maxAdsPerSearch &&
scrollAttempts < 100
) {
const ads = await page.evaluate(() => {
// Extraction logic
const cards = document.querySelectorAll(
'[class*="adCard"]'
);
return Array.from(cards).map(card => ({
body: card.querySelector(
'[class*="adText"]'
)?.textContent?.trim() || '',
pageName: card.querySelector(
'[class*="pageName"]'
)?.textContent?.trim() || ''
}));
});
for (const ad of ads) {
ad.searchTerm = searchTerm;
ad.country = country;
await Apify.pushData(ad);
collected++;
}
await page.evaluate(() =>
window.scrollTo(
0, document.documentElement.scrollHeight
)
);
await page.waitForTimeout(2000);
scrollAttempts++;
}
log.info(
`Finished "${searchTerm}": ${collected} ads collected`
);
}
});
await crawler.run(requests);
});
Analyzing the Collected Data
Once you have a substantial dataset, analysis reveals powerful insights:
function analyzeSpendingPatterns(ads) {
// Group by advertiser
const byAdvertiser = {};
ads.forEach(ad => {
const name = ad.pageName || 'Unknown';
if (!byAdvertiser[name]) {
byAdvertiser[name] = {
totalAds: 0,
totalSpendLower: 0,
totalSpendUpper: 0,
platforms: new Set(),
dateRange: { earliest: null, latest: null }
};
}
const entry = byAdvertiser[name];
entry.totalAds++;
if (ad.spendLower) entry.totalSpendLower += ad.spendLower;
if (ad.spendUpper) entry.totalSpendUpper += ad.spendUpper;
if (ad.platforms) {
ad.platforms.forEach(p => entry.platforms.add(p));
}
const startDate = new Date(ad.adDeliveryStartTime);
if (!entry.dateRange.earliest ||
startDate < entry.dateRange.earliest) {
entry.dateRange.earliest = startDate;
}
if (!entry.dateRange.latest ||
startDate > entry.dateRange.latest) {
entry.dateRange.latest = startDate;
}
});
// Sort by estimated total spend
const sorted = Object.entries(byAdvertiser)
.map(([name, data]) => ({
advertiser: name,
...data,
platforms: Array.from(data.platforms),
estimatedAvgSpend: (
data.totalSpendLower + data.totalSpendUpper
) / 2
}))
.sort((a, b) => b.estimatedAvgSpend - a.estimatedAvgSpend);
return sorted;
}
function analyzeAdCreatives(ads) {
// Common messaging themes
const wordFrequency = {};
ads.forEach(ad => {
if (!ad.adBody) return;
const words = ad.adBody.toLowerCase()
.replace(/[^\w\s]/g, '')
.split(/\s+/)
.filter(w => w.length > 4);
words.forEach(word => {
wordFrequency[word] = (wordFrequency[word] || 0) + 1;
});
});
const topWords = Object.entries(wordFrequency)
.sort((a, b) => b[1] - a[1])
.slice(0, 50);
return { topWords };
}
Legal and Ethical Considerations
Facebook Ads Library scraping occupies a unique legal position:
The data is intentionally public: Meta created the Ad Library specifically for transparency. Political ads are required by law to be publicly disclosed in many jurisdictions.
Respect rate limits: Even though the data is public, aggressive scraping can affect service availability. Use reasonable delays (2-5 seconds between requests).
Academic and journalistic use: Courts have generally been favorable toward scraping public data for research and journalism purposes, especially when the data was made public for transparency.
Commercial use considerations: Using scraped ad data for commercial purposes may face additional scrutiny. Consult legal counsel if you plan to commercialize insights derived from the data.
Don't scrape user data: The Ads Library shows advertiser information, not user data. Never attempt to correlate ad targeting data with individual users.
Credit your sources: When publishing research based on Ad Library data, cite Meta's Ad Library as your source.
GDPR compliance: If you're collecting data about EU-based advertisers, ensure your data handling practices comply with GDPR requirements.
Automating Ongoing Monitoring
For continuous political ad monitoring, set up scheduled scraping runs:
const { ApifyClient } = require('apify-client');
async function setupScheduledMonitoring() {
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
// Create a scheduled task that runs daily
const schedule = await client.schedules().create({
name: 'Daily Political Ad Monitor',
cronExpression: '0 6 * * *', // Every day at 6 AM UTC
actions: [{
type: 'RUN_ACTOR',
actorId: 'your-fb-ads-actor-id',
runInput: {
searchTerms: [
'election',
'vote',
'candidate names...'
],
country: 'US',
adType: 'political_and_issue_ads',
maxAdsPerSearch: 200,
dateRange: {
from: 'yesterday',
to: 'today'
}
}
}]
});
console.log(`Schedule created: ${schedule.id}`);
}
Extract Facebook Ads Data Without Building Infrastructure
Skip the Playwright setup, proxy management, and maintenance. The Facebook Ads Scraper by cryptosignals handles political ads, commercial ads, spend data, and creatives with residential proxies built in.
Try it free on Apify → — no credit card, free plan included.
Conclusion
The Facebook Ads Library is a goldmine of transparency data that enables powerful analysis of political advertising, competitive intelligence, and market research. While Meta provides a web interface and basic API, automated scraping with tools like Playwright and Apify unlocks the ability to collect and analyze data at scale.
By combining the techniques in this guide — from basic ad card extraction to sophisticated spend analysis and ongoing monitoring — you can build comprehensive datasets that reveal how organizations use political advertising to influence public opinion.
Whether you're a journalist investigating campaign spending, a researcher studying political communication, or a marketer analyzing competitive strategies, the Facebook Ads Library contains insights waiting to be extracted. Start with small, focused scraping runs, validate your data quality, and scale up with Apify when you're ready to go big.
Remember: this data was made public for a reason. Use it responsibly, respect rate limits, and contribute to the transparency that makes democratic discourse healthier.
Top comments (0)