Ad libraries tell you what the ad says.
The landing page tells you what the company is actually trying to make you do.
That difference matters.
I have seen teams obsess over competitor headlines while ignoring the part that changed first: the page behind the ad.
That is where you notice things like:
- a new self-serve CTA
- a new pricing angle
- a comparison page replacing a generic feature page
- a trial push replacing a demo flow
- a new proof block added to support a fresh campaign
If you are tracking competitors seriously, ad monitoring without landing-page tracking is incomplete.
So this is the workflow I would build: use public ad data to discover the active landing pages, save structured snapshots of those pages, and diff the important parts over time.
Why Ad Pages Are Better Than General Site Crawling
You can crawl a competitor site broadly.
Sometimes that is useful.
But if your real goal is paid-intel and offer monitoring, ad-linked landing pages are much higher signal.
They tell you what the company is actively paying to promote.
That means the workflow is simpler and the insight is usually sharper.
The Three Things I Track First
I start with only a few fields.
- page title
- primary H1 or hero headline
- primary CTA text
- canonical URL
That is enough to detect most meaningful competitor landing-page changes.
You do not need a full DOM diff to start getting value.
JavaScript Version: Pull Ad URLs, Then Diff Key Page Elements
This version uses Facebook and LinkedIn ad data to collect landing page URLs, then fetches and compares a few key HTML elements.
const headers = { 'X-API-Key': process.env.SOCIAVAULT_API_KEY };
async function fetchJson(url) {
const response = await fetch(url, { headers });
if (!response.ok) {
throw new Error(`Request failed with ${response.status}`);
}
return response.json();
}
async function fetchHtml(url) {
const response = await fetch(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
},
});
if (!response.ok) {
throw new Error(`HTML fetch failed with ${response.status}`);
}
return response.text();
}
function extractBetween(html, regex) {
const match = html.match(regex);
return match ? match[1].replace(/\s+/g, ' ').trim() : null;
}
function parsePage(html, url) {
return {
url,
title: extractBetween(html, /<title>(.*?)<\/title>/is),
h1: extractBetween(html, /<h1[^>]*>(.*?)<\/h1>/is),
cta: extractBetween(html, /<a[^>]*>(Start free|Book a demo|Talk to sales|Try free|Get started)<\/a>/is),
};
}
function normalizeAds(items = []) {
return (items || []).map(item => item.url || item.landingPageUrl || item.snapshot?.link_url).filter(Boolean);
}
function diffPage(previous, current) {
const changes = [];
for (const field of ['title', 'h1', 'cta']) {
if ((previous?.[field] || null) !== (current?.[field] || null)) {
changes.push({
field,
previous: previous?.[field] || null,
current: current?.[field] || null,
});
}
}
return changes;
}
async function collectLandingPages(company) {
const [facebookJson, linkedinJson] = await Promise.all([
fetchJson(
`https://api.sociavault.com/v1/scrape/facebook-ad-library/company-ads?companyName=${encodeURIComponent(company)}&status=ACTIVE&trim=true`
),
fetchJson(
`https://api.sociavault.com/v1/scrape/linkedin-ad-library/search?company=${encodeURIComponent(company)}`
),
]);
const urls = new Set([
...normalizeAds(facebookJson.data),
...normalizeAds(linkedinJson.data),
]);
const pages = [];
for (const url of urls) {
try {
const html = await fetchHtml(url);
pages.push(parsePage(html, url));
} catch (error) {
console.error(`Failed to fetch ${url}:`, error.message);
}
}
return pages;
}
const previousPages = {
'https://example.com/demo': {
title: 'Book a Demo | Example',
h1: 'See Example in action',
cta: 'Book a demo',
},
};
const currentPages = await collectLandingPages('HubSpot');
for (const page of currentPages) {
const changes = diffPage(previousPages[page.url], page);
if (changes.length > 0) {
console.log(`\nChanges detected for ${page.url}`);
console.log(changes);
}
}
This is enough to build a daily tracker that tells you when an ad-linked page changes its core message.
And if you want the public ad layer feeding that workflow without managing it manually, SociaVault makes that part much easier.
Python Version: Good for Scheduled Tracking Jobs
Python works very well for this kind of scheduled page-diff script.
import os
import re
import requests
HEADERS = {'X-API-Key': os.environ['SOCIAVAULT_API_KEY']}
def fetch_json(url):
response = requests.get(url, headers=HEADERS, timeout=30)
response.raise_for_status()
return response.json()
def fetch_html(url):
response = requests.get(
url,
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'},
timeout=30,
)
response.raise_for_status()
return response.text
def extract_between(html, pattern):
match = re.search(pattern, html, re.IGNORECASE | re.DOTALL)
return re.sub(r'\s+', ' ', match.group(1)).strip() if match else None
def parse_page(html, url):
return {
'url': url,
'title': extract_between(html, r'<title>(.*?)</title>'),
'h1': extract_between(html, r'<h1[^>]*>(.*?)</h1>'),
'cta': extract_between(html, r'<a[^>]*>(Start free|Book a demo|Talk to sales|Try free|Get started)</a>'),
}
def normalize_ads(items=None):
items = items or []
urls = []
for item in items:
url = item.get('url') or item.get('landingPageUrl') or item.get('snapshot', {}).get('link_url')
if url:
urls.append(url)
return urls
def diff_page(previous, current):
changes = []
for field in ['title', 'h1', 'cta']:
old_value = (previous or {}).get(field)
new_value = current.get(field)
if old_value != new_value:
changes.append({'field': field, 'previous': old_value, 'current': new_value})
return changes
def collect_landing_pages(company):
facebook = fetch_json(
f'https://api.sociavault.com/v1/scrape/facebook-ad-library/company-ads?companyName={company}&status=ACTIVE&trim=true'
)
linkedin = fetch_json(
f'https://api.sociavault.com/v1/scrape/linkedin-ad-library/search?company={company}'
)
urls = set(normalize_ads(facebook.get('data')) + normalize_ads(linkedin.get('data')))
pages = []
for url in urls:
try:
html = fetch_html(url)
pages.append(parse_page(html, url))
except Exception as error:
print(f'Failed to fetch {url}: {error}')
return pages
previous_pages = {
'https://example.com/demo': {
'title': 'Book a Demo | Example',
'h1': 'See Example in action',
'cta': 'Book a demo',
}
}
current_pages = collect_landing_pages('HubSpot')
for page in current_pages:
changes = diff_page(previous_pages.get(page['url']), page)
if changes:
print(f'\nChanges detected for {page["url"]}')
print(changes)
The Best Use Cases for This
This kind of tracker is especially useful for:
- spotting pricing pivots earlier
- seeing when a competitor moves from demo-led to self-serve
- catching new proof points or trust elements in the hero section
- identifying when a category starts leaning harder into comparison pages
That is much more actionable than just saving screenshots of ads.
Honest Alternatives
There are a few other ways to do this.
General site crawlers
Useful if you care about broad site changes.
Less focused if your real interest is what paid traffic is being sent to.
Visual diff tools
Great when design changes matter.
Often noisier and heavier than necessary if you only care about message shifts.
Manual weekly review
Fine for small lists.
Easy to forget, and very hard to compare consistently over time.
That is why I usually start with simple structured text diffs and only add visual diffing later if needed.
Final Take
If you want to understand what a competitor is actually pushing, look past the ad and track the page.
That is where the real shift often shows up first.
Use public ad data to find the current destination URLs. Save a few meaningful fields. Diff them over time. That alone gives you a much more useful competitor signal than casual ad-library browsing.
And if you want the ad-side data layer without wiring it all manually, SociaVault is a good place to start.
Then keep the rest simple and boring: fetch, extract, diff, report.
That is enough to make a landing-page change tracker genuinely useful.
Top comments (0)