Olamide Olaniyan

Posted on May 5

How I'd Build a Landing Page Change Tracker for Competitor Ads

#ai #webdev #programming #tutorial

Ad libraries tell you what the ad says.

The landing page tells you what the company is actually trying to make you do.

That difference matters.

I have seen teams obsess over competitor headlines while ignoring the part that changed first: the page behind the ad.

That is where you notice things like:

a new self-serve CTA
a new pricing angle
a comparison page replacing a generic feature page
a trial push replacing a demo flow
a new proof block added to support a fresh campaign

If you are tracking competitors seriously, ad monitoring without landing-page tracking is incomplete.

So this is the workflow I would build: use public ad data to discover the active landing pages, save structured snapshots of those pages, and diff the important parts over time.

Why Ad Pages Are Better Than General Site Crawling

You can crawl a competitor site broadly.

Sometimes that is useful.

But if your real goal is paid-intel and offer monitoring, ad-linked landing pages are much higher signal.

They tell you what the company is actively paying to promote.

That means the workflow is simpler and the insight is usually sharper.

The Three Things I Track First

I start with only a few fields.

page title
primary H1 or hero headline
primary CTA text
canonical URL

That is enough to detect most meaningful competitor landing-page changes.

You do not need a full DOM diff to start getting value.

JavaScript Version: Pull Ad URLs, Then Diff Key Page Elements

This version uses Facebook and LinkedIn ad data to collect landing page URLs, then fetches and compares a few key HTML elements.

const headers = { 'X-API-Key': process.env.SOCIAVAULT_API_KEY };

async function fetchJson(url) {
  const response = await fetch(url, { headers });
  if (!response.ok) {
    throw new Error(`Request failed with ${response.status}`);
  }
  return response.json();
}

async function fetchHtml(url) {
  const response = await fetch(url, {
    headers: {
      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    },
  });
  if (!response.ok) {
    throw new Error(`HTML fetch failed with ${response.status}`);
  }
  return response.text();
}

function extractBetween(html, regex) {
  const match = html.match(regex);
  return match ? match[1].replace(/\s+/g, ' ').trim() : null;
}

function parsePage(html, url) {
  return {
    url,
    title: extractBetween(html, /<title>(.*?)<\/title>/is),
    h1: extractBetween(html, /<h1[^>]*>(.*?)<\/h1>/is),
    cta: extractBetween(html, /<a[^>]*>(Start free|Book a demo|Talk to sales|Try free|Get started)<\/a>/is),
  };
}

function normalizeAds(items = []) {
  return (items || []).map(item => item.url || item.landingPageUrl || item.snapshot?.link_url).filter(Boolean);
}

function diffPage(previous, current) {
  const changes = [];

  for (const field of ['title', 'h1', 'cta']) {
    if ((previous?.[field] || null) !== (current?.[field] || null)) {
      changes.push({
        field,
        previous: previous?.[field] || null,
        current: current?.[field] || null,
      });
    }
  }

  return changes;
}

async function collectLandingPages(company) {
  const [facebookJson, linkedinJson] = await Promise.all([
    fetchJson(
      `https://api.sociavault.com/v1/scrape/facebook-ad-library/company-ads?companyName=${encodeURIComponent(company)}&status=ACTIVE&trim=true`
    ),
    fetchJson(
      `https://api.sociavault.com/v1/scrape/linkedin-ad-library/search?company=${encodeURIComponent(company)}`
    ),
  ]);

  const urls = new Set([
    ...normalizeAds(facebookJson.data),
    ...normalizeAds(linkedinJson.data),
  ]);

  const pages = [];
  for (const url of urls) {
    try {
      const html = await fetchHtml(url);
      pages.push(parsePage(html, url));
    } catch (error) {
      console.error(`Failed to fetch ${url}:`, error.message);
    }
  }

  return pages;
}

const previousPages = {
  'https://example.com/demo': {
    title: 'Book a Demo | Example',
    h1: 'See Example in action',
    cta: 'Book a demo',
  },
};

const currentPages = await collectLandingPages('HubSpot');

for (const page of currentPages) {
  const changes = diffPage(previousPages[page.url], page);
  if (changes.length > 0) {
    console.log(`\nChanges detected for ${page.url}`);
    console.log(changes);
  }
}

This is enough to build a daily tracker that tells you when an ad-linked page changes its core message.

And if you want the public ad layer feeding that workflow without managing it manually, SociaVault makes that part much easier.

Python Version: Good for Scheduled Tracking Jobs

Python works very well for this kind of scheduled page-diff script.

import os
import re
import requests


HEADERS = {'X-API-Key': os.environ['SOCIAVAULT_API_KEY']}


def fetch_json(url):
    response = requests.get(url, headers=HEADERS, timeout=30)
    response.raise_for_status()
    return response.json()


def fetch_html(url):
    response = requests.get(
        url,
        headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'},
        timeout=30,
    )
    response.raise_for_status()
    return response.text


def extract_between(html, pattern):
    match = re.search(pattern, html, re.IGNORECASE | re.DOTALL)
    return re.sub(r'\s+', ' ', match.group(1)).strip() if match else None


def parse_page(html, url):
    return {
        'url': url,
        'title': extract_between(html, r'<title>(.*?)</title>'),
        'h1': extract_between(html, r'<h1[^>]*>(.*?)</h1>'),
        'cta': extract_between(html, r'<a[^>]*>(Start free|Book a demo|Talk to sales|Try free|Get started)</a>'),
    }


def normalize_ads(items=None):
    items = items or []
    urls = []
    for item in items:
        url = item.get('url') or item.get('landingPageUrl') or item.get('snapshot', {}).get('link_url')
        if url:
            urls.append(url)
    return urls


def diff_page(previous, current):
    changes = []
    for field in ['title', 'h1', 'cta']:
        old_value = (previous or {}).get(field)
        new_value = current.get(field)
        if old_value != new_value:
            changes.append({'field': field, 'previous': old_value, 'current': new_value})
    return changes


def collect_landing_pages(company):
    facebook = fetch_json(
        f'https://api.sociavault.com/v1/scrape/facebook-ad-library/company-ads?companyName={company}&status=ACTIVE&trim=true'
    )
    linkedin = fetch_json(
        f'https://api.sociavault.com/v1/scrape/linkedin-ad-library/search?company={company}'
    )

    urls = set(normalize_ads(facebook.get('data')) + normalize_ads(linkedin.get('data')))
    pages = []

    for url in urls:
        try:
            html = fetch_html(url)
            pages.append(parse_page(html, url))
        except Exception as error:
            print(f'Failed to fetch {url}: {error}')

    return pages


previous_pages = {
    'https://example.com/demo': {
        'title': 'Book a Demo | Example',
        'h1': 'See Example in action',
        'cta': 'Book a demo',
    }
}

current_pages = collect_landing_pages('HubSpot')

for page in current_pages:
    changes = diff_page(previous_pages.get(page['url']), page)
    if changes:
        print(f'\nChanges detected for {page["url"]}')
        print(changes)

The Best Use Cases for This

This kind of tracker is especially useful for:

spotting pricing pivots earlier
seeing when a competitor moves from demo-led to self-serve
catching new proof points or trust elements in the hero section
identifying when a category starts leaning harder into comparison pages

That is much more actionable than just saving screenshots of ads.

Honest Alternatives

There are a few other ways to do this.

General site crawlers

Useful if you care about broad site changes.

Less focused if your real interest is what paid traffic is being sent to.

Visual diff tools

Great when design changes matter.

Often noisier and heavier than necessary if you only care about message shifts.

Manual weekly review

Fine for small lists.

Easy to forget, and very hard to compare consistently over time.

That is why I usually start with simple structured text diffs and only add visual diffing later if needed.

Final Take

If you want to understand what a competitor is actually pushing, look past the ad and track the page.

That is where the real shift often shows up first.

Use public ad data to find the current destination URLs. Save a few meaningful fields. Diff them over time. That alone gives you a much more useful competitor signal than casual ad-library browsing.

And if you want the ad-side data layer without wiring it all manually, SociaVault is a good place to start.

Then keep the rest simple and boring: fetch, extract, diff, report.

That is enough to make a landing-page change tracker genuinely useful.

webdev #monitoring #python #javascript #marketing

DEV Community