Alex Spinov

Posted on Mar 24 • Edited on Mar 27

The Wayback Machine Has a Free API — Check How Any Website Looked 10 Years Ago

#webdev #api #javascript #python

A client once asked me to prove that a competitor copied their landing page design. I needed to show what both sites looked like 2 years ago.

The Internet Archive's Wayback Machine has a free API that lets you retrieve any website's snapshot history — programmatically.

The API

Base URL: https://archive.org/wayback/available

No API key. No auth. No rate limit docs (but be reasonable).

1. Check If a URL Has Been Archived

import requests

def check_archive(url):
    resp = requests.get(
        'https://archive.org/wayback/available',
        params={'url': url}
    ).json()

    snapshot = resp.get('archived_snapshots', {}).get('closest', {})
    if snapshot:
        return {
            'available': True,
            'url': snapshot['url'],
            'timestamp': snapshot['timestamp'],
            'status': snapshot['status']
        }
    return {'available': False}

# Check any website
result = check_archive('github.com')
print(result)
# {'available': True, 'url': 'http://web.archive.org/web/20260324.../github.com', ...}

2. Get a Specific Date's Snapshot

def get_snapshot(url, date='20200101'):
    """Get closest snapshot to a specific date. Format: YYYYMMDD."""
    resp = requests.get(
        'https://archive.org/wayback/available',
        params={'url': url, 'timestamp': date}
    ).json()

    snapshot = resp.get('archived_snapshots', {}).get('closest', {})
    return snapshot.get('url', 'Not found')

# What did Google look like on Jan 1, 2015?
print(get_snapshot('google.com', '20150101'))

3. Track Design Changes Over Time

def track_changes(url, years=range(2015, 2027)):
    """Get one snapshot per year to track evolution."""
    snapshots = []
    for year in years:
        resp = requests.get(
            'https://archive.org/wayback/available',
            params={'url': url, 'timestamp': f'{year}0601'}
        ).json()
        snap = resp.get('archived_snapshots', {}).get('closest', {})
        if snap:
            snapshots.append({
                'year': year,
                'url': snap['url'],
                'timestamp': snap['timestamp'][:8]
            })
    return snapshots

# Track how a company's site evolved
for s in track_changes('stripe.com'):
    print(f"  {s['year']}: {s['url'][:70]}...")

4. CDX API for Bulk Queries

For serious research, use the CDX Server API:

def get_all_snapshots(url, limit=20):
    """Get all archived snapshots of a URL."""
    resp = requests.get(
        'https://web.archive.org/cdx/search/cdx',
        params={
            'url': url,
            'output': 'json',
            'limit': limit,
            'fl': 'timestamp,statuscode,original'
        }
    ).json()

    if len(resp) <= 1:  # First row is header
        return []

    headers = resp[0]
    return [dict(zip(headers, row)) for row in resp[1:]]

# How many times was python.org archived?
snaps = get_all_snapshots('python.org')
print(f"Found {len(snaps)} snapshots")
for s in snaps[:5]:
    print(f"  {s['timestamp'][:8]} — status {s['statuscode']}")

5. Monitor Competitor Changes

def compare_snapshots(url, date1, date2):
    """Get two snapshots for comparison."""
    snap1 = get_snapshot(url, date1)
    snap2 = get_snapshot(url, date2)

    print(f"Before ({date1}): {snap1}")
    print(f"After  ({date2}): {snap2}")
    print("\nOpen both in browser to compare visually.")

compare_snapshots('airbnb.com', '20200101', '20250101')

Use Cases

Legal: Prove what a website showed on a specific date
Competitive analysis: Track competitor redesigns and messaging changes
Due diligence: Verify claims about company history
Research: Study how the web evolves over time
SEO: Check if a domain previously hosted different content
Journalism: Find deleted pages or changed statements

Tips

Add 1-second delays between requests — be nice to the Archive
Not every page is archived — popular sites have more snapshots
Some sites block archiving via robots.txt
Snapshots may not include all images/CSS (pages may look broken)

I explore free APIs that most developers miss. More: GitHub | Writing inquiries: Spinov001@gmail.com

More free tools: 77 Web Scraping Tools & APIs

What's the oldest or most interesting website snapshot you've found on the Wayback Machine? I love finding how big sites looked in their early days. 👇

Need web scraping or data extraction? I've built 88 production scrapers. Email spinov001@gmail.com — quote in 2 hours. Or try my ready-made Apify actors — no code needed.

If you liked this, check out my list of 36+ Free APIs Every Developer Should Bookmark — all with generous free tiers, no credit card required.

DEV Community