DEV Community

Alex Spinov
Alex Spinov

Posted on

Archive.org Has a Free API — Check How Any Website Looked 20 Years Ago (Wayback Machine)

The Wayback Machine has been archiving the internet since 1996. It has 800 billion web pages stored.

What most developers don't know: it has a free API. No key needed. You can programmatically check how any URL looked on any date.


Quick Start

import requests

def wayback_check(url):
    """Check if a URL exists in the Wayback Machine."""
    api = f'https://archive.org/wayback/available?url={url}'
    resp = requests.get(api)
    data = resp.json()
    snapshot = data.get('archived_snapshots', {}).get('closest', {})
    if snapshot:
        return {
            'archived': True,
            'url': snapshot['url'],
            'timestamp': snapshot['timestamp'],
            'status': snapshot['status']
        }
    return {'archived': False}

result = wayback_check('google.com')
print(f'Archived: {result["archived"]}')
if result['archived']:
    print(f'Snapshot URL: {result["url"]}')
Enter fullscreen mode Exit fullscreen mode

Use Cases

1. Competitive Intelligence

See how a competitor's pricing page changed over time:

def get_snapshots(url, limit=10):
    """Get all archived snapshots of a URL."""
    cdx_api = f'https://web.archive.org/cdx/search/cdx?url={url}&output=json&limit={limit}'
    resp = requests.get(cdx_api)
    rows = resp.json()
    if len(rows) < 2:
        return []
    headers = rows[0]
    return [dict(zip(headers, row)) for row in rows[1:]]

# See every version of a competitor's pricing page
snapshots = get_snapshots('competitor.com/pricing', limit=20)
for s in snapshots:
    print(f'{s["timestamp"]}{s["statuscode"]}{s["length"]} bytes')
Enter fullscreen mode Exit fullscreen mode

2. Check If a Domain Was Previously Used

Buying a domain? Check what it was used for before:

snapshots = get_snapshots('domain-you-want-to-buy.com', limit=5)
for s in snapshots:
    archive_url = f'https://web.archive.org/web/{s["timestamp"]}/{s["original"]}'
    print(f'{s["timestamp"][:4]}: {archive_url}')
Enter fullscreen mode Exit fullscreen mode

3. Recover Lost Content

Website went down? Blog post deleted? Check if the Wayback Machine has it:

result = wayback_check('deleted-blog.com/important-article')
if result['archived']:
    print(f'Found it! {result["url"]}')
    # Download the page
    page = requests.get(result['url'])
    with open('recovered.html', 'w') as f:
        f.write(page.text)
Enter fullscreen mode Exit fullscreen mode

4. SEO: Track How a Site Evolved

# How many times was a URL archived per year?
from collections import Counter

snapshots = get_snapshots('example.com', limit=1000)
years = Counter(s['timestamp'][:4] for s in snapshots)
for year, count in sorted(years.items()):
    bar = '' * (count // 5)
    print(f'{year}: {count:4d} snapshots {bar}')
Enter fullscreen mode Exit fullscreen mode

CDX API (The Power User API)

The CDX API is more powerful than the basic availability API:

https://web.archive.org/cdx/search/cdx?url=example.com&output=json
Enter fullscreen mode Exit fullscreen mode

Parameters:

  • url — URL to search
  • matchType — exact, prefix, host, domain
  • from / to — date range (YYYYMMDD)
  • limit — max results
  • output — json, text, csv
  • filter — statuscode, mimetype, etc.

Rate Limits

  • No API key needed
  • Be respectful: 1-2 requests/second
  • Large queries may timeout — use limit parameter
  • Full dataset available for bulk download

What do you use the Wayback Machine for?

I've used it for competitive analysis and recovering deleted blog posts. What's the most creative use you've seen? Share in the comments.


More free APIs: Awesome Free APIs 2026 — 300+ APIs, no key needed

Web scraping tools: Awesome Web Scraping 2026

Top comments (0)