DEV Community

agenthustler
agenthustler

Posted on

Twitter API v2 vs Web Scraping in 2026: Which Should You Use?

Twitter (now X) data is still one of the most valuable sources for sentiment analysis, trend monitoring, and competitive intelligence. But accessing that data has gotten expensive. The API pricing changes in 2023 priced out most small teams, and the landscape has shifted further in 2026.

Let's break down your options: the official API vs. web scraping, with a practical comparison to help you decide.

Twitter API v2: The Official Route

Current Pricing (2026)

Tier Monthly Cost Tweet Cap Features
Free $0 1,500 tweets/mo (write) Post only, 1 app
Basic $200/mo 10,000 reads/mo Read + write, 2 apps
Pro $5,000/mo 1M reads/mo Full archive search, analytics
Enterprise Custom Custom Firehose access

What the Free Tier Actually Gets You

Almost nothing for data extraction:

  • Write-only — you can post tweets, but can't read/search them
  • 1,500 tweets per month posting limit
  • No search endpoint access
  • No user lookup
  • Useful only if you're building a posting bot

Basic Tier ($200/mo)

The entry point for data access:

  • 10,000 tweet reads per month
  • Basic search (recent tweets, last 7 days)
  • User lookup and follower data
  • No historical/archive search

At $200/month for 10,000 tweets, that's $0.02 per tweet. For many research and monitoring use cases, that's too expensive.

API Code Example

import requests

BEARER_TOKEN = "YOUR_BEARER_TOKEN"

def search_tweets(query: str, max_results: int = 10):
    url = "https://api.twitter.com/2/tweets/search/recent"
    headers = {"Authorization": f"Bearer {BEARER_TOKEN}"}
    params = {
        "query": query,
        "max_results": max_results,
        "tweet.fields": "created_at,public_metrics,author_id",
        "expansions": "author_id",
        "user.fields": "username,name,public_metrics"
    }
    response = requests.get(url, headers=headers, params=params)
    return response.json()

def get_user_tweets(username: str, max_results: int = 10):
    # First get user ID
    url = f"https://api.twitter.com/2/users/by/username/{username}"
    headers = {"Authorization": f"Bearer {BEARER_TOKEN}"}
    user = requests.get(url, headers=headers).json()
    user_id = user["data"]["id"]

    # Then get their tweets
    url = f"https://api.twitter.com/2/users/{user_id}/tweets"
    params = {
        "max_results": max_results,
        "tweet.fields": "created_at,public_metrics"
    }
    response = requests.get(url, headers=headers, params=params)
    return response.json()

# Search for tweets about Python
results = search_tweets("python programming", max_results=10)
for tweet in results.get("data", []):
    metrics = tweet["public_metrics"]
    print(f"Tweet: {tweet['text']}")
    print(f"Likes: {metrics['like_count']} | RTs: {metrics['retweet_count']}")
    print("---")
Enter fullscreen mode Exit fullscreen mode

Web Scraping: The Alternative

When Scraping Makes More Sense

  • Budget under $200/mo — the API's cheapest data tier costs $200
  • Need more than 10K tweets — Basic tier caps out quickly
  • Historical data — API archive search requires Pro ($5K/mo)
  • Specific data needs — the API doesn't expose everything (e.g., view counts were added late)
  • One-time research — paying $200/mo for a one-off analysis doesn't make sense

Challenges with Scraping Twitter

Twitter/X actively fights scraping:

  • Aggressive rate limiting
  • Login walls for search results
  • Frequent frontend changes
  • Legal threats (though enforcement is rare for research)

Building and maintaining your own Twitter scraper is a full-time job. That's where managed scraping tools come in.

Using an Apify Actor

I built a Twitter/X Scraper on Apify that handles the anti-scraping protections and outputs structured data.

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run_input = {
    "searchQueries": ["python developer"],
    "maxTweets": 100,
    "sortBy": "Top"
}

run = client.actor("cryptosignals/twitter-scraper").call(run_input=run_input)

for tweet in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"@{tweet['username']}")
    print(f"Tweet: {tweet['text']}")
    print(f"Likes: {tweet['likeCount']} | RTs: {tweet['retweetCount']}")
    print(f"Views: {tweet['viewCount']}")
    print(f"Posted: {tweet['createdAt']}")
    print("---")
Enter fullscreen mode Exit fullscreen mode

Sample Output

{
  "text": "Just shipped my first FastAPI app. Python devs, what framework do you use?",
  "username": "devperson",
  "likeCount": 342,
  "retweetCount": 45,
  "replyCount": 89,
  "viewCount": 52000,
  "createdAt": "2026-03-05T14:23:00Z",
  "hashtags": ["python", "fastapi"],
  "url": "https://x.com/devperson/status/..."
}
Enter fullscreen mode Exit fullscreen mode

Decision Framework: API vs Scraping

Factor Twitter API v2 Web Scraping
Cost (small scale) $200/mo minimum Pay per result
Cost (large scale) $5,000/mo for archive Still pay per result
Data freshness Real-time Near real-time
Historical data Pro tier only ($5K) Available
Rate limits Strict (10K tweets/mo on Basic) Managed by tool
Data structure Well-documented JSON Depends on tool
Reliability High (official) Medium (can break)
Setup time Minutes Minutes (with Apify)
Legal risk None Low for public data
View counts Yes (API v2) Yes
Best for Production apps, real-time Research, analysis, one-off

When to Use the API

  • You're building a production application that needs guaranteed uptime
  • You need real-time streaming (filtered stream endpoint)
  • Your company can justify $200-5,000/month in API costs
  • You're working at a company with a compliance requirement for official data sources

When to Use Scraping

  • You're a small team or solo developer on a budget
  • You need historical data without paying $5K/mo
  • You're doing one-off research or periodic analysis
  • You need more than 10K tweets without upgrading to Pro
  • You want pay-per-result pricing instead of a monthly subscription

Hybrid Approach

Many teams use both:

  1. API for real-time — stream mentions of your brand as they happen
  2. Scraping for research — bulk extract historical data for analysis

This gives you the reliability of the official API for production features while using scraping for cost-effective research.

Building a Monitoring Dashboard

Here's a practical example combining data extraction with analysis:

from apify_client import ApifyClient
import pandas as pd
from datetime import datetime

client = ApifyClient("YOUR_APIFY_TOKEN")

def monitor_brand_mentions(brand: str, days: int = 7):
    run_input = {
        "searchQueries": [brand],
        "maxTweets": 200,
        "sortBy": "Latest"
    }

    run = client.actor("cryptosignals/twitter-scraper").call(run_input=run_input)
    tweets = list(client.dataset(run["defaultDatasetId"]).iterate_items())

    df = pd.DataFrame(tweets)

    report = {
        "brand": brand,
        "total_mentions": len(df),
        "total_impressions": df["viewCount"].sum(),
        "avg_likes": df["likeCount"].mean(),
        "avg_retweets": df["retweetCount"].mean(),
        "top_tweet": df.loc[df["viewCount"].idxmax()]["text"][:100],
        "checked_at": datetime.now().isoformat()
    }

    return report

report = monitor_brand_mentions("@yourbrand")
for key, value in report.items():
    print(f"{key}: {value}")
Enter fullscreen mode Exit fullscreen mode

Conclusion

The Twitter API v2 is the right choice for well-funded teams building production applications. For everyone else — researchers, small teams, indie developers — scraping tools like the Twitter/X Scraper on Apify offer a more cost-effective path to the same data.

The best approach depends on your budget, scale, and use case. Start with scraping for research, and upgrade to the API when you need real-time reliability in production.

Top comments (0)