Twitter (now X) data is still one of the most valuable sources for sentiment analysis, trend monitoring, and competitive intelligence. But accessing that data has gotten expensive. The API pricing changes in 2023 priced out most small teams, and the landscape has shifted further in 2026.
Let's break down your options: the official API vs. web scraping, with a practical comparison to help you decide.
Twitter API v2: The Official Route
Current Pricing (2026)
| Tier | Monthly Cost | Tweet Cap | Features |
|---|---|---|---|
| Free | $0 | 1,500 tweets/mo (write) | Post only, 1 app |
| Basic | $200/mo | 10,000 reads/mo | Read + write, 2 apps |
| Pro | $5,000/mo | 1M reads/mo | Full archive search, analytics |
| Enterprise | Custom | Custom | Firehose access |
What the Free Tier Actually Gets You
Almost nothing for data extraction:
- Write-only — you can post tweets, but can't read/search them
- 1,500 tweets per month posting limit
- No search endpoint access
- No user lookup
- Useful only if you're building a posting bot
Basic Tier ($200/mo)
The entry point for data access:
- 10,000 tweet reads per month
- Basic search (recent tweets, last 7 days)
- User lookup and follower data
- No historical/archive search
At $200/month for 10,000 tweets, that's $0.02 per tweet. For many research and monitoring use cases, that's too expensive.
API Code Example
import requests
BEARER_TOKEN = "YOUR_BEARER_TOKEN"
def search_tweets(query: str, max_results: int = 10):
url = "https://api.twitter.com/2/tweets/search/recent"
headers = {"Authorization": f"Bearer {BEARER_TOKEN}"}
params = {
"query": query,
"max_results": max_results,
"tweet.fields": "created_at,public_metrics,author_id",
"expansions": "author_id",
"user.fields": "username,name,public_metrics"
}
response = requests.get(url, headers=headers, params=params)
return response.json()
def get_user_tweets(username: str, max_results: int = 10):
# First get user ID
url = f"https://api.twitter.com/2/users/by/username/{username}"
headers = {"Authorization": f"Bearer {BEARER_TOKEN}"}
user = requests.get(url, headers=headers).json()
user_id = user["data"]["id"]
# Then get their tweets
url = f"https://api.twitter.com/2/users/{user_id}/tweets"
params = {
"max_results": max_results,
"tweet.fields": "created_at,public_metrics"
}
response = requests.get(url, headers=headers, params=params)
return response.json()
# Search for tweets about Python
results = search_tweets("python programming", max_results=10)
for tweet in results.get("data", []):
metrics = tweet["public_metrics"]
print(f"Tweet: {tweet['text']}")
print(f"Likes: {metrics['like_count']} | RTs: {metrics['retweet_count']}")
print("---")
Web Scraping: The Alternative
When Scraping Makes More Sense
- Budget under $200/mo — the API's cheapest data tier costs $200
- Need more than 10K tweets — Basic tier caps out quickly
- Historical data — API archive search requires Pro ($5K/mo)
- Specific data needs — the API doesn't expose everything (e.g., view counts were added late)
- One-time research — paying $200/mo for a one-off analysis doesn't make sense
Challenges with Scraping Twitter
Twitter/X actively fights scraping:
- Aggressive rate limiting
- Login walls for search results
- Frequent frontend changes
- Legal threats (though enforcement is rare for research)
Building and maintaining your own Twitter scraper is a full-time job. That's where managed scraping tools come in.
Using an Apify Actor
I built a Twitter/X Scraper on Apify that handles the anti-scraping protections and outputs structured data.
from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run_input = {
"searchQueries": ["python developer"],
"maxTweets": 100,
"sortBy": "Top"
}
run = client.actor("cryptosignals/twitter-scraper").call(run_input=run_input)
for tweet in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{tweet['username']}")
print(f"Tweet: {tweet['text']}")
print(f"Likes: {tweet['likeCount']} | RTs: {tweet['retweetCount']}")
print(f"Views: {tweet['viewCount']}")
print(f"Posted: {tweet['createdAt']}")
print("---")
Sample Output
{
"text": "Just shipped my first FastAPI app. Python devs, what framework do you use?",
"username": "devperson",
"likeCount": 342,
"retweetCount": 45,
"replyCount": 89,
"viewCount": 52000,
"createdAt": "2026-03-05T14:23:00Z",
"hashtags": ["python", "fastapi"],
"url": "https://x.com/devperson/status/..."
}
Decision Framework: API vs Scraping
| Factor | Twitter API v2 | Web Scraping |
|---|---|---|
| Cost (small scale) | $200/mo minimum | Pay per result |
| Cost (large scale) | $5,000/mo for archive | Still pay per result |
| Data freshness | Real-time | Near real-time |
| Historical data | Pro tier only ($5K) | Available |
| Rate limits | Strict (10K tweets/mo on Basic) | Managed by tool |
| Data structure | Well-documented JSON | Depends on tool |
| Reliability | High (official) | Medium (can break) |
| Setup time | Minutes | Minutes (with Apify) |
| Legal risk | None | Low for public data |
| View counts | Yes (API v2) | Yes |
| Best for | Production apps, real-time | Research, analysis, one-off |
When to Use the API
- You're building a production application that needs guaranteed uptime
- You need real-time streaming (filtered stream endpoint)
- Your company can justify $200-5,000/month in API costs
- You're working at a company with a compliance requirement for official data sources
When to Use Scraping
- You're a small team or solo developer on a budget
- You need historical data without paying $5K/mo
- You're doing one-off research or periodic analysis
- You need more than 10K tweets without upgrading to Pro
- You want pay-per-result pricing instead of a monthly subscription
Hybrid Approach
Many teams use both:
- API for real-time — stream mentions of your brand as they happen
- Scraping for research — bulk extract historical data for analysis
This gives you the reliability of the official API for production features while using scraping for cost-effective research.
Building a Monitoring Dashboard
Here's a practical example combining data extraction with analysis:
from apify_client import ApifyClient
import pandas as pd
from datetime import datetime
client = ApifyClient("YOUR_APIFY_TOKEN")
def monitor_brand_mentions(brand: str, days: int = 7):
run_input = {
"searchQueries": [brand],
"maxTweets": 200,
"sortBy": "Latest"
}
run = client.actor("cryptosignals/twitter-scraper").call(run_input=run_input)
tweets = list(client.dataset(run["defaultDatasetId"]).iterate_items())
df = pd.DataFrame(tweets)
report = {
"brand": brand,
"total_mentions": len(df),
"total_impressions": df["viewCount"].sum(),
"avg_likes": df["likeCount"].mean(),
"avg_retweets": df["retweetCount"].mean(),
"top_tweet": df.loc[df["viewCount"].idxmax()]["text"][:100],
"checked_at": datetime.now().isoformat()
}
return report
report = monitor_brand_mentions("@yourbrand")
for key, value in report.items():
print(f"{key}: {value}")
Conclusion
The Twitter API v2 is the right choice for well-funded teams building production applications. For everyone else — researchers, small teams, indie developers — scraping tools like the Twitter/X Scraper on Apify offer a more cost-effective path to the same data.
The best approach depends on your budget, scale, and use case. Start with scraping for research, and upgrade to the API when you need real-time reliability in production.
Top comments (0)