Last month, a friend working in biotech asked me: "How can I quickly see what trials Pfizer is running right now?"
He was spending hours on ClinicalTrials.gov, clicking through pages manually.
I said: "Give me 20 minutes."
The Problem
ClinicalTrials.gov has 500,000+ trials. The website works, but if you want to:
- Monitor a specific company's pipeline
- Track a disease area over time
- Export data for analysis
...you're stuck clicking around.
The Solution: 15 Lines of Python
import requests
def track_company(sponsor, status='RECRUITING'):
resp = requests.get('https://clinicaltrials.gov/api/v2/studies', params={
'query.sponsor': sponsor,
'filter.overallStatus': status,
'pageSize': 20,
'format': 'json'
})
trials = resp.json().get('studies', [])
print(f"\n{sponsor} — {len(trials)} {status.lower()} trials:\n")
for s in trials:
p = s['protocolSection']
title = p['identificationModule']['briefTitle']
phase = ', '.join(p.get('designModule', {}).get('phases', ['N/A']))
print(f" [{phase}] {title}")
return trials
# Check what the big players are testing
for company in ['Pfizer', 'Moderna', 'Novartis', 'Roche']:
track_company(company)
Output:
Pfizer — 20 recruiting trials:
[PHASE3] Study of PF-07321332 in Non-Hospitalized Adults
[PHASE2] Novel mRNA Cancer Vaccine + Pembrolizumab
[PHASE1] Gene Therapy for Hemophilia B
...
Moderna — 20 recruiting trials:
[PHASE3] mRNA-1283 COVID-19 Next-Gen Vaccine
[PHASE2] Personalized Cancer Vaccine (mRNA-4157)
...
Making It Useful: Daily Email Alert
import smtplib
from email.mime.text import MIMEText
from datetime import datetime, timedelta
def get_new_trials(query, days=1):
since = (datetime.now() - timedelta(days=days)).strftime('%Y-%m-%d')
resp = requests.get('https://clinicaltrials.gov/api/v2/studies', params={
'query.term': query,
'filter.advanced': f'AREA[StudyFirstPostDate]RANGE[{since}, MAX]',
'pageSize': 50,
'format': 'json'
})
return resp.json().get('studies', [])
# Run daily via cron
new_trials = get_new_trials('artificial intelligence', days=1)
if new_trials:
print(f"{len(new_trials)} new AI trials today!")
What My Friend Said
"I was paying $200/month for a clinical trial monitoring service. This does 80% of what it does."
That's $2,400/year saved with a Python script.
The Full Toolkit
I packaged this into a proper CLI tool:
python search_trials.py 'cancer immunotherapy' --status RECRUITING --format csv --output trials.csv
It's part of my Research API Suite — 9 free API toolkits for research automation.
What data would you track if you had easy access to 500K clinical trials? I'm curious about non-obvious use cases.
More tools: Apify scrapers | GitHub
Top comments (0)