DEV Community

Alex Spinov
Alex Spinov

Posted on

PubMed Has a Free API — Search 36M+ Medical Papers Programmatically

I was building a health-tech prototype last week. Needed medical research data. Expected paywalls everywhere.

Then I found PubMed's E-utilities API. 36 million biomedical papers. Free. No API key. No signup.

What Is PubMed?

PubMed is the U.S. National Library of Medicine's database — the world's largest collection of biomedical literature. It's run by NIH (National Institutes of Health), and they provide free programmatic access to everything.

If you work with health data, drug research, clinical trials, or biomedical NLP — this is your goldmine.

Your First API Call (Zero Setup)

curl "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=artificial+intelligence&retmode=json&retmax=3"
Enter fullscreen mode Exit fullscreen mode

That returns 364,000+ results for "artificial intelligence" in biomedical literature.

Full Python Example: Search and Fetch Papers

import requests

# Step 1: Search for papers
search_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi"
search_params = {
    "db": "pubmed",
    "term": "machine learning cancer diagnosis",
    "retmode": "json",
    "retmax": 5,
    "sort": "relevance"
}

search = requests.get(search_url, params=search_params).json()
ids = search["esearchresult"]["idlist"]
total = search["esearchresult"]["count"]
print(f"Found {total} papers. Fetching top {len(ids)}...")

# Step 2: Fetch paper details
fetch_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi"
fetch_params = {
    "db": "pubmed",
    "id": ",".join(ids),
    "retmode": "json"
}

details = requests.get(fetch_url, params=fetch_params).json()["result"]

for pmid in ids:
    paper = details[pmid]
    title = paper["title"]
    authors = ", ".join(a["name"] for a in paper.get("authors", [])[:3])
    journal = paper.get("source", "Unknown")
    date = paper.get("pubdate", "Unknown")
    print(f"
[{pmid}] {title}")
    print(f"  Authors: {authors}")
    print(f"  Journal: {journal} | Date: {date}")
Enter fullscreen mode Exit fullscreen mode

Output:

Found 45,231 papers. Fetching top 5...

[39187234] Machine Learning in Cancer Diagnosis: Current State and Future
  Authors: Smith J, Chen L, Kumar R
  Journal: Nature Reviews | Date: 2024 Mar
Enter fullscreen mode Exit fullscreen mode

5 Things You Can Build

1. Drug Research Tracker

# Track publications about a specific drug
drugs = ["ozempic", "metformin", "ivermectin"]
for drug in drugs:
    r = requests.get(search_url, params="db": "pubmed")
    count = r.json()["esearchresult"]["count"]
    print(f"{drug}: {count} papers")
Enter fullscreen mode Exit fullscreen mode

2. Clinical Trial Monitor

# Find recent clinical trials
params = {
    "db": "pubmed",
    "term": "clinical trial[pt] AND 2024[dp] AND diabetes",
    "retmode": "json",
    "retmax": 10
}
trials = requests.get(search_url, params=params).json()
print(f"Diabetes clinical trials in 2024: {trials["esearchresult"]["count"]}")
Enter fullscreen mode Exit fullscreen mode

3. Author Publication Tracker

# Find all papers by a specific author
params = {
    "db": "pubmed",
    "term": "Hinton GE[author]",
    "retmode": "json"
}
result = requests.get(search_url, params=params).json()
print(f"Geoffrey Hinton: {result["esearchresult"]["count"]} papers in PubMed")
Enter fullscreen mode Exit fullscreen mode

4. Research Trend Analyzer

# Publications per year for a topic
for year in range(2019, 2026):
    params = {
        "db": "pubmed",
        "term": f"large language model AND year}[dp]"
    count = requests.get(search_url, params=params).json()["esearchresult"]["count"]
    print(f"{year}: {count} LLM papers")
Enter fullscreen mode Exit fullscreen mode

5. Abstract Downloader for NLP Training

import csv

# Get abstracts for NLP/ML training
fetch_abstract_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
params = "db": "pubmed"
abstracts = requests.get(fetch_abstract_url, params=params).text
print(abstracts[:500])
Enter fullscreen mode Exit fullscreen mode

PubMed vs Other Medical APIs

Feature PubMed (E-utilities) Semantic Scholar Google Scholar Scopus
API Key ❌ Not required ✅ Required No API ✅ Required
Papers 36M+ biomedical 200M+ all fields Unknown 84M+
Free ✅ Completely ✅ Basic tier N/A ❌ Paid
Abstracts ✅ Full text ✅ Yes ❌ No ✅ Yes
MeSH Terms ✅ Yes ❌ No ❌ No ❌ No
Clinical Trials ✅ Filterable ❌ No ❌ No Limited

Pro Tips

  1. Use MeSH terms for precise medical queries: "diabetes mellitus"[MeSH] is more accurate than just diabetes
  2. Add your email as &email=you@example.com — NCBI recommends it for tracking
  3. Rate limit: 3 requests/second without API key, 10/sec with a free key from NCBI
  4. Get a free API key at NCBI — optional but recommended
  5. Use retstart for pagination: &retstart=10&retmax=10 for page 2

Combine with OpenAlex

PubMed is biomedical-only. For broader academic search (CS, physics, social science), check out my OpenAlex tutorial — 250M+ papers, also free.


What medical data would you extract from 36M papers? Share your use case in the comments.

More free API tutorials: My API series on Dev.to

Need custom data extraction? 77 scrapers on Apify | Contact

Top comments (0)