DEV Community

Alex Spinov
Alex Spinov

Posted on

How to Find Free PDFs of Research Papers (Legally) With One API Call

You find a paper you need. It costs $35. You check another journal. $49.

But here's the thing: most paywalled papers have a free, legal version somewhere. Author's personal website, university repository, preprint server.

The problem is finding it. That's what Unpaywall does.

One API Call Per Paper

import requests

doi = "10.1038/nature12373"  # A Nature paper
resp = requests.get(f"https://api.unpaywall.org/v2/{doi}?email=you@email.com")
data = resp.json()

if data["is_oa"]:
    print(f"FREE: {data['best_oa_location']['url_for_pdf']}")
else:
    print("No free version found")
Enter fullscreen mode Exit fullscreen mode

That's it. One request, one answer.

How Does Unpaywall Work?

Unpaywall checks:

  1. Publisher websites — some papers become free after embargo
  2. Preprint servers — arXiv, bioRxiv, medRxiv
  3. Institutional repositories — university hosting
  4. Author pages — self-archived copies
  5. Government mandates — NIH-funded papers must be free

All of these are legal. Unpaywall doesn't do anything shady — it just knows where to look.

Batch Check Your Reading List

Got 50 DOIs from a Crossref search? Check them all:

import time

dois = [
    "10.1038/nature12373",
    "10.1126/science.1252229",
    "10.1016/j.cell.2014.11.021",
    "10.1073/pnas.1318679111",
    "10.1038/nbt.3122",
]

for doi in dois:
    resp = requests.get(f"https://api.unpaywall.org/v2/{doi}?email=you@email.com")
    data = resp.json()
    status = "FREE" if data["is_oa"] else "PAID"
    title = data.get("title", "")[:50]
    print(f"[{status}] {title}")
    if data["is_oa"]:
        pdf = data["best_oa_location"].get("url_for_pdf", "no direct PDF")
        print(f"{pdf}")
    time.sleep(1)  # be polite
Enter fullscreen mode Exit fullscreen mode

Sample output:

[FREE] RNA-guided human genome engineering via Cas9
  → https://europepmc.org/articles/pmc3969860?pdf=render
[FREE] Programmable editing of a target base in genomic DNA
  → https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4873371
[PAID] Genome engineering using the CRISPR-Cas9 system
[FREE] Highly efficient Cas9-mediated transcriptional progr
  → https://www.biorxiv.org/content/10.1101/005967v1.full.pdf
Enter fullscreen mode Exit fullscreen mode

3 out of 5 papers — free. That's $105 saved.

Combine With Crossref

Real workflow: search Crossref → check Unpaywall → download free PDFs:

# Step 1: Find papers via Crossref
results = requests.get("https://api.crossref.org/works", params={
    "query": "CRISPR gene therapy", "rows": 10, "sort": "is-referenced-by-count"
}).json()["message"]["items"]

# Step 2: Check each DOI against Unpaywall
for item in results:
    doi = item["DOI"]
    title = item["title"][0][:50] if item.get("title") else "Untitled"

    oa = requests.get(f"https://api.unpaywall.org/v2/{doi}?email=you@email.com").json()
    status = "FREE" if oa.get("is_oa") else "PAID"
    print(f"[{status}] {title}")
    time.sleep(1)
Enter fullscreen mode Exit fullscreen mode

The Numbers

According to Unpaywall's own data:

  • ~30% of all papers have a free version somewhere
  • For recent papers (2020+), it's closer to 50%
  • NIH-funded papers: >90% are free (by mandate)

I Built a Toolkit

unpaywall-toolkit — Python wrapper for the Unpaywall API:

from unpaywall_toolkit import UnpaywallClient

client = UnpaywallClient(email="you@email.com")

# Single check
result = client.check("10.1038/nature12373")

# Batch check + export to CSV
client.export_csv(["10.1038/nature12373", "10.1126/science.1252229"], "results.csv")
Enter fullscreen mode Exit fullscreen mode

Part of the Research API Suite

All on GitHub. All free and open source.


How do you access paywalled papers? And do you use any tools to find free versions?

Top comments (0)