Last week I shared 5 Free APIs That Changed How I Build Side Projects. It got way more attention than I expected.
Several people asked about research and academic APIs — so here are 5 more free APIs, focused on scholarly data.
All of them are completely free, require no API key, and return real, verified data (not AI-generated).
1. Crossref — 150M+ Scholarly Articles
Crossref is the official DOI registry. Every DOI ever assigned has metadata here.
import requests
resp = requests.get("https://api.crossref.org/works", params={
"query": "machine learning",
"rows": 3,
"mailto": "you@email.com" # polite pool = 10x faster
})
for item in resp.json()["message"]["items"]:
print(f"{item['title'][0]}")
print(f" Cited by {item['is-referenced-by-count']} papers")
Best for: Citation analysis, bibliography generation, DOI resolution
Docs: https://api.crossref.org
2. OpenAlex — 250M+ Academic Works
The successor to Microsoft Academic Graph. Completely open.
resp = requests.get("https://api.openalex.org/works", params={
"search": "CRISPR gene therapy",
"per_page": 3,
"sort": "cited_by_count:desc"
})
for work in resp.json()["results"]:
print(f"{work['title']}")
print(f" {work['cited_by_count']} citations | {work['publication_year']}")
Best for: Large-scale bibliometrics, author networks, institutional research
Docs: https://docs.openalex.org
3. PubMed (E-utilities) — 36M+ Medical Papers
The gold standard for biomedical literature. Run by NIH.
# Search
search = requests.get("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi", params={
"db": "pubmed", "term": "covid vaccine efficacy",
"retmax": 3, "retmode": "json"
})
ids = search.json()["esearchresult"]["idlist"]
# Fetch details
details = requests.get("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi", params={
"db": "pubmed", "id": ",".join(ids), "retmode": "json"
})
for uid in ids:
paper = details.json()["result"][uid]
print(f"{paper['title'][:80]}...")
Best for: Medical research, clinical trials, drug discovery
Docs: https://www.ncbi.nlm.nih.gov/books/NBK25501/
4. arXiv — 2M+ Preprints
Where cutting-edge research appears first — before peer review.
import urllib.parse
query = urllib.parse.quote("all:transformer attention mechanism")
resp = requests.get(f"http://export.arxiv.org/api/query?search_query={query}&max_results=3")
import xml.etree.ElementTree as ET
root = ET.fromstring(resp.text)
ns = {"atom": "http://www.w3.org/2005/Atom"}
for entry in root.findall("atom:entry", ns):
title = entry.find("atom:title", ns).text.strip()
print(title[:80])
Best for: AI/ML papers, physics, math, CS — latest research before publication
Docs: https://info.arxiv.org/help/api
5. CORE — 300M+ Open Access Articles
The world's largest collection of open access research papers. Full text included.
# CORE requires a free API key (quick signup)
resp = requests.get("https://api.core.ac.uk/v3/search/works", params={
"q": "renewable energy storage",
"limit": 3
}, headers={"Authorization": "Bearer YOUR_KEY"})
for work in resp.json()["results"]:
print(f"{work['title']}")
print(f" Full text: {'Yes' if work.get('fullText') else 'Abstract only'}")
Best for: Full-text analysis, open access research, text mining
Docs: https://core.ac.uk/services/api
Quick Comparison
| API | Records | Key? | Best For |
|---|---|---|---|
| Crossref | 150M+ | No | DOIs, citations |
| OpenAlex | 250M+ | No | Bibliometrics |
| PubMed | 36M+ | No | Medical research |
| arXiv | 2M+ | No | Preprints (AI/Physics) |
| CORE | 300M+ | Free key | Full text |
Open Source Toolkits
I built Python toolkits for each of these:
All MIT licensed. Use them however you want.
Which academic API are you using? Or is there one I missed? Drop a comment — I'm building more toolkits based on what people need.
Need custom research data pipelines? Check my GitHub or email me.
Top comments (0)