Alex Spinov

Posted on Mar 25 • Edited on Mar 26

8 Free APIs to Automate Your Research (With Python Examples)

#python #productivity #api #tutorial

Have you ever spent hours manually searching for papers, articles, or data?

I did — until I discovered these 8 free APIs that let you automate research with a few lines of Python. No API keys needed for most of them.

1. OpenAlex — 250M+ Academic Works

OpenAlex indexes 250 million scholarly works from every discipline.

import requests

resp = requests.get('https://api.openalex.org/works', params={
    'search': 'machine learning healthcare',
    'per_page': 5
})
for work in resp.json()['results']:
    print(f"{work['title']}")
    print(f"  Cited by: {work['cited_by_count']}")
    print(f"  DOI: {work.get('doi', 'N/A')}\n")

No API key required. Just send requests.

Full toolkit on GitHub

2. Crossref — 150M+ Scholarly Articles

Crossref powers DOI resolution. Their API gives you metadata on 150M+ articles.

resp = requests.get('https://api.crossref.org/works', params={
    'query': 'artificial intelligence drug discovery',
    'rows': 5
})
for item in resp.json()['message']['items']:
    title = item.get('title', ['No title'])[0]
    print(f"{title}")
    print(f"  Publisher: {item.get('publisher')}")
    print(f"  Citations: {item.get('is-referenced-by-count', 0)}\n")

Free, no key needed. Add mailto param for priority access.

Full toolkit on GitHub

3. PubMed (NCBI E-utilities) — 36M+ Medical Papers

The gold standard for biomedical literature.

# Search PubMed
resp = requests.get('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi', params={
    'db': 'pubmed',
    'term': 'CRISPR gene therapy 2024',
    'retmax': 5,
    'retmode': 'json'
})
ids = resp.json()['esearchresult']['idlist']
print(f"Found {len(ids)} papers: {ids}")

Free, no key needed (but register for higher rate limits).

Full toolkit on GitHub

4. Semantic Scholar — AI-Powered Paper Search

Built by AI2, it gives you AI-generated summaries (TLDRs) of papers.

resp = requests.get('https://api.semanticscholar.org/graph/v1/paper/search', params={
    'query': 'large language models reasoning',
    'limit': 5,
    'fields': 'title,abstract,citationCount,tldr'
})
for paper in resp.json().get('data', []):
    print(f"{paper['title']}")
    if paper.get('tldr'):
        print(f"  TLDR: {paper['tldr']['text'][:100]}...")
    print(f"  Citations: {paper.get('citationCount', 0)}\n")

Free tier: 100 requests/5min. No key for basic access.

Full toolkit on GitHub

5. arXiv — 2.4M+ Preprints

Physics, CS, math, biology — fresh preprints before peer review.

import urllib.parse

query = urllib.parse.quote('all:transformer attention mechanism')
resp = requests.get(f'http://export.arxiv.org/api/query?search_query={query}&max_results=3')
print(resp.text[:500])  # XML response with paper metadata

Completely free. Returns XML (use feedparser for easy parsing).

Full toolkit on GitHub

6. CORE — 300M+ Open Access Papers

The world's largest collection of open access research.

resp = requests.get('https://api.core.ac.uk/v3/search/works', params={
    'q': 'climate change renewable energy',
    'limit': 5
}, headers={'Authorization': 'Bearer YOUR_FREE_KEY'})
# Free API key from core.ac.uk
for work in resp.json().get('results', []):
    print(f"{work['title']}")

Free API key — register at core.ac.uk/services/api.

Full toolkit on GitHub

7. Unpaywall — Find Free PDFs Legally

Given a DOI, tells you if a legal free PDF exists.

doi = '10.1038/s41586-021-03819-2'
resp = requests.get(f'https://api.unpaywall.org/v2/{doi}', params={
    'email': 'your@email.com'
})
data = resp.json()
if data.get('is_oa'):
    print(f"Free PDF: {data['best_oa_location']['url_for_pdf']}")
else:
    print('No free version available')

Free, just needs email as param.

Full toolkit on GitHub

8. Wikipedia — Structured Knowledge

Often overlooked, Wikipedia's API is incredibly powerful for quick facts.

resp = requests.get('https://en.wikipedia.org/api/rest_v1/page/summary/Machine_learning')
data = resp.json()
print(f"{data['title']}")
print(f"{data['extract'][:200]}...")

No key, no limits (be reasonable).

Bonus: Combine Them All

The real power is combining APIs:

Search with OpenAlex/Semantic Scholar
Get metadata from Crossref
Find free PDF via Unpaywall
Get context from Wikipedia

I built toolkits for each of these — all open source on my GitHub.

Which API Would You Try First?

I'm curious — what research tasks would you automate? Drop a comment below!

If you found this useful, check out my full collection of research API toolkits.

Need custom web scraping or data extraction? Check my Apify actors or DM me.

Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*

DEV Community

8 Free APIs to Automate Your Research (With Python Examples)

1. OpenAlex — 250M+ Academic Works

2. Crossref — 150M+ Scholarly Articles

3. PubMed (NCBI E-utilities) — 36M+ Medical Papers

4. Semantic Scholar — AI-Powered Paper Search

5. arXiv — 2.4M+ Preprints

6. CORE — 300M+ Open Access Papers

7. Unpaywall — Find Free PDFs Legally

8. Wikipedia — Structured Knowledge

Bonus: Combine Them All

Which API Would You Try First?

Top comments (0)