DEV Community

Alex Spinov
Alex Spinov

Posted on

8 Free APIs to Automate Your Research (With Python Examples)

Have you ever spent hours manually searching for papers, articles, or data?

I did — until I discovered these 8 free APIs that let you automate research with a few lines of Python. No API keys needed for most of them.

1. OpenAlex — 250M+ Academic Works

OpenAlex indexes 250 million scholarly works from every discipline.

import requests

resp = requests.get('https://api.openalex.org/works', params={
    'search': 'machine learning healthcare',
    'per_page': 5
})
for work in resp.json()['results']:
    print(f"{work['title']}")
    print(f"  Cited by: {work['cited_by_count']}")
    print(f"  DOI: {work.get('doi', 'N/A')}\n")
Enter fullscreen mode Exit fullscreen mode

No API key required. Just send requests.

Full toolkit on GitHub


2. Crossref — 150M+ Scholarly Articles

Crossref powers DOI resolution. Their API gives you metadata on 150M+ articles.

resp = requests.get('https://api.crossref.org/works', params={
    'query': 'artificial intelligence drug discovery',
    'rows': 5
})
for item in resp.json()['message']['items']:
    title = item.get('title', ['No title'])[0]
    print(f"{title}")
    print(f"  Publisher: {item.get('publisher')}")
    print(f"  Citations: {item.get('is-referenced-by-count', 0)}\n")
Enter fullscreen mode Exit fullscreen mode

Free, no key needed. Add mailto param for priority access.

Full toolkit on GitHub


3. PubMed (NCBI E-utilities) — 36M+ Medical Papers

The gold standard for biomedical literature.

# Search PubMed
resp = requests.get('https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi', params={
    'db': 'pubmed',
    'term': 'CRISPR gene therapy 2024',
    'retmax': 5,
    'retmode': 'json'
})
ids = resp.json()['esearchresult']['idlist']
print(f"Found {len(ids)} papers: {ids}")
Enter fullscreen mode Exit fullscreen mode

Free, no key needed (but register for higher rate limits).

Full toolkit on GitHub


4. Semantic Scholar — AI-Powered Paper Search

Built by AI2, it gives you AI-generated summaries (TLDRs) of papers.

resp = requests.get('https://api.semanticscholar.org/graph/v1/paper/search', params={
    'query': 'large language models reasoning',
    'limit': 5,
    'fields': 'title,abstract,citationCount,tldr'
})
for paper in resp.json().get('data', []):
    print(f"{paper['title']}")
    if paper.get('tldr'):
        print(f"  TLDR: {paper['tldr']['text'][:100]}...")
    print(f"  Citations: {paper.get('citationCount', 0)}\n")
Enter fullscreen mode Exit fullscreen mode

Free tier: 100 requests/5min. No key for basic access.

Full toolkit on GitHub


5. arXiv — 2.4M+ Preprints

Physics, CS, math, biology — fresh preprints before peer review.

import urllib.parse

query = urllib.parse.quote('all:transformer attention mechanism')
resp = requests.get(f'http://export.arxiv.org/api/query?search_query={query}&max_results=3')
print(resp.text[:500])  # XML response with paper metadata
Enter fullscreen mode Exit fullscreen mode

Completely free. Returns XML (use feedparser for easy parsing).

Full toolkit on GitHub


6. CORE — 300M+ Open Access Papers

The world's largest collection of open access research.

resp = requests.get('https://api.core.ac.uk/v3/search/works', params={
    'q': 'climate change renewable energy',
    'limit': 5
}, headers={'Authorization': 'Bearer YOUR_FREE_KEY'})
# Free API key from core.ac.uk
for work in resp.json().get('results', []):
    print(f"{work['title']}")
Enter fullscreen mode Exit fullscreen mode

Free API key — register at core.ac.uk/services/api.

Full toolkit on GitHub


7. Unpaywall — Find Free PDFs Legally

Given a DOI, tells you if a legal free PDF exists.

doi = '10.1038/s41586-021-03819-2'
resp = requests.get(f'https://api.unpaywall.org/v2/{doi}', params={
    'email': 'your@email.com'
})
data = resp.json()
if data.get('is_oa'):
    print(f"Free PDF: {data['best_oa_location']['url_for_pdf']}")
else:
    print('No free version available')
Enter fullscreen mode Exit fullscreen mode

Free, just needs email as param.

Full toolkit on GitHub


8. Wikipedia — Structured Knowledge

Often overlooked, Wikipedia's API is incredibly powerful for quick facts.

resp = requests.get('https://en.wikipedia.org/api/rest_v1/page/summary/Machine_learning')
data = resp.json()
print(f"{data['title']}")
print(f"{data['extract'][:200]}...")
Enter fullscreen mode Exit fullscreen mode

No key, no limits (be reasonable).


Bonus: Combine Them All

The real power is combining APIs:

  1. Search with OpenAlex/Semantic Scholar
  2. Get metadata from Crossref
  3. Find free PDF via Unpaywall
  4. Get context from Wikipedia

I built toolkits for each of these — all open source on my GitHub.


Which API Would You Try First?

I'm curious — what research tasks would you automate? Drop a comment below!

If you found this useful, check out my full collection of research API toolkits.


Need custom web scraping or data extraction? Check my Apify actors or DM me.

Top comments (0)