DEV Community

Alex Spinov
Alex Spinov

Posted on

OpenAlex API: Search 250M+ Research Papers for Free (No API Key Needed)

Most researchers pay for Scopus ($10,000+/year) or Web of Science ($50,000+/year) to search academic literature. But OpenAlex — a free, open-source index of 250M+ scholarly works — gives you the same data through a simple REST API. No API key. No rate limits (for polite users). No paywall.

I replaced a $500/month research data pipeline with 30 lines of Python using this API. Here's how.

What Is OpenAlex?

OpenAlex is a free, open catalog of the global research system. It indexes:

  • 250M+ works (papers, books, datasets)
  • 100M+ authors with citation metrics
  • 100K+ institutions worldwide
  • 50K+ journals and conferences
  • Topics, concepts, and citation graphs

Think of it as a free alternative to Scopus, Web of Science, and Google Scholar — but with a proper API.

Quick Start: Search Papers in 5 Lines

import urllib.request
import json

url = "https://api.openalex.org/works?search=large+language+models&sort=cited_by_count:desc&per_page=5"
response = urllib.request.urlopen(url)
papers = json.loads(response.read())

for paper in papers["results"]:
    print(f"{paper['title']}")
    print(f"  Citations: {paper['cited_by_count']}")
    print(f"  Year: {paper['publication_year']}")
    print(f"  DOI: {paper.get('doi')}\n")
Enter fullscreen mode Exit fullscreen mode

No pip install. No API key. Just stdlib Python.

5 Practical Use Cases

1. Find the Most-Cited Papers on Any Topic

def top_papers(topic, limit=10):
    url = f"https://api.openalex.org/works?search={topic}&sort=cited_by_count:desc&per_page={limit}"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read())
    return [(p["title"], p["cited_by_count"], p["publication_year"]) for p in data["results"]]

for title, citations, year in top_papers("artificial intelligence"):
    print(f"[{year}] {title}{citations} citations")
Enter fullscreen mode Exit fullscreen mode

2. Track Research Trends Over Time

def trend(topic, start_year=2020, end_year=2025):
    for year in range(start_year, end_year + 1):
        url = f"https://api.openalex.org/works?search={topic}&filter=publication_year:{year}&per_page=1"
        response = urllib.request.urlopen(url)
        data = json.loads(response.read())
        print(f"{year}: {data['meta']['count']} papers")

trend("large language models")
# 2020: 1,204 papers
# 2021: 2,891 papers
# 2022: 8,445 papers
# 2023: 28,102 papers  <- explosion
# 2024: 61,334 papers
Enter fullscreen mode Exit fullscreen mode

This alone replaces expensive research analytics tools.

3. Map Author Networks

def author_info(name):
    url = f"https://api.openalex.org/authors?search={name}&per_page=1"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read())
    if data["results"]:
        a = data["results"][0]
        return {
            "name": a["display_name"],
            "works": a["works_count"],
            "citations": a["cited_by_count"],
            "h_index": a["summary_stats"]["h_index"]
        }

print(author_info("Yann LeCun"))
Enter fullscreen mode Exit fullscreen mode

4. Competitive Intelligence for Startups

def company_research(name):
    url = f"https://api.openalex.org/institutions?search={name}&per_page=1"
    response = urllib.request.urlopen(url)
    data = json.loads(response.read())
    if data["results"]:
        inst = data["results"][0]
        inst_id = inst["id"].split("/")[-1]
        papers_url = f"https://api.openalex.org/works?filter=authorships.institutions.id:{inst_id},publication_year:2024-2025&sort=cited_by_count:desc&per_page=5"
        response = urllib.request.urlopen(papers_url)
        papers = json.loads(response.read())
        print(f"{inst['display_name']}{inst['works_count']} total works")
        for p in papers["results"]:
            print(f"  {p['title']} ({p['cited_by_count']} citations)")

company_research("Google DeepMind")
Enter fullscreen mode Exit fullscreen mode

5. Build a Research Dashboard

def dashboard(topics):
    for topic in topics:
        url = f"https://api.openalex.org/works?search={topic}&filter=publication_year:2025&per_page=1"
        response = urllib.request.urlopen(url)
        data = json.loads(response.read())
        print(f"{topic:30} | {data['meta']['count']:>8} papers in 2025")

dashboard(["large language models", "computer vision", "quantum computing", "web scraping"])
Enter fullscreen mode Exit fullscreen mode

API Endpoints Cheat Sheet

Endpoint What it returns Example
/works Papers, books, datasets ?search=topic&sort=cited_by_count:desc
/authors Researchers with h-index ?search=name
/institutions Universities, companies ?search=MIT
/topics Research topics with trends ?search=machine+learning
/sources Journals, conferences ?search=Nature

Rate Limits and Best Practices

  • No API key needed for basic usage
  • Add mailto=your@email.com parameter for higher rate limits (polite pool)
  • Default: ~10 requests/second
  • Polite pool: ~100 requests/second
  • All responses are JSON

When NOT to Use OpenAlex

  • You need full-text PDFs (metadata only)
  • You need real-time data (1-2 day delay)
  • You need patent data (use Google Patents instead)

Conclusion

OpenAlex gives you what Scopus charges $10,000/year for — free. The API is clean, fast, and requires zero setup.

Full code examples: github.com/spinov001-art


What research API do you use? Have you tried OpenAlex? Let me know in the comments — I am building a collection of free research data tools.

Top comments (0)