DEV Community

Alex Spinov
Alex Spinov

Posted on

OpenAlex Has a Free API — Search 250M+ Academic Works Without Any Key

Ever tried to build something with academic data? Google Scholar blocks you, Scopus costs thousands, and Web of Science requires institutional access.

Then I found OpenAlex — a completely free, open-source catalog of 250M+ academic works, 90M+ authors, and 100K+ institutions. No API key. No rate limits. No authentication.

Let me show you what you can actually build with it.

What Is OpenAlex?

OpenAlex is the open replacement for Microsoft Academic Graph (which shut down in 2021). It indexes:

  • 250M+ works (papers, books, datasets)
  • 90M+ authors with disambiguation
  • 100K+ institutions
  • 65K+ journals and venues
  • Citation graphs, concepts, topics — all linked

And it is 100% free. No API key needed.

Quick Start: Search Papers by Topic

import requests

# Search for papers about "transformer architecture"
url = "https://api.openalex.org/works"
params = {
    "search": "transformer architecture neural networks",
    "per_page": 5,
    "sort": "cited_by_count:desc"
}

response = requests.get(url, params=params)
data = response.json()

for work in data["results"]:
    title = work["title"]
    citations = work["cited_by_count"]
    year = work["publication_year"]
    doi = work.get("doi", "No DOI")
    print(f"[{year}] {title}")
    print(f"  Citations: {citations:,} | DOI: {doi}")
    print()
Enter fullscreen mode Exit fullscreen mode

Output:

[2017] Attention Is All You Need
  Citations: 120,000+ | DOI: https://doi.org/10....

[2018] BERT: Pre-training of Deep Bidirectional Transformers
  Citations: 85,000+ | DOI: https://doi.org/10..
Enter fullscreen mode Exit fullscreen mode

No key. No signup. Just works.

Find the Most-Cited Authors in Any Field

# Top authors in "machine learning"
url = "https://api.openalex.org/authors"
params = {
    "search": "machine learning",
    "sort": "cited_by_count:desc",
    "per_page": 5
}

response = requests.get(url, params=params)
for author in response.json()["results"]:
    name = author["display_name"]
    citations = author["cited_by_count"]
    works = author["works_count"]
    inst = author.get("last_known_institutions", [{}])
    inst_name = inst[0]["display_name"] if inst else "Unknown"
    print(f"{name} ({inst_name})")
    print(f"  {works:,} works | {citations:,} citations")
Enter fullscreen mode Exit fullscreen mode

Track Citation Trends Over Time

# How many papers mention "large language models" per year?
url = "https://api.openalex.org/works"
params = {
    "search": "large language models",
    "group_by": "publication_year"
}

response = requests.get(url, params=params)
for group in response.json()["group_by"]:
    year = group["key"]
    count = group["count"]
    if int(year) >= 2019:
        bar = "" * (count // 500)
        print(f"{year}: {count:>6,} papers {bar}")
Enter fullscreen mode Exit fullscreen mode

This reveals the explosion of LLM research:

2019:  1,200 papers ██
2020:  2,800 papers █████
2021:  5,400 papers ██████████
2022: 12,000 papers ████████████████████████
2023: 35,000 papers ██████████████████████████████████████████
Enter fullscreen mode Exit fullscreen mode

Build a Research Dashboard

Combine OpenAlex with other free APIs:

def research_landscape(topic):
    """Get a complete overview of any research topic."""
    base = "https://api.openalex.org"

    # Total papers
    works = requests.get(f"{base}/works", params={"search": topic}).json()
    total = works["meta"]["count"]

    # Top institutions
    inst = requests.get(f"{base}/institutions", 
                       params={"search": topic, "sort": "cited_by_count:desc", "per_page": 3}).json()

    # Growth trend
    trend = requests.get(f"{base}/works", 
                        params={"search": topic, "group_by": "publication_year"}).json()

    print(f"Topic: {topic}")
    print(f"Total papers: {total:,}")
    print(f"\nTop institutions:")
    for i in inst["results"][:3]:
        print(f"  - {i[display_name]} ({i[cited_by_count]:,} citations)")

    recent = [g for g in trend["group_by"] if int(g["key"]) >= 2023]
    if recent:
        print(f"\n2023+ papers: {sum(g[count] for g in recent):,}")

research_landscape("artificial intelligence safety")
Enter fullscreen mode Exit fullscreen mode

Why OpenAlex Over Alternatives?

Feature OpenAlex Google Scholar Scopus Semantic Scholar
API Key Required No No API Yes ($$$) Yes (free)
Rate Limits 100K/day Blocked Strict 100/5min
Data Download Full dump No No Yes
Author IDs ORCID-linked No Scopus ID S2 ID
Open Source Yes No No Partial
Works Indexed 250M+ Unknown 90M+ 200M+

OpenAlex wins on openness and scale.

What You Can Build

  1. Research trend tracker — monitor any field in real-time
  2. Author discovery tool — find experts in niche topics
  3. Citation network visualizer — map how ideas spread
  4. Literature review automation — find related papers systematically
  5. Grant landscape analyzer — see where funding goes
  6. Competitor intelligence for R&D — track what rival labs publish

The API Endpoints

  • /works — papers, articles, books, datasets
  • /authors — researchers with disambiguation
  • /institutions — universities, labs, companies
  • /sources — journals, conferences, repositories
  • /concepts — hierarchical topic taxonomy
  • /topics — fine-grained research topics
  • /funders — funding organizations
  • /publishers — publishing companies

All support filtering, sorting, grouping, and pagination.

Pro Tips

  1. Add mailto=your@email.com to get into the polite pool (faster responses)
  2. Use filter instead of search for exact matches
  3. Download monthly snapshots from AWS S3 for bulk analysis
  4. Combine with Semantic Scholar for citation context

I have been building data collection tools for 2+ years. OpenAlex is hands-down the best free academic API I have found.

Have you used OpenAlex? What did you build with it? I would love to hear about your projects in the comments.


If you need custom data pipelines for academic research, check out my data collection tools on GitHub — I build scrapers and API wrappers for research workflows.\n\n---\n\n## More Free Research APIs\n\nThis is part of my series on free APIs for researchers and data scientists:\n\n- OpenAlex API — 250M+ Academic Works\n- CORE API — 260M+ Scientific Papers\n- Crossref API — DOI Metadata for 150M+ Papers\n- Unpaywall API — Find Free Paper Versions\n- Europe PMC — 40M+ Biomedical Papers\n- World Bank API — GDP & Economic Data\n- ORCID API — 18M+ Researcher Profiles\n- DBLP API — 6M+ CS Publications\n- NASA APIs — 20+ Free Space Data APIs\n- FRED API — 800K+ US Economic Time Series\n- All 30+ Research APIs Mapped\n\n*Tools: Academic Research Toolkit on GitHub*

Top comments (0)