Alex Spinov

Posted on Mar 25 • Edited on Mar 26

OpenAlex Has a Free API — Search 250M+ Academic Works Without Any Key

#api #python #research #opensource

Ever tried to build something with academic data? Google Scholar blocks you, Scopus costs thousands, and Web of Science requires institutional access.

Then I found OpenAlex — a completely free, open-source catalog of 250M+ academic works, 90M+ authors, and 100K+ institutions. No API key. No rate limits. No authentication.

Let me show you what you can actually build with it.

What Is OpenAlex?

OpenAlex is the open replacement for Microsoft Academic Graph (which shut down in 2021). It indexes:

250M+ works (papers, books, datasets)
90M+ authors with disambiguation
100K+ institutions
65K+ journals and venues
Citation graphs, concepts, topics — all linked

And it is 100% free. No API key needed.

Quick Start: Search Papers by Topic

import requests

# Search for papers about "transformer architecture"
url = "https://api.openalex.org/works"
params = {
    "search": "transformer architecture neural networks",
    "per_page": 5,
    "sort": "cited_by_count:desc"
}

response = requests.get(url, params=params)
data = response.json()

for work in data["results"]:
    title = work["title"]
    citations = work["cited_by_count"]
    year = work["publication_year"]
    doi = work.get("doi", "No DOI")
    print(f"[{year}] {title}")
    print(f"  Citations: {citations:,} | DOI: {doi}")
    print()

Output:

[2017] Attention Is All You Need
  Citations: 120,000+ | DOI: https://doi.org/10....

[2018] BERT: Pre-training of Deep Bidirectional Transformers
  Citations: 85,000+ | DOI: https://doi.org/10..

No key. No signup. Just works.

Find the Most-Cited Authors in Any Field

# Top authors in "machine learning"
url = "https://api.openalex.org/authors"
params = {
    "search": "machine learning",
    "sort": "cited_by_count:desc",
    "per_page": 5
}

response = requests.get(url, params=params)
for author in response.json()["results"]:
    name = author["display_name"]
    citations = author["cited_by_count"]
    works = author["works_count"]
    inst = author.get("last_known_institutions", [{}])
    inst_name = inst[0]["display_name"] if inst else "Unknown"
    print(f"{name} ({inst_name})")
    print(f"  {works:,} works | {citations:,} citations")

Track Citation Trends Over Time

# How many papers mention "large language models" per year?
url = "https://api.openalex.org/works"
params = {
    "search": "large language models",
    "group_by": "publication_year"
}

response = requests.get(url, params=params)
for group in response.json()["group_by"]:
    year = group["key"]
    count = group["count"]
    if int(year) >= 2019:
        bar = "█" * (count // 500)
        print(f"{year}: {count:>6,} papers {bar}")

This reveals the explosion of LLM research:

2019:  1,200 papers ██
2020:  2,800 papers █████
2021:  5,400 papers ██████████
2022: 12,000 papers ████████████████████████
2023: 35,000 papers ██████████████████████████████████████████

Build a Research Dashboard

Combine OpenAlex with other free APIs:

def research_landscape(topic):
    """Get a complete overview of any research topic."""
    base = "https://api.openalex.org"

    # Total papers
    works = requests.get(f"{base}/works", params={"search": topic}).json()
    total = works["meta"]["count"]

    # Top institutions
    inst = requests.get(f"{base}/institutions", 
                       params={"search": topic, "sort": "cited_by_count:desc", "per_page": 3}).json()

    # Growth trend
    trend = requests.get(f"{base}/works", 
                        params={"search": topic, "group_by": "publication_year"}).json()

    print(f"Topic: {topic}")
    print(f"Total papers: {total:,}")
    print(f"\nTop institutions:")
    for i in inst["results"][:3]:
        print(f"  - {i[display_name]} ({i[cited_by_count]:,} citations)")

    recent = [g for g in trend["group_by"] if int(g["key"]) >= 2023]
    if recent:
        print(f"\n2023+ papers: {sum(g[count] for g in recent):,}")

research_landscape("artificial intelligence safety")

Why OpenAlex Over Alternatives?

Feature	OpenAlex	Google Scholar	Scopus	Semantic Scholar
API Key Required	No	No API	Yes ($$$)	Yes (free)
Rate Limits	100K/day	Blocked	Strict	100/5min
Data Download	Full dump	No	No	Yes
Author IDs	ORCID-linked	No	Scopus ID	S2 ID
Open Source	Yes	No	No	Partial
Works Indexed	250M+	Unknown	90M+	200M+

OpenAlex wins on openness and scale.

What You Can Build

Research trend tracker — monitor any field in real-time
Author discovery tool — find experts in niche topics
Citation network visualizer — map how ideas spread
Literature review automation — find related papers systematically
Grant landscape analyzer — see where funding goes
Competitor intelligence for R&D — track what rival labs publish

The API Endpoints

/works — papers, articles, books, datasets
/authors — researchers with disambiguation
/institutions — universities, labs, companies
/sources — journals, conferences, repositories
/concepts — hierarchical topic taxonomy
/topics — fine-grained research topics
/funders — funding organizations
/publishers — publishing companies

All support filtering, sorting, grouping, and pagination.

Pro Tips

Add mailto=your@email.com to get into the polite pool (faster responses)
Use filter instead of search for exact matches
Download monthly snapshots from AWS S3 for bulk analysis
Combine with Semantic Scholar for citation context

I have been building data collection tools for 2+ years. OpenAlex is hands-down the best free academic API I have found.

Have you used OpenAlex? What did you build with it? I would love to hear about your projects in the comments.

If you need custom data pipelines for academic research, check out my data collection tools on GitHub — I build scrapers and API wrappers for research workflows.\n\n---\n\n## More Free Research APIs\n\nThis is part of my series on free APIs for researchers and data scientists:\n\n- OpenAlex API — 250M+ Academic Works\n- CORE API — 260M+ Scientific Papers\n- Crossref API — DOI Metadata for 150M+ Papers\n- Unpaywall API — Find Free Paper Versions\n- Europe PMC — 40M+ Biomedical Papers\n- World Bank API — GDP & Economic Data\n- ORCID API — 18M+ Researcher Profiles\n- DBLP API — 6M+ CS Publications\n- NASA APIs — 20+ Free Space Data APIs\n- FRED API — 800K+ US Economic Time Series\n- All 30+ Research APIs Mapped\n\n*Tools: Academic Research Toolkit on GitHub*

Need web scraping or data extraction? I've built 77+ production scrapers. Email spinov001@gmail.com — quote in 2 hours. Or try my ready-made Apify actors — no code needed.

DEV Community