OpenAlex API: Search 250M+ Academic Papers for Free (No Key, No Limits)

#tutorial #api #python #opensource

Why OpenAlex?

Most developers know about Google Scholar, but few know about OpenAlex — a completely free, open API with 250 million+ academic works, 100K+ journals, and 90M+ authors.

No API key. No rate limits (polite pool). No paywall.

I used it to build a research assistant that finds papers by topic, tracks citation networks, and discovers trending research areas — all with simple HTTP requests.

Quick Start: Search Papers

import requests

def search_papers(query, per_page=5):
    url = "https://api.openalex.org/works"
    params = {
        "search": query,
        "per_page": per_page,
        "sort": "cited_by_count:desc"
    }
    resp = requests.get(url, params=params)
    data = resp.json()

    for work in data["results"]:
        title = work["title"]
        year = work["publication_year"]
        citations = work["cited_by_count"]
        doi = work.get("doi", "N/A")
        print(f"[{year}] {title}")
        print(f"  Citations: {citations} | DOI: {doi}")
        print()

search_papers("large language models")

Output:

[2020] Language Models are Few-Shot Learners
  Citations: 28,451 | DOI: https://doi.org/10.48550/arxiv.2005.14165

[2023] LLaMA: Open and Efficient Foundation Language Models
  Citations: 8,234 | DOI: https://doi.org/10.48550/arxiv.2302.13971

Find Trending Topics

def trending_concepts():
    url = "https://api.openalex.org/concepts"
    params = {
        "filter": "works_count:>1000",
        "sort": "works_count:desc",
        "per_page": 10
    }
    resp = requests.get(url, params=params)
    for concept in resp.json()["results"]:
        name = concept["display_name"]
        count = concept["works_count"]
        print(f"{name}: {count:,} works")

trending_concepts()

Track an Author's Full Publication History

def author_profile(name):
    url = "https://api.openalex.org/authors"
    params = {"search": name, "per_page": 1}
    resp = requests.get(url, params=params)
    author = resp.json()["results"][0]

    print(f"Name: {author['display_name']}")
    print(f"Works: {author['works_count']}")
    print(f"Citations: {author['cited_by_count']}")
    print(f"h-index: {author['summary_stats']['h_index']}")

author_profile("Geoffrey Hinton")

Build a Citation Network

def citation_network(work_id):
    url = "https://api.openalex.org/works"
    params = {
        "filter": f"cites:{work_id}",
        "sort": "cited_by_count:desc",
        "per_page": 5
    }
    resp = requests.get(url, params=params)
    papers = resp.json()["results"]

    print("Top papers citing this work:")
    for p in papers:
        print(f"  [{p['publication_year']}] {p['title']} ({p['cited_by_count']} citations)")
    return papers

# Find papers citing "Attention Is All You Need"
citation_network("W2741809807")

Why This Beats Google Scholar Scraping

Feature	OpenAlex API	Google Scholar
API Access	Free, open	No official API
Rate Limits	Generous (polite pool)	Will block you
Data Fields	50+ per work	Title + snippet
Bulk Download	Yes (full dataset)	No
Citation Graph	Built-in	Scraping required
Author Profiles	Full h-index, institutions	Limited

Real Use Cases

Literature Reviews — Find the most-cited papers in any field in seconds
Trend Analysis — Track which research topics are growing fastest
Due Diligence — Verify a researcher's publication record and impact
Content Research — Find data and statistics backed by peer-reviewed sources
Competitive Intelligence — See what research companies are publishing

Endpoints Cheatsheet

Endpoint	Returns	Example Filter
`/works`	Papers, articles	`search=machine+learning`
`/authors`	Researchers	`search=Yann+LeCun`
`/institutions`	Universities, labs	`country_code=US`
`/concepts`	Research topics	`works_count:>10000`
`/venues`	Journals, conferences	`search=Nature`
`/funders`	Grant organizations	`search=NSF`

Full docs: docs.openalex.org

What academic APIs do you use? I'm building a collection of free research tools — drop your favorites in the comments.

If you work with data and APIs, I write practical tutorials every week. Follow for more.

Need to extract data from any website or API? I build custom web scrapers and data pipelines. Check out my ready-made scrapers on Apify or email me at spinov001@gmail.com for a custom solution.

Need data from the web without writing scrapers? Check my *Apify actors** — ready-made scrapers for HN, Reddit, LinkedIn, and 75+ more sites. Or email: spinov001@gmail.com*