DEV Community

Alex Spinov
Alex Spinov

Posted on

OpenAlex API: Search 250M+ Academic Papers for Free (No Key Required)

Last month I needed citation data for a research project. Google Scholar blocks scraping. Semantic Scholar has rate limits. Then I found OpenAlex — and it changed everything.

What Is OpenAlex?

OpenAlex is a free, open catalog of 250M+ academic papers, authors, institutions, and concepts. No API key. No authentication. No rate limits (well, 100K/day, but that's generous). It's maintained by the nonprofit OurResearch (the folks behind Unpaywall).

Think of it as the Wikipedia of academic metadata.

Why Should You Care?

If you work with:

  • Academic research — find papers, citations, co-author networks
  • Market research — track R&D trends by analyzing publication patterns
  • AI/ML — build training datasets from paper abstracts
  • Competitive intelligence — see what universities/companies are publishing

...then OpenAlex is your new best friend.

Quick Start: Your First API Call

No setup. No signup. Just curl:

curl "https://api.openalex.org/works?search=machine+learning&per_page=3"
Enter fullscreen mode Exit fullscreen mode

That returns 3.7 million results. Each paper includes title, authors, DOI, citation count, abstract, topics, and more.

Real-World Example: Finding the Most-Cited ML Papers

import requests

# Search for machine learning papers, sorted by citations
url = "https://api.openalex.org/works"
params = {
    "search": "machine learning",
    "sort": "cited_by_count:desc",
    "per_page": 10
}

response = requests.get(url, params=params)
data = response.json()

print(f"Total papers found: data["meta"]["count"]:")
print()

for paper in data["results"]:
    title = paper["title"]
    citations = paper["cited_by_count"]
    year = paper["publication_year"]
    print(f"citations: citations | {year} | {title[:80]}")
Enter fullscreen mode Exit fullscreen mode

Output:

Total papers found: 3,750,848

63,373 citations | 2011 | Scikit-learn: Machine Learning in Python
49,282 citations | 1989 | Genetic algorithms in search, optimization, and machine learning
42,418 citations | 2015 | Deep Learning
Enter fullscreen mode Exit fullscreen mode

That took 0.3 seconds. No API key needed.

5 Powerful Things You Can Do

1. Track Research Trends Over Time

# How many AI papers per year?
for year in range(2020, 2027):
    url = f"https://api.openalex.org/works?filter=concept.id:C154945302,publication_year:{year}"
    count = requests.get(url).json()["meta"]["count"]
    print(f"{year}: count: AI papers")
Enter fullscreen mode Exit fullscreen mode

2. Find an Author's Full Publication List

# Search by author name
url = "https://api.openalex.org/authors?search=yann+lecun"
author = requests.get(url).json()["results"][0]
print(f"{author["display_name"]}: {author["works_count"]} papers, {author["cited_by_count"]:,} citations")
Enter fullscreen mode Exit fullscreen mode

3. Map Institution Research Output

# MIT's publications
url = "https://api.openalex.org/institutions?search=MIT"
mit = requests.get(url).json()["results"][0]
print(f"{mit["display_name"]}: mit["works_count"]: papers")
Enter fullscreen mode Exit fullscreen mode

4. Build a Citation Network

# Get papers that cite a specific paper
paper_id = "W2741809807"  # "Attention Is All You Need"
url = f"https://api.openalex.org/works?filter=cites:{paper_id}&per_page=5"
citing = requests.get(url).json()
print(f"citing["meta"]["count"]: papers cite "Attention Is All You Need"")
Enter fullscreen mode Exit fullscreen mode

5. Export Data for Analysis

import csv

url = "https://api.openalex.org/works?search=web+scraping&per_page=50"
papers = requests.get(url).json()["results"]

with open("papers.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["Title", "Year", "Citations", "DOI"])
    for p in papers:
        writer.writerow([p["title"], p["publication_year"], p["cited_by_count"], p.get("doi", "")])

print(f"Exported {len(papers)} papers to papers.csv")
Enter fullscreen mode Exit fullscreen mode

OpenAlex vs Other Academic APIs

Feature OpenAlex Google Scholar Semantic Scholar Scopus
API Key Required ❌ No ❌ No API ✅ Yes ✅ Yes
Rate Limit 100K/day Blocked 100/5min Varies
Papers 250M+ Unknown 200M+ 84M+
Free ✅ Yes N/A ✅ Yes ❌ Paid
Open Source ✅ Yes ❌ No ❌ No ❌ No
Bulk Download ✅ Yes ❌ No ✅ Yes ❌ No

Pro Tips

  1. Add your email to get into the "polite pool" (faster responses):
    ?mailto=you@example.com

  2. Use filters instead of search for precise queries:
    ?filter=concept.id:C41008148,publication_year:2024 (Computer Science papers from 2024)

  3. Pagination — use cursor for large result sets (faster than offset)

  4. Group by — get aggregated stats:
    /works?group_by=publication_year&filter=concept.id:C154945302

When NOT to Use OpenAlex

  • You need full-text PDFs → Use Unpaywall or Sci-Hub
  • You need patent data → Use Google Patents or Lens.org
  • You need real-time preprints → Use arXiv API (my previous article covers this)

Build Something With It

I used OpenAlex to build a research trend analyzer that tracks how AI subfields grow year over year. The entire thing is 50 lines of Python.

The data is there. It's free. No gatekeepers. Go build.


What would you build with 250M papers? Drop your idea in the comments — I'll pick the most interesting one and build a prototype.

More free API tutorials: My API series on Dev.to

Need custom data extraction? I build scrapers professionally. Contact me or check my 77 ready-made scrapers on Apify.

Top comments (0)