Why OpenAlex?
Most developers know about Google Scholar, but few know about OpenAlex — a completely free, open API with 250 million+ academic works, 100K+ journals, and 90M+ authors.
No API key. No rate limits (polite pool). No paywall.
I used it to build a research assistant that finds papers by topic, tracks citation networks, and discovers trending research areas — all with simple HTTP requests.
Quick Start: Search Papers
import requests
def search_papers(query, per_page=5):
url = "https://api.openalex.org/works"
params = {
"search": query,
"per_page": per_page,
"sort": "cited_by_count:desc"
}
resp = requests.get(url, params=params)
data = resp.json()
for work in data["results"]:
title = work["title"]
year = work["publication_year"]
citations = work["cited_by_count"]
doi = work.get("doi", "N/A")
print(f"[{year}] {title}")
print(f" Citations: {citations} | DOI: {doi}")
print()
search_papers("large language models")
Output:
[2020] Language Models are Few-Shot Learners
Citations: 28,451 | DOI: https://doi.org/10.48550/arxiv.2005.14165
[2023] LLaMA: Open and Efficient Foundation Language Models
Citations: 8,234 | DOI: https://doi.org/10.48550/arxiv.2302.13971
Find Trending Topics
def trending_concepts():
url = "https://api.openalex.org/concepts"
params = {
"filter": "works_count:>1000",
"sort": "works_count:desc",
"per_page": 10
}
resp = requests.get(url, params=params)
for concept in resp.json()["results"]:
name = concept["display_name"]
count = concept["works_count"]
print(f"{name}: {count:,} works")
trending_concepts()
Track an Author's Full Publication History
def author_profile(name):
url = "https://api.openalex.org/authors"
params = {"search": name, "per_page": 1}
resp = requests.get(url, params=params)
author = resp.json()["results"][0]
print(f"Name: {author['display_name']}")
print(f"Works: {author['works_count']}")
print(f"Citations: {author['cited_by_count']}")
print(f"h-index: {author['summary_stats']['h_index']}")
author_profile("Geoffrey Hinton")
Build a Citation Network
def citation_network(work_id):
url = "https://api.openalex.org/works"
params = {
"filter": f"cites:{work_id}",
"sort": "cited_by_count:desc",
"per_page": 5
}
resp = requests.get(url, params=params)
papers = resp.json()["results"]
print("Top papers citing this work:")
for p in papers:
print(f" [{p['publication_year']}] {p['title']} ({p['cited_by_count']} citations)")
return papers
# Find papers citing "Attention Is All You Need"
citation_network("W2741809807")
Why This Beats Google Scholar Scraping
| Feature | OpenAlex API | Google Scholar |
|---|---|---|
| API Access | Free, open | No official API |
| Rate Limits | Generous (polite pool) | Will block you |
| Data Fields | 50+ per work | Title + snippet |
| Bulk Download | Yes (full dataset) | No |
| Citation Graph | Built-in | Scraping required |
| Author Profiles | Full h-index, institutions | Limited |
Real Use Cases
- Literature Reviews — Find the most-cited papers in any field in seconds
- Trend Analysis — Track which research topics are growing fastest
- Due Diligence — Verify a researcher's publication record and impact
- Content Research — Find data and statistics backed by peer-reviewed sources
- Competitive Intelligence — See what research companies are publishing
Endpoints Cheatsheet
| Endpoint | Returns | Example Filter |
|---|---|---|
/works |
Papers, articles | search=machine+learning |
/authors |
Researchers | search=Yann+LeCun |
/institutions |
Universities, labs | country_code=US |
/concepts |
Research topics | works_count:>10000 |
/venues |
Journals, conferences | search=Nature |
/funders |
Grant organizations | search=NSF |
Full docs: docs.openalex.org
What academic APIs do you use? I'm building a collection of free research tools — drop your favorites in the comments.
If you work with data and APIs, I write practical tutorials every week. Follow for more.
More from me: 10 Dev Tools I Use Daily | 77 Scrapers on a Schedule | 150+ Free APIs
Top comments (0)