Semantic Scholar Has a Free API — It Gives You AI Summaries of Research Papers

#python #ai #api #machinelearning

Most academic search tools give you a title, abstract, and a DOI. Semantic Scholar gives you something different: AI-generated TLDR summaries for every paper.

What Makes It Different

Semantic Scholar is built by the Allen Institute for AI (AI2). Unlike Crossref or OpenAlex, it doesn't just store metadata — it understands papers using NLP.

Features you won't find elsewhere:

TLDR summaries — one-sentence AI summaries for millions of papers
Influence scores — not just citation count, but citation quality
Paper recommendations — "papers like this one"
Author disambiguation — AI resolves which "J. Smith" is which

Quick Example

import requests

resp = requests.get("https://api.semanticscholar.org/graph/v1/paper/search", params={
    "query": "large language models reasoning",
    "limit": 3,
    "fields": "title,year,citationCount,tldr"
})

for paper in resp.json()["data"]:
    print(f"\n{paper['title']} ({paper['year']})")
    print(f"  Citations: {paper['citationCount']}")
    if paper.get('tldr'):
        print(f"  TLDR: {paper['tldr']['text']}")

Sample output:

Chain-of-Thought Prompting Elicits Reasoning (2022)
  Citations: 2847
  TLDR: Chain of thought prompting enables LLMs to solve complex reasoning tasks by generating intermediate steps.

Self-Consistency Improves Chain of Thought Reasoning (2023)
  Citations: 1203
  TLDR: Sampling multiple reasoning paths and taking majority vote significantly improves accuracy.

That TLDR saved me from reading the abstract. For scanning 100 papers, this is invaluable.

Get Paper Recommendations

# Find papers similar to "Attention Is All You Need"
paper_id = "204e3073870fae3d05bcbc2f6a8e263d9b72e776"
resp = requests.get(
    f"https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{paper_id}",
    params={"limit": 5, "fields": "title,year,citationCount"}
)

for rec in resp.json()["recommendedPapers"]:
    print(f"{rec['title']} ({rec['year']}) — {rec['citationCount']} citations")

This is like "Related Papers" on steroids — it uses AI to find semantically similar work, not just papers that cite each other.

Get Author Profiles

# Yoshua Bengio's profile
resp = requests.get(
    "https://api.semanticscholar.org/graph/v1/author/1741101",
    params={"fields": "name,hIndex,paperCount,citationCount"}
)
author = resp.json()
print(f"{author['name']}")
print(f"  h-index: {author['hIndex']}")
print(f"  Papers: {author['paperCount']}")
print(f"  Citations: {author['citationCount']}")

Free vs Paid

Semantic Scholar offers:

No key: 100 requests per 5 minutes
Free key (signup): 1,000 requests per 5 minutes
Research partner: unlimited (apply)

For most projects, the no-key tier is plenty.

Academic API Comparison

API	Papers	TLDR	Recommendations	Full Text
Semantic Scholar	200M+	✅ Yes	✅ Yes	No
OpenAlex	250M+	No	No	No
Crossref	150M+	No	No	No
CORE	300M+	No	No	✅ Yes
PubMed	36M+	No	Similar articles	No

Each has its strength. I use Semantic Scholar for discovery, Crossref for DOI metadata, and CORE for full text.

All toolkits are on my GitHub.

Do you use Semantic Scholar? What's your workflow for finding relevant papers?

You might also like:

Need web scraping or data extraction? I've built 77+ production scrapers. Email spinov001@gmail.com — quote in 2 hours. Or try my ready-made Apify actors — no code needed.