DEV Community

Алексей Спинов
Алексей Спинов

Posted on

arXiv API: Search 2M+ Research Papers Programmatically (No Key)

arXiv has over 2 million papers and a completely free API. No authentication, no rate limits (within reason), structured XML responses.

Basic Search

curl 'http://export.arxiv.org/api/query?search_query=all:machine+learning&max_results=5'
Enter fullscreen mode Exit fullscreen mode

Returns Atom XML with: title, abstract, authors, categories, published date, PDF link.

Node.js Example

async function searchArxiv(query, maxResults = 10) {
  const url = `http://export.arxiv.org/api/query?search_query=all:${encodeURIComponent(query)}&max_results=${maxResults}&sortBy=submittedDate&sortOrder=descending`;

  const res = await fetch(url);
  const xml = await res.text();

  // Simple XML parsing without dependencies
  const entries = xml.split('<entry>').slice(1);
  return entries.map(entry => ({
    title: entry.match(/<title>(.*?)<\/title>/s)?.[1]?.trim(),
    abstract: entry.match(/<summary>(.*?)<\/summary>/s)?.[1]?.trim().substring(0, 200),
    published: entry.match(/<published>(.*?)<\/published>/)?.[1],
    url: entry.match(/<id>(.*?)<\/id>/)?.[1],
    authors: [...entry.matchAll(/<name>(.*?)<\/name>/g)].map(m => m[1])
  }));
}

const papers = await searchArxiv('transformer attention');
console.table(papers);
Enter fullscreen mode Exit fullscreen mode

Why arXiv Data Matters

arXiv papers are a leading indicator — what researchers publish today becomes commercial products in 2-3 years.

  • Tracking arXiv = tracking future markets
  • Paper volume in a topic = research interest = future investment
  • New categories appearing = emerging industries

Use Cases

  1. Market research — how much academic activity is there in your industry?
  2. Competitive intelligence — what are competitors' research teams publishing?
  3. Trend analysis — which topics are growing fastest?
  4. AI training — curate papers for domain-specific fine-tuning

More Free APIs


Need academic research data extracted? $20 flat rate. Any topic, any timeframe. Email: Spinov001@gmail.com | Hire me

Top comments (0)