The Hugging Face Hub Is a Free JSON API: Rank Trending AI Models Without a Key

#ai #api #javascript #showdev

Everyone reads the Hugging Face trending page in a browser. Almost nobody knows the whole Hub sits behind a plain JSON API with no key, no login, and cursor pagination. If you want a weekly report of what the AI community is actually adopting, you can build it with fetch.

The endpoints

GET https://huggingface.co/api/models
GET https://huggingface.co/api/datasets
GET https://huggingface.co/api/spaces

Useful parameters, same across all three:

sort ranks results: trendingScore, downloads, likes, createdAt, lastModified
direction=-1 for descending
search matches names, author restricts to one org like meta-llama
filter matches Hub tags: text-generation, license:mit, even arxiv:2606.23050
limit up to 100 per page

So the top trending models right now:

https://huggingface.co/api/models?sort=trendingScore&direction=-1&limit=100

trendingScore is the interesting one. Downloads and likes rank all time popularity, which is dominated by the same old models. Trending score is Hugging Face's own measure of current momentum, and it moves daily. Today it puts a four day old OCR model from Baidu at the top, which no downloads sort would surface for weeks.

Slim payloads with expand

By default the models endpoint returns a siblings array listing every file in the repo, which bloats a 100 item page. Ask for exactly the fields you want instead:

const fields = ['downloads', 'likes', 'trendingScore', 'pipeline_tag', 'tags', 'createdAt'];
const params = new URLSearchParams({ sort: 'trendingScore', direction: '-1', limit: '100' });
for (const f of fields) params.append('expand[]', f);
const res = await fetch(`https://huggingface.co/api/models?${params}`);
const models = await res.json();

Pagination is a Link header

There is no page parameter. Each response carries a Link header with a cursor for the next page, GitHub style:

function nextUrl(res) {
  const m = (res.headers.get('link') || '').match(/<([^>]+)>;\s*rel="next"/);
  return m ? m[1] : null;
}

Loop until it returns null or you have enough rows.

What a row looks like

{
  "id": "baidu/Unlimited-OCR",
  "trendingScore": 701,
  "downloads": 758489,
  "likes": 1643,
  "pipeline_tag": "image-text-to-text",
  "tags": ["transformers", "ocr", "arxiv:2606.23050", "license:mit"],
  "createdAt": "2026-06-19T09:40:33.000Z"
}

Two details worth knowing. License and paper references arrive as tags, so license:mit and arxiv:2606.23050 need a prefix parse. And Spaces have no download counter, only likes.

What you can build with it

A weekly digest of new trending models, filtered to the niches you care about
A tracker for one lab: everything deepseek-ai or meta-llama releases, with adoption numbers a day later
Model selection with evidence: every text-to-speech model under an MIT license, ranked by real downloads
Dataset discovery for training runs, sorted by what researchers are actually using

If you would rather not write the pagination and parsing yourself, I packaged this as a pay per use actor: Hugging Face Scraper: Trending AI Models, Datasets and Spaces. Pick models, datasets or Spaces, sort and filter, and get clean rows with license and arXiv ids already parsed. The first 20 rows of every run are free.

This is one more in a growing portfolio of keyless scrapers. The lesson that keeps repeating: before you reach for a headless browser, check whether the site already ships a JSON API. The Hub does, and it is a good one.

Top comments (4)

Nazar Boyko • Jul 3

Quick question on building the weekly digest around trendingScore: since it moves daily and reflects current momentum, wouldn't a Monday snapshot look pretty different from a Thursday one for the same week? I'm wondering whether you sample it once or average a few pulls across the week to get something stable enough to call a "weekly" trend. The keyless API is a great find either way, and the Link header cursor especially, since that's the part people always trip on when they expect a page param.

Ken-Mutisya • Jul 3

Good question. trendingScore decays by design, so a Monday pull and a Thursday pull will rank differently. I do not average it. I take two snapshots a week apart and compute my own deltas from downloads and likes, which are cumulative and stable no matter which weekday you sample. Then I use trendingScore only to spot fresh spikes within the current pull, like a model released three days ago that a weekly delta would miss. And agreed on the Link header cursor, that tripped me up too until I stopped looking for a page param.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.