DEV Community

Ken-Mutisya
Ken-Mutisya

Posted on

The Hugging Face Hub Is a Free JSON API: Rank Trending AI Models Without a Key

Everyone reads the Hugging Face trending page in a browser. Almost nobody knows the whole Hub sits behind a plain JSON API with no key, no login, and cursor pagination. If you want a weekly report of what the AI community is actually adopting, you can build it with fetch.

The endpoints

GET https://huggingface.co/api/models
GET https://huggingface.co/api/datasets
GET https://huggingface.co/api/spaces
Enter fullscreen mode Exit fullscreen mode

Useful parameters, same across all three:

  • sort ranks results: trendingScore, downloads, likes, createdAt, lastModified
  • direction=-1 for descending
  • search matches names, author restricts to one org like meta-llama
  • filter matches Hub tags: text-generation, license:mit, even arxiv:2606.23050
  • limit up to 100 per page

So the top trending models right now:

https://huggingface.co/api/models?sort=trendingScore&direction=-1&limit=100
Enter fullscreen mode Exit fullscreen mode

trendingScore is the interesting one. Downloads and likes rank all time popularity, which is dominated by the same old models. Trending score is Hugging Face's own measure of current momentum, and it moves daily. Today it puts a four day old OCR model from Baidu at the top, which no downloads sort would surface for weeks.

Slim payloads with expand

By default the models endpoint returns a siblings array listing every file in the repo, which bloats a 100 item page. Ask for exactly the fields you want instead:

const fields = ['downloads', 'likes', 'trendingScore', 'pipeline_tag', 'tags', 'createdAt'];
const params = new URLSearchParams({ sort: 'trendingScore', direction: '-1', limit: '100' });
for (const f of fields) params.append('expand[]', f);
const res = await fetch(`https://huggingface.co/api/models?${params}`);
const models = await res.json();
Enter fullscreen mode Exit fullscreen mode

Pagination is a Link header

There is no page parameter. Each response carries a Link header with a cursor for the next page, GitHub style:

function nextUrl(res) {
  const m = (res.headers.get('link') || '').match(/<([^>]+)>;\s*rel="next"/);
  return m ? m[1] : null;
}
Enter fullscreen mode Exit fullscreen mode

Loop until it returns null or you have enough rows.

What a row looks like

{
  "id": "baidu/Unlimited-OCR",
  "trendingScore": 701,
  "downloads": 758489,
  "likes": 1643,
  "pipeline_tag": "image-text-to-text",
  "tags": ["transformers", "ocr", "arxiv:2606.23050", "license:mit"],
  "createdAt": "2026-06-19T09:40:33.000Z"
}
Enter fullscreen mode Exit fullscreen mode

Two details worth knowing. License and paper references arrive as tags, so license:mit and arxiv:2606.23050 need a prefix parse. And Spaces have no download counter, only likes.

What you can build with it

  • A weekly digest of new trending models, filtered to the niches you care about
  • A tracker for one lab: everything deepseek-ai or meta-llama releases, with adoption numbers a day later
  • Model selection with evidence: every text-to-speech model under an MIT license, ranked by real downloads
  • Dataset discovery for training runs, sorted by what researchers are actually using

If you would rather not write the pagination and parsing yourself, I packaged this as a pay per use actor: Hugging Face Scraper: Trending AI Models, Datasets and Spaces. Pick models, datasets or Spaces, sort and filter, and get clean rows with license and arXiv ids already parsed. The first 20 rows of every run are free.

This is one more in a growing portfolio of keyless scrapers. The lesson that keeps repeating: before you reach for a headless browser, check whether the site already ships a JSON API. The Hub does, and it is a good one.

Top comments (0)