Omar Eldeeb

Posted on Jun 1 • Originally published at datatooly.xyz

The Hacker News Search API: Free, No-Key, and Surprisingly Powerful

#api #javascript #webdev #datascience

The Hacker News search API you don't need a key for

If you've ever wanted to programmatically search Hacker News — pull every "Show HN" above 100 points, mine the monthly "Who is hiring?" thread, or track mentions of your project — there's a Hacker News search API that is free, requires no key, and no OAuth dance. It lives at https://hn.algolia.com/api/v1/ and it's powered by Algolia, the search company HN uses for its own on-site search.

This post walks through how it actually works, with code you can paste and run right now. Everything below is verified against the live endpoint, not copied from stale docs.

Two endpoints: relevance vs. recency

There are two search endpoints, and the difference matters more than people expect:

/search — ranked by relevance (Algolia's text-relevance scoring, weighted by points/comments). Use this when you're searching for a topic.
/search_by_date — ranked by recency (newest first). Use this when you're building a feed, a monitor, or anything time-sensitive.

A subtle gotcha: /search reorders by relevance, so a query like created_at_i>... won't give you a clean chronological list. If you want "the newest N items matching X," reach for /search_by_date.

A runnable example

Here's a real fetch() call. It finds story-type posts mentioning "rust" with more than 100 points, newest first:

const params = new URLSearchParams({
  query: "rust",
  tags: "story",
  numericFilters: "points>100",
  hitsPerPage: "20",
});

const url = `https://hn.algolia.com/api/v1/search_by_date?${params}`;
const res = await fetch(url);
const data = await res.json();

console.log(`${data.nbHits} total matches, showing ${data.hits.length}`);
for (const hit of data.hits) {
  console.log(`${hit.points}pts  ${hit.title}`);
  console.log(`  https://news.ycombinator.com/item?id=${hit.objectID}`);
}

And the same thing as a one-liner with curl:

curl "https://hn.algolia.com/api/v1/search_by_date?query=rust&tags=story&numericFilters=points%3E100&hitsPerPage=20"

(%3E is just a URL-encoded >. In a browser/fetch, URLSearchParams encodes it for you.)

Each hit in the hits array contains the fields you'd want: objectID (the HN item id), title, url, author, points, num_comments, created_at, and created_at_i (the Unix timestamp — handy for filtering). The response envelope also gives you nbHits, page, nbPages, and hitsPerPage for pagination.

Tags: the most useful parameter

The tags parameter is how you scope what kind of item you want. The supported values:

story — link/text submissions
comment — individual comments
ask_hn — Ask HN posts
show_hn — Show HN posts
poll — polls
author_<username> — items by a specific user, e.g. author_pg

Tags combine with logic. A comma means AND; parentheses mean OR. So:

tags=story,author_pg            → stories by pg
tags=(story,poll),author_pg     → stories OR polls by pg
tags=show_hn,(story,comment)    → Show HN items that are stories or comments

This is genuinely powerful. Want every Ask HN post by a particular user? tags=ask_hn,author_jl. Want only top-level submissions and never comments? Just tags=story.

Numeric filters: points, comments, and time ranges

numericFilters lets you filter on numeric fields server-side, so you don't pull 1,000 rows just to discard 980. Supported operators are <, <=, =, >, >=, and you can comma-separate multiple conditions (AND):

numericFilters=points>500
numericFilters=num_comments>50
numericFilters=points>100,num_comments>20

The time field created_at_i is a Unix timestamp, which makes date-range queries easy. To get high-signal stories from a specific window:

const since = Math.floor(Date.now() / 1000) - 7 * 24 * 3600; // last 7 days
const params = new URLSearchParams({
  tags: "story",
  numericFilters: `points>200,created_at_i>${since}`,
  hitsPerPage: "30",
});
const res = await fetch(
  `https://hn.algolia.com/api/v1/search?${params}`
);
const { hits } = await res.json();

This pattern — points>N plus a created_at_i floor — is the backbone of most "what's hot this week" dashboards built on HN.

Pagination and the limits to know about

Pagination is straightforward: pass page (zero-indexed) and hitsPerPage (max 1000, though smaller pages are kinder). Read nbPages from the response to know when to stop.

Two limits are worth internalizing so you don't design something that quietly breaks:

~1,000 retrievable results per query. This is Algolia's standard pagination ceiling — you can page through results, but only down to roughly the first 1,000. If you need everything matching a broad query, you can't just deep-paginate; you have to slice by time instead. Run several narrower created_at_i ranges and stitch the results together.
A rough rate ceiling of ~10,000 requests/hour/IP. Important caveat: this is a community / Algolia-staff figure that's been cited over the years, not a published SLA. Treat it as a courtesy budget, not a guarantee — add backoff, cache responses, and don't hammer it.

Neither limit is a problem for normal use, but both shape how you architect a large backfill.

Drilling into a single item (and the "Who is hiring?" thread)

The search index returns flat hits. To get the full nested comment tree for any item, use the items endpoint:

curl "https://hn.algolia.com/api/v1/items/42000000"

This returns the post and a recursive children array of comments — perfect for the monthly "Who is hiring?" thread, which typically carries ~400–900 job-posting comments. Grab the thread's objectID, hit /items/:id, and walk children to pull every job comment in one shot.

Don't forget: the Firebase API has no search

Hacker News also publishes an official Firebase API at https://hacker-news.firebaseio.com/v0/. It's great for live data — top stories, new stories, individual item lookups by id, user profiles — but it has no search capability whatsoever. You can't query it by keyword, points, or date.

The practical move is to combine the two: use the Algolia search API to discover item ids matching your criteria, then optionally hit Firebase for the freshest real-time state of those items. Search where you need search; go to Firebase where you need authority and freshness.

Try it without writing code first

If you just want to poke at queries and see real results before wiring anything up, I built a free browser tool that runs this exact API live: datatooly.xyz/hacker-news-search. It's not a canned demo — it fires the request straight from your browser (the Algolia endpoint echoes the request origin for CORS), so the results are the live index. Tweak the query, tags, and filters and watch the JSON come back.

When you need the heavy version

The raw API is perfect for targeted queries. But once you're doing serious extraction — full nested comment trees across thousands of items, a parsed "Who is hiring?" feed, user profiles, or export to CSV — the pagination cap and rate budget start to bite, and you end up rebuilding the same plumbing.

That's what pushed me to package it as the Hacker News Scraper actor on Apify. It has 9 modes (top / new / best / ask / show / jobs / search / user / hiring_threads), pulls full nested comment trees and user profiles, includes a dedicated "Who is hiring?" parser, supports date/score/domain filters, and exports JSON, CSV, or via API. It's free to start, then pay-as-you-go — the first 50 events of every run are free, so small jobs cost nothing.

Disclosure: I built both the free tool and the actor.

TL;DR

Base URL: https://hn.algolia.com/api/v1/ — no key, no auth.
/search = relevance, /search_by_date = newest first.
tags scopes type (story, comment, ask_hn, show_hn, poll, author_X); comma = AND, parentheses = OR.
numericFilters filters on points, num_comments, created_at_i.
Watch the ~1,000-results-per-query cap (slice by time) and the unofficial ~10k req/hr/IP courtesy budget.
Use /items/:id for full comment trees; combine with the search-less Firebase API for live state.

Go build something. The index is wide open.

DEV Community