Shen Huang

Posted on May 24

How I Indexed 2,000 Claude Code Skills (And What the Install Data Says About AI Coding in 2026)

#seo #webdev #ai #claude

The problem: 2,000 skills, no map

Claude Code skills exploded across 2025 and into 2026. The official registry at skills.sh and the parallel ecosystem on GitHub now expose well over 2,000 public skills — covering everything from frontend-design patterns to azure-deploy to nano-banana-pro image generation. The format won. Every serious AI coding agent — Claude Code, Cursor, OpenSeek, Codex — now ships some flavor of "loadable context package".

But there is no ranked, install-weighted, searchable index. The registry lists skills alphabetically, paginates them across dozens of pages, and doesn't surface install volume in a way you can sort. You discover a useful skill one of three ways: someone tweets it, you scroll forever, or you already know the author/repo combo. None of those scale past a few hundred skills.

So I built one. orangebot.ai/skills is a ranked, filterable index of 1,998 public Claude Code skills sorted by weekly install volume, tagged by domain, with individual detail pages for the top 50 highest-installed entries. It is the page I wanted six months ago and didn't have.

This post is the build log. Three parts: the stack (boring on purpose), the SEO crisis I walked into and the fix that shipped today, and what the install data actually says about where AI coding tools are headed in 2026.

Source registry: skills.sh. The catalog: orangebot.ai/skills.

The stack (zero infrastructure surprises)

I picked the most boring stack I could justify, on purpose. The catalog is not where I want to spend operational attention — the data pipeline and the SEO are.

Next.js 16 App Router. Server Components by default so Google sees real HTML, not a JS shell. Routing is filesystem-based, the metadata API is built in.
Firebase App Hosting. git push deploys. No manual ops, no Docker, no Kubernetes. Free tier so far.
Cron on a Linux box at home. A Python scraper runs daily on lich-ubuntu, normalizes install counts and author attribution, and writes a single ~2.5MB skills_index.json to the repo's public/data/ directory. When the file changes, the next deploy picks it up.
No database at request time. The Next.js page reads the JSON at revalidate time (every 6 hours). Firebase serves the rendered HTML. No Postgres, no Redis, no Firestore call on the hot path.

That's the whole architecture. One JSON file, one Next.js page, one daily cron. The catalog routes in the repo:

src/app/skills/page.js              # SSR catalog (server component)
src/app/skills/SkillsClient.js      # interactive island (search/filter/sort)
src/app/skills/[category]/page.js   # domain filter pages
src/app/s/[id]/page.js              # individual skill detail page (top 50)
public/data/skills_index.json       # 1,998 entries, ~2.5MB

Honest disclosure on the data freshness: skills_index.json is regenerated weekly-ish during active development as I iterate on the parser. The production cron — separate Python service on lich-ubuntu — is what stabilizes it to daily; that part is still rolling out. So treat the snapshot as recent, not "fresh this morning".

The decision I keep being asked about is "why not a database". Two reasons. First, the catalog is read-mostly and changes once a day — that is the canonical shape of a static JSON file, not a row store. Second, every database I'd add is a thing that can break at 2am while I'm asleep. A JSON file in public/ cannot break. It can be wrong, but it cannot be down.

The pillar guide that links into the catalog lives at orangebot.ai/blog/claude-code-skills-guide. It's the long-form companion — what skills are, how to install them, when to pick a skill vs an MCP server.

The SEO crisis I didn't see coming

Six months after the catalog went live, I finally opened Google Search Console. The numbers were ugly.

21 of 186 known URLs indexed. 11.3%.
151 URLs sitting in "Discovered – currently not indexed". last_crawled timestamps were the 1969-12-31 Unix epoch — Google had registered the URLs from my sitemap and then never bothered to fetch them.
The /skills page alone was getting 5,651 impressions per week but a 0.12% CTR. People were seeing the page in search results, then bouncing past it.

That is the failure mode for a content site. I had built the catalog, written the pillar guide, submitted the sitemap, and Google's verdict was a polite "no thanks". The 5,651 impressions were good news in disguise — the demand was real; the page just couldn't convert it because the SERP snippet was the generic brand tagline (no generateMetadata) and the rendered HTML was an empty client shell.

I spent a day diagnosing it instead of pushing more content. Three root causes:

Root cause 1 — the catalog page was a client component. The top of src/app/skills/page.js had 'use client'. The server-rendered HTML was an empty <div>. Googlebot does run JS, eventually, but it queues JS-heavy pages on a much slower second pass, and for low-authority sites it often skips the second pass entirely. So Google saw a blank page, picked up the brand-tagline title from the root layout, and ranked accordingly.

Root cause 2 — sitemap pollution. Next.js App Router auto-generates routes from metadata files like opengraph-image.js, twitter-image.js, icon.js. My next-sitemap.config.js was pulling all of them into sitemap.xml, plus eight stale scaffold routes I never finished (/cjobs, /mnews, /jobmatch, /jobs, /datajobs, /photographer, /prop, /startups). Google saw 173 sitemap entries, found ~18 weren't HTML and another 8 returned thin/empty pages, and started treating the whole sitemap as low-trust.

Root cause 3 — the homepage didn't link to /skills, /tools, or /digest in visible HTML. The nav was inside a <details> element that collapsed by default. Googlebot reads <details> content fine, but the absence of a prominent above-the-fold internal link to /skills meant the catalog was getting near-zero internal PageRank from the homepage.

Three causes, four commits.

The fix: 4 commits, ~180 file-changes (shipped 2026-05-23)

Four commits today, between 11:15 and 18:59 PDT: e9b4ba0 (SSR-ify /skills /tools /news, clean sitemap, add hub nav), d09e0e4 (CTR titles, EEAT author page, 5 tools SSR, news sitemap), 4442629 (P1+P2 push — 5 more tools SSR, deep compare pages, blog framework + 8 posts), 505ec1d (P3 push — 23 tools SSR, 50 skill detail pages, hub depth, OG image, newsletter). Cumulative session touched ~180 file-changes across the four commits per the P3 commit message.

Commit 1 — convert /skills and the other hub routes from client components to a server-shell + client-island pattern. This is the generally useful pattern I'd recommend to anyone running into the same problem. The shape is:

// src/app/skills/page.js  — Server Component (NO 'use client')
import fs from 'node:fs/promises';
import path from 'node:path';
import SkillsClient from './SkillsClient';

export const revalidate = 21600; // 6 hours

export const metadata = {
  title: '2,000+ Best Claude Code Skills (2026): Top Skills by Installs & Stars',
  description: 'A working library of Claude Code skills from across GitHub...',
  alternates: { canonical: '/skills' },
};

export default async function SkillsPage() {
  const raw = await fs.readFile(
    path.join(process.cwd(), 'public', 'data', 'skills_index.json'),
    'utf-8'
  );
  const allSkills = JSON.parse(raw);
  const topSkills = [...allSkills]
    .sort((a, b) => (b.installs || 0) - (a.installs || 0))
    .slice(0, 60);

  const itemListJsonLd = {
    '@context': 'https://schema.org',
    '@type': 'ItemList',
    numberOfItems: topSkills.length,
    itemListElement: topSkills.map((s, i) => ({
      '@type': 'ListItem',
      position: i + 1,
      name: s.name,
      url: s.repository,
    })),
  };

  return (
    <>
      <script
        type="application/ld+json"
        dangerouslySetInnerHTML={{ __html: JSON.stringify(itemListJsonLd) }}
      />
      <details>
        <summary>Text view · {allSkills.length} skills</summary>
        <h1>Claude Code Skills Index</h1>
        <ol>
          {topSkills.map((s) => (
            <li key={`${s.source}-${s.skillId}`}>
              <a href={s.repository}><strong>{s.name}</strong></a>
              {' '}by @{s.source} — {s.weeklyInstalls} installs/wk
            </li>
          ))}
        </ol>
      </details>
      <SkillsClient />
    </>
  );
}

The trick: the server emits the real top-60 list inside a <details> block (collapsed for users, fully readable for crawlers) plus a JSON-LD ItemList schema, and only then mounts the interactive <SkillsClient />. Google indexes the static list. Users get the interactive shell. Both reads are served from the same render.

Commit 2 — clean the sitemap. Excluded /*/opengraph-image, /*/twitter-image, /*/icon, /feed.xml, /feed.json, /feed/*, /api/*, and the eight stale scaffold routes from next-sitemap.config.js. Sitemap went 173 → 154 (cleanup), then later commits added /blog/* posts and 50 /s/[id] detail pages, bringing it to 219 URLs today. All current entries return HTML with <title> and <meta description>.

Commit 3 — visible homepage nav. Added a <nav aria-label="Site sections"> block at the top of the homepage with plain anchor tags to /skills, /tools, /digest, /news, /blog, /sources, /topics, /year/2026, /compare. No <details>, no JS, no collapse. Boring. Crawlable.

Commit 4 — depth on /tools, /blog, /s/[id], OG images, newsletter. SSR'd 33 of 36 tool pages (each with HowTo / FAQPage / SoftwareApplication / BreadcrumbList JSON-LD), shipped 13 blog posts as the editorial layer, built 50 individual /s/[id] skill detail pages, added a dynamic next/og card per blog post, and embedded Substack signup on every post page.

Expected D7 trajectory (forecast, not measured — post deploys today): indexed URLs 30-40 by 2026-05-30, 60-80 by 2026-06-06. The new /skills title — "2,000+ Best Claude Code Skills (2026): Top Skills by Installs & Stars" — replaces the generic brand tagline that was driving the 0.12% CTR. I'll update this post with measured GSC numbers once the D7 pull lands.

What the install data says about AI coding in 2026

The catalog gives a daily-updated view into which skills people actually install. Here is the top 10 by weekly installs, pulled directly from skills_index.json:

Rank	Skill	Author	Installs
1	find-skills	vercel-labs/skills	753,732
2	vercel-react-best-practices	vercel-labs/agent-skills	256,738
3	frontend-design	anthropics/skills	212,072
4	web-design-guidelines	vercel-labs/agent-skills	206,584
5	remotion-best-practices	remotion-dev/skills	182,063
6	azure-ai	microsoft/github-copilot-for-azure	146,196
7	azure-deploy	microsoft/github-copilot-for-azure	145,787
8	azure-storage	microsoft/github-copilot-for-azure	145,752
9	azure-cost-optimization	microsoft/github-copilot-for-azure	145,752
10	azure-diagnostics	microsoft/github-copilot-for-azure	145,697

Four things jump out.

find-skills is #1 with 753K weekly installs. A meta-skill — a skill whose only job is to discover other skills — outranks every domain-specific skill in the index by roughly 3x. The discovery layer is the actual moat. That's the thesis behind this entire catalog. The #1 skill is published by vercel-labs/skills, not Anthropic, which is itself a tell about who is racing to own the discovery primitive. Detail page: orangebot.ai/s/find-skills.

Microsoft owns the platform-skill volume. The microsoft/github-copilot-for-azure family ships 22 skills totaling 2.86M installs — the largest publisher in the index by total installs. The next-largest publisher is the parallel microsoft/azure-skills family at 2.29M across more skills. Combined: ~5.15M installs, all curated, all opinionated, all shipped as part of GitHub Copilot for Azure's default surface. If you're a cloud provider in 2026 and you don't have a curated skill family, you are losing this distribution channel.

Vercel is quietly the #2 publisher by reach. vercel-labs/skills (the find-skills repo) plus vercel-labs/agent-skills total 1.42M installs across very few skills — meaning per-skill leverage is higher than Microsoft's. Vercel is shipping skills that get installed almost universally rather than the broad-Azure-surface bet.

Inferen-sh and the open-source long tail are real. inferen-sh/skills totals 1.94M installs across 79 skills, mostly multimodal (image/video gen, OCR, Qwen). Six months ago "coding skills" meant test runners and migration helpers. Today image/video generation skills are colonizing the surface — because the same developer building a Next.js app is also generating OG images, hero shots, and product demos. The agent is the unified tool.

Top publishers by total installs in the current index:

Rank	Publisher	Skills	Total Installs
1	microsoft/github-copilot-for-azure	22	2,858,606
2	microsoft/azure-skills	(multiple)	2,291,209
3	inferen-sh/skills	79	1,941,845
4	github/awesome-copilot	209	1,598,674
5	coreyhaines31/marketingskills	(multiple)	873,458
6	anthropics/skills	18	757,840
7	vercel-labs/skills	1	753,732
8	supercent-io/skills-template	77	745,913

Total index: 1,998 skills, 401 unique publishers, ~19.2M cumulative weekly installs. Top 8 publishers hold roughly 60% of total volume — the power law is sharper than I expected, but the long tail of ~390 smaller publishers still matters.

The thesis I'd commit to: by Q4 2026 the meaningful divide will be "agents that ship and run skills" (Claude Code, Cursor, OpenSeek) vs "agents that just call LLMs" (everyone else). Skills become the unit of teaching an agent your stack. If your agent can't load a community skill in one command, it's a closed garden, and closed gardens lose to ecosystems.

A note on install-count interpretability: the numbers above include CI installs, scripted bulk installs, and probably some bot traffic. Treat them as a relative-popularity proxy, not a clean user count. The signal is noisy at the absolute level but the relative ranking holds up across multiple pulls.

For reference, the actual install command shape from skills_index.json:

# Install find-skills (the #1 skill, by Vercel Labs)
npx skills add https://github.com/vercel-labs/skills --skill find-skills

# Install frontend-design (by Anthropic)
npx skills add https://github.com/anthropics/skills --skill frontend-design

# Install azure-deploy (Microsoft)
npx skills add https://github.com/microsoft/github-copilot-for-azure --skill azure-deploy

One command. No package.json, no virtualenv, no Docker. That low-friction install path is what makes install counts a usable popularity signal in the first place.

What's next

The 50 highest-installed skills now have individual SEO pages at /s/[id] — searchable on Google for the skill ID directly, with SoftwareApplication + BreadcrumbList structured data and inline install instructions. That doubles the catalog's surface area for long-tail queries like "claude code find-skills install" or "azure-deploy skill setup".

The daily digest email is up next — a curated 5-skill brief delivered each morning, weighted toward newly-published skills with unusually high first-week install velocity. The signal I'm hunting: which skills are about to break out before they hit the top 50.

Open invitation: if you publish a Claude Code skill — public Git repo with a valid SKILL.md, listed on skills.sh or installable via npx skills add — the next scrape will pick it up. No submission form, no review queue, no gatekeeping. If your skill is good, the install counts will surface it on their own; if it isn't, no one will install it and that's also fine.

The live catalog is at orangebot.ai/skills. The long-form pillar — what skills are, how they compare to MCP servers, how to write one — is at orangebot.ai/blog/claude-code-skills-guide.

Go publish a skill. I'll see it on the next scrape.

Top comments (1)

Harjot Singh • May 29

2000-skills-indexed is a great dataset. install rate tells u what devs WANT but not what they SHIP - those diverge wildly. one cut that'd be interesting in a follow-up: which skills correlate w/ skill-installer actually deploying something to prod (vs just having it on their machine). been seeing the same pattern at the saas-gen layer - high curiosity, low completion. moonshift is on the completion side: $3 per shipped saas, code in ur own gh + vercel. first run free if u want a complete-the-circle data point.