GitHub's search API has a hard limit: 1,000 results per query.
We have 172,000+ skills indexed.
Here's how we built a discovery system that found them all—without breaking any rules.
The Problem: Skills Are Everywhere
AI agents like Claude Code, OpenAI Codex, and GitHub Copilot use SKILL.md files to learn new capabilities. These skills teach agents how to handle PDFs, write Excel formulas, follow brand guidelines, and much more.
The problem? These skills are scattered across thousands of GitHub repositories:
- Some live in
~/.claude/skills/ - Others in
.github/skills/ - Many in random
skills/folders - And countless more in personal dotfiles repos
Finding the right skill is like searching for a needle in a haystack of haystacks.
I tried GitHub's search: filename:SKILL.md. It returned results, but never more than 1,000. The GitHub API documentation confirms this limit—and there's no way around it with a single query.
So I built something different.
Our Approach: Multi-Strategy Discovery
Instead of fighting the 1,000-result limit, we work with it by running multiple specialized searches. Each strategy targets a different slice of the skill ecosystem.
Strategy 1: Path-Based Search
Skills follow predictable directory patterns. We search each path separately:
filename:SKILL.md path:skills
filename:SKILL.md path:.claude
filename:SKILL.md path:.github
filename:SKILL.md path:.codex
Each query can return up to 1,000 results. Four queries = up to 4,000 potential discoveries.
Strategy 2: File Size Segmentation
GitHub lets you filter by file size. We segment our searches:
filename:SKILL.md size:<1000 # Small skills
filename:SKILL.md size:1000..5000 # Medium skills
filename:SKILL.md size:>5000 # Large skills
Same file, different queries, different result sets.
Strategy 3: Topic-Based Discovery
Many skill repositories use GitHub topics. We search for repos tagged with:
claude-skillsagent-skillsai-skillsmcp-skillsllm-skills
Then deep-scan each repository for SKILL.md files.
Strategy 4: Awesome List Crawling
The community maintains curated lists of skills:
awesome-claude-skillsawesome-agent-skillsawesome-copilot
We parse these lists and index every linked repository.
Strategy 5: Fork Network Traversal
When we find a popular skills repository, we also check its forks. Forks often contain additional or modified skills that never made it back to the original repo.
The Stack
Here's what powers the discovery and search:
| Component | Technology | Purpose |
|---|---|---|
| Web App | Next.js 15 | Marketplace UI |
| Database | PostgreSQL | Skill metadata, ratings |
| Search | Meilisearch | Full-text search with typo tolerance |
| Queue | Redis + BullMQ | Background crawl jobs |
| CLI | Node.js | Install skills from terminal |
The indexer runs on a schedule:
- Daily: Incremental crawl (new/updated skills)
- Weekly: Full discovery (all strategies)
- On-demand: Process user-submitted repositories
All queries use authenticated GitHub API requests with proper rate limit handling. We rotate between multiple tokens to stay well within limits.
Results
After running our multi-strategy discovery:
| Metric | Count |
|---|---|
| Skills Indexed | 172,000+ |
| Contributors | 4,000+ |
| Categories | 30 |
| Platforms | Claude, Codex, Copilot |
The search is fast. Type "pdf" and get relevant results in milliseconds, ranked by GitHub stars, download count, and security status.
Every skill is scanned for:
- Dangerous shell commands
- Prompt injection patterns
- Data exfiltration attempts
Skills that pass get a green checkmark. Those with issues get flagged.
Try It Now
Install the CLI:
npm install -g skillhub
Search for skills:
skillhub search pdf
Install a skill:
skillhub install anthropics/skills/pdf
Or browse all 172,000+ skills on the web:
What's Next
We're working on:
- Native Claude Code integration via MCP protocol
- Skill verification with author confirmation
- Usage analytics so you know which skills actually work
The entire project is open source under MIT license.
Your Turn
What skills would you like to see indexed? Any repositories we should add?
Drop a comment below—I read every one.
Built with Next.js, PostgreSQL, Meilisearch, and way too much coffee.



Top comments (0)