I built Skillradar: find the right agent skill by describing your task (2.5k+ indexed)

#ai #agents #programming

I’m experimenting with a semantic search workflow for discovering agent skills from natural-language task descriptions.

Many skill lists are still keyword-based, which makes it hard to compare similar skills before trying them. I indexed ~2.5k skills and use semantic retrieval to surface candidates for a given scenario.

1. Website mode (baseline semantic search)

You can type a scenario like:

I’d like to conduct a market analysis”

…and get a ranked list of candidate skills.

You can click a skill card to view details and inspect its SKILL.md / manifest.

2. Agent-native mode: let an agent turn vague prompts into structured search queries

This is the part I personally use the most.

Instead of going to a website and trying to craft the “right keywords”, I use an agent-side helper (a small “discover” prompt) to convert a vague request into a search goal + keywords, then query the index. This fits CLI-style agent workflows.

After installation, the agent can:

Ask a couple of simple questions (e.g., install scope/path)
Then you just describe your scenario in plain English — even if it’s abstract, vague, or messy
discover-skills will translate that into a structured search (task goal + keywords), query the index, and return candidates with short match reasons

Here’s an example with a very “vague” need:

I have a bunch of meeting notes scattered everywhere and I want to organize them better. Is there a skill for that?”

The agent turns it into a query + keywords, retrieves candidates, and suggests what to install next.

Question (Embeddings / for skill retrieval)

I’d love advice on how you’d embed and index a SKILL.md-style skill definition for semantic retrieval.

Right now I’m thinking about embedding each skill from multiple “views” (e.g., what it does, use cases, inputs/outputs, examples, constraints), but I’m not fully sure what structure works best.

How would you chunk/structure SKILL.md (by section, by template fields, or by examples)?
Single vector per skill vs multi-vector per section/view — and how do you aggregate scores at query time?
Which fields usually move retrieval quality most (examples, tool/actions, constraints, tags, or “when not to use”)?