DEV Community

capman
capman

Posted on

I built a tool that lets AI agents interact with your app without navigating the UI

When I started thinking about how AI agents interact with applications, something bothered me.

Today, when an AI agent needs to answer "are there seats available for Friday?" — it navigates your app like a tourist with no map:

AI clicks → Home → Explore → Events → Category → Availability
Enter fullscreen mode Exit fullscreen mode

That's slow, wasteful, and it exposes parts of your app the AI was never supposed to see.

So I built capman — a Capability Manifest Engine that gives AI agents a structured map of what your app can do, and shows you exactly why it made every decision.


The core idea

Your app publishes a capability manifest — a machine-readable list of everything it can do, what API to call, and what data scope is allowed.

// capman.config.js
module.exports = {
  app: 'my-app',
  baseUrl: 'https://api.my-app.com',
  capabilities: [
    {
      id: 'check_availability',
      name: 'Check availability',
      description: "'Check if a product or slot is available on a given date.',"
      examples: [
        'Are there seats available Friday?',
        'Check availability for blue jacket',
      ],
      params: [
        { name: 'item',    description: "'Item name or ID', required: true,  source: 'user_query' },"
        { name: 'date',    description: "'Date to check',   required: false, source: 'user_query' },"
      ],
      returns: ['available', 'count', 'price'],
      resolver: {
        type: 'api',
        endpoints: [{ method: 'GET', path: '/availability/{item}' }],
      },
      privacy: { level: 'public' },
    },
  ],
}
Enter fullscreen mode Exit fullscreen mode

The AI reads this manifest and goes directly to the answer — no navigation, no guessing.


The part I'm most proud of: the execution trace

Every query returns a full execution trace. No black box.

const engine = new CapmanEngine({ manifest, baseUrl: 'https://api.my-app.com' })
const result = await engine.ask('Are there seats available Friday?')

console.log(result.trace)
// {
//   query: 'Are there seats available Friday?',
//   candidates: [
//     { capabilityId: 'check_availability', score: 100, matched: true  },
//     { capabilityId: 'get_orders',         score: 12,  matched: false },
//     { capabilityId: 'navigate_to_screen', score: 0,   matched: false },
//   ],
//   reasoning: [
//     'Matched "check_availability" with 100% confidence',
//     'Rejected: get_orders (12%)',
//     'Resolved via: keyword',
//     'Extracted params: item=seats, date=Friday',
//   ],
//   steps: [
//     { type: 'cache_check',   status: 'miss', durationMs: 0 },
//     { type: 'keyword_match', status: 'pass', durationMs: 1, detail: 'confidence: 100%' },
//     { type: 'privacy_check', status: 'pass', durationMs: 0, detail: 'level: public'    },
//     { type: 'resolve',       status: 'pass', durationMs: 2, detail: 'via api'          },
//   ],
//   totalMs: 4,
// }
Enter fullscreen mode Exit fullscreen mode

This answers the question every AI developer eventually asks:

"Why did it pick this capability?"
"Why wrong parameters?"
"Why did it fail?"

Today with LangChain, OpenAI function calling, or custom agents — there's no clear answer. Logs are scattered, debugging is painful.

capman makes AI execution debuggable like backend code.


Three matching modes — control cost vs accuracy

// cheap — keyword only, free, fast
const engine = new CapmanEngine({ manifest, mode: 'cheap' })

// balanced — keyword first, LLM fallback if confidence < 50% (default)
const engine = new CapmanEngine({ manifest, mode: 'balanced', llm: myLLM })

// accurate — LLM first, always correct
const engine = new CapmanEngine({ manifest, mode: 'accurate', llm: myLLM })
Enter fullscreen mode Exit fullscreen mode

The LLM function is plug-and-play — works with Anthropic, OpenAI, or any model:

const engine = new CapmanEngine({
  manifest,
  mode: 'balanced',
  llm: async (prompt) => {
    const res = await anthropic.messages.create({
      model: 'claude-sonnet-4-20250514',
      max_tokens: 500,
      messages: [{ role: 'user', content: prompt }],
    })
    return res.content[0].text
  },
})
Enter fullscreen mode Exit fullscreen mode

Privacy enforcement — per capability, not per request

{
  id: 'get_my_orders',
  privacy: { level: 'user_owned' },  // blocked without auth
}

{
  id: 'get_admin_report',
  privacy: { level: 'admin' },       // blocked without admin role
}
Enter fullscreen mode Exit fullscreen mode

Pass auth context once — capman enforces it before resolution:

const engine = new CapmanEngine({
  manifest,
  auth: {
    isAuthenticated: true,
    role: 'user',
    userId: 'user-123',  // auto-injected into session params
  },
})
Enter fullscreen mode Exit fullscreen mode

See it live in 30 seconds

npx capman demo
Enter fullscreen mode Exit fullscreen mode

That runs a live demo against a sample e-commerce app — no config needed. You'll see matched capabilities, execution traces, candidate scores, and API calls constructed in real time.

Or run a query against your own manifest:

npx capman init          # creates capman.config.js
npx capman generate      # generates manifest.json
npx capman run "are there seats available Friday?" --debug
Enter fullscreen mode Exit fullscreen mode

Get started

npm install capman
Enter fullscreen mode Exit fullscreen mode

What I'm working on next

The v0.5.0 roadmap includes:

  • Wiring the learning index back into the keyword matcher — usage data improves matching over time
  • CapmanEngine.explain(query) — understand why a query would match without executing
  • Redis adapter for multi-instance deployments
  • mode: 'adaptive' — uses learning data to boost confidence on proven patterns

If you're building AI agents and you've ever stared at logs wondering why the AI did what it did — capman is for you.

Would love feedback. What would make this useful for your stack?

Top comments (0)