I built a self-inspection tool for AI agents with no AI inside it

#ai #mcp #llm #open

There's a small voice that asks "wait, are you sure?" right before you do something dumb. AI agents don't have that voice.

So they commit to the first read of a task and never reopen it. They pile up assumptions they never name. They drift from the goal over a long chain, stop at the first answer that looks right, and get more confident without getting more evidence. None of it is a knowledge failure. The model already knows better. It just never stops to ask.

And here's the catch: an agent can't reliably grow that voice on its own. The part that decides what to reflect on is the same part that's already committed, so prompting a model to "double-check itself" mostly re-runs the same bias and calls it confidence. The question has to come from outside the model.

That's what I built. It's called Self-Inspect.

What it does
The agent sends a thought, a sentence about what it's doing or about to do. It gets back one metathought: a short question that makes it inspect its own task and assumptions before continuing.

thought: "I'm committing to this architecture and treating it as fixed"
metathought: "What is fixed?" (lens: commitment)

thought: "what depends on this being true?"
metathought: "What depends on being true?" (lens: assumption)
Not advice. Not an answer. A question. The agent still does the thinking; the tool just makes it look.

The part people don't expect: there's no model in it
No LLM, no embeddings, no semantic similarity. Selection is a small deterministic heuristic over an open CSV of ~50 "inspection lenses" (137 questions: assumption, confidence, scope, drift, completeness, and so on).

Given a thought, it normalizes the text, scores each lens by keyword overlap with the lens name and that lens's questions, picks the best lens, and returns its canonical question. Same input, same question, every time. You can read the selector (one small file) and the CSV and know exactly why it returned what it did. It can't hallucinate its own critique, because there's nothing in it that hallucinates.

If nothing matches, it doesn't fail. It returns a universal question about task and assumptions, because a tool called Self-Inspect should always give the agent something to question.

Use it (keyless, free)
REST, from anything:

curl -s -X POST https://api.ejentum.com/self-inspect \
-H "Content-Type: application/json" \
-d '{"thought":"I am committing to this architecture and treating it as fixed"}'

-> [{ "label": "commitment", "metathought": "What is fixed?" }]

As an MCP server (Claude Code, Cursor, n8n, any HTTP-MCP client), no install, no key:

{
"mcpServers": {
"self-inspect": {
"type": "http",
"url": "https://api.ejentum.com/self-inspect-mcp"
}
}
}
There's also a stdio package that runs the whole thing offline (SELF_INSPECT_LOCAL=1), no network at all.

Open all the way down
The repo is the tool. The CSV and the selector you read there are the exact logic the live endpoint runs. A drift test fails the build if the deployed engine isn't byte-identical to the published source, so "published == deployed" isn't a promise, it's enforced.

It's deliberately dumb so it stays fully auditable. Whether keyword routing is good enough versus an embedding-based router is a fair question, and one I'd genuinely like feedback on. The lens set is a CSV; adding to it is a pull request.

Repo: https://github.com/ejentum/self-inspect-mcp

It's the first open-source tool I've shipped from Ejentum. Drop it in your agent loop right before it commits to something, and tell me what it asks of yours.

DEV Community

I built a self-inspection tool for AI agents with no AI inside it

-> [{ "label": "commitment", "metathought": "What is fixed?" }]

Top comments (0)