Built an open-source static analyzer for AI agent code. Not evals, not runtime monitoring — actual source code analysis.
Pointed it at 53 popular repos (LangChain, CrewAI, AutoGen, OpenHands, MetaGPT, etc). Results:
- 42 out of 53 had at least one finding
- 20 had CRITICAL severity
- Most common: missing human oversight on tool calls that execute code
- Runner up: agent loops with no exit condition
The scanner builds an intermediate representation of agent logic (like LLVM but for agents), then runs taint tracking from user input → LLM call → tool execution.
Works with 11 frameworks. No AI in the scanner. Apache 2.0.
Try it: app.inkog.io/scan — paste any GitHub URL, 60 seconds, no signup.
GitHub: github.com/inkog-io/inkog
Feedback welcome — what patterns are we missing?
Top comments (0)