I built Flow-Guard because I keep seeing the same failure mode with AI coding agents: a local change looks correct, but the larger workflow or architecture is still wrong.
Flow-Guard is an open-source Python tool that acts like a lightweight workflow / architecture simulator before an agent writes or changes code.
It is not an LLM wrapper and it is not prompt-only. The core is a real finite-state mathematical model: you define the workflow state, transitions, inputs, and outputs, then Flow-Guard explores that model and reports concrete counterexample traces.
In other words, it is using a state-machine style mathematical simulation to reason about the workflow before code is generated or changed.
What it tries to catch
Right now I am focusing on three broad classes of problems:
- duplicate side effects or repeated actions
- state / cache / source-of-truth drift
- stuck loops, retry paths, or progress failures
The goal is not to replace tests or claim full formal verification. I am trying to add a design-time guardrail: simulate the logic first, inspect the failures, then let the coding agent implement or modify the real code.
The easiest way to try it is to hand the GitHub URL to your coding agent and ask it to install the project and run the examples.
Repo:
https://github.com/liuyingxuvka/FlowGuard
Feedback:
https://github.com/liuyingxuvka/FlowGuard/discussions/1
I would especially like feedback on:
- whether the concept is understandable
- whether "workflow / architecture simulator" works well
- what real AI-agent workflow bugs would be worth modeling next
Top comments (0)