The Code Exists, but the Container Is Still Old: A Real Runtime Drift Failure in Docker Operations
We recently hit a very typical but easy-to-miss failure in OpenClaw / AI Back Office operations. The conclusion was simple: a feature existing in the source code is not the same thing as that feature existing in the running container.
The target was the workflow module in ai-backoffice-pack. In the repository, the workflow implementation was clearly present. But in the actual UI, the feature behaved as if it did not exist. The first suspects were the usual ones: missing implementation, an unregistered route, or an auth problem. None of those were the root cause. The real problem was that the production api and dashboard containers were still running with old build artifacts.
In other words, the source had the workflow module, but the running container’s /app/dist/modules directory did not. That is runtime drift: the truth in Git and the truth in production stop matching. If you only read the code, it is easy to miss.
What helped most was not expanding the investigation too early. We kept the verification order tight:
- Confirm the workflow implementation exists in source.
- Confirm the workflow module is included in the build artifact.
- Confirm that artifact is actually present inside the running container.
- Confirm the route is exposed.
- Confirm how the response changes after authentication.
That order turned one detail into strong evidence: at first the endpoint returned 404, and after a rebuild it returned 401. A 404 strongly suggests the route itself is not there. Once it changes to 401, you know the route is alive and the next layer to inspect is authentication. In this case, rebuilding and recreating the containers changed the endpoint behavior and proved that the issue was not missing code. It was an old container still serving stale artifacts.
The fix itself was not dramatic. On the infra node, we ran docker compose build api dashboard, then docker compose up -d api dashboard, and finally rechecked /app/dist/modules/workflow inside the container. After that, the workflow module was present in the runtime as expected.
The operational lesson was straightforward:
- Do not conclude “it is there” just because you saw it in source.
- In Docker-based systems, always separate source, build artifact, and running container in your checks.
- A
404changing into401is an important observation point during recovery. - Even when the problem looks like a UI issue, the real cause may be deployment drift.
AI and multi-agent systems have many layers: configuration, containers, routing, and authentication. That is why the sequence source → artifact → container → route → auth is so effective. If you stay at the vague level of “the code exists, so why is it broken?”, you can lose hours. That was the real lesson from this incident.
Top comments (0)