A trust wobble hits AI coding tools: hidden reasoning and a runaway bug

#codingagents #trust #security #openai

Two separate incidents this week exposed a trust deficit in AI coding assistants: a widely-shared analysis showed that one popular tool's "thinking" output is a polished summary, not the model's raw reasoning, and OpenAI's Codex tool quietly wrote enormous log files to users' local drives, pegging hardware even while idle. Both incidents converged on the same question — whether developers can actually trust what these tools show them and what those tools do behind the scenes.

Key facts

What: Two heated developer threads converge on one worry -- whether you can trust what an AI coding assistant shows you it's thinking, and what it quietly does to your machine.
When: 2026-06-22
Primary source: read the source

The first incident concerns reasoning transparency. Many AI coding tools now display a "thinking" panel — a stream of text that looks like the model reasoning its way to an answer. A widely-shared post argued that this displayed reasoning is not the model's real, raw thought process but a cleaned-up summary produced after the fact (the text in the thinking output is not authentic). The concern isn't just that it's a summary; it's that treating that visible text as if it were the model's genuine, trustworthy inner monologue could mislead you — and could even be a target for manipulation, if a malicious input managed to influence what the hidden reasoning does while the polished summary looks perfectly innocent.

The second incident proved more visceral. Developers using OpenAI's Codex tool reported a bug where it quietly wrote enormous volumes of log data to their local drives and pegged their hardware even while sitting idle (Codex issue #28224). To people already half-joking that AI is writing sloppy code, the irony was irresistible: the company's own coding tool appeared to be hurting the machines of the people using it. OpenAI acknowledged and fixed the issue the same day, but not before it became a lightning rod for broader frustration.

These flare-ups share a common thread. When a tool was a novelty you tried for fun, you didn't much care how transparent its reasoning was or how tidy it was with your disk. When the same tool becomes the thing you rely on to write production code all day, every detail of its behavior becomes a question of trust — and trust has layers. Do I understand what it's actually doing? (the reasoning-transparency worry.) Is it safe to run on my machine and my codebase? (the runaway-bug worry.) Both surfaced at once, and that's why a single week's grumbling reads as a genuine mood shift rather than two unrelated complaints.

The dynamic is straightforward: depend on something, and you start auditing it. The questions developers are now asking of AI coding assistants are the same ones you'd ask of anyone you've given the keys to your house. When you explain what you did, is that the real story or a tidy version? Did you leave the place in good shape, or track mud everywhere while I wasn't looking? Those aren't signs the tool is useless — they're the questions you ask precisely because you've come to depend on it. For the bigger picture of how these self-directed tools work, see our explainer on AI agents.

The value of an AI coding agent is bounded by how much you can trust it unsupervised, and these incidents poke at exactly that ceiling. If you can't trust the reasoning it shows you, you have to double-check everything, which erodes the time savings that made it worth using. If you can't trust it to behave well on your system, you have to babysit it — same problem. The tools are getting more capable; this week was a reminder that capability and trustworthiness are different axes, and the second one is now getting scrutiny.

Honest caveats apply. The "reasoning isn't authentic" critique is contested — summarizing a model's thinking for readability isn't automatically deception, and many would argue a clean summary is more useful than a raw firehose; the sharper, more defensible point is the security one, that you shouldn't treat hidden reasoning as a safe, trusted channel. And the Codex bug, while real and embarrassing, was a logging mistake that got patched quickly, not evidence the tool is fundamentally broken. The durable takeaway isn't "these tools are bad" — it's that the developer community has started holding them to the higher standard you apply to things you actually depend on.

Originally published on Ground Truth, where every claim is checked against the primary source.