We spend a lot of time securing the inputs to our LLMs—filtering prompts, checking for injections.
But in the world of AI Agents, we have a new blind spot: Tool Outputs.
When an agent calls get_jira_ticket, the response often contains a dump of raw text. In my case, that text contained user emails and internal secrets.
If I logged that context window to an observability tool, I was essentially persisting secrets in a dashboard.
So, I built QuiGuard to solve this. Here is how it works under the hood.
The Architecture
I didn't want to rewrite the agent frameworks (LangChain/AutoGen). I needed something that sat transparently in the middle.
The solution was a Reverse Proxy.
Interception: The proxy accepts the OpenAI-compatible API request.
Traversal: It recursively walks through the messages array.
The Gatekeeper Logic: If it sees a message with role: "tool", it knows this is data coming back from an API.
The Challenge: Recursive JSON
Tool responses aren't always clean strings. Sometimes they are stringified JSON inside JSON.
To handle this, I wrote a recursive scrubber:
def _recursive_scrub(data):
if isinstance(data, dict):
return {k: _recursive_scrub(v) for k, v in data.items()}
elif isinstance(data, list):
return [_recursive_scrub(item) for item in data]
elif isinstance(data, str):
# It's a string. Is it stringified JSON? Try to parse.
try:
nested_data = json.loads(data)
scrubbed_nested = _recursive_scrub(nested_data)
return json.dumps(scrubbed_nested)
except json.JSONDecodeError:
# Not JSON, just a normal string. Scrub PII.
return sanitize_text(data)
else:
return data
This ensures that even if a tool returns {"body": "{\"user\": \"secret@...\"}"}, we catch the secret.
The Result
Clean Logs: My LangSmith traces now show instead of real emails.
Safe Context: The LLM processes the logic without "seeing" the sensitive data.
Restoration: The user sees the real data in the final reply.
I open-sourced the project (MIT).
Repo: https://github.com/somegg90-blip/quiguard-gateway
Curious if others have run into the "messy tool output" problem? Let me know in the comments!
Top comments (0)