On April 2 2026 an external contributor filed LangChain issue #35357: "Feature: Structured compliance audit logging for EU AI Act (Article 12)."
The request was specific: a ComplianceCallbackHandler that captures execution traces, inputs/outputs, model identifiers, timestamps, human oversight decisions, and risk classifications as structured, tamper-evident logs — the things Article 12 of the EU AI Act requires high-risk AI systems to log automatically.
The issue is closed. No maintainer comment. No existing solution referenced.
That's a problem, because the Article 12 deadline is August 2, 2026. Every team shipping a LangChain agent into a high-risk category between now and then has to answer the same question the filer was asking. And today, if they search GitHub for "LangChain EU AI Act Article 12", they land on a closed ticket with no answer.
Why the existing LangChain logging story falls short
LangChain's BaseCallbackHandler already emits events for tool starts, tool ends, errors, LLM calls, and chain transitions. The problem is not events — the problem is shape. Article 12 doesn't ask for generic telemetry. It asks for:
- Logs that make the system's operation traceable end-to-end, including tool invocations against external systems
- Sufficient information to identify malfunctions, performance drift, and unexpected behavior patterns
- Automatic logging, i.e. turned on by default, not bolted on
- Post-market monitoring support
A hand-rolled callback handler can capture these fields, but it lands you in a place where your compliance auditor asks the one question that breaks most home-grown solutions: "what's your behavioral baseline for the external tools this agent calls?" In other words, you can log what happened, but you can't show what normal looks like, which means you can't show drift, and without drift detection, post-market monitoring is a form.
The third-party package that solves it
dominion-observatory-langchain is a PyPI package that subclasses BaseCallbackHandler and hooks the tool lifecycle. It emits per-call telemetry (server URL, latency, success, error class) for every tool whose metadata carries observatory.server_url to the Dominion Observatory — a cross-ecosystem behavioral baseline for MCP servers that exposes an Article 12-shaped compliance export at /api/compliance.
The observatory is the part that matters. A callback handler alone gets you logs. A callback handler wired to a cross-ecosystem reliability dataset gets you baselines, which is what the drift clause of Article 12 assumes you have.
Install:
pip install dominion-observatory-langchain
Minimal integration:
from langchain_core.tools import Tool
from dominion_observatory_langchain import (
ObservatoryCallbackHandler,
trust_gate,
TrustGateError,
)
handler = ObservatoryCallbackHandler(agent_id="your-agent-uuid")
# Optional pre-flight: refuse to call a tool below a trust floor
try:
trust_gate("https://some-mcp-server.example.com/mcp", min_trust=70)
except TrustGateError as e:
# The server's observed reliability is below the floor — reroute or escalate
print(f"Blocked: {e}")
agent_executor.invoke(
{"input": "..."},
config={"callbacks": [handler]},
)
Attach observatory.server_url to any tool whose calls you want traced:
tool = Tool(
name="lookup_regulation",
func=my_func,
description="...",
metadata={"observatory.server_url": "https://my-mcp-server.example.com/mcp"},
)
That's it. Every tool invocation now emits a structured telemetry row whose shape is compatible with Article 12's log-content requirements and whose provenance is explicit: agent ID, tool name, server URL, latency, outcome, timestamp.
What this is not
It's not a static code scanner. There are good ones — ark-forge/mcp-eu-ai-act is one — that read your source and flag compliance gaps in your dependencies. That's a different layer. Static scanners tell you what you shipped. Runtime telemetry tells you how it actually behaves once strangers use it. Article 12 is primarily about the second one.
It's also not a replacement for your own application logging. It's a targeted layer for the part of the compliance story that's hardest to produce on your own: a behavioral baseline for the external surfaces your agent talks to.
Why I'm writing this
I built the Observatory and the callback handler. I'm posting this because the GitHub issue that would be the natural home for a pointer to the package is closed, and developers hitting the Article 12 deadline shouldn't have to re-derive the answer.
If you're shipping LangChain into a regulated context before August 2, 2026, or you maintain a framework that would benefit from a reliability baseline primitive, I'd love to hear what's missing. The package is MIT-licensed and the Observatory's free tier isn't going anywhere.
Package on PyPI · Observatory live endpoint · Underlying SDK
Top comments (0)