AIvora Labs for AgentTrail

Posted on Jun 13

Why AgentTrail Exists: Building Open-Source Audit Trails for AI Agents

#euaiact #opensource #ai #security

The EU AI Act is now in force, and compliance deadlines for high-risk AI systems are approaching. Many mid-market organizations are still figuring out what "record-keeping" actually means in practice. This is why we built AgentTrail: an open-source SDK designed to make AI decision traceability practical, transparent, and affordable.

What Article 12 Actually Requires

The European Union's Artificial Intelligence Act (EU AI Act, Regulation (EU) 2024/1689) entered into force on 1 August 2024. Article 12, Record-Keeping, requires providers of high-risk AI systems to design those systems so that they automatically generate logs throughout their lifecycle.

These logs must be sufficient to enable monitoring, post-market oversight, incident investigation, and regulatory compliance. The Act also requires that logs be retained for an appropriate period and made available to competent authorities when required.

High-risk AI systems are defined primarily in Annex III of the Regulation and include use cases such as:

Recruitment and employment decisions (HR Tech)
Creditworthiness assessment (Fintech)
Certain insurance-related risk assessments (InsurTech)
Access to essential services and other regulated decision-making processes

Important context on deadlines: The original framework set 2 August 2026 as the key compliance date for most high-risk AI systems. However, in May 2026, EU co-legislators reached political agreement on the so-called AI Omnibus (Digital Omnibus package), which amended certain provisions and adjusted enforcement timelines. For some categories of high-risk systems, obligations now align with a later timeline, with 2 December 2027 referenced for specific implementation steps. Organizations should verify which deadline applies to their specific system category rather than assuming a single universal date.

What the law says about integrity: Article 12 mandates automatic logging and retention, but it does not prescribe specific technical formats or explicitly mandate cryptographic signatures. The regulatory requirement is evidence of what the system did and when. In practice, however, traditional observability tools (Splunk, Datadog, ELK) store logs that can be modified, deleted, or reordered without leaving evidence. For organizations that need to demonstrate integrity to an auditor or regulator, cryptographic proof of tamper-evidence is a strong technical implementation—not because the Act spells out "SHA-256," but because it is the most reliable way to prove a log has not been altered.

The Market Gap

Solution	Typical Cost (indicative)	Target Audience
OneTrust / ServiceNow GRC	$50,000+ annually*	Large enterprises
Big Four consulting firms	£1,400–£2,600 per day	Enterprise and government
Boutique compliance consultancies	€5,000–€15,000 per project	Mid-market

* Enterprise GRC suites; smaller-scope plans may start at lower tiers.

The European mid-market segment—companies with roughly 50–500 employees—often sits between enterprise-grade governance platforms and one-off consulting engagements.

Many of these organizations are already experimenting with AI-powered workflows but lack dedicated compliance teams or six-figure governance budgets. This creates a practical gap between regulatory requirements and affordable implementation.

How AgentTrail Works

AgentTrail is an open-source TypeScript SDK released under the MIT License.

It is built around three core primitives designed to satisfy the spirit of Article 12 through robust technical evidence:

1. SHA-256 Hash Chains

Each event incorporates the hash of the previous event, creating a tamper-evident chain of records.

2. Ed25519 Digital Signatures

Every receipt can be cryptographically signed and independently verified using a public key.

3. Canonical JSON

Deterministic serialization ensures that the same event always produces the same hash, regardless of platform or environment.

Privacy by Design

AgentTrail does not require centralized storage of audit data. Receipts remain within your infrastructure—whether stored in Amazon S3, a local filesystem, or another storage backend. Verification can be performed offline using the CLI:

npx @aivoralabs/agenttrail-cli audit-receipt verify audit-log.jsonl

Early Validation (Internal Signals)

Our initial outreach is still in a very early phase. These are internal metrics from our first conversations, not market validation:

Channel	Metric
Emails sent	23
Open rate	35%
LinkedIn connections	17
Landing page clicks	1

While these numbers are small, the open rate suggests that traceability and AI compliance are topics decision-makers are willing to engage with. Our immediate goal is to convert that interest into customer discovery interviews and concrete product feedback—not to claim market validation.

What's Next

The roadmap is straightforward and transparent:

Continue improving the open-source SDK
Validate compliance requirements with practitioners and auditors
Conduct customer discovery interviews
Launch AgentTrail Cloud as a managed offering (currently in development)

Our planned model is open-core:

AgentTrail SDK: Free and open source (MIT)
AgentTrail Cloud: Planned starting at $99 per agent per month (pricing and availability TBD; no payment system is active yet)

AI governance is becoming a business requirement. Organizations need auditability, but they should not need a six-figure budget to implement it technically.

GitHub: https://github.com/AIvoraLabs/AgentTrail

Landing Page: https://agenttrail.aivoralabs.org

Top comments (1)

Mehmet Can Farsak • Jun 13

Great write-up on AI agent audit trails — the gap between compliance requirements and practical tooling is real. I've noticed a related blind spot: agents don't distinguish between thinking time and action time, so audit logs end up mixing brainstorming with execution. Put together Brainstorm-Mode (mehmetcanfarsak/Brainstorm-Mode on GitHub) that uses PreToolUse hooks to keep agents in ideation mode when they should be, which also makes the audit trail much cleaner since you can see where the agent was supposed to be thinking vs. acting.