You are about to deploy an AI chatbot. Your backlog has 40 items on it: better prompts, streaming responses, multi-language support, analytics dashboard, Slack integration. Governance is item 38, somewhere between "dark mode" and "maybe add a mascot."
This is the wrong order. And the consequences are not theoretical.
The feature trap
In March 2023, Samsung engineers pasted proprietary source code into ChatGPT on three separate occasions. The company banned the tool within weeks, but the data was already in OpenAI's training pipeline. In late 2023, Air Canada's chatbot invented a bereavement discount policy that did not exist — and a tribunal ruled the airline had to honour it. In 2024, a New York law firm submitted a court filing with six fabricated case citations generated by ChatGPT. The lawyers were sanctioned.
These are not edge cases from careless teams. These are what happens when capable AI is deployed without governance. The chatbot works. It sounds confident. It handles 80% of queries well. And then it stores a customer's ID number in plaintext, invents a policy, or leaks internal data — and the feature backlog becomes irrelevant because you are now managing a crisis.
Every team that has shipped an ungoverned AI chatbot believed they would "add governance later." Later does not arrive until after the incident.
What AI governance actually means
Governance is not a committee. It is not a PDF that legal signs off on. It is a middleware layer — running code that sits between the user and the LLM, inspecting every message in both directions.
Every input is scanned before the model sees it. Every output is checked before the customer sees it. Every interaction generates a cryptographic receipt that proves what happened.
Think of governance as a seatbelt, not a speed limit. A speed limit slows you down. A seatbelt lets you drive at full speed with a plan for when things go wrong. Governed AI is not slower AI. It is AI that you can defend, audit, and trust.
The 3 non-negotiable capabilities
1. PII detection
A customer types: "My ID number is 9502015800086 and my card is 4532-XXXX-XXXX-1234."
An ungoverned chatbot stores this in the conversation log, sends it to the LLM provider's API, and maybe writes it to an analytics database. That is three copies of sensitive data created in under a second.
A governed chatbot detects the PII before it leaves your infrastructure. The ID number is redacted. The card number is masked. The LLM receives the sanitised version. The original is never stored.
This is not optional. South Africa's POPIA Act requires that personal information be processed only for the purpose it was collected. Europe's GDPR requires data minimisation — you cannot store what you do not need. California's CCPA gives consumers the right to know what data you have collected. If your AI chatbot is hoovering up PII without detection, you are violating all of them simultaneously.
Real-time PII detection is the baseline. Everything else is built on top of it.
2. Audit receipts
When a regulator, a customer, or a lawyer asks "what did your AI say to this person on Tuesday at 14:32?", you need an answer. Not a log file. Not "we think it said something like this." A tamper-proof receipt.
An audit receipt is a record of every interaction: what the user sent, what the AI received (after redaction), what the AI responded, and what governance actions were taken. Each receipt has a unique ID, a timestamp, and is stored independently of the conversation log.
This is the difference between "we have logs" and "we have evidence." Logs can be edited. Receipts are cryptographically verifiable.
The Air Canada case would have gone differently if the company could have produced an audit trail showing that the chatbot's bereavement policy response was a hallucination that governance should have caught. Instead, they had no trail at all, and the tribunal treated the chatbot's statement as if it were company policy.
3. Policy enforcement
Your AI should have boundaries that are defined in code, not in hope.
Policy enforcement means defining what your AI can and cannot do: which topics it can discuss, which claims it can make, when it must escalate to a human, and what happens when it does not know the answer. These rules are evaluated on every message, not as a system prompt suggestion that the model might ignore.
A system prompt that says "do not discuss competitors" is a request. A governance policy that scans the output and blocks competitor mentions is enforcement. The difference matters when your chatbot decides to be helpful and recommends a rival's product.
Escalation triggers are part of policy enforcement. When a customer says "I want to speak to a manager" or "this is completely unacceptable," the governed response is not another AI-generated apology. It is a structured handoff to a human, with the conversation context attached. The ungoverned response is three more paragraphs of synthetic empathy that makes the customer angrier.
The cost of retrofitting
Adding governance after launch is not twice as hard. It is an order of magnitude harder.
Your database already has six months of unscanned conversations. Some contain PII. You do not know which ones. A full scan of historical data is a project in itself — and it reveals problems you now have to report under data protection law.
Your audit trail has gaps. For every conversation that happened before governance was added, you cannot prove what the AI said. If a customer dispute arises from that period, your legal position is "we don't know."
Your architecture was not designed for middleware. Adding an inspection layer between the user and the LLM means reworking your request pipeline, your streaming implementation, your error handling, and your tests. Features that were built assuming direct LLM access now need to account for governance latency, redaction, and denial responses.
And your customers have already formed trust expectations. If your chatbot has been freely accepting ID numbers for six months and suddenly starts redacting them, customers notice. The transition itself becomes a support issue.
Build governance into the foundation. Then build features on top of a system you can trust.
How we did it at Tork
At Tork, governance is not a wrapper around the AI. It is a node in the state machine.
Every message follows the same path: tenant resolution, then governance input scan, then intent classification, then agent routing, then response generation, then governance output scan, then storage. The LLM never sees raw PII. The response never reaches the customer without an output scan. Every step produces a receipt.
User message
→ Tenant resolution
→ Governance input scan (PII detect, policy check, receipt generated)
→ Intent classification
→ Specialist agent (fleet, policy, booking, etc.)
→ Response generation
→ Governance output scan (data leak check, policy check, receipt generated)
→ Customer receives response
If governance denies the input, the graph short-circuits. No LLM call, no agent routing, no response generation. The denial is recorded with a receipt, and the customer gets a safe fallback message. The system does less work, not more.
If governance is temporarily unreachable, the system degrades gracefully — messages are allowed through with a logged warning, and the governance scan is retried asynchronously. No single point of failure blocks the customer experience.
This architecture means governance cannot be bypassed by a new feature, a new endpoint, or a developer who forgets to add the middleware. It is structural, not procedural.
The practical checklist
Before you deploy any AI system that talks to customers, answer these seven questions:
Can it detect PII in real-time?
Not in a batch job overnight. In the request path, before the data is stored or sent to a third-party API.
Does it generate audit receipts?
Not log lines. Structured, immutable records with unique IDs that can be retrieved by conversation, by user, or by time range.
Can you prove what it said to a regulator?
If a data protection authority asks for the full interaction history for a specific customer, can you produce it within 72 hours? GDPR requires this.
Does it have escalation rules?
When the AI is out of its depth or the customer is frustrated, does it hand off to a human? Or does it keep generating responses and hoping for the best?
Is there a human override?
Can a supervisor intervene in a live conversation? Can you disable the AI for a specific tenant, a specific topic, or globally — without a deployment?
Is it compliant with local data protection law?
POPIA in South Africa. GDPR in Europe. CCPA in California. LGPD in Brazil. The law your AI needs to comply with depends on where your customers are, not where your servers are.
Can you turn it off in 5 seconds?
Not "start a deployment." Not "merge a PR." A kill switch. If the AI starts generating harmful content at scale, how fast can you stop it?
If you answered "no" to more than two of these, you are not ready to deploy. Fix governance first. The features can wait.
Start governed
We built Tork Chat because we needed a customer assistant that we could defend to a regulator, explain to a compliance officer, and trust with real customer data. The governance layer is not a premium add-on. It is the first thing that runs.
If you are building AI for customer-facing use cases, start with governance. Not because it is the responsible thing to do — though it is — but because retrofitting it later will cost you more in engineering time, legal exposure, and customer trust than building it right from the start.
Try Tork's governance-first approach at tork.network/chat. Free to start.
If you want the deeper thesis on why governed AI agents are the next frontier — and why most current deployments are dangerously ungovened — we wrote a book about it. The Agent Crisis is available free at tork.network.
Built by the Tork team. Governance-first AI for customer-facing deployments. tork.network
Top comments (0)