Mathew Pregasen

Posted on Jan 27

What is Governance for AI and AI Agents?

#llm #agents #security #ai

The term “AI Governance” has recently gained traction, and not without reason: enterprises rolling out safe AI systems into real-world applications need governance. Even so, there remains significant uncertainty around what governance actually entails when it comes to AI agents.

In this article, we’ll unpack what AI governance really means and why getting it right is such a complex problem. We’ll also take a look at how AI orchestration platforms like Credal can help teams simplify their governance approach.

What is AI Governance?

To start, let’s establish a clear definition of AI Governance. At its core, AI Governance refers to a collection of policies, processes, and controls to guide how AI systems (models, applications, and agents) should be built, rolled out, and operated in a safe and compliant manner. These frameworks exist so that AI is introduced and scaled responsibly, without security vulnerabilities, compliance violations, or reputational harm for the organization.

In theory, it’s straightforward—in execution, it is not. AI Governance focuses on resolving a handful of sub-problems, most of which only surfaced with the rapid rise of AI agents. Because AI governance is still relatively new, it’s mainly addressed by existing governance frameworks (e.g. SOC 2) only partially and largely just in relation to data.

The Sudden Popularity of AI Agents (and their risks)

Developers and users alike have embraced AI agents at a rapid pace. Their ability to operate autonomously makes them feel like a natural next step for AI, and their high level of customization makes them easy to tailor to different use cases. From a security and risk standpoint, however, these same qualities introduce a host of new problems.

What makes agents so risky? There are two categories with distinct consequences:

Data Risk. AI agents may expose sensitive information to employees who don’t actually have the appropriate authorization or permissions. For organizations subject to strict data custody compliance requirements, such incidents can lead to regulatory penalties or jeopardize customer contracts.
Mutation Risk. AI agents usually have write access to systems. This means an AI agent could incorrectly update an external system (e.g. send an email, unauthorized Slack message, delete a ticket, or make a payment).

Addressing these risks requires a governance framework that clearly defines and enforces how access is provisioned across an agent ecosystem.

Vendors do not bear the risk. Enterprises do.

Responsibility for implementing these principles rests squarely with the customer, not the vendor. Vendors are generally unwilling to assume liability for mistakes made by their applications (and now, agents). Given that agent behavior is largely unpredictable because of AI’s non-deterministic behavior and responses may vary dramatically depending on prompts, enterprises are left to take ownership of safeguards.

For instance, numerous vendors offer agents capable of sending emails or creating Jira tickets. However, none of them will pay your legal fees if their agent accidentally leaks sensitive data to a public Jira board or sends an email with customer PII to the wrong recipient.

Rather than relying on vendors, enterprises need to deploy the right tooling to manage the risks of agents, especially for regulated companies subject to significant penalties if sensitive data is exposed. This is why an entire market of third-party solutions, such as Credal, emerged between vendors and enterprises.

The Three Tenets of Governance

This leads us back to the question of governance. Enterprises must safeguard themselves against agent errors, but how? It comes down to three core tenets:

Access. Agents must be granted permissions that do not bypass the controls applied to humans, other servers, or devices. For the most part, each agent should have a designated owner and inherit the same permissions as that owner (sometimes even less). This approach aligns with the principle of least privilege, ensuring the agent receives only the permissions they need for their current task.
Auditing. Agent activity must be tracked so that any errors or breaches can be investigated and reproduced by developers. Unlike humans, where you might simply ask a colleague “Who deleted this table?”, agents require deterministic monitoring to maintain a useful history.
Human-in-the-Loop. For critical operations, a human should explicitly approve the agent’s access after reviewing a concise summary of the intended action. This approach reduces the risk of catastrophic mistakes (e.g. a full database drop).

Let’s focus specifically on the last tenet, since the risks and controls vary depending on the specific type of action being performed.

Categorizing Risk for Agents

How should we determine what actions an agent is allowed to execute?

First, keep in mind that every action carries a different degree of risk. Some barely matter, some might disrupt operations, and others can lead to significant financial, legal, or compliance problems.

We can classify three categories of actions:

Read-only (Lowest Risk)
Low risk write
High risk write

Let’s examine how we should manage each of these categories.

Read Only Actions

For read only actions, the human owner should take responsibility. Through a governance framework, the owner must grant the agent access within their own scope of permissions.

Low Risk Write Actions

For low risk write actions, it’s usually fine for agents to proceed without waiting for human approval. Provided that permissions and auditing are properly set, requiring human approval for every action would be more obstructive than beneficial.

High Risk Write Actions

For high risk write actions, however, enterprises should consider mandating explicit human approval.

Determining Low Risk versus High Risk.

It is up to each enterprise to define the boundary between low risk and high risk actions. For example, updating a Salesforce record could be considered low risk, while sending payments would be high risk. In high risk scenarios, the human owner providing approval should assume accountability. In low risk scenarios, responsibility rests with the agent developer.

Centralized agent governance becomes crucial in larger or regulated enterprises. Codifying practices, such as defining high and low risk actions, helps demonstrate defensibility to a regulator.

Setting up these categories helps enterprises create a defensible, structured framework to governing AI agents. By clearly distinguishing between read-only, low risk, and high risk writes, organizations can align oversight with risk, avoid unnecessary friction for users, and intervene with human judgement when it matters.

What is Credal?

Credal is an AI governance and orchestration platform with ready-to-use managed agents, complete with built-in auditing, human-in-the-loop, and permissions inheritance. Credal sets the environment and rules for agents without defining low risk versus high risk actions or dictate human-in-the-loop workflows. Those decisions are still made by the enterprise.

If you are interested in learning more about Credal, sign up for a demo today.

DEV Community