Governing AI on Every Company Computer: A Reference Architecture

#aioverview #security #architecture #enterprise

To effectively govern AI, organizations need a reference architecture that extends from a central gateway to every endpoint. This post outlines a three-plane model for the control plane (gateway), the enforcement plane (endpoint agent), and the management plane (MDM) that closes governance gaps left by shadow AI.

The adoption of generative AI has created a significant governance gap in most organizations. While security and platform teams focus on controlling AI through centrally managed gateways, employees are using unapproved desktop apps, browser-based tools, and coding agents on their company computers. This "shadow AI" operates outside of established controls, creating unmonitored pathways for data exfiltration, compliance violations, and other security risks.

An effective AI governance strategy cannot stop at the network edge; it must extend to the endpoint where AI tools are actually used. A policy document is not a technical control. Real governance requires an architecture that ensures the rules defined centrally are enforced on every device. This article presents a vendor-agnostic reference architecture for achieving this, composed of three distinct but interconnected planes: a control plane, an enforcement plane, and a management plane.

The Challenge: Shadow AI and the Endpoint Blind Spot

Shadow AI refers to the use of AI tools within an organization without IT approval or oversight. It's a modern variant of shadow IT, but with higher stakes due to the ability of AI models to process and retain sensitive data. When an employee pastes proprietary code or confidential customer data into a public AI chatbot, that information can become part of the model's training data, effectively leaking it.

The core problem is that a gateway-only approach to AI governance is incomplete. An AI gateway can only enforce policies on traffic that is explicitly configured to pass through it. The tools employees download and use on their own—from the ChatGPT desktop app to coding agents like Claude Code—bypass these gateways by default, leaving a massive blind spot. This is where a comprehensive architecture becomes necessary.

A Three-Plane Reference Architecture for Endpoint Governance

A robust solution for governing AI across a fleet of computers relies on the clear separation of duties across three architectural layers. This model draws inspiration from established frameworks like the NIST AI Risk Management Framework and Zero Trust principles, applying them to the specific problem of endpoint AI control.

The Control Plane (AI Gateway): Where policy is defined.
The Enforcement Plane (Endpoint Agent): Where policy is applied.
The Management Plane (MDM): How the enforcement agent is deployed and managed.

This separation ensures that policy decisions are centralized and consistent, while enforcement is distributed and universally applied.

Plane 1: The Control Plane (The AI Gateway)

The control plane is the centralized policy engine for all AI traffic. This is typically an AI gateway, a specialized proxy that sits between AI applications and the various model providers (like OpenAI, Anthropic, or Google).

The gateway serves as the single point for defining and managing the rules of AI engagement for the entire organization. Its key responsibilities in this architecture include:

Unified Policy Management: Central definition of all AI policies, such as which models are approved, which users or teams have access, and what data is allowed.
Virtual Keys and Budgets: Instead of scattering provider API keys across applications, the gateway uses virtual keys to abstract them. These keys are tied to specific projects, teams, or users and can have granular budgets and rate limits attached.
Guardrails: The gateway inspects prompts and responses in real-time to block sensitive data (like PII or API keys) from leaving the organization and to prevent harmful or non-compliant content from being returned by the model.
Audit Logging: It creates an immutable, centralized log of all AI interactions, which is essential for security audits and compliance with regulations like SOC 2, HIPAA, or GDPR.

Platforms like Bifrost, which combine a gateway with an endpoint component, exemplify this model where the gateway acts as the "brain" of the operation.

Plane 2: The Enforcement Plane (The Endpoint Agent)

The enforcement plane's job is to ensure the policies defined in the control plane are applied to every AI tool running on a user's machine. This is accomplished by a lightweight, always-on agent that is installed on each company computer.

This agent is not a second policy engine. Its sole purpose is to intercept AI-related traffic on the device and route it through the central AI gateway (the control plane). This transparent routing is the mechanism that closes the shadow AI gap.

Key responsibilities of the enforcement agent include:

Application and Tool Discovery: The agent identifies all known AI desktop applications, browser-based AI tools, and CLI coding agents running on the machine. This provides administrators with a real-time inventory of AI usage across the fleet.
Transparent Traffic Routing: It automatically and transparently routes traffic from these applications to the organization's AI gateway. The user does not need to configure anything; the tools they already use are simply brought under governance.
Policy Enforcement: By forcing traffic through the gateway, the agent ensures all policies—budgets, guardrails, audit requirements—are applied, even to previously ungoverned tools.
Block/Allow Capabilities: Based on policy from the control plane, the agent can either route an application's traffic or block it entirely, preventing the use of unapproved AI tools on company devices.

This architecture ensures that governance follows the user and the application, regardless of how or where the AI tool is run.

Plane 3: The Management Plane (MDM Platform)

The management plane is the mechanism for deploying, configuring, and maintaining the enforcement agent across the entire fleet of company devices. This is the role of Mobile Device Management (MDM) or Unified Endpoint Management (UEM) platforms like Jamf, Microsoft Intune, or Kandji.

MDM platforms are critical for operationalizing endpoint governance at scale. Their responsibilities include:

Silent, Fleet-Wide Deployment: Pushing the enforcement agent to all managed devices without requiring any user interaction.
Managed Configuration: Securely delivering the initial configuration to the agent, such as the address of the organization's AI gateway. This avoids hardcoding sensitive information and allows for centralized updates.
Health and Status Monitoring: Reporting on the installation status and health of the enforcement agent across all devices.
Lifecycle Management: Handling updates and uninstallation of the agent as needed.

Using an MDM platform turns the endpoint agent from a tool that must be manually installed into a standard, non-bypassable component of the corporate software stack.

Putting It All Together: The Governed AI Request Flow

When these three planes work in concert, the flow for governing a request from an employee's laptop looks like this:

Deployment: The MDM platform (Management Plane) deploys the endpoint agent (Enforcement Plane) to a new company laptop with a managed configuration pointing it to the corporate AI gateway (Control Plane).
User Action: An employee opens an AI tool on their laptop—for example, the Claude Desktop app.
Interception & Routing: The endpoint agent detects the network request from the app, intercepts it, and securely routes it to the AI gateway.
Policy Adjudication: The AI gateway receives the request. It checks the user's identity, attaches the correct virtual key, verifies the request against assigned budgets and guardrails, and logs the entire transaction.
Execution: If the request is compliant, the gateway forwards it to the upstream LLM provider. The response is routed back through the gateway, inspected by guardrails again, and then sent to the user's application.

This architecture creates a system where policy is managed centrally, but enforcement is everywhere. It makes AI usage both safe and visible, eliminating the risks of shadow AI without having to block tools that make employees more productive.