Ramon Marrero

Posted on Mar 16

Stop Shipping Ungoverned AI: Add Policy Gates, Audit Trails, and Compliance to Every LLM Call

#ai #typescript #python #security

Your AI Solution Works. But Can You Prove What It Did?

You shipped the chatbot. The coding assistant is saving your team hours. The RAG workflow is answering questions from internal docs. Product is happy because the demo works.

Then the harder questions show up:

Can you show which policy checks ran before a model call?
Can you prove what happened for a specific runId last week?
Can you redact sensitive input before it reaches the provider?
Can you generate evidence for audits instead of screenshotting dashboards?

That is where most AI solutions fall apart.

A successful model response is not the same thing as governed AI.

The Real Problem in Production AI

Most teams still ship LLM features with a thin wrapper around the provider SDK:

Accept a prompt.
Send it to OpenAI, Anthropic, Gemini, or Bedrock.
Return the response.
Hope logs are enough later.

That works until you need to answer operational or compliance questions:

What exactly was sent to the model?
Was sensitive data redacted first?
Which policy decided the call was allowed?
What other artifacts are linked to that run?
Can you replay or verify the evidence later?

You do not solve that with another console.log.

You solve it by putting governance in the execution path.

What the AI Governance SDK Actually Gives You

The AI Governance SDK exposes three complementary surfaces:

createArelis: the recommended unified SDK for governed model calls and agent runs.
ArelisPlatform: the platform API client for /api/v1/* resources like events, policies, proofs, replay, governance snapshots, jobs, and usage.
createArelisClient: the in-process runtime SDK for models, agents, MCP, knowledge, memory, quotas, approvals, secrets, compliance, and governance helpers.

If you want the fastest path to governed LLM calls, start with createArelis.

If you need centrally managed policy lifecycle, proof generation, replay, or governance snapshots, add ArelisPlatform.

The 5-Minute Path: Govern a Real Model Call

The public runtime guide positions governedInvoke() as the main entry point for governed model calls: PII scanning, policy evaluation, model invocation, risk scoring, and audit logging in one flow.

Install the SDK first:

# TypeScript
npm install @arelis-ai/ai-governance-sdk @google/genai

# Python
pip install ai-governance-sdk google-genai

In Python, the package installs as ai-governance-sdk and is imported as arelis.

TypeScript

import { createArelis } from "@arelis-ai/ai-governance-sdk";
import { GoogleGenAI } from "@google/genai";

const arelis = createArelis({
  platform: { apiKey: process.env.ARELIS_API_KEY },
});

const gemini = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY! });

async function main() {
  const result = await arelis.governedInvoke({
    runId: `run-${crypto.randomUUID()}`,
    model: "gemini-2.5-flash",
    prompt: "Explain the three most important AI governance controls for a support bot.",
    denyMode: "return",
    invoke: async (sanitizedPrompt) => {
      const completion = await gemini.models.generateContent({
        model: "gemini-2.5-flash",
        contents: sanitizedPrompt,
      });

      return completion.text ?? "";
    },
  });

  console.log("runId:", result.runId);
  console.log("invoked:", result.invoked);
  console.log("decision:", result.decision.decision);
  console.log("sanitizedPrompt:", result.sanitizedPrompt);
  console.log("result:", result.result);
}

main();

Python

import asyncio
import os
import uuid

from arelis import GovernedInvokeInput, create_arelis
from google import genai

arelis = create_arelis({
    "platform": {"apiKey": os.environ["ARELIS_API_KEY"]},
})

gemini = genai.Client(api_key=os.environ["GEMINI_API_KEY"])


def invoke_model(sanitized_prompt: str) -> str:
    completion = gemini.models.generate_content(
        model="gemini-2.5-flash",
        contents=sanitized_prompt,
    )
    return completion.text or ""


async def main() -> None:
    result = await arelis.governed_invoke(
        GovernedInvokeInput(
            run_id=f"run-{uuid.uuid4()}",
            model="gemini-2.5-flash",
            prompt="Explain the three most important AI governance controls for a support bot.",
            deny_mode="return",
            invoke=invoke_model,
        )
    )

    print("invoked:", result.invoked)
    print("decision:", result.decision.decision)
    print("result:", result.result)


asyncio.run(main())

Once you have centrally managed policies on the platform, add policyIds to the governed call to enforce those policies in the runtime path.

What Happens on That Call

The useful part is not just that you called the model. It is what the SDK does around it.

With governedInvoke():

Your original prompt is redacted into sanitizedPrompt before your callback runs.
The governance gate evaluates whether the call should proceed.
If the gate denies the request and denyMode is "return", the model is not called.
The run is linked by runId, which you can reuse across events, snapshots, proofs, replay, and investigations.
Risk scoring and platform-side audit reporting are part of the unified flow documented in the runtime guide.

That is the main shift: your provider SDK stays the same, but the boundary around it becomes governed.

Managed Policies: Use the Platform for Central Control

The platform docs make an important distinction: centrally managed policies live on the platform surface.

That means policy lifecycle belongs in ArelisPlatform:

create policies
version them
simulate them
activate or roll them back
evaluate them against explicit checkpoints
query evaluation history later

Create a managed policy

This example matches the public governance docs:

import { ArelisPlatform } from "@arelis-ai/ai-governance-sdk";

const platform = new ArelisPlatform({
  baseUrl: "https://api.arelis.digital",
  apiKey: process.env.ARELIS_API_KEY!,
});

const policy = await platform.governance.policies.create({
  key: "require-eu-routing",
  name: "Require EU routing for PII",
  description: "Deny calls routed outside the EU when handling PII data.",
  condition: {
    operator: "and",
    conditions: [
      { field: "routing.region", operator: "neq", value: "eu-west-1" },
      { field: "content.classification", operator: "eq", value: "pii" },
    ],
  },
  action: "deny",
  severity: "high",
  priority: 10,
});

console.log(policy.id, policy.activeVersion);

Evaluate a policy against a checkpoint

Also from the public docs:

const evaluated = await platform.governance.evaluatePolicy({
  runId: "run_checkout_001",
  checkpoint: {
    model: { provider: "openai", name: "gpt-4o" },
    routing: { region: "eu-west-1" },
  },
  policyIds: ["pol_critical_region", "pol_pii_guard"],
});

console.log(evaluated.decisions);

This is the right mental model:

governedInvoke() is the fastest path for governed model execution.
ArelisPlatform is the control plane for centrally managed policy definitions and audit history.

If you need custom checkpoint fields beyond the defaults your runtime path provides, evaluate them explicitly through the platform or wire them into your runtime governance layer.

Audit Trails That Are Actually Useful

The public runtime docs emphasize runtime-to-platform interoperability and recommend persisting runId as the primary join key across artifacts.

That is the difference between “we have logs” and “we can investigate an incident without guessing.”

With Arelis, the practical pattern is:

Use runId for the governed call.
Persist or sync related events under that same runId.
Retrieve governance history and snapshots by runId.
Generate proofs and replay artifacts for the same run later.

On the platform side, the relevant APIs are already documented:

That gives you a real investigation workflow instead of a pile of provider logs.

Compliance Proofs Are a First-Class Operation

One of the strongest parts of the public surface is that compliance proofs are explicit, documented operations, not vague “trust us” marketing.

From the proofs docs:

const proof = await platform.proofs.create({
  runId: "run_checkout_001",
  schemaVersion: "v2",
  composed: {
    layers: ["event_integrity", "causal_consistency", "policy_compliance"],
  },
  async: true,
});

if ("proofId" in proof) {
  console.log("Proof ready:", proof.proofId);
} else {
  console.log("Queued proof job:", proof.jobId);
}

That matters because “audit trail” and “compliance artifact” are not the same thing.

An audit trail helps your engineers debug.

A proof is the thing you generate when you need verifiable evidence tied to a governed run.

This Is Not Just for Chatbots

The AI Governance SDK is broader than one-off prompt calls.

Agents

The runtime guide exposes agents.run() for governed agent loops:

const result = await arelis.agents.run({
  runId: `run-agent-${crypto.randomUUID()}`,
  model: "gemini-2.5-flash",
  prompt: "Investigate this billing anomaly and produce a short incident summary.",
  tools: [
    { name: "fetchUsage", description: "Get recent usage records", schema: {} },
    { name: "fetchPolicies", description: "Get active policy metadata", schema: {} },
  ],
  maxSteps: 6,
  invokeModel: async ({ messages, tools }) => {
    // Call your model here
    return { text: "..." };
  },
  executeToolCall: async ({ tool }) => {
    // Execute tool here
    return { ok: true };
  },
  mapOutput: ({ finalResponse }) => finalResponse.text ?? "",
});

The public runtime docs describe this path as governed agent execution with event, graph, and proof enrichment. That is a much stronger story than bolting logging onto a hand-rolled agent loop after the fact.

Why This Matters for Regulation

This is not only about engineering hygiene.

The EU AI Act entered into force on August 1, 2024. Key obligations for general-purpose AI models start applying on August 2, 2025. Major obligations for high-risk AI systems apply on August 2, 2026.

If your system touches hiring, finance, healthcare, education, critical infrastructure, or other regulated workflows, the questions get sharper:

Can you show what happened for a specific run?
Can you demonstrate what safeguards were in place?
Can you trace policy decisions over time?
Can you produce evidence without rebuilding the timeline by hand?

You do not want to start inventing governance architecture after those questions are already live.

What Makes This Worth Using

The strongest case for Arelis is not “it has a lot of features.”

It is this:

You keep your provider SDK.
You wrap the execution boundary instead of rewriting your app.
You get a documented split between runtime enforcement and platform control plane.
You can move from governed calls to policy lifecycle, proofs, replay, and snapshots without changing vendors halfway through.

That is a much better story than “we have prompt logs somewhere.”

Where to Start

If you want the shortest path to value:

Read the runtime governance guide.
Install the SDK from the SDK docs.
Wrap one real model call with createArelis() and governedInvoke().
Reuse the runId everywhere.
Add platform-managed policies and proofs once the first governed path is working.

Useful links:

If your AI feature is already in production, the real question is no longer whether the model works.

It is whether the call is governed.

DEV Community