Anna Jambhulkar

Posted on May 9

I Built an AI Governance Runtime Layer for Production AI Apps

#ai #machinelearning #startup #llm

Most AI apps today follow a very simple pattern:

User → App → LLM → Response

That pattern works well for demos.

It works for prototypes.
It works for simple assistants.
It works when the workflow is clean and the risk is low.

But once AI starts moving into real products, the problem changes.

The question is no longer only:

Can the model generate a good answer?

The real production questions become:

What was the AI allowed to do?
What context did it use?
What memory was active?
Which policy applied?
Why did it respond this way?
Can this interaction be reviewed later?

That is the problem I am trying to solve with NEES Core Engine.

What is NEES Core Engine?

NEES Core Engine is a governed AI runtime layer for production AI applications.

It sits between your application and the model provider.

User
  ↓
Application
  ↓
NEES Core Engine
  ↓
Governance Runtime
  ↓
Model Provider
  ↓
Governed Response

The goal is not to build another chatbot.

The goal is to give AI applications a runtime control layer for:

policy awareness
identity consistency
memory boundaries
runtime modes
traceability
explainability metadata
safer production behavior

In simple terms:

NEES helps AI apps become more controlled, traceable, and reviewable before the response reaches the user.

Why prompts are not enough

A prompt can guide behavior.

But a prompt is not governance.

A prompt cannot reliably answer:

Which policy was active?
What memory scope was allowed?
What should happen if two instructions conflict?
When should the AI escalate?
What response path was used?
How do we debug this response later?

Most production AI problems do not happen because the model is completely useless.

They happen because the system around the model is weak.

The workflow is unclear.
The context is messy.
The memory boundary is undefined.
The role is inconsistent.
The decision path is not visible.

So the model is forced to guess.

That is where governance becomes necessary.

What NEES adds to the AI stack

A direct model call usually gives you:

Prompt → Model → Text Response

A governed NEES call gives you:

Request
  ↓
Runtime governance
  ↓
Model response
  ↓
Governance metadata
  ↓
Traceable output

That means the response is not only text.

It can also carry metadata such as:

{
  "reply": "Governed assistant response...",
  "trace_id": "trace_xxxxx",
  "engine_source": "core_engine",
  "governance": {
    "status": "allowed",
    "mode_used": "supportive",
    "policy_applied": true,
    "memory_scope": "session"
  }
}

The exact response fields may evolve during the developer preview, but the principle is the same:

Every AI response should be easier to understand, debug, and review.

A simple example

Here is a basic Python request:

import requests

response = requests.post(
    "https://api.nees.cloud/chat",
    headers={
        "Authorization": "Bearer YOUR_NEES_API_KEY"
    },
    json={
        "message": "Explain why AI apps need runtime governance in simple terms.",
        "mode": "supportive",
        "session_id": "demo-session"
    },
    timeout=45
)

print(response.json())

This is still a simple API call.

But instead of treating the model response as a black box, NEES routes the request through a governed runtime layer.

Why traceability matters

When an AI response goes wrong in production, teams need more than:

“The model said this.”

They need to know:

what request came in
what mode was active
what policy applied
what memory scope was used
what provider/model path handled the request
whether the response was allowed, modified, or blocked
how the interaction can be reviewed later

That is why trace IDs matter.

A trace ID acts like a reference point for debugging and review.

Without traceability, AI debugging becomes guesswork.

Memory boundaries matter too

Memory is powerful.

But uncontrolled memory can create serious problems.

If every past interaction can influence every future response, the system becomes harder to reason about.

So memory should not be treated as unlimited context.

It should be governed.

A production AI system should be able to reason about:

what belongs only to the current session
what can be reused across sessions
what requires explicit consent
what should never influence a response
when memory usage should be visible or traceable

The goal is not simply:

Give the AI more memory.

The goal is:

Control when memory is used, why it is used, and how that usage can be reviewed.

Runtime governance vs another AI agent

I do not think the answer to every AI problem is “add another agent.”

Sometimes the missing layer is not another AI.

Sometimes the missing layer is control.

AI agents become useful when the system around them is designed properly:

clear workflow boundaries
role permissions
escalation rules
memory scope
policy checks
observability
fallback behavior
human review when needed

NEES is focused on that runtime layer.

It is not trying to replace the model.

It is trying to make AI behavior easier to govern before it reaches users.

Where this can be useful

NEES Core Engine can be useful for teams building:

AI assistants
AI agents
customer support bots
education apps
workflow automation
internal company copilots
AI content pipelines
production AI tools that need auditability

The common thread is simple:

If AI behavior affects real users, real workflows, or real decisions, it should be controlled and traceable.

Developer preview is now open

I recently opened a public developer preview repo for NEES Core Engine.

The repo includes:

Python quickstart
Node.js quickstart
cURL and PowerShell examples
API reference
governance flow documentation
15-minute integration guide
API key request template
developer feedback template

Developer preview repo:

https://github.com/NEES-Anna/nees-core-developer-preview

There is also a live sample app connected to the governed runtime:

https://naina.nees.cloud

The sample app is useful for seeing the governed response flow in a real interface.

What I am looking for

This is still early.

I am not looking for generic traffic.

I am looking for honest feedback from developers, AI builders, founders, and teams working with production AI systems.

I would especially like feedback on:

Is the API approach clear?
Does the governance metadata feel useful?
Would trace IDs help you debug AI behavior?
How would you expect memory boundaries to work?
Would this fit better as a hosted API, SDK, or both?
What would you need before using this in a real product?
What integration docs should come next?

The first goal is not to make the system complex.

The first goal is to make the first 15 minutes useful.

A developer should be able to send one governed request and immediately understand:

This is different from a direct model call because I can see how the response was governed.

Final thought

AI is moving from demos into production.

That shift changes the infrastructure requirement.

In demos, a good answer is enough.

In production, teams need control.

They need to know what the AI was allowed to do, what context it used, what policy applied, and how the decision can be reviewed later.

That is the layer I am building with NEES Core Engine.

Not another chatbot.

A governance runtime for production AI.

Top comments (1)

NOVAInetwork • May 11

The gap you identify between "the model said this"
and "here is why it was allowed to say this" is real.
Most teams bolt governance on after the fact and then
wonder why it leaks.

I am working on a similar problem but at a different
layer. Instead of a runtime between the app and the
model, I am building a Layer 1 blockchain where AI
entities have protocol-enforced capabilities. The
chain itself rejects transactions that exceed an
entity's granted permissions before execution starts.

This weekend I shipped entity delegation for it. An
entity can grant a subset of its own capabilities to
another entity for a bounded duration. Revocation is
one transaction. The capability resolver checks every
admission gate at the consensus layer, not at the
application layer.

Your traceability point is the one that resonates
most. Every signal, every reputation update, every
delegation grant on our chain is a committed block
with a deterministic state root. The audit trail is
not a separate logging system. It is the chain itself.

Curious whether you have thought about what happens
when multiple governed agents need to interact with
each other. That is where on-chain identity and
reputation start mattering, because the governance
boundary of one agent needs to be legible to the
governance boundary of another.