DEV Community

Cover image for NEES Guard for Gemma 4: Governance, Traceability, and Predictable Behavior for Open-Model AI
Anna Jambhulkar
Anna Jambhulkar

Posted on

NEES Guard for Gemma 4: Governance, Traceability, and Predictable Behavior for Open-Model AI

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

I built NEES Guard for Gemma 4, a small full-stack demo that shows how open-model intelligence can be paired with a lightweight governance layer before responses reach the user.

Gemma 4 provides the model intelligence. NEES Guard adds the production-facing governance layer around it:

  • intent detection
  • risk classification
  • policy decisions
  • raw vs governed response comparison
  • trace IDs
  • fallback metadata
  • response hashing
  • clean final user-facing output

The idea behind the project is simple:

A model can generate an answer, but production AI needs a governed runtime around that answer.

In the demo, a user enters a prompt and selects a scenario such as general, customer support, agent action, or sensitive advice. The backend sends the task to Gemma 4, then NEES Guard analyzes the prompt and finalizes the output based on the risk level.

Example governance behavior:

Prompt Governance Result
“Summarize this product feedback…” Green / Allow
“Reply harshly to this angry customer.” Yellow / Modify
“Delete all inactive users without asking.” Red / Ask confirmation
“Give guaranteed legal advice.” Red / Block

This makes the project useful as a small demonstration of how AI apps can move from model response to governed response.

Demo

Live demo:

https://nees-guard-gemma4.vercel.app/

Backend health check:

https://nees-guard-gemma4.onrender.com/health

The demo shows four main panels:

  1. Gemma Raw Response — the direct model output.
  2. NEES Guard Analysis — intent, risk band, policy decision, and flags.
  3. Governed Response — the final response after governance.
  4. Trace JSON — audit-style metadata including trace ID, model provider, mock/live mode, fallback status, and response hash.

One important behavior the demo highlights is that raw model output can sometimes be verbose, draft-like, or formatted in a way that is not ideal for end users. NEES Guard cleans and finalizes it into a concise user-facing response.

Example:

Raw model output:
May include draft notes, formatting, or intermediate response structure.

Governed response:
“While the app is useful, the setup instructions and trace panel are difficult to understand.”
Enter fullscreen mode Exit fullscreen mode

This is the core point of the project: the model generates, but the governance layer decides what should safely and clearly reach the user.

Code

GitHub repository:

https://github.com/NEES-Anna/NEES-Guard-Gemma4

The project is structured as a standalone demo:

backend/
  app/
    main.py
    config.py
    gemma_client.py
    governance.py
    schemas.py
    trace.py
  tests/

frontend/
  src/
    App.jsx
    api.js
    components/
Enter fullscreen mode Exit fullscreen mode

Backend:

  • FastAPI
  • Gemma 4 API call
  • deterministic governance rules
  • trace builder
  • fallback handling
  • response finalizer
  • test coverage

Frontend:

  • Vite + React
  • scenario selector
  • example prompts
  • result cards
  • trace viewer
  • deployment-friendly API configuration

The backend test suite covers governance behavior, API shape, Gemma fallback metadata, trace fields, and safety handling.

How I Used Gemma 4

I used Gemma 4 as the model intelligence layer through the Gemini API.

The selected primary model is:

gemma-4-26b-a4b-it
Enter fullscreen mode Exit fullscreen mode

I chose this model because the project needs a practical instruction-following model that can generate useful responses for realistic AI application scenarios, while still being suitable for a fast deployed demo workflow.

The project also supports a fallback model:

gemma-4-31b-it
Enter fullscreen mode Exit fullscreen mode

Gemma 4 is responsible for generating the initial response. NEES Guard then wraps that response with a governance process:

User Prompt
   ↓
Intent + Risk Analysis
   ↓
Gemma 4 Model Response
   ↓
Governance Finalizer
   ↓
Governed Response + Trace
Enter fullscreen mode Exit fullscreen mode

The governance layer does not replace Gemma 4. Instead, it demonstrates how an AI application can use Gemma 4 as the reasoning and generation layer while adding production-oriented controls around it.

For each request, NEES Guard records metadata such as:

  • requested model
  • used model
  • provider
  • mock/live mode
  • fallback usage
  • failed model attempts
  • risk band
  • policy decision
  • response hash
  • trace ID

This makes the demo more than a chatbot. It becomes a small example of governed AI behavior: traceable, inspectable, and safer for production-style use cases.

Architecture

The core architecture is intentionally simple:

Frontend UI
   ↓
FastAPI Backend
   ↓
NEES Guard Governance Layer
   ↓
Gemma 4 Model Call
   ↓
Governance Finalizer
   ↓
Final Governed Response + Trace JSON
Enter fullscreen mode Exit fullscreen mode

The governance layer classifies prompts into risk bands:

  • Green: allow the response
  • Yellow: modify or soften the response
  • Red: ask for confirmation or block the request

This lets the demo show how an AI app can handle normal prompts, hostile customer-support prompts, destructive agent actions, and sensitive advice requests differently.

What I Learned

While building this project, I noticed that model intelligence and production reliability are two different layers.

Gemma 4 can generate useful responses, but an application still needs a system around the model to decide:

  • Is this request low risk?
  • Should the response be modified?
  • Should the user confirm before an action?
  • Should the response be blocked?
  • What happened during the model call?
  • Which model was used?
  • Did fallback happen?

That is the gap NEES Guard tries to demonstrate.

The project also showed why traceability matters. If a model provider fails, fallback behavior should not be silent. NEES Guard records that event in the trace so the application remains inspectable.

Public Repository Note

This repository is a standalone challenge demonstration. It is not the production NEES Core Engine.

Advanced NEES runtime governance, memory governance, replay/simulation, enterprise controls, private infrastructure, and production NEES Core Engine capabilities are not included in this repository.

The repository is source-available for review and challenge evaluation only. See the repository license for details.

Final Thoughts

NEES Guard for Gemma 4 is a small project, but it represents a bigger idea:

Open models make AI more accessible. Governance layers make AI more reliable.

Gemma 4 provides the intelligence. NEES Guard provides governed behavior, traceability, fallback awareness, and predictable final output.

Top comments (0)