praveen sinha

Posted on May 22

GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4)

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

🚨 From 373 alarms to 1 root cause in seconds

A production-grade AI reasoning agent that turns a wall of network alarms into clear root-cause analysis — running entirely on your own hardware.

What I Built

The Problem

It is 3 AM. A NOC engineer receives an alert:

"North region customers reporting intermittent connectivity drops. Possible fiber cut or BGP flap."

The system shows:

373 alarms
45 active
6 CRITICAL

The challenge:

Identify root cause
Determine blast radius
Estimate impact and resolution

This typically takes 20–120 minutes depending on expertise.

The Solution

GemmaOps Edge is a fully local AI reasoning agent that enables operators to query network state in natural language and receive precise, actionable insights.

While GemmaOps Edge is demonstrated using telecom NOC scenarios, the same architecture applies to any high-volume event-driven system — including cloud observability, microservices monitoring, and enterprise infrastructure platforms.

🚨 This is not alert summarization — it is reasoning-driven root cause analysis.

Key Capabilities

Topology-aware Root Cause Analysis
Multi-condition Correlation (alarms + topology + history + traffic)
Service Impact Propagation
Historical Incident Matching with MTTR estimation
Natural Language Query Interface

Example Interaction

Operator: Why is the North region experiencing outages?

Agent:

BGP SESSION DOWN on CR-NOR-01 (ALM-00196)
CE-NOR-02 (ALM-00199) — 1,252+ prefixes withdrawn
SERVICE_OUTAGE affecting 2,560 customers

Historical match:

INC-2026-017 (BGP failure, MTTR 53 min)

Recommended actions:

Check BGP config changes
Rollback recent changes
Initiate incident bridge

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                            GemmaOps Edge                             │
│              AI-Powered NOC Assistant · Edge Deployment              │
│                                                                      │
│   ┌───────────────────┐                 ┌─────────────────────┐      │
│   │   NOC Dashboard   │◄──────REST─────►│   FastAPI Backend   │      │
│   │    React · TW     │                 │                     │      │
│   └───────────────────┘                 └──────────┬──────────┘      │
│                                                    │                 │
│                                         ┌──────────▼──────────┐      │
│                                         │     Agent Loop      │      │
│                                         │   ReAct · 128K ctx  │      │
│                                         └──────────┬──────────┘      │
│                                                    │                 │
│                                 ┌──────────────────┼───────────────┐ │
│                                 │                  │               │ │
│                       ┌─────────▼────────┐   ┌─────▼──────────┐    │ │
│                       │ Context Builder  │   │Reasoning Engine│    │ │
│                       └──────────────────┘   └────────┬───────┘    │ │
│                                 │                     │            │ │
│                       ┌─────────▼────────┐            │            │ │
│                       │  Tool Registry   │◄───────────┘            │ │
│                       └──────────────────┘                         │ │
│                                 └──────────────────────────────────┘ │
│                                                                      │
│   ┌──────────────────────────────┐     ┌─────────────────────────┐   │
│   │          Data Layer          │     │      Memory Layer       │   │
│   │                              │     │                         │   │
│   │   • NetworkX graph           │     │ • Redis     · short-term│   │
│   │   • FAISS vector index       │     │ • ChromaDB  · long-term │   │
│   │   • Live alarm store         │     │                         │   │
│   └──────────────────────────────┘     └─────────────────────────┘   │
│                                                                      │
│   ┌──────────────────────────────────────────────────────────────┐   │
│   │            Ollama · gemma4:e4b · localhost:11434             │   │
│   └──────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘

✔ Fully local deployment

✔ No cloud/API dependency

✔ Runs on commodity hardware

How It Works

ReAct Agent (Reasoning + Acting)

The agent dynamically:

Reads summarized network state
Calls tools based on need
Correlates multiple data sources
Produces precise RCA output

NOC Tools

Tool	Purpose
`alarm_search`	Fetch active alarms
`topology_lookup`	Get network relationships
`path_finder`	Analyze routes
`incident_search`	Retrieve historical incidents

Context Engineering (Critical Innovation)

Priority-based prompt construction:

KEY FACTS (highest impact)
Query intent
Active alarms
Topology graph
Historical incidents

➡ Improved accuracy from ~40% to ~90%

The 128K Advantage

Two Operating Modes

Mode	Description
ReAct (6K)	Fast, tool-driven RCA analysis
Full Context (128K)	Whole-network reasoning in one pass

Why It Matters

Questions like:

"Which nodes appear in both CRITICAL alarms AND past P1 incidents?"

❌ Cannot be solved by RAG or smaller-context models

✅ Solved using full-context reasoning

Benchmark Results

Model	Context	Performance
Gemma 4B	128K	✅ 5/5 (Best)
Mistral 7B	32K	⚠️ 2/5 (Partial)
Gemma 2B	8K	❌ 1/5 (Limited)

➡ The limitation is context window, not model size

Demo

Code

https://github.com/praveen-sinha-ai/gemmaops-edge

How I Used Gemma 4

Model Selected

gemma4:e4b (4B)

Why This Model

Edge Deployment Requirement
Runs locally (no GPU required)
< 3GB footprint
1–4s response time
Reasoning Capability
Handles multi-condition correlation:
- alarms
- topology
- incidents
- traffic
- config
Accuracy vs Efficiency Balance
E2B → insufficient reasoning
31B → impractical for edge deployment
E4B → optimal trade-off

Two Usage Modes

ReAct Agent Mode (6K)
Multi-step reasoning
Tool-based retrieval
Fast responses
Full Context Mode (128K)
Entire dataset in prompt (~43K tokens)
No retrieval needed
Enables deep correlation queries

Key Insight

The biggest differentiator was not model size —

it was how much data the model could see at once.

What Makes This Different

Not a basic RAG system or generic LLM wrapper
Performs multi-step reasoning with tool execution (ReAct)
Understands network topology as a graph, not just text
Combines alarms, topology, and incident history in one reasoning flow
Supports full-network reasoning using 128K context
Runs fully local — no cloud, no data exposure
Produces specific, verifiable outputs (IDs, nodes, incidents) — not vague summaries

What's Next

Graph Neural Networks (GNN-based RCA)
Predictive failure detection
Automated remediation workflows
Larger Gemma models (26B, 31B)
Domain fine-tuning (3GPP, TM Forum)

Closing

The biggest insight from building GemmaOps Edge:

The limitation is not model intelligence — it is how much of the system the model can see at once.

By combining:

Structured context engineering
Topology-aware reasoning
Large context windows (128K)

…it becomes possible to move from alert noise → precise root cause in seconds.

In a real NOC, that difference is not theoretical:

2 hours MTTR → 20 minutes
Fewer escalations
Faster recovery
Better customer experience

Local AI for enterprise operations is no longer a future concept.

With Gemma 4, it is practical today.

Tech Stack: Python, FastAPI, NetworkX, FAISS, Ollama, Gemma 4

Tags: gemma ai telecom llm fastapi

Feedback & Discussion

I built GemmaOps Edge to solve a very real problem I’ve seen repeatedly in telecom NOCs — too many alarms, too little clarity.

If you're working on similar problems (telecom, observability, AI agents), I’d genuinely like to hear your thoughts.

What would you improve in this approach?
Would you trust this in a real NOC?
Any ideas for scaling this further?

Feel free to drop your questions or suggestions in the comments.

DEV Community