DEV Community

Cover image for GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4)
praveen sinha
praveen sinha

Posted on

GemmaOps Edge: From 373 Alarms to 1 Root Cause Using Local AI (Gemma 4)

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

🚨 From 373 alarms to 1 root cause in seconds

A production-grade AI reasoning agent that turns a wall of network alarms into clear root-cause analysis β€” running entirely on your own hardware.


What I Built

The Problem

It is 3 AM. A NOC engineer receives an alert:

"North region customers reporting intermittent connectivity drops. Possible fiber cut or BGP flap."

The system shows:

  • 373 alarms
  • 45 active
  • 6 CRITICAL

The challenge:

  • Identify root cause
  • Determine blast radius
  • Estimate impact and resolution

This typically takes 20–120 minutes depending on expertise.


The Solution

GemmaOps Edge is a fully local AI reasoning agent that enables operators to query network state in natural language and receive precise, actionable insights.

While GemmaOps Edge is demonstrated using telecom NOC scenarios, the same architecture applies to any high-volume event-driven system β€” including cloud observability, microservices monitoring, and enterprise infrastructure platforms.

🚨 This is not alert summarization β€” it is reasoning-driven root cause analysis.

Key Capabilities

  • Topology-aware Root Cause Analysis
  • Multi-condition Correlation (alarms + topology + history + traffic)
  • Service Impact Propagation
  • Historical Incident Matching with MTTR estimation
  • Natural Language Query Interface

Example Interaction

Operator: Why is the North region experiencing outages?

Agent:

  • BGP SESSION DOWN on CR-NOR-01 (ALM-00196)
  • CE-NOR-02 (ALM-00199) β€” 1,252+ prefixes withdrawn
  • SERVICE_OUTAGE affecting 2,560 customers

Historical match:

INC-2026-017 (BGP failure, MTTR 53 min)

Recommended actions:

  1. Check BGP config changes
  2. Rollback recent changes
  3. Initiate incident bridge

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                            GemmaOps Edge                             β”‚
β”‚              AI-Powered NOC Assistant Β· Edge Deployment              β”‚
β”‚                                                                      β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚   β”‚   NOC Dashboard   │◄──────REST─────►│   FastAPI Backend   β”‚      β”‚
β”‚   β”‚    React Β· TW     β”‚                 β”‚                     β”‚      β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                                                    β”‚                 β”‚
β”‚                                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚                                         β”‚     Agent Loop      β”‚      β”‚
β”‚                                         β”‚   ReAct Β· 128K ctx  β”‚      β”‚
β”‚                                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                                                    β”‚                 β”‚
β”‚                                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚                                 β”‚                  β”‚               β”‚ β”‚
β”‚                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚ β”‚
β”‚                       β”‚ Context Builder  β”‚   β”‚Reasoning Engineβ”‚    β”‚ β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚ β”‚
β”‚                                 β”‚                     β”‚            β”‚ β”‚
β”‚                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚            β”‚ β”‚
β”‚                       β”‚  Tool Registry   β”‚β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚ β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                         β”‚ β”‚
β”‚                                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                      β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚          Data Layer          β”‚     β”‚      Memory Layer       β”‚   β”‚
β”‚   β”‚                              β”‚     β”‚                         β”‚   β”‚
β”‚   β”‚   β€’ NetworkX graph           β”‚     β”‚ β€’ Redis     Β· short-termβ”‚   β”‚
β”‚   β”‚   β€’ FAISS vector index       β”‚     β”‚ β€’ ChromaDB  Β· long-term β”‚   β”‚
β”‚   β”‚   β€’ Live alarm store         β”‚     β”‚                         β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                      β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚            Ollama Β· gemma4:e4b Β· localhost:11434             β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

βœ” Fully local deployment

βœ” No cloud/API dependency

βœ” Runs on commodity hardware


How It Works

ReAct Agent (Reasoning + Acting)

The agent dynamically:

  1. Reads summarized network state
  2. Calls tools based on need
  3. Correlates multiple data sources
  4. Produces precise RCA output

NOC Tools

Tool Purpose
alarm_search Fetch active alarms
topology_lookup Get network relationships
path_finder Analyze routes
incident_search Retrieve historical incidents

Context Engineering (Critical Innovation)

Priority-based prompt construction:

  1. KEY FACTS (highest impact)
  2. Query intent
  3. Active alarms
  4. Topology graph
  5. Historical incidents

➑ Improved accuracy from ~40% to ~90%


The 128K Advantage

Two Operating Modes

Mode Description
ReAct (6K) Fast, tool-driven RCA analysis
Full Context (128K) Whole-network reasoning in one pass

Why It Matters

Questions like:

"Which nodes appear in both CRITICAL alarms AND past P1 incidents?"

❌ Cannot be solved by RAG or smaller-context models

βœ… Solved using full-context reasoning


Benchmark Results

Model Context Performance
Gemma 4B 128K βœ… 5/5 (Best)
Mistral 7B 32K ⚠️ 2/5 (Partial)
Gemma 2B 8K ❌ 1/5 (Limited)

➑ The limitation is context window, not model size


Demo


Code

https://github.com/praveen-sinha-ai/gemmaops-edge


How I Used Gemma 4

Model Selected

gemma4:e4b (4B)

Why This Model

  1. Edge Deployment Requirement
  2. Runs locally (no GPU required)
  3. < 3GB footprint
  4. 1–4s response time

  5. Reasoning Capability

  6. Handles multi-condition correlation:

    • alarms
    • topology
    • incidents
    • traffic
    • config
  7. Accuracy vs Efficiency Balance

  8. E2B β†’ insufficient reasoning

  9. 31B β†’ impractical for edge deployment

  10. E4B β†’ optimal trade-off


Two Usage Modes

  1. ReAct Agent Mode (6K)
  2. Multi-step reasoning
  3. Tool-based retrieval
  4. Fast responses

  5. Full Context Mode (128K)

  6. Entire dataset in prompt (~43K tokens)

  7. No retrieval needed

  8. Enables deep correlation queries


Key Insight

The biggest differentiator was not model size β€”

it was how much data the model could see at once.


What Makes This Different

  • Not a basic RAG system or generic LLM wrapper
  • Performs multi-step reasoning with tool execution (ReAct)
  • Understands network topology as a graph, not just text
  • Combines alarms, topology, and incident history in one reasoning flow
  • Supports full-network reasoning using 128K context
  • Runs fully local β€” no cloud, no data exposure
  • Produces specific, verifiable outputs (IDs, nodes, incidents) β€” not vague summaries

What's Next

  • Graph Neural Networks (GNN-based RCA)
  • Predictive failure detection
  • Automated remediation workflows
  • Larger Gemma models (26B, 31B)
  • Domain fine-tuning (3GPP, TM Forum)

Closing

The biggest insight from building GemmaOps Edge:

The limitation is not model intelligence β€” it is how much of the system the model can see at once.

By combining:

  • Structured context engineering
  • Topology-aware reasoning
  • Large context windows (128K)

…it becomes possible to move from alert noise β†’ precise root cause in seconds.

In a real NOC, that difference is not theoretical:

  • 2 hours MTTR β†’ 20 minutes
  • Fewer escalations
  • Faster recovery
  • Better customer experience

Local AI for enterprise operations is no longer a future concept.

With Gemma 4, it is practical today.

Tech Stack: Python, FastAPI, NetworkX, FAISS, Ollama, Gemma 4

Tags: gemma ai telecom llm fastapi


Feedback & Discussion

I built GemmaOps Edge to solve a very real problem I’ve seen repeatedly in telecom NOCs β€” too many alarms, too little clarity.

If you're working on similar problems (telecom, observability, AI agents), I’d genuinely like to hear your thoughts.

  • What would you improve in this approach?
  • Would you trust this in a real NOC?
  • Any ideas for scaling this further?

Feel free to drop your questions or suggestions in the comments.

Top comments (0)