This is a submission for the Gemma 4 Challenge: Build with Gemma 4
π¨ From 373 alarms to 1 root cause in seconds
A production-grade AI reasoning agent that turns a wall of network alarms into clear root-cause analysis β running entirely on your own hardware.
What I Built
The Problem
It is 3 AM. A NOC engineer receives an alert:
"North region customers reporting intermittent connectivity drops. Possible fiber cut or BGP flap."
The system shows:
- 373 alarms
- 45 active
- 6 CRITICAL
The challenge:
- Identify root cause
- Determine blast radius
- Estimate impact and resolution
This typically takes 20β120 minutes depending on expertise.
The Solution
GemmaOps Edge is a fully local AI reasoning agent that enables operators to query network state in natural language and receive precise, actionable insights.
While GemmaOps Edge is demonstrated using telecom NOC scenarios, the same architecture applies to any high-volume event-driven system β including cloud observability, microservices monitoring, and enterprise infrastructure platforms.
π¨ This is not alert summarization β it is reasoning-driven root cause analysis.
Key Capabilities
- Topology-aware Root Cause Analysis
- Multi-condition Correlation (alarms + topology + history + traffic)
- Service Impact Propagation
- Historical Incident Matching with MTTR estimation
- Natural Language Query Interface
Example Interaction
Operator: Why is the North region experiencing outages?
Agent:
- BGP SESSION DOWN on CR-NOR-01 (ALM-00196)
- CE-NOR-02 (ALM-00199) β 1,252+ prefixes withdrawn
- SERVICE_OUTAGE affecting 2,560 customers
Historical match:
INC-2026-017 (BGP failure, MTTR 53 min)
Recommended actions:
- Check BGP config changes
- Rollback recent changes
- Initiate incident bridge
Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GemmaOps Edge β
β AI-Powered NOC Assistant Β· Edge Deployment β
β β
β βββββββββββββββββββββ βββββββββββββββββββββββ β
β β NOC Dashboard ββββββββRESTββββββΊβ FastAPI Backend β β
β β React Β· TW β β β β
β βββββββββββββββββββββ ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββΌβββββββββββ β
β β Agent Loop β β
β β ReAct Β· 128K ctx β β
β ββββββββββββ¬βββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββ β
β β β β β
β βββββββββββΌβββββββββ βββββββΌβββββββββββ β β
β β Context Builder β βReasoning Engineβ β β
β ββββββββββββββββββββ ββββββββββ¬ββββββββ β β
β β β β β
β βββββββββββΌβββββββββ β β β
β β Tool Registry ββββββββββββββ β β
β ββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β Data Layer β β Memory Layer β β
β β β β β β
β β β’ NetworkX graph β β β’ Redis Β· short-termβ β
β β β’ FAISS vector index β β β’ ChromaDB Β· long-term β β
β β β’ Live alarm store β β β β
β ββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ollama Β· gemma4:e4b Β· localhost:11434 β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Fully local deployment
β No cloud/API dependency
β Runs on commodity hardware
How It Works
ReAct Agent (Reasoning + Acting)
The agent dynamically:
- Reads summarized network state
- Calls tools based on need
- Correlates multiple data sources
- Produces precise RCA output
NOC Tools
| Tool | Purpose |
|---|---|
alarm_search |
Fetch active alarms |
topology_lookup |
Get network relationships |
path_finder |
Analyze routes |
incident_search |
Retrieve historical incidents |
Context Engineering (Critical Innovation)
Priority-based prompt construction:
- KEY FACTS (highest impact)
- Query intent
- Active alarms
- Topology graph
- Historical incidents
β‘ Improved accuracy from ~40% to ~90%
The 128K Advantage
Two Operating Modes
| Mode | Description |
|---|---|
| ReAct (6K) | Fast, tool-driven RCA analysis |
| Full Context (128K) | Whole-network reasoning in one pass |
Why It Matters
Questions like:
"Which nodes appear in both CRITICAL alarms AND past P1 incidents?"
β Cannot be solved by RAG or smaller-context models
β
Solved using full-context reasoning
Benchmark Results
| Model | Context | Performance |
|---|---|---|
| Gemma 4B | 128K | β 5/5 (Best) |
| Mistral 7B | 32K | β οΈ 2/5 (Partial) |
| Gemma 2B | 8K | β 1/5 (Limited) |
β‘ The limitation is context window, not model size
Demo
Code
https://github.com/praveen-sinha-ai/gemmaops-edge
How I Used Gemma 4
Model Selected
gemma4:e4b (4B)
Why This Model
- Edge Deployment Requirement
- Runs locally (no GPU required)
- < 3GB footprint
1β4s response time
Reasoning Capability
-
Handles multi-condition correlation:
- alarms
- topology
- incidents
- traffic
- config
Accuracy vs Efficiency Balance
E2B β insufficient reasoning
31B β impractical for edge deployment
E4B β optimal trade-off
Two Usage Modes
- ReAct Agent Mode (6K)
- Multi-step reasoning
- Tool-based retrieval
Fast responses
Full Context Mode (128K)
Entire dataset in prompt (~43K tokens)
No retrieval needed
Enables deep correlation queries
Key Insight
The biggest differentiator was not model size β
it was how much data the model could see at once.
What Makes This Different
- Not a basic RAG system or generic LLM wrapper
- Performs multi-step reasoning with tool execution (ReAct)
- Understands network topology as a graph, not just text
- Combines alarms, topology, and incident history in one reasoning flow
- Supports full-network reasoning using 128K context
- Runs fully local β no cloud, no data exposure
- Produces specific, verifiable outputs (IDs, nodes, incidents) β not vague summaries
What's Next
- Graph Neural Networks (GNN-based RCA)
- Predictive failure detection
- Automated remediation workflows
- Larger Gemma models (26B, 31B)
- Domain fine-tuning (3GPP, TM Forum)
Closing
The biggest insight from building GemmaOps Edge:
The limitation is not model intelligence β it is how much of the system the model can see at once.
By combining:
- Structured context engineering
- Topology-aware reasoning
- Large context windows (128K)
β¦it becomes possible to move from alert noise β precise root cause in seconds.
In a real NOC, that difference is not theoretical:
- 2 hours MTTR β 20 minutes
- Fewer escalations
- Faster recovery
- Better customer experience
Local AI for enterprise operations is no longer a future concept.
With Gemma 4, it is practical today.
Tech Stack: Python, FastAPI, NetworkX, FAISS, Ollama, Gemma 4
Tags: gemma ai telecom llm fastapi
Feedback & Discussion
I built GemmaOps Edge to solve a very real problem Iβve seen repeatedly in telecom NOCs β too many alarms, too little clarity.
If you're working on similar problems (telecom, observability, AI agents), Iβd genuinely like to hear your thoughts.
- What would you improve in this approach?
- Would you trust this in a real NOC?
- Any ideas for scaling this further?
Feel free to drop your questions or suggestions in the comments.
Top comments (0)