DEV Community

shashank ms
shashank ms

Posted on

A Practical Guide to Using LLMs in Telecommunications

We are building a network incident triage agent that ingests raw syslog and alarm text from cell towers and core routers, then returns structured severity scores, probable root causes, and next-step runbooks. It is aimed at NOC engineers who need to cut through alert noise during outages. Because the agent often processes large syslog dumps, Oxlo.ai's request-based pricing keeps costs flat regardless of how many lines you feed into the context window.

What you'll need

Python 3.10 or newer, an Oxlo.ai API key from https://portal.oxlo.ai, and the OpenAI SDK installed with pip install openai.

Step 1: Design the incident schema

Start with a Pydantic model so the LLM returns structured data that downstream automation can rely on. This removes fragile string parsing from your pipeline.

from pydantic import BaseModel, Field
from typing import List

class NetworkIncident(BaseModel):
    severity: str = Field(description="One of: CRITICAL, HIGH, MEDIUM, LOW")
    category: str = Field(description="e.g., BGP, RF, Backhaul, Power, Core")
    root_cause: str = Field(description="Short technical summary, max 20 words")
    affected_services: List[str] = Field(description="Impacted cell sectors, VLANs, or service IDs")
    next_steps: List[str] = Field(description="Ordered remediation actions")

Step 2: Write the system prompt

The system prompt is the only manual SRE knowledge we inject. It enforces valid JSON output and constrains the model to telecom-specific categories.

SYSTEM_PROMPT = """You are a senior telecom SRE with 10 years of experience in RAN and core networks.
Analyze the provided syslog or alarm text and produce a structured incident assessment.

Rules:
- Output ONLY valid JSON. No markdown, no preamble.
- severity must be one of: CRITICAL, HIGH, MEDIUM, LOW.
- category must be one of: BGP, RF, Backhaul, Power, Core, Transport, Other.
- root_cause must be a single sentence.
- affected_services must be a list of strings.
- next_steps must be an ordered list of concrete remediation actions.

If the log is ambiguous, mark severity as MEDIUM and set root_cause to \"Ambiguous - manual triage required\"."""

Step 3: Build the triage client

Wire the prompt to Oxlo.ai using the OpenAI SDK. I use Llama 3.3 70B here because it follows structured instructions reliably and handles technical jargon well.

import json
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

def triage_incident(syslog_text: str) -> dict:
    user_message = f"Syslog:\n{syslog_text}"

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        temperature=0.1,
        max_tokens=1024,
    )

    raw = response.choices[0].message.content.strip()
    # Strip accidental markdown fences
    raw = raw.removeprefix("

```json").removeprefix("```

").removesuffix("

```").strip()
    return json.loads(raw)

Step 4: Add context from a knowledge base

To reduce hallucination, prepend relevant runbook snippets based on keyword matches. In production you would replace this with a vector store, but a simple lookup is enough to show the pattern.

KB = {
    "BGP": "BGP peer flaps often indicate upstream provider maintenance or a failing optic on port xe-0/0/1.",
    "RF": "High VSWR alarms usually trace to loose antenna connectors or water ingress in the jumper cable.",
    "POWER": "Rectifier faults at remote sites typically follow battery degradation or AC input swings.",
}

def triage_with_context(syslog_text: str) -> dict:
    hits = [f"{k}: {v}" for k, v in KB.items() if k in syslog_text.upper()]
    kb_block = "\n".join(hits) if hits else "No matching runbook entries."

    user_message = f"""Relevant runbook context:
{kb_block}

Syslog:
{syslog_text}"""

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
        temperature=0.1,
        max_tokens=1024,
    )

    raw = response.choices[0].message.content.strip()
    raw = raw.removeprefix("```

json").removeprefix("

```").removesuffix("```

").strip()
    return json.loads(raw)

Step 5: Wrap it in a CLI

Add a small argparse interface so you can pipe raw alarm files directly into the agent from your NOC workstation.

import argparse
import json

def main():
    parser = argparse.ArgumentParser(description="Triage a telecom syslog file via Oxlo.ai")
    parser.add_argument("file", help="Path to syslog or alarm text")
    args = parser.parse_args()

    log_text = open(args.file, "r").read()
    result = triage_with_context(log_text)
    print(json.dumps(result, indent=2))

if __name__ == "__main__":
    main()

Run it

Create a sample alarm and invoke the script. The agent returns structured JSON you can feed directly into a ticketing system.

$ cat alarm.txt
2024-05-21T14:03:11Z cell-tower-42 RF_LINK_FAIL VSWR=3.4 Antenna-3G-SEC-07

$ python triage.py alarm.txt

Example output:

{
  "severity": "HIGH",
  "category": "RF",
  "root_cause": "High VSWR on Antenna-3G-SEC-07 indicates a physical connector or cable fault.",
  "affected_services": [
    "3G Sector 07"
  ],
  "next_steps": [
    "Dispatch field team to inspect antenna connector on SEC-07",
    "Check jumper cable for water ingress or kinks",
    "Verify torque specs on all RF connectors",
    "Escalate to RAN engineering if VSWR persists after reseating"
  ]
}

Next steps

Wire the JSON output into a PagerDuty or Slack webhook so your NOC receives structured alerts with severity already ranked. If you later need to analyze hours of continuous syslog history in a single shot, switch the model to Kimi K2.6 on Oxlo.ai and take advantage of its 131K context window without worrying about per-token cost escalation.

Top comments (0)