IT service management generates high volumes of structured and unstructured data, from incident tickets and change requests to CMDB relationships and runbook documentation. Traditional automation handles routing and categorization through rigid rule engines, but these break down when context spans multiple systems or requires nuanced interpretation. Large language models offer a more flexible interface for reasoning over this fragmented data, resolving tickets, and orchestrating remediation workflows without replacing existing ITSM platforms.
Why LLMs Fit ITSM Workflows
ITSM processes are conversational and context heavy. A single incident might include a user description, stack traces, metric snapshots, related configuration items, and historical resolution notes. LLMs can synthesize this multimodal context to classify severity, extract affected services, propose root causes, and draft stakeholder updates. Models that support function calling can query monitoring APIs, search knowledge bases, and create follow-up tasks directly through your service desk interfaces, turning a static playbook into an interactive reasoning agent.
Core Use Cases
- Automated triage: Classify and route incoming tickets based on free-form descriptions and attached logs.
- Root cause analysis: Correlate incident text with change records and monitoring data to surface likely causes.
- Knowledge base Q&A: Let engineers query runbooks and past resolutions using natural language.
- Change risk scoring: Summarize change plans and compare against historical failures to flag high-risk deployments.
- Post-incident summarization: Compress long ticket threads and chat logs into structured timelines and lessons learned.
Architecture Patterns
Production ITSM integrations usually combine retrieval augmented generation with tool use. A pipeline ingests a ticket, enriches it with CMDB data, retrieves relevant runbook sections via an embedding model, then prompts the LLM to answer or invoke an action. Because ITSM contexts can grow quickly, input length becomes a major cost driver on token-based inference platforms.
Implementing Ticket Triage with Function Calling
The example below uses the OpenAI SDK pointed at Oxlo.ai to classify a ticket, query a CMDB, and assign priority. Oxlo.ai is fully OpenAI SDK compatible and supports tool use on models such as Llama 3.3 70B and Qwen 3 32B.
import os
import openai
client = openai.OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ["OXLO_API_KEY"]
)
tools = [
{
"type": "function",
"function": {
"name": "query_cmdb",
"description": "Query the CMDB for configuration item details",
"parameters": {
"type": "object",
"properties": {
"ci_name": {"type": "string"}
},
"required": ["ci_name"]
}
}
},
{
"type": "function",
"function": {
"name": "assign_ticket",
"description": "Assign ticket to a team and set priority",
"parameters": {
"type": "object",
"properties": {
"team_id": {"type": "string"},
"priority": {"type": "string", "enum": ["P1", "P2", "P3", "P4"]}
},
"required": ["team_id", "priority"]
}
}
}
]
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "You are an ITSM assistant. Analyze tickets, query the CMDB when needed, and assign them to the correct team with appropriate priority."},
{"role": "user", "content": "Users report that the SAP production instance is returning 503 errors after last night's deployment."}
],
tools=tools,
tool_choice="auto"
)
print(response.choices[0].message)
Because Oxlo.ai offers no cold starts on popular models, this synchronous triage step returns a routing decision immediately without warmup latency.
Choosing Models for ITSM Workloads
Oxlo.ai provides more than 45 models across 7 categories, many of which map directly to ITSM tasks:
- Llama 3.3 70B works well for general ticket classification, routing, and stakeholder communication.
- DeepSeek R1 671B MoE and Kimi K2.6 excel at deep reasoning tasks such as cross-referencing logs, change records, and CMDB topology for root cause analysis.
- Qwen 3 32B is suited for multilingual service desks and agent workflows that span multiple tools.
- Qwen 3 Coder 30B and DeepSeek Coder can review infrastructure-as-code changes or generate remediation scripts during change management.
- BGE-Large and E5-Large embeddings power retrieval over internal knowledge bases and past incident writeups.
Cost Considerations for Long-Context ITSM Tasks
ITSM automation frequently processes long inputs: threaded email chains, log files, JSON configuration dumps, and extensive runbooks. On token-based providers, these long contexts drive up cost unpredictably. Oxlo.ai uses request-based pricing with one flat cost per API request regardless of prompt length. For ITSM workloads that ingest lengthy incident histories or large documents, this model can be significantly cheaper than token-based alternatives. See https://oxlo.ai/pricing for current plan details.
Getting Started with Oxlo.ai
If you already use the OpenAI SDK, switching to Oxlo.ai requires only a base URL change. The snippet below streams a knowledge base answer using a model from Oxlo.ai:
import openai
client = openai.OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ["OXLO_API_KEY"]
)
stream = client.chat.completions.create(
model="qwen3-32b",
messages=[
{"role": "system", "content": "You are a senior infrastructure engineer. Answer questions using the internal knowledge base."},
{"role": "user", "content": "How do we roll back a failed Kubernetes deployment in cluster prod-us-east?"}
],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
Oxlo.ai supports streaming, JSON mode, multi-turn conversations, and vision input where needed, so you can extend the integration to cover chatbots, structured output parsing, and image-based incident evidence without changing your client code.
Conclusion
LLMs are becoming standard infrastructure for ITSM teams that need to reason over unstructured tickets, logs, and documentation. By pairing retrieval and function calling with a compatible inference backend, you can automate triage, accelerate root cause analysis, and keep knowledge accessible without replatforming your entire service desk. Oxlo.ai offers an OpenAI-compatible API, request-based pricing that favors long-context ITSM workloads, and a broad model catalog that covers reasoning, coding, and embeddings. It is a practical backend choice for teams building agentic ITSM automation today.
Top comments (0)