Skip to content

DEV Community

Eliana Lam for AWS Community On Air

Posted on Nov 22

Observe to Optimize – LLM Observability to AIOps Turning real-time insights into intelligent automation

#aws #cloud #beginners #productivity

Speaker: Jimmy Soh @ AWS Amarathon 2025

Summary by Amazon Nova

Key Challenges in LLM Operations

Tracking usage across tenants and models
Preventing abuse and prompt injection
Optimizing cost without sacrificing SLA
Blind spots in usage, security, and cost can sink LLM scale.

Real-Time Insights Driving AIOps Decisions

Monitor every prompt, token, and latency
Detect anomalies and abuse patterns
Drive intelligent automation
Live metrics turn anomalies into instant, automated fixes.

Fair Pricing Through Smart Observability

Align cloud spend with true LLM usage
Identify under-utilized resources and right-size automatically
Trigger cost-saving actions (scale-to-zero, burst capacity)
Pay only for value – usage-driven metering that right-sizes itself.

From Observability to Optimization with AIOps

Smarter automation drives faster incident resolution
Continuous cost efficiency without manual tuning
High-performing AI workloads
transform LLM observability into intelligent AIOps actions.

Architecture

1 Chat / AI Services Usage with Customers using Toby AI

SaaS Cluster
Toby AI Services
Application Performance
Monitoring
Real User Monitoring
Logs and Metrics Analytics
Synthetic Monitoring
Monitoring Rules and Alerts

2 Telemetry Data SaaS (Self-Monitoring Cluster)

Toby AI Services
Application Performance
Monitoring
Real User Monitoring
Logs and Metrics Analytics
Synthetic Monitoring
Monitoring Rules and Alerts

3 Subscription Check with AIOps Orchestrator

Context Enrichment
Policy Decision
Trigger Actions

4 Runbook, Scale, Throttle, Optimize with DevOps Orchestrator

GitOps Runbooks

Team:

AWS FSI Customer Acceleration Hong Kong

AWS Amarathon Fan Club

AWS Community Builder Hong Kong

Top comments (0)

Subscribe