DEV Community

Cover image for Monitoring: From Black Box to Glass Box
Halton Chen
Halton Chen

Posted on

Monitoring: From Black Box to Glass Box

You've built your AI agent. You've configured the tools, crafted a thoughtful system prompt, and deployed it to your users. Job done, right?

Not quite. Once your agent is live, a whole new set of questions emerges: Is it actually working? How fast is it responding? How many tokens is it burning through — and what does that mean for your costs?

That's where the Monitoring and Evaluation tab in Oracle AI Agent Studio comes in. Think of it as your agent's mission control — minus the dramatic countdowns.

Let's walk through how it works.


Before You Start: Run the ESS Job

Monitoring data doesn't appear by magic. Before you can view any meaningful metrics, you need to run the ESS job: Aggregate AI Agent Usage and Metrics.

This job does exactly what it says — it aggregates the usage data and metrics displayed in the Monitoring and Evaluation tab. Oracle recommends running it once or twice per day, so it's worth scheduling it on a regular cadence rather than remembering to kick it off manually.

Set it and (mostly) forget it.


What the Monitoring Tab Shows You

Once the ESS job has run, the Monitoring tab gives you an aggregated view across all your agents — both published and draft. That last part is worth noting: draft agents are included too, which is great for testing before you go live.

You can filter the view by time period: last 1 day, 7 days, 1 month, or 3 months. This flexibility lets you spot trends over time, not just point-in-time snapshots.

At the top level, the dashboard answers the big-picture questions:

  • How many users are engaging with my agents?
  • How many sessions are being initiated?
  • How much latency are users experiencing?
  • How many tokens are being consumed — and what does that mean for cost?

Speaking of tokens — pay close attention to this number. Oracle's pricing strategy is tied to token consumption, so the token count is more than just a technical metric. It directly informs your cost management and capacity planning decisions. (No one wants a surprise on the bill😒)


Drilling Down: From Agent Team to Individual Session

The real power of the monitoring view comes from its drill-down capability. Here's how it layers:

Level 1 — Agent Team View

Click on an agent team to see its detailed runs. Each row represents a session — a single end-to-end interaction between a user and the agent.

Key metrics at this level include:

  • Turns — The number of back-and-forth exchanges within a session. Two turns means the user asked two questions and the agent responded twice. Simple, but useful for understanding conversation depth.
  • Session Status — Whether the session completed successfully or hit an error. Keep an eye on error rates; a spike usually means something upstream has changed.
  • Total Tokens Used — As mentioned, this is your cost signal. In a typical example, just 2 turns can consume around 20,000 tokens. That adds up quickly at scale.
  • P99 Latency — This is the maximum wait time (in milliseconds) for 99% of your users. For example, a P99 of 16,375ms means 99% of users received their response within 16 seconds. It's a practical measure of worst-case user experience, not just average performance.

Level 2 — Session Trace View

Drill into any individual session and you get a detailed trace view — a timeline of exactly what happened, in what order, and how long each step took.

This is where troubleshooting becomes genuinely useful. You can see:

  • Which tools were called and how long each tool execution took
  • When the LLM was invoked and how long the model took to process each request
  • Token usage and latency broken down at the individual LLM call or tool level

This level of granularity is invaluable when you're optimising agent performance. If a tool is consistently slow, it shows up here. If the LLM is the bottleneck, the trace makes it obvious. No guessing required.


Why This Matters for Enterprise AI

Monitoring isn't just a nice-to-have — it's a governance and cost control requirement in any serious enterprise deployment. Oracle AI Agent Studio's built-in monitoring gives you the visibility to:

  • Manage costs proactively by tracking token consumption before it becomes a budget conversation
  • Ensure reliability by catching session errors early and resolving them before users notice
  • Optimise performance by identifying slow tools or LLM calls that degrade the user experience
  • Support adoption reporting by providing concrete usage data across your agent portfolio

Whether you're reporting upwards to a CIO or fine-tuning agent logic with your development team, the Monitoring tab speaks both languages — executive summary at the top, engineering detail at the bottom.


Summary

The Monitoring and Evaluation capability in Oracle AI Agent Studio gives you a clear, layered view of how your agents are performing in production. From aggregate usage trends down to individual tool traces, the data is there — you just need to run the ESS job first.

Build it, deploy it, monitor it. That's the full loop.

Top comments (0)