Harish Kotra (he/him)

Posted on Dec 22, 2025

Building Observable, Secure, and Resilient AI Agents with Oracle MCP, OpenTelemetry, and LangGraph

#oracle #ai #langgraph #programming

Why Agentic AI Needs Observability

AI systems have changed fundamentally.

We are no longer building single-prompt chatbots. We are building agents i.e., systems that plan, call tools, query databases, and make decisions over multiple steps.

The hard problem is no longer “Can the model answer?”

The hard problem is:

“What happened when the agent failed?”

Without observability:

Tool failures look like hallucinations
Latency spikes are invisible
SQL errors are buried
Partial failures silently corrupt results

This post introduces TalentScout AI, a reference implementation of an observable, secure, and resilient agent system. The use case is simple, but the architecture is realistic and production-oriented.

What Observability Means for AI Agents
Architecture
Local Development Prerequisites
Installing Docker
Running Oracle AI Database Locally
Installing Oracle SQLcl (MCP Server)
Installing Ollama (Local LLM Runtime)
Setting Up the Python Environment
Database Schema and Seed Data
Secure Database Access with MCP
Agent Architecture with LangGraph
Implementing the Agents
Adding OpenTelemetry Observability
Running the Agents
Viewing Traces in Phoenix
Troubleshooting and Failure Modes
Resources

What Observability Means for AI Agents

Observability is not logging.

In agentic systems, observability means being able to reconstruct an execution after the fact, without rerunning it.

An observable agent must capture:

Signal
Purpose
Execution order
Understand control flow
Tool calls
Identify external dependencies
LLM prompts
Debug reasoning errors
Generated SQL
Catch unsafe or invalid queries
Latency
Find bottlenecks
Errors and warnings
Diagnose partial failures

If you cannot answer “what happened inside the agent?”, you cannot operate it safely.

Architecture

TalentScout AI is built as a graph of agents, not a monolith.

Each agent has one responsibility:

Web Agent → gathers external context
DB Agent → queries enterprise data
Orchestrator → makes the final decision

This separation makes failures visible and traceable.

Local Development Prerequisites

Hardware

macOS, Linux, or Windows (WSL2 recommended)
Minimum 16GB RAM
Docker-capable CPU

Software

Docker
Python 3.12+
Git

Installing Docker

macOS / Windows

Download Docker Desktop: https://www.docker.com/products/docker-desktop

Verify installation:

docker --version
docker compose version

Running Oracle AI Database Locally

Oracle provides a free container image suitable for local development.

Pull the Image

docker pull container-registry.oracle.com/database/free:latest

Run the Container

docker run -d \
  --name oracle-ai \
  -p 1521:1521 \
  -e ORACLE_PWD=oracle \
  container-registry.oracle.com/database/free:latest

Verify Startup

docker logs oracle-ai

Wait until you see:

DATABASE IS READY TO USE!

Installing Oracle SQLcl (MCP Server)

SQLcl acts as the Model Context Protocol server.

Download SQLcl

Install

unzip sqlcl-25.x.x.zip
export PATH=$PATH:/path/to/sqlcl/bin

Verify:

sql -v

Installing Ollama (Local LLM Runtime)

Ollama allows you to run LLMs locally without cloud APIs.

Install Ollama

https://ollama.com/download

Verify:

ollama --version

Pull a Model

ollama pull gemma3:12b

Test:

ollama run gemma3:12b "Hello"

Setting Up the Python Environment

Clone the Repository

git clone https://github.com/harishkotra/talentscoutai/
cd talentscoutai

Create a Virtual Environment

python3.12 -m venv .venv
source .venv/bin/activate

Install Dependencies

pip install -r requirements.txt

Example requirements.txt:

langchain
langchain-community
langchain-ollama
langchain-tavily
langchain-mcp-adapters
langgraph
oracledb
arize-phoenix
openinference-instrumentation-langchain
opentelemetry-sdk
opentelemetry-exporter-otlp
rich
python-dotenv
mcp

Database Schema and Seed Data

Create the Table

CREATE TABLE talent_roster (
    id NUMBER GENERATED BY DEFAULT AS IDENTITY,
    actor_name VARCHAR2(100),
    availability_status VARCHAR2(20)
);

Insert Sample Data

INSERT INTO talent_roster VALUES (DEFAULT, 'Pedro Pascal', 'AVAILABLE');
INSERT INTO talent_roster VALUES (DEFAULT, 'Cillian Murphy', 'AVAILABLE');
COMMIT;

Secure Database Access with MCP

Why MCP Exists

Direct database access from agents:

Leaks credentials
Breaks auditability
Grants too much authority to LLMs

MCP solves this by separating reasoning from execution.

The agent requests execution; SQLcl owns execution.

Agent Architecture with LangGraph

Agent state is explicit and typed:

class AgentState(TypedDict):
    request: str
    research_data: str
    db_data: str
    final_report: str

This makes transitions observable and debuggable.

Implementing the Agents

Web Research Agent

async def web_search_node(state):
    query_prompt = f"Provide actors suitable for: {state['request']}"
    search_query = await (llm | StrOutputParser()).ainvoke(query_prompt)
    result = tavily.invoke({"query": search_query})
    return {"research_data": result}

This agent collects context only, it does not decide.

Database Agent (MCP)

async with mcp_client.session("oracle") as session:
    await session.initialize()
    result = await session.call_tool(
        "run-sqlcl",
        arguments={"sqlcl": sql_script}
    )

Key properties:

No JDBC in Python
No credentials in prompts
SQL execution is mediated

Orchestrator Agent

async def orchestrator_node(state):
    prompt = f"""
    Context: {state['research_data']}
    DB Results: {state['db_data']}
    Recommend an available actor.
    """
    report = await (llm | StrOutputParser()).ainvoke(prompt)
    return {"final_report": report}

Adding OpenTelemetry Observability

Instrument LangChain

LangChainInstrumentor().instrument(
    tracer_provider=tracer_provider
)

Export Traces

OTLPSpanExporter(
    endpoint="http://localhost:6006/v1/traces"
)

This captures:

Agent execution order
Prompts
SQL
Errors
Latency

Running the agents

python main.py

You should see:

Web research
SQL execution
Final recommendation

Viewing Traces in Phoenix (OpenTelemetry)

Open:

http://localhost:6006

You can inspect:

Each agent span
Generated SQL
MCP warnings
Latency breakdowns

This is the agent’s internal state, preserved.

Troubleshooting and Failure Modes

SQLcl Banner Noise

Always use:

-nolog -silent

MCP Protocol Mismatch

Catch and recover instead of crashing.

Silent SQL Output

Check for:

ORA-
SP2-

Resources

👉 GitHub
👉 YouTube Video

Agentic AI does not fail because models are weak.

It fails because:

Systems are opaque
Failures are invisible
Security is bolted on too late

By combining:

Structured agents
Secure tool boundaries
End-to-end observability

We move from impressive demos to operable systems.

That is how agentic AI becomes production-ready.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.