DEV Community

Syeed Talha
Syeed Talha

Posted on

How to Inject Hidden Runtime Context into AI Agent Tools (LangChain + LangGraph)

When you build a multi-tenant AI agent — one that serves multiple users across multiple companies — you face a tricky problem: how do you make sure each user only sees their own data?

You can't just let the LLM decide what data to fetch. You need to inject trusted, server-side context — like user_id and company_id — directly into the tool at runtime, completely hidden from the user's input.

That's exactly what this article is about. We'll build a document search agent that scopes results to the authenticated user's company, using LangChain's ToolRuntime and context_schema pattern.


The Problem: LLMs Can't Be Trusted With Authorization

Imagine a user types:

"Search documents about invoices"

A naive agent might pass this straight to a database query — but with what company ID? If the user can influence that, they might search another company's documents. That's a serious security hole.

The right approach: inject user_id and company_id from the backend (your auth system, session, JWT token, etc.) — never from the user's message.


What We're Building

A LangChain agent with a search_company_docs tool that:

  • Takes a plain search query from the user
  • Automatically receives user_id and company_id from a hidden runtime context
  • Only returns documents belonging to that company
  • Never exposes the context injection mechanism to the LLM or the user

Prerequisites

pip install langchain langgraph langchain-openai
Enter fullscreen mode Exit fullscreen mode

Step 1: Define the Models

We use two models here — a lightweight one for the agent's reasoning loop, and a more capable one for complex tasks. With langchain's init_chat_model, you can initialize any supported model by name.

from langchain.chat_models import init_chat_model

# Fast, lightweight model for the agent loop
basic_model = init_chat_model("openai:gpt-4o-mini")

# More powerful model for complex reasoning (optional — swap as needed)
advance_model = init_chat_model("openai:gpt-4o")
Enter fullscreen mode Exit fullscreen mode

In this example the agent uses basic_model. The advance_model is defined for cases where you want to escalate to heavier reasoning — e.g. for a second-pass summarization step.


Step 2: Set Up a Dummy Database

For this tutorial, we simulate a multi-tenant document store as a plain Python dict. Each key is a company_id, and its value is a list of document names.

COMPANY_DOCUMENTS = {
    "company_abc": [
        "Invoice system migration guide",
        "HR leave policy 2026",
        "AI roadmap document",
    ],
    "company_xyz": [
        "Financial audit report",
        "Marketing strategy Q2",
        "Sales onboarding manual",
    ],
}
Enter fullscreen mode Exit fullscreen mode

In a real application this would be a database query filtered by company_id at the SQL/ORM level.


Step 3: Define the Runtime Context

This is the key piece. We define a Context dataclass that holds the values we want to inject into every tool call — server-side, not from user input.

from dataclasses import dataclass

@dataclass
class Context:
    user_id: str
    company_id: str
Enter fullscreen mode Exit fullscreen mode

Think of this as your session payload — the kind of data you'd extract from a verified JWT token or a session cookie on every request. The user never touches this; your backend populates it.


Step 4: Build the Tool with ToolRuntime

Here's where the magic happens. Our tool accepts two parameters:

  • query — visible to the LLM, comes from the user's message
  • runtime: ToolRuntime[Context]invisible to the LLM, injected by the framework at call time
from langchain.tools import ToolRuntime, tool

@tool
def search_company_docs(query: str, runtime: ToolRuntime[Context]) -> str:
    """
    Search company documents.
    """
    # These values come from the runtime context — not from the user
    user_id = runtime.context.user_id
    company_id = runtime.context.company_id

    print("\n===== RUNTIME INFO =====")
    print("User ID:", user_id)
    print("Company ID:", company_id)

    # Fetch only this company's documents
    docs = COMPANY_DOCUMENTS.get(company_id, [])

    # Simple keyword search
    matched_docs = [doc for doc in docs if query.lower() in doc.lower()]

    if matched_docs:
        return (
            f"User {user_id} searched company documents.\n"
            f"Matched documents:\n- " + "\n- ".join(matched_docs)
        )

    return f"No documents found for query: '{query}'"
Enter fullscreen mode Exit fullscreen mode

What makes ToolRuntime special?

The runtime parameter is stripped from the tool's schema before it's sent to the LLM. The LLM only sees query: str in the tool definition. It cannot read, modify, or even know about user_id or company_id. The framework injects those values at execution time from the context you provide.

This is the correct, secure way to handle authorization in agentic systems.


Step 5: Create the Agent with context_schema

We wire everything together with create_agent, passing context_schema=Context so the agent knows what shape the runtime context will take.

from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver

agent = create_agent(
    model=basic_model,
    tools=[search_company_docs],
    context_schema=Context,       # <-- tells the agent about our runtime context
)
Enter fullscreen mode Exit fullscreen mode

Step 6: Invoke with Hidden Context

When we call agent.invoke, we pass the user's message in messages (as usual) and the trusted server-side data in context (completely separate).

response = agent.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "Search documents about invoice",
            }
        ]
    },
    # This comes from your auth system — never from the user's input
    context=Context(
        user_id="user_123",
        company_id="company_abc",
    ),
)

print("\n===== FINAL RESPONSE =====")
print(response["messages"][-1].content)
Enter fullscreen mode Exit fullscreen mode

The context argument is passed at the framework level, not as part of the LLM conversation. The model never sees it, and the user cannot tamper with it.


Full Code

from dataclasses import dataclass

from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import ToolRuntime, tool
from langgraph.checkpoint.memory import InMemorySaver

# ── Models ─────────────────────────────────────────────────────────────────────
basic_model = init_chat_model("openai:gpt-4o-mini")
advance_model = init_chat_model("openai:gpt-4o")

# ── Dummy Database ─────────────────────────────────────────────────────────────
COMPANY_DOCUMENTS = {
    "company_abc": [
        "Invoice system migration guide",
        "HR leave policy 2026",
        "AI roadmap document",
    ],
    "company_xyz": [
        "Financial audit report",
        "Marketing strategy Q2",
        "Sales onboarding manual",
    ],
}

# ── Runtime Context ────────────────────────────────────────────────────────────
@dataclass
class Context:
    user_id: str
    company_id: str

# ── Tool ───────────────────────────────────────────────────────────────────────
@tool
def search_company_docs(query: str, runtime: ToolRuntime[Context]) -> str:
    """Search company documents."""
    user_id = runtime.context.user_id
    company_id = runtime.context.company_id

    print("\n===== RUNTIME INFO =====")
    print("User ID:", user_id)
    print("Company ID:", company_id)

    docs = COMPANY_DOCUMENTS.get(company_id, [])
    matched_docs = [doc for doc in docs if query.lower() in doc.lower()]

    if matched_docs:
        return (
            f"User {user_id} searched company documents.\n"
            f"Matched documents:\n- " + "\n- ".join(matched_docs)
        )
    return f"No documents found for query: '{query}'"

# ── Agent ──────────────────────────────────────────────────────────────────────
agent = create_agent(
    model=basic_model,
    tools=[search_company_docs],
    context_schema=Context,
)

# ── Invoke ─────────────────────────────────────────────────────────────────────
response = agent.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "Search documents about invoice",
            }
        ]
    },
    context=Context(
        user_id="user_123",
        company_id="company_abc",
    ),
)

print("\n===== FINAL RESPONSE =====")
print(response["messages"][-1].content)
Enter fullscreen mode Exit fullscreen mode

Sample Output

===== RUNTIME INFO =====
User ID: user_123
Company ID: company_abc

===== FINAL RESPONSE =====
I found the following document related to "invoice" in your company's knowledge base:

- Invoice system migration guide

Let me know if you'd like more details about this document!
Enter fullscreen mode Exit fullscreen mode

Notice that company_xyz's documents (Financial audit report, etc.) never appear — even though they exist in the database. The company scoping happens silently at the tool level.


Why This Architecture Matters

Concern Naive Approach Runtime Context Approach
Authorization LLM decides what to fetch Backend enforces it via context
Data isolation Relies on prompt instructions Enforced in code, not in prompts
Auditability Hard to trace who fetched what user_id is always available in the tool
Security surface User input can influence data scope Context is injected server-side only
Multi-tenancy Fragile, prompt-based Structurally enforced

The core insight: never trust the LLM for authorization. The LLM's job is to reason about what the user wants. Your backend's job is to enforce what the user is allowed to access. These are separate concerns and should live in separate layers.


Real-World Integration

In production, you'd populate Context from your auth middleware:

# FastAPI example
@app.post("/chat")
async def chat(request: ChatRequest, user=Depends(get_current_user)):
    response = agent.invoke(
        {"messages": [{"role": "user", "content": request.message}]},
        context=Context(
            user_id=user.id,
            company_id=user.company_id,   # from your auth system
        ),
    )
    return {"reply": response["messages"][-1].content}
Enter fullscreen mode Exit fullscreen mode

The context is assembled from a verified token — never from the request body.


Wrapping Up

The ToolRuntime + context_schema pattern gives you a clean, secure way to inject trusted server-side data into your AI agent's tools — without leaking it to the LLM or exposing it to users.

This is one of those patterns that seems like a small detail but becomes critical the moment you build anything multi-tenant. Get it right from the start.

If this helped you, drop a ❤️ below. And if you're building multi-tenant agents and running into other challenges, let's talk in the comments!


Top comments (0)