When you build a multi-tenant AI agent — one that serves multiple users across multiple companies — you face a tricky problem: how do you make sure each user only sees their own data?
You can't just let the LLM decide what data to fetch. You need to inject trusted, server-side context — like user_id and company_id — directly into the tool at runtime, completely hidden from the user's input.
That's exactly what this article is about. We'll build a document search agent that scopes results to the authenticated user's company, using LangChain's ToolRuntime and context_schema pattern.
The Problem: LLMs Can't Be Trusted With Authorization
Imagine a user types:
"Search documents about invoices"
A naive agent might pass this straight to a database query — but with what company ID? If the user can influence that, they might search another company's documents. That's a serious security hole.
The right approach: inject user_id and company_id from the backend (your auth system, session, JWT token, etc.) — never from the user's message.
What We're Building
A LangChain agent with a search_company_docs tool that:
- Takes a plain search
queryfrom the user - Automatically receives
user_idandcompany_idfrom a hidden runtime context - Only returns documents belonging to that company
- Never exposes the context injection mechanism to the LLM or the user
Prerequisites
pip install langchain langgraph langchain-openai
Step 1: Define the Models
We use two models here — a lightweight one for the agent's reasoning loop, and a more capable one for complex tasks. With langchain's init_chat_model, you can initialize any supported model by name.
from langchain.chat_models import init_chat_model
# Fast, lightweight model for the agent loop
basic_model = init_chat_model("openai:gpt-4o-mini")
# More powerful model for complex reasoning (optional — swap as needed)
advance_model = init_chat_model("openai:gpt-4o")
In this example the agent uses
basic_model. Theadvance_modelis defined for cases where you want to escalate to heavier reasoning — e.g. for a second-pass summarization step.
Step 2: Set Up a Dummy Database
For this tutorial, we simulate a multi-tenant document store as a plain Python dict. Each key is a company_id, and its value is a list of document names.
COMPANY_DOCUMENTS = {
"company_abc": [
"Invoice system migration guide",
"HR leave policy 2026",
"AI roadmap document",
],
"company_xyz": [
"Financial audit report",
"Marketing strategy Q2",
"Sales onboarding manual",
],
}
In a real application this would be a database query filtered by company_id at the SQL/ORM level.
Step 3: Define the Runtime Context
This is the key piece. We define a Context dataclass that holds the values we want to inject into every tool call — server-side, not from user input.
from dataclasses import dataclass
@dataclass
class Context:
user_id: str
company_id: str
Think of this as your session payload — the kind of data you'd extract from a verified JWT token or a session cookie on every request. The user never touches this; your backend populates it.
Step 4: Build the Tool with ToolRuntime
Here's where the magic happens. Our tool accepts two parameters:
-
query— visible to the LLM, comes from the user's message -
runtime: ToolRuntime[Context]— invisible to the LLM, injected by the framework at call time
from langchain.tools import ToolRuntime, tool
@tool
def search_company_docs(query: str, runtime: ToolRuntime[Context]) -> str:
"""
Search company documents.
"""
# These values come from the runtime context — not from the user
user_id = runtime.context.user_id
company_id = runtime.context.company_id
print("\n===== RUNTIME INFO =====")
print("User ID:", user_id)
print("Company ID:", company_id)
# Fetch only this company's documents
docs = COMPANY_DOCUMENTS.get(company_id, [])
# Simple keyword search
matched_docs = [doc for doc in docs if query.lower() in doc.lower()]
if matched_docs:
return (
f"User {user_id} searched company documents.\n"
f"Matched documents:\n- " + "\n- ".join(matched_docs)
)
return f"No documents found for query: '{query}'"
What makes ToolRuntime special?
The runtime parameter is stripped from the tool's schema before it's sent to the LLM. The LLM only sees query: str in the tool definition. It cannot read, modify, or even know about user_id or company_id. The framework injects those values at execution time from the context you provide.
This is the correct, secure way to handle authorization in agentic systems.
Step 5: Create the Agent with context_schema
We wire everything together with create_agent, passing context_schema=Context so the agent knows what shape the runtime context will take.
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
agent = create_agent(
model=basic_model,
tools=[search_company_docs],
context_schema=Context, # <-- tells the agent about our runtime context
)
Step 6: Invoke with Hidden Context
When we call agent.invoke, we pass the user's message in messages (as usual) and the trusted server-side data in context (completely separate).
response = agent.invoke(
{
"messages": [
{
"role": "user",
"content": "Search documents about invoice",
}
]
},
# This comes from your auth system — never from the user's input
context=Context(
user_id="user_123",
company_id="company_abc",
),
)
print("\n===== FINAL RESPONSE =====")
print(response["messages"][-1].content)
The context argument is passed at the framework level, not as part of the LLM conversation. The model never sees it, and the user cannot tamper with it.
Full Code
from dataclasses import dataclass
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import ToolRuntime, tool
from langgraph.checkpoint.memory import InMemorySaver
# ── Models ─────────────────────────────────────────────────────────────────────
basic_model = init_chat_model("openai:gpt-4o-mini")
advance_model = init_chat_model("openai:gpt-4o")
# ── Dummy Database ─────────────────────────────────────────────────────────────
COMPANY_DOCUMENTS = {
"company_abc": [
"Invoice system migration guide",
"HR leave policy 2026",
"AI roadmap document",
],
"company_xyz": [
"Financial audit report",
"Marketing strategy Q2",
"Sales onboarding manual",
],
}
# ── Runtime Context ────────────────────────────────────────────────────────────
@dataclass
class Context:
user_id: str
company_id: str
# ── Tool ───────────────────────────────────────────────────────────────────────
@tool
def search_company_docs(query: str, runtime: ToolRuntime[Context]) -> str:
"""Search company documents."""
user_id = runtime.context.user_id
company_id = runtime.context.company_id
print("\n===== RUNTIME INFO =====")
print("User ID:", user_id)
print("Company ID:", company_id)
docs = COMPANY_DOCUMENTS.get(company_id, [])
matched_docs = [doc for doc in docs if query.lower() in doc.lower()]
if matched_docs:
return (
f"User {user_id} searched company documents.\n"
f"Matched documents:\n- " + "\n- ".join(matched_docs)
)
return f"No documents found for query: '{query}'"
# ── Agent ──────────────────────────────────────────────────────────────────────
agent = create_agent(
model=basic_model,
tools=[search_company_docs],
context_schema=Context,
)
# ── Invoke ─────────────────────────────────────────────────────────────────────
response = agent.invoke(
{
"messages": [
{
"role": "user",
"content": "Search documents about invoice",
}
]
},
context=Context(
user_id="user_123",
company_id="company_abc",
),
)
print("\n===== FINAL RESPONSE =====")
print(response["messages"][-1].content)
Sample Output
===== RUNTIME INFO =====
User ID: user_123
Company ID: company_abc
===== FINAL RESPONSE =====
I found the following document related to "invoice" in your company's knowledge base:
- Invoice system migration guide
Let me know if you'd like more details about this document!
Notice that company_xyz's documents (Financial audit report, etc.) never appear — even though they exist in the database. The company scoping happens silently at the tool level.
Why This Architecture Matters
| Concern | Naive Approach | Runtime Context Approach |
|---|---|---|
| Authorization | LLM decides what to fetch | Backend enforces it via context |
| Data isolation | Relies on prompt instructions | Enforced in code, not in prompts |
| Auditability | Hard to trace who fetched what |
user_id is always available in the tool |
| Security surface | User input can influence data scope | Context is injected server-side only |
| Multi-tenancy | Fragile, prompt-based | Structurally enforced |
The core insight: never trust the LLM for authorization. The LLM's job is to reason about what the user wants. Your backend's job is to enforce what the user is allowed to access. These are separate concerns and should live in separate layers.
Real-World Integration
In production, you'd populate Context from your auth middleware:
# FastAPI example
@app.post("/chat")
async def chat(request: ChatRequest, user=Depends(get_current_user)):
response = agent.invoke(
{"messages": [{"role": "user", "content": request.message}]},
context=Context(
user_id=user.id,
company_id=user.company_id, # from your auth system
),
)
return {"reply": response["messages"][-1].content}
The context is assembled from a verified token — never from the request body.
Wrapping Up
The ToolRuntime + context_schema pattern gives you a clean, secure way to inject trusted server-side data into your AI agent's tools — without leaking it to the LLM or exposing it to users.
This is one of those patterns that seems like a small detail but becomes critical the moment you build anything multi-tenant. Get it right from the start.
If this helped you, drop a ❤️ below. And if you're building multi-tenant agents and running into other challenges, let's talk in the comments!
Top comments (0)