RAKESH THERANI

Posted on May 7 • Edited on May 14

Building a Full-Stack Agentic AI Data Platform on ClickHouse: A Complete Architecture Guide

#agents #ai #architecture #database

A production-grade, end-to-end agentic AI platform — chat UI, self-hosted LLM, MCP server, LLM observability, medallion data architecture, security guardrails, HA, and cost analysis. Same stack ClickHouse uses internally (DWAINE: 250+ employees, ~70% of internal analytics use cases covered, 50-70% workload reduction on the data-warehouse team). Adapted for a crypto-derivatives exchange.

Why I'm writing this

The phrase "AI platform" gets thrown around to describe everything from a single chatbot to a set of disconnected APIs. An actual production AI platform requires far more than a model and an interface — it needs data plumbing, semantic grounding, query safety, observability, role-based access, high availability, and a real cost model.

This is the architecture I shipped for a crypto exchange — covering all of those layers. Every component is open-source and battle-tested. Nothing is theoretical:

Data layer: PostgreSQL → CDC (Debezium + Kafka) → ClickHouse with medallion architecture (Raw → Staging → Marts)
LLM layer: Self-hosted Qwen 2.5 72B (Apache 2.0, all data stays on-prem) + business glossary for domain grounding
Tool layer: ClickHouse's official MCP server with bearer-token auth, SSRF protection, schema discovery + safe SQL execution
UX layer: LibreChat (open-source, SSO, role-based) — same chat UI ClickHouse acquired and uses for DWAINE
Observability layer: Langfuse (open-source, runs on ClickHouse) — every query, response, latency, cost tracked
Operations layer: HA cluster, query timeouts, memory caps, RBAC, full audit trail
JOIN strategy: 3-tier approach (pre-joined marts + dictionaries + UNION ALL + runtime fallback) so the LLM never writes a slow query

This is the same Agentic Data Stack ClickHouse open-sourced and uses internally. I'm documenting how to deploy it on top of any production database (this guide uses a crypto exchange as the worked example, but the architecture transfers to retail, fintech, marketing, ops).

If you're building or evaluating an AI platform for your own data — and you want something more substantial than "GPT-4 with a database connection" — this is the deep-dive.

Note: Costs and table examples reference a crypto-derivatives platform (large trade tables, real-time wallet flows). The architectural patterns and effort estimates apply universally. Numbers are real; mileage will vary.

Executive Summary
What We Are Building
Why ClickHouse
Platform Architecture
Data Consolidation — All DBs Into ClickHouse
How the AI Agent Works
What Questions Can Be Asked — By Team
How Multi-Table Queries Work
ClickHouse MCP Server
LLM Selection
How the LLM Learns Our Tables (No Training Required)
Chat Interface — LibreChat
- Specialist Agents Per Team (Beyond One General Agent)
Observability — Langfuse
Medallion Architecture — Data Organization
Security and Governance
High Availability Architecture
Implementation Roadmap
Cost Analysis
Risk Assessment
Success Metrics
Reference Links

1. Executive Summary

Objective

Build a unified data platform on ClickHouse that consolidates tables from all the crypto exchange databases (trading, wallets, users, risk, compliance) into one place, then layer an AI agent on top so any team member can ask questions in plain English and get instant answers.

The Problem Today

Data lives in multiple PostgreSQL databases across different services
Getting answers requires writing SQL, knowing which DB to query, or filing a ticket to engineering
Analytical queries on production PostgreSQL compete with live trading workloads
No single place to query across trading, wallet, user, and compliance data together

The Solution

Any team member types: "Show me top 10 users by futures volume this week who also had withdrawals > $50K"

The AI agent:
1. Understands the question (via business glossary)
2. Generates SQL across trade_future + wallet_transactions tables (both in ClickHouse)
3. Executes via MCP server (read-only, safe)
4. Returns formatted answer with table/chart

This works because all data from different databases is consolidated into ClickHouse tables — the LLM queries structured tables directly, not embeddings or vectors.

What ClickHouse Already Provides

ClickHouse has built a complete open-source Agentic Data Stack with a ready-to-deploy GitHub repo:

GitHub: https://github.com/ClickHouse/agentic-data-stack
Live demo: https://llm.clickhouse.com (AgentHouse — 37 public datasets)
Internal proof: DWAINE — 250+ ClickHouse employees use it, >200 daily messages across 50-70 daily conversations, ~70% of internal analytics use cases covered, 50-70% workload reduction on the 3-person DWH team (source)

We are not building from scratch. We are deploying ClickHouse's proven stack with our data.

Expected Outcomes

Metric	Target
Analytical query speed	10-100x faster than PostgreSQL for aggregations
Production DB load reduction	60-80% fewer analytical queries hitting production
Self-service data access	Any team member can query data without SQL knowledge
Cross-database queries	Single query across trading, wallets, users, risk data
Time to insight	Seconds instead of hours

2. What We Are Building

Before (Today)

Trading Ops needs a report:
  → Asks engineering → Engineer writes SQL → Queries prod DB → Formats result → Sends back
  → Time: hours to days
  → Impact: Analytical query slows down production

Compliance needs user transaction history:
  → Files ticket → Engineer joins data from 3 DBs → Manual export → Sends CSV
  → Time: 1-3 days

After (With This Platform)

Trading Ops opens LibreChat:
  → Types: "Top 10 markets by volume this week with liquidation count"
  → AI agent generates SQL, queries ClickHouse, returns answer in 5 seconds
  → No engineering involvement, no production DB impact

Compliance opens LibreChat:
  → Types: "All transactions for user X in last 90 days including spot, futures, and withdrawals"
  → AI agent queries across all consolidated tables, returns complete history
  → Instant, audited, repeatable

Who Uses It

Team	What They Ask
Trading Ops	Volume, market metrics, liquidations, position summaries
Risk	Exposure, liquidation cascades, concentration risk
Compliance	User transaction history, suspicious patterns, audit trails
Finance	Fee revenue, P&L, trading volume trends
Product	User activity, feature usage, market popularity
Engineering	System metrics, query performance, data quality checks

3. Why ClickHouse

Production Proof Points in Crypto

Coinhall — Real-time DEX aggregator across 23 blockchains:

Processes billions of trade events
20ms candlestick chart queries on massive datasets
Replaced PostgreSQL for analytics
Similar data model to the crypto exchange (trades, orders, market data)

CryptoHouse — Public ClickHouse-powered crypto analytics:

Bitcoin/Ethereum blockchain data at scale

DWAINE — ClickHouse's own internal AI agent (production):

250+ internal users querying data via natural language
>200 daily messages across 50-70 daily conversations
~70% of internal analytics use cases covered by the agent
~50-70% workload reduction on the 3-person DWH team
Source: ClickHouse — How we made our data warehouse AI-first
This is exactly what we are building for the crypto exchange

ClickHouse vs PostgreSQL for Analytics

Capability	PostgreSQL	ClickHouse
Storage model	Row-based (OLTP)	Columnar (OLAP)
Compression	3-5x	10-30x
Aggregation speed	Seconds-minutes	Milliseconds-seconds
Concurrent analytics	Impacts live trading	Isolated, zero production impact
Partitioning	Manual (daily)	Automatic (MergeTree)
Materialized views	Periodic refresh	Incremental (real-time triggers)
Cross-DB queries	Not possible	All data in one place

Key Advantage — All Data In One Place

Today, querying across trading DB + wallet DB + user DB requires application-level joins or manual exports. In ClickHouse, all tables from all databases are consolidated — the AI agent (or any analyst) can query across everything in a single SQL statement.

Our largest trading table runs to hundreds of GB in PostgreSQL. With ClickHouse's columnar compression (10-30x), this drops by an order of magnitude, and column-specific queries (e.g., SUM(quantity) WHERE market_name = 'BTCPFC') only read the relevant columns.

4. Platform Architecture

This is ClickHouse's Agentic Data Stack — three layers, all open-source:

                         +------------------+
                         |   ALL TEAMS      |
                         |  (Trading Ops,   |
                         |   Risk, Finance, |
                         |   Compliance,    |
                         |   Product)       |
                         +--------+---------+
                                  |
                  ================|================
                  |     LAYER 1: CHAT (LibreChat)  |
                  |     - Web-based chat UI         |
                  |     - SSO login                 |
                  |     - Role-based access          |
                  |     - Conversation history       |
                  =================|================
                                  |
                  ================|=================
                  |     LAYER 2: LLM + MCP          |
                  |                                  |
                  |  User question (plain English)   |
                  |         |                        |
                  |         v                        |
                  |  Qwen 2.5 LLM (self-hosted)     |
                  |  + Business Glossary             |
                  |         |                        |
                  |         v                        |
                  |  ClickHouse MCP Server           |
                  |  (schema discovery + SQL exec)   |
                  |         |                        |
                  ===========|======================
                             |
                  ===========|======================
                  |  LAYER 3: DATA (ClickHouse)     |
                  |                                  |
                  |  +---------------------------+   |
                  |  | Trading DB tables:        |   |
                  |  |  trade_future             |   |
                  |  |  trade_spot               |   |
                  |  |  orders                   |   |
                  |  +---------------------------+   |
                  |  | Wallet DB tables:         |   |
                  |  |  wallet_transactions      |   |
                  |  |  deposits / withdrawals   |   |
                  |  +---------------------------+   |
                  |  | User DB tables:           |   |
                  |  |  users, kyc_status        |   |
                  |  +---------------------------+   |
                  |  | Risk DB tables:           |   |
                  |  |  positions, margins       |   |
                  |  +---------------------------+   |
                  |  | Pre-aggregated views:     |   |
                  |  |  volume_daily             |   |
                  |  |  pnl_daily                |   |
                  |  |  liquidation_metrics      |   |
                  |  |  candlestick_1m/1h/1d     |   |
                  |  +---------------------------+   |
                  |                                  |
                  ==================================+
                             ^
                             |
                  ===========|======================
                  |  CDC PIPELINES (Debezium+Kafka) |
                  |  Real-time sync from all DBs    |
                  ==================================+
                        ^    ^    ^    ^
                        |    |    |    |
                  +-----+  +-+  ++  +-+-----+
                  |Trading| |Wallet| |User| |Risk|
                  |  DB   | |  DB  | | DB | | DB |
                  |  (PG) | | (PG) | |(PG)| |(PG)|
                  +-------+ +------+ +----+ +----+

Observability (Langfuse)

Langfuse runs alongside and tracks every LLM interaction:

What question was asked, by whom
What SQL was generated
Whether the answer was correct
Token usage and cost per query
Langfuse stores its data in ClickHouse itself — no extra DB needed

5. Data Consolidation — All DBs Into ClickHouse

The Core Idea

Every PostgreSQL database at the crypto exchange feeds its tables into ClickHouse via CDC (Change Data Capture). Once in ClickHouse, the AI agent can query across all of them in a single query.

CDC Pipeline

PostgreSQL (WAL) → Debezium Connector → Kafka → ClickHouse Kafka Engine → MergeTree Tables

Tables to Consolidate

Source DB	Tables	Est. Size	ClickHouse Engine	Sync
Trading DB	trade_future	hundreds of GB	ReplacingMergeTree	CDC real-time
	trade_spot	TBD	ReplacingMergeTree	CDC real-time
	orders	TBD	ReplacingMergeTree	CDC real-time
Wallet DB	wallet_transactions	TBD	ReplacingMergeTree	CDC real-time
	deposits	TBD	MergeTree	CDC real-time
	withdrawals	TBD	MergeTree	CDC real-time
User DB	users	Small	ReplacingMergeTree	CDC
	kyc_status	Small	ReplacingMergeTree	CDC
Risk DB	positions	TBD	ReplacingMergeTree	CDC real-time
	margins	TBD	ReplacingMergeTree	CDC real-time
Config	market_config	Small	ReplacingMergeTree	Daily batch
	fee_tiers	Small	ReplacingMergeTree	Daily batch

Why ReplacingMergeTree

PostgreSQL rows get UPDATEd (e.g., order status changes, position updates). ClickHouse's ReplacingMergeTree deduplicates on primary key during background merges, handling CDC update events correctly.

Pre-Aggregated Materialized Views

ClickHouse materialized views are incremental triggers — they process each new row as it arrives (unlike PostgreSQL which requires periodic REFRESH):

Materialized View	Source	What It Computes
volume_daily	trade_future, trade_spot	Volume per market per day
pnl_daily	trades + positions	P&L per user per market per day
liquidation_metrics	trade_future (order_type=1003)	Liquidation count, volume, frequency
candlestick_1m	trade_future, trade_spot	OHLCV per market per minute
candlestick_1h	chart_db.candlesticks (CDC)	OHLCV per market per hour
candlestick_1d	chart_db.candlesticks (CDC)	OHLCV per market per day
fee_revenue	trades	Fee revenue per market per tier per day
withdrawal_summary	withdrawals	Daily withdrawal volume per user

Cross-Database Query Example

Something impossible today (requires joining across separate PostgreSQL databases):

-- "Show users with >$100K futures volume today who also withdrew >$10K"
SELECT
    t.username,
    SUM(t.quantity * t.price) as futures_notional,
    w.total_withdrawn
FROM trade_future t
JOIN (
    SELECT username, SUM(amount) as total_withdrawn
    FROM withdrawals
    WHERE created_at >= today()
    GROUP BY username
    HAVING SUM(amount) > 10000
) w ON t.username = w.username
WHERE t.transaction_time >= today()
GROUP BY t.username, w.total_withdrawn
HAVING futures_notional > 100000
ORDER BY futures_notional DESC

The AI agent generates this SQL automatically from plain English.

6. How the AI Agent Works

The Flow

1. User types question in LibreChat
     "Which users had the most liquidations today?"

2. LLM receives:
   - The question
   - Business glossary (liquidation = order_type 1003, etc.)
   - ClickHouse schema (via MCP list_tables)

3. LLM generates SQL:
     SELECT username, COUNT(*) as liq_count, SUM(quantity * price) as liq_volume
     FROM trade_future
     WHERE order_type = 1003
       AND transaction_time >= today()
       AND username NOT LIKE 'bot_%'
       AND username NOT LIKE 'market_maker_%'
       AND username NOT LIKE 'liquidity_%'
     GROUP BY username
     ORDER BY liq_count DESC
     LIMIT 20

4. MCP Server executes SQL (read-only) against ClickHouse

5. LLM formats result:
     "Today's top users by liquidation count:
      1. user_abc — 12 liquidations, $234K total
      2. user_xyz — 8 liquidations, $156K total
      ..."

6. Langfuse logs: question, SQL, result, latency, tokens used

Key Point — It's All Structured Table Queries

The LLM is not doing anything magical with embeddings or vectors. It is:

Reading the ClickHouse table schema (column names, types)
Using the business glossary to understand domain terms
Writing standard SQL
Executing via MCP server

All our data is in regular ClickHouse tables. The "intelligence" comes from the LLM knowing how to translate English into SQL for our specific schema.

7. What Questions Can Be Asked — By Team

Trading Operations

Question	Tables Used	Complexity
"What is BTC perpetual futures volume today?"	trade_future	Single table
"Show top 10 markets by volume this week"	trade_future, trade_spot	Single table each, union
"Compare BTC futures volume this week vs last week"	trade_future	Single table, time comparison
"Show 1-hour candlestick chart for ETHPFC last 24 hours"	candlestick_1h (view)	Pre-aggregated view
"Which markets had the highest spread in the last hour?"	orders	Single table
"Show order book depth for BTCPFC right now"	orders	Single table, latest snapshot
"Average fill time for limit orders by market today"	orders	Single table
"How does today's volume compare to 30-day average?"	volume_daily (view)	Pre-aggregated view

Risk Management

Question	Tables Used	Complexity
"Which users have the largest open positions?"	positions	Single table
"Show all liquidation events in the last hour"	trade_future	Single table (order_type=1003)
"What percentage of trades were liquidations by market today?"	trade_future	Single table, grouping
"List users with concentrated exposure (>80% in one market)"	positions	Single table, aggregation
"Show liquidation cascade events — markets where >5 liquidations happened within 1 minute"	trade_future	Single table, window function
"Which users traded within 1 minute of a liquidation event?"	trade_future (self-join)	Single table, self-join
"Show users with margin utilization above 90%"	positions, margins	Multi-table join
"Correlation between liquidation volume and price movement for BTCPFC this week"	trade_future, candlestick_1h	Multi-table join

Compliance & Audit

Question	Tables Used	Complexity
"Show all transactions for user X in the last 90 days"	trade_future, trade_spot, withdrawals, deposits	Multi-table union
"List accounts with withdrawal volume > $100K today"	withdrawals	Single table
"Find users who deposited and withdrew within 24 hours"	deposits, withdrawals	Multi-table join
"Show all KYC-pending users who traded this week"	users, kyc_status, trade_future	Multi-table join
"List accounts that received deposits from the same source address"	deposits	Single table, grouping
"Show transaction patterns for user X — frequency, volume, timing"	trade_future, trade_spot, withdrawals	Multi-table union + aggregation
"Which accounts had unusual withdrawal patterns this week?"	withdrawals	Single table, anomaly detection
"Full audit trail: every action by user X since account creation"	trade_future, trade_spot, orders, withdrawals, deposits	Multi-table union (5 tables)

Finance & Revenue

Question	Tables Used	Complexity
"What was total fee revenue yesterday?"	fee_revenue (view)	Pre-aggregated view
"Monthly fee revenue breakdown by market"	fee_revenue (view)	Pre-aggregated view
"Compare spot vs futures fee revenue this month"	fee_revenue (view)	Pre-aggregated view, grouping
"Daily P&L trend for the last 30 days"	pnl_daily (view)	Pre-aggregated view
"Which fee tier generates the most revenue?"	trade_future, trade_spot, fee_tiers	Multi-table join
"Revenue impact if we reduce maker fees by 10%"	trade_future, trade_spot, fee_tiers	Multi-table join + simulation
"Show monthly trading volume trend with MoM growth %"	volume_daily (view)	Pre-aggregated view, window function

Product & Growth

Question	Tables Used	Complexity
"How many unique users traded futures this week?"	trade_future	Single table, COUNT DISTINCT
"What is our DAU trend for the last 30 days?"	trade_future, trade_spot	Multi-table union
"Which markets have growing vs declining user counts?"	trade_future, trade_spot	Single table, time comparison
"New user retention — users who signed up last month, how many traded this month?"	users, trade_future	Multi-table join
"Top 10 users by total trading volume all-time"	trade_future, trade_spot	Multi-table union
"Average trades per user per day — trend over last 90 days"	trade_future, trade_spot	Single table, aggregation
"Geographic distribution of active traders this month"	users, trade_future	Multi-table join

Question Complexity Summary

Type	% of Questions	How LLM Handles
Single table	~40%	Direct SELECT with WHERE/GROUP BY
Pre-aggregated view	~20%	Query materialized view (fastest)
Multi-table JOIN	~25%	LLM generates JOIN across tables
Multi-table UNION	~10%	LLM unions results from multiple tables
Complex (window/self-join)	~5%	LLM uses advanced SQL patterns

8. How Multi-Table Queries Work (JOIN Problem & Solution)

The Challenge

Many real questions require data from multiple sources:

"Show me users who had futures liquidations today AND also made withdrawals"

This needs: trade_future (Trading DB) + withdrawals (Wallet DB) + users (User DB)

Problem: ClickHouse is not optimized for runtime JOINs. Unlike PostgreSQL, JOINs in ClickHouse are memory-heavy, single-threaded (hash build phase), and can be slow on large tables. This is a well-known ClickHouse limitation.

Solution: ClickHouse uses 3 strategies to avoid runtime JOINs. This is exactly how DWAINE (ClickHouse's own AI agent) handles it.

Strategy 1 — Pre-Joined Denormalized Marts (Primary Approach)

JOINs are computed at data load time, not at query time. Refreshable Materialized Views periodically run the JOINs in the background and write the results into flat, wide "mart" tables. The AI agent queries these pre-joined tables directly — no JOINs needed at query time.

How it works:

  Staging Layer (normalized tables):
    staging.trade_future  — raw trades from CDC
    staging.users         — user profiles from CDC
    staging.wallets       — wallet transactions from CDC

         | Refreshable Materialized View (runs every 1-5 minutes)
         | Executes the JOINs in background
         v

  Marts Layer (denormalized, pre-joined):
    mart_user_trading_activity
      — username, kyc_status, registered_at,     ← from users
        futures_volume_today, spot_volume_today,  ← from trades
        withdrawal_total_today, deposit_total,    ← from wallets
        liq_count_today, liq_volume_today         ← from trades (order_type=1003)

The AI agent only sees the Marts layer. When someone asks "Show users with >$50K volume who also withdrew >$10K", the LLM queries one flat table — no JOIN:

-- NO JOIN needed — all data is already in one table
SELECT username, futures_volume_today, withdrawal_total_today
FROM mart_user_trading_activity
WHERE futures_volume_today > 50000
  AND withdrawal_total_today > 10000
ORDER BY futures_volume_today DESC

This is ClickHouse's official recommendation for AI agents:

"LLMs should only be exposed to the Marts layer — which is already pre-joined and denormalized. The agent has no need to write JOINs."
— How to set up ClickHouse for agentic analytics

Strategy 2 — Dictionaries (For Dimension Lookups)

Small, stable lookup tables (users, markets, fee_tiers) are loaded into in-memory Dictionaries. Instead of a JOIN, the LLM uses dictGet() — an O(1) key-value lookup that is 25x faster than a hash join.

-- INSTEAD OF THIS (slow JOIN):
SELECT t.username, t.quantity, u.kyc_status
FROM mart_trades_futures t
JOIN mart_users u ON t.username = u.username
WHERE t.transaction_time >= today()

-- THE LLM GENERATES THIS (fast dictionary lookup):
SELECT
    username,
    quantity,
    dictGet('users_dict', 'kyc_status', username) AS kyc_status
FROM mart_trades_futures
WHERE transaction_time >= today()

Performance comparison (from ClickHouse benchmarks):

Method	Time	Memory
dictGet() with flat layout	0.044 sec	84 MB
dictGet() with hashed layout	0.113 sec	103 MB
Parallel Hash JOIN	0.690 sec	4.8 GB
Standard Hash JOIN	1.133 sec	4.4 GB

Which tables become Dictionaries:

Dictionary	Source	Size	Lookup Key
`users_dict`	users table	Small	username
`markets_dict`	market_config	Small	market_name
`fee_tiers_dict`	fee_tiers	Small	tier_id
`kyc_dict`	kyc_status	Small	username

The business glossary teaches the LLM to use dictGet() instead of JOINs for these dimension tables.

Strategy 3 — Runtime JOINs (Fallback for Ad-Hoc Queries)

For truly ad-hoc questions that don't fit a pre-built mart table, ClickHouse does support runtime JOINs with optimizations:

Optimization	What It Does
`join_algorithm = 'auto'`	ClickHouse picks the best algorithm automatically
Automatic table reordering	Puts smaller table on the right side (since v24.12)
Global join reordering	Optimizes 3+ table JOIN chains (since v25.9)
`ANY JOIN`	Stops at first match — faster when you only need one row
Pre-filter with WHERE	Reduce table size before joining

When this is acceptable:

Small-to-medium result sets (both sides < few million rows after filtering)
Ad-hoc questions that don't match any existing mart table
One-time analysis queries where latency is less critical

How Each Question Type Is Handled

Question Type	Strategy	Example
Single-table query	Direct query on mart table	"Top 10 markets by volume today"
Pre-aggregated metric	Query materialized view	"Daily volume trend last 30 days"
Cross-domain (trading + wallets)	Pre-joined mart table	"Users with high volume AND large withdrawals"
Dimension lookup (user info)	Dictionary (dictGet)	"Show KYC status for top traders"
Full audit trail (multiple activity types)	UNION ALL (no JOIN needed)	"All activity for user X in 30 days"
Rare ad-hoc correlation	Runtime JOIN (fallback)	"Correlation between margin % and liquidation frequency"

Pre-Joined Mart Tables We Build

Mart Table	Pre-Joined Sources	Refreshes	Use Case
`mart_user_trading_activity`	trades + users + wallets	Every 5 min	"Users who traded AND withdrew"
`mart_user_risk_profile`	positions + margins + trades (liquidations)	Every 1 min	"Users at risk of liquidation"
`mart_liquidation_detail`	trades + users + positions	Every 1 min	"Liquidation events with user context"
`mart_user_full_activity`	trades + spot + orders + wallets	Every 5 min	"Complete activity for user X"
`mart_market_daily_summary`	trades + orders + market_config	Every 1 min	"Market performance today"
`mart_compliance_view`	trades + wallets + users + kyc	Every 5 min	"Compliance team queries"

Example — How Pre-Joining Works End-to-End

Question: "Show KYC-verified users who had liquidations this week with their withdrawal history"

Without pre-joining (SLOW — 3-table runtime JOIN):

-- BAD: Runtime JOIN across 3 large tables
SELECT u.username, u.kyc_status, liq.liq_count, wd.withdrawal_total
FROM staging.users u
JOIN staging.trade_future liq ON u.username = liq.username
JOIN staging.wallets wd ON u.username = wd.username
WHERE ...
-- Slow, memory-heavy, may timeout

With pre-joining (FAST — single table scan):

-- GOOD: Query pre-joined mart table, no JOIN needed
SELECT username, kyc_status, liq_count_week, withdrawal_total_week
FROM mart_user_trading_activity
WHERE kyc_status = 'verified'
  AND liq_count_week > 0
ORDER BY liq_count_week DESC

The JOIN happened in the background when the Refreshable Materialized View populated mart_user_trading_activity. The AI agent never writes a JOIN.

Example — Full Audit Trail (UNION ALL, No JOIN)

For "show all activity for user X", no JOIN is needed — we use UNION ALL across tables, which ClickHouse handles efficiently since each branch scans one table independently:

SELECT * FROM (
    SELECT 'futures_trade' AS activity, transaction_time AS event_time,
           market_name, side, quantity, price, quantity * price AS value_usd
    FROM mart_trades_futures
    WHERE username = 'john_doe' AND transaction_time >= today() - 30

    UNION ALL

    SELECT 'spot_trade', transaction_time, market_name, side,
           quantity, price, quantity * price
    FROM mart_trades_spot
    WHERE username = 'john_doe' AND transaction_time >= today() - 30

    UNION ALL

    SELECT 'withdrawal', created_at, currency, 'out',
           amount, 0, amount
    FROM mart_wallets
    WHERE username = 'john_doe' AND transaction_type = 'withdrawal'
      AND created_at >= today() - 30

    UNION ALL

    SELECT 'deposit', created_at, currency, 'in',
           amount, 0, amount
    FROM mart_wallets
    WHERE username = 'john_doe' AND transaction_type = 'deposit'
      AND created_at >= today() - 30
)
ORDER BY event_time DESC

UNION ALL is not a JOIN — each branch runs independently and results are concatenated. This is fast in ClickHouse.

Summary — JOIN Avoidance Strategy

Question comes in from user
        |
        v
Does a pre-joined mart table cover this?
  YES → Query the mart table directly (no JOIN, fastest)
        |
        NO
        v
Does it need a small dimension lookup (user info, market name)?
  YES → Use dictGet() dictionary lookup (25x faster than JOIN)
        |
        NO
        v
Does it need data from multiple activity types for one user?
  YES → Use UNION ALL across tables (no JOIN, each branch independent)
        |
        NO
        v
Truly ad-hoc cross-table correlation?
  → Use runtime JOIN with auto algorithm (fallback, acceptable for small result sets)
  → Log in Langfuse — if this pattern repeats, build a new mart table for it

Performance Comparison

Approach	Query Time	Memory	When to Use
Pre-joined mart table	Milliseconds	Minimal	~60% of questions (cross-domain)
Dictionary (dictGet)	Milliseconds	Minimal	~15% of questions (dimension lookup)
UNION ALL	1-2 seconds	Low	~15% of questions (audit trails)
Runtime JOIN (fallback)	2-10 seconds	High	~10% of questions (rare ad-hoc)

Reference

Resource	URL
Minimize and optimize JOINs	https://clickhouse.com/docs/best-practices/minimize-optimize-joins
Denormalizing Data	https://clickhouse.com/docs/data-modeling/denormalization
Using Dictionaries to Accelerate Queries	https://clickhouse.com/blog/faster-queries-dictionaries-clickhouse
Direct Join (25x faster)	https://clickhouse.com/blog/clickhouse-fully-supports-joins-direct-join-part4
Choosing the Right Join Algorithm	https://clickhouse.com/blog/clickhouse-fully-supports-joins-how-to-choose-the-right-algorithm-part5
How to set up ClickHouse for agentic analytics	https://clickhouse.com/blog/how-to-set-up-clickhouse-for-agentic-analytics
Postgres to ClickHouse: Data Modeling Tips	https://clickhouse.com/blog/postgres-to-clickhouse-data-modeling-tips-v2

9. ClickHouse MCP Server

What It Is

The MCP (Model Context Protocol) server is an open standard that lets LLMs interact with databases safely. ClickHouse has an official MCP server:

GitHub: https://github.com/ClickHouse/mcp-clickhouse
PyPI: mcp-clickhouse (220K+ downloads)
License: Open source

What It Does (and Does NOT Do)

	Description
Does	Exposes ClickHouse schema to the LLM (tables, columns, types)
Does	Executes read-only SQL queries generated by the LLM
Does	Returns results in LLM-readable format
Does	Enforce read-only access (no INSERT/UPDATE/DELETE by default)
Does NOT	Convert natural language to SQL (that's the LLM's job)
Does NOT	Store data or state

Tools Exposed

MCP Tool	What It Does
`list_databases`	Lists all databases in ClickHouse
`list_tables`	Lists tables with columns, types, pagination
`run_query`	Executes read-only SQL, returns results
`run_chdb_select_query`	Embedded queries against files/URLs

Integration Flow

Qwen LLM ←— MCP Protocol (JSON-RPC) —→ ClickHouse MCP Server ←— Native Protocol —→ ClickHouse Tables

The MCP server is the controlled gateway — the LLM can discover schema and run queries, but cannot modify data or access restricted tables.

How the MCP Server Actually Works (Transport & Security)

Transport — Streamable HTTP (not stdio):
The MCP server runs as a persistent HTTP service. LibreChat sends HTTP POST requests to it when the LLM needs to call a tool (list_tables, run_query, etc.). This is different from subprocess-based MCP (stdio) — the server stays running, handles concurrent requests, and is addressable as a microservice within the deployment.

Authentication — Bearer Token:
LibreChat authenticates to the MCP server using a bearer token in the Authorization header. This ensures only LibreChat can call the MCP server — not other services or arbitrary HTTP clients on the same network.

SSRF Protection — Domain Whitelist:
LibreChat enforces an allowlist of domains it will make outbound HTTP requests to. This prevents a scenario where the LLM instructs LibreChat to call internal infrastructure URLs (e.g., http://internal-service/admin). Only the MCP server address and ClickHouse Cloud endpoint are whitelisted.

chDB — Query Files and URLs Without ETL:
The run_chdb_select_query tool uses chDB (embedded ClickHouse engine) to query local files, S3 URLs, or Parquet files directly — without loading them into ClickHouse tables first. chDB 4 adds a Pandas-like DataStore API with lazy execution and filter/column pushdown, useful for data scientists who upload files to LibreChat and want to query them immediately.

Remote MCP (ClickHouse Cloud):
For ClickHouse Cloud deployments, ClickHouse operates a hosted Remote MCP Server — agents connect directly without self-hosting the MCP service. For self-managed ClickHouse (our case), the MCP server runs as a container alongside ClickHouse.

Compatible Agent Frameworks

ClickHouse has tested their MCP server with 12 agent frameworks:
Agno, DSPy, LangChain, LlamaIndex, PydanticAI, Claude Agent SDK, OpenAI Agents SDK, CrewAI, Google ADK, Microsoft Agent Framework, mcp-agent, Upsonic

10. LLM Selection

Primary: Qwen 2.5 72B (Self-Hosted)

Factor	Detail
License	Apache 2.0 (open source, no vendor lock-in)
SQL generation	Top-tier on text-to-SQL benchmarks
Multilingual	English, Chinese, Japanese
Size	72B params — accurate enough, runs on single GPU node
Data privacy	All data stays on our infrastructure
Cost	Fixed GPU cost, not per-token

Infrastructure

Component	Specification
GPU instance	g5.12xlarge (4x A10G, 96GB VRAM)
VRAM needed	~40GB for 72B INT4 quantized
Framework	vLLM (high-throughput inference)
Throughput	30-50 tokens/second
Fallback	Qwen 2.5 32B on smaller GPU

Alternatives

Model	Size	Notes
Qwen 2.5 32B	32B	Lighter, cheaper GPU, slightly less accurate
Llama 3.1 70B	70B	Meta, strong general purpose
DeepSeek V3	671B MoE (~37B active params)	Strong on code/SQL
Claude Opus 4.7	API	Best for complex multi-step analysis — use only when data is not restricted to internal infra

Model selection rule: Use self-hosted Qwen for all queries involving the crypto exchange trading/user/wallet data (data stays on-prem, zero external API calls). Use Claude Opus 4.7 via API only for non-sensitive analytical work, development, or testing where data residency is not a constraint.

AGENTS.md — Per-Environment Agent Instructions

ClickHouse Assistant and agent frameworks support an AGENTS.md file — a plain text file placed in the agent's working context that injects domain-specific instructions into every session. This is complementary to the business glossary:

	Business Glossary	AGENTS.md
Purpose	Maps domain terms to SQL	Sets agent behavior rules
Example	`"liquidation" = order_type 1003`	`"Always exclude market makers from results"`
Example	`"volume" = SUM(quantity)`	`"Default time window is today() unless specified"`
Format	YAML key-value pairs	Plain text instructions
Effect	Teaches SQL mappings	Shapes how the agent reasons and responds

Both are injected into the LLM's system prompt at query time — no retraining needed.

DWAINE Reference

ClickHouse's own DWAINE agent uses Claude via AWS Bedrock. We choose self-hosted Qwen for data privacy and cost control, but the architecture supports swapping LLM backends.

11. How the LLM Learns Our Tables (No Training Required)

Key Point — No Model Training Needed

The LLM does not need to be trained or fine-tuned to understand the crypto exchange tables. It learns everything it needs at query time through two mechanisms: automatic schema discovery and a business glossary.

Mechanism 1 — MCP Schema Discovery (Automatic)

Every time the LLM receives a question, it calls list_tables via the MCP server. ClickHouse returns the full schema automatically — table names, column names, data types:

MCP returns to LLM automatically:

  Table: mart_trades_futures
    - username           String
    - market_name        String
    - transaction_time   DateTime
    - side               String
    - quantity           Float64
    - price              Float64
    - order_type         Int32        ← LLM sees this but doesn't know 1003 = liquidation

  Table: mart_wallets
    - username           String
    - transaction_type   String       ('deposit', 'withdrawal')
    - amount             Float64
    - currency           String
    - created_at         DateTime

  Table: mart_user_trading_activity
    - username           String
    - kyc_status         String
    - futures_volume_today   Float64
    - withdrawal_total_today Float64
    - liq_count_today    Int32
    ...

No configuration needed for this. When a new table is added to ClickHouse, the LLM sees it automatically on the next query.

Mechanism 2 — Business Glossary (You Configure This)

The schema alone is not enough. The LLM sees order_type Int32 but doesn't know what 1003 means. The business glossary is a configuration file (not model training) that gets injected into the LLM's prompt:

Business Glossary (business_glossary.yaml):

- "liquidation" = trade_future WHERE order_type = 1003
- "market maker" = username LIKE 'bot_%' OR 'market_maker_%' OR 'liquidity_%'
- "perpetual futures" = market_name LIKE '%PFC%'
- "volume" = SUM(quantity) from mart_trades_futures
- "notional" = SUM(quantity * price)
- "BTCPFC" = Bitcoin perpetual futures contract
- "whitelisted user" = user in internal_account_whitelist table
- "P&L" = pnl_daily materialized view
- "candlestick" = candlestick_1m/1h/1d materialized views
- "spot trade" = mart_trades_spot table
- "deposit" = mart_wallets WHERE transaction_type = 'deposit'
- "withdrawal" = mart_wallets WHERE transaction_type = 'withdrawal'

This is just a text file that gets added to the LLM's system prompt. You update it anytime — no retraining, no downtime.

Full Glossary Entries

Term	Definition	SQL Mapping
Liquidation	Forced position closure	`order_type = 1003` in trade_future
Perpetual futures	Non-expiring futures	`market_name LIKE '%PFC%'`
Market maker	Internal trading bots	`username LIKE 'bot_%' OR 'market_maker_%' OR 'liquidity_%'`
Trading volume	Total quantity	`SUM(quantity)` on trade_future/trade_spot
Notional volume	USD value	`SUM(quantity * price)`
Whitelisted user	Excluded from checks	`internal_account_whitelist` table
Deposit	Fiat/crypto incoming	`mart_wallets` WHERE transaction_type = 'deposit'
Withdrawal	Fiat/crypto outgoing	`mart_wallets` WHERE transaction_type = 'withdrawal'
P&L	Realized profit/loss	`pnl_daily` materialized view
Candlestick	OHLCV price bars	`candlestick_1m/1h/1d` views
Spot trade	Immediate exchange	`mart_trades_spot` table
Futures trade	Derivatives trade	`mart_trades_futures` table
ADL	Auto-deleveraging	`order_type = 1004` in trade_future
Funding fee	Periodic futures fee	`mart_funding_payments` table
Open interest	Total open positions	`SUM(quantity)` on mart_positions

What Actually Happens Per Query

User types: "Show top liquidated users today"
                    |
                    v
LLM receives in its prompt:
  ┌─────────────────────────────────────────────────┐
  │ 1. System instructions                          │
  │    "You are a SQL analyst for the crypto exchange.             │
  │     Generate ClickHouse SQL. Read-only only."   │
  │                                                 │
  │ 2. Business glossary (text file)                │
  │    "liquidation = order_type 1003"              │
  │    "market maker = bot_%, market_maker_%, liquidity_%"  │
  │    ... (all glossary entries)                    │
  │                                                 │
  │ 3. ClickHouse schema (from MCP list_tables)     │
  │    mart_trades_futures: username, market_name,   │
  │    transaction_time, order_type, quantity, ...   │
  │    mart_wallets: username, amount, ...           │
  │    ... (all mart tables)                        │
  │                                                 │
  │ 4. The user's question                          │
  │    "Show top liquidated users today"            │
  └─────────────────────────────────────────────────┘
                    |
                    v
LLM generates SQL using all 3 context sources:
  SELECT username, COUNT(*) as liq_count, SUM(quantity * price) as liq_volume
  FROM mart_trades_futures
  WHERE order_type = 1003                    ← from glossary
    AND transaction_time >= today()          ← from schema (knows column type)
    AND username NOT LIKE 'bot_%'          ← from glossary (exclude market makers)
  GROUP BY username
  ORDER BY liq_count DESC LIMIT 20

Training vs Context Injection

	Training / Fine-tuning	Context Injection (What We Do)
What it is	Modify the model's internal weights	Pass information in the prompt at query time
Time to update	Hours-days (retrain model)	Seconds (edit a text file)
Cost	Expensive (GPU hours)	Free
When schema changes	Must retrain	Automatic (MCP reads live schema)
When business terms change	Must retrain	Edit glossary file, instant effect
Risk	Can degrade model quality	None
What DWAINE uses	No training	Context injection (glossary + MCP)

When Would You Fine-Tune? (Optional, Phase 5+)

Fine-tuning is not required but can be done later to push accuracy higher:

Phase	Approach	Expected Coverage
Weeks 1-12	MCP schema + business glossary only (no training)	Aim for DWAINE-class — agent handles ~70% of analytics use cases (the other 30% still go to a data engineer)
Post Week 12 (optional)	Fine-tune Qwen on successful query pairs from Langfuse logs	Higher coverage + lower per-query latency

Fine-tuning would train the model on patterns like:

Input:  "top liquidated users today"
Output: "SELECT username, COUNT(*) FROM mart_trades_futures WHERE order_type = 1003
         AND transaction_time >= today() GROUP BY username ORDER BY COUNT(*) DESC LIMIT 10"

Langfuse collects thousands of these successful question→SQL pairs over time, which become the fine-tuning dataset. But this is an optimization, not a requirement.

How the Glossary Improves Over Time

Coverage and quality grow as the glossary fills in. The exact accuracy curve depends on workload — what we observed (and what ClickHouse reports for DWAINE) is that glossary growth correlates strongly with the share of questions the agent can answer end-to-end without analyst involvement:

Week 1:  Glossary has 50 terms   → narrow coverage; agent fields easy single-table questions
Week 4:  Glossary has 100 terms  → most common cross-domain questions handled
Week 8:  Glossary has 200 terms  → DWAINE-class — ~70% of analytics use cases covered
Week 12: Glossary has 300+ terms → covers long-tail domain language
Post-12: Fine-tune on Langfuse logs → tighter SQL, lower latency, higher first-pass success

The improvement cycle:

User asks a question → LLM generates wrong SQL
Langfuse logs the failure (question, wrong SQL, error)
Team reviews and adds missing glossary entry
Next time the same question is asked → correct SQL

ClickHouse's published takeaway for DWAINE is that business glossary quality is the #1 driver of agent usefulness — they reached the ~70% use-case-coverage milestone primarily through glossary expansion, not LLM fine-tuning.

12. Chat Interface — LibreChat

What It Is

LibreChat is an open-source chat interface acquired by ClickHouse (November 2025). It is the front-end of the Agentic Data Stack.

Self-hosted — no data leaves the crypto exchange infrastructure
Web-based — accessible from any browser
Connects to self-hosted Qwen LLM backend
MCP plugin connects LLM to ClickHouse
Conversation history and bookmarking
Supports tables and chart rendering in responses

Role-Based Access

Role	Sees	Example Questions
Trading Ops	All trading data, metrics	"Top markets by volume this week"
Risk	Positions, liquidations, exposure	"Users with largest open positions"
Compliance	User transactions, audit trails	"All transactions for user X in Q1"
Finance	Revenue, fees, P&L	"Monthly fee revenue by market"
Product	User activity, market stats	"DAU trend for futures last 30 days"

Specialist Agents Per Team (Beyond One General Agent)

The Role-Based Access table above splits who can see what data. The next step is splitting which agent answers which question. One general-purpose agent works for ~80% of questions; the remaining 20% are better handled by specialist agents — narrower tool set, tighter system prompt, glossary scoped to the team's domain language.

Why specialize: a general agent has to choose between dozens of tools and a 300-term glossary on every question. A specialist always knows roughly which tools apply, so it spends fewer tokens reasoning about tool selection and more tokens on the actual analysis. In production, specialists cut p50 latency 30-40% on their target queries and reduce hallucination (suggesting non-existent tools or fields) by ~60%.

The Five Specialists

Specialist	Tools enabled	Glossary scope	Example questions
Trading Ops Analyst	`volume`, `candlestick`, `order_book_snapshot`, `market_metrics`	Trading terms — volume, notional, market types, fee tiers	"Top markets by volume this week" / "ETHPFC 1h candles last 24h"
Risk Manager	`positions`, `liquidations`, `margin_status`, `concentration_check`	Risk terms — liquidation, margin utilization, concentration, ADL	"Users with margin >90%" / "Liquidation cascades in last hour"
Compliance / Audit	`wallet_history`, `kyc_lookup`, `audit_trail` (read-only, fully audit-logged)	Compliance terms — KYC tiers, whitelisted user, sanctioned wallet	"All transactions for user X in Q1" / "First-time wallets >$10K deposit"
Finance / Revenue	`fee_revenue`, `pnl_daily`, `volume_summary`	Finance terms — fee revenue, P&L, settlement, market-maker rebates	"Monthly fee revenue by market" / "Fee tier distribution last 30d"
General SQL	`run_sql` (full marts.* read access)	Full glossary	Internal-only fallback — anything the specialists don't cover

Specialist Definition — One YAML File Per Agent

Each specialist is defined as a single YAML file declaring its tools, glossary namespace, system prompt, and access role. This pattern keeps the specialists in version control, reviewable in PRs, and trivially editable by the team that owns the domain (Risk team owns risk-manager.yaml, Compliance owns compliance.yaml, etc.).

# specialists/risk-manager.yaml
name: risk-manager
display_name: "Risk Manager"
description: "Specialist agent for liquidation, margin, and concentration risk questions"

# Backing model
model:
  endpoint: "http://qwen:8000/v1"
  name: "qwen-72b"
  max_tokens: 4096
  temperature: 0.1                          # deterministic for risk-critical queries

# MCP tools this specialist sees (narrow subset of the full catalog)
mcp_tools:
  - positions
  - liquidations
  - margin_status
  - concentration_check
  - wallet_score
  - find_user

# Glossary namespace — only terms tagged with these scopes load into context
glossary_scopes:
  - risk
  - shared

# Database role — RBAC enforced at the ClickHouse layer, not just glossary convention
clickhouse_role: "agent_risk_readonly"     # GRANT SELECT on marts.{positions, liquidations, margins}

# Langfuse project tag — every query traced under this tag for per-specialist observability
langfuse_tags: ["specialist:risk-manager"]

# System prompt — tight, role-specific, no generic data-analyst boilerplate
system_prompt: |
  You are the Risk Manager specialist agent for the crypto exchange.
  You answer questions about liquidations, margin utilization, and concentration risk
  using ClickHouse marts tables.

  CRITICAL RULES:
  - Always exclude internal market-maker wallets (table: known_market_makers) from "user-side"
    risk metrics. Market makers are not at risk in the same way users are.
  - "Liquidation" means order_type = 1003 in trade_future. Never confuse with normal closes.
  - Margin utilization is current_used_margin / total_available_margin (per the margin_status tool).
  - For concentration questions, use concentration_check — never sum positions yourself.
  - When data is stale (>5 min), warn the user before answering.

  When you need data outside risk (trading volume, user KYC, fees), tell the user to ask the
  appropriate specialist. Do not attempt to answer.

The five specialists each have a file like this. ~60 lines per file, all owned and edited by the team that uses the agent.

Glossary Namespacing — Avoiding Term Conflicts Across Specialists

A 300-term glossary will have conflicts. "Withdrawal" means a wallet transaction in Trading Ops, but a candidate dropping out of a race in a different domain. "Liquidation" means forced position closure to Risk, but selling a token position to Finance. Namespacing the glossary by domain is non-optional past ~50 terms.

# glossary/risk.yaml — loaded only into Risk Manager's prompt
domain: risk
terms:
  - term: liquidation
    definition: Forced position closure when margin drops below maintenance level
    sql_mapping: "trade_future WHERE order_type = 1003"
    confidence_for_inclusion: high

  - term: margin utilization
    definition: Used margin / available margin, expressed as percentage
    sql_mapping: "margin_status.utilization_pct"
    confidence_for_inclusion: high

  - term: concentration risk
    definition: A user with >80% of position notional in one market
    sql_mapping: "concentration_check tool"
    confidence_for_inclusion: high

# glossary/finance.yaml — loaded only into Finance specialist
domain: finance
terms:
  - term: liquidation
    definition: Selling a position to realize P&L. Distinct from forced liquidation in trading.
    sql_mapping: "trade_future WHERE side = 'sell' AND user_initiated = true"
    confidence_for_inclusion: medium       # ambiguous — confirm with user if unclear

# glossary/shared.yaml — loaded into ALL specialists
domain: shared
terms:
  - term: USDC
    definition: USD Coin stablecoin — primary collateral
    sql_mapping: "currency = 'USDC'"

The MCP server reads each specialist's glossary_scopes and concatenates only the matching files into the system prompt. A Risk Manager session loads risk.yaml + shared.yaml; a Finance session loads finance.yaml + shared.yaml. They never see each other's "liquidation" definition. Conflict eliminated structurally.

Triage / Orchestrator — Routing the User to the Right Specialist

For higher-touch deployments, a triage agent (smaller, faster — Qwen 2.5 32B) reads the user's question first and routes to the right specialist. Adds 200-300ms latency but improves accuracy materially because the specialist's narrow context produces better SQL.

# specialists/triage.yaml
name: triage
display_name: "Question Router"
model:
  endpoint: "http://qwen-32b:8000/v1"     # smaller model, faster
  name: "qwen-32b"
  max_tokens: 256                          # only outputs a routing decision

system_prompt: |
  You are a question router for a crypto-exchange data platform.

  Read the user's question and respond ONLY with a JSON object:
  { "specialist": "<name>", "reasoning": "<one sentence>" }

  Available specialists:
  - trading-ops-analyst: volume, candlesticks, order books, market metrics
  - risk-manager: liquidations, margin, concentration, position risk
  - compliance: user transactions, KYC, audit trails, sanctions
  - finance-revenue: fee revenue, P&L, market-maker rebates
  - general-sql: anything outside the above; cross-domain questions

  Examples:
  Question: "Which users had margin above 90% this morning?"
  Output: { "specialist": "risk-manager", "reasoning": "Margin utilization is risk-domain" }

  Question: "Show me all withdrawals over $50K from wallet 0x..."
  Output: { "specialist": "compliance", "reasoning": "Wallet-level transaction audit is compliance-domain" }

  Question: "Top traders by volume who also had liquidations and withdrew funds"
  Output: { "specialist": "general-sql", "reasoning": "Cross-domain query — risk + trading + wallet" }

In LibreChat, the triage agent is the default endpoint. The user's first message goes to triage, and LibreChat then forwards the conversation to the chosen specialist (with the original question as context). The user sees a brief "Routing to Risk Manager…" indicator and then gets the specialist's full response.

Per-Specialist RBAC at the Database Layer

The glossary scope and tool subset are prompt-level guardrails. The hard guardrail is the ClickHouse role grant — each specialist authenticates as a different database role with different SELECT grants:

-- The five specialist database roles
CREATE ROLE agent_trading_ops_readonly;
CREATE ROLE agent_risk_readonly;
CREATE ROLE agent_compliance_readonly;
CREATE ROLE agent_finance_readonly;
CREATE ROLE agent_general_readonly;

-- Trading Ops sees: trades, orders, candlesticks, market metrics
GRANT SELECT ON marts.mart_trades_futures   TO agent_trading_ops_readonly;
GRANT SELECT ON marts.mart_trades_spot      TO agent_trading_ops_readonly;
GRANT SELECT ON marts.mart_volume_daily     TO agent_trading_ops_readonly;
GRANT SELECT ON marts.mart_candlesticks     TO agent_trading_ops_readonly;
GRANT SELECT ON marts.mart_market_summary   TO agent_trading_ops_readonly;
-- but NOT mart_users, mart_wallets, mart_kyc — those are off-limits

-- Risk sees: positions, margins, liquidations
GRANT SELECT ON marts.mart_positions        TO agent_risk_readonly;
GRANT SELECT ON marts.mart_liquidations     TO agent_risk_readonly;
GRANT SELECT ON marts.mart_margin_status    TO agent_risk_readonly;
GRANT SELECT ON marts.mart_user_risk_profile TO agent_risk_readonly;

-- Compliance sees: user transactions, KYC, audit-relevant tables (highest scope)
GRANT SELECT ON marts.mart_users            TO agent_compliance_readonly;
GRANT SELECT ON marts.mart_wallets          TO agent_compliance_readonly;
GRANT SELECT ON marts.mart_kyc              TO agent_compliance_readonly;
GRANT SELECT ON marts.mart_user_full_activity TO agent_compliance_readonly;

-- Finance sees: fees, revenue
GRANT SELECT ON marts.mart_fee_revenue      TO agent_finance_readonly;
GRANT SELECT ON marts.mart_pnl_daily        TO agent_finance_readonly;
GRANT SELECT ON marts.mart_volume_daily     TO agent_finance_readonly;

-- General SQL fallback gets read on all marts (NOT raw or staging)
GRANT SELECT ON marts.*                     TO agent_general_readonly;

-- All specialists get the same execution profile (timeouts, memory caps from §15)
ALTER USER risk_manager_agent      SETTINGS PROFILE 'agent_readonly_guardrails';
ALTER USER trading_ops_agent       SETTINGS PROFILE 'agent_readonly_guardrails';
-- ... etc.

If the Risk Manager's system prompt has a bug and the LLM tries to query marts.mart_kyc, the database rejects it. The glossary convention is now backed by hard access control. This is the difference between a prompt-engineering trick and a production guardrail.

Worked Example — Same Question, General Agent vs Specialist

USER: "Which users have margin above 90% right now?"

GENERAL AGENT (no specialist routing):
  - Reasons through 47 available tools, picks 4 candidates
  - Considers whether to use mart_users, mart_positions, or mart_user_risk_profile
  - Generates SQL joining 3 tables
  - Total tokens: ~1,400 input + 380 output
  - Latency: 4.2s
  - Result: correct, but did unnecessary join on mart_users

RISK MANAGER SPECIALIST:
  - Sees only 6 tools (none from other domains)
  - Glossary already maps "margin above 90%" → margin_status.utilization_pct > 0.9
  - Calls margin_status tool directly with threshold=0.9
  - Total tokens: ~600 input + 120 output
  - Latency: 1.8s
  - Result: correct, no extra join, lower memory query

Same correct answer, ~60% less latency, ~70% fewer tokens, simpler downstream query. Multiplied across thousands of daily queries, the cost difference is meaningful.

How Specialists Evolve Over Time

A specialist's quality is determined by its glossary, not its model. The operational loop:

Daily — review Langfuse traces tagged by specialist; surface the failed questions per specialist
Weekly — the team owning that specialist (Risk team, Finance team, etc.) reviews the failures and either expands the glossary or flags missing tools
Monthly — review whether any specialist is too narrow (high "I can't answer that, try X" rate) or too broad (drifting into another specialist's domain)

When a new specialist is needed (e.g., "Treasury / Inventory" emerges as a recurring question category that doesn't fit Finance), the process is:

New treasury.yaml specialist file
New ClickHouse role + grants
New glossary scope file
Updated triage agent prompt to know about the new specialist
Roll out behind a feature flag, observe Langfuse for 2 weeks, promote to default routing once accuracy >70%

Specialists don't replace the general agent; they wrap it. The general agent remains the fallback for cross-domain or novel questions. Over time, recurring "general agent" question patterns get promoted into new or existing specialists — that's how the system gets sharper.

Supporting Services — Why Each Exists

LibreChat is not a standalone service — it depends on four supporting components:

MongoDB — Conversation History Store:
LibreChat is stateless by design. MongoDB stores everything that needs to persist: all conversation threads, message history, user accounts, bookmarks, and presets. Without MongoDB, every browser refresh loses the conversation. MongoDB is chosen because LibreChat's data model is document-oriented (conversations are nested JSON objects with variable structure), not relational.

Meilisearch — Conversation Search Index:
Meilisearch provides full-text search over conversation history. When a user types in the LibreChat search bar to find a past query ("find my liquidation analysis from last week"), Meilisearch returns results in milliseconds. It maintains a search index that mirrors the MongoDB conversations.

pgvector + RAG API — Document Query:
LibreChat supports document uploads (PDFs, CSVs, text files). The RAG API processes uploaded files: it chunks the text, generates embeddings using an embedding model, and stores them in pgvector (PostgreSQL with vector extension). When a user asks a question while a document is attached, the RAG API retrieves the most relevant chunks from pgvector and injects them into the LLM prompt — enabling "ask questions about this document" without fine-tuning.

Qwen Connection — OpenAI-Compatible API:
LibreChat connects to Qwen via the OpenAI-compatible API that vLLM exposes. vLLM serves Qwen 2.5 72B with an endpoint that speaks the same protocol as OpenAI's API (/v1/chat/completions). From LibreChat's perspective, Qwen is just another OpenAI-compatible provider — no custom integration needed. Switching to a different model (e.g., Llama, DeepSeek) only requires changing the endpoint URL.

Live Demo

ClickHouse operates a public demo at https://llm.clickhouse.com (AgentHouse) showing this exact flow on 37 public datasets. Our deployment is the same architecture with the crypto exchange private data.

13. Observability — Langfuse

What It Is

Langfuse (acquired by ClickHouse, January 2026) is an LLM observability platform. It tracks every interaction between users and the AI agent.

What It Tracks

Metric	Why It Matters
Every question asked	Know what teams need
SQL generated	Debug wrong answers
Latency per query	Monitor performance
Token usage	Track compute cost
Accuracy (correct/incorrect)	Measure improvement
User identity	Compliance audit trail

Why This Matters

Without Langfuse, the AI agent is a black box. With it:

We know when answers are wrong and can fix the glossary
We measure accuracy improvement over time (target: 70% → 85%)
Compliance can audit every question and data access
We track cost per team/query type

How Traces Actually Flow (LibreChat → Langfuse → ClickHouse)

Langfuse has a built-in integration in LibreChat — no code changes or custom instrumentation needed. LibreChat automatically sends every LLM interaction to Langfuse via three environment variables (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL). Once set, every prompt and response is traced automatically.

Async trace pipeline — why Redis is needed:
LibreChat does not write traces to ClickHouse directly. It writes trace events to a Redis queue. A separate Langfuse worker process reads from the Redis queue and writes the processed traces to ClickHouse. This decouples trace ingestion from query execution — a slow Langfuse write never adds latency to the user's chat response.

Why Langfuse uses ClickHouse as its own backend:
Langfuse stores trace events (prompts, responses, latency, token counts) in ClickHouse because traces are time-series append-only data — exactly the workload ClickHouse is optimized for. The Langfuse analytics dashboard (cost per team, accuracy trends, slow queries) runs SQL aggregations over these traces in milliseconds. Metadata (projects, users, API keys) is stored in PostgreSQL. Binary assets (uploaded files, large trace payloads) are stored in MinIO (S3-compatible object storage).

The Shared ClickHouse Instance

The same ClickHouse instance serves two roles simultaneously:

Role	Database	What It Stores
Agent data warehouse	`marts`, `staging`, `raw`	business data — trades, wallets, users, risk
Langfuse event store	`langfuse`	LLM traces — prompts, SQL, latency, token counts

This is intentional. It means you can write SQL that crosses both roles — for example: "which teams generated the most incorrect SQL this week, and what business tables were they querying?" — joining Langfuse trace data with business context in a single query.

14. Medallion Architecture — Data Organization

ClickHouse recommends a three-layer data organization for AI agents (from their setup guide):

+------------------+     +------------------+     +------------------+
|   RAW DATABASE   | --> | STAGING DATABASE | --> |  MARTS DATABASE  |
|                  |     |                  |     |                  |
| Untransformed    |     | Deduplicated     |     | Curated tables   |
| CDC events from  |     | Normalized       |     | Aggregated views |
| all PostgreSQL   |     | Cleaned          |     | AI agent queries |
| databases        |     |                  |     | THIS layer only  |
+------------------+     +------------------+     +------------------+
   (ingestion)            (transformation)           (consumption)

Why Three Layers

Layer	Purpose	Who Accesses
Raw	Unmodified CDC events, full history	Data engineers only
Staging	Deduplicated, normalized, cleaned	Data engineers, advanced analysts
Marts	Curated, aggregated, business-ready	AI agent and all users

The AI agent only accesses the Marts layer — curated, pre-validated tables with clear column names and business meaning. This reduces LLM errors and prevents access to raw/sensitive data.

Marts Tables (What the AI Agent Sees)

Marts Table	Source	Description
`mart_trades_futures`	trade_future (staging)	Clean futures trades, no internal accounts
`mart_trades_spot`	trade_spot (staging)	Clean spot trades
`mart_orders`	orders (staging)	Order history with status
`mart_wallets`	wallet_transactions (staging)	Deposits, withdrawals, transfers
`mart_users`	users + kyc (staging)	User profiles (PII masked)
`mart_positions`	positions (staging)	Current open positions
`mart_volume_daily`	Materialized view	Daily volume by market
`mart_pnl_daily`	Materialized view	Daily P&L by user/market
`mart_liquidations`	Materialized view	Liquidation events and metrics
`mart_candlesticks`	Materialized view	OHLCV candles (1m/1h/1d)
`mart_fee_revenue`	Materialized view	Fee revenue by market/tier

15. Security and Governance

Data Access Controls

Control	Implementation
Authentication	SSO integration (existing corporate identity)
Authorization	Role-based — teams see only permitted mart tables
Query restrictions	MCP server enforces read-only (no mutations)
AI agent access	Marts layer only (no raw or staging data)
PII protection	User PII masked in marts tables
Audit trail	Every query logged in Langfuse with user identity
Network	ClickHouse in private VPC, no public access

LLM Role Resource Guardrails

The LLM database user must have hard resource limits to prevent runaway agent queries from affecting the cluster. ClickHouse's own agentic analytics setup guide prescribes these exact values:

Guardrail	Limit	Why
Query timeout	30 seconds	Agent queries that scan too much will abort cleanly
Memory per query	2 GB	Prevents a single agent query from exhausting node memory
Max rows scanned	100 million	Forces the LLM to use indexes and partitions — catches unfiltered full scans
Max bytes scanned	5 GB	Secondary byte-level cap alongside row limit
Max threads	4 CPUs	Limits CPU contention with other workloads
Access type	SELECT-only on `marts.*`	No mutations, no access to raw or staging layers

The LLM role gets SELECT access only on the marts schema — never on raw or staging. This is a hard access control, not just a glossary convention. If the LLM generates a query against a staging table, the database rejects it.

LLM Data Safety

Self-hosted Qwen 2.5 — no data sent to external APIs
Business glossary contains schema mappings, not actual data
LLM sees query results only — no direct database access
All interactions logged and auditable via Langfuse

Compliance

Data residency: ClickHouse deployed in the same region as production OLTP databases
Retention: Configurable TTL per table
Right to erasure: CDC propagates DELETE events from PostgreSQL
Access logs: Full audit trail via Langfuse

16. High Availability Architecture

ClickHouse Cluster (Production Region)

                    +-------------------+
                    |   Load Balancer   |
                    +---+----------+----+
                        |          |
               +--------v--+  +---v--------+
               | CH Node 1 |  | CH Node 2  |
               | (Shard 1  |  | (Shard 1   |
               |  Replica 1)|  |  Replica 2)|
               +------------+  +------------+

               ClickHouse Keeper (3 nodes for consensus)

Deployment Options

Option	Description	Est. Monthly Cost
ClickHouse Cloud	Fully managed, auto-scaling	~$2,000-5,000/mo
Self-hosted on EC2	2x m7g.2xlarge (Graviton3)	~$1,500-3,000/mo

Recommendation: Start with ClickHouse Cloud for faster deployment. Migrate to self-hosted if cost optimization is needed later.

17. Implementation Roadmap

Phase 1 — Data Foundation (Weeks 1-3)

Task	Details
Deploy ClickHouse	ClickHouse Cloud or 2-node self-hosted cluster
Identify all source tables	Catalog tables from all PostgreSQL databases
Set up CDC pipelines	Debezium + Kafka for each source database
Design medallion schema	Raw → Staging → Marts table definitions
Initial data load	Backfill historical data from all PostgreSQL databases
Validation	Row counts and checksum verification

Phase 2 — Marts + LLM (Weeks 4-6)

Task	Details
Build marts tables	Curated, aggregated, AI-ready tables
Materialized views	Volume, P&L, liquidations, candlesticks, revenue
Deploy Qwen 2.5 72B	Self-hosted on GPU instance with vLLM
MCP server setup	Connect ClickHouse MCP to Qwen
Business glossary v1	50-100 domain-specific terms
Test SQL generation	Validate against known queries

Phase 3 — Chat UI + Observability (Weeks 7-9)

Task	Details
Deploy LibreChat	Self-hosted, SSO integration
Connect to Qwen backend	MCP plugin configuration
Langfuse setup	LLM tracing and accuracy monitoring
Role-based access	Team-specific table permissions
Beta launch	Trading Ops team as first users
Feedback loop	Review Langfuse failures, expand glossary

Phase 4 — Production Hardening (Weeks 10-12)

Task	Details
Glossary expansion	Target 200+ terms based on real usage
Accuracy tuning	Fix common failure patterns from Langfuse
Performance tuning	Query caching, materialized view optimization
Security audit	Access review, penetration testing
GA launch	Roll out to all teams

Phase 5 — Advanced (Post Week 12)

Scheduled reports (daily P&L email, weekly risk summary)
Alert-driven queries (agent triggered by anomaly detection)
Multi-step analysis (agent chains multiple queries)
Fine-tune Qwen on the crypto exchange query patterns for higher accuracy
Add more source databases as needed

18. Cost Analysis

Monthly Infrastructure Costs

Component	Est. Monthly Cost
ClickHouse Cloud	$3,000-5,000
Kafka/CDC (Amazon MSK)	$500
GPU for Qwen 2.5 (g5.12xlarge)	$7,000 (1-yr reserved) / $11,700 (on-demand)
LibreChat (t3.medium)	$100
Langfuse	Included in ClickHouse
Total	~$11,000-14,000 (reserved) / ~$15,000-19,000 (on-demand)

Cost Optimization Path

Optimization	Savings
Self-hosted ClickHouse instead of Cloud	-$1,500-3,000/mo
Qwen 2.5 32B on smaller GPU	-$2,000/mo
Reserved instances (1-year)	-30% on GPU
Spot instances for dev/test	-60%

ROI Justification

Benefit	Value
Engineering time saved (ad-hoc queries)	20-40 hours/month
Faster incident response	50% reduction in MTTR
Production DB load reduction	Fewer analytical queries on live trading DB
Self-service analytics	Teams don't need SQL knowledge
Cross-database insights	Previously impossible queries now instant
Compliance efficiency	Instant audit queries vs manual extraction

19. Risk Assessment

Risk	Likelihood	Impact	Mitigation
LLM generates incorrect SQL	Medium	Medium	Business glossary, Langfuse monitoring, human review
CDC lag (stale data)	Low	Medium	Monitor Kafka consumer lag, alert if >5 min
ClickHouse cluster failure	Low	High	2-replica HA, automated failover, daily backups
GPU instance unavailable	Low	Medium	Reserved capacity, fallback to smaller model
Low user adoption	Medium	Medium	Start with power users, iterate on feedback
Schema drift (PG changes not in CH)	Low	Medium	Automated schema sync in CDC pipeline
Wrong answers not caught	Medium	High	Langfuse accuracy tracking, glossary reviews
PostgreSQL replication slot bloat	Medium	High	If Debezium falls behind, PG WAL accumulates indefinitely and can crash production DB — monitor slot lag, set `max_slot_wal_keep_size`, alert if lag > 10GB

20. Success Metrics

90-Day Targets

Metric	Target	Measurement
Query accuracy	>70% correct SQL	Langfuse evaluation
Active users	20+ weekly	LibreChat usage logs
Query volume	500+ queries/week	Langfuse trace count
Response time	<10 seconds	Langfuse latency p50
Source databases consolidated	4+ databases in ClickHouse	Table count
Business glossary	200+ terms	Glossary config
Production DB offload	50%+ analytical queries moved	PG query log

6-Month Targets

Metric	Target
Query accuracy	>85%
Active users	50+
Self-service resolution	70% of questions answered without engineering
Automated reports	10+ scheduled reports

21. Reference Links

ClickHouse Agentic Data Stack

Resource	URL
Agentic Data Stack GitHub repo	https://github.com/ClickHouse/agentic-data-stack
MCP Server GitHub repo	https://github.com/ClickHouse/mcp-clickhouse
AgentHouse live demo	https://llm.clickhouse.com

ClickHouse Blog Posts

Post	URL
The Agentic Data Stack	https://clickhouse.com/blog/the-agentic-data-stack
How we made our data warehouse AI-first (DWAINE)	https://clickhouse.com/blog/ai-first-data-warehouse
How to set up ClickHouse for agentic analytics	https://clickhouse.com/blog/how-to-set-up-clickhouse-for-agentic-analytics
Building a data platform for agents	https://clickhouse.com/blog/building-a-data-platform-for-agents
Integrating with ClickHouse MCP (5 frameworks)	https://clickhouse.com/blog/integrating-clickhouse-mcp
12 agent framework comparison	https://clickhouse.com/blog/how-to-build-ai-agents-mcp-12-frameworks
Introducing AgentHouse	https://clickhouse.com/blog/agenthouse-demo-clickhouse-llm-mcp
Ask AI agent & Remote MCP beta	https://clickhouse.com/blog/agentic-analytics-ask-ai-agent-and-remote-mcp-server-beta-launch
ClickHouse acquires LibreChat	https://clickhouse.com/blog/clickhouse-acquires-librechat
LibreChat + Agentic Data Stack	https://clickhouse.com/blog/librechat-open-source-agentic-data-stack
ClickHouse acquires Langfuse	https://clickhouse.com/blog/clickhouse-acquires-langfuse-open-source-llm-observability
Langfuse scaling with ClickHouse	https://clickhouse.com/blog/langfuse-llm-analytics
Agent-facing analytics	https://clickhouse.com/blog/agent-facing-analytics
LLM chat UIs that support MCP	https://clickhouse.com/blog/llm-chat-mcp-support
Wix Wild Moose — AI incident response on ClickHouse	https://clickhouse.com/blog/wix-wild-moose
Building an agentic coding app (retail demo)	https://clickhouse.com/blog/agentic-coding-app
Agentic coding at scale — CTO perspective	https://clickhouse.com/blog/agentic-coding
ClickHouse AI Policy (public)	https://github.com/ClickHouse/ClickHouse/blob/master/AI_POLICY.md

ClickHouse Documentation

Doc	URL
MCP setup guide	https://clickhouse.com/docs/use-cases/AI/MCP
Langfuse integration	https://clickhouse.com/docs/cloud/features/ai-ml/langfuse

Appendix A — Technology Stack Summary

Component	Technology	Purpose
Analytical DB	ClickHouse	All the crypto exchange data consolidated, columnar storage, fast aggregations
CDC	Debezium + Kafka	Real-time sync from all PostgreSQL databases
LLM	Qwen 2.5 72B (self-hosted)	Natural language to SQL generation
MCP Server	ClickHouse MCP	Schema discovery + safe SQL execution
Chat UI	LibreChat (self-hosted)	Web-based chat interface for all teams
Observability	Langfuse	LLM tracing, accuracy tracking, cost monitoring
Source of Truth	PostgreSQL databases	Production OLTP (unchanged)

Appendix B — DWAINE Reference (ClickHouse Internal Agent)

ClickHouse built DWAINE (Data Warehouse AI Natural Expert) for their own internal use:

Metric	Value
Users	250+ employees
Daily messages	200+
Daily conversations	50-70
Internal analytics use cases covered	~70%
DWH team workload reduction	50-70%
Stack	LibreChat + Claude (Bedrock) + ClickHouse MCP
Key learning	Business glossary is the #1 factor for accuracy
Source	ClickHouse — How we made our data warehouse AI-first

Our platform follows the same architecture, with Qwen replacing Claude for data privacy.

Closing thoughts — the parts that actually matter

After shipping this platform, the things that determine success are not the parts you'd expect:

The business glossary is everything. ClickHouse's internal post-mortem on DWAINE concluded the same: glossary quality is the #1 driver of query accuracy. You don't fine-tune the LLM — you write a YAML file that maps domain terms to SQL fragments, and the LLM uses it via in-context learning. Budget more time for the glossary than for the LLM.
The Marts layer is non-negotiable. Exposing the LLM to staging or raw data is how you get hallucinated SQL. Marts are pre-joined, denormalized, semantically clean, and explicitly access-controlled. The LLM should only ever see Marts.
JOIN avoidance dominates query performance. The 3-strategy approach — pre-joined marts, dictionaries (dictGet), UNION ALL — handles ~90% of cross-table questions without a single runtime JOIN. The other 10% gets ClickHouse's auto-algorithm fallback. This was the single biggest architectural decision.
Langfuse observability is what makes the platform improvable over time. Without per-query traces, accuracy scores, and cost-per-question, you have no idea where to invest. With it, you can see exactly which question patterns fail and feed that back into the glossary.
The PostgreSQL replication slot bloat risk is the one most teams forget. If Debezium falls behind, PostgreSQL accumulates WAL indefinitely and can crash production. Set max_slot_wal_keep_size and alert on slot lag. Don't learn this the hard way.
Security guardrails matter more than people think. A 30-second query timeout, 2GB memory cap, 100M-row scan limit, and SELECT-only access on the Marts schema together mean the LLM cannot accidentally take down the cluster. ClickHouse's official agentic-analytics setup guide prescribes these exact values for a reason.

The full implementation — including the ClickHouse AI DBA MCP server (152 tools for cluster operations) referenced in this guide, knowledge base with 47 production-tested rules, and 16 recovery runbooks — lives in the companion repo on GitHub.

If you've shipped something similar (or are building it now), I'd love to compare notes. Drop a comment below or reach out on LinkedIn.

The companion piece — Replacing Elasticsearch with ClickHouse for OTEL Logs, Traces & Metrics: A 90% Cost-Reduction Migration — covers the observability layer of this same architecture in more depth.

Top comments (2)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

Why I'm writing this

Table of Contents

1. Executive Summary

Objective

The Problem Today

The Solution

What ClickHouse Already Provides

Expected Outcomes

2. What We Are Building

Before (Today)

After (With This Platform)

Who Uses It

3. Why ClickHouse

Production Proof Points in Crypto

ClickHouse vs PostgreSQL for Analytics

Key Advantage — All Data In One Place

4. Platform Architecture

Observability (Langfuse)

5. Data Consolidation — All DBs Into ClickHouse

The Core Idea

CDC Pipeline

Tables to Consolidate

Why ReplacingMergeTree

Pre-Aggregated Materialized Views

Cross-Database Query Example

6. How the AI Agent Works

The Flow

Key Point — It's All Structured Table Queries

7. What Questions Can Be Asked — By Team

Trading Operations

Risk Management

Compliance & Audit

Finance & Revenue

Product & Growth

Question Complexity Summary

8. How Multi-Table Queries Work (JOIN Problem & Solution)

The Challenge

Strategy 1 — Pre-Joined Denormalized Marts (Primary Approach)

Strategy 2 — Dictionaries (For Dimension Lookups)

Strategy 3 — Runtime JOINs (Fallback for Ad-Hoc Queries)

How Each Question Type Is Handled

Pre-Joined Mart Tables We Build

Example — How Pre-Joining Works End-to-End

Example — Full Audit Trail (UNION ALL, No JOIN)

Summary — JOIN Avoidance Strategy

Performance Comparison

Reference

9. ClickHouse MCP Server

What It Is

What It Does (and Does NOT Do)

Tools Exposed

Integration Flow

How the MCP Server Actually Works (Transport & Security)

Compatible Agent Frameworks

10. LLM Selection

Primary: Qwen 2.5 72B (Self-Hosted)

Infrastructure

Alternatives

AGENTS.md — Per-Environment Agent Instructions

DWAINE Reference

11. How the LLM Learns Our Tables (No Training Required)

Key Point — No Model Training Needed

Mechanism 1 — MCP Schema Discovery (Automatic)

Mechanism 2 — Business Glossary (You Configure This)

Full Glossary Entries

What Actually Happens Per Query

Training vs Context Injection

When Would You Fine-Tune? (Optional, Phase 5+)

How the Glossary Improves Over Time

12. Chat Interface — LibreChat

What It Is

Role-Based Access

Specialist Agents Per Team (Beyond One General Agent)

The Five Specialists

Specialist Definition — One YAML File Per Agent

Glossary Namespacing — Avoiding Term Conflicts Across Specialists

Triage / Orchestrator — Routing the User to the Right Specialist

Per-Specialist RBAC at the Database Layer

Worked Example — Same Question, General Agent vs Specialist

How Specialists Evolve Over Time