<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: zeevgg</title>
    <description>The latest articles on DEV Community by zeevgg (@zeevgerstner).</description>
    <link>https://dev.to/zeevgerstner</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F305358%2F76bd4c8f-1e9f-4c07-9570-35c2829c082f.png</url>
      <title>DEV Community: zeevgg</title>
      <link>https://dev.to/zeevgerstner</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zeevgerstner"/>
    <language>en</language>
    <item>
      <title>Building an Intelligent ClickHouse Investigation Agent (The Official MCP Isn't Enough)</title>
      <dc:creator>zeevgg</dc:creator>
      <pubDate>Mon, 09 Feb 2026 20:47:00 +0000</pubDate>
      <link>https://dev.to/zeevgerstner/building-an-intelligent-clickhouse-investigation-agent-the-official-mcp-isnt-enough-a34</link>
      <guid>https://dev.to/zeevgerstner/building-an-intelligent-clickhouse-investigation-agent-the-official-mcp-isnt-enough-a34</guid>
      <description>&lt;p&gt;For the last six months, I've been using the &lt;a href="https://github.com/ClickHouse/mcp-clickhouse" rel="noopener noreferrer"&gt;official ClickHouse MCP&lt;/a&gt; every day to investigate sports data incidents at LSports. I use it primarily through Cursor (the AI-powered code editor), though it works with Claude Desktop and other MCP clients too.&lt;/p&gt;

&lt;p&gt;The official MCP is excellent at what it does: executing SQL queries and providing schema introspection.&lt;/p&gt;

&lt;p&gt;But every day, I hit the same wall: &lt;strong&gt;I still write SQL manually.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So I'm building the intelligence layer on top. In the open. Here's why.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Official ClickHouse MCP is Good (Really)
&lt;/h2&gt;

&lt;p&gt;ClickHouse Inc built a solid MCP server. It does exactly what a database connector should do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Executes SQL queries reliably&lt;/li&gt;
&lt;li&gt;✅ Provides schema introspection (list databases, tables, columns)&lt;/li&gt;
&lt;li&gt;✅ Proper security (readonly mode)&lt;/li&gt;
&lt;li&gt;✅ Active maintenance and updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For generic database access, it works great. I use it every day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But it's missing the intelligence layer.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Daily Pain Points
&lt;/h2&gt;

&lt;p&gt;Let me show you what a typical investigation looks like:&lt;/p&gt;

&lt;h3&gt;
  
  
  Pain #1: Manual SQL Writing (Every. Single. Time.)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Me in Cursor:&lt;/strong&gt; "Show me NBA incidents from yesterday"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor (using official ClickHouse MCP):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Cursor writes this from scratch every time&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;incidents&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;sport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Basketball'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;league&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'NBA'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;yesterday&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What I actually want:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Template already exists&lt;/li&gt;
&lt;li&gt;Just fill in parameters: &lt;code&gt;sport=NBA&lt;/code&gt;, &lt;code&gt;timeframe=yesterday&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Instant execution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pain #2: No Safety Checks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Bad query generated:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;incidents&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%timeout%'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; 💥 Full table scan, times out, wastes cluster resources&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I need:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validate it uses ORDER BY columns (ClickHouse optimization requirement)&lt;/li&gt;
&lt;li&gt;Enforce LIMIT clause&lt;/li&gt;
&lt;li&gt;Check query won't cause full table scan&lt;/li&gt;
&lt;li&gt;Timeout enforcement&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pain #3: No Business Context
&lt;/h3&gt;

&lt;p&gt;The AI (whether in Cursor or Claude) doesn't understand my domain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is an "incident" in my schema?&lt;/li&gt;
&lt;li&gt;How do I calculate "settlement quality"?&lt;/li&gt;
&lt;li&gt;Which tables should be JOINed for investigations?&lt;/li&gt;
&lt;li&gt;What are the common investigation patterns?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Every query starts from zero knowledge.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Pain #4: No Investigation Workflow
&lt;/h3&gt;

&lt;p&gt;Complex questions require 5-10 queries:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Query match events around specific time&lt;/li&gt;
&lt;li&gt;Query market changes&lt;/li&gt;
&lt;li&gt;Query incidents&lt;/li&gt;
&lt;li&gt;Manually correlate the results&lt;/li&gt;
&lt;li&gt;Ask follow-up questions&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;No context retention. No correlation. Manual synthesis every time.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm Building
&lt;/h2&gt;

&lt;p&gt;An &lt;strong&gt;intelligent investigation agent&lt;/strong&gt; that sits on top of the official ClickHouse MCP.&lt;/p&gt;

&lt;p&gt;Think of it this way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Official ClickHouse MCP  →  My Investigation Agent
(Database Driver)            (Intelligent Engine)
     ↓                             ↓
  psycopg2                    Django ORM
  AWS SDK                     Serverless Framework
  HTTP client                 GraphQL client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Architecture (High-Level)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Question (Natural Language)
    ↓
Intent Parser (LLM - Claude, GPT-4, or other)
    ↓
Semantic Layer (maps business concepts to ClickHouse)
    ↓
Query Template Selection (30-50 pre-optimized patterns)
    ↓
Parameter Extraction &amp;amp; Validation
    ↓
Official ClickHouse MCP (executes the SQL)
    ↓
ClickHouse Database
    ↓
Result Formatting &amp;amp; Insights
    ↓
Response (via Cursor, Claude Desktop, or any MCP client)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Four Intelligence Layers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: Semantic Understanding&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maps "incidents" to specific tables/columns&lt;/li&gt;
&lt;li&gt;Knows business metrics and how to calculate them&lt;/li&gt;
&lt;li&gt;Understands investigation patterns&lt;/li&gt;
&lt;li&gt;Domain-specific knowledge baked in&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Query Intelligence&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;30-50 pre-optimized ClickHouse query templates&lt;/li&gt;
&lt;li&gt;LLM picks the right template and fills parameters&lt;/li&gt;
&lt;li&gt;Only generates SQL from scratch for edge cases&lt;/li&gt;
&lt;li&gt;Learns from usage to create new templates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Safety &amp;amp; Performance&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validates queries use ORDER BY columns (ClickHouse requirement)&lt;/li&gt;
&lt;li&gt;Enforces LIMIT clauses (prevents unbounded results)&lt;/li&gt;
&lt;li&gt;Checks query cost before execution&lt;/li&gt;
&lt;li&gt;Guarantees &amp;lt; 5 second execution time&lt;/li&gt;
&lt;li&gt;Timeout enforcement at 4.5 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Investigation Features&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-turn conversation state (remembers context)&lt;/li&gt;
&lt;li&gt;Automatic correlation between queries&lt;/li&gt;
&lt;li&gt;Proactive insights ("I noticed high incident rate at 23:42. Want me to check affected markets?")&lt;/li&gt;
&lt;li&gt;Rich result formatting&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why ClickHouse Needs Special Handling
&lt;/h2&gt;

&lt;p&gt;You can't just let an LLM generate ClickHouse queries naively:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance Gotchas:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ Wrong ORDER BY → 100x slower query&lt;/li&gt;
&lt;li&gt;❌ Missing LIMIT → out of memory&lt;/li&gt;
&lt;li&gt;❌ Bad JOIN → cluster-wide performance impact&lt;/li&gt;
&lt;li&gt;❌ High-cardinality GROUP BY → resource exhaustion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;ClickHouse-Specific Optimizations:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Must use ORDER BY columns in WHERE clause&lt;/li&gt;
&lt;li&gt;PREWHERE for heavy filtering&lt;/li&gt;
&lt;li&gt;Materialized views for aggregations&lt;/li&gt;
&lt;li&gt;Skip indexes for selective queries&lt;/li&gt;
&lt;li&gt;Different engine types (MergeTree, ReplacingMergeTree, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This is why no one has built a production-grade ClickHouse agent yet. It's too risky without proper guardrails.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Template-First Approach (Why It's Actually Perfect for Agents)
&lt;/h2&gt;

&lt;p&gt;Despite the agent hype, I'm going &lt;strong&gt;template-first&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Fast Path (80% of queries)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;incidents for {sport} {timeframe}&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sql&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
    SELECT 
      timestamp, sport, league, 
      incident_type, severity
    FROM incidents
    WHERE sport = {sport}
      AND timestamp &amp;gt;= {timeframe_start}
    ORDER BY timestamp DESC
    LIMIT 100
  `&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why templates?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;✅ Guaranteed ClickHouse optimization (pre-tuned)&lt;/li&gt;
&lt;li&gt;✅ Predictable performance (&amp;lt; 5 sec guaranteed)&lt;/li&gt;
&lt;li&gt;✅ Safe by default (no dangerous queries)&lt;/li&gt;
&lt;li&gt;✅ Handles 80% of use cases&lt;/li&gt;
&lt;li&gt;✅ Perfect for autonomous agents (stronger guardrails)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;For the 20% edge cases:&lt;/strong&gt; LLM generates SQL, but with heavy validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This isn't limiting—it's intelligent:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Template = muscle memory for common patterns&lt;/li&gt;
&lt;li&gt;LLM = creativity for novel questions&lt;/li&gt;
&lt;li&gt;Best of both worlds&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Build This?
&lt;/h2&gt;

&lt;p&gt;Three reasons:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. I Need It (Real Pain, Daily)
&lt;/h3&gt;

&lt;p&gt;This isn't theoretical. I use ClickHouse + Claude for data investigation every day. The official MCP gets me 80% there. I'm building the missing 20%.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Market Gap
&lt;/h3&gt;

&lt;p&gt;There is &lt;strong&gt;no production-grade intelligent agent for ClickHouse&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generic text-to-SQL tools (Vanna, Defog) don't handle ClickHouse optimization&lt;/li&gt;
&lt;li&gt;Other database agents (for Postgres/MySQL) don't transfer&lt;/li&gt;
&lt;li&gt;The official MCP is execution only, not intelligence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Someone needs to build this. Might as well be me.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Technical Challenge
&lt;/h3&gt;

&lt;p&gt;ClickHouse + LLM is genuinely hard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different SQL dialect (arrayJoin, PREWHERE, engine-specific syntax)&lt;/li&gt;
&lt;li&gt;Query optimization is critical (ORDER BY, partitions, indexes)&lt;/li&gt;
&lt;li&gt;Performance unpredictable without templates&lt;/li&gt;
&lt;li&gt;Safety requirements for production use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This is the kind of problem that's fun to solve.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Building in Public
&lt;/h2&gt;

&lt;p&gt;I'm building this &lt;strong&gt;completely in the open&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Open source from day 1 (Apache 2.0)&lt;/li&gt;
&lt;li&gt;✅ Weekly blog posts on progress&lt;/li&gt;
&lt;li&gt;✅ GitHub repo with full code&lt;/li&gt;
&lt;li&gt;✅ Community feedback welcomed&lt;/li&gt;
&lt;li&gt;✅ LSports as design partner (validates with real data)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The problem is real and affects many teams&lt;/li&gt;
&lt;li&gt;Building in public gets better feedback faster&lt;/li&gt;
&lt;li&gt;Open source means others can contribute and improve it&lt;/li&gt;
&lt;li&gt;Someone needs to solve this—might as well share the solution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not trying to:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compete with ClickHouse Inc (building on their foundation)&lt;/li&gt;
&lt;li&gt;Keep it proprietary&lt;/li&gt;
&lt;li&gt;Build in secret&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Roadmap
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Week 1-2: POC (Proof of Concept)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;10 investigation query templates&lt;/li&gt;
&lt;li&gt;Basic LLM integration&lt;/li&gt;
&lt;li&gt;Validation layer&lt;/li&gt;
&lt;li&gt;Integration with official ClickHouse MCP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Success:&lt;/strong&gt; Handles 40% of my daily investigations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Month 2: Core Engine
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;30-50 query templates (80% coverage)&lt;/li&gt;
&lt;li&gt;Complete semantic layer&lt;/li&gt;
&lt;li&gt;Multi-turn conversation state&lt;/li&gt;
&lt;li&gt;Performance monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Success:&lt;/strong&gt; Replaces manual SQL for most use cases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Month 3: Launch
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Production-ready release&lt;/li&gt;
&lt;li&gt;MCP server implementation (works in Cursor, Claude Desktop, ChatGPT Apps)&lt;/li&gt;
&lt;li&gt;REST API + SDKs&lt;/li&gt;
&lt;li&gt;Documentation and examples&lt;/li&gt;
&lt;li&gt;Conference talk submissions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'm NOT Building
&lt;/h2&gt;

&lt;p&gt;Important to be clear:&lt;/p&gt;

&lt;p&gt;❌ &lt;strong&gt;Not replacing the official ClickHouse MCP&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The official MCP is great at execution&lt;/li&gt;
&lt;li&gt;I'm building the intelligence layer on top&lt;/li&gt;
&lt;li&gt;Complementary, not competitive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;❌ &lt;strong&gt;Not a generic text-to-SQL tool&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Purpose-built for ClickHouse&lt;/li&gt;
&lt;li&gt;Optimized for real-time analytics and investigations&lt;/li&gt;
&lt;li&gt;Won't support other databases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;❌ &lt;strong&gt;Not a consumer product (yet)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Starting as open source&lt;/li&gt;
&lt;li&gt;LSports is design partner&lt;/li&gt;
&lt;li&gt;May evolve into hosted service later&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Initial Tech Direction (Subject to Change)
&lt;/h2&gt;

&lt;p&gt;I'm starting with these technologies, but may adjust as I build:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent Core:
- Language: TypeScript or Python (TBD based on MCP integration)
- LLM: Claude or GPT-4 (testing both)
- MCP Integration: Official ClickHouse MCP

ClickHouse Access:
- Primary: Official ClickHouse MCP
- Fallback: Direct client if needed

Semantic Layer:
- Configuration files (format TBD)
- Schema caching (Redis or in-memory)

Observability:
- Query logging and performance tracking
- Error monitoring
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This will evolve as I learn what actually works.&lt;/strong&gt; Building in public means sharing the journey, including what doesn't work.&lt;/p&gt;




&lt;h2&gt;
  
  
  How You Can Help
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you're using ClickHouse + AI agents:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What questions are you trying to answer with your data?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I want to understand common investigation patterns&lt;/li&gt;
&lt;li&gt;What templates would be most useful?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What's broken in current solutions?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Official MCP pain points?&lt;/li&gt;
&lt;li&gt;LLM SQL generation issues?&lt;/li&gt;
&lt;li&gt;Performance problems?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What would make this 10x better for you?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Features you need?&lt;/li&gt;
&lt;li&gt;Integration requirements?&lt;/li&gt;
&lt;li&gt;Use cases I'm missing?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; [Coming this week - will update post]&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Follow progress:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Twitter: &lt;a href="https://x.com/zevikgg" rel="noopener noreferrer"&gt;https://x.com/zevikgg&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;LinkedIn: &lt;a href="https://linkedin.com/in/zeevgerstner" rel="noopener noreferrer"&gt;https://linkedin.com/in/zeevgerstner&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Or just comment below.&lt;/strong&gt; I read everything.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where This is Heading
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Product Vision:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In 12 months, this should be the go-to solution for teams running ClickHouse + AI agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Production-ready and battle-tested&lt;/li&gt;
&lt;li&gt;Handles 80%+ of investigation patterns out of the box&lt;/li&gt;
&lt;li&gt;Active community contributing templates&lt;/li&gt;
&lt;li&gt;Used by teams beyond just sports analytics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Personal Goals:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If I become known as a ClickHouse + AI expert along the way—great. If I end up speaking at conferences about this—even better.&lt;/p&gt;

&lt;p&gt;But the real measure of success is: &lt;strong&gt;Did we solve a real problem that others have too?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;The official ClickHouse MCP is excellent at execution.&lt;/p&gt;

&lt;p&gt;But execution alone isn't enough for production investigations. You need intelligence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding of business context&lt;/li&gt;
&lt;li&gt;Query optimization for ClickHouse&lt;/li&gt;
&lt;li&gt;Safety guarantees&lt;/li&gt;
&lt;li&gt;Investigation workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;That's what I'm building.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Standing on the shoulders of giants (the official MCP), building the intelligence layer that's missing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Want to follow along?&lt;/strong&gt; Drop a comment, star the repo (coming this week), or reach out directly.&lt;/p&gt;

&lt;p&gt;Let's build something useful. Together. In the open.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Zevik - Group Manager, Data Intelligence @ LSports&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Building ClickHouse investigation agents in public&lt;/em&gt;&lt;/p&gt;

</description>
      <category>clickhouse</category>
      <category>ai</category>
      <category>agents</category>
      <category>buildinpublic</category>
    </item>
  </channel>
</rss>
