<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Siddhant Rai</title>
    <description>The latest articles on DEV Community by Siddhant Rai (@siiddhantt).</description>
    <link>https://dev.to/siiddhantt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F952096%2F3377ef4b-e695-47a9-a7d7-7eff30c6f5e0.jpeg</url>
      <title>DEV Community: Siddhant Rai</title>
      <link>https://dev.to/siiddhantt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/siiddhantt"/>
    <language>en</language>
    <item>
      <title>Building ReefWatch, a Coral-Powered Production Triage Agent</title>
      <dc:creator>Siddhant Rai</dc:creator>
      <pubDate>Sat, 30 May 2026 06:43:00 +0000</pubDate>
      <link>https://dev.to/siiddhantt/building-reefwatch-a-coral-powered-production-triage-agent-23hf</link>
      <guid>https://dev.to/siiddhantt/building-reefwatch-a-coral-powered-production-triage-agent-23hf</guid>
      <description>&lt;p&gt;Production incidents almost never break in one place.&lt;/p&gt;

&lt;p&gt;The alert fires in one tool. The broken deploy is in Netlify. The suspicious&lt;br&gt;
change is in GitHub. The stack trace is in Sentry. The human context is in&lt;br&gt;
Slack. The runbook is in Notion. The "is this actually paging someone?" answer&lt;br&gt;
is in PagerDuty.&lt;/p&gt;

&lt;p&gt;A normal chatbot can sound helpful in that situation. It can say things like&lt;br&gt;
"you should check your recent deployments" and "look for related errors in&lt;br&gt;
Sentry."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But that is not triage. That is a polished to-do list.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I wanted something more useful: an agent that could go get the evidence, connect&lt;br&gt;
the dots across sources, show its work, and give an operator-grade answer&lt;br&gt;
grounded in real system data.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The design constraint from the start was simple: no evidence, no answer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That became &lt;a href="https://github.com/siiddhantt/reefwatch" rel="noopener noreferrer"&gt;&lt;strong&gt;ReefWatch&lt;/strong&gt;&lt;/a&gt;, a Coral-powered production triage agent built to&lt;br&gt;
investigate instead of improvise.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4kfs0hu26c8gtcs36bu6.gif" class="article-body-image-wrapper"&gt;&lt;img width="559" height="339" alt="GH" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4kfs0hu26c8gtcs36bu6.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It discovers the tools connected to a workspace at runtime, queries them as&lt;br&gt;
evidence, correlates records across systems, and produces a compact answer only&lt;br&gt;
when the facts support one.&lt;/p&gt;

&lt;p&gt;Coral became the backbone because it turns the messiest part of agent tooling&lt;br&gt;
into something the model can actually reason about: &lt;strong&gt;SQL&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  What This Guide Builds
&lt;/h2&gt;

&lt;p&gt;By the end of this route, you will have a blueprint for an agent that can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;discover connected &lt;strong&gt;Coral&lt;/strong&gt; sources at runtime&lt;/li&gt;
&lt;li&gt;query production systems through read-only &lt;strong&gt;SQL&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;correlate evidence across code, deploys, errors, alerts, chats, and runbooks&lt;/li&gt;
&lt;li&gt;stream every query and row count into an inspectable UI&lt;/li&gt;
&lt;li&gt;run the same investigation workflow from a CLI when you want a scriptable path&lt;/li&gt;
&lt;li&gt;generate an incident report only when the evidence supports one&lt;/li&gt;
&lt;li&gt;stay focused with policy layers instead of a giant prompt blob&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fel13nah6vxx5pevhcsaw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fel13nah6vxx5pevhcsaw.png" alt="Investigation Workspace" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In one sentence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ReefWatch is a Coral-powered investigation workspace that lets an agent discover connected tools at runtime, query them with read-only SQL, stream the evidence trail, and generate an incident report only when the facts actually support one.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3de5x18u8cyyvnypajqf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3de5x18u8cyyvnypajqf.png" alt="The ReefWatch Flow" width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Why Coral Belongs At The Center
&lt;/h2&gt;

&lt;p&gt;MCP is excellent as an integration layer. It gives models a way to call tools&lt;br&gt;
with schemas instead of scraping humans through UI glue.&lt;/p&gt;

&lt;p&gt;But if every source becomes a separate collection of bespoke tools, a new&lt;br&gt;
problem appears:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the model has to learn many tool shapes&lt;/li&gt;
&lt;li&gt;every API has different pagination and filters&lt;/li&gt;
&lt;li&gt;joining records across sources becomes a prompt exercise&lt;/li&gt;
&lt;li&gt;source-specific errors leak straight into the agent loop&lt;/li&gt;
&lt;li&gt;every new integration asks the app to own more integration logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Coral changes the abstraction.&lt;/p&gt;

&lt;p&gt;It still uses MCP, but the agent mostly sees a small set of stable capabilities:&lt;br&gt;
discover catalog, inspect schema, read the guide, and run SQL.&lt;/p&gt;

&lt;p&gt;That means a new source is not:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;teach ReefWatch another SDK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It becomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;install a Coral source -&amp;gt; discover the tables -&amp;gt; query evidence with SQL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frb57ip1ivtepqn4cgc8p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frb57ip1ivtepqn4cgc8p.png" alt="Tool-first flow" width="799" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxz7i6m6kx5cc2cy9mfwl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxz7i6m6kx5cc2cy9mfwl.png" alt="ReefWatch flow using Coral" width="796" height="73"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The practical win is boring in the best way: &lt;strong&gt;ReefWatch can stay small.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The app does not own GitHub pagination, Sentry auth, Slack table shapes, or&lt;br&gt;
Netlify deploy schemas.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coral owns that. ReefWatch owns the investigation behavior.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That split also maps well to how I think reliable agents should be built:&lt;br&gt;
ground the model in real environment feedback, keep tools composable, trace the&lt;br&gt;
work, and wrap the loop with small guardrails instead of hoping one perfect&lt;br&gt;
prompt behaves forever.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;MCP gives the agent hands. Coral gives it a map and a query language.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  The Build Path
&lt;/h2&gt;

&lt;p&gt;If I were rebuilding ReefWatch from scratch, I would not start with the UI.&lt;/p&gt;

&lt;p&gt;I would start with the investigation pipeline and make each layer earn its&lt;br&gt;
place.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember, it is tempting to start with the surface, but you should make the surface reflect a system that was already worth trusting.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The project came together in eight slices:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Slice&lt;/th&gt;
&lt;th&gt;What I Built&lt;/th&gt;
&lt;th&gt;Why It Mattered&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Coral MCP client&lt;/td&gt;
&lt;td&gt;Proved Coral could be the data plane&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Warm Coral session&lt;/td&gt;
&lt;td&gt;Removed repeated MCP startup cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Schema context&lt;/td&gt;
&lt;td&gt;Kept prompts aligned with live Coral metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Minimal agent loop&lt;/td&gt;
&lt;td&gt;Exposed the real model failure modes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Policy modules&lt;/td&gt;
&lt;td&gt;Made the agent reliable without hardcoding a demo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Persistence&lt;/td&gt;
&lt;td&gt;Made runs debuggable and conversations durable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Streaming UI&lt;/td&gt;
&lt;td&gt;Made the investigation inspectable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Source profiles&lt;/td&gt;
&lt;td&gt;Made setup reproducible without requiring every token&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hq47mq5psev54bz4g2f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8hq47mq5psev54bz4g2f.png" alt="Build Path" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The final project shape looks roughly like this:&lt;/p&gt;

&lt;p&gt;(a FastAPI backend, with SQLite store and React frontend)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;reefwatch/
|-- src/
|   |-- api/
|   |   |-- routes/
|   |   |   |-- coral.py              # Coral health and source setup
|   |   |   |-- conversations.py      # persisted investigation threads
|   |   |   |-- investigations.py     # REST + SSE investigation runs
|   |   |   `-- schema.py             # schema visibility for the UI
|   |   |-- dependencies.py           # shared Settings, store, agent, session
|   |   |-- mappers.py                # domain models to API responses
|   |   `-- schemas.py                # API contracts
|   `-- app/
|       |-- adapters/
|       |   |-- coral_session.py      # long-lived Coral process + warm cache
|       |   |-- mcp_client.py         # JSON-RPC over coral mcp-stdio
|       |   `-- store.py              # SQLite run/conversation persistence
|       |-- agent/
|       |   |-- context.py            # conversation compression
|       |   |-- coverage.py           # evidence lane policy
|       |   |-- events.py             # streamable trace event contracts
|       |   |-- guardrails.py         # evidence-first retries
|       |   |-- intent.py             # structured artifact routing
|       |   |-- execution_policy.py   # duplicate and SQL-shape hygiene
|       |   |-- loop.py               # LLM/tool loop
|       |   |-- policy.py             # budgets and finalization
|       |   |-- prompts.py            # schema-aware operating contract
|       |   |-- schema.py             # Coral table/column context builder
|       |   |-- source_guidance.py    # compact source idiom hints
|       |   |-- taxonomy.py           # source lanes and shared intent vocabulary
|       |   |-- workflow.py           # coverage, correlation, and stop checkpoints
|       |   `-- synthesis.py          # optional incident report synthesis
|       |-- config.py                 # centralized runtime knobs
|       |-- coral_setup.py            # install/test Coral source profiles
|       `-- source_profiles.py        # triage, demo, enterprise profiles
`-- frontend/
    `-- src/
        |-- components/chat/          # chat surface, markdown, evidence trail
        |-- store/                    # conversation/run state
        `-- api/                      # backend client

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That structure came from the order of problems I solved.&lt;/p&gt;




&lt;h2&gt;
  
  
  Slice 1: Prove Coral Can Be The Data Plane
&lt;/h2&gt;

&lt;p&gt;The first backend slice was deliberately small.&lt;/p&gt;

&lt;p&gt;I wanted to answer one question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can ReefWatch treat Coral as the source of operational truth?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first proof was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Launch &lt;code&gt;coral mcp-stdio&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Initialize MCP over JSON-RPC.&lt;/li&gt;
&lt;li&gt;Read &lt;code&gt;coral://tables&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Call the &lt;code&gt;sql&lt;/code&gt; tool.&lt;/li&gt;
&lt;li&gt;Return rows to a plain API endpoint.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;At that point, ReefWatch was not an agent yet. It was a thin Coral client.&lt;/p&gt;

&lt;p&gt;That was useful because it proved the most important bet: &lt;strong&gt;the app could treat&lt;br&gt;
Coral as the data plane&lt;/strong&gt; instead of building direct SDK integrations for&lt;br&gt;
GitHub, Slack, Sentry, and every other source.&lt;/p&gt;

&lt;p&gt;The first reusable module was &lt;code&gt;mcp_client.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It owns the boring but essential transport details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;spawn the Coral binary&lt;/li&gt;
&lt;li&gt;speak JSON-RPC over stdio&lt;/li&gt;
&lt;li&gt;convert MCP tools to OpenAI-compatible function tools&lt;/li&gt;
&lt;li&gt;read Coral resources such as &lt;code&gt;coral://guide&lt;/code&gt; and &lt;code&gt;coral://tables&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;surface stderr and decoding errors clearly&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Design decision:&lt;/strong&gt; keep transport boring. Once &lt;code&gt;mcp_client.py&lt;/code&gt; worked, the&lt;br&gt;
rest of the app could stop thinking about processes and start thinking about&lt;br&gt;
investigations.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Slice 2: Keep Coral Warm
&lt;/h2&gt;

&lt;p&gt;The naive approach would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user asks question -&amp;gt; spawn Coral -&amp;gt; discover schema -&amp;gt; ask model -&amp;gt; run SQL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is fine for a script.&lt;/p&gt;

&lt;p&gt;It feels rough in a &lt;strong&gt;product&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So the second slice was &lt;code&gt;coral_session.py&lt;/code&gt;. It keeps one Coral process alive,&lt;br&gt;
warms the schema/guide/tool cache, and recreates the process if it dies.&lt;/p&gt;

&lt;p&gt;That gave ReefWatch a cleaner runtime shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;app starts -&amp;gt; warm Coral once -&amp;gt; investigations reuse the session
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fis4rcd498w66bbc9w2w4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fis4rcd498w66bbc9w2w4.png" alt="Agent loop" width="800" height="646"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The session cache stores three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the Coral source schema&lt;/li&gt;
&lt;li&gt;the Coral guide&lt;/li&gt;
&lt;li&gt;the OpenAI-compatible tool definitions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That one decision made the product feel different.&lt;/p&gt;

&lt;p&gt;Instead of every user prompt waiting on MCP bootstrapping and catalog discovery,&lt;br&gt;
&lt;strong&gt;ReefWatch starts from a warm map of the available sources.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There is still a fallback path. If the process dies, &lt;code&gt;CoralSession&lt;/code&gt; can recreate&lt;br&gt;
the client and warm the cache again.&lt;/p&gt;

&lt;p&gt;The MCP client reads and writes JSON-RPC over stdio with UTF-8 decoding, drains&lt;br&gt;
stderr on a background thread, and reports useful transport errors rather than&lt;br&gt;
hanging silently.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A production triage agent that randomly waits on its own plumbing is not a production triage agent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Slice 3: Build Schema Context From Coral, Not Memory
&lt;/h2&gt;

&lt;p&gt;Hardcoding source schemas would defeat the point of using Coral.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The agent must discover what is installed right now.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The temptation was to write a hand-authored prompt like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GitHub has these tables. Sentry has these tables. Slack uses channel ids.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That would have made ReefWatch brittle and less Coral-native.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This was one of those small choices where the architecture either respects the&lt;br&gt;
tool it is built on, or quietly works around it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Instead, ReefWatch builds its prompt context from Coral itself.&lt;/p&gt;

&lt;p&gt;It reads &lt;code&gt;coral://tables&lt;/code&gt;, enriches the result with &lt;code&gt;coral.columns&lt;/code&gt;, groups&lt;br&gt;
tables by source, and includes only a compact slice of each source in the&lt;br&gt;
prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;schema_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data_type&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;coral&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;schema_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ordinal_position&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The key idea:&lt;/strong&gt; the model gets a map, not a maze.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the source catalog is small, the model sees most of it.&lt;/p&gt;

&lt;p&gt;If the catalog is large, the model gets enough to start and can use Coral&lt;br&gt;
discovery tools for the rest.&lt;/p&gt;

&lt;p&gt;That keeps the prompt useful without pretending the app has permanent knowledge of every source.&lt;/p&gt;
&lt;h2&gt;
  
  
  Slice 4: Start With The Smallest Useful Agent Loop
&lt;/h2&gt;

&lt;p&gt;Only after Coral transport and schema context worked did I build the agent loop.&lt;/p&gt;

&lt;p&gt;The first version of &lt;code&gt;agent/loop.py&lt;/code&gt; had one job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;messages -&amp;gt; LLM tool call -&amp;gt; Coral SQL -&amp;gt; tool result -&amp;gt; final answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That version was intentionally plain.&lt;/p&gt;

&lt;p&gt;It let me see the raw failure modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;advice instead of action:&lt;/strong&gt; the model answered with instructions instead of querying&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;catalog instead of evidence:&lt;/strong&gt; it listed tables instead of using them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;unnecessary clarification:&lt;/strong&gt; it asked for repo names GitHub auth could reveal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;single-lane tunnel vision:&lt;/strong&gt; it queried one source and claimed it investigated everything&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;false negatives:&lt;/strong&gt; it treated a filtered zero-row query as proof of no evidence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those failures were useful.&lt;/p&gt;

&lt;p&gt;They showed which parts belonged in the prompt and which parts deserved&lt;br&gt;
code-level policy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A bad first agent run is not wasted time if it tells you where the system needs structure.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Slice 5: Add Policy Around The Loop
&lt;/h2&gt;

&lt;p&gt;This was the real turning point.&lt;/p&gt;

&lt;p&gt;I &lt;strong&gt;stopped trying to make one heroic system prompt do everything&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead, I split agent behavior into focused modules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;policy.py&lt;/code&gt; decides query budgets and finalization behavior.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;guardrails.py&lt;/code&gt; handles evidence-first retries and missing-source retries.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;coverage.py&lt;/code&gt; decides which evidence lanes matter for a request.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;workflow.py&lt;/code&gt; turns coverage and correlation into small checkpoint prompts.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;execution_policy.py&lt;/code&gt; skips duplicate/noisy query shapes and catches table/function syntax mistakes before they hit Coral.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;context.py&lt;/code&gt; compresses conversation history.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;synthesis.py&lt;/code&gt; decides whether a structured report is appropriate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was better than making one giant prompt because each module has a clear&lt;br&gt;
reason to exist and can be tested:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Failure Mode&lt;/th&gt;
&lt;th&gt;Layer That Handles It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Agent answers without querying&lt;/td&gt;
&lt;td&gt;&lt;code&gt;guardrails.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent stops after one source&lt;/td&gt;
&lt;td&gt;&lt;code&gt;coverage.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent ignores missing evidence lanes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;workflow.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent skips cross-source correlation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;workflow.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent repeats query shapes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;execution_policy.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent loops too long&lt;/td&gt;
&lt;td&gt;&lt;code&gt;policy.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conversation gets too large&lt;/td&gt;
&lt;td&gt;&lt;code&gt;context.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Report appears for ordinary questions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;synthesis.py&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The model still has agency. The code does not prescribe exact SQL for a demo&lt;br&gt;
scenario.&lt;/p&gt;

&lt;p&gt;The policy layer just &lt;strong&gt;keeps the model inside the kind of investigation a&lt;br&gt;
human operator would expect.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Slice 6: Persist Runs Before Building A Fancy Chat
&lt;/h2&gt;

&lt;p&gt;The next slice was persistence.&lt;/p&gt;

&lt;p&gt;I started with SQLite because this is a proof-of-concept and local operator&lt;br&gt;
tool, not a multi-tenant SaaS backend.&lt;/p&gt;

&lt;p&gt;The important part was not Postgres. The important part was recording:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;conversation IDs&lt;/li&gt;
&lt;li&gt;user questions&lt;/li&gt;
&lt;li&gt;model used&lt;/li&gt;
&lt;li&gt;final answer&lt;/li&gt;
&lt;li&gt;report payload&lt;/li&gt;
&lt;li&gt;every SQL execution&lt;/li&gt;
&lt;li&gt;row counts and errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That made debugging dramatically easier.&lt;/p&gt;

&lt;p&gt;When a run looked bad, I could inspect the exact queries and decide whether the&lt;br&gt;
failure was &lt;strong&gt;prompt, policy, schema, model, or source setup&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is also why the frontend can hydrate conversations and show evidence&lt;br&gt;
instead of keeping everything only in Redux memory.&lt;/p&gt;
&lt;h2&gt;
  
  
  Slice 7: Build The UI Around Evidence
&lt;/h2&gt;

&lt;p&gt;Only then did the chat UI become valuable.&lt;/p&gt;

&lt;p&gt;The UI was not designed as &lt;strong&gt;"talk to an AI."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It was designed as an &lt;strong&gt;investigation workspace&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The input starts the investigation.&lt;/li&gt;
&lt;li&gt;The agent streams progress.&lt;/li&gt;
&lt;li&gt;SQL queries appear as evidence.&lt;/li&gt;
&lt;li&gt;The trail collapses when the final answer arrives.&lt;/li&gt;
&lt;li&gt;The final answer renders as Markdown.&lt;/li&gt;
&lt;li&gt;Conversations can be revisited because runs are persisted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That UI decision matters because Coral is visualizable.&lt;/p&gt;

&lt;p&gt;The user can see source counts, SQL queries, row counts, and the final&lt;br&gt;
synthesis.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;ReefWatch shows the route instead of hiding it behind one polished&lt;br&gt;
paragraph.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Slice 8: Add Source Profiles Last
&lt;/h2&gt;

&lt;p&gt;The last piece was source profiles.&lt;/p&gt;

&lt;p&gt;I did not want the default setup to require every possible token. That creates a&lt;br&gt;
bad demo path.&lt;/p&gt;

&lt;p&gt;Instead, ReefWatch has profiles:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Profile&lt;/th&gt;
&lt;th&gt;Sources&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;triage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GitHub, Sentry, Slack, Netlify&lt;/td&gt;
&lt;td&gt;lightweight production triage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;demo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;triage + PagerDuty&lt;/td&gt;
&lt;td&gt;richer incident response demo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;security&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;GitHub, Slack, Notion, OSV&lt;/td&gt;
&lt;td&gt;compliance/security route&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;enterprise&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;demo + Notion + OSV&lt;/td&gt;
&lt;td&gt;default hackathon showcase&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;observability&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;demo + Datadog + StatusGator&lt;/td&gt;
&lt;td&gt;deeper ops setup&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This keeps the build reproducible.&lt;/p&gt;

&lt;p&gt;A reader can start with &lt;code&gt;triage&lt;/code&gt;, get a real agent working, then add&lt;br&gt;
Notion/OSV/PagerDuty when they want a stronger story.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Agent Loop That Made It Work
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82jtnm1jno9d24dh9bl1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82jtnm1jno9d24dh9bl1.png" alt="Loop" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The main loop is intentionally simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build messages from the user prompt, schema context, Coral guide, shared
source taxonomy, and compressed conversation memory.&lt;/li&gt;
&lt;li&gt;Ask the LLM for tool calls.&lt;/li&gt;
&lt;li&gt;Execute Coral MCP tools.&lt;/li&gt;
&lt;li&gt;Record SQL executions.&lt;/li&gt;
&lt;li&gt;Stream trace events to the UI.&lt;/li&gt;
&lt;li&gt;Apply workflow checkpoints and lightweight execution hygiene.&lt;/li&gt;
&lt;li&gt;Stop when evidence is sufficient or the configured budget is reached.&lt;/li&gt;
&lt;li&gt;Classify the artifact the current request deserves.&lt;/li&gt;
&lt;li&gt;Optionally synthesize a structured incident report.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9ptt3tqpz1to89yfdx8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9ptt3tqpz1to89yfdx8.png" alt="Agent loop" width="800" height="825"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The important part is not that the loop is complicated.&lt;/p&gt;

&lt;p&gt;It is that the loop is surrounded by small pieces of judgment.&lt;/p&gt;
&lt;h3&gt;
  
  
  Evidence-first guardrail
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7p35ehe6anmluadtpwv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe7p35ehe6anmluadtpwv.png" alt="Guardrail" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the user asks an operational question like "what issues are on my GitHub?"&lt;br&gt;
and the model tries to answer without querying Coral, ReefWatch injects a retry&lt;br&gt;
message:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"You have not queried Coral yet. Do not answer with table recommendations or ask&lt;br&gt;
for repo/org names until you first run metadata/source SQL queries to infer them.""&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This fixed the first embarrassing failure mode: the agent giving me instructions&lt;br&gt;
instead of doing the investigation.&lt;/p&gt;
&lt;h3&gt;
  
  
  Source coverage policy
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ci12fdy25t17gs64s9a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ci12fdy25t17gs64s9a.png" alt="Coverage" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For production triage, one source is almost never enough.&lt;/p&gt;

&lt;p&gt;ReefWatch treats sources as evidence lanes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Sources&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ops&lt;/td&gt;
&lt;td&gt;GitHub, Sentry, Netlify, Slack, PagerDuty, StatusGator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge&lt;/td&gt;
&lt;td&gt;Notion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;GitHub, OSV, Notion, Slack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Datadog&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The policy does not say "always query everything."&lt;/p&gt;

&lt;p&gt;It checks what is actually installed and what the user asked. If the user asks&lt;br&gt;
specifically about GitHub, the coverage stays GitHub-scoped. If the user asks&lt;br&gt;
for production triage, the agent should cover the available ops lanes before&lt;br&gt;
finalizing.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You have only checked GitHub, but Sentry and Netlify are available, so prefer those lanes next.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That is the kind of judgment I wanted outside the model.&lt;/p&gt;

&lt;p&gt;The important refinement: coverage is a &lt;strong&gt;guide, not a cage&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If the model just discovered the right Sentry project ID or hit a column error,&lt;br&gt;
it is allowed to inspect Coral metadata and correct that source query before&lt;br&gt;
moving on. That matters because real triage has tiny detours:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;find the ID&lt;/li&gt;
&lt;li&gt;inspect the columns&lt;/li&gt;
&lt;li&gt;fix the table/function shape&lt;/li&gt;
&lt;li&gt;then continue the lane plan&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hard-blocking those detours made the agent worse. ReefWatch now nudges the&lt;br&gt;
investigation path without preventing useful schema correction.&lt;/p&gt;

&lt;p&gt;The source lane definitions and shared intent vocabulary live in &lt;code&gt;taxonomy.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That small file exists for a boring but important reason: &lt;strong&gt;coverage, budgets,&lt;br&gt;
and intent classification should not each carry their own slightly different&lt;br&gt;
definition of what "incident" means.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent is still dynamic. &lt;code&gt;taxonomy.py&lt;/code&gt; does not contain TraceChat queries,&lt;br&gt;
table names for a demo, or source-specific SQL recipes. It only describes the&lt;br&gt;
categories ReefWatch can reason about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ops evidence&lt;/li&gt;
&lt;li&gt;knowledge evidence&lt;/li&gt;
&lt;li&gt;security evidence&lt;/li&gt;
&lt;li&gt;observability evidence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Coral still discovers the actual tables, functions, filters, and columns at&lt;br&gt;
runtime.&lt;/p&gt;
&lt;h3&gt;
  
  
  Cross-source correlation checkpoint
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhyj758omo5ae0egip34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhyj758omo5ae0egip34.png" alt="Cross-source" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This was the final thing I tightened before the demo.&lt;/p&gt;

&lt;p&gt;Once multiple evidence lanes return concrete anchors, ReefWatch asks for a&lt;br&gt;
Coral-side correlation query instead of letting the model stitch everything&lt;br&gt;
together in prose.&lt;/p&gt;

&lt;p&gt;The preferred shape is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...),&lt;/span&gt;
     &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...),&lt;/span&gt;
     &lt;span class="n"&gt;notes&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;notes&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or, when the relationship is time-based instead of key-based:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...),&lt;/span&gt;
     &lt;span class="n"&gt;errors&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...),&lt;/span&gt;
     &lt;span class="n"&gt;notes&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt;
&lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;notes&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_seen&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_seen&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That checkpoint is still source-agnostic. It does not say "for TraceChat, run&lt;br&gt;
this SQL." It says: &lt;strong&gt;if the evidence exposes IDs, URLs, releases, commits,&lt;br&gt;
service names, channel IDs, or timestamps, prove the relationship inside Coral.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If a correlation query fails because of SQL shape, the next instruction is not&lt;br&gt;
"give up." It is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;inspect Coral metadata,&lt;/li&gt;
&lt;li&gt;correct the table, function, column, or filter shape,&lt;/li&gt;
&lt;li&gt;retry with a smaller join.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This made ReefWatch feel much less like a chatbot and much more like an&lt;br&gt;
investigation workbench.&lt;/p&gt;
&lt;h3&gt;
  
  
  Decisive evidence, not accidental emptiness
&lt;/h3&gt;

&lt;p&gt;One subtle failure: a model can run a query with a hallucinated timestamp column,&lt;br&gt;
get zero rows, and conclude "Slack had no evidence."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That is bad triage.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;ReefWatch treats a filtered zero-row evidence query as not fully decisive until&lt;br&gt;
the model relaxes the filter or inspects the schema.&lt;/p&gt;

&lt;p&gt;A broad zero-row data query can satisfy a lane. A narrow zero-row query with&lt;br&gt;
extra &lt;code&gt;WHERE&lt;/code&gt; filters cannot automatically close the book.&lt;/p&gt;

&lt;p&gt;That small distinction protects against false negatives without hardcoding&lt;br&gt;
Slack or any other source.&lt;/p&gt;
&lt;h3&gt;
  
  
  Scope discipline
&lt;/h3&gt;

&lt;p&gt;Another failure mode showed up with quiet repositories.&lt;/p&gt;

&lt;p&gt;The model would discover the correct repo, then drift into global GitHub&lt;br&gt;
searches anyway.&lt;/p&gt;

&lt;p&gt;The fix was not "hardcode this repository"&lt;/p&gt;

&lt;p&gt;The fix was a &lt;strong&gt;general scope policy.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If ReefWatch has discovered a concrete &lt;code&gt;owner/repo&lt;/code&gt;, and the agent keeps&lt;br&gt;
running broad GitHub searches without &lt;code&gt;repo:owner/repo&lt;/code&gt;, it nudges the agent&lt;br&gt;
back to scoped checks.&lt;/p&gt;

&lt;p&gt;This now lives as workflow guidance rather than a hard execution block. The&lt;br&gt;
point is the same: once a concrete anchor exists, prefer scoped evidence over&lt;br&gt;
another broad search, but still allow a corrective metadata query when the model&lt;br&gt;
needs to fix the route.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhc4du75b43x81xhjvh5f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhc4du75b43x81xhjvh5f.png" alt="Scope policy" width="800" height="285"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Query budgets
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmmnhbypk7p5mkrh6ak2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhmmnhbypk7p5mkrh6ak2.png" alt="Budget" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The budget is not about limiting Coral.&lt;/p&gt;

&lt;p&gt;Coral SQL queries are cheap compared to LLM loops.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The budget is about preventing agent drift and making the product predictable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;ReefWatch uses different budgets by request type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;health checks get a smaller budget&lt;/li&gt;
&lt;li&gt;general triage gets a medium budget&lt;/li&gt;
&lt;li&gt;incident/root-cause prompts get a larger budget&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the budget is reached, the model must stop querying and produce the best&lt;br&gt;
evidence-backed answer it can, explicitly naming unknowns.&lt;/p&gt;
&lt;h3&gt;
  
  
  Conversation memory
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxcl1e4gbqxijjm91ub9i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxcl1e4gbqxijjm91ub9i.png" alt="Memory" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The UI is conversational, but the product is not trying to become a general chat&lt;br&gt;
companion.&lt;/p&gt;

&lt;p&gt;The conversation flow exists for follow-up investigations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"check the same repo again"&lt;/li&gt;
&lt;li&gt;"what about Sentry?"&lt;/li&gt;
&lt;li&gt;"show me the deployment angle"&lt;/li&gt;
&lt;li&gt;"now make that an incident report"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ReefWatch persists conversations and runs in SQLite.&lt;/p&gt;

&lt;p&gt;For the agent prompt, it builds a compact context from recent runs and SQL&lt;br&gt;
executions. If the message history gets too large, &lt;code&gt;ContextWindow&lt;/code&gt; compresses&lt;br&gt;
older tool chatter into an execution summary and keeps the latest turns.&lt;/p&gt;

&lt;p&gt;That gives the model continuity without stuffing every old row into the prompt.&lt;/p&gt;
&lt;h3&gt;
  
  
  Intent classification
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkgq5y0y5cofz9577lrx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkgq5y0y5cofz9577lrx.png" alt="Intent" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first version of ReefWatch used a small keyword policy to decide whether a&lt;br&gt;
run should produce an incident report.&lt;/p&gt;

&lt;p&gt;That was useful as a fallback, but it was too blunt for a real conversation.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What did it find on Slack?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That follow-up might mention "incident chatter" or "deploy errors" in the&lt;br&gt;
answer, but the user did not ask for a new incident report. They asked for a&lt;br&gt;
source-specific explanation.&lt;/p&gt;

&lt;p&gt;The fix was a &lt;strong&gt;structured intent classifier&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After the evidence answer is drafted, ReefWatch asks a lightweight structured&lt;br&gt;
LLM step to classify the artifact:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;answer_only&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;incident_report&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;audit_report&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;follow_up&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The prompt is intentionally narrow. It classifies the &lt;strong&gt;current user request&lt;/strong&gt;,&lt;br&gt;
not random words that appear in the answer draft or previous conversation&lt;br&gt;
context.&lt;/p&gt;

&lt;p&gt;There are still deterministic policy boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;report_policy=never&lt;/code&gt; always disables reports&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;report_policy=always&lt;/code&gt; always enables an incident report&lt;/li&gt;
&lt;li&gt;no evidence queries means no report&lt;/li&gt;
&lt;li&gt;if the classifier fails, ReefWatch falls back to a conservative heuristic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the pattern I ended up liking most: let the model handle semantic&lt;br&gt;
intent, but keep product policy outside the model.&lt;/p&gt;
&lt;h3&gt;
  
  
  Report synthesis
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Not every question deserves a report.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If I ask "are there any open issues on my GitHub?", an incident report would be&lt;br&gt;
the wrong artifact.&lt;/p&gt;

&lt;p&gt;If I ask "investigate the production regression," a report is useful.&lt;/p&gt;

&lt;p&gt;The intent classifier decides the artifact. Report synthesis only runs when the&lt;br&gt;
mode is &lt;code&gt;incident_report&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The structured synthesizer gets only the findings, SQL summary, and sources&lt;br&gt;
used.&lt;/p&gt;

&lt;p&gt;It has to stay grounded in the evidence already collected. If evidence is weak,&lt;br&gt;
it must lower confidence rather than invent a root cause.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08jfdc01hgul4kcfltoh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08jfdc01hgul4kcfltoh.png" alt="Mode selection" width="800" height="556"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The CLI Path
&lt;/h2&gt;

&lt;p&gt;The UI is the best place to watch the investigation unfold.&lt;/p&gt;

&lt;p&gt;The CLI is the best place to prove the plumbing works.&lt;/p&gt;

&lt;p&gt;That split matters for a production agent. Before I ask the model to connect&lt;br&gt;
GitHub, Sentry, Netlify, Slack, PagerDuty, Notion, and OSV into one answer, I&lt;br&gt;
want a boring setup path that can validate each lane by itself.&lt;/p&gt;

&lt;p&gt;ReefWatch exposes that through &lt;code&gt;reefwatch coral&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;doctor&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;build&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;install-profile&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;github&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;sentry&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;netlify&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;slack&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;pagerduty&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;test-source&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;notion&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;sql&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SELECT * FROM pagerduty.abilities LIMIT 5"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important detail is that the CLI does not invent another integration layer.&lt;/p&gt;

&lt;p&gt;It uses the same Coral configuration and the same MCP transport that the agent&lt;br&gt;
uses. The difference is intent: the CLI is for setup, validation, and scripted&lt;br&gt;
investigations; the web workspace is for watching evidence appear and reading&lt;br&gt;
the final answer.&lt;/p&gt;

&lt;p&gt;For example, a teammate can run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;uv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;reefwatch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;investigate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Investigate the current production issue for tracechat-ledger and tell me what needs attention now."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;--trace&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gives the project a second interface without splitting the product in two.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reproducing The Demo
&lt;/h2&gt;

&lt;p&gt;Here is the practical route another developer can follow.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Clone Coral and ReefWatch
&lt;/h3&gt;

&lt;p&gt;Build Coral locally and point ReefWatch at the binary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;clone&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;https://github.com/withcoral/coral.git&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;coral&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;cargo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;build&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then configure ReefWatch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;RW_CORAL_EXECUTABLE&lt;/span&gt;&lt;span class="o"&gt;=..&lt;/span&gt;&lt;span class="n"&gt;/coral/target/debug/coral.exe&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nx"&gt;RW_CORAL_REPO_PATH&lt;/span&gt;&lt;span class="o"&gt;=..&lt;/span&gt;&lt;span class="n"&gt;/coral&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nx"&gt;RW_CORAL_CONFIG_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;state/coral&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nx"&gt;RW_SOURCE_PROFILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;enterprise&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Choosing a capable LLM
&lt;/h3&gt;

&lt;p&gt;The LLM I went for at the time of making and testing ReefWatch was &lt;a href="https://openrouter.ai/deepseek/deepseek-v4-pro" rel="noopener noreferrer"&gt;DeepSeek v4 Pro&lt;/a&gt; as it is quite a powerful model for agentic workflows and is very cost efficient for the amount of work it does.&lt;/p&gt;

&lt;p&gt;ReefWatch supports multi-modal LLM requests for the different stages, i.e inference, the main agent loop and the synthesis, so depending on your budget and use-case you can customise it!&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Install the first source set
&lt;/h3&gt;

&lt;p&gt;Start with the sources that give the best incident story without too much setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub for code, issues, PRs, workflows&lt;/li&gt;
&lt;li&gt;Sentry for runtime errors&lt;/li&gt;
&lt;li&gt;Netlify for deployments&lt;/li&gt;
&lt;li&gt;Slack for human context&lt;/li&gt;
&lt;li&gt;PagerDuty if available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the security/compliance variant, add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Notion for runbooks and policies&lt;/li&gt;
&lt;li&gt;OSV for vulnerability intelligence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important UX decision is profiles.&lt;/p&gt;

&lt;p&gt;ReefWatch does not force every source into every demo. It has &lt;code&gt;triage&lt;/code&gt;, &lt;code&gt;demo&lt;/code&gt;,&lt;br&gt;
&lt;code&gt;security&lt;/code&gt;, &lt;code&gt;enterprise&lt;/code&gt;, and &lt;code&gt;observability&lt;/code&gt; profiles so the setup can match&lt;br&gt;
the story.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Ask one strong prompt
&lt;/h3&gt;

&lt;p&gt;Use a prompt that gives the agent enough intent but not a scripted path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Investigate the current production issue for tracechat-ledger and tell me what
needs attention now.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A good run should show:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;schema discovery from Coral&lt;/li&gt;
&lt;li&gt;GitHub repo/issue resolution&lt;/li&gt;
&lt;li&gt;Sentry project and event lookup&lt;/li&gt;
&lt;li&gt;Netlify site/deploy lookup&lt;/li&gt;
&lt;li&gt;Slack channel/message lookup&lt;/li&gt;
&lt;li&gt;optional Notion runbook lookup&lt;/li&gt;
&lt;li&gt;an answer that labels lanes as confirmed, checked-empty, partial, blocked, or
not-linked&lt;/li&gt;
&lt;li&gt;a report only if the incident shape is present&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What A Quiet Result Should Look Like
&lt;/h2&gt;

&lt;p&gt;Quiet repos are harder than they look.&lt;/p&gt;

&lt;p&gt;A lazy agent says "no issues" after one empty query. A paranoid agent runs 30&lt;br&gt;
searches and still sounds unsure.&lt;/p&gt;

&lt;p&gt;The ReefWatch answer I want is calmer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I did not find an active issue for &amp;lt;repo name&amp;gt;.

GitHub is checked-empty for open issues and PRs on that repository. I did not
find linked deployment/runtime evidence in the installed sources. No incident
report was generated because this looks like a quiet repository check, not an
active production incident.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the product philosophy in miniature:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;useful&lt;/li&gt;
&lt;li&gt;scoped&lt;/li&gt;
&lt;li&gt;evidence-backed&lt;/li&gt;
&lt;li&gt;not dramatic for no reason&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Some Highlights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Coral is not a checkbox
&lt;/h3&gt;

&lt;p&gt;ReefWatch depends on Coral's core strengths:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;runtime source discovery&lt;/li&gt;
&lt;li&gt;SQL-first querying&lt;/li&gt;
&lt;li&gt;source manifests&lt;/li&gt;
&lt;li&gt;MCP tool exposure&lt;/li&gt;
&lt;li&gt;local execution&lt;/li&gt;
&lt;li&gt;cacheable schema/guide/tool metadata&lt;/li&gt;
&lt;li&gt;cross-source correlation through common identifiers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent does not just "call Coral once."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coral is the investigation substrate.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The agent is layered
&lt;/h3&gt;

&lt;p&gt;The code is intentionally split:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Responsibility&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MCP adapter&lt;/td&gt;
&lt;td&gt;JSON-RPC over Coral stdio, UTF-8 safety, guide/resources/tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coral session&lt;/td&gt;
&lt;td&gt;Long-lived process and warm cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema model&lt;/td&gt;
&lt;td&gt;Compact source/table/column context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt builder&lt;/td&gt;
&lt;td&gt;Operating contract and live schema context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent loop&lt;/td&gt;
&lt;td&gt;LLM/tool loop and execution recording&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy&lt;/td&gt;
&lt;td&gt;Budgets and finalization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage&lt;/td&gt;
&lt;td&gt;Evidence lane requirements and source-level completeness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workflow&lt;/td&gt;
&lt;td&gt;Coverage, correlation, correction, and stop checkpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Taxonomy&lt;/td&gt;
&lt;td&gt;Shared source lanes and investigation vocabulary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Guardrails&lt;/td&gt;
&lt;td&gt;Evidence-first and missing-source retries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context&lt;/td&gt;
&lt;td&gt;Conversation compression&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Synthesis&lt;/td&gt;
&lt;td&gt;Optional structured report&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Persistence and SSE streaming&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The model is guided, not spoon-fed
&lt;/h3&gt;

&lt;p&gt;ReefWatch does not hardcode "for tracechat, query these exact tables."&lt;/p&gt;

&lt;p&gt;It gives the model a source-agnostic investigation workflow, then lets Coral's&lt;br&gt;
live catalog expose the actual tables, functions, filters, and source idioms.&lt;/p&gt;


&lt;h2&gt;
  
  
  Closing The Log
&lt;/h2&gt;

&lt;p&gt;Thanks for reading, if you've reached this part!&lt;br&gt;
My teammate and I built ReefWatch for the Coral Hackathon. The experience taught me so much about building autonomous agents from scratch and shaping ReefWatch into a helpful tool.&lt;/p&gt;

&lt;p&gt;The most useful thing Coral gave ReefWatch was not just another integration.&lt;/p&gt;

&lt;p&gt;It gave the agent a way to move through operational data with a consistent&lt;br&gt;
mental model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;discover -&amp;gt; inspect -&amp;gt; query -&amp;gt; correlate -&amp;gt; report
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the difference between a chatbot that knows what tools exist and an&lt;br&gt;
agent that can actually investigate.&lt;/p&gt;

&lt;p&gt;ReefWatch is still a proof-of-concept, but the shape feels right: Coral handles&lt;br&gt;
the source layer, ReefWatch handles the investigation behavior, and the UI shows&lt;br&gt;
the route clearly enough that an operator can trust or challenge the answer.&lt;/p&gt;

&lt;p&gt;That is the kind of agent I wanted to build.&lt;/p&gt;

&lt;p&gt;Not a narrator.&lt;/p&gt;

&lt;p&gt;An investigator.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reference Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/siiddhantt/reefwatch" rel="noopener noreferrer"&gt;ReefWatch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://withcoral.com/docs" rel="noopener noreferrer"&gt;Coral docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/withcoral/coral" rel="noopener noreferrer"&gt;Coral GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/building-effective-agents" rel="noopener noreferrer"&gt;Anthropic: Building effective agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/engineering/writing-tools-for-agents" rel="noopener noreferrer"&gt;Anthropic: Writing effective tools for AI agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.github.io/openai-agents-python/tracing/" rel="noopener noreferrer"&gt;OpenAI Agents SDK tracing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://openai.github.io/openai-agents-python/guardrails/" rel="noopener noreferrer"&gt;OpenAI Agents SDK guardrails&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>showdev</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
