DEV Community

Cover image for Building ReefWatch, a Coral-Powered Production Triage Agent
Siddhant Rai
Siddhant Rai

Posted on

Building ReefWatch, a Coral-Powered Production Triage Agent

Production incidents almost never break in one place.

The alert fires in one tool. The broken deploy is in Netlify. The suspicious
change is in GitHub. The stack trace is in Sentry. The human context is in
Slack. The runbook is in Notion. The "is this actually paging someone?" answer
is in PagerDuty.

A normal chatbot can sound helpful in that situation. It can say things like
"you should check your recent deployments" and "look for related errors in
Sentry."

But that is not triage. That is a polished to-do list.

I wanted something more useful: an agent that could go get the evidence, connect
the dots across sources, show its work, and give an operator-grade answer
grounded in real system data.

The design constraint from the start was simple: no evidence, no answer.

That became ReefWatch, a Coral-powered production triage agent built to
investigate instead of improvise.

GH

It discovers the tools connected to a workspace at runtime, queries them as
evidence, correlates records across systems, and produces a compact answer only
when the facts support one.

Coral became the backbone because it turns the messiest part of agent tooling
into something the model can actually reason about: SQL.


What This Guide Builds

By the end of this route, you will have a blueprint for an agent that can:

  • discover connected Coral sources at runtime
  • query production systems through read-only SQL
  • correlate evidence across code, deploys, errors, alerts, chats, and runbooks
  • stream every query and row count into an inspectable UI
  • run the same investigation workflow from a CLI when you want a scriptable path
  • generate an incident report only when the evidence supports one
  • stay focused with policy layers instead of a giant prompt blob

Investigation Workspace

In one sentence:

ReefWatch is a Coral-powered investigation workspace that lets an agent discover connected tools at runtime, query them with read-only SQL, stream the evidence trail, and generate an incident report only when the facts actually support one.

The ReefWatch Flow


Why Coral Belongs At The Center

MCP is excellent as an integration layer. It gives models a way to call tools
with schemas instead of scraping humans through UI glue.

But if every source becomes a separate collection of bespoke tools, a new
problem appears:

  • the model has to learn many tool shapes
  • every API has different pagination and filters
  • joining records across sources becomes a prompt exercise
  • source-specific errors leak straight into the agent loop
  • every new integration asks the app to own more integration logic

Coral changes the abstraction.

It still uses MCP, but the agent mostly sees a small set of stable capabilities:
discover catalog, inspect schema, read the guide, and run SQL.

That means a new source is not:

teach ReefWatch another SDK
Enter fullscreen mode Exit fullscreen mode

It becomes:

install a Coral source -> discover the tables -> query evidence with SQL
Enter fullscreen mode Exit fullscreen mode

Tool-first flow

ReefWatch flow using Coral

The practical win is boring in the best way: ReefWatch can stay small.

The app does not own GitHub pagination, Sentry auth, Slack table shapes, or
Netlify deploy schemas.

Coral owns that. ReefWatch owns the investigation behavior.

That split also maps well to how I think reliable agents should be built:
ground the model in real environment feedback, keep tools composable, trace the
work, and wrap the loop with small guardrails instead of hoping one perfect
prompt behaves forever.

MCP gives the agent hands. Coral gives it a map and a query language.


The Build Path

If I were rebuilding ReefWatch from scratch, I would not start with the UI.

I would start with the investigation pipeline and make each layer earn its
place.

Remember, it is tempting to start with the surface, but you should make the surface reflect a system that was already worth trusting.

The project came together in eight slices:

Slice What I Built Why It Mattered
1 Coral MCP client Proved Coral could be the data plane
2 Warm Coral session Removed repeated MCP startup cost
3 Schema context Kept prompts aligned with live Coral metadata
4 Minimal agent loop Exposed the real model failure modes
5 Policy modules Made the agent reliable without hardcoding a demo
6 Persistence Made runs debuggable and conversations durable
7 Streaming UI Made the investigation inspectable
8 Source profiles Made setup reproducible without requiring every token

Build Path

The final project shape looks roughly like this:

(a FastAPI backend, with SQLite store and React frontend)

reefwatch/
|-- src/
|   |-- api/
|   |   |-- routes/
|   |   |   |-- coral.py              # Coral health and source setup
|   |   |   |-- conversations.py      # persisted investigation threads
|   |   |   |-- investigations.py     # REST + SSE investigation runs
|   |   |   `-- schema.py             # schema visibility for the UI
|   |   |-- dependencies.py           # shared Settings, store, agent, session
|   |   |-- mappers.py                # domain models to API responses
|   |   `-- schemas.py                # API contracts
|   `-- app/
|       |-- adapters/
|       |   |-- coral_session.py      # long-lived Coral process + warm cache
|       |   |-- mcp_client.py         # JSON-RPC over coral mcp-stdio
|       |   `-- store.py              # SQLite run/conversation persistence
|       |-- agent/
|       |   |-- context.py            # conversation compression
|       |   |-- coverage.py           # evidence lane policy
|       |   |-- events.py             # streamable trace event contracts
|       |   |-- guardrails.py         # evidence-first retries
|       |   |-- intent.py             # structured artifact routing
|       |   |-- execution_policy.py   # duplicate and SQL-shape hygiene
|       |   |-- loop.py               # LLM/tool loop
|       |   |-- policy.py             # budgets and finalization
|       |   |-- prompts.py            # schema-aware operating contract
|       |   |-- schema.py             # Coral table/column context builder
|       |   |-- source_guidance.py    # compact source idiom hints
|       |   |-- taxonomy.py           # source lanes and shared intent vocabulary
|       |   |-- workflow.py           # coverage, correlation, and stop checkpoints
|       |   `-- synthesis.py          # optional incident report synthesis
|       |-- config.py                 # centralized runtime knobs
|       |-- coral_setup.py            # install/test Coral source profiles
|       `-- source_profiles.py        # triage, demo, enterprise profiles
`-- frontend/
    `-- src/
        |-- components/chat/          # chat surface, markdown, evidence trail
        |-- store/                    # conversation/run state
        `-- api/                      # backend client

Enter fullscreen mode Exit fullscreen mode

That structure came from the order of problems I solved.


Slice 1: Prove Coral Can Be The Data Plane

The first backend slice was deliberately small.

I wanted to answer one question:

Can ReefWatch treat Coral as the source of operational truth?

The first proof was:

  1. Launch coral mcp-stdio.
  2. Initialize MCP over JSON-RPC.
  3. Read coral://tables.
  4. Call the sql tool.
  5. Return rows to a plain API endpoint.

At that point, ReefWatch was not an agent yet. It was a thin Coral client.

That was useful because it proved the most important bet: the app could treat
Coral as the data plane
instead of building direct SDK integrations for
GitHub, Slack, Sentry, and every other source.

The first reusable module was mcp_client.py.

It owns the boring but essential transport details:

  • spawn the Coral binary
  • speak JSON-RPC over stdio
  • convert MCP tools to OpenAI-compatible function tools
  • read Coral resources such as coral://guide and coral://tables
  • surface stderr and decoding errors clearly

Design decision: keep transport boring. Once mcp_client.py worked, the
rest of the app could stop thinking about processes and start thinking about
investigations.

Slice 2: Keep Coral Warm

The naive approach would be:

user asks question -> spawn Coral -> discover schema -> ask model -> run SQL
Enter fullscreen mode Exit fullscreen mode

That is fine for a script.

It feels rough in a product.

So the second slice was coral_session.py. It keeps one Coral process alive,
warms the schema/guide/tool cache, and recreates the process if it dies.

That gave ReefWatch a cleaner runtime shape:

app starts -> warm Coral once -> investigations reuse the session
Enter fullscreen mode Exit fullscreen mode

Agent loop

The session cache stores three things:

  • the Coral source schema
  • the Coral guide
  • the OpenAI-compatible tool definitions

That one decision made the product feel different.

Instead of every user prompt waiting on MCP bootstrapping and catalog discovery,
ReefWatch starts from a warm map of the available sources.

There is still a fallback path. If the process dies, CoralSession can recreate
the client and warm the cache again.

The MCP client reads and writes JSON-RPC over stdio with UTF-8 decoding, drains
stderr on a background thread, and reports useful transport errors rather than
hanging silently.

A production triage agent that randomly waits on its own plumbing is not a production triage agent.

Slice 3: Build Schema Context From Coral, Not Memory

Hardcoding source schemas would defeat the point of using Coral.

The agent must discover what is installed right now.

The temptation was to write a hand-authored prompt like:

GitHub has these tables. Sentry has these tables. Slack uses channel ids.
Enter fullscreen mode Exit fullscreen mode

That would have made ReefWatch brittle and less Coral-native.

This was one of those small choices where the architecture either respects the
tool it is built on, or quietly works around it.

Instead, ReefWatch builds its prompt context from Coral itself.

It reads coral://tables, enriches the result with coral.columns, groups
tables by source, and includes only a compact slice of each source in the
prompt.

SELECT schema_name, table_name, column_name, data_type
FROM coral.columns
ORDER BY schema_name, table_name, ordinal_position
Enter fullscreen mode Exit fullscreen mode

The key idea: the model gets a map, not a maze.

If the source catalog is small, the model sees most of it.

If the catalog is large, the model gets enough to start and can use Coral
discovery tools for the rest.

That keeps the prompt useful without pretending the app has permanent knowledge of every source.

Slice 4: Start With The Smallest Useful Agent Loop

Only after Coral transport and schema context worked did I build the agent loop.

The first version of agent/loop.py had one job:

messages -> LLM tool call -> Coral SQL -> tool result -> final answer
Enter fullscreen mode Exit fullscreen mode

That version was intentionally plain.

It let me see the raw failure modes:

  • advice instead of action: the model answered with instructions instead of querying
  • catalog instead of evidence: it listed tables instead of using them
  • unnecessary clarification: it asked for repo names GitHub auth could reveal
  • single-lane tunnel vision: it queried one source and claimed it investigated everything
  • false negatives: it treated a filtered zero-row query as proof of no evidence

Those failures were useful.

They showed which parts belonged in the prompt and which parts deserved
code-level policy.

A bad first agent run is not wasted time if it tells you where the system needs structure.

Slice 5: Add Policy Around The Loop

This was the real turning point.

I stopped trying to make one heroic system prompt do everything.

Instead, I split agent behavior into focused modules:

  • policy.py decides query budgets and finalization behavior.
  • guardrails.py handles evidence-first retries and missing-source retries.
  • coverage.py decides which evidence lanes matter for a request.
  • workflow.py turns coverage and correlation into small checkpoint prompts.
  • execution_policy.py skips duplicate/noisy query shapes and catches table/function syntax mistakes before they hit Coral.
  • context.py compresses conversation history.
  • synthesis.py decides whether a structured report is appropriate.

This was better than making one giant prompt because each module has a clear
reason to exist and can be tested:

Failure Mode Layer That Handles It
Agent answers without querying guardrails.py
Agent stops after one source coverage.py
Agent ignores missing evidence lanes workflow.py
Agent skips cross-source correlation workflow.py
Agent repeats query shapes execution_policy.py
Agent loops too long policy.py
Conversation gets too large context.py
Report appears for ordinary questions synthesis.py

The model still has agency. The code does not prescribe exact SQL for a demo
scenario.

The policy layer just keeps the model inside the kind of investigation a
human operator would expect.

Slice 6: Persist Runs Before Building A Fancy Chat

The next slice was persistence.

I started with SQLite because this is a proof-of-concept and local operator
tool, not a multi-tenant SaaS backend.

The important part was not Postgres. The important part was recording:

  • conversation IDs
  • user questions
  • model used
  • final answer
  • report payload
  • every SQL execution
  • row counts and errors

That made debugging dramatically easier.

When a run looked bad, I could inspect the exact queries and decide whether the
failure was prompt, policy, schema, model, or source setup.

This is also why the frontend can hydrate conversations and show evidence
instead of keeping everything only in Redux memory.

Slice 7: Build The UI Around Evidence

Only then did the chat UI become valuable.

The UI was not designed as "talk to an AI."

It was designed as an investigation workspace:

  1. The input starts the investigation.
  2. The agent streams progress.
  3. SQL queries appear as evidence.
  4. The trail collapses when the final answer arrives.
  5. The final answer renders as Markdown.
  6. Conversations can be revisited because runs are persisted.

That UI decision matters because Coral is visualizable.

The user can see source counts, SQL queries, row counts, and the final
synthesis.

ReefWatch shows the route instead of hiding it behind one polished
paragraph.

Slice 8: Add Source Profiles Last

The last piece was source profiles.

I did not want the default setup to require every possible token. That creates a
bad demo path.

Instead, ReefWatch has profiles:

Profile Sources Use Case
triage GitHub, Sentry, Slack, Netlify lightweight production triage
demo triage + PagerDuty richer incident response demo
security GitHub, Slack, Notion, OSV compliance/security route
enterprise demo + Notion + OSV default hackathon showcase
observability demo + Datadog + StatusGator deeper ops setup

This keeps the build reproducible.

A reader can start with triage, get a real agent working, then add
Notion/OSV/PagerDuty when they want a stronger story.


The Agent Loop That Made It Work

Loop

The main loop is intentionally simple:

  1. Build messages from the user prompt, schema context, Coral guide, shared source taxonomy, and compressed conversation memory.
  2. Ask the LLM for tool calls.
  3. Execute Coral MCP tools.
  4. Record SQL executions.
  5. Stream trace events to the UI.
  6. Apply workflow checkpoints and lightweight execution hygiene.
  7. Stop when evidence is sufficient or the configured budget is reached.
  8. Classify the artifact the current request deserves.
  9. Optionally synthesize a structured incident report.

Agent loop

The important part is not that the loop is complicated.

It is that the loop is surrounded by small pieces of judgment.

Evidence-first guardrail

Guardrail

If the user asks an operational question like "what issues are on my GitHub?"
and the model tries to answer without querying Coral, ReefWatch injects a retry
message:

"You have not queried Coral yet. Do not answer with table recommendations or ask
for repo/org names until you first run metadata/source SQL queries to infer them.""

This fixed the first embarrassing failure mode: the agent giving me instructions
instead of doing the investigation.

Source coverage policy

Coverage

For production triage, one source is almost never enough.

ReefWatch treats sources as evidence lanes:

Category Sources
Ops GitHub, Sentry, Netlify, Slack, PagerDuty, StatusGator
Knowledge Notion
Security GitHub, OSV, Notion, Slack
Observability Datadog

The policy does not say "always query everything."

It checks what is actually installed and what the user asked. If the user asks
specifically about GitHub, the coverage stays GitHub-scoped. If the user asks
for production triage, the agent should cover the available ops lanes before
finalizing.

You have only checked GitHub, but Sentry and Netlify are available, so prefer those lanes next.

That is the kind of judgment I wanted outside the model.

The important refinement: coverage is a guide, not a cage.

If the model just discovered the right Sentry project ID or hit a column error,
it is allowed to inspect Coral metadata and correct that source query before
moving on. That matters because real triage has tiny detours:

  • find the ID
  • inspect the columns
  • fix the table/function shape
  • then continue the lane plan

Hard-blocking those detours made the agent worse. ReefWatch now nudges the
investigation path without preventing useful schema correction.

The source lane definitions and shared intent vocabulary live in taxonomy.py.

That small file exists for a boring but important reason: coverage, budgets,
and intent classification should not each carry their own slightly different
definition of what "incident" means.

The agent is still dynamic. taxonomy.py does not contain TraceChat queries,
table names for a demo, or source-specific SQL recipes. It only describes the
categories ReefWatch can reason about:

  • ops evidence
  • knowledge evidence
  • security evidence
  • observability evidence

Coral still discovers the actual tables, functions, filters, and columns at
runtime.

Cross-source correlation checkpoint

Cross-source

This was the final thing I tightened before the demo.

Once multiple evidence lanes return concrete anchors, ReefWatch asks for a
Coral-side correlation query instead of letting the model stitch everything
together in prose.

The preferred shape is:

WITH deploy AS (...),
     errors AS (...),
     notes AS (...)
SELECT ...
FROM deploy
JOIN errors ON ...
LEFT JOIN notes ON ...
Enter fullscreen mode Exit fullscreen mode

or, when the relationship is time-based instead of key-based:

WITH deploy AS (...),
     errors AS (...),
     notes AS (...)
SELECT ...
FROM deploy
CROSS JOIN errors
LEFT JOIN notes ON notes.ts <= errors.first_seen
WHERE errors.first_seen >= deploy.created_at
Enter fullscreen mode Exit fullscreen mode

That checkpoint is still source-agnostic. It does not say "for TraceChat, run
this SQL." It says: if the evidence exposes IDs, URLs, releases, commits,
service names, channel IDs, or timestamps, prove the relationship inside Coral.

If a correlation query fails because of SQL shape, the next instruction is not
"give up." It is:

  1. inspect Coral metadata,
  2. correct the table, function, column, or filter shape,
  3. retry with a smaller join.

This made ReefWatch feel much less like a chatbot and much more like an
investigation workbench.

Decisive evidence, not accidental emptiness

One subtle failure: a model can run a query with a hallucinated timestamp column,
get zero rows, and conclude "Slack had no evidence."

That is bad triage.

ReefWatch treats a filtered zero-row evidence query as not fully decisive until
the model relaxes the filter or inspects the schema.

A broad zero-row data query can satisfy a lane. A narrow zero-row query with
extra WHERE filters cannot automatically close the book.

That small distinction protects against false negatives without hardcoding
Slack or any other source.

Scope discipline

Another failure mode showed up with quiet repositories.

The model would discover the correct repo, then drift into global GitHub
searches anyway.

The fix was not "hardcode this repository"

The fix was a general scope policy.

If ReefWatch has discovered a concrete owner/repo, and the agent keeps
running broad GitHub searches without repo:owner/repo, it nudges the agent
back to scoped checks.

This now lives as workflow guidance rather than a hard execution block. The
point is the same: once a concrete anchor exists, prefer scoped evidence over
another broad search, but still allow a corrective metadata query when the model
needs to fix the route.

Scope policy

Query budgets

Budget

The budget is not about limiting Coral.

Coral SQL queries are cheap compared to LLM loops.

The budget is about preventing agent drift and making the product predictable.

ReefWatch uses different budgets by request type:

  • health checks get a smaller budget
  • general triage gets a medium budget
  • incident/root-cause prompts get a larger budget

When the budget is reached, the model must stop querying and produce the best
evidence-backed answer it can, explicitly naming unknowns.

Conversation memory

Memory

The UI is conversational, but the product is not trying to become a general chat
companion.

The conversation flow exists for follow-up investigations:

  • "check the same repo again"
  • "what about Sentry?"
  • "show me the deployment angle"
  • "now make that an incident report"

ReefWatch persists conversations and runs in SQLite.

For the agent prompt, it builds a compact context from recent runs and SQL
executions. If the message history gets too large, ContextWindow compresses
older tool chatter into an execution summary and keeps the latest turns.

That gives the model continuity without stuffing every old row into the prompt.

Intent classification

Intent

The first version of ReefWatch used a small keyword policy to decide whether a
run should produce an incident report.

That was useful as a fallback, but it was too blunt for a real conversation.

For example:

What did it find on Slack?
Enter fullscreen mode Exit fullscreen mode

That follow-up might mention "incident chatter" or "deploy errors" in the
answer, but the user did not ask for a new incident report. They asked for a
source-specific explanation.

The fix was a structured intent classifier.

After the evidence answer is drafted, ReefWatch asks a lightweight structured
LLM step to classify the artifact:

  • answer_only
  • incident_report
  • audit_report
  • follow_up

The prompt is intentionally narrow. It classifies the current user request,
not random words that appear in the answer draft or previous conversation
context.

There are still deterministic policy boundaries:

  • report_policy=never always disables reports
  • report_policy=always always enables an incident report
  • no evidence queries means no report
  • if the classifier fails, ReefWatch falls back to a conservative heuristic

This is the pattern I ended up liking most: let the model handle semantic
intent, but keep product policy outside the model.

Report synthesis

Not every question deserves a report.

If I ask "are there any open issues on my GitHub?", an incident report would be
the wrong artifact.

If I ask "investigate the production regression," a report is useful.

The intent classifier decides the artifact. Report synthesis only runs when the
mode is incident_report.

The structured synthesizer gets only the findings, SQL summary, and sources
used.

It has to stay grounded in the evidence already collected. If evidence is weak,
it must lower confidence rather than invent a root cause.

Mode selection


The CLI Path

The UI is the best place to watch the investigation unfold.

The CLI is the best place to prove the plumbing works.

That split matters for a production agent. Before I ask the model to connect
GitHub, Sentry, Netlify, Slack, PagerDuty, Notion, and OSV into one answer, I
want a boring setup path that can validate each lane by itself.

ReefWatch exposes that through reefwatch coral:

uv run reefwatch coral doctor
uv run reefwatch coral build
uv run reefwatch coral install-profile
uv run reefwatch coral test-source github
uv run reefwatch coral test-source sentry
uv run reefwatch coral test-source netlify
uv run reefwatch coral test-source slack
uv run reefwatch coral test-source pagerduty
uv run reefwatch coral test-source notion
uv run reefwatch coral sql "SELECT * FROM pagerduty.abilities LIMIT 5"
Enter fullscreen mode Exit fullscreen mode

The important detail is that the CLI does not invent another integration layer.

It uses the same Coral configuration and the same MCP transport that the agent
uses. The difference is intent: the CLI is for setup, validation, and scripted
investigations; the web workspace is for watching evidence appear and reading
the final answer.

For example, a teammate can run:

uv run reefwatch investigate "Investigate the current production issue for tracechat-ledger and tell me what needs attention now." --trace
Enter fullscreen mode Exit fullscreen mode

That gives the project a second interface without splitting the product in two.


Reproducing The Demo

Here is the practical route another developer can follow.

1. Clone Coral and ReefWatch

Build Coral locally and point ReefWatch at the binary:

git clone https://github.com/withcoral/coral.git
cd coral
cargo build
Enter fullscreen mode Exit fullscreen mode

Then configure ReefWatch:

RW_CORAL_EXECUTABLE=../coral/target/debug/coral.exe
RW_CORAL_REPO_PATH=../coral
RW_CORAL_CONFIG_DIR=state/coral
RW_SOURCE_PROFILE=enterprise
Enter fullscreen mode Exit fullscreen mode

2. Choosing a capable LLM

The LLM I went for at the time of making and testing ReefWatch was DeepSeek v4 Pro as it is quite a powerful model for agentic workflows and is very cost efficient for the amount of work it does.

ReefWatch supports multi-modal LLM requests for the different stages, i.e inference, the main agent loop and the synthesis, so depending on your budget and use-case you can customise it!

3. Install the first source set

Start with the sources that give the best incident story without too much setup:

  • GitHub for code, issues, PRs, workflows
  • Sentry for runtime errors
  • Netlify for deployments
  • Slack for human context
  • PagerDuty if available

For the security/compliance variant, add:

  • Notion for runbooks and policies
  • OSV for vulnerability intelligence

The important UX decision is profiles.

ReefWatch does not force every source into every demo. It has triage, demo,
security, enterprise, and observability profiles so the setup can match
the story.

4. Ask one strong prompt

Use a prompt that gives the agent enough intent but not a scripted path:

Investigate the current production issue for tracechat-ledger and tell me what
needs attention now.
Enter fullscreen mode Exit fullscreen mode

A good run should show:

  • schema discovery from Coral
  • GitHub repo/issue resolution
  • Sentry project and event lookup
  • Netlify site/deploy lookup
  • Slack channel/message lookup
  • optional Notion runbook lookup
  • an answer that labels lanes as confirmed, checked-empty, partial, blocked, or not-linked
  • a report only if the incident shape is present

What A Quiet Result Should Look Like

Quiet repos are harder than they look.

A lazy agent says "no issues" after one empty query. A paranoid agent runs 30
searches and still sounds unsure.

The ReefWatch answer I want is calmer:

I did not find an active issue for <repo name>.

GitHub is checked-empty for open issues and PRs on that repository. I did not
find linked deployment/runtime evidence in the installed sources. No incident
report was generated because this looks like a quiet repository check, not an
active production incident.
Enter fullscreen mode Exit fullscreen mode

That is the product philosophy in miniature:

  • useful
  • scoped
  • evidence-backed
  • not dramatic for no reason

Some Highlights

Coral is not a checkbox

ReefWatch depends on Coral's core strengths:

  • runtime source discovery
  • SQL-first querying
  • source manifests
  • MCP tool exposure
  • local execution
  • cacheable schema/guide/tool metadata
  • cross-source correlation through common identifiers

The agent does not just "call Coral once."

Coral is the investigation substrate.

The agent is layered

The code is intentionally split:

Layer Responsibility
MCP adapter JSON-RPC over Coral stdio, UTF-8 safety, guide/resources/tools
Coral session Long-lived process and warm cache
Schema model Compact source/table/column context
Prompt builder Operating contract and live schema context
Agent loop LLM/tool loop and execution recording
Policy Budgets and finalization
Coverage Evidence lane requirements and source-level completeness
Workflow Coverage, correlation, correction, and stop checkpoints
Taxonomy Shared source lanes and investigation vocabulary
Guardrails Evidence-first and missing-source retries
Context Conversation compression
Synthesis Optional structured report
API Persistence and SSE streaming

The model is guided, not spoon-fed

ReefWatch does not hardcode "for tracechat, query these exact tables."

It gives the model a source-agnostic investigation workflow, then lets Coral's
live catalog expose the actual tables, functions, filters, and source idioms.


Closing The Log

Thanks for reading, if you've reached this part!
My teammate and I built ReefWatch for the Coral Hackathon. The experience taught me so much about building autonomous agents from scratch and shaping ReefWatch into a helpful tool.

The most useful thing Coral gave ReefWatch was not just another integration.

It gave the agent a way to move through operational data with a consistent
mental model:

discover -> inspect -> query -> correlate -> report
Enter fullscreen mode Exit fullscreen mode

That is the difference between a chatbot that knows what tools exist and an
agent that can actually investigate.

ReefWatch is still a proof-of-concept, but the shape feels right: Coral handles
the source layer, ReefWatch handles the investigation behavior, and the UI shows
the route clearly enough that an operator can trust or challenge the answer.

That is the kind of agent I wanted to build.

Not a narrator.

An investigator.


Reference Links

Top comments (0)