Building an AI Agent That Queries Operational Data (Not Just Chats)

#webdev #react #opensource

There are two kinds of AI integrations in operational software. The first wraps a language model around documentation and lets users ask questions about how the product works. That is a support chatbot. It is useful, but it does not touch operational data.

The second gives the AI agent structured access to the operational database through purpose-built query tools. The agent can look up orders, search samples, check attribution data, and surface patterns. That is what I built for SampleHQ's AI assistant, and the architecture behind it is worth explaining because the difference between these two approaches is the difference between a toy and a tool.

Why Generic Chat Fails for Operational Data

A generic chatbot connected to a database has two failure modes. Either it generates SQL queries directly, which risks both incorrect results and security vulnerabilities, or it summarizes data it has seen in its training, which means hallucinated numbers that look plausible but are fabricated.

When a sales leader asks "how many samples did we send to Acme Packaging this quarter?" the answer must be exact. Not approximately. Not "based on typical patterns." The answer needs to come from the actual database, reflect the actual data, and be verifiable. A language model generating SQL might join the wrong tables, misinterpret a date range, or return results from the wrong tenant in a multi-tenant system.

The solution is to never let the AI touch the database directly. Instead, give it tools.

The Tool-Based Architecture

Modern language models support function calling, also called tool use. Instead of generating SQL, the model describes what it wants to know, and the system executes a pre-built function that returns the result. The model then formats that result into a natural language response.

For SampleHQ, the AI agent has access to several query builders, each designed for a specific domain:

OrderQueryBuilder. Searches orders by account, date range, status, recipient, or sample content. Returns structured results with order IDs, statuses, and delivery dates. The query is parameterized and runs against indexed columns. No SQL generation involved.

SampleQueryBuilder. Searches the sample catalog by name, category, SKU, or availability. Returns structured results that the AI can reference when answering questions about what samples exist or what has been sent most frequently.

AttributionQueryBuilder. Queries the attribution data model: orders linked to deals, credit assignments, close dates, and influenced revenue. This is the most complex query builder because it joins across orders, CRM links, attribution snapshots, and deal cache tables.

CustomerQueryBuilder. Searches customer and contact data, including CRM-synced information. Useful for questions like "what is our history with this account?"

Each query builder validates its inputs, constructs a safe parameterized query, executes it, and returns a structured result object. The AI model never sees raw SQL, never has direct database access, and cannot return data that does not exist in the database.

Why This Prevents Hallucination

The key insight is that the AI model is constrained to what the tools return. If the OrderQueryBuilder finds no orders matching the criteria, the model gets an empty result set. It cannot invent orders. If the AttributionQueryBuilder returns $42,000 in influenced revenue, the model reports $42,000. It cannot round up to $50,000 because that sounds better.

This is fundamentally different from a model that generates answers from its training data or from unstructured context. The tools act as a ground-truth layer between the model and the data. The model handles natural language understanding (parsing the question) and natural language generation (formatting the answer). The tools handle data retrieval. Neither can do the other's job.

Context Injection

The AI agent is more useful when it has context about what the user is looking at. If a user is on the order details page for Order #412 and asks "when was this delivered?" the agent should know they mean Order #412 without being told.

A context layer injects page context into the AI conversation. The current page, the selected order, the active filters, and the visible data summary are all included in the system prompt. The agent uses this context to interpret ambiguous queries and provide relevant responses without the user having to specify what they are asking about.

Multi-Provider Support

One design decision that proved valuable: the AI layer supports multiple providers through a factory pattern. The system works with OpenAI, Anthropic's Claude, and other providers through OpenRouter. Each provider uses the same tool definitions and the same query builders. The difference is in the model's reasoning quality and response style.

This matters for two reasons. First, enterprise customers often have approved vendor lists. If their organization has approved Anthropic but not OpenAI, the AI features still work. Second, different models have different strengths. Claude tends to be more careful about caveats and qualifications in analytical responses. GPT-4 tends to be more direct. Letting customers choose their provider gives them the interaction style that fits their preference.

Credentials are stored encrypted per-tenant, and the provider factory instantiates the correct client based on the tenant's configuration. Switching providers does not require any code changes. It is a settings change.

What the Agent Cannot Do

Constraints are as important as capabilities. The AI agent can read data through query builders. It cannot write data. It cannot change order statuses, send emails, modify attribution, or perform any destructive action without explicit user confirmation through the standard UI.

This is a deliberate design choice. Operational data has consequences. Changing an order status triggers notifications and updates dashboards. Sending a follow-up email reaches an actual customer. These actions should go through the same permission checks and audit logging as any other user action, not through an AI agent that might misinterpret a request.

The agent can draft a follow-up email. It cannot send it. It can suggest that an order should be escalated. It cannot escalate it. The human stays in the loop for every action that changes state.

The Pattern

If you are building AI features into operational software, the pattern is: give the model tools, not database access. Build query builders for each data domain. Validate inputs in the query builders. Return structured results. Let the model handle language, not data. Inject page context for relevance. Support multiple providers for flexibility. And never let the agent write data without human confirmation.

This produces an AI assistant that is genuinely useful for operational questions, reliably accurate because it can only return real data, and safe because it cannot modify anything autonomously.

This architecture powers SampleHQ's AI assistant. It queries orders, samples, attribution, and customer data through purpose-built tools. It cannot hallucinate numbers because it can only return what the database contains.