Why an All-in-One Data Foundation Matters: Harness, Tape, and a Database-Native Path

#ai #dataengineering #machinelearning #database

From model + Harness to Tape and one data foundation — why agent runtime data should live in the database from the first line of code.

Photo by Brett Jordan on Unsplash

The agent race is quietly shifting from the model layer to the data layer. When agents run, they produce vast volumes of semi-structured trace data — high-frequency writes, long lifecycles — and traditional database architectures struggle to keep up. Shuttling that data back and forth across observability platforms, vector stores, and caches further erodes the efficiency of the record → distill → feed back loop.

This reveals a critical divide:

Building a database downward from an agent framework is not the same as extending a mature database upward to connect with agent frameworks. The starting points differ, and so do the cost structures.

On the latter path, data is a first-class citizen from the first line of code. Run, record, distill, evaluate, and feed back — all within one foundation, without the overhead of cross-system data movement.

That is where an all-in-one data foundation matters: it turns the agent data loop into an internal cycle, not a fragmented patchwork of engineering pieces.

Starting from the definition of Harness, this article draws on the open-source Bub project, discusses layered agent architecture, and lands on a database-native Harness approach — including OceanBase’s exploration and value in this space.

I. Understanding Agents, Harnesses, and How They Relate

A complete agent can be expressed as model + Harness.

Harness covers every engineering component outside the model. Like tack on a horse, Harness is the full toolkit for steering a model toward its destination — reins, saddle, route — which at the engineering level maps to feedback mechanisms, logging systems, and training methods.

A harness has a clear layered structure. Layer 1 is provided by the coding-agent builder or SDK vendor, including base tools and external interfaces. Layer 2 is extended on the user side with the components they need — business logic such as RAG systems, memory systems, and BI pipelines.

In agent scenarios, the model itself is not a continuously stateful system — it returns responses to requests without awareness of concrete business state. What lets agents work reliably in products and teams are the context management, tool invocation, state recording, run-trace tracking, effectiveness evaluation, and data flow responsibilities that Harness takes on.

In this process, we gradually identify and abstract key elements, defined as primitives (Primitive). System prompts, Skills, task-completion methodologies, and multi-agent communication mechanisms are all important primitives that emerge from practice. Standardizing these primitives and incorporating them into Harness improves business performance and expands capability on one hand, and gradually productizes Harness itself on the other.

Data collected from Harness is equally vital. It evaluates workflow effectiveness and, after de-identification, can form standard datasets for training the next generation of models. As models improve, they feed back into primitive discovery and refinement within Harness — even correcting past behavior — forming a continuously improving flywheel. The diagram below (from the LangChain blog) illustrates this loop clearly.

II. Building Extensible Agents: The Bub Project

Bub is an open-source Python agent project on GitHub. Its design reflects a key approach to controlling agent complexity: balancing stability and flexibility through a lean kernel and plugin-based extension.

Mainstream agent products such as ChatGPT, Qwen (Alibaba’s conversational AI), ModelScope services, and low-code platforms like Dify and Flowise already ship a built-in agent loop. A core problem remains: agent capability must match the business scenario precisely. Skills and tools can extend capability, but to complete tasks efficiently you still need to assemble a toolset for each specific scenario.

Products such as OpenClaw, Nanobot, and Hermes Agent bundle too many features together. That creates two problems: feature interference and cognitive load for users; for developers, high system complexity and difficult maintenance (for example, OpenClaw upgrades often break many features across the product). Such tightly coupled designs are hard to use as-is in production. Many vendors therefore repackage a specific version or build entirely in-house.

Bub takes a different architecture strategy: build a lightweight kernel and extend capabilities through plugins. Extra functionality is separated into plugins. Only a carefully designed lean kernel is maintained to implement a stable agent loop; required business capabilities are introduced step by step through feature plugins. Users need only verify that plugins are working correctly. If a plugin fails, they remove it and restore service — greatly improving maintainability.

Bub’s core design philosophy is not about how powerful a single agent is, but about how stages are divided within a single interaction. Whether it is Bub’s built-in agent or externally integrated Codex or LangChain, either can get the work done. Bub breaks each interaction into explicit stages — conversation state construction, prompt assembly, channel input/output definitions, and more. This staged breakdown makes flow control possible: hooks expose integration points at each stage, rather than piling all logic into a single agent.

A key design is removing mandatory binding on output. Traditional systems bind message replies strictly to the input channel. Bub allows an agent to stay silent in certain scenarios — returning no message. That looks like a flaw in a personal-assistant setting, but in multi-user or multi-agent collaboration, silence that avoids noise is a friendly trait.

The community is now seeing a wave of approaches that standardize and modularize agent design, for example:

Agents.md — inject system- and task-related prompts.
Skills — package general SOPs (documentation, code review) as distributable assets without hard-coding them into the agent loop.
MCP (Model Context Protocol) — through plugins, provide IM channel adapters, scheduled tasks, AG-UI visualization, and more.

This is the direction mainstream agent frameworks are moving in 2026. Bub is a practice of this idea: with only a few hundred lines of core interface code, it builds flexible infrastructure.

III. From Context to Data Loop: Tape and Database-Native Harness

1. Building the data loop around Tape

Tape (a core concept in Bub and in AgentSeek, which we are building) is not simple chat history. In some ways it resembles a trace, recording key facts from a single agent run.

Unlike traces in OpenTelemetry and similar observability systems, Tape’s view is simpler — related, but not overly focused on detail. Its distinctive value lies in:

Both observability data and a context model — Tape carries observability for critical tasks and serves as the agent’s runtime context model. That means humans and AI can collaborate on the same data view. The agent can read its own Tape to review past behavior.
Enabling agent self-reflection and diagnosis — Traditionally, when an agent fails, engineers troubleshoot through an observability platform. With Tape, users can talk directly to the agent and ask, “Why did that fail just now?” Engineering investigation also becomes a natural conversation with the agent, because root-cause information is already built into its context.
Supporting automated evaluation and analysis — From Tape records, an agent can compare different models, or the same model across different tasks, enabling automated comparative evaluation without relying on human-facing dashboards.
Serving model training — Through de-identified, formatted export, Tape can readily become task-specific datasets for model training and fine-tuning — truly connecting context and observability to model training in a closed data loop.

2. Why a database-native Harness is needed

Agent systems such as OpenClaw rely heavily on the filesystem (such as various .md files). That is friendly for humans and agents to read, but poor for data processing, analysis, and handling. Modern context engineering needs a Memory layer above raw task trajectories—both a summary of the trajectory and an index. Plugins such as lossless-claw in the OpenClaw community later began using databases like SQLite to connect call chains and memory, which shows databases are necessary in this layer.

Using a database as the foundation of Harness means all agent runtime data is natively a first-class citizen in the database. Observability, data extraction, and archival analysis can use native database capabilities without maintaining a complex heterogeneous data stack (such as MySQL + Elasticsearch + Redis). That provides a unified data foundation, simplifies architecture, and lowers operational cost.

OceanBase is a strong fit for this path. Why? Its core strengths include:

AI workload readiness — OceanBase and its derived tools provide vector search and hybrid retrieval optimized for AI agent workloads. SQL together with vector and full-text search are built-in capabilities, without maintaining multiple technology stacks.
HTAP capability — As a hybrid transactional/analytical processing database, it directly supports real-time queries and complex analysis on agent runtime data, supporting the data loop.
Unified storage with seamless scale-out — All kinds of data can be stored uniformly, supporting trace analysis, retrieval, and related workloads. From single-node deployment on the edge (such as OceanBase seekdb) it scales seamlessly to a distributed OceanBase cluster, offering a smooth upgrade path as the business grows.

IV. AgentSeek: Exploring Database-Native Harness

Through ongoing exploration of agent architecture, the OceanBase team is building AgentSeek — a Harness built entirely on database-native capabilities.

AgentSeek’s core idea: make agent runtime data a first-class database citizen from day one, helping users build data-loop scenarios. The project integrates OceanBase product capabilities with AgentSeek-related wrappers and is actively progressing.

Repository: github.com/ob-labs/agentseek

Closing Thoughts

From the layered definition of Harness, to Bub’s plugin-based extensible architecture, to Tape’s integration of observability and context, to the database-native Harness technical path — agent infrastructure is evolving from feature stacking toward data-driven design. OceanBase’s work in this space is both a natural extension of its technical architecture and a response to data-foundation needs in the AI era.

If you are building data-intensive agents today: where does runtime data land in your stack, and what still breaks when you try to close the loop? Share your setup in the comments — or dig into Bub and AgentSeek, and join Data4AI on LinkedIn to compare notes with other Data + AI practitioners.