DEV Community

Cover image for How I used DDD and hexagonal architecture to build klay+ — a flexible, provider-agnostic RAG infrastructure you can plug into any project.
Carlosmgs111
Carlosmgs111

Posted on

How I used DDD and hexagonal architecture to build klay+ — a flexible, provider-agnostic RAG infrastructure you can plug into any project.

The Problem Everyone's Having "My chunking strategy is hardcoded everywhere. I want to experiment with different approaches but changing it means touching 15 files."

RAG is everywhere right now. But most implementations share the same problem — they're built as scripts, not as infrastructure. They work for the demo, they work for the first provider, and then they break the moment something needs to change.
I kept seeing this pattern and thought: what if you could build RAG infrastructure the way you'd build any serious backend system? With clear boundaries, swappable providers, and an architecture that actually survives evolving requirements?
That's what klay+ is — a RAG infrastructure toolkit built with DDD and hexagonal architecture in TypeScript. You integrate it into your project, pick your providers, define your processing strategies, and the architecture handles the rest.
This article is about how it's structured and why.


Why DDD for RAG Infrastructure?

At first glance, DDD seems like overkill for a RAG pipeline. Ingest documents, chunk them, embed them, search. Four steps, right?
But look closer. A RAG system that's actually useful in production has to deal with:

  • Multiple input formats (PDF, markdown, plain text) with different extraction logic
  • Configurable chunking (recursive, sentence-based, fixed-size) that you need to experiment with
  • Pluggable embedding providers that you might swap mid-project
  • Knowledge versioning so you can track why search results changed
  • Multiple runtimes if you want offline/browser support

These aren't steps in a pipeline. They're independent domains with their own rules, their own lifecycles, and their own reasons to change. That's literally the use case for bounded contexts.


The 4 Bounded Contexts

After a few iterations (and a few wrong turns), I ended up with four contexts. Each one owns a piece of the RAG pipeline and communicates through service facades.

1. Source Ingestion
Handles everything related to content acquisition and source-level knowledge management. A consumer of klay+ feeds it a PDF, text, or markdown — this context validates input, creates a Source aggregate, kicks off an ExtractionJob, and manages the SourceKnowledge hub that bridges raw content with its semantic projections.

const source = Source.create({
  name: "research-paper.pdf",
  type: SourceType.PDF,
  metadata: { pages: 42, author: "..." }
});

const job = ExtractionJob.create({
  sourceId: source.id,
  strategy: ExtractionStrategy.PDF
});
Enter fullscreen mode Exit fullscreen mode

ExtractionJob is a separate aggregate because extraction can fail, retry, and run async — it has its own lifecycle. And SourceKnowledge lives here too because one source can produce multiple projections over time (different chunking, different embedding model), so tracking that history is part of managing the source itself.

2. Context Management
Groups knowledge sources into queryable collections and tracks lineage. The KnowledgeLineage aggregate records every transformation applied to content.

const lineage = KnowledgeLineage.create({
  contextId: context.id,
  sourceId: source.id,
  transformations: [
    { type: "chunking", strategy: "recursive", chunkSize: 512 },
    { type: "embedding", provider: "openai", model: "text-embedding-3-small" }
  ]
});
Enter fullscreen mode Exit fullscreen mode

This is the answer to "I changed my chunking strategy and now results are worse — what happened?" Without lineage, you're guessing. With it, you can diff configurations and roll back.

3. Semantic Processing
Orchestrates the chunking → embedding pipeline. Owns SemanticProjection and ProcessingProfile.

const profile = ProcessingProfile.create({
  chunkingStrategy: "recursive",
  chunkSize: 512,
  overlap: 50,
  embeddingProvider: "openai",
  embeddingModel: "text-embedding-3-small"
});
Enter fullscreen mode Exit fullscreen mode

The ProviderRegistry pattern is what makes the multi-provider promise real. A factory resolves the correct implementation based on config. Your project uses OpenAI today? Cool. Need to switch to a local model tomorrow? Implement one interface. The domain doesn't care.

4. Knowledge Retrieval
The simplest context, intentionally. Read-only. Takes a query, computes its embedding, ranks passages by cosine similarity.

const queryEmbedding = await embeddingProvider.embed(query);
const results = await vectorStore.findSimilar(queryEmbedding, {
  threshold: 0.7,
  limit: 10,
  contextId: activeContext.id
});
Enter fullscreen mode Exit fullscreen mode

Separate context because retrieval is read-heavy and latency-sensitive — totally different scaling profile from processing. You don't want optimizing one to break the other.


The Shared Kernel

All contexts build on the same foundational abstractions. This is the only code that crosses boundaries.
Entity & AggregateRoot

abstract class Entity<Id> {
  protected readonly _id: Id;
  equals(other: Entity<Id>): boolean {
    return this._id === other._id;
  }
}

abstract class AggregateRoot<Id> extends Entity<Id> {
  private _events: DomainEvent[] = [];

  protected record(event: DomainEvent): void {
    this._events.push(event);
  }

  clearEvents(): DomainEvent[] {
    const events = [...this._events];
    this._events = [];
    return events;
  }
}
Enter fullscreen mode Exit fullscreen mode

ValueObject — Immutable by Default

abstract class ValueObject<T> {
  protected readonly props: Readonly<T>;

  constructor(props: T) {
    this.props = Object.freeze(props);
  }

  equals(other: ValueObject<T>): boolean {
    return JSON.stringify(this.props) === JSON.stringify(other.props);
  }
}
Enter fullscreen mode Exit fullscreen mode

Result — Because Try-Catch Isn't a Strategy

Every domain operation returns a Result. No exceptions for expected failures, no null propagating silently through three layers.

class Result<E, T> {
  static ok<T>(value: T): Result<never, T>;
  static fail<E>(error: E): Result<E, never>;

  isOk(): boolean;
  isFail(): boolean;
  map<U>(fn: (value: T) => U): Result<E, U>;
  flatMap<U>(fn: (value: T) => Result<E, U>): Result<E, U>;
  match<U>(handlers: { ok: (v: T) => U; fail: (e: E) => U }): U;
}
Enter fullscreen mode Exit fullscreen mode

Repository — Three Methods

interface Repository<T, Id> {
  save(entity: T): Promise<Result<PersistenceError, void>>;
  findById(id: Id): Promise<Result<PersistenceError, T | null>>;
  delete(id: Id): Promise<Result<PersistenceError, void>>;
}
Enter fullscreen mode Exit fullscreen mode

Each context defines its own repositories. Infrastructure implements them. The domain doesn't know if it's talking to NeDB, IndexedDB, or a potato.


The 6 Principles (Enforced by Tests)

Not guidelines. Invariants. Break one and CI breaks.

1. Dependency Rule — Inward only. Adapters → Application → Contexts → Shared Kernel.
2. Tell, Don't Ask — You don't inspect aggregate state. You tell it what to do and get a Result.
3. Port Isolation — Each context defines its own interfaces. Semantic Processing doesn't know NeDB exists.
4. Composition over Inheritance — Only DDD building blocks inherit. Everything else composes.
5. Result-Based Errors — Domain failures are values, not exceptions.
6. Illegal States Are Unrepresentable — A ChunkSize of -1? Can't construct it.


What This Enables for Your Project

The whole point of klay+ being infrastructure (not a product) is that it adapts to your context:

  • Swap providers without rewriting pipeline code — implement one interface
  • Experiment with chunking strategies by changing a config, not refactoring files
  • Track why search results changed through immutable lineage
  • Run server-side or browser-side — same logic, pick your runtime *13 tests ensuring the architecture holds as you extend it

No vendor lock-in. No "works for the demo but breaks in production." Just a solid foundation for RAG that you plug into your stack.


Check It Out

klay+ is open source: github.com/Carlosmgs111/klay-plus
If you've ever wished your RAG pipeline had real architecture instead of glue code, take a look. Star it, fork it, open an issue, tell me what you'd change.
And if you've built RAG infrastructure yourself — what worked? What would you do differently? The comments are open.


Part 1 of the klay+ Architecture Series. Next: Hexagonal Architecture with Astro + TypeScript — how your framework and your domain can coexist without pain.

Top comments (0)