<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Carlosmgs111</title>
    <description>The latest articles on DEV Community by Carlosmgs111 (@carlosmgs111).</description>
    <link>https://dev.to/carlosmgs111</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F845469%2Fb51c6de4-23ac-43c1-910c-26f173b1cd11.jpg</url>
      <title>DEV Community: Carlosmgs111</title>
      <link>https://dev.to/carlosmgs111</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/carlosmgs111"/>
    <language>en</language>
    <item>
      <title>How I used DDD and hexagonal architecture to build klay+ — a flexible, provider-agnostic RAG infrastructure you can plug into any project.</title>
      <dc:creator>Carlosmgs111</dc:creator>
      <pubDate>Fri, 27 Mar 2026 18:21:26 +0000</pubDate>
      <link>https://dev.to/carlosmgs111/how-i-used-ddd-and-hexagonal-architecture-to-build-klay-a-flexible-provider-agnostic-rag-5e20</link>
      <guid>https://dev.to/carlosmgs111/how-i-used-ddd-and-hexagonal-architecture-to-build-klay-a-flexible-provider-agnostic-rag-5e20</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Problem Everyone's Having&lt;/strong&gt; &lt;em&gt;"My chunking strategy is hardcoded everywhere. I want to experiment with different approaches but changing it means touching 15 files."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;RAG&lt;/strong&gt;&lt;/em&gt; is everywhere right now. But most implementations share the same problem — they're built as scripts, not as infrastructure. They work for the demo, they work for the first provider, and then they break the moment something needs to change.&lt;br&gt;
I kept seeing this pattern and thought: what if you could build &lt;em&gt;&lt;strong&gt;RAG&lt;/strong&gt;&lt;/em&gt; infrastructure the way you'd build any serious backend system? With clear boundaries, swappable providers, and an architecture that actually survives evolving requirements?&lt;br&gt;
That's what &lt;em&gt;&lt;strong&gt;klay+&lt;/strong&gt;&lt;/em&gt; is — a &lt;em&gt;&lt;strong&gt;RAG&lt;/strong&gt;&lt;/em&gt; infrastructure toolkit built with DDD and hexagonal architecture in TypeScript. You integrate it into your project, pick your providers, define your processing strategies, and the architecture handles the rest.&lt;br&gt;
This article is about how it's structured and why.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why DDD for RAG Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At first glance, DDD seems like overkill for a RAG pipeline. Ingest documents, chunk them, embed them, search. Four steps, right?&lt;br&gt;
But look closer. A RAG system that's actually useful in production has to deal with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiple input formats&lt;/strong&gt; (PDF, markdown, plain text) with different extraction logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configurable chunking&lt;/strong&gt; (recursive, sentence-based, fixed-size) that you need to experiment with&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable embedding&lt;/strong&gt; providers that you might swap mid-project&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge versioning&lt;/strong&gt; so you can track why search results changed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple runtimes&lt;/strong&gt; if you want offline/browser support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't steps in a pipeline. They're independent domains with their own rules, their own lifecycles, and their own reasons to change. That's literally the use case for bounded contexts.&lt;/p&gt;


&lt;h2&gt;
  
  
  The 4 Bounded Contexts
&lt;/h2&gt;

&lt;p&gt;After a few iterations (and a few wrong turns), I ended up with four contexts. Each one owns a piece of the &lt;strong&gt;&lt;em&gt;RAG&lt;/em&gt;&lt;/strong&gt; pipeline and communicates through service facades.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Source Ingestion&lt;/strong&gt;&lt;br&gt;
Handles everything related to content acquisition and source-level knowledge management. A consumer of &lt;strong&gt;klay+&lt;/strong&gt; feeds it a PDF, text, or markdown — this context validates input, creates a &lt;code&gt;Source&lt;/code&gt; aggregate, kicks off an &lt;code&gt;ExtractionJob&lt;/code&gt;, and manages the &lt;code&gt;SourceKnowledge&lt;/code&gt; hub that bridges raw content with its semantic projections.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;research-paper.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SourceType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PDF&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ExtractionJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;sourceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ExtractionStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PDF&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ExtractionJob&lt;/code&gt; is a separate aggregate because extraction can fail, retry, and run async — it has its own lifecycle. And &lt;code&gt;SourceKnowledge&lt;/code&gt; lives here too because one source can produce multiple projections over time (different chunking, different embedding model), so tracking that history is part of managing the source itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Context Management&lt;/strong&gt;&lt;br&gt;
Groups knowledge sources into queryable collections and tracks lineage. The &lt;code&gt;KnowledgeLineage&lt;/code&gt; aggregate records every transformation applied to content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lineage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;KnowledgeLineage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sourceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;transformations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunking&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;recursive&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;embedding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the answer to "I changed my chunking strategy and now results are worse — what happened?" Without lineage, you're guessing. With it, you can diff configurations and roll back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Semantic Processing&lt;/strong&gt;&lt;br&gt;
Orchestrates the chunking → embedding pipeline. Owns &lt;code&gt;SemanticProjection&lt;/code&gt; and &lt;code&gt;ProcessingProfile&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ProcessingProfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;chunkingStrategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;recursive&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;embeddingProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;embeddingModel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ProviderRegistry&lt;/code&gt; pattern is what makes the multi-provider promise real. A factory resolves the correct implementation based on config. Your project uses OpenAI today? Cool. Need to switch to a local model tomorrow? Implement one interface. The domain doesn't care.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Knowledge Retrieval&lt;/strong&gt;&lt;br&gt;
The simplest context, intentionally. Read-only. Takes a query, computes its embedding, ranks passages by cosine similarity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryEmbedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddingProvider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findSimilar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryEmbedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;activeContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Separate context because retrieval is read-heavy and latency-sensitive — totally different scaling profile from processing. You don't want optimizing one to break the other.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Shared Kernel
&lt;/h2&gt;

&lt;p&gt;All contexts build on the same foundational abstractions. This is the only code that crosses boundaries.&lt;br&gt;
&lt;strong&gt;Entity &amp;amp; AggregateRoot&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;abstract&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Entity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="na"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;equals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;other&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Entity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;abstract&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AggregateRoot&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Entity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="na"&gt;_events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DomainEvent&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

  &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="nf"&gt;record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DomainEvent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;clearEvents&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;DomainEvent&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_events&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;events&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;ValueObject — Immutable by Default&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;abstract&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ValueObject&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="na"&gt;props&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Readonly&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;props&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;freeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;equals&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;other&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ValueObject&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;other&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result — Because Try-Catch Isn't a Strategy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every domain operation returns a &lt;code&gt;Result&lt;/code&gt;. No exceptions for expected failures, no &lt;code&gt;null&lt;/code&gt; propagating silently through three layers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;never&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nx"&gt;fail&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;never&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;isOk&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;isFail&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;flatMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;handlers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;v&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;fail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;E&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt; &lt;span class="p"&gt;}):&lt;/span&gt; &lt;span class="nx"&gt;U&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Repository — Three Methods&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;Repository&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;PersistenceError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;findById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;PersistenceError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;T&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;PersistenceError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each context defines its own repositories. Infrastructure implements them. The domain doesn't know if it's talking to NeDB, IndexedDB, or a potato.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 6 Principles (Enforced by Tests)
&lt;/h2&gt;

&lt;p&gt;Not guidelines. Invariants. Break one and CI breaks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Dependency Rule&lt;/strong&gt; — Inward only. Adapters → Application → Contexts → Shared Kernel.&lt;br&gt;
&lt;strong&gt;2. Tell, Don't Ask&lt;/strong&gt; — You don't inspect aggregate state. You tell it what to do and get a &lt;code&gt;Result&lt;/code&gt;.&lt;br&gt;
&lt;strong&gt;3. Port Isolation&lt;/strong&gt; — Each context defines its own interfaces. Semantic Processing doesn't know NeDB exists.&lt;br&gt;
&lt;strong&gt;4. Composition over Inheritance&lt;/strong&gt; — Only DDD building blocks inherit. Everything else composes.&lt;br&gt;
&lt;strong&gt;5. Result-Based Errors&lt;/strong&gt; — Domain failures are values, not exceptions.&lt;br&gt;
&lt;strong&gt;6. Illegal States Are Unrepresentable&lt;/strong&gt; — A &lt;code&gt;ChunkSize&lt;/code&gt; of -1? Can't construct it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Enables for Your Project
&lt;/h2&gt;

&lt;p&gt;The whole point of klay+ being infrastructure (not a product) is that it adapts to your context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Swap providers&lt;/strong&gt; without rewriting pipeline code — implement one interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Experiment with chunking strategies&lt;/strong&gt; by changing a config, not refactoring files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track why&lt;/strong&gt; search results changed through immutable lineage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run server-side or browser-side&lt;/strong&gt; — same logic, pick your runtime
*13 tests ensuring the architecture holds as you extend it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No vendor lock-in. No "works for the demo but breaks in production." Just a solid foundation for RAG that you plug into your stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  Check It Out
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;klay+&lt;/strong&gt;&lt;/em&gt; is open source: &lt;a href="//github.com/Carlosmgs111/klay-plus"&gt;github.com/Carlosmgs111/klay-plus&lt;/a&gt;&lt;br&gt;
If you've ever wished your RAG pipeline had real architecture instead of glue code, take a look. Star it, fork it, open an issue, tell me what you'd change.&lt;br&gt;
And if you've built RAG infrastructure yourself — what worked? What would you do differently? The comments are open.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 1 of the klay+ Architecture Series. Next: Hexagonal Architecture with Astro + TypeScript — how your framework and your domain can coexist without pain.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rag</category>
      <category>typescript</category>
      <category>architecture</category>
      <category>ddd</category>
    </item>
    <item>
      <title>Why every RAG project I've built ends up fighting the pipeline — and what I'm doing about it</title>
      <dc:creator>Carlosmgs111</dc:creator>
      <pubDate>Mon, 16 Mar 2026 06:54:35 +0000</pubDate>
      <link>https://dev.to/carlosmgs111/why-every-rag-project-ive-built-ends-up-fighting-the-pipeline-and-what-im-doing-about-it-5hl</link>
      <guid>https://dev.to/carlosmgs111/why-every-rag-project-ive-built-ends-up-fighting-the-pipeline-and-what-im-doing-about-it-5hl</guid>
      <description>&lt;p&gt;The pattern that keeps repeating&lt;/p&gt;

&lt;p&gt;If you've built a RAG application, this probably sounds familiar:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You pick an embedding model&lt;/li&gt;
&lt;li&gt;You set up a vector store&lt;/li&gt;
&lt;li&gt;You write chunking logic&lt;/li&gt;
&lt;li&gt;You wire everything together&lt;/li&gt;
&lt;li&gt;You realize the chunking doesn't work for your use case&lt;/li&gt;
&lt;li&gt;You rewrite half the pipeline&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The models are the easy part. The pipeline glue is where projects slow down — and where most teams burn weeks they didn't plan for.&lt;/p&gt;

&lt;p&gt;A support chatbot needs sentence-level chunks. A legal search tool needs paragraph-level with overlap. An internal knowledge base needs something in between. But every time you change one component, you're rewiring the whole thing.&lt;/p&gt;

&lt;p&gt;The actual problem&lt;/p&gt;

&lt;p&gt;It's not that building a RAG pipeline is hard. It's that iterating on one is painful.&lt;/p&gt;

&lt;p&gt;You pick a chunking strategy, embed a few thousand documents, and your retrieval quality is... okay. Not great. So you want to try a different approach. But that means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Re-processing all your documents&lt;/li&gt;
&lt;li&gt;Re-generating all your embeddings&lt;/li&gt;
&lt;li&gt;Hoping the new strategy is actually better&lt;/li&gt;
&lt;li&gt;Doing all of this without breaking what's already working&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most teams don't experiment. They ship the first thing that "kind of works" and move on. Retrieval quality suffers, but the cost of iteration is too high.&lt;/p&gt;

&lt;p&gt;What I'm building&lt;/p&gt;

&lt;p&gt;I started working on klay+ — a composable RAG infrastructure layer where every component is independently swappable.&lt;/p&gt;

&lt;p&gt;The core idea: your application code shouldn't change when you change your RAG strategy.&lt;/p&gt;

&lt;p&gt;Here's what that looks like in practice:&lt;/p&gt;

&lt;p&gt;Ingestion&lt;/p&gt;

&lt;p&gt;Feed in PDFs, Markdown, HTML, or plain text. The content gets normalized automatically — no format-specific parsing logic in your app.&lt;/p&gt;

&lt;p&gt;Chunking&lt;/p&gt;

&lt;p&gt;Choose your strategy per use case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Recursive — split by structure (headings, paragraphs, sentences)&lt;/li&gt;
&lt;li&gt;Sentence-aware — keep semantic units intact&lt;/li&gt;
&lt;li&gt;Fixed-size — predictable token counts for context windows&lt;/li&gt;
&lt;li&gt;Custom — bring your own logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key: switching from recursive to sentence-aware chunking doesn't require touching your application code or your retrieval logic.&lt;/p&gt;

&lt;p&gt;Embedding&lt;/p&gt;

&lt;p&gt;Plug in the provider that fits your stage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hash-based — zero API cost, great for local development&lt;/li&gt;
&lt;li&gt;OpenAI / Cohere — production-grade quality&lt;/li&gt;
&lt;li&gt;Local models via WebLLM — self-hosted, no data leaves your infra&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Swap providers without re-architecting. Your retrieval layer doesn't know or care which embedder generated the vectors.&lt;/p&gt;

&lt;p&gt;Retrieval&lt;/p&gt;

&lt;p&gt;Query by meaning, not keywords. Results come back ranked by relevance scores. Your application gets a clean interface regardless of what's happening underneath.&lt;/p&gt;

&lt;p&gt;The part I'm most excited about: parallel projections&lt;/p&gt;

&lt;p&gt;This is the feature that solves the iteration problem. You can generate a new projection — different chunking, different embedding, different strategy — side by side with your production index.&lt;/p&gt;

&lt;p&gt;Compare retrieval quality before committing to a migration. No downtime, no risk.&lt;/p&gt;

&lt;p&gt;Technical decisions&lt;/p&gt;

&lt;p&gt;A few choices worth mentioning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Self-hostable — your documents don't leave your infrastructure if you don't want them to&lt;/li&gt;
&lt;li&gt;No vendor lock-in — every component has multiple provider options&lt;/li&gt;
&lt;li&gt;Static configuration — strategies are defined declaratively, not buried in application code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where it stands&lt;/p&gt;

&lt;p&gt;klay+ is in early development. I'm collecting feedback from developers who are building with RAG to understand which pain points matter most.&lt;/p&gt;

&lt;p&gt;If you've fought with RAG pipelines before, I'd genuinely love to hear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What part of the pipeline costs you the most time?&lt;/li&gt;
&lt;li&gt;How do you handle iteration on retrieval quality?&lt;/li&gt;
&lt;li&gt;What's your current stack and what would you swap if you could?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The landing page is here if you want to follow along: klay-plus-landing.vercel.app&lt;/p&gt;

</description>
      <category>rag</category>
      <category>ai</category>
      <category>webdev</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
