<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: RoTSL</title>
    <description>The latest articles on DEV Community by RoTSL (@rotsl).</description>
    <link>https://dev.to/rotsl</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818774%2F3a43cf64-9ded-407d-829e-4555f203a82e.png</url>
      <title>DEV Community: RoTSL</title>
      <link>https://dev.to/rotsl</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rotsl"/>
    <language>en</language>
    <item>
      <title>Hypercontext: a framework for agents that actually know what they're doing</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:41:25 +0000</pubDate>
      <link>https://dev.to/rotsl/hypercontext-a-framework-for-agents-that-actually-know-what-theyre-doing-3e7p</link>
      <guid>https://dev.to/rotsl/hypercontext-a-framework-for-agents-that-actually-know-what-theyre-doing-3e7p</guid>
      <description>&lt;p&gt;I built Hypercontext because I got tired of agent frameworks that treat context like a static blob you shove into a prompt and hope for the best. Most tools out there assume context is something you &lt;em&gt;pass&lt;/em&gt;. I wanted something that treats context as something you can &lt;em&gt;inspect&lt;/em&gt;, &lt;em&gt;compress&lt;/em&gt;, &lt;em&gt;score&lt;/em&gt;, and &lt;em&gt;rewrite&lt;/em&gt; while the agent is running. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Hypercontext is still in Alpha phase. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This isn't about adding another layer of abstraction over OpenAI's API. It's about making agents aware of their own reasoning so they can fix it when it breaks.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it actually does
&lt;/h2&gt;

&lt;p&gt;Hypercontext is a self-referential agent framework for Python and TypeScript. The core idea is simple: agents should be able to read and modify their own system prompts, tool descriptions, and memory at runtime based on whether they're actually succeeding at the task.&lt;/p&gt;

&lt;p&gt;The framework ships with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Python SDK with orchestration, agents, scoring, memory, compression, deduplication, convergence detection, and archive helpers&lt;/li&gt;
&lt;li&gt;A TypeScript SDK for &lt;code&gt;Node.js&lt;/code&gt; with the same primitives&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;CLI&lt;/code&gt; for running compression, archive queries, provider discovery, and orchestration&lt;/li&gt;
&lt;li&gt;A curses-based terminal UI for browsing and pinning commands without leaving the shell&lt;/li&gt;
&lt;li&gt;A browser dashboard for visual inspection&lt;/li&gt;
&lt;li&gt;An MCP stdio daemon for Claude Desktop, Claude Code, and Codex integration&lt;/li&gt;
&lt;li&gt;An HTTP MCP server for web integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both SDKs are zero-dependency where possible. The Python core is pure Python. The TypeScript SDK has minimal deps. You can run the whole thing against Ollama locally without touching a cloud provider.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with most agent frameworks
&lt;/h2&gt;

&lt;p&gt;I've used agents. They all share the same blind spot: context is treated as immutable input. You construct a prompt, feed it to the model, get output back. If the output is wrong, you tweak the prompt and try again. The agent itself has no idea what worked and what didn't across runs.&lt;/p&gt;

&lt;p&gt;Hypercontext changes this by making context a first-class citizen that agents can manipulate. Each generation gets tracked as a node in a lineage tree. You can see which parent led to which result, which branch is going stale, and which context configuration produced the best score. Successful strategies get archived and reused. Failed ones get pruned.&lt;/p&gt;

&lt;p&gt;This isn't theoretical. The archive stores scored generations so later runs can compare branches and identify the strongest evolution path. Memory is split between persistent storage (lessons across runs) and episodic storage (context within a single session).&lt;/p&gt;

&lt;h2&gt;
  
  
  How the context loop works
&lt;/h2&gt;

&lt;p&gt;Here's the basic flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The agent receives a task and its current context window&lt;/li&gt;
&lt;li&gt;It generates a response and scores the result against a fitness function&lt;/li&gt;
&lt;li&gt;If the score is below threshold, the agent reflects on what went wrong&lt;/li&gt;
&lt;li&gt;It rewrites its own system prompt, tool descriptions, or memory based on that reflection&lt;/li&gt;
&lt;li&gt;The new context configuration gets tested in the next generation&lt;/li&gt;
&lt;li&gt;Successful configurations get archived; failed ones get discarded&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This happens automatically in the &lt;code&gt;TaskAgent&lt;/code&gt; and &lt;code&gt;MetaAgent&lt;/code&gt; classes. You don't need to hand-code the reflection logic unless you want to.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;MetaAgent&lt;/code&gt; goes further. It can perform repository-aware tool use and self-modification workflows. If you point it at a codebase with &lt;code&gt;--workdir&lt;/code&gt;, it can inspect files, suggest modifications, and track whether those modifications improved the code quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and setup
&lt;/h2&gt;

&lt;p&gt;For Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;hypercontext
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Node.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;hypercontext-node-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No separate MCP package to install. No complex dependency tree. The Python package includes the &lt;code&gt;CLI&lt;/code&gt;, &lt;code&gt;TUI&lt;/code&gt;, &lt;code&gt;stdio daemon&lt;/code&gt;, HTTP server, and browser launcher. The &lt;code&gt;npm&lt;/code&gt; package is the SDK only, which is the right granularity for Node projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Provider setup
&lt;/h2&gt;

&lt;p&gt;Hypercontext doesn't lock you into a provider. It supports Claude, OpenAI, Ollama, OpenAI-compatible servers, and local transformers models. You set credentials via environment variables or a YAML config file with named presets.&lt;/p&gt;

&lt;p&gt;For Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;HYPERCONTEXT_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;anthropic
&lt;span class="nv"&gt;HYPERCONTEXT_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;claude-sonnet-4-20250514
&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-key-here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Ollama (fully local):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
ollama pull llama3

&lt;span class="nv"&gt;HYPERCONTEXT_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama
&lt;span class="nv"&gt;HYPERCONTEXT_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;llama3
&lt;span class="nv"&gt;OLLAMA_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:11434
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The named preset feature is useful when you want multiple backends in one project. You define them in a &lt;code&gt;YAML&lt;/code&gt; file and resolve by name at runtime. The framework expands &lt;code&gt;${VAR}&lt;/code&gt; values from the environment, so secrets stay out of config files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using it in Python
&lt;/h2&gt;

&lt;p&gt;Direct orchestration is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;hypercontext&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HyperContext&lt;/span&gt;
&lt;span class="n"&gt;hc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HyperContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./hypercontext_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_generations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want provider-backed calls without the full orchestration loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;hypercontext&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LLMClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;hypercontext.providers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ProviderRegistry&lt;/span&gt;

&lt;span class="n"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ProviderRegistry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;instance&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-20250514&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key-here&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.anthropic.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LLMClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize this in one sentence.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For agent workflows, you choose between &lt;code&gt;TaskAgent&lt;/code&gt; (repeatable tasks) and &lt;code&gt;MetaAgent&lt;/code&gt; (repository-aware reasoning and self-modification). Both support the context evolution loop out of the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using it in TypeScript
&lt;/h2&gt;

&lt;p&gt;The Node SDK follows the same patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;ContextWindow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;TaskAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;StructuredOutputParser&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;EnhancedToolRegistry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;LoggingMiddleware&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hypercontext-node-sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ContextWindow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Important context&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TaskAgent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;demo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hello&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StructuredOutputParser&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parseFirst&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Answer: {"status":"ok"}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;EnhancedToolRegistry&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LoggingMiddleware&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;registerTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;echo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Echo a payload back&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The TypeScript SDK includes context compression, retrieval, lineage tracking, persistent memory, fitness evaluation, and structured output parsing. It's not a port of the Python code; it's a parallel implementation with the same design goals.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLI and terminal UI
&lt;/h2&gt;

&lt;p&gt;The Python package includes a full CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext version
python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext providers
python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext run &lt;span class="nt"&gt;--generations&lt;/span&gt; 5 &lt;span class="nt"&gt;--output-dir&lt;/span&gt; ./runs/demo &lt;span class="nt"&gt;--workdir&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext compress &lt;span class="nt"&gt;--input&lt;/span&gt; long_text.txt &lt;span class="nt"&gt;--ratio&lt;/span&gt; 0.4
python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext archive &lt;span class="nt"&gt;--list&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;TUI&lt;/code&gt; is a curses dashboard for browsing commands, pinning favorites, and executing them without leaving the terminal. It supports &lt;code&gt;--workdir&lt;/code&gt; so you can point it at any project &lt;code&gt;root&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext tui &lt;span class="nt"&gt;--workdir&lt;/span&gt; /path/to/project
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For desktop assistants, the stdio MCP daemon handles Claude &lt;/p&gt;

&lt;p&gt;Desktop, Claude Code, and Codex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext mcp &lt;span class="nt"&gt;--workdir&lt;/span&gt; /path/to/project
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For browser integrations, the HTTP server exposes the same tools over a REST interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext serve &lt;span class="nt"&gt;--port&lt;/span&gt; 8080 &lt;span class="nt"&gt;--workdir&lt;/span&gt; /path/to/project
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  MCP integration without the hassle
&lt;/h2&gt;

&lt;p&gt;Most MCP implementations require you to install a separate mcp package and configure &lt;code&gt;JSON&lt;/code&gt; files. Hypercontext bundles the &lt;code&gt;stdio daemon&lt;/code&gt; and &lt;code&gt;HTTP&lt;/code&gt; server directly. You don't need to &lt;code&gt;pip-install&lt;/code&gt; anything extra.&lt;br&gt;
The stdio daemon speaks the Model Context Protocol natively. Claude Desktop can discover and invoke Hypercontext tools without manual configuration. The HTTP server does the same for browser-based integrations.&lt;/p&gt;
&lt;h2&gt;
  
  
  Context compression and deduplication
&lt;/h2&gt;

&lt;p&gt;One of the practical problems with long-running agents is context bloat. Hypercontext includes a ContextCompressor that reduces text size while preserving semantic meaning. There's also a validator that checks compression fidelity so you don't accidentally drop important information.&lt;br&gt;
The deduplication layer identifies repeated patterns across generations and collapses them. This matters when you're running evolutionary loops where similar context configurations get tested repeatedly.&lt;/p&gt;
&lt;h2&gt;
  
  
  Lineage tracking
&lt;/h2&gt;

&lt;p&gt;Every generation gets a unique ID and tracks its parent. You can query the lineage tree to answer questions like:&lt;br&gt;
Which generation produced the best score?&lt;br&gt;
Which parent led to this result?&lt;br&gt;
Which branch hasn't improved in the last 10 generations?&lt;br&gt;
This isn't just logging. The lineage data feeds back into the parent selection strategy for the next generation. Stagnant branches get deprioritized. High-fitness branches get explored further.&lt;/p&gt;
&lt;h2&gt;
  
  
  Archive and transfer learning
&lt;/h2&gt;

&lt;p&gt;The archive stores proven context configurations ranked by fitness score. When you start a new task, the framework can query the archive for context patterns that worked well on similar tasks. This is transfer learning without neural network retraining. You're transferring context strategies instead of model weights.&lt;/p&gt;

&lt;p&gt;The archive is queryable via CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext archive &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"task:code-review fitness:&amp;gt;0.8"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I learned building this
&lt;/h2&gt;

&lt;p&gt;I started this project after reading the Hyperagents paper and getting frustrated that none of the existing frameworks implemented the meta-cognitive ideas in a practical way. Most research code is a mess of Jupyter notebooks and hardcoded paths. I wanted something you could actually install and use.&lt;br&gt;
The hardest part wasn't the context compression or the lineage tracking. It was designing the agent loop so that self-modification doesn't spiral into chaos. If an agent can rewrite its own system prompt, it can also break its own system prompt. The convergence detection layer stops the loop when scores plateau or when context configurations start cycling.&lt;br&gt;
I also learned that dual SDK maintenance is a pain. Keeping the Python and TypeScript implementations in sync requires discipline. The APIs aren't identical because the languages have different conventions, but the core concepts map directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current state and what's next
&lt;/h2&gt;

&lt;p&gt;The framework is functional and I'm using it in my own projects. The Python package is on PyPI, the TypeScript SDK is on npm, and the docs are on GitHub Pages.&lt;br&gt;
I'm currently working on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Better convergence heuristics for multi-objective optimization&lt;/li&gt;
&lt;li&gt;A web-based lineage visualizer&lt;/li&gt;
&lt;li&gt;More provider recipes for local model setups&lt;/li&gt;
&lt;li&gt;Benchmark suites to compare context strategies across tasks
The repo includes runnable examples for evolution, lineage tracking, self-modifying agents, and provider workflows. If you want to see what the framework can do without writing code, start with &lt;code&gt;examples/python/feature_gallery.py&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Python&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;hypercontext
python &lt;span class="nt"&gt;-m&lt;/span&gt; hypercontext version

&lt;span class="c"&gt;# TypeScript&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;hypercontext-node-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://rotsl.github.io/hypercontext/" rel="noopener noreferrer"&gt;Docs&lt;/a&gt;&lt;br&gt;
&lt;a href="https://pypi.org/project/hypercontext/" rel="noopener noreferrer"&gt;PyPI&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.npmjs.com/package/hypercontext-node-sdk" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>agents</category>
      <category>typescript</category>
    </item>
    <item>
      <title>NoB (Noticeably Better): a compiled language that tries to stay out of your way</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Mon, 13 Apr 2026 12:39:52 +0000</pubDate>
      <link>https://dev.to/rotsl/nob-noticeably-better-a-compiled-language-that-tries-to-stay-out-of-your-way-163</link>
      <guid>https://dev.to/rotsl/nob-noticeably-better-a-compiled-language-that-tries-to-stay-out-of-your-way-163</guid>
      <description>&lt;p&gt;Most new languages promise the same things: performance, simplicity, better tooling. NoB is trying to hit those too—but the interesting part is how it actually does it.&lt;/p&gt;

&lt;p&gt;At its core, NoB is a &lt;strong&gt;compiled language that targets C++20&lt;/strong&gt;, with a second execution path through a &lt;strong&gt;bytecode VM&lt;/strong&gt;. That split ends up being more practical than it sounds.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Website: &lt;a href="https://nob-lang.omni-flows.uk/" rel="noopener noreferrer"&gt;https://nob-lang.omni-flows.uk/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Source: private / proprietary
&lt;/li&gt;
&lt;li&gt;Platforms: macOS, Linux, Windows (via WSL2)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What NoB actually is
&lt;/h2&gt;

&lt;p&gt;NoB compiles &lt;code&gt;.nob&lt;/code&gt; code into C++20, then uses &lt;code&gt;clang++&lt;/code&gt; to produce a native binary.&lt;/p&gt;

&lt;p&gt;There’s also a VM mode that skips compilation entirely and runs bytecode instead.&lt;/p&gt;

&lt;p&gt;That gives you two very different workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Native pipeline&lt;/strong&gt; → slower startup, fast runtime
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VM pipeline&lt;/strong&gt; → instant startup, slower runtime
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, it feels like using two tools under one language.&lt;/p&gt;




&lt;h2&gt;
  
  
  The two pipelines (and when they matter)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Native (default)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nob file.nob
nob file.nob &lt;span class="nt"&gt;-o&lt;/span&gt; app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is what you’d use for anything serious.&lt;br&gt;
    • Compiles via C++20&lt;br&gt;
    • Uses clang++&lt;br&gt;
    • Runs as a native binary&lt;br&gt;
    • Supports everything (networking, threads, async, etc.)&lt;/p&gt;



&lt;p&gt;VM mode&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nob &lt;span class="nt"&gt;--vm&lt;/span&gt; file.nob

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This skips the compiler completely.&lt;/p&gt;

&lt;p&gt;It’s useful for:&lt;br&gt;
    • quick scripts&lt;br&gt;
    • REPL work&lt;br&gt;
    • testing ideas&lt;/p&gt;

&lt;p&gt;But it’s not feature-complete. Networking, threading, and some advanced features don’t work here.&lt;/p&gt;



&lt;p&gt;The syntax (closer to Python than C++)&lt;/p&gt;

&lt;p&gt;The syntax leans readable without being too loose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;
&lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="s2"&gt;"Alice"&lt;/span&gt;

&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;greet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"Hello, {name}"&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="nb"&gt;print&lt;/span&gt; &lt;span class="n"&gt;greet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Bob"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things stand out:&lt;br&gt;
    • set vs let (mutable vs immutable)&lt;br&gt;
    • 1-based indexing&lt;br&gt;
    • string interpolation built-in&lt;br&gt;
    • structured blocks without braces&lt;/p&gt;

&lt;p&gt;It’s easy to pick up, especially if you’ve used Python or Lua.&lt;/p&gt;



&lt;p&gt;Features that are actually interesting&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tail-call optimization&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Recursive functions don’t blow the stack if written in tail form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;
&lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;sum_tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sum_tail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;acc&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gets compiled into a loop automatically.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;Pipe operator
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;
&lt;span class="n"&gt;words&lt;/span&gt;
  &lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="err"&gt;|&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It makes chained transformations easier to read.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;Macros (compile-time)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;
&lt;span class="n"&gt;macro&lt;/span&gt; &lt;span class="n"&gt;swap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="n"&gt;tmp&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
  &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
  &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tmp&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These run at compile time, not runtime.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;Python backend
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
nob py file.nob &lt;span class="nt"&gt;-o&lt;/span&gt; file.py &lt;span class="nt"&gt;--run&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This converts NoB into Python so you can use Python libraries like NumPy or OpenCV.&lt;/p&gt;

&lt;p&gt;There are limits (no macros, no pipe operator), but it’s useful when you need the ecosystem.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;Built-in concurrency (native only)
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;
&lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;thread_spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="nb"&gt;print&lt;/span&gt; &lt;span class="s2"&gt;"running"&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;thread_join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Includes threads, mutexes, channels, and async support.&lt;/p&gt;




&lt;p&gt;Tooling (built-in)&lt;/p&gt;

&lt;p&gt;NoB ships with a lot already included:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
nob repl
nob check file.nob
nob profile file.nob
nob &lt;span class="nb"&gt;fmt &lt;/span&gt;file.nob
nob gui

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notable pieces:&lt;br&gt;
    • GUI REPL (nob gui)&lt;br&gt;
    • formatter and profiler included&lt;br&gt;
    • package manager (nob pkg)&lt;/p&gt;

&lt;p&gt;You don’t need to assemble a separate toolchain.&lt;/p&gt;



&lt;p&gt;Performance&lt;/p&gt;

&lt;p&gt;From the official benchmarks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Python&lt;/th&gt;
&lt;th&gt;NoB&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Simple loop&lt;/td&gt;
&lt;td&gt;0.498s&lt;/td&gt;
&lt;td&gt;0.046s&lt;/td&gt;
&lt;td&gt;~10.9×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prime count&lt;/td&gt;
&lt;td&gt;0.090s&lt;/td&gt;
&lt;td&gt;0.018s&lt;/td&gt;
&lt;td&gt;~4.9×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fibonacci&lt;/td&gt;
&lt;td&gt;0.734s&lt;/td&gt;
&lt;td&gt;0.484s&lt;/td&gt;
&lt;td&gt;~1.5×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What this means&lt;br&gt;
    • Big gains in loops and numeric work&lt;br&gt;
    • Smaller gains in recursive workloads&lt;br&gt;
    • Much faster than Python for CPU-heavy code&lt;/p&gt;



&lt;p&gt;Compilation targets&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
nob file.nob &lt;span class="nt"&gt;--profile&lt;/span&gt; simd
nob file.nob &lt;span class="nt"&gt;--profile&lt;/span&gt; cuda
nob file.nob &lt;span class="nt"&gt;--profile&lt;/span&gt; wasm

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supports:&lt;br&gt;
    • SIMD optimized builds&lt;br&gt;
    • CUDA / OpenCL&lt;br&gt;
    • WebAssembly&lt;br&gt;
    • LLVM IR&lt;/p&gt;




&lt;p&gt;Platform support&lt;/p&gt;

&lt;p&gt;Platform    Support&lt;br&gt;
macOS   Native&lt;br&gt;
Linux   Native&lt;br&gt;
Windows WSL2&lt;/p&gt;




&lt;p&gt;Pricing (indicative)&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;£0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Indie&lt;/td&gt;
&lt;td&gt;£5–10/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pro&lt;/td&gt;
&lt;td&gt;£10–20/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;Where it fits&lt;/p&gt;

&lt;p&gt;NoB makes sense if you want:&lt;br&gt;
    • something faster than Python&lt;br&gt;
    • something simpler than C++&lt;br&gt;
    • built-in tooling without extra setup&lt;/p&gt;

&lt;p&gt;Less ideal if you need:&lt;br&gt;
    • a large ecosystem&lt;br&gt;
    • long-established tooling&lt;/p&gt;




&lt;p&gt;Final thoughts&lt;/p&gt;

&lt;p&gt;NoB isn’t trying to reinvent programming. It’s trying to remove friction.&lt;/p&gt;

&lt;p&gt;The dual pipeline is the most practical part—you can prototype quickly in VM mode, then switch to native when performance matters.&lt;/p&gt;

&lt;p&gt;It’s early, but the core design is solid. Worth keeping an eye on.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>oop</category>
      <category>nob</category>
      <category>architecture</category>
    </item>
    <item>
      <title>I built a safety net for python environments because I was tired of debugging “It works on my machine”</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Thu, 09 Apr 2026 09:59:33 +0000</pubDate>
      <link>https://dev.to/rotsl/i-built-a-safety-net-for-python-environments-because-i-was-tired-of-debugging-it-works-on-my-1hig</link>
      <guid>https://dev.to/rotsl/i-built-a-safety-net-for-python-environments-because-i-was-tired-of-debugging-it-works-on-my-1hig</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Why every python developer needs a preflight check for their code&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhkni89bihdn99chmfpq.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhkni89bihdn99chmfpq.jpeg" alt="image" width="455" height="167"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I used to have a recurring nightmare. I’d be halfway through a machine learning experiment, three hours into training, when suddenly everything would explode. Not because my code was wrong, but because my environment was quietly broken in a way I couldn’t see coming. Wrong Python version. A package that got installed with pip instead of conda. A wheel that claimed to support my architecture but didn’t. &lt;strong&gt;Rosetta 2&lt;/strong&gt; running my &lt;strong&gt;arm64&lt;/strong&gt; Python as &lt;strong&gt;x86_64&lt;/strong&gt; and tanking my GPU acceleration.&lt;/p&gt;

&lt;p&gt;The error messages were always cryptic. The fixes were always tedious. And the worst part? I never knew if my environment was actually healthy until something went wrong.&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://github.com/rotsl/envguard" rel="noopener noreferrer"&gt;&lt;strong&gt;EnvGuard&lt;/strong&gt;&lt;/a&gt;—a CLI tool that validates your Python environment &lt;strong&gt;before&lt;/strong&gt; you run your code, not after it breaks. Think of it as a preflight checklist for your Python projects. If something’s wrong, it blocks execution and tells you exactly what’s broken and how to fix it. If everything passes, your command runs in a validated environment.&lt;/p&gt;

&lt;p&gt;It’s macOS-first (because that’s where I do my work), runs on Linux with partial support, and deliberately doesn’t support Windows (because life’s too short for three-platform maintenance). You can install it from &lt;a href="https://pypi.org/project/envguard-tool/" rel="noopener noreferrer"&gt;&lt;strong&gt;PyPI&lt;/strong&gt;&lt;/a&gt; as &lt;code&gt;envguard-tool&lt;/code&gt; — the CLI command is just &lt;code&gt;envguard&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Problem: Python Environments Are Fragile and Invisible
&lt;/h4&gt;

&lt;p&gt;Python environment management is a solved problem in the same way that herding cats is a solved problem. Technically there are tools. Practically, things go wrong constantly.&lt;/p&gt;

&lt;p&gt;Here’s what actually happens in the wild:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The architecture confusion.&lt;/strong&gt; You install Python on an M1 Mac, but somehow you’re running the &lt;code&gt;x86\_64&lt;/code&gt; version under Rosetta 2. Everything works, but your Metal Performance Shaders (MPS) acceleration is silently disabled. PyTorch falls back to CPU. Your training takes 10x longer. You don’t notice until you check &lt;code&gt;activity monitor&lt;/code&gt; and see no GPU usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mixed ownership trap.&lt;/strong&gt; You create a conda environment, but then you &lt;strong&gt;pip install&lt;/strong&gt; something because the conda version is outdated. Then you conda install something else. Now you have packages owned by two different managers, and &lt;code&gt;pip check&lt;/code&gt; is screaming about conflicts, but your code still runs so you ignore it until it doesn’t.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The CUDA delusion.&lt;/strong&gt; You’re on macOS, but your &lt;code&gt;requirements.txt&lt;/code&gt; includes &lt;code&gt;torch==2.0.0+cu118&lt;/code&gt; because you copied it from a Linux server. It installs fine. It even imports. But CUDA doesn’t exist on Apple Silicon, and your code fails three layers deep in a stack trace that mentions nothing about GPU compatibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The “it worked yesterday” mystery.&lt;/strong&gt; Your environment was fine. Then you updated one package. Now something else is broken. You have no idea what changed or when.&lt;/p&gt;

&lt;p&gt;These aren’t exotic edge cases. They’re daily experiences for Python developers working on ML, data science, or scientific computing projects. The existing tools — &lt;code&gt;conda&lt;/code&gt;, &lt;code&gt;pip&lt;/code&gt;, &lt;code&gt;poetry&lt;/code&gt;, &lt;code&gt;uv&lt;/code&gt;, &lt;code&gt;pyenv&lt;/code&gt; — are great at &lt;strong&gt;creating&lt;/strong&gt; environments. They’re terrible at &lt;strong&gt;validating&lt;/strong&gt; them continuously.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Solution: Preflight Validation Every Single Time
&lt;/h4&gt;

&lt;p&gt;EnvGuard’s core idea is simple: instead of running &lt;code&gt;python train.py&lt;/code&gt; and hoping, you run &lt;code&gt;envguard run — python train.py&lt;/code&gt;. Before your code executes, EnvGuard runs a nine-step preflight pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Detect the host &lt;/strong&gt; — OS version, architecture (native &lt;code&gt;arm64&lt;/code&gt; vs &lt;code&gt;Intel&lt;/code&gt; vs &lt;code&gt;Rosetta 2&lt;/code&gt;), available package managers, network connectivity
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discover the project&lt;/strong&gt;  — scan for &lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;requirements.txt&lt;/code&gt;, &lt;code&gt;environment.yml&lt;/code&gt;, &lt;code&gt;Pipfile&lt;/code&gt;, &lt;code&gt;poetry.lock&lt;/code&gt;, etc.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze intent&lt;/strong&gt;  — figure out what environment type (venv/conda/pipenv/poetry), Python version, and accelerator targets (CPU/MPS/CUDA) the project needs
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluate rules &lt;/strong&gt; — run 15+ validation rules to catch problems
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fail-fast on critical issues &lt;/strong&gt; — block execution if anything is unrecoverable
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create a resolution plan&lt;/strong&gt;  — determine exactly how to satisfy the environment requirements
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create or repair the environment&lt;/strong&gt; — make sure the actual environment matches the requirements
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate the environment&lt;/strong&gt;  — run &lt;code&gt;pip check&lt;/code&gt; or equivalent to verify consistency
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smoke test&lt;/strong&gt; — try importing key packages in an isolated subprocess to catch runtime failures&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If any step fails with a &lt;code&gt;CRITICAL&lt;/code&gt; finding, your command never runs. You get a clear error message explaining what went wrong and how to fix it. No cryptic tracebacks. No debugging environment issues at 2am.&lt;/p&gt;

&lt;h4&gt;
  
  
  What EnvGuard Actually Catches
&lt;/h4&gt;

&lt;p&gt;The rules engine evaluates 15+ specific checks. Here are the ones that have saved me the most pain:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Severity&lt;/th&gt;
&lt;th&gt;What it catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CUDA_ON_MACOS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;CRITICAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Any CUDA dependency on macOS (hardware impossibility)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ROSETTA_TRANSLATION_DETECTED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WARNING&lt;/td&gt;
&lt;td&gt;x86_64 Python running under Rosetta 2 on Apple Silicon (kills MPS performance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ARCHITECTURE_MISMATCH&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ERROR&lt;/td&gt;
&lt;td&gt;Python architecture doesn't match project requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MIXED_PIP_CONDA_OWNERSHIP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WARNING&lt;/td&gt;
&lt;td&gt;Packages installed by both pip and conda (dependency hell indicator)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WHEEL_INCOMPATIBLE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WARNING&lt;/td&gt;
&lt;td&gt;Wheel file doesn't match current platform/architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BROKEN_ENVIRONMENT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ERROR&lt;/td&gt;
&lt;td&gt;Active venv/conda is missing Python binary or critical files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PYTHON_VERSION_BELOW_MINIMUM&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ERROR&lt;/td&gt;
&lt;td&gt;Python version below &lt;code&gt;requires-python&lt;/code&gt; in pyproject.toml&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;MPS_NOT_AVAILABLE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;INFO&lt;/td&gt;
&lt;td&gt;Apple Silicon present but MPS not available (usually means PyTorch wasn't built with MPS support)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;CUDA_ON_MACOS&lt;/code&gt; rule alone has probably saved me hours of debugging. Here’s what happens: you copy a &lt;code&gt;requirements.txt&lt;/code&gt; from a Linux machine that specifies &lt;code&gt;torch==2.1.0+cu118&lt;/code&gt;. You install it on your M1 Mac. It seems to work — pip doesn’t complain, the import succeeds. But when you actually try to move tensors to the GPU, you get a cryptic error about CUDA devices not being available. EnvGuard catches this at the dependency resolution stage and blocks execution with a message telling you to use &lt;code&gt;mps&lt;/code&gt; or &lt;code&gt;cpu&lt;/code&gt; targets instead.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ROSETTA_TRANSLATION_DETECTED&lt;/code&gt; rule is subtler but equally important. If you’re running an x86_64 Python binary on an Apple Silicon Mac (usually because you installed it before Rosetta was properly configured, or you’re using an old pyenv), everything works — but MPS acceleration is silently disabled. Your ML training runs on CPU. Your inference is 10x slower than it should be. EnvGuard detects this via &lt;code&gt;sysctl proc_translated&lt;/code&gt; and warns you that you’re leaving performance on the table.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Technical Architecture: How It Actually Works
&lt;/h4&gt;

&lt;p&gt;EnvGuard is built as a layered Python package with clear separation of concerns. The source is in &lt;a href="[https://github.com/rotsl/envguard](https://github.com/rotsl/envguard)"&gt;the GitHub repo&lt;/a&gt; under &lt;code&gt;src/envguard/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLI Layer&lt;/strong&gt; (&lt;code&gt;cli.py&lt;/code&gt;): Typer-based interface with 25 commands across environment management, dependency resolution, lock files, publishing, and self-updating. Every command supports &lt;code&gt; — json&lt;/code&gt; output for CI/CD integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestration Layer&lt;/strong&gt; (&lt;code&gt;preflight.py&lt;/code&gt;, &lt;code&gt;doctor.py&lt;/code&gt;): The preflight engine runs the nine-step pipeline. The doctor runs standalone diagnostics without execution. Both use the same underlying detection and rules systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain Layer&lt;/strong&gt; : The heavy lifting happens here:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;detect.py&lt;/code&gt; — &lt;code&gt;HostDetector&lt;/code&gt; class gathers OS, architecture, Python, shell, network, and permission facts
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rules.py&lt;/code&gt; — &lt;code&gt;RulesEngine&lt;/code&gt; evaluates all 15+ validation rules
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;repair.py&lt;/code&gt; — &lt;code&gt;RepairEngine&lt;/code&gt; can automatically fix broken environments (recreate venvs, fix mixed ownership, switch Python versions)
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;models.py&lt;/code&gt; — Pydantic models for &lt;code&gt;HostFacts&lt;/code&gt;, &lt;code&gt;ProjectIntent&lt;/code&gt;, &lt;code&gt;RuleFinding&lt;/code&gt;, &lt;code&gt;ResolutionRecord&lt;/code&gt;, etc.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;project/&lt;/code&gt; — Discovery (scanning for project files), intent analysis (inferring requirements), resolution (dependency solving), and lifecycle management
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;resolver/&lt;/code&gt; — Pluggable backends for PyPI (BFS resolution via JSON API), uv, pip, and conda
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lock/&lt;/code&gt; — Lock file generation and management (&lt;code&gt;envguard.lock&lt;/code&gt; in TOML format with SHA-256 content hashes)
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;update/&lt;/code&gt; — Self-updating mechanism with SHA-256 verification and rollback support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Platform Layer&lt;/strong&gt; : macOS-specific code for permissions, Rosetta detection, Xcode CLI tools, and LaunchAgent management. Linux gets partial support here — core pipeline works, but no LaunchAgent, no MPS detection, no Rosetta checks.&lt;/p&gt;

&lt;p&gt;All state files (in &lt;code&gt;.envguard/&lt;/code&gt;) are written atomically using write-to-temp-then-rename to prevent corruption from interrupted writes. Every subprocess call has explicit timeouts. The security model is documented in detail — checksum verification for updates, no shell=True with string interpolation, path traversal protection for archive extraction.&lt;/p&gt;

&lt;h4&gt;
  
  
  Real Usage: What My Workflow Looks Like
&lt;/h4&gt;

&lt;p&gt;I work on a lot of ML projects with different requirements. Some need PyTorch with MPS. Some need TensorFlow (which has its own special hell on macOS). Some are pure Python data pipelines. Here’s how I actually use EnvGuard day-to-day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Starting a new project:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/new-ml-experiment
envguard init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a &lt;code&gt;.envguard/&lt;/code&gt; directory with &lt;code&gt;state.json&lt;/code&gt;, &lt;code&gt;envguard.toml&lt;/code&gt; (config), and subdirectories for snapshots, cache, logs, and backups. It scans my project files to figure out what I’m building.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Checking if everything is healthy:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;envguard doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs 10 diagnostic checks: host detection, project discovery, Python environment, package manager health, dependency consistency, accelerator support, permissions, network connectivity, and environment ownership. It outputs a report showing what’s working and what’s not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running my actual code:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;envguard run - python train.py - epochs 100 - batch-size 32
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before &lt;code&gt;train.py&lt;/code&gt; executes, the preflight pipeline runs. If my environment has drifted — say, I updated PyTorch and now there’s a version conflict with torchvision — EnvGuard catches it and blocks execution. I can then run &lt;code&gt;envguard repair&lt;/code&gt; to fix it automatically, or &lt;code&gt;envguard lock sync&lt;/code&gt; to reinstall from my lock file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Locking dependencies for reproducibility:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;envguard resolve
envguard lock generate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;resolve&lt;/code&gt; uses the PyPI JSON API to resolve my project dependencies to exact pinned versions. &lt;code&gt;lock generate&lt;/code&gt; writes an &lt;code&gt;envguard.lock&lt;/code&gt; file with SHA-256 content hashes. I commit this to git. When someone else clones the repo, they run &lt;code&gt;envguard install — from-lock&lt;/code&gt; and get exactly the same environment I have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-updating:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;envguard update - dry-run &lt;span class="c"&gt;# check if there's a new version&lt;/span&gt;
envguard update &lt;span class="c"&gt;# actually update with SHA-256 verification and automatic rollback snapshot&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;EnvGuard can update itself. Before applying an update, it creates a rollback snapshot. If something goes wrong, &lt;code&gt;envguard rollback&lt;/code&gt; restores the previous version.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Lock File: Reproducibility Without the Pain
&lt;/h4&gt;

&lt;p&gt;Python dependency management has a reproducibility problem. &lt;code&gt;requirements.txt&lt;/code&gt; with loose version constraints means “install something that hopefully works.” &lt;code&gt;requirements.txt&lt;/code&gt; with pinned versions means “this worked on my machine at one specific moment, but good luck if you’re on a different architecture or Python version.”&lt;/p&gt;

&lt;p&gt;EnvGuard’s lock file (&lt;code&gt;envguard.lock&lt;/code&gt;) tries to be smarter. It’s a TOML file that includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The exact resolved dependency graph with specific versions
&lt;/li&gt;
&lt;li&gt;SHA-256 content hashes for verification
&lt;/li&gt;
&lt;li&gt;Platform and Python version markers (so you can have different resolutions for macOS-arm64 vs Linux-x86_64 if needed)
&lt;/li&gt;
&lt;li&gt;The source files that contributed to the resolution (&lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;requirements.txt&lt;/code&gt;, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lock file is human-readable but machine-generated. You don’t edit it manually. You regenerate it with &lt;code&gt;envguard lock generate&lt;/code&gt; or update specific packages with &lt;code&gt;envguard lock update — package &amp;lt;name&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In CI, you can run &lt;code&gt;envguard lock check&lt;/code&gt; to verify that the lock file is up-to-date with your source requirements. It exits with code 13 if stale, which you can use to fail builds that might have inconsistent dependencies.&lt;/p&gt;

&lt;h4&gt;
  
  
  What It Doesn’t Do (And Why)
&lt;/h4&gt;

&lt;p&gt;EnvGuard has deliberate limitations that are worth understanding:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It doesn’t intercept unmanaged launches.&lt;/strong&gt; If you run &lt;code&gt;python train.py&lt;/code&gt; directly, EnvGuard doesn’t see it. Only commands routed through &lt;code&gt;envguard run&lt;/code&gt; get validated. This is by design — EnvGuard is opt-in, not a system-wide interceptor that could break other workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It doesn’t support Windows&lt;/strong&gt;. The codebase uses POSIX-specific APIs throughout (&lt;code&gt;os.access()&lt;/code&gt; for permissions, list-form subprocess arguments, &lt;code&gt;/tmp&lt;/code&gt; paths). Adding Windows support would require a parallel implementation of the platform layer, and I don’t use Windows enough to maintain that. WSL2 works if you need it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It doesn’t make CUDA work on macOS&lt;/strong&gt;. Apple Silicon physically cannot run NVIDIA CUDA. EnvGuard detects CUDA dependencies and blocks them with a clear error, but it can’t magically add CUDA support where none exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It doesn’t auto-activate environments on directory change.&lt;/strong&gt; If you want &lt;code&gt;cd my-project&lt;/code&gt; to automatically activate the right &lt;code&gt;venv&lt;/code&gt;, use &lt;code&gt;direnv&lt;/code&gt;. EnvGuard’s shell hooks are minimal and opt-in — they just load the integration, not the environments themselves.&lt;/p&gt;

&lt;p&gt;These limitations are documented in the repo’s &lt;code&gt;docs/limitations.md&lt;/code&gt; and tracked as architectural decisions in &lt;code&gt;docs/adrs/&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Installation and Getting Started
&lt;/h4&gt;

&lt;p&gt;If you’re on macOS 12+ (Monterey) or Linux, installation is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;envguard-tool
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The PyPI package is named &lt;code&gt;envguard-tool&lt;/code&gt; because &lt;code&gt;envguard&lt;/code&gt; was taken. The CLI command and Python import are both just &lt;code&gt;envguard&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For macOS, there’s also a bootstrap script that installs shell hooks and the LaunchAgent for automatic update checking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/rotsl/envguard.git
&lt;span class="nb"&gt;cd &lt;/span&gt;envguard
bash scripts/bootstrap.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once installed, verify it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;envguard - version
envguard doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then initialize any Python project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/your/project
envguard init
envguard run - python your_script.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Why I Built This (And Who It’s For)
&lt;/h4&gt;

&lt;p&gt;I built EnvGuard for myself, primarily. I’m a researcher working on machine learning for biology — specifically computer vision for fungal pathogen analysis. I work on Apple Silicon Macs. I collaborate with people on Linux servers. I deal with PyTorch, TensorFlow, JAX, and a lot of scientific Python packages with complex native dependencies.&lt;/p&gt;

&lt;p&gt;I was tired of debugging environment issues that had nothing to do with my actual research. I wanted a tool that would catch problems before they cost me hours of training time or corrupted experimental results.&lt;/p&gt;

&lt;p&gt;EnvGuard is for Python developers who:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work on macOS (especially Apple Silicon) and are tired of Rosetta/architecture surprises
&lt;/li&gt;
&lt;li&gt;Need MPS acceleration for PyTorch and want to know when it’s not actually available
&lt;/li&gt;
&lt;li&gt;Collaborate across different machines and need reproducible environments
&lt;/li&gt;
&lt;li&gt;Are tired of “works on my machine” and want validation that happens &lt;strong&gt;before&lt;/strong&gt; execution
&lt;/li&gt;
&lt;li&gt;Prefer CLI tools that integrate into existing workflows rather than replacing them entirely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s not for everyone. If you’re a web developer working with simple Python environments, you probably don’t need this. If you’re on Windows, this won’t help you (yet). If you want a full IDE-like environment manager with GUI buttons, look elsewhere.&lt;/p&gt;

&lt;p&gt;But if you’re doing scientific Python or ML and you’ve ever lost a day to a broken environment that you didn’t know was broken until it was too late — EnvGuard might save you some pain.&lt;/p&gt;

&lt;p&gt;Python environment management has been broken for a long time. We’ve accepted “it works on my machine” as an inevitable part of the development experience. We’ve normalized spending hours debugging issues that have nothing to do with our actual code.&lt;/p&gt;

&lt;p&gt;I don’t think it has to be this way. EnvGuard is my attempt to bring some of the safety and validation we expect from production systems (preflight checks, reproducible builds, clear error messages) to the messy world of Python development.&lt;/p&gt;

&lt;p&gt;It’s not perfect. It’s alpha software with known limitations. But it’s already saved me hours of debugging, and I hope it can do the same for you.&lt;/p&gt;

&lt;p&gt;If you’re tired of environment surprises, give it a try. Run &lt;code&gt;pip install envguard-tool&lt;/code&gt;, run &lt;code&gt;envguard doctor&lt;/code&gt; on your project, and see what it finds. You might discover that your “working” environment has been quietly broken in ways you never noticed.&lt;/p&gt;

&lt;h4&gt;
  
  
  Links:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/rotsl/envguard" rel="noopener noreferrer"&gt;github.com/rotsl/envguard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;PyPI: &lt;a href="https://pypi.org/project/envguard-tool/" rel="noopener noreferrer"&gt;pypi.org/project/envguard-tool/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cli</category>
      <category>developertools</category>
      <category>macos</category>
      <category>python</category>
    </item>
    <item>
      <title>Health AI on Notion with Tribe V2</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Thu, 02 Apr 2026 15:16:29 +0000</pubDate>
      <link>https://dev.to/rotsl/health-ai-on-notion-with-tribe-v2-2g1j</link>
      <guid>https://dev.to/rotsl/health-ai-on-notion-with-tribe-v2-2g1j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Local-first Notion health tracker with TRIBEv2 brain analysis, AI health insights, symptom logging, goals, medications, appointments, and a browser UI&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://dev.to/challenges/notion-2026-03-04"&gt;Notion MCP Challenge&lt;/a&gt;*&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqnrnhqd08yxmvtt7zhef.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqnrnhqd08yxmvtt7zhef.jpeg" alt="image" width="800" height="504"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This was supposed to be a Notion challenge submission.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/NJIflkjwPsM"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;I built most of it close to the deadline, got something working, and then missed the window. No big failure story. Just underestimated how long the messy parts would take.&lt;/p&gt;

&lt;p&gt;After that, keeping it private felt pointless. So I pushed it to GitHub.&lt;/p&gt;

&lt;p&gt;Around the same time, I came across &lt;strong&gt;Tribe v2&lt;/strong&gt;. That changed how I looked at this project. Instead of treating it like a failed submission, I started treating it like something that could keep evolving in public.&lt;/p&gt;

&lt;p&gt;That is what this is now. Not finished. Still useful.&lt;/p&gt;

&lt;h4&gt;
  
  
  The actual problem I was trying to solve
&lt;/h4&gt;

&lt;p&gt;I sometimes already track things in Notion:&lt;/p&gt;

&lt;p&gt;• Sleep&lt;/p&gt;

&lt;p&gt;• Workouts&lt;/p&gt;

&lt;p&gt;• Random notes about how I feel&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The problem is not tracking. It is what happens after.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nothing.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No aggregation. No patterns. No feedback loop. Just logs sitting there.&lt;/p&gt;

&lt;p&gt;Every week I would think I should look at it properly. I never did.&lt;/p&gt;

&lt;p&gt;So this project is basically me outsourcing that thinking step.&lt;/p&gt;

&lt;h4&gt;
  
  
  System design
&lt;/h4&gt;

&lt;p&gt;The architecture is simple on paper and annoying in practice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• Fetch data from Notion databases&lt;/p&gt;

&lt;p&gt;• Normalize it into a consistent structure&lt;/p&gt;

&lt;p&gt;• Send it to an LLM&lt;/p&gt;

&lt;p&gt;• Write the output back into Notion&lt;/p&gt;

&lt;p&gt;That is it. No fancy orchestration.&lt;/p&gt;

&lt;p&gt;The difficulty is everything in between.&lt;/p&gt;

&lt;h4&gt;
  
  
  Notion is not a real database
&lt;/h4&gt;

&lt;p&gt;At first glance, Notion feels structured. It is not.&lt;/p&gt;

&lt;p&gt;Things that break over time:&lt;/p&gt;

&lt;p&gt;• Property names change&lt;/p&gt;

&lt;p&gt;• Data types shift&lt;/p&gt;

&lt;p&gt;• Fields get added or removed&lt;/p&gt;

&lt;p&gt;If you build with fixed schemas, your system breaks quietly.&lt;/p&gt;

&lt;h4&gt;
  
  
  What I did instead
&lt;/h4&gt;

&lt;p&gt;I treated Notion as semi structured data:&lt;/p&gt;

&lt;p&gt;• Map fields dynamically instead of hardcoding&lt;/p&gt;

&lt;p&gt;• Use fallback parsing when fields do not match&lt;/p&gt;

&lt;p&gt;• Normalize everything into an internal schema&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example internal format:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026–03–20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sleep_hours"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;6.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"workout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"strength"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"mood"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"low"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No matter how messy the source is, the model only sees this cleaned version.&lt;/p&gt;

&lt;h4&gt;
  
  
  Data normalization is the real system
&lt;/h4&gt;

&lt;p&gt;Most of the work went here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Extract raw values from Notion API&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Convert them into usable types&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Handle missing or inconsistent fields&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Align everything by time&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Examples:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;• "6 hrs" becomes 6.0
 • Empty fields get dropped from inference
 • Mixed labels get standardized
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If this layer is weak, everything downstream gets worse.&lt;/p&gt;

&lt;h4&gt;
  
  
  LLM layer
&lt;/h4&gt;

&lt;p&gt;The model is not used as a general assistant.&lt;/p&gt;

&lt;p&gt;It has a narrow job:&lt;/p&gt;

&lt;p&gt;• Summarize recent data&lt;/p&gt;

&lt;p&gt;• Spot simple patterns&lt;/p&gt;

&lt;p&gt;• Suggest small adjustments&lt;/p&gt;

&lt;h4&gt;
  
  
  Input structure
&lt;/h4&gt;

&lt;p&gt;Each run includes:&lt;/p&gt;

&lt;p&gt;• Recent data window&lt;/p&gt;

&lt;p&gt;• Aggregated values&lt;/p&gt;

&lt;p&gt;• Instructions that limit scope&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Sleep: [6, 5.5, 7, 6]
Workout: [yes, no, yes, yes]
Mood: [low, medium, medium, high]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Identify patterns&lt;/p&gt;

&lt;p&gt;Avoid assumptions without enough data&lt;/p&gt;

&lt;p&gt;State uncertainty clearly&lt;/p&gt;

&lt;p&gt;The main issue: the model guesses&lt;/p&gt;

&lt;p&gt;Even with weak data, it tries to sound confident.&lt;/p&gt;

&lt;p&gt;That is a problem, especially for anything health related.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I added&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• Minimum data thresholds before running inference&lt;/p&gt;

&lt;p&gt;• Prompts that force uncertainty&lt;/p&gt;

&lt;p&gt;• Restrictions on long term claims&lt;/p&gt;

&lt;p&gt;• Filtering outputs that sound too certain&lt;/p&gt;

&lt;p&gt;It still makes mistakes. It just makes fewer confident ones.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Writing results back to Notion&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Outputs are stored as:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• Daily summaries&lt;/p&gt;

&lt;p&gt;• Weekly insights&lt;/p&gt;

&lt;p&gt;• Separate logs for traceability&lt;/p&gt;

&lt;p&gt;Each output includes:&lt;/p&gt;

&lt;p&gt;• Timestamp&lt;/p&gt;

&lt;p&gt;• Data window used&lt;/p&gt;

&lt;p&gt;• Generated insight&lt;/p&gt;

&lt;p&gt;This makes it easier to debug and iterate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why I stayed inside Notion
&lt;/h4&gt;

&lt;p&gt;I considered building a separate app.&lt;/p&gt;

&lt;p&gt;That would solve a lot of problems:&lt;/p&gt;

&lt;p&gt;• Cleaner schema&lt;/p&gt;

&lt;p&gt;• Better validation&lt;/p&gt;

&lt;p&gt;• Fewer edge cases&lt;/p&gt;

&lt;p&gt;But nobody wants another health app.&lt;/p&gt;

&lt;p&gt;Notion already has the data. So I built on top of it instead.&lt;/p&gt;

&lt;p&gt;The tradeoff is dealing with inconsistency.&lt;/p&gt;

&lt;h4&gt;
  
  
  Influence from Tribe v2
&lt;/h4&gt;

&lt;p&gt;This project shifted direction after I came across Tribe v2.&lt;/p&gt;

&lt;p&gt;The main idea that stuck:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You do not wait until something feels ready.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You ship it. Then improve it in the open.&lt;/p&gt;

&lt;p&gt;That is exactly what this repo reflects. Some parts are solid. Some are clearly not. That is fine.&lt;/p&gt;

&lt;h4&gt;
  
  
  What is still broken
&lt;/h4&gt;

&lt;p&gt;A few things are still rough:&lt;/p&gt;

&lt;p&gt;• Sparse data leads to weak outputs&lt;/p&gt;

&lt;p&gt;• The model confuses correlation with causation&lt;/p&gt;

&lt;p&gt;• Some insights sound better than they are&lt;/p&gt;

&lt;p&gt;• No feedback loop yet to measure usefulness&lt;/p&gt;

&lt;p&gt;The system works. It just does not always matter.&lt;/p&gt;

&lt;h4&gt;
  
  
  What I would change
&lt;/h4&gt;

&lt;p&gt;If I rebuilt/rework this:&lt;/p&gt;

&lt;p&gt;• Define a stricter schema earlier&lt;/p&gt;

&lt;p&gt;• Separate ingestion and AI layers properly&lt;/p&gt;

&lt;p&gt;• Add better logging from day one&lt;/p&gt;

&lt;p&gt;• Focus more on actionable insights, not just observations&lt;/p&gt;

&lt;h4&gt;
  
  
  Where this could go
&lt;/h4&gt;

&lt;p&gt;A few directions that feel real:&lt;/p&gt;

&lt;p&gt;• Long term memory instead of short windows&lt;/p&gt;

&lt;p&gt;• Feedback loops to track if suggestions help&lt;/p&gt;

&lt;p&gt;• Wearable integrations&lt;/p&gt;

&lt;p&gt;• Confidence scoring for outputs&lt;/p&gt;

&lt;p&gt;Or it might just stay like this. A small layer that makes Notion slightly smarter.&lt;/p&gt;

&lt;h4&gt;
  
  
  Closing
&lt;/h4&gt;

&lt;p&gt;Missing the deadline changed the trajectory of this project.&lt;/p&gt;

&lt;p&gt;If I had submitted it, I probably would have moved on.&lt;/p&gt;

&lt;p&gt;Instead, it is now something I can keep improving without pretending it is finished.&lt;/p&gt;

&lt;p&gt;Right now, it is useful enough to keep using.&lt;/p&gt;

&lt;p&gt;That is enough.&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/rotsl/notion-Health-AI" rel="noopener noreferrer"&gt;https://github.com/rotsl/notion-Health-AI&lt;/a&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>tribev2</category>
      <category>notion</category>
      <category>metaai</category>
    </item>
    <item>
      <title>☕ Pot.OF — AI-Powered HTCPCP Coffee Pot</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Thu, 02 Apr 2026 09:03:18 +0000</pubDate>
      <link>https://dev.to/rotsl/potof-ai-powered-htcpcp-coffee-pot-2cf8</link>
      <guid>https://dev.to/rotsl/potof-ai-powered-htcpcp-coffee-pot-2cf8</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/aprilfools-2026"&gt;DEV April Fools Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Pot.OF is a playful HTCPCP/1.0 coffee pot simulator inspired by RFC 2324. It includes an interactive terminal, a full &lt;code&gt;418 I'm a Teapot&lt;/code&gt; tea-rejection flow, decaf kernel panic mode, and three optional AI features powered by Google Gemini: an AI Coffee Therapist, an AI Brew Critic, and an AI RFC Generator.&lt;/p&gt;

&lt;p&gt;It solves no real problems, but it does let users argue with a coffee pot that has strong opinions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;Deployed app: &lt;a href="https://pot-of-pj1j.vercel.app/" rel="noopener noreferrer"&gt;pot-of&lt;/a&gt;&lt;br&gt;
Video demo: &lt;a href="https://youtu.be/c_fYiGmoDxk?si=EML1dyDcmtQn1lIq" rel="noopener noreferrer"&gt;Youtube&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;Built with Next.js 16, TypeScript, Tailwind CSS 4, shadcn/ui, Framer Motion, Zustand, Prisma, and Google Gemini.&lt;/p&gt;

&lt;p&gt;Repo link: &lt;a href="https://github.com/rotsl/pot.of" rel="noopener noreferrer"&gt;Github&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Built It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Built the app as a Next.js 16 App Router project with a single interactive coffee-pot interface and dedicated API routes for both protocol behavior and AI features.&lt;/li&gt;
&lt;li&gt;Implemented 3 Gemini-powered AI endpoints:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/api/htcpcp/ai-therapist&lt;/code&gt; — a sentient coffee pot therapist with a consistent personality, multi-turn chat, and coffee-themed advice&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/api/htcpcp/ai-critic&lt;/code&gt; — a dramatic coffee snob that generates absurd tasting notes and scores&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/api/htcpcp/ai-rfc&lt;/code&gt; — an RFC-style generator that creates fake HTCPCP protocol extensions with realistic formatting&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Added a bring-your-own-key flow in the GUI so users can paste their own Gemini API key locally to unlock AI features without requiring a deployment-wide secret&lt;/li&gt;

&lt;li&gt;Built 8 total API routes:

&lt;ul&gt;
&lt;li&gt;5 HTCPCP-inspired core routes for brewing, status, RFC display, teapot mode, and timing&lt;/li&gt;
&lt;li&gt;3 AI routes for therapist, critic, and RFC generation&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Added personality-driven UI behavior including pot moods like &lt;code&gt;idle&lt;/code&gt;, &lt;code&gt;brewing&lt;/code&gt;, &lt;code&gt;happy&lt;/code&gt;, &lt;code&gt;offended&lt;/code&gt;, &lt;code&gt;existential&lt;/code&gt;, and &lt;code&gt;decaf-panic&lt;/code&gt;
&lt;/li&gt;

&lt;li&gt;Implemented joke protocol interactions including:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;BREW tea&lt;/code&gt; -&amp;gt; full-screen &lt;code&gt;418 I'm a Teapot&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BREW decaf&lt;/code&gt; -&amp;gt; fake kernel panic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RFC&lt;/code&gt;, &lt;code&gt;STATUS&lt;/code&gt;, &lt;code&gt;WHEN&lt;/code&gt;, &lt;code&gt;PROPFIND&lt;/code&gt;, and other terminal commands&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Used three generated visual assets for the coffee pot mascot, teapot artwork, and coffee cup imagery&lt;/li&gt;

&lt;li&gt;Deployed it as a Vercel-friendly app with the AI key supplied by each user in the interface instead of hardcoding a shared secret&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prize Category
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Best Google AI Usage&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The app uses Google Gemini across three distinct feature types: conversational AI through the therapist, creative generation through the brew critic, and structured document generation through the RFC generator. AI is not a side widget here; it is part of the product’s personality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best Ode to Larry Masinter&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The project is built around RFC 2324, including the legendary &lt;code&gt;418 I'm a Teapot&lt;/code&gt;, HTCPCP-style commands, and a coffee pot that takes the protocol far too seriously.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>418challenge</category>
      <category>showdev</category>
    </item>
    <item>
      <title>From Kidney Stones to Convergence</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Sat, 28 Mar 2026 08:16:21 +0000</pubDate>
      <link>https://dev.to/rotsl/from-kidney-stones-to-convergence-gno</link>
      <guid>https://dev.to/rotsl/from-kidney-stones-to-convergence-gno</guid>
      <description>&lt;h4&gt;
  
  
  The strange path from ultrasound physics to rethinking how solvers move through space
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1y4owbcuuxn8awya5g7n.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1y4owbcuuxn8awya5g7n.jpeg" alt="image1" width="392" height="584"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I didn’t expect this to start with kidney stones, but that’s honestly where it began.&lt;/p&gt;

&lt;p&gt;I was reading about ultrasound lithotripsy, how they break stones using focused waves, and I got stuck on the geometry of it. Ellipses, focal points, energy landing exactly where it needs to.&lt;/p&gt;

&lt;p&gt;It is one of those cases where physics feels less like equations and more like choreography.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;That idea just sat there for a while.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Then, separately, I was dealing with solver code. Big systems, messy residuals, the usual “why is this not converging” loop. At some point I stopped thinking in terms of matrices. The system started to feel like a place.&lt;/p&gt;

&lt;p&gt;Some parts resisted everything, like trying to push something heavy across rough ground. Other parts moved too easily and felt unstable. Residuals stopped feeling abstract and started feeling like forces pushing things out of balance.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;That is roughly where PICD came from.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;PICD does not try to replace anything. It wraps what already works.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;GMRES, CG, Newton – Krylov, BDF. They still do the actual solving. PICD just watches what is happening and keeps some memory: residual history, how the system is partitioned, how different parts relate to each other.&lt;/p&gt;

&lt;p&gt;Then it adjusts the setup for the next solve. Preconditioners, damping, small corrections. Carefully.&lt;/p&gt;

&lt;p&gt;There is a hard boundary it does not cross. If a step does not reduce the residual, it does not count. The usual acceptance rules still apply.&lt;/p&gt;

&lt;p&gt;The “conic” part is just how the system gets split up.&lt;/p&gt;

&lt;p&gt;Instead of one big vector, you break it into regions. Each one tracks its own behavior. Its residual pattern, its neighbors, what worked last time.&lt;/p&gt;

&lt;p&gt;It sounds heavier than it feels. In practice it just gives the solver a bit of context it did not have before.&lt;/p&gt;

&lt;p&gt;The unusual part is treating those regions like they have physical properties.&lt;/p&gt;

&lt;p&gt;It sounds heavier than it feels. In practice it just gives the solver a bit of context it did not have before.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9wqs1bp5sz188wl1dym.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg9wqs1bp5sz188wl1dym.jpeg" alt="formulae" width="384" height="108"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Underneath all that is a graph.&lt;/p&gt;

&lt;p&gt;Connections between regions depend on how similar their residuals are, how often they activate together, and the actual structure of the problem. From that you get a Laplacian:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;L = D -W
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It does not replace the solver. It just helps decide what should be grouped together and what should be prioritized.&lt;/p&gt;

&lt;p&gt;The solve loop itself is pretty normal:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pick a solver, partition, build state, adjust preconditioner, run, accept or reject, update.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The results are interesting.&lt;/p&gt;

&lt;p&gt;Everything in the current validation set runs. 98 tests, 22 examples.&lt;/p&gt;

&lt;p&gt;On direct comparisons, same solver with and without PICD, the PICD version is faster in the published benchmark set and uses less memory there as well.&lt;/p&gt;

&lt;p&gt;Linear problems stand out the most. Most cases improve, sometimes by a lot. There is a Helmholtz example that jumps by hundreds of times faster.&lt;/p&gt;

&lt;p&gt;Nonlinear and time-dependent cases are less clean. Some improve. Some do not. There is a turbulence example that clearly gets worse, with more rejected steps and slower runtime.&lt;/p&gt;

&lt;p&gt;That part I trust more than the wins.&lt;/p&gt;

&lt;p&gt;If there is one thing I would keep in mind, it is that PICD is deliberately limited in what it claims.&lt;/p&gt;

&lt;p&gt;It works well in same-method comparisons. Beyond that, it depends. It does not assume every physics-inspired term helps, and the controller can reduce or disable them when they start hurting convergence.&lt;/p&gt;

&lt;p&gt;I still come back to that original picture of energy being guided instead of forced.&lt;/p&gt;

&lt;p&gt;That is really what this is. Instead of brute-forcing convergence, you reshape the space a little so the solver has an easier path.&lt;/p&gt;

&lt;p&gt;But &lt;strong&gt;it changes how you think about the problem&lt;/strong&gt;. And for me, that shift was the interesting part.&lt;/p&gt;

&lt;p&gt;Read more on my reasearch here and cite it if you find it useful : &lt;a href="https://doi.org/10.13140/RG.2.2.10721.06243" rel="noopener noreferrer"&gt;https://doi.org/10.13140/RG.2.2.10721.06243&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mathematics</category>
      <category>researchmethods</category>
      <category>algorithms</category>
      <category>physics</category>
    </item>
    <item>
      <title>Your LLM prompts are probably wasting 90% of tokens. Here’s how I fixed mine.</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Sun, 22 Mar 2026 13:04:10 +0000</pubDate>
      <link>https://dev.to/rotsl/your-llm-prompts-are-probably-wasting-90-of-tokens-heres-how-i-fixed-mine-1hg0</link>
      <guid>https://dev.to/rotsl/your-llm-prompts-are-probably-wasting-90-of-tokens-heres-how-i-fixed-mine-1hg0</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlt9s712wk7xfidv3oz1.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlt9s712wk7xfidv3oz1.jpeg" alt="Tokens in LLM" width="582" height="327"&gt;&lt;/a&gt;&lt;br&gt;
I keep running into the same problem with LLM apps.&lt;/p&gt;

&lt;p&gt;This work is based on my previous article on dev.to &lt;a href="https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm"&gt;https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You build a retrieval pipeline, hook it up to an API, and then quietly ship prompts that are full of stuff the model doesn’t need. Extra chunks. Duplicates. Half-relevant context that just bloats everything.&lt;/p&gt;

&lt;p&gt;And you pay for all of it.&lt;/p&gt;

&lt;p&gt;CFAdv is basically an attempt to stop doing that.&lt;/p&gt;

&lt;p&gt;It builds on context-fusion, but adds something that turns out to matter more than I expected: even if you pick the right context, you can still mess it up by putting it in the wrong place.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Most pipelines are still doing this&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s be honest about the default pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;top_k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it.&lt;/p&gt;

&lt;p&gt;No budget. No filtering beyond retrieval. No thought about ordering.&lt;/p&gt;

&lt;p&gt;More context is assumed to be better. It often isn’t.&lt;/p&gt;




&lt;p&gt;CFAdv splits the problem in two&lt;/p&gt;

&lt;p&gt;Instead of one “context step”, it does two separate things:&lt;br&gt;
    1.  Decide what gets in&lt;br&gt;
    2.  Decide where it goes&lt;/p&gt;

&lt;p&gt;That separation is the whole point.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Step 1: selecting context under a budget&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of top-k, CFAdv treats selection like an optimization problem.&lt;/p&gt;

&lt;p&gt;Each chunk gets a score based on things like:&lt;br&gt;
    • relevance&lt;br&gt;
    • trust&lt;br&gt;
    • freshness&lt;br&gt;
    • diversity&lt;br&gt;
    • token cost&lt;/p&gt;

&lt;p&gt;Then it tries to pick the best combination under a fixed token budget.&lt;/p&gt;

&lt;p&gt;At a high level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;utility&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="mf"&gt;0.25&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relevance&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.20&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;trust&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;freshness&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;structure&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;diversity&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;risk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="mf"&gt;0.40&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hallucination&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.35&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;staleness&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
        &lt;span class="mf"&gt;0.25&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;privacy&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;utility&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;risk&lt;/span&gt;

&lt;span class="n"&gt;Then&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="n"&gt;density&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="n"&gt;density&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And greedily pack until you hit the budget.&lt;/p&gt;




&lt;p&gt;The small trick that makes a big difference&lt;/p&gt;

&lt;p&gt;There’s a simple filter before any of that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;floor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_score&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.15&lt;/span&gt;
&lt;span class="n"&gt;selected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anything below 15% of the best chunk just gets dropped.&lt;/p&gt;

&lt;p&gt;That sounds minor, but it changes behavior a lot.&lt;br&gt;
    • If your data is clean, everything stays&lt;br&gt;
    • If it’s noisy, most of it disappears&lt;/p&gt;

&lt;p&gt;So you don’t fill your prompt with mediocre content just because you have space.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Step 2: ordering for attention&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the part I underestimated.&lt;/p&gt;

&lt;p&gt;Even if you pick the right chunks, models don’t treat all positions equally. Stuff at the start tends to get more attention than stuff buried in the middle.&lt;/p&gt;

&lt;p&gt;So CFAdv reorders the selected chunks based on similarity to the query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Basic version:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cosine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;cosine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;weights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ordered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Higher weight goes earlier in the prompt.&lt;/p&gt;




&lt;p&gt;No embeddings API required&lt;/p&gt;

&lt;p&gt;Instead of calling an external model, it uses a simple hashed bag-of-words vector.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;vec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\w+\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
        &lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vec&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;1e-8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s not fancy. No positional info, no learned weights. But for short chunks it works surprisingly well.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Two levels of ordering&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There’s also a second layer.&lt;/p&gt;

&lt;p&gt;Instead of treating everything as one list, CFAdv groups context into blocks:&lt;br&gt;
    • system&lt;br&gt;
    • history&lt;br&gt;
    • retrieval&lt;br&gt;
    • tools&lt;/p&gt;

&lt;p&gt;Then it does:&lt;br&gt;
    1.  sort chunks inside each block&lt;br&gt;
    2.  sort the blocks themselves&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sketch:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# intra-block
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# cross-block
&lt;/span&gt;&lt;span class="n"&gt;block_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;mean_embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;ordered_blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;block_scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So you end up shaping the whole prompt, not just shuffling pieces.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The full pipeline&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CFAdv is an 8-stage pipeline, but it’s easier to think of it like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ingest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;variants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;represent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;selected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;ordered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;attention_fuse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;selected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;packet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;assemble&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ordered&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;packet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each step is stateless. That makes it easier to test and reason about.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What happens in practice&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can cut most of the prompt without losing the answer, as long as:&lt;br&gt;
    • retrieval pulls in some noise&lt;br&gt;
    • there is redundancy&lt;br&gt;
    • the query only needs a subset of the data&lt;/p&gt;

&lt;p&gt;If everything is relevant, the system mostly leaves it alone.&lt;/p&gt;

&lt;p&gt;If only one chunk survives selection, ordering doesn’t matter.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Where this actually helps&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This kind of pipeline shines when:&lt;br&gt;
    • your retrieval step is messy&lt;br&gt;
    • you’re concatenating multiple documents&lt;br&gt;
    • prompts are long enough for attention effects to matter&lt;/p&gt;

&lt;p&gt;If you already have clean, minimal context, you won’t see much change.&lt;/p&gt;




&lt;p&gt;The part that stuck with me&lt;/p&gt;

&lt;p&gt;This isn’t really about attention or embeddings.&lt;/p&gt;

&lt;p&gt;It’s about treating prompt assembly as something worth optimizing.&lt;/p&gt;

&lt;p&gt;Right now most systems act like prompts are just containers. You throw things in and hope the model figures it out.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;CFAdv flips that.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It asks a simple question: what is the smallest amount of context that still works?&lt;/p&gt;

&lt;p&gt;Then it enforces it.&lt;/p&gt;

&lt;p&gt;And once you start thinking that way, it’s hard to go back to dumping chunks into a string and calling it a day.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try it yourself&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to see how this works in practice or plug it into your own workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/rotsl/CFAdv" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; &lt;br&gt;
Contains the full Python library, CLI, benchmarks, and tests. You can run it locally, inspect the pipeline stages, or integrate it into your own RAG setup.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://rotsl.github.io/CFAdv/" rel="noopener noreferrer"&gt;Live demo&lt;/a&gt; &lt;br&gt;
Lets you compare raw prompts vs CFAdv-compiled prompts side by side. Useful for quickly seeing how much context gets removed and how ordering changes.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re already using retrieval + concatenation, the repo is the easiest place to start. Swap your prompt assembly step with CFAdv’s planner + fusion stages and see what drops out.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>promptengineering</category>
      <category>rag</category>
      <category>llm</category>
    </item>
    <item>
      <title>Resume Tailor</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Fri, 20 Mar 2026 14:31:02 +0000</pubDate>
      <link>https://dev.to/rotsl/resume-tailor-3gb3</link>
      <guid>https://dev.to/rotsl/resume-tailor-3gb3</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/notion-2026-03-04"&gt;Notion MCP Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Resume Tailor takes a job posting and your resume, then outputs a tailored resume and cover letter as PDFs. The whole thing runs in your browser. No sign-up, no server, no data stored anywhere except your Notion workspace if you want it there.&lt;/p&gt;

&lt;p&gt;You pick Claude or Gemini (Gemini has a free tier, no credit card), paste or upload the job description, upload your resume, and click go. Two PDFs come out the other side.&lt;/p&gt;

&lt;p&gt;It also runs as a local Flask app with more features (DOCX support, job URL fetching, richer PDFs) and a CLI if that's your thing.&lt;/p&gt;

&lt;h3&gt;
  
  
  The one rule I actually cared about
&lt;/h3&gt;

&lt;p&gt;The AI is not allowed to make things up. That sounds obvious but it's easy to get wrong. The system prompt on every single call says: you may reorder and reword existing content, you may use keywords from the job description if they honestly describe something the candidate already did, but you cannot add skills, invent metrics, or fabricate roles. If the job asks for five years of Kubernetes experience and the resume doesn't mention Kubernetes, that gap stays in the output.&lt;/p&gt;

&lt;p&gt;I've seen other resume tools confidently add skills the user never had. I didn't want to build that.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Notion MCP works
&lt;/h2&gt;

&lt;p&gt;The Notion integration reads job descriptions from Notion pages and logs every run's output back. If you track jobs in Notion, pass the page ID directly instead of copy-pasting. The system reads the page via MCP.&lt;/p&gt;

&lt;p&gt;After each run, two databases get entries. A Job Applications table tracks company, role, date, and a snippet. A linked Outputs database stores the actual resume and cover letter text as readable blocks. A few weeks in, you have every application: what you sent and what they asked for.&lt;/p&gt;

&lt;p&gt;I also included &lt;code&gt;.mcp.json&lt;/code&gt; for the official &lt;code&gt;@notionhq/notion-mcp-server&lt;/code&gt;. Claude Desktop and Cursor pick it up, letting you ask Claude things like "which applications are pending?" or "draft a follow-up for the engineering role."&lt;/p&gt;

&lt;p&gt;The Notion API breaks if you write to a property that doesn't exist. Early versions failed when someone's title column wasn't "Name". The fix: introspect the database first, find the actual title property, and put everything else (status, date, company) in the page body as blocks instead of database properties. Works now regardless of configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_title_property_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_notion_mcp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API-retrieve-a-database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;db_id&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The refactor (late 2024): Moved from the Notion SDK to a Python MCP client. All calls now route through &lt;code&gt;src/mcp_notion_client.py&lt;/code&gt;, which spawns the Node.js MCP server and communicates via stdio. Same behavior, but now the operations flow through MCP like the &lt;code&gt;.mcp.json&lt;/code&gt; config intended. The MCP server is launched on-demand—no persistent process—so it's transparent to the user.&lt;/p&gt;




&lt;h2&gt;
  
  
  Video demo
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://youtu.be/H5RqClzqvVo?si=7jYTx6aJIPEKH5-F" rel="noopener noreferrer"&gt;Resume Tailor Demo&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Show us the code
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/rotsl/resume-tailor" rel="noopener noreferrer"&gt;https://github.com/rotsl/resume-tailor&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://rotsl.github.io/resume-tailor" rel="noopener noreferrer"&gt;https://rotsl.github.io/resume-tailor&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How it's structured
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resume-tailor/
├── docs/index.html                   ← the GitHub Pages app, fully self-contained
├── app.py                            ← local Flask server
├── main.py                           ← CLI
├── instruct.md                       ← formatting rules injected into every prompt
├── .mcp.json                         ← Notion MCP server config
├── .github/workflows/deploy.yml      ← deploys docs/ to GitHub Pages on push
├── scripts/
│   └── setup_notion_databases.py     ← creates the Notion DBs via MCP, writes IDs to .env
└── src/
    ├── tailor.py                     ← AI engine, supports Claude and Gemini
    ├── parser.py                     ← PDF / DOCX / text extraction
    ├── pdf_generator.py              ← PDF output via ReportLab
    ├── web_context.py                ← fetches company context from the web
    ├── mcp_notion_client.py          ← Python MCP client for Notion operations
    └── notion_integration.py         ← high-level Notion read/write (uses MCP)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Supporting two AI providers
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;src/tailor.py&lt;/code&gt; has a single &lt;code&gt;tailor_resume()&lt;/code&gt; function that accepts a &lt;code&gt;provider&lt;/code&gt;, &lt;code&gt;model&lt;/code&gt;, and &lt;code&gt;api_key&lt;/code&gt; argument. The same prompts go to both. The browser version calls the APIs directly via &lt;code&gt;fetch()&lt;/code&gt;; the local version uses the Python SDKs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Claude
&lt;/span&gt;&lt;span class="n"&gt;tailored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tailor_resume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;job_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sk-ant-...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Gemini free tier
&lt;/span&gt;&lt;span class="n"&gt;tailored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tailor_resume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;job_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AIza...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When no key is passed, it falls back to environment variables, so the CLI reads from &lt;code&gt;.env&lt;/code&gt; without asking every time.&lt;/p&gt;

&lt;h3&gt;
  
  
  The prompt structure
&lt;/h3&gt;

&lt;p&gt;Two layers. The system prompt sets the hard rules (no fabrication, no adding skills). The user prompt gives the model the original resume, the job description, and any web context about the company as clearly labelled separate sections.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ABSOLUTE RULES — NEVER VIOLATE:
1. You may ONLY use information that exists in the candidate's original resume.
2. Do NOT invent, embellish, or assume any experience, skills, metrics, or facts.
3. You MAY reorder, reword, and emphasize existing content.
4. Mirror keywords from the job description only where they truthfully apply.
5. If the candidate lacks a required skill, do NOT add it. Leave it absent.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cover letter call gets both the original resume and the already-tailored resume, so it can see exactly what was kept and what was cut.&lt;/p&gt;

&lt;h3&gt;
  
  
  Runtime config using &lt;code&gt;instruct.md&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Formatting rules live in &lt;code&gt;instruct.md&lt;/code&gt; and get injected into every prompt at call time. Swap the file out and the output changes — no code edits. Someone who wants a one-page resume with a specific section order can describe that there. Someone applying to academic roles can put a different set of rules in.&lt;/p&gt;

&lt;h3&gt;
  
  
  The GitHub Pages version
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;docs/index.html&lt;/code&gt; is the entire app. PDF.js reads uploaded PDFs in the browser, the AI APIs are called directly via fetch, jsPDF builds the output PDFs in memory. The GitHub Actions workflow just copies that one file to Pages on every push to main.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Upload Pages artifact&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-pages-artifact@v3&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docs/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No build step, no npm, no bundler. The tradeoff is no Notion logging on the static version, since there's nowhere safe to store the Notion API key client-side.&lt;/p&gt;

&lt;h3&gt;
  
  
  Notion setup script
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python scripts/setup_notion_databases.py YOUR_NOTION_PAGE_ID
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Creates both databases via MCP, then writes their IDs into &lt;code&gt;.env&lt;/code&gt; automatically. No manual copy-paste needed. The script calls &lt;code&gt;call_notion_mcp("API-create-a-database", {...})&lt;/code&gt; for each database—same flow as the app itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/YOUR_USERNAME/resume-tailor.git
&lt;span class="nb"&gt;cd &lt;/span&gt;resume-tailor
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source &lt;/span&gt;venv/bin/activate
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# Add GEMINI_API_KEY (free) or ANTHROPIC_API_KEY, plus NOTION_API_KEY&lt;/span&gt;

python scripts/setup_notion_databases.py YOUR_NOTION_PAGE_ID

python app.py  &lt;span class="c"&gt;# → http://localhost:5000&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
python main.py tailor &lt;span class="nt"&gt;--resume&lt;/span&gt; resume.pdf &lt;span class="nt"&gt;--job-url&lt;/span&gt; https://...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;Stack: Claude / Gemini, Notion MCP (Python mcp client + Node.js server), ReportLab, pdfplumber, jsPDF, PDF.js, Flask.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>notionchallenge</category>
      <category>mcp</category>
      <category>ai</category>
    </item>
    <item>
      <title>🧠 Codex OS: I tried turning AI into a local dev “operating system”</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Wed, 18 Mar 2026 19:31:15 +0000</pubDate>
      <link>https://dev.to/rotsl/codex-os-i-tried-turning-ai-into-a-local-dev-operating-system-45f0</link>
      <guid>https://dev.to/rotsl/codex-os-i-tried-turning-ai-into-a-local-dev-operating-system-45f0</guid>
      <description>&lt;p&gt;I’ve been experimenting with a simple idea:&lt;/p&gt;

&lt;p&gt;What if AI wasn’t just a tool you call… but something that behaves more like an operating system for development?&lt;/p&gt;

&lt;p&gt;That’s how Codex OS started.&lt;br&gt;
    • GitHub: &lt;a href="https://github.com/rotsl/codex-os" rel="noopener noreferrer"&gt;https://github.com/rotsl/codex-os&lt;/a&gt;&lt;br&gt;
    • Webpage: &lt;a href="https://rotsl.github.io/codex-os/" rel="noopener noreferrer"&gt;https://rotsl.github.io/codex-os/&lt;/a&gt;&lt;br&gt;
    • npm: &lt;a href="https://www.npmjs.com/package/codexospackage" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/codexospackage&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This isn’t another wrapper around an API. I was trying to build something that feels persistent — like it’s sitting there, managing tasks, running workflows, and helping you think through code instead of just spitting snippets.&lt;/p&gt;

&lt;p&gt;I’m still figuring it out. But it’s already useful in ways I didn’t expect.&lt;/p&gt;


&lt;h3&gt;
  
  
  What Codex OS actually is
&lt;/h3&gt;

&lt;p&gt;At its core, Codex OS is a local-first system that lets you:&lt;br&gt;
    • run AI-driven tasks&lt;br&gt;
    • structure workflows&lt;br&gt;
    • interact with code in a more stateful way&lt;/p&gt;

&lt;p&gt;The key idea: treat AI like a runtime environment, not a function call.&lt;/p&gt;

&lt;p&gt;That changes how you design everything.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;const result &lt;span class="o"&gt;=&lt;/span&gt; await ai.generate&lt;span class="o"&gt;(&lt;/span&gt;prompt&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’re closer to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;await codex.run&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"analyze-project"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s subtle, but it shifts the mindset from “ask → answer” to “delegate → process”.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why I built it
&lt;/h3&gt;

&lt;p&gt;I kept running into the same friction with AI tools:&lt;br&gt;
    • Context gets lost constantly&lt;br&gt;
    • You repeat yourself more than you should&lt;br&gt;
    • There’s no real “memory” unless you bolt it on&lt;br&gt;
    • Everything feels stateless&lt;/p&gt;

&lt;p&gt;It works fine for small tasks. But once you try to build something non-trivial, it starts to feel like you’re babysitting the tool.&lt;/p&gt;

&lt;p&gt;I wanted something that:&lt;br&gt;
    • keeps context around&lt;br&gt;
    • can chain tasks together&lt;br&gt;
    • behaves more like a system than a chatbot&lt;/p&gt;

&lt;p&gt;So I started building it.&lt;/p&gt;



&lt;p&gt;How it works (without the marketing layer)&lt;/p&gt;

&lt;p&gt;There are three main pieces:&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Task execution model
&lt;/h3&gt;

&lt;p&gt;You define actions. Codex runs them.&lt;/p&gt;

&lt;p&gt;These can be things like:&lt;br&gt;
    • analyze files&lt;br&gt;
    • generate code&lt;br&gt;
    • refactor parts of a project&lt;br&gt;
    • run multi-step workflows&lt;/p&gt;

&lt;p&gt;The important part is that tasks can call other tasks. That’s where it starts feeling like a system instead of a script.&lt;/p&gt;


&lt;h3&gt;
  
  
  2. Local-first approach
&lt;/h3&gt;

&lt;p&gt;Everything is designed to run locally.&lt;/p&gt;

&lt;p&gt;That decision came early, mostly because:&lt;br&gt;
    • I don’t want to depend entirely on remote APIs&lt;br&gt;
    • local context is easier to manage&lt;br&gt;
    • it’s faster for iteration&lt;/p&gt;

&lt;p&gt;It also makes the whole thing feel more like tooling and less like a service.&lt;/p&gt;


&lt;h3&gt;
  
  
  3. npm package integration
&lt;/h3&gt;

&lt;p&gt;You can install it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;codexospackage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once installed, you can start wiring it into your own workflows instead of using it as a standalone tool.&lt;/p&gt;

&lt;p&gt;That’s where it gets interesting.&lt;/p&gt;




&lt;h4&gt;
  
  
  A small example
&lt;/h4&gt;

&lt;p&gt;Here’s a rough idea of how you might use it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;import &lt;span class="o"&gt;{&lt;/span&gt; codex &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"codexospackage"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

await codex.run&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"review-codebase"&lt;/span&gt;, &lt;span class="o"&gt;{&lt;/span&gt;
  path: &lt;span class="s2"&gt;"./src"&lt;/span&gt;
&lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of asking “what’s wrong with this file?”, you define a reusable task and run it whenever you need.&lt;/p&gt;

&lt;p&gt;It’s closer to scripting your thinking than querying an assistant.&lt;/p&gt;




&lt;h3&gt;
  
  
  What surprised me
&lt;/h3&gt;

&lt;p&gt;I expected this to be a thin abstraction.&lt;/p&gt;

&lt;p&gt;It isn’t.&lt;/p&gt;

&lt;p&gt;Once tasks start calling other tasks, you get something that feels… layered. Almost like a tiny OS scheduler for AI workflows.&lt;/p&gt;

&lt;p&gt;But there’s also a downside:&lt;br&gt;
    • It’s easy to over-engineer things&lt;br&gt;
    • You can end up building systems instead of solving problems&lt;br&gt;
    • Debugging AI-driven flows is still messy&lt;/p&gt;

&lt;p&gt;I’m still working through that.&lt;/p&gt;




&lt;h3&gt;
  
  
  Where this could go
&lt;/h3&gt;

&lt;p&gt;I don’t want to oversell this. It’s early.&lt;/p&gt;

&lt;p&gt;But a few directions feel promising:&lt;br&gt;
    • persistent agents that track project state&lt;br&gt;
    • better tooling for chaining tasks&lt;br&gt;
    • tighter integration with local dev environments&lt;/p&gt;

&lt;p&gt;Right now, it’s somewhere between a tool and an experiment.&lt;/p&gt;




&lt;p&gt;If you try it and it breaks (it probably will in some cases), I’d actually love to hear about it. That’s the only way this gets better.&lt;/p&gt;




&lt;h3&gt;
  
  
  Final thought
&lt;/h3&gt;

&lt;p&gt;I don’t think the future of AI in dev is just better autocomplete.&lt;/p&gt;

&lt;p&gt;It’s systems.&lt;/p&gt;

&lt;p&gt;Small ones at first. Slightly weird. A bit unreliable. But more useful once they stick around and understand what you’re doing.&lt;/p&gt;

&lt;p&gt;Codex OS is my attempt at that.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>code</category>
      <category>productivity</category>
    </item>
    <item>
      <title>ContextFusion: The Context Engineering Layer Your LLM Apps Are Missing</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Wed, 11 Mar 2026 17:31:28 +0000</pubDate>
      <link>https://dev.to/rotsl/contextfusion-the-context-engineering-layer-your-llm-apps-are-missing-99h</link>
      <guid>https://dev.to/rotsl/contextfusion-the-context-engineering-layer-your-llm-apps-are-missing-99h</guid>
      <description>&lt;p&gt;Modern AI applications rely heavily on &lt;strong&gt;Large Language Models (LLMs)&lt;/strong&gt;, but many production systems still struggle with a critical problem:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Context management.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Developers often construct prompts by simply concatenating everything available:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;system instructions
&lt;/li&gt;
&lt;li&gt;user queries
&lt;/li&gt;
&lt;li&gt;conversation history
&lt;/li&gt;
&lt;li&gt;retrieved documents
&lt;/li&gt;
&lt;li&gt;tool outputs
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This works for small prototypes, but in real systems it leads to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;bloated prompts
&lt;/li&gt;
&lt;li&gt;higher API costs
&lt;/li&gt;
&lt;li&gt;increased latency
&lt;/li&gt;
&lt;li&gt;inconsistent responses
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A new discipline is emerging to address this challenge: &lt;strong&gt;context engineering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of treating prompts as raw text, context engineering treats &lt;strong&gt;information as structured input that must be optimized before being sent to an LLM&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is exactly what &lt;strong&gt;ContextFusion&lt;/strong&gt; introduces.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub Repository: &lt;a href="https://github.com/rotsl/context-fusion" rel="noopener noreferrer"&gt;https://github.com/rotsl/context-fusion&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm Package: &lt;a href="https://www.npmjs.com/package/@rotsl/contextfusion" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@rotsl/contextfusion&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Hidden Problem in LLM Applications
&lt;/h2&gt;

&lt;p&gt;When developers optimize AI systems, they often focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt engineering
&lt;/li&gt;
&lt;li&gt;retrieval pipelines
&lt;/li&gt;
&lt;li&gt;model selection
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, the &lt;strong&gt;real bottleneck is frequently the context itself&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Every LLM request must include all relevant information inside the prompt. Since LLM APIs charge and operate based on &lt;strong&gt;tokens&lt;/strong&gt;, inefficient context handling directly affects performance.&lt;/p&gt;

&lt;p&gt;More tokens mean:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;higher inference latency
&lt;/li&gt;
&lt;li&gt;increased API costs
&lt;/li&gt;
&lt;li&gt;greater noise in the prompt
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A typical LLM request pipeline looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;
&lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="nx"&gt;Input&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;System&lt;/span&gt; &lt;span class="nx"&gt;Prompt&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Conversation&lt;/span&gt; &lt;span class="nx"&gt;History&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Retrieved&lt;/span&gt; &lt;span class="nx"&gt;Documents&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Tool&lt;/span&gt; &lt;span class="nx"&gt;Results&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Final&lt;/span&gt; &lt;span class="nx"&gt;Prompt&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without careful orchestration, this pipeline leads to &lt;strong&gt;prompt bloat&lt;/strong&gt;, where irrelevant or duplicated context inflates token usage.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is ContextFusion?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;ContextFusion is a provider-neutral context compiler designed for token-efficient and low-latency LLM workflows.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of manually assembling prompts, developers supply structured context components.&lt;/p&gt;

&lt;p&gt;ContextFusion then:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;collects context sources
&lt;/li&gt;
&lt;li&gt;normalizes their structure
&lt;/li&gt;
&lt;li&gt;fuses relevant information
&lt;/li&gt;
&lt;li&gt;compiles an optimized prompt
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Conceptually, the system works like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;
&lt;span class="nx"&gt;Raw&lt;/span&gt; &lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Sources&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Normalization&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Fusion&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Optimization&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Compiled&lt;/span&gt; &lt;span class="nx"&gt;Prompt&lt;/span&gt;
&lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;LLM&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can think of ContextFusion as &lt;strong&gt;a build system for LLM context&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Just as compilers optimize source code before execution, ContextFusion optimizes context before it reaches the model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Context Engineering Matters
&lt;/h2&gt;

&lt;p&gt;Prompt engineering helped developers get started with LLMs. But modern AI systems involve much more complexity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-step reasoning agents
&lt;/li&gt;
&lt;li&gt;retrieval pipelines (RAG)
&lt;/li&gt;
&lt;li&gt;tool integrations
&lt;/li&gt;
&lt;li&gt;long-running conversations
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these components produce context that must be merged carefully.&lt;/p&gt;

&lt;p&gt;Consider this example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;
&lt;span class="nx"&gt;System&lt;/span&gt; &lt;span class="nx"&gt;Prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;           &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;
&lt;span class="nx"&gt;Conversation&lt;/span&gt; &lt;span class="nx"&gt;History&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="mi"&gt;1200&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;
&lt;span class="nx"&gt;Retrieved&lt;/span&gt; &lt;span class="nx"&gt;Documents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="mi"&gt;1800&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;
&lt;span class="nx"&gt;Tool&lt;/span&gt; &lt;span class="nx"&gt;Output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;             &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;
&lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="nx"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;              &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;

&lt;span class="nx"&gt;Total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3650&lt;/span&gt; &lt;span class="nx"&gt;tokens&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Much of this information may not be necessary for the current request.&lt;/p&gt;

&lt;p&gt;ContextFusion helps reduce this overhead by &lt;strong&gt;structuring and prioritizing context before generating the prompt&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  ContextFusion Architecture
&lt;/h2&gt;

&lt;p&gt;ContextFusion introduces a &lt;strong&gt;context compilation pipeline&lt;/strong&gt; that separates context management from prompt construction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="nx"&gt;Application&lt;/span&gt; &lt;span class="nx"&gt;Logic&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt;   &lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Sources&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;|---------------------|&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;System&lt;/span&gt; &lt;span class="nx"&gt;Instructions&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;Conversation&lt;/span&gt; &lt;span class="nx"&gt;Memory&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;Retrieved&lt;/span&gt; &lt;span class="nx"&gt;Knowledge&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;Tool&lt;/span&gt; &lt;span class="nx"&gt;Outputs&lt;/span&gt;        &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Normalizer&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt;   &lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Fusion&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;Context&lt;/span&gt; &lt;span class="nx"&gt;Optimizer&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
             &lt;span class="o"&gt;+---------------------+&lt;/span&gt;
             &lt;span class="o"&gt;|&lt;/span&gt;  &lt;span class="nx"&gt;Compiled&lt;/span&gt; &lt;span class="nx"&gt;Prompt&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt;
             &lt;span class="o"&gt;+----------+----------+&lt;/span&gt;
                        &lt;span class="o"&gt;|&lt;/span&gt;
                        &lt;span class="nx"&gt;v&lt;/span&gt;
                   &lt;span class="nx"&gt;LLM&lt;/span&gt; &lt;span class="nx"&gt;Provider&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This architecture creates a clean separation between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;application logic&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;context orchestration&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;model inference&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Installing ContextFusion
&lt;/h2&gt;

&lt;p&gt;You can install ContextFusion using npm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i @rotsl/contextfusion
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;npm package:&lt;br&gt;
&lt;a href="https://www.npmjs.com/package/@rotsl/contextfusion" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@rotsl/contextfusion&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Example Usage
&lt;/h3&gt;

&lt;p&gt;Instead of manually constructing prompts, developers provide structured context modules.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ContextFusion&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;context-fusion&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fusion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ContextFusion&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;fusion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addContext&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful coding assistant.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;fusion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addContext&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;memory&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;conversationHistory&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;fusion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addContext&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;retrieval&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;retrievedDocuments&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;fusion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addContext&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tool&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;toolOutput&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;compiledPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;fusion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;compiledPrompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ContextFusion automatically handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;merging context sources&lt;/li&gt;
&lt;li&gt;removing duplicate information&lt;/li&gt;
&lt;li&gt;structuring prompt sections&lt;/li&gt;
&lt;li&gt;optimizing token usage&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Modular Context Pipelines
&lt;/h2&gt;

&lt;p&gt;ContextFusion allows developers to structure context into logical modules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;systemContext&lt;/span&gt;
&lt;span class="nx"&gt;memoryContext&lt;/span&gt;
&lt;span class="nx"&gt;retrievalContext&lt;/span&gt;
&lt;span class="nx"&gt;toolContext&lt;/span&gt;
&lt;span class="nx"&gt;metadataContext&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each module contributes structured information to the final compiled prompt.&lt;/p&gt;

&lt;p&gt;This modular architecture makes LLM applications easier to maintain and scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Designed for AI Agents
&lt;/h2&gt;

&lt;p&gt;Modern AI systems increasingly rely on &lt;strong&gt;agent-based workflows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A typical agent pipeline might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;User&lt;/span&gt; &lt;span class="nx"&gt;Query&lt;/span&gt;
   &lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Retrieve&lt;/span&gt; &lt;span class="nx"&gt;Knowledge&lt;/span&gt;
   &lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Call&lt;/span&gt; &lt;span class="nx"&gt;External&lt;/span&gt; &lt;span class="nx"&gt;Tools&lt;/span&gt;
   &lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Reasoning&lt;/span&gt; &lt;span class="nx"&gt;Step&lt;/span&gt;
   &lt;span class="err"&gt;↓&lt;/span&gt;
&lt;span class="nx"&gt;Generate&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each step generates additional context that must be merged efficiently.&lt;/p&gt;

&lt;p&gt;ContextFusion manages these layers automatically, ensuring that prompts remain &lt;strong&gt;clean and token-efficient&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  When Should You Use ContextFusion?
&lt;/h2&gt;

&lt;p&gt;ContextFusion is particularly useful for:&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieval-Augmented Generation (RAG)
&lt;/h3&gt;

&lt;p&gt;RAG pipelines often produce large sets of documents that must be structured carefully before prompting.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI Agents
&lt;/h3&gt;

&lt;p&gt;Agent workflows generate intermediate reasoning steps that become context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coding Assistants
&lt;/h3&gt;

&lt;p&gt;Large codebases produce significant contextual data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Long Chat Conversations
&lt;/h3&gt;

&lt;p&gt;Conversation history grows rapidly over time and must be managed efficiently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Context Engineering vs Prompt Engineering
&lt;/h2&gt;

&lt;p&gt;Prompt engineering focuses on &lt;strong&gt;how prompts are written&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Context engineering focuses on &lt;strong&gt;what information the model receives&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Prompt Engineering&lt;/th&gt;
&lt;th&gt;Context Engineering&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;wording prompts&lt;/td&gt;
&lt;td&gt;selecting context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;formatting instructions&lt;/td&gt;
&lt;td&gt;structuring context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;small prompt optimization&lt;/td&gt;
&lt;td&gt;large workflow optimization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;prompt phrasing&lt;/td&gt;
&lt;td&gt;token efficiency&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As AI systems grow more complex, &lt;strong&gt;context engineering becomes essential infrastructure&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Large Language Models continue to evolve rapidly, but &lt;strong&gt;context remains the primary bottleneck in real-world AI systems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Simply increasing context window size is not enough.&lt;/p&gt;

&lt;p&gt;Efficient AI systems must:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;select relevant context&lt;/li&gt;
&lt;li&gt;remove redundant information&lt;/li&gt;
&lt;li&gt;structure prompts clearly&lt;/li&gt;
&lt;li&gt;minimize token usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ContextFusion introduces an important idea:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Treat context like code. Compile it before execution.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For developers building modern AI applications especially RAG systems, AI agents, and coding assistants, ContextFusion represents a powerful new architectural layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;GitHub Repository&lt;br&gt;
&lt;a href="https://github.com/rotsl/context-fusion" rel="noopener noreferrer"&gt;https://github.com/rotsl/context-fusion&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;npm Package&lt;br&gt;
&lt;a href="https://www.npmjs.com/package/@rotsl/contextfusion" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@rotsl/contextfusion&lt;/a&gt;&lt;/p&gt;

</description>
      <category>contextengineering</category>
      <category>llmcontextmanagement</category>
      <category>aicontentarchitecture</category>
      <category>tokenefficientprompts</category>
    </item>
    <item>
      <title>ContextFusion: The Context Brain Your LLM Apps Are Missing</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Tue, 10 Mar 2026 21:58:28 +0000</pubDate>
      <link>https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm</link>
      <guid>https://dev.to/rotsl/contextfusion-the-context-brain-your-llm-apps-are-missing-2gkm</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A deep dive for users who want results and developers who want control&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5idpz1x780al6wby420z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5idpz1x780al6wby420z.png" width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  TL;DR (For the Impatient)
&lt;/h4&gt;

&lt;p&gt;Normal users: Install context-portfolio-optimizer, run cpo compile ./your-docs --budget 4000, and stop overpaying for tokens.&lt;/p&gt;

&lt;p&gt;Developers: Middleware pipeline that ingests heterogeneous sources → normalizes → precomputes → optimizes via multi-objective knapsack → compiles provider-specific payloads with delta fusion for agents.&lt;/p&gt;

&lt;p&gt;Both groups get 60–99% token reduction with identical answer quality.&lt;/p&gt;

&lt;h4&gt;
  
  
  Part 1: For Normal Users — “Just Make My LLM Cheaper and Faster”
&lt;/h4&gt;

&lt;h4&gt;
  
  
  The Problem You Actually Face
&lt;/h4&gt;

&lt;p&gt;You’re building with LLMs. Maybe it’s a chatbot over your company docs. Maybe it’s a coding assistant. Maybe it’s an agent that needs to remember context across 20 turns.&lt;/p&gt;

&lt;p&gt;You keep hitting the same frustrations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Why is this API call so expensive?” — You’re sending 8,000 tokens when 800 would suffice&lt;/li&gt;
&lt;li&gt;“Why does it take 10 seconds to respond?” — Latency scales with prompt size&lt;/li&gt;
&lt;li&gt;“Why does my agent forget everything?” — You’re not managing context deltas across turns&lt;/li&gt;
&lt;li&gt;“Why do I have to rewrite everything when I switch from GPT-4 to Claude?” — Hardcoded prompt formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;You’ve tried RAG. You’ve tried chunking. But you’re still blindly stuffing retrieved chunks into prompts without knowing which ones actually matter.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What ContextFusion Does (No Jargon)
&lt;/h4&gt;

&lt;p&gt;Think of it like a smart travel packer for your LLM trips.&lt;/p&gt;

&lt;p&gt;You have a weight limit (token budget). You have dozens of items (documents, code, images). Some items are essential. Some are nice-to-have. Some are duplicates. Some are risky (outdated, untrusted).&lt;/p&gt;

&lt;p&gt;ContextFusion:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Unpacks everything — PDFs, Word docs, spreadsheets, images, code files&lt;/li&gt;
&lt;li&gt;Weighs and labels each item — How useful? How risky? How heavy?&lt;/li&gt;
&lt;li&gt;Packs the optimal suitcase — Maximum value within your weight limit&lt;/li&gt;
&lt;li&gt;Formats it for your destination — OpenAI’s preferred style, Anthropic’s format, or local Ollama&lt;/li&gt;
&lt;li&gt;And for return trips (agent conversations), it remembers what you already packed and only adds what’s new.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Real Results
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fodbsicisqafqtqdulnf3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fodbsicisqafqtqdulnf3.png" width="800" height="491"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Benchmarks run with Claude Sonnet 4.6 on production-like workloads. Full methodology at&lt;/em&gt; &lt;a href="https://github.com/rotsl/context-fusion/tree/main/benchmarks" rel="noopener noreferrer"&gt;&lt;em&gt;github.com/rotsl/context-fusion/benchmarks&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Getting Started (Three Options)
&lt;/h4&gt;

&lt;h4&gt;
  
  
  Option A: NPM Wrapper (Easiest — No Python Required)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-time setup&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @rotsl/contextfusion
npx @rotsl/contextfusion setup

&lt;span class="c"&gt;# Create API keys file&lt;/span&gt;
npx @rotsl/contextfusion &lt;span class="nb"&gt;env&lt;/span&gt;
&lt;span class="c"&gt;# Edit .env with your OPENAI_API_KEY or ANTHROPIC_API_KEY&lt;/span&gt;

&lt;span class="c"&gt;# Run optimization&lt;/span&gt;
npx @rotsl/contextfusion run ./my-documents &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Summarize key findings"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; anthropic &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; claude-sonnet-4-6 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--budget&lt;/span&gt; 4000

&lt;span class="c"&gt;# Launch Web UI&lt;/span&gt;
npx @rotsl/contextfusion ui &lt;span class="nt"&gt;--port&lt;/span&gt; 8080
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Option B: Python Package (More Control)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;context-portfolio-optimizer

&lt;span class="c"&gt;# Set up environment&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; .env &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;# Run CLI&lt;/span&gt;
cpo run ./my-documents &lt;span class="nt"&gt;--budget&lt;/span&gt; 4000 &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"What are the main points?"&lt;/span&gt;

&lt;span class="c"&gt;# Or compile for specific task type&lt;/span&gt;
cpo compile ./my-codebase &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--task&lt;/span&gt; &lt;span class="s2"&gt;"Explain this function"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; openai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model&lt;/span&gt; gpt-5-mini &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--mode&lt;/span&gt; code &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--budget&lt;/span&gt; 3000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Option C: Docker (Isolated, Reproducible)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; context-fusion:latest &lt;span class="nb"&gt;.&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;:/app context-fusion:latest run ./data &lt;span class="nt"&gt;--budget&lt;/span&gt; 3000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Web UI: See What Your LLM Actually Receives
&lt;/h4&gt;

&lt;p&gt;Run cpo ui --port 8080 and open your browser. You'll see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run stats: Files ingested, blocks selected, total tokens&lt;/li&gt;
&lt;li&gt;Representation usage: Which compact variants were chosen&lt;/li&gt;
&lt;li&gt;Selected blocks: Source, representation type, utility score, token estimate&lt;/li&gt;
&lt;li&gt;Context preview: Exactly what gets sent to the LLM&lt;/li&gt;
&lt;li&gt;Model answer: Optional direct comparison&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;This transparency is rare. Most RAG tools are black boxes. ContextFusion shows its work.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Common Use Cases
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc4ofr4m2e7867ajd330.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc4ofr4m2e7867ajd330.png" width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  When ContextFusion Helps Most
&lt;/h4&gt;

&lt;p&gt;✅ Multi-provider setups — Same pipeline, different output formats&lt;br&gt;&lt;br&gt;
✅ Cost-sensitive production — 60–99% token reduction&lt;br&gt;&lt;br&gt;
✅ Agent conversations — Delta fusion prevents token churn&lt;br&gt;&lt;br&gt;
✅ Complex ingestion — PDFs, images, code, spreadsheets unified&lt;br&gt;&lt;br&gt;
✅ Latency requirements — Precomputation + caching&lt;/p&gt;
&lt;h4&gt;
  
  
  When You Might Not Need It
&lt;/h4&gt;

&lt;p&gt;❌ Simple single-turn Q&amp;amp;A with tiny documents&lt;br&gt;&lt;br&gt;
❌ You’re already heavily invested in a specific RAG framework and happy with costs&lt;br&gt;&lt;br&gt;
❌ You need real-time streaming with sub-100ms latency (ContextFusion adds 50–200ms optimization overhead)&lt;/p&gt;
&lt;h4&gt;
  
  
  Part 2: For Developers — “How This Actually Works”
&lt;/h4&gt;
&lt;h4&gt;
  
  
  Architecture Overview
&lt;/h4&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────┐
│ INGESTION LAYER │
│ PDF │ DOCX │ CSV │ JSON │ Images (OCR) │ Code │ Markdown │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ NORMALIZATION LAYER │
│ Convert all sources to uniform ContextBlock objects │
│ - source_type, content_hash, created_at, metadata │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ REPRESENTATION LAYER │
│ Precompute compact variants per block: │
│ - universal_summary (general purpose) │
│ - qa_extractive (question-answering focused) │
│ - code_signature (functions, classes, dependencies) │
│ - agent_condensed (working memory format) │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ PRECOMPUTE PIPELINE │
│ Store: fingerprints, summaries, token stats, │
│ retrieval features, compact variants in .cpo_cache/ │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ RETRIEVAL LAYER │
│ Query classification → Lexical retrieval (top-100) │
│ → Fast rerank (top-20/25) → Candidate set │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ MULTI-OBJECTIVE PLANNER (Core) │
│ │
│ maximize Σ( w_u·utility - w_r·risk - w_t·token_cost │
│ - w_l·latency + w_c·cacheability + w_d·diversity ) │
│ │
│ subject to: Σ(token_i) ≤ budget │
│ │
│ Selects optimal representation variant per block │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ COMPRESSION LAYER │
│ - JSON minification │
│ - Citation compaction (Source URI → [id]) │
│ - Schema field pruning │
│ Levels: none │ light │ medium │ aggressive │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ DELTA FUSION (Agent Mode) │
│ Compute ContextDelta: │
│ - added_blocks: new since last turn │
│ - updated_blocks: changed content │
│ - removed_blocks: no longer relevant │
│ - unchanged_block_ids: reuse from cache │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ PROVIDER ADAPTER LAYER │
│ Compile provider-specific payloads: │
│ - openai: chat.completions format │
│ - anthropic: messages with XML citations │
│ - ollama: local API structure │
│ - openai_compatible: generic wrapper │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│ CACHE-AWARE ASSEMBLY │
│ Segment into: │
│ - stable: system instructions, citation maps, cacheable blocks │
│ - dynamic: volatile content, real-time data │
└─────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  The Knapsack Formulation: Why This Isn’t Just “Smart Chunking”
&lt;/h4&gt;

&lt;p&gt;Most RAG tools use semantic similarity: embed query, embed chunks, return top-k. This fails when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your budget is 4,000 tokens and you have 50 relevant chunks of 500 tokens each&lt;/li&gt;
&lt;li&gt;Some chunks are high-utility but high-risk (outdated documentation)&lt;/li&gt;
&lt;li&gt;Some chunks are cacheable, others must be fresh&lt;/li&gt;
&lt;li&gt;You need diversity (don’t send 5 versions of the same information)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;ContextFusion’s planner treats this as a constrained optimization problem:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudocode of the core algorithm
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_context_blocks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    candidates: List[ContextBlock with multiple representation variants]
    budget: int (token limit)
    weights: dict[str, float] (utility, risk, latency, cacheability, diversity)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate all (block, variant) pairs with scores
&lt;/span&gt;    &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;representations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utility&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utility_score&lt;/span&gt;
                &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;risk&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;risk_score&lt;/span&gt;
                &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;token_cost&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token_count&lt;/span&gt;
                &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;latency&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;latency_estimate&lt;/span&gt;
                &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cacheability&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache_score&lt;/span&gt;
                &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;diversity&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;diversity_bonus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;selected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token_count&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Solve 0/1 knapsack for maximum score within budget
&lt;/span&gt;    &lt;span class="n"&gt;selected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;knapsack_01&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;selected&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;This is NP-hard, but with proper indexing and heuristics, it runs in &amp;lt;100ms for typical workloads.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Code Example: Pipeline Integration
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PipelineRunner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Config&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer.providers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnthropicAdapter&lt;/span&gt;

&lt;span class="c1"&gt;# Custom configuration
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_yaml&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
budget:
  instructions: 1000
  retrieval: 3000
  memory: 2000
  examples: 1500
  tool_trace: 1000
  output_reserve: 1000

scoring:
  utility_weights:
    retrieval: 0.25
    trust: 0.20
    freshness: 0.15
    structure: 0.15
    diversity: 0.15
    token_cost: -0.10

provider:
  name: anthropic
  model: claude-sonnet-4-6
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize pipeline
&lt;/span&gt;&lt;span class="n"&gt;runner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PipelineRunner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Run full pipeline
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./docs/architecture.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./src/api.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./data/metrics.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How does the authentication flow work?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;task_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# chat | qa | code | agent
&lt;/span&gt;    &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;use_precomputed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compute_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt; &lt;span class="c1"&gt;# Set True for agent loops
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Inspect results
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Selected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stats&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;blocks_selected&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; blocks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Total tokens: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;stats&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total_tokens&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Context preview:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Direct provider compilation
&lt;/span&gt;&lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnthropicAdapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile_packet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;context_blocks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;selected_blocks&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Answer with citations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# payload is ready for anthropic.messages.create(**payload)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Delta Fusion: The Secret to Efficient Agents
&lt;/h4&gt;

&lt;p&gt;Standard agent implementations re-send the entire conversation history + retrieved context on every turn. With 10 turns × 4,000 tokens = 40,000 tokens wasted.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;ContextFusion’s delta tracking:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Turn 1: Full context
&lt;/span&gt;&lt;span class="n"&gt;turn1_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Step 1...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;turn1_packet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;turn1_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;context_packet&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Turn 2: Only send what changed
&lt;/span&gt;&lt;span class="n"&gt;turn2_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Step 2...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;task_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;previous_packet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;turn1_packet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# Enable delta computation
&lt;/span&gt;    &lt;span class="n"&gt;compute_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# turn2_result['context_delta'] contains:
# {
# 'added_blocks': [new_retrieved_content],
# 'updated_blocks': [changed_blocks],
# 'removed_blocks': [no_longer_relevant],
# 'unchanged_block_ids': [ids_to_reuse_from_cache],
# 'full_context_hash': 'abc123...' # For cache validation
# }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The provider adapter assembles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System instructions (stable, cached)&lt;/li&gt;
&lt;li&gt;Citation map (stable, cached)&lt;/li&gt;
&lt;li&gt;New/updated blocks (dynamic, sent)&lt;/li&gt;
&lt;li&gt;Unchanged block references (cached, not sent)&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Precompute Pipeline: Latency Optimization
&lt;/h4&gt;

&lt;p&gt;For production workloads, precompute expensive operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One-time setup (can run offline, on CI, or scheduled)&lt;/span&gt;
cpo precompute ./corpus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--store-dir&lt;/span&gt; .cpo_cache/precompute &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--semantic-dedup&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--generate-all-representations&lt;/span&gt;

&lt;span class="c"&gt;# Runtime query uses precomputed artifacts&lt;/span&gt;
cpo compile ./corpus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--precomputed-only&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Quick question"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--budget&lt;/span&gt; 2000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Precomputed artifacts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fingerprints.jsonl: Content hashes for deduplication&lt;/li&gt;
&lt;li&gt;representations/: All compact variants per block&lt;/li&gt;
&lt;li&gt;token_stats.json: Pre-counted tokens per variant&lt;/li&gt;
&lt;li&gt;retrieval_index.faiss: FAISS index for fast similarity search&lt;/li&gt;
&lt;li&gt;features.jsonl: Utility/risk/cacheability scores&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  MCP Server Integration
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;Expose ContextFusion as an MCP (Model Context Protocol) server:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cpo serve-mcp &lt;span class="nt"&gt;--host&lt;/span&gt; localhost &lt;span class="nt"&gt;--port&lt;/span&gt; 8765
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MCP clients can now call:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tools/ingest: Add documents to context&lt;/li&gt;
&lt;li&gt;tools/compile: Optimize and compile context&lt;/li&gt;
&lt;li&gt;resources/context/{session_id}: Retrieve compiled packets&lt;/li&gt;
&lt;li&gt;tools/delta: Compute context deltas&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Framework Integrations
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;LangChain:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ContextFusionRetriever&lt;/span&gt;

&lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ContextFusionRetriever&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;task_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qa&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Use in any LangChain chain
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.chains&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;
&lt;span class="n"&gt;qa&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RetrievalQA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_chain_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chat_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chain_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stuff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retriever&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;LlamaIndex:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer.integrations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ContextFusionNodeParser&lt;/span&gt;

&lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ContextFusionNodeParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;budget_per_query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;precompute_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.cpo_cache&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Use with LlamaIndex index construction
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;llama_index.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VectorStoreIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;node_parser&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Development Setup
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nl"&gt;git clone https&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="nf"&gt;//github.com/rotsl/context-fusion.git&lt;/span&gt;
&lt;span class="err"&gt;cd&lt;/span&gt; &lt;span class="err"&gt;context-fusion&lt;/span&gt;
&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;bootstrap&lt;/span&gt; &lt;span class="c"&gt;# Install dev dependencies
&lt;/span&gt;
&lt;span class="c"&gt;# Development workflow
&lt;/span&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;test&lt;/span&gt; &lt;span class="c"&gt;# Run test suite (49 tests)
&lt;/span&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;lint&lt;/span&gt; &lt;span class="c"&gt;# Ruff + mypy
&lt;/span&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;type-check&lt;/span&gt; &lt;span class="c"&gt;# Strict type checking
&lt;/span&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;format&lt;/span&gt; &lt;span class="c"&gt;# Auto-format code
&lt;/span&gt;
&lt;span class="c"&gt;# Local servers
&lt;/span&gt;&lt;span class="nl"&gt;make ui # Web UI on &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="nf"&gt;8080&lt;/span&gt;
&lt;span class="nl"&gt;make serve-mcp # MCP server on &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="nf"&gt;8765&lt;/span&gt;

&lt;span class="c"&gt;# Benchmarking
&lt;/span&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;benchmark&lt;/span&gt; &lt;span class="c"&gt;# Run full benchmark suite
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Project Structure
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvr0xzunjn8tkl0b52tm3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvr0xzunjn8tkl0b52tm3.png" width="800" height="614"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Performance Characteristics
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsvdgr8x0zwi55pzhey6w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsvdgr8x0zwi55pzhey6w.png" width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Extending ContextFusion
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;Custom representation:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer.representations&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Representation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;register_representation&lt;/span&gt;

&lt;span class="nd"&gt;@register_representation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_custom&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyCustomRepresentation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Representation&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ContextBlock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Your custom summarization logic
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;custom_summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.3&lt;/span&gt; &lt;span class="c1"&gt;# Rough heuristic
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Custom provider adapter:&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;context_portfolio_optimizer.providers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseProviderAdapter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;register_adapter&lt;/span&gt;

&lt;span class="nd"&gt;@register_adapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my_provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyProviderAdapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseProviderAdapter&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compile_packet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context_blocks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Format for your custom LLM API
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_system&lt;/span&gt;&lt;span class="p"&gt;()},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;format_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context_blocks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Part 3: Common Questions
&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;Q: How is this different from LangChain’s&lt;/em&gt; &lt;em&gt;ContextualCompressionRetriever?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;LangChain’s version compresses after retrieval using an LLM call. ContextFusion optimizes which content to retrieve and which representation to use, without requiring an LLM for compression. It’s also provider-agnostic and handles delta fusion for agents.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Q: Does this replace my vector database?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;No. ContextFusion sits after retrieval. Use Pinecone, Weaviate, pgvector, or FAISS for initial retrieval — then pass candidates through ContextFusion for optimization.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Q: What about streaming responses?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;ContextFusion optimizes the input context. Streaming the LLM’s output is unaffected. The optimization adds 50–200ms overhead, which is usually offset by reduced LLM latency from shorter prompts.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Q: Can I use this with local models?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Yes. The Ollama adapter works with any OpenAI-compatible local server. Budget planning and compression are even more valuable with slower local hardware.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Q: How do I debug suboptimal context selection?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Run&lt;/em&gt; &lt;em&gt;cpo ui and inspect the "Selected Blocks" panel. Each block shows its utility score, risk score, token count, and why it was included/excluded. Run&lt;/em&gt; &lt;em&gt;cpo ablate ./data to see which blocks contribute most to answer quality.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0945vhqkg1kj1d0mxtfe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0945vhqkg1kj1d0mxtfe.png" width="800" height="526"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/rotsl/context-fusion" rel="noopener noreferrer"&gt;GitHub - rotsl/context-fusion: ContextFusion is the context brain for LLM apps - compress, rank, and route the right evidence to chat + agent models across OpenAI, Claude, Ollama, and MCP&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@rotsl/contextfusion" rel="noopener noreferrer"&gt;&lt;strong&gt;NPM Package&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Final Thoughts
&lt;/h4&gt;

&lt;p&gt;ContextFusion isn’t just another RAG tool. It’s a bet that context optimization — treating token budgets as scarce resources to be allocated intelligently — will become as essential as retrieval itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For normal users:&lt;/strong&gt; Install it, run it, pay less.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For developers:&lt;/strong&gt; Extend it, integrate it, build smarter systems.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Fuse less context. Keep more signal. Ship faster answers.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;⭐️ Star the repo, ⚠️file issues, ㊣ submit PRs. ContextFusion is Apache-2.0 and built for production.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>llm</category>
      <category>rags</category>
      <category>contextengineering</category>
    </item>
    <item>
      <title>I Built a Programming Language That Lets You Write Websites in Plain English</title>
      <dc:creator>RoTSL</dc:creator>
      <pubDate>Thu, 26 Feb 2026 14:29:07 +0000</pubDate>
      <link>https://dev.to/rotsl/i-built-a-programming-language-that-lets-you-write-websites-in-plain-english-3e5e</link>
      <guid>https://dev.to/rotsl/i-built-a-programming-language-that-lets-you-write-websites-in-plain-english-3e5e</guid>
      <description>&lt;p&gt;No HTML. No CSS classes. No build step complexity. Just describe what you want, get production-ready code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon1lv38o5jz181a9sqjx.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon1lv38o5jz181a9sqjx.jpeg" width="500" height="332"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Weave pages like you’d knit your sweater. Stay warm! ❤️&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Sometime ago, I released Wisp – a zero-dependency UI engine that automatically styles semantic HTML. The response was incredible. Developers loved that they could write clean, accessible markup and have it look professionally designed without touching CSS.&lt;/p&gt;

&lt;p&gt;But something kept bothering me.&lt;/p&gt;

&lt;p&gt;Wisp solved the styling problem, but it didn’t solve the authoring problem.&lt;/p&gt;

&lt;p&gt;You still needed to know HTML. You still had to remember which tags to use, when to use  vs , how to structure a hero section properly for accessibility. For developers, that’s second nature. But for content creators, marketers, and domain experts who just want to build a landing page?&lt;/p&gt;

&lt;p&gt;That’s a massive barrier.&lt;/p&gt;

&lt;p&gt;So I built Weave – a natural language interface that turns plain English descriptions into semantic HTML that Wisp (or any other styling engine) can work with.&lt;/p&gt;

&lt;p&gt;Think of it as the missing piece: Weave handles the authoring, Wisp handles the styling, and standard HTML sits in the middle as the universal interface.&lt;/p&gt;
&lt;h4&gt;
  
  
  &lt;strong&gt;The “Aha” Moment&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Here’s what writing a landing page looks like with Weave:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A page called "Helio Labs"
  With a hero
    Showing "Launch faster with Weave"
    With subtitle "Write structure in plain English, ship full HTML"
    With a primary button "Start Building"
  With features
    Having 3 features:
      "Fast authoring" with description "No class-level styling setup needed"
      "Deterministic output" with description "Compiler emits stable HTML"
      "Portable delivery" with description "Export static files instantly"
  Using modern theme

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That’s it. No &lt;/p&gt; soup. No class=”flex container mx-auto” noise. Just intent.

&lt;p&gt;Run this through the Weave compiler, and you get:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;header&lt;/span&gt; &lt;span class="na"&gt;role=&lt;/span&gt;&lt;span class="s"&gt;"banner"&lt;/span&gt; &lt;span class="na"&gt;data-wisp-context=&lt;/span&gt;&lt;span class="s"&gt;"hero"&lt;/span&gt; &lt;span class="na"&gt;data-wisp-priority=&lt;/span&gt;&lt;span class="s"&gt;"critical"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;Launch faster with Weave&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;Write structure in plain English, ship full HTML&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"#"&lt;/span&gt; &lt;span class="na"&gt;role=&lt;/span&gt;&lt;span class="s"&gt;"button"&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"primary"&lt;/span&gt; &lt;span class="na"&gt;data-wisp-expand=&lt;/span&gt;&lt;span class="s"&gt;"auto"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    Start Building
  &lt;span class="nt"&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/header&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;main&lt;/span&gt; &lt;span class="na"&gt;role=&lt;/span&gt;&lt;span class="s"&gt;"main"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;section&lt;/span&gt; &lt;span class="na"&gt;aria-labelledby=&lt;/span&gt;&lt;span class="s"&gt;"features-heading"&lt;/span&gt; 
           &lt;span class="na"&gt;data-wisp-context=&lt;/span&gt;&lt;span class="s"&gt;"features"&lt;/span&gt; &lt;span class="na"&gt;data-wisp-density=&lt;/span&gt;&lt;span class="s"&gt;"0.3"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h2&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"features-heading"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Features&lt;span class="nt"&gt;&amp;lt;/h2&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;ul&lt;/span&gt; &lt;span class="na"&gt;role=&lt;/span&gt;&lt;span class="s"&gt;"list"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;li&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;h3&amp;gt;&lt;/span&gt;Fast authoring&lt;span class="nt"&gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;No class-level styling setup needed&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
      &lt;span class="c"&gt;&amp;lt;!-- ... --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/section&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/main&amp;gt;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqoagwclsy4ag72dhk5va.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqoagwclsy4ag72dhk5va.png" width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice the data-wisp-* attributes? Those are optimization hints for the Wisp runtime. If Wisp is present, you get automatic context-aware styling. If not, you still have perfectly valid, accessible HTML5 that works everywhere.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why This Matters
&lt;/h4&gt;

&lt;p&gt;The web development landscape has become increasingly complex. We’ve gone from simple HTML pages to build-step-heavy frameworks that require:&lt;/p&gt;

&lt;p&gt;• Learning JSX/template syntax&lt;/p&gt;

&lt;p&gt;• Understanding component hierarchies&lt;/p&gt;

&lt;p&gt;• Managing state and props&lt;/p&gt;

&lt;p&gt;• Configuring bundlers and transpilers&lt;/p&gt;

&lt;p&gt;• Debugging CSS specificity wars&lt;/p&gt;

&lt;p&gt;Weave inverts this complexity.&lt;/p&gt;

&lt;p&gt;It asks: What if the barrier to creating web content was as low as writing a document outline?&lt;/p&gt;

&lt;p&gt;The Weave Philosophy&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Semantic fidelity first – Output must be valid, accessible HTML5&lt;/li&gt;
&lt;li&gt;2. Deterministic compilation – Same script, same output, every time&lt;/li&gt;
&lt;li&gt;3. Progressive disclosure – Simple cases are simple; complex cases are possible&lt;/li&gt;
&lt;li&gt;4. Wisp compatibility – Generated HTML maximizes context-detection for styling&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  How It Works: The Compiler Pipeline
&lt;/h4&gt;

&lt;p&gt;Weave isn’t just a templating engine. It’s a proper compiler with a two-phase architecture:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Phase 1: Parsing (parseWeave)&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• Lexer: Tokenizes input using indentation as block delimiters (Python-style off-side rule)&lt;/p&gt;

&lt;p&gt;• Parser: Recursive descent parser with LL(1) lookahead, building a typed Abstract Syntax Tree (AST)&lt;/p&gt;

&lt;p&gt;• Type checker: Validates semantic constraints (e.g., buttons must be inside sections, images need alt text)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Phase 2: Code Generation (compileWeave)&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;• AST traversal: Visitor pattern walk of the typed tree&lt;/p&gt;

&lt;p&gt;• HTML emission: Generates semantic HTML5 with proper ARIA roles&lt;/p&gt;

&lt;p&gt;• Wisp optimization: Injects data-wisp-* hints for enhanced styling&lt;/p&gt;

&lt;p&gt;• Post-processing: Optional minification, pretty-printing, or CSS inlining&lt;/p&gt;

&lt;p&gt;The result? Linear time complexity O(n) – compile times under 10ms for typical scripts.&lt;/p&gt;

&lt;h4&gt;
  
  
  Using Weave in Your Projects
&lt;/h4&gt;

&lt;p&gt;As an npm Package&lt;/p&gt;

&lt;p&gt;Install it:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @rotsl/weave

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;Use it programmatically:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;parseWeave&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;compileWeave&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@rotsl/weave&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;script&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
A page called "My Product"
  With a hero
    Showing "Ship faster"
    With a primary button "Get Started"
  Using modern theme
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ast&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseWeave&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;script&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;compileWeave&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
  &lt;span class="na"&gt;wispHints&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;minify&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; 
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;Or use the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Compile a script&lt;/span&gt;
weave build page.weave &lt;span class="nt"&gt;-o&lt;/span&gt; page.html

&lt;span class="c"&gt;# Watch mode for development&lt;/span&gt;
weave watch ./pages/ &lt;span class="nt"&gt;--output&lt;/span&gt; ./dist/

&lt;span class="c"&gt;# Validate without compiling&lt;/span&gt;
weave validate page.weave

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;&lt;strong&gt;The Visual Editor&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For non-technical users, there’s a browser-based editor with:&lt;/p&gt;

&lt;p&gt;• Split-pane interface: Script on the left, live preview on the right&lt;/p&gt;

&lt;p&gt;• Real-time error highlighting: Catch mistakes as you type&lt;/p&gt;

&lt;p&gt;• Wisp toggle: See raw vs. styled output instantly&lt;/p&gt;

&lt;p&gt;• One-click export: Download HTML or full project bundles&lt;/p&gt;

&lt;h4&gt;
  
  
  The Weave-Wisp Ecosystem
&lt;/h4&gt;

&lt;p&gt;Here’s where it gets interesting. Weave and Wisp form a complete content-to-presentation pipeline:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxset5o4e66wigqdn2r9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxset5o4e66wigqdn2r9n.png" width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;• Content teams iterate on copy and structure without developer bottlenecks&lt;/p&gt;

&lt;p&gt;• Developers maintain styling logic independently of content&lt;/p&gt;

&lt;p&gt;• Accessibility is built-in, not bolted-on (ARIA roles, heading hierarchies, alt text enforcement)&lt;/p&gt;

&lt;p&gt;• Performance is optimal (zero runtime dependencies, ~5KB optional Wisp runtime)&lt;/p&gt;

&lt;h4&gt;
  
  
  Real-World Use Cases
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Marketing Landing Pages&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Your marketing team wants to A/B test three hero variants. Instead of filing Jira tickets and waiting for dev resources, they write:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A page called "Summer Campaign"
  With a hero
    Showing "Save 50% this summer"
    With a secondary button "View Plans"
  Using playful theme

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;&lt;em&gt;Compile, deploy, done.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Documentation Sites&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Technical writers focus on content hierarchy, not CSS frameworks. Weave enforces proper heading structure (h1 → h2 → h3) and generates table-of-contents-ready markup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Rapid Prototyping&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Validate page structure before investing in visual design. Weave’s six built-in themes (modern, minimal, corporate, playful, elegant, dark) give you instant visual feedback via Wisp integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Accessibility-First Development&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Weave implements accessibility by construction:&lt;/p&gt;

&lt;p&gt;• Automatic landmark roles (, )&lt;/p&gt;

&lt;p&gt;• Enforced heading hierarchies&lt;/p&gt;

&lt;p&gt;• Alt text validation for images&lt;/p&gt;

&lt;p&gt;• Semantic button vs. link detection&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You get WCAG 2.1 AA compliant markup without thinking about it.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Under the Hood: Technical Highlights
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Grammar Design&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Weave uses an indentation-sensitive grammar (EBNF) that’s formally specified yet reads like English:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;section_declaration ::= "With" section_type { element_declaration }
section_type ::= "a" "hero" | "features" | "content" | ...
element_declaration ::= "Showing" string_literal 
                      | "With" "a" button_type "button" string_literal
                      | "Having" number "features" ":" feature_list

&lt;/code&gt;&lt;/pre&gt;



&lt;p&gt;Error Handling That Doesn’t Suck&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxt0te8fn375avoanh7j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxt0te8fn375avoanh7j.png" width="800" height="238"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead of cryptic parser errors, Weave gives you:&lt;/p&gt;

&lt;p&gt;Line numbers, context, and “did you mean” suggestions included.&lt;/p&gt;

&lt;h4&gt;
  
  
  Testing Strategy
&lt;/h4&gt;

&lt;p&gt;Golden file testing ensures output stability:&lt;/p&gt;

&lt;p&gt;• Parser tests: Input → expected AST JSON&lt;/p&gt;

&lt;p&gt;• Compiler tests: AST → expected HTML snapshots&lt;/p&gt;

&lt;p&gt;• Integration tests: End-to-end script → HTML → Wisp rendering&lt;/p&gt;

&lt;p&gt;• Regression tests: Real-world complex scripts&lt;/p&gt;

&lt;h4&gt;
  
  
  The Road Ahead
&lt;/h4&gt;

&lt;p&gt;Weave is just getting started. Here’s what’s coming:&lt;/p&gt;

&lt;p&gt;Language Extensions:&lt;/p&gt;

&lt;p&gt;• Component definitions (Define a component called “Testimonial”)&lt;/p&gt;

&lt;p&gt;• Data binding (Showing data from “testimonials.json”)&lt;/p&gt;

&lt;p&gt;• Conditional rendering (If user.isAuthenticated show…)&lt;/p&gt;

&lt;p&gt;• Internationalization (In English: … In French: …)&lt;/p&gt;

&lt;h4&gt;
  
  
  Ecosystem Growth:
&lt;/h4&gt;

&lt;p&gt;• VS Code extension with Language Server Protocol support&lt;/p&gt;

&lt;p&gt;• GitHub Actions for CI/CD integration&lt;/p&gt;

&lt;p&gt;• Markdown/Word import-export&lt;/p&gt;

&lt;p&gt;• Figma design-to-Weave conversion&lt;/p&gt;

&lt;h4&gt;
  
  
  Research Directions:
&lt;/h4&gt;

&lt;p&gt;• Learnability studies with non-technical users&lt;/p&gt;

&lt;p&gt;• Automated WCAG 2.2 compliance validation&lt;/p&gt;

&lt;p&gt;• Semantic preservation across Weave → HTML → Wisp pipeline&lt;/p&gt;

&lt;h4&gt;
  
  
  Get Started
&lt;/h4&gt;

&lt;p&gt;🚀 Repository: &lt;a href="https://github.com/rotsl/Weave" rel="noopener noreferrer"&gt;github.com/rotsl/Weave&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/rotsl/Weave" rel="noopener noreferrer"&gt;https://github.com/rotsl/Weave&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📦 npm Package: &lt;a href="https://www.npmjs.com/package/@rotsl/weave" rel="noopener noreferrer"&gt;@rotsl/weave&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.npmjs.com/package/@rotsl/weave" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@rotsl/weave&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🌐 Live Editor: Try it in your browser&lt;/p&gt;

&lt;p&gt;&lt;a href="https://rotsl.github.io/Weave/" rel="noopener noreferrer"&gt;https://rotsl.github.io/Weave/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📄 Documentation: Full syntax reference and examples in the repo&lt;/p&gt;

&lt;p&gt;🔗 Related: Wisp UI Engine (MIT Licensed) – the styling layer that completes the toolchain&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/rotsl/wisp" rel="noopener noreferrer"&gt;https://github.com/rotsl/wisp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;💾 Archived Version: &lt;a href="//DOI%2010.5281/zenodo.18773305"&gt;DOI 10.5281/zenodo.18773305&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Why I Built This
&lt;/h4&gt;

&lt;p&gt;I believe the web should be writable by everyone, not just developers. We’ve made consuming content effortless, but creating it remains unnecessarily complex.&lt;/p&gt;

&lt;p&gt;Weave is my attempt to lower that barrier. It’s not about replacing developers. It’s about empowering domain experts to create structured, accessible, performant web content without learning markup syntax.&lt;/p&gt;

&lt;p&gt;If you can write a document outline, you can build a webpage.&lt;/p&gt;

&lt;p&gt;That’s the future I want to build toward.&lt;/p&gt;

&lt;p&gt;Questions? Thoughts? Open an issue on the repo. I’d love to hear how you’d use Weave in your workflow.&lt;/p&gt;

&lt;h4&gt;
  
  
  TL;DR
&lt;/h4&gt;

&lt;p&gt;• Weave turns plain English into semantic HTML&lt;/p&gt;

&lt;p&gt;• Works standalone or with Wisp for automatic styling&lt;/p&gt;

&lt;p&gt;• Zero-config, deterministic, accessibility-first&lt;/p&gt;

&lt;p&gt;• Use via npm (&lt;a class="mentioned-user" href="https://dev.to/rotsl"&gt;@rotsl&lt;/a&gt;/weave) or visual editor&lt;/p&gt;

&lt;p&gt;• Perfect for content teams, rapid prototyping, and accessible web development&lt;/p&gt;

&lt;p&gt;• Repo: &lt;a href="//github.com/rotsl/Weave"&gt;github.com/rotsl/Weave&lt;/a&gt;&lt;/p&gt;

</description>
      <category>css</category>
      <category>webcompiler</category>
      <category>javascript</category>
      <category>html</category>
    </item>
  </channel>
</rss>
