RoTSL

Posted on Apr 20 • Edited on Apr 22

Hypercontext: a framework for agents that actually know what they're doing

#ai #agents #python #typescript

I built Hypercontext because I got tired of agent frameworks that treat context like a static blob you shove into a prompt and hope for the best. Most tools out there assume context is something you pass. I wanted something that treats context as something you can inspect, compress, score, and rewrite while the agent is running.

Hypercontext is still in Alpha phase.

This isn't about adding another layer of abstraction over OpenAI's API. It's about making agents aware of their own reasoning so they can fix it when it breaks.

What it actually does

Hypercontext is a self-referential agent framework for Python and TypeScript. The core idea is simple: agents should be able to read and modify their own system prompts, tool descriptions, memory, and runtime capabilities based on whether they're actually succeeding at the task.

The framework ships with:

A Python SDK with orchestration, agents, scoring, memory, compression, deduplication, convergence detection, archive helpers, and extensions
A TypeScript SDK for Node.js with the same primitives
A modular extension system for adding capabilities without modifying the core runtime
Built-in research tooling extensions for web search, retrieval, and evidence gathering
A CLI for running compression, archive queries, provider discovery, orchestration, and extension workflows
A curses-based terminal UI for browsing and pinning commands without leaving the shell
A browser dashboard for visual inspection
An MCP stdio daemon for Claude Desktop, Claude Code, and Codex integration
An HTTP MCP server for web integrations

Both SDKs are zero-dependency where possible. The Python core is pure Python. The TypeScript SDK has minimal deps. You can run the whole thing against Ollama locally without touching a cloud provider.

The problem with most agent frameworks

I've used agents. They all share the same blind spot: context is treated as immutable input. You construct a prompt, feed it to the model, get output back. If the output is wrong, you tweak the prompt and try again. The agent itself has no idea what worked and what didn't across runs.

Hypercontext changes this by making context a first-class citizen that agents can manipulate. Each generation gets tracked as a node in a lineage tree. You can see which parent led to which result, which branch is going stale, and which context configuration produced the best score. Successful strategies get archived and reused. Failed ones get pruned.

This isn't theoretical. The archive stores scored generations so later runs can compare branches and identify the strongest evolution path. Memory is split between persistent storage (lessons across runs) and episodic storage (context within a single session).

Hypercontext also treats tooling as adaptive context. Extensions can expose capabilities dynamically during execution, allowing an agent to decide when external retrieval, research, or analysis should become part of its reasoning loop.

How the context loop works

Here's the basic flow:

The agent receives a task and its current context window
It generates a response and scores the result against a fitness function
If the score is below threshold, the agent reflects on what went wrong
It rewrites its own system prompt, tool descriptions, memory, or extension configuration based on that reflection
The new context configuration gets tested in the next generation
Successful configurations get archived; failed ones get discarded

This happens automatically in the TaskAgent and MetaAgent classes. You don't need to hand-code the reflection logic unless you want to.

The MetaAgent goes further. It can perform repository-aware tool use, extension orchestration, and self-modification workflows. If you point it at a codebase with --workdir, it can inspect files, suggest modifications, and track whether those modifications improved the code quality.

Extensions system

One of the biggest additions to Hypercontext is the extension architecture.

Extensions allow you to add runtime capabilities without bloating the core framework. Instead of hardcoding every tool into the agent runtime, Hypercontext lets you compose functionality based on the task.

Extensions can provide:

Tool registries
Retrieval pipelines
Research workflows
Context enrichers
External APIs
Scoring hooks
Middleware layers
Memory augmentation
Custom orchestration behaviors

This keeps the framework modular while still allowing deeply integrated workflows.

Extensions are designed to be lightweight and composable. You can enable only what your workflow requires.

Research tools extension

The research_tools extension adds structured information gathering to Hypercontext.

Instead of forcing agents to rely purely on static context or hallucinated recall, the extension provides a research layer that can gather, refine, and reuse evidence during execution.

The extension includes capabilities for:

Query expansion
Iterative research loops
Evidence collection
Citation-aware retrieval
Multi-step search refinement
Context injection from retrieved sources
Search result scoring and filtering
Research memory persistence across generations

This matters because long-running reasoning tasks often fail not from poor prompting but from incomplete information. Research tooling allows the agent to actively reduce uncertainty.

The extension integrates directly into the context evolution loop, meaning research results become part of the lineage and scoring process rather than existing as disconnected tool outputs.

Installation and setup

For Python:

pip install hypercontext

For Node.js:

npm install hypercontext-node-sdk

That's it. No separate MCP package to install. No complex dependency tree.

The Python package includes:

Core SDK
CLI
TUI
Browser dashboard launcher
MCP stdio daemon
HTTP server
Built-in extensions

The npm package is the SDK layer for Node.js projects.

Provider setup

Hypercontext doesn't lock you into a provider. It supports Claude, OpenAI, Ollama, OpenAI-compatible servers, and local transformers models. You set credentials via environment variables or a YAML config file with named presets.

For Claude:

HYPERCONTEXT_PROVIDER=anthropic
HYPERCONTEXT_MODEL=claude-sonnet-4-20250514
ANTHROPIC_API_KEY=your-key-here

For Ollama (fully local):

ollama serve
ollama pull llama3

HYPERCONTEXT_PROVIDER=ollama
HYPERCONTEXT_MODEL=llama3
OLLAMA_BASE_URL=http://localhost:11434

The named preset feature is useful when you want multiple backends in one project. You define them in a YAML file and resolve by name at runtime. The framework expands ${VAR} values from the environment, so secrets stay out of config files.

Using extensions in Python

Extensions are loaded directly into the runtime.

from hypercontext import HyperContext
from hypercontext.extensions import ResearchToolsExtension

hc = HyperContext(output_dir="./hypercontext_output")

hc.use(
    ResearchToolsExtension()
)

summary = hc.run(max_generations=3)
print(summary)

You can compose multiple extensions together depending on the workflow.

hc.use(ResearchToolsExtension())
hc.use(CustomMemoryExtension())
hc.use(CustomScoringExtension())

Extensions participate in orchestration rather than existing as isolated plugins.

Using it in Python

Direct orchestration is straightforward:

from hypercontext import HyperContext

hc = HyperContext(output_dir="./hypercontext_output")
summary = hc.run(max_generations=3)

print(summary)

If you want provider-backed calls without the full orchestration loop:

from hypercontext import LLMClient
from hypercontext.providers import ProviderRegistry

registry = ProviderRegistry.instance()

provider = registry.create(
    "anthropic",
    model="claude-sonnet-4-20250514",
    api_key="your-key-here",
    base_url="https://api.anthropic.com",
)

client = LLMClient(provider=provider)

text, history, metadata = client.complete(
    "Summarize this in one sentence."
)

For agent workflows, you choose between TaskAgent (repeatable tasks) and MetaAgent (repository-aware reasoning and self-modification). Both support context evolution and extension-aware execution.

Using it in TypeScript

The Node SDK follows the same patterns:

import {
  ContextWindow,
  TaskAgent,
  StructuredOutputParser,
  EnhancedToolRegistry,
  LoggingMiddleware,
} from "hypercontext-node-sdk";

const window = new ContextWindow(4096);
window.add("Important context", 1.0, "system");

const agent = new TaskAgent({
  name: "demo",
  maxTokens: 1024,
});

const result = agent.forward({
  query: "hello",
});

const parser = new StructuredOutputParser();

console.log(
  parser.parseFirst('Answer: {"status":"ok"}')
);

const registry = new EnhancedToolRegistry();

registry.use(new LoggingMiddleware());

registry.registerTool(
  {
    name: "echo",
    description: "Echo a payload back",
    parameters: { type: "object" },
  },
  async (args) => args,
);

The TypeScript SDK includes context compression, retrieval, lineage tracking, persistent memory, fitness evaluation, structured output parsing, and extension support.

CLI and terminal UI

The Python package includes a full CLI:

python -m hypercontext version
python -m hypercontext providers
python -m hypercontext run --generations 5 --output-dir ./runs/demo --workdir .
python -m hypercontext compress --input long_text.txt --ratio 0.4
python -m hypercontext archive --list
python -m hypercontext extensions --list

The TUI is a curses dashboard for browsing commands, pinning favorites, and executing them without leaving the terminal.

python -m hypercontext tui --workdir /path/to/project

For desktop assistants, the stdio MCP daemon handles Claude Desktop, Claude Code, and Codex:

python -m hypercontext mcp --workdir /path/to/project

For browser integrations, the HTTP server exposes the same tools over a REST interface:

python -m hypercontext serve --port 8080 --workdir /path/to/project

MCP integration without the hassle

Most MCP implementations require you to install a separate package and configure JSON files.

Hypercontext bundles the stdio daemon and HTTP server directly.

You don't need to install additional MCP dependencies.

The stdio daemon speaks the Model Context Protocol natively. Claude Desktop can discover and invoke Hypercontext tools without manual configuration. The HTTP server provides the same capability for browser and web integrations.

Extensions can also expose MCP-accessible tools, making them available to external clients automatically.

Context compression and deduplication

One of the practical problems with long-running agents is context bloat.

Hypercontext includes a ContextCompressor that reduces text size while preserving semantic meaning.

There's also a validator that checks compression fidelity so you don't accidentally drop important information.

The deduplication layer identifies repeated patterns across generations and collapses them.

This matters when you're running evolutionary loops where similar context configurations get tested repeatedly.

Lineage tracking

Every generation gets a unique ID and tracks its parent.

You can query the lineage tree to answer questions like:

Which generation produced the best score?
Which parent led to this result?
Which branch hasn't improved in the last 10 generations?

This isn't just logging.

The lineage data feeds back into the parent selection strategy for the next generation.

Stagnant branches get deprioritized. High-fitness branches get explored further.

Research extension outputs can also become lineage artifacts, meaning evidence chains are tracked alongside prompt evolution.

Archive and transfer learning

The archive stores proven context configurations ranked by fitness score.

When you start a new task, the framework can query the archive for context patterns that worked well on similar tasks.

This is transfer learning without neural network retraining.

You're transferring context strategies instead of model weights.

The archive is queryable via CLI:

python -m hypercontext archive --query "task:code-review fitness:>0.8"

Archived runs can include extension-derived context, allowing successful research workflows to be reused across future tasks.

What I learned building this

I started this project after reading the Hyperagents paper and getting frustrated that none of the existing frameworks implemented the meta-cognitive ideas in a practical way.

Most research code is a mess of Jupyter notebooks and hardcoded paths.

I wanted something you could actually install and use.

The hardest part wasn't compression or lineage tracking.

It was designing the agent loop so self-modification doesn't spiral into chaos.

If an agent can rewrite its own system prompt, it can also break its own system prompt.

The convergence detection layer stops the loop when scores plateau or when context configurations start cycling.

I also learned that modularity matters more than feature count.

Extensions let Hypercontext grow without turning the framework into an unmaintainable monolith.

Current state and what's next

The framework is functional and I'm using it in my own projects.

The Python package is on PyPI, the TypeScript SDK is on npm, and the docs are on GitHub Pages.

I'm currently working on:

Better convergence heuristics for multi-objective optimization
A web-based lineage visualizer
Additional extension categories
Improved research pipelines
Benchmark suites to compare context strategies across tasks
Better local model workflows
More retrieval-aware orchestration patterns

The repo includes runnable examples for evolution, lineage tracking, self-modifying agents, extensions, provider workflows, and research tooling.

If you want to see what the framework can do without writing code, start with:

examples/python/feature_gallery.py

Try it

# Python
pip install hypercontext
python -m hypercontext version

# TypeScript
npm install hypercontext-node-sdk

Docs
Extensions
PyPI
npm

Top comments (0)

The discussion has been locked. New comments can't be added.