The Problem Nobody Talks About
When you ask Cursor to "fix the login bug in my app," here's what actually happens:
- Your query gets embedded into a vector
- The embedding is compared to every file in your codebase (cosine similarity)
- The top 5-10 most similar files are stuffed into the context window
- Everything else is invisible
Your AI has no idea about your database schema, your configuration, your test patterns, your middleware. It's working blind on 95% of your codebase.
The Information-Theoretic Solution
We built Entroly — a context engineering engine that approaches this as an optimization problem, not a search problem.
Instead of "find the most similar files," we ask: "What's the mathematically optimal set of fragments to include in the context, given a token budget?"
Step 1: Score Every Fragment
Every piece of code gets scored by Shannon entropy — measuring information density:
H(X) = -Σ p(xᵢ) · log₂(p(xᵢ))
High-entropy code (complex logic, unique algorithms) scores high. Low-entropy code (boilerplate, imports, comments) scores low.
We also measure:
- Recency: was this file recently modified?
- Frequency: is this file frequently accessed?
- Semantic relevance: how related to the current query?
Step 2: Build the Dependency Graph
Code is not independent. auth.py depends on auth_config.py. Your API routes call functions defined in models.py.
Entroly automatically extracts:
- Import relationships
- Function call chains
- Type references
- Module dependencies
When a fragment is selected, its dependencies get a relevance boost. This is the graph-constrained knapsack — NP-hard in general, but tractable for typical code graphs.
Step 3: Solve the Optimization Problem
This is where it gets mathematically interesting.
We use KKT bisection to find the exact Lagrange multiplier for the token-budget constraint:
f(th) = Σᵢ σ((sᵢ − th) / τ) · tokensᵢ − B = 0
30 steps of bisection give us th* — the exact dual variable. Then we greedily fill the hard budget.
The beautiful part: the same σ(·/τ) appears in the REINFORCE backward pass. Zero train/test mismatch.
Step 4: Compress at Three Levels
Not every file needs full source code:
-
L1 (5% budget): Skeleton map —
auth.py → AuthService, login(), verify_token() - L2 (25% budget): Expanded signatures for dependency-connected files
- L3 (70% budget): Full source code for the most relevant fragments
Your AI sees ALL 500 files. The important ones in detail. The rest in summary.
Step 5: Learn From Outcomes
After the AI generates a response, Entroly scores how well the context worked:
- Counterfactual Shapley credit: How much did each fragment contribute?
- Spectral natural gradient update: Adjust the 4D weight vector using Jacobi eigendecomposition of the gradient covariance
- TD(λ) eligibility traces: Credit cascades across a 3-request window
Over time, your context selection gets better without any manual tuning.
The Numbers
- 78% fewer tokens per request
- <10ms overhead (Rust engine)
- 304 unit tests in Rust, 100+ in Python
- 24 Rust modules, ~850KB of optimized code
- Works with any OpenAI-compatible API
Try It
pip install entroly
entroly go
GitHub: github.com/juyterman1000/entroly
MIT licensed. PRs welcome.
Top comments (0)