Hi everyone,
I've been hacking on a personal, local project called Hillock. Honestly, it's very much a work in progress and it isn't all that, but I wanted to see if we could build a lightweight, offline memory layer for local LLMs without the overhead of running a heavy neural vector database or wasting precious VRAM.
It is named after the biological Axon Hillockβthe region of a human neuron that sums up incoming electrical charges and decides whether to fire (open the gate) or remain silent (block).
How the stack works:
- Hard Facts (SQLite): Stores raw facts as simple database triples (Subject-Predicate-Object) so the system has a solid symbolic foundation.
- Synapses (Hebbian Plasticity): Tracks which concepts co-occur during a conversation to dynamically build gradient-free associative weights.
- Context (Hyperdimensional Computing): Maintains a 10,000-dimensional leaky context vector that rolls, binds, and accumulates history. This helps the system resolve pronouns (like "he/she") and decide when to block a query to prevent hallucinations.
The "Smarter Model, Lower Score" Paradox
I wrote a tough, 30-sentence scientific benchmark with complex sentence structures and hard negatives to see where this breaks on local hardware.
When I ran Qwen 2 (1.5B), it got around 50.0% Retrieval Accuracy. But when I upgraded to the much smarter Qwen 3 (5.2GB), its score actually dropped to 15.0%!
Why? Because Qwen 3 is too expressive for my rigid evaluation script:
- The test expected
Marie_Curie born_in Poland. Qwen 3 extracted[Marie_Curie] -[spent_childhood_in]-> [Poland]. - The test expected
Albert_Einstein. Qwen 3 extracted[albert_einstein](lowercase), which broke the exact-string checks. - The test expected
compiler. Qwen 3 extracted[first_compiler].
So, while Qwen 3 populated the database with beautiful, highly accurate, and conversational triples, it got penalized by the rigid evaluation harness.
The codebase is written in pure Python, is fully open-source (under the AGPL-3.0 copyleft license), and is designed to run entirely offline on consumer hardware.
If anyone is interested in VSAs, alternative cognitive architectures, or has feedback on the HDC context-binding math, I'd love for you to check it out!
GitHub Repository: https://github.com/roandejager/Hillock
Top comments (0)