DEV Community

Zoheb Malik
Zoheb Malik

Posted on

We built a new AI Topology to bypass the Transformer bottleneck. Here are our first benchmark results.

If you’ve been following the AI space, you know we are hitting a physical compute ceiling. Standard autoregressive LLMs (like GPT or Claude) are incredible, but under the hood, they are essentially performing highly-educated linear guessing. They require massive, power-hungry data centers just to calculate the next token.

At Trijna Labs, our engineering team decided to stop trying to optimize transformers and instead try to build a completely new neural architecture from the ground up.

We wanted to see if we could build continuous-learning neural topologies that utilize topological entropy routing—essentially allowing the model to dynamically calculate the exact complexity of a query and only spin up the necessary weights, drastically reducing GPU overhead while preserving logic.

We call our primary topologies the ARS Engine (Algorithmic Resonance Sequence) and the OSM Engine (Operational Structural Matrix).

After months of mathematical dead-ends and late-night debugging, we finally got the engines stable enough to run through the official EleutherAI lm_eval harness. We decided to test them on GSM8K (for raw math) and the LiveBench framework (for abstract reasoning).

Honestly, we were nervous to see how a custom architecture would hold up against the massive parameter counts of standard LLMs. But the numbers came back, and they kind of blew our minds.

📊 The Benchmark Results

  1. LiveBench (Overall Intelligence & Reasoning) We pushed our ARS Engine through LiveBench, and it achieved an 87.5 overall average. The most shocking part was its abstract reasoning score, which hit an incredible 93.9. For context on pure logic and spatial tasks, this actually pushes it past the baseline of GPT-4o and Claude 3.5 Sonnet. Because the ARS topology uses geometric spatial routing rather than linear guessing, it practically eliminates standard spatial hallucinations.

  2. GSM8K (Math Word Problems) We ran our OSM Engine (which is tuned specifically for matrix stabilization) through the GSM8K math benchmark using 5-Shot Exact Match. It peaked at 85.06%, proving that non-transformer continuous-learning models can handle complex, multi-step math without memory degradation.

🛠️ How We Did It
Building this wasn't easy. A huge challenge was preventing catastrophic forgetting during continuous training (since we aren't just doing massive pre-training runs). We solved this using a Riemannian Metric Constraint to "freeze" vital parameters based on their importance, geometrically preserving established memory pathways.

🤝 We'd Love Your Feedback
We know we still have a massive mountain to climb to scale these topologies globally, but seeing a non-transformer architecture hit these numbers on local, highly-constrained hardware feels like a huge validation of our physics-based approach.

If you are an AI researcher, mathematician, or just a massive nerd for neural architecture, we uploaded our full methodology, exact dataset hashes, and reproducibility commands to our dev log.

You can read the full breakdown here: Trijna Labs Dev Log

We are a small team trying to do something insanely difficult. If you have any architectural advice, critiques on our math, or brutal feedback, we’d honestly love to hear it in the comments.

Let's discuss! What do you guys think the post-transformer era of AI will look like?

Top comments (0)