QORA - Native Rust LLM Inference Engine

#ai #programming #rust #rustai

Pure Rust inference engine for the SmolLM3-3B language model. No Python runtime, no CUDA, no external dependencies. Single executable + quantized weights = portable AI on any machine.

Downlod 🤗: https://huggingface.co/qoranet/QORA-LLM

Key Architectural Innovation: NoPE (No Position Encoding)
SmolLM3 uses a 3:1 NoPE ratio — 75% of layers have no positional encoding at all. Only layers 3, 7, 11, 15, 19, 23, 27, 31, 35 apply RoPE. This reduces computational overhead and enables better long-context generalization.

Performance Benchmarks
Test Hardware: Windows 11, CPU-only (no GPU acceleration)

DEV Community

QORA - Native Rust LLM Inference Engine

Top comments (0)