Ammar

Posted on Nov 3

Helios-Engine ,Why I Built Another LLM Agent Framework (And Why You Might Actually Care)

#rust #programming #ai #rag

Yeah, I know. Another LLM agent framework. The last thing the world needs, right?

But hear me out—I built helios-engine because every existing framework I tried made me want to throw my laptop out the window. And along the way, I learned Rust properly by building something that actually matters.

The Problem: Every Framework Was Broken in Its Own Special Way

I tried them all. LangChain, LlamaIndex, AutoGPT, CrewAI—you name it, I fought with it. Here's what I kept running into:

Vendor Lock-In Everywhere

Most frameworks are married to OpenAI. Want to use Claude? Gemini? Some open-source model you're hosting yourself? Good luck navigating their half-baked adapter patterns. And if you want to switch providers mid-project? Better start refactoring.

"Local Support" That Doesn't Actually Work

Frameworks claim they support local models. What they mean is "we have a broken integration with Ollama that throws cryptic errors half the time." Running a GGUF model directly? Offline inference? Forget about it.

Syntax That Fights You

Either you're drowning in boilerplate and configuration files, or you're using some "magical" DSL that breaks the moment you need to do something the framework designers didn't anticipate. Both suck.

No Real Multi-Agent Support

Everyone talks about "agents," but most frameworks give you one agent that can kinda-sorta call tools. Building an actual system where multiple agents collaborate? You're on your own.

The Python Tax

Most of these frameworks are Python. Which is fine... until you want performance, type safety, or to deploy something that doesn't need a 500MB Docker image just to run.

So I Built helios-engine

I had two goals:

Fix all the shit that annoyed me about existing frameworks
Actually learn Rust instead of just reading the book and building toy projects

What Makes It Different

No Vendor Lock-In

// OpenAI
let client = LLMClient::new(LLMProviderType::OpenAI(config)).await?;

// Any OpenAI-compatible API (Groq, Together, local inference servers)
let client = LLMClient::new(LLMProviderType::Custom(config)).await?;

// True local GGUF models (no server required)
let client = LLMClient::new(LLMProviderType::Local(config)).await?;

Same interface. Zero refactoring to switch providers. The client doesn't care where the tokens come from.

Local Models That Actually Work
The local mode uses llama.cpp under the hood. First run downloads the model from HuggingFace. After that, it's fully offline. No servers, no containers, no Python interpreter. Just Rust and the model file.

let local_config = LocalConfig {
    huggingface_repo: "unsloth/Qwen3-0.6B-GGUF".to_string(),
    model_file: "Qwen3-0.6B-Q4_K_M.gguf".to_string(),
    temperature: 0.7,
    max_tokens: 2048,
};

Forest of Agents
Multiple agents that can actually collaborate. Each agent has its own personality, tools, and context. They can delegate tasks to each other, share information, and work together on complex problems.

let researcher = AgentBuilder::new()
    .name("Researcher")
    .goal("Find accurate information")
    .tools(vec![search_tool, web_scraper_tool])
    .build()?;

let writer = AgentBuilder::new()
    .name("Writer")
    .goal("Create engaging content")
    .tools(vec![markdown_tool])
    .build()?;

Clean Syntax
No magic. No configuration hell. Just Rust code that does what it looks like it does.

let agent = AgentBuilder::new()
    .name("Assistant")
    .goal("Help the user")
    .instruction("You are helpful and concise")
    .tool(calculator)
    .tool(web_search)
    .build()?;

agent.run("What's 147 * 923?").await?;

Tool Belt
Tools are just async functions. Register them, and agents can use them. That's it.

async fn get_weather(location: String) -> Result<String> {
    // Your implementation
}

let tool = Tool::new(
    "get_weather",
    "Gets current weather for a location",
    get_weather
);

agent.register_tool(tool);

Built-in RAG
Vector stores (in-memory or Qdrant), embeddings, retrieval—all integrated. No separate libraries to wrangle.

Streaming
Real token-by-token streaming for both remote and local models. Watch the thinking process unfold in real-time.

What I Learned Building This

Building helios-engine forced me to actually understand Rust's ownership model, async runtime, trait system, and error handling—not just read about them.

Async Rust Is Beautiful (Once You Get It)

Fighting with lifetimes in async contexts was painful at first. But once I understood how Send and Sync work with futures, everything clicked. The Tokio runtime is fast.

The Type System Saves You

Python's flexibility sounds nice until you're debugging why your agent system breaks at runtime. Rust's compiler is harsh, but it catches problems before they become production incidents.

// This won't compile if LLMProviderType doesn't implement Clone
let client = LLMClient::new(provider.clone()).await?;

That "won't compile" is your best friend.

Error Handling Done Right

No more Python exceptions that bubble up three layers and get swallowed. Everything is Result<T, E>. Errors are values. Handle them or propagate them—the compiler enforces it.

pub type Result<T> = std::result::Result<T, HeliosError>;

#[derive(Debug, thiserror::Error)]
pub enum HeliosError {
    #[error("LLM API error: {0}")]
    LLMError(String),

    #[error("Configuration error: {0}")]
    ConfigError(String),

    #[error("Tool execution failed: {0}")]
    ToolError(String),
}

Performance Matters

Rust's zero-cost abstractions aren't marketing. The difference in response time between a Python framework and helios-engine is noticeable, especially for local models.

Feature Flags Are Powerful

[dependencies]
helios-engine = { version = "0.3", features = ["local", "rag"] }

Compile only what you need. Keep your binary small.

Should You Use It?

If you're building LLM-powered applications and you're tired of:

Being locked into one provider
Python's performance and deployment overhead
Complex frameworks that do too much or too little
"Local support" that doesn't actually work offline
Agent systems that can't handle multiple agents

Then yeah, give it a shot.

If you're also learning Rust and want a real project that exercises the language's features in a practical context, helios-engine is a good codebase to study or contribute to.

DEV Community