DEV Community

Pavel
Pavel

Posted on

I Made a Tiny Node.js Engine to Stop My LLM from Lying to Me

Hey everyone!

Like many of you, I've been riding the AI wave, trying to get Large Language Models (LLMs) like GPT, Llama, and Mistral to build cool stuff for me. They're amazing at writing isolated functions or boilerplate code. But the moment I ask them to build a full, simple web app, things start to get... weird.

The LLM starts to hallucinate.

It invents file paths that don't exist. It writes brilliant but flawed JavaScript to manipulate the DOM, forgetting crucial id. It messes up state management, reading from one file and writing to another, creating a tangled mess. It’s like having a brilliant but dangerously overconfident intern.

I realized the problem wasn't the LLM's coding skill. The problem was that I was giving it too much freedom. I was asking it to be a full-stack developer, a file system expert, and a DOM artist all at once.

So, I decided to try a weird experiment: What if I built an engine that a lying LLM couldn't break?

Introducing Serverokey: The "Guardrail" Engine

I created Serverokey, a tiny, zero-dependency Node.js engine with a simple philosophy: force the LLM to be an architect, not a coder.

Instead of writing how to do things, the LLM just declares WHAT it wants to happen in a single manifest.js file. The engine handles the rest.

Here's how it works:

1. No More File Paths, Only Names

The LLM doesn't know about ./app/components/ or ./app/actions/. If it needs a component, it just asks for 'receipt'. The engine's strict conventions find the right file.

LLM Hallucination: "Let's require('../../utils/helpers.js')..."
Serverokey's Fix: Not possible. You can only ask for 'receipt', and I'll find receipt.html and receipt.css for you.

2. Declarative Actions, Not Imperative Code

To add an item to a shopping cart, the LLM doesn't write a JavaScript function. It describes the operation in the manifest:

// manifest.js
'POST /action/addItem': {
  type: 'action',
  // No JS file needed!
  manipulate: {
    target: 'receipt.items',       // The array to change
    operation: 'push',             // What to do
    source: 'positions.all',       // Where to get the data from
    findBy: { "id": "body.id" }    // How to find the source item
  },
  writes: ['receipt'],
  update: 'receipt' // Re-render the 'receipt' component
}
Enter fullscreen mode Exit fullscreen mode

The engine's DataManipulator handles the find, push, and state update. The LLM can't mess up the algorithm because it doesn't write it.

3. Computed Data with Formulas, Not Manual Recalculations

Instead of asking the LLM to write code to recalculate the total price every time the cart changes, I just ask it to define the relationship once.

// manifest.js
data: {
  receipt: {
    initialState: { items: [], total: '0.00' },
    // This field is now fully automatic!
    computed: [
      {
        "target": "total",
        "formula": "sum(items, 'price')", // The engine knows how to do this
        "format": "toFixed(2)"
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

The DataManager and its FormulaParser automatically run this calculation whenever receipt.items changes. No more bugs from forgetting to update the total.

The Result: A Predictable, "Boring" Workflow

With Serverokey, building a feature looks like this:

  1. The LLM modifies manifest.js to describe the new data, action, or computed value.
  2. The LLM modifies the corresponding HTML template in app/components/.
  3. That's it.

The entire process becomes a predictable, "fill-in-the-blanks" task. The LLM's creativity is channeled into architecture and UI, not into writing potentially buggy, low-level code.

Help Me Test This! A Call to the Community

I've built this as a personal experiment, and it works surprisingly well for me. But I'm just one person. I'm super curious to see how this approach holds up with different models, especially local ones.

This is where I need your help.

If you're into LLMs and have a few minutes to spare, I'd be incredibly grateful if you could try this out. The project is on GitHub, it's tiny, and has zero dependencies.

Check out Serverokey on GitHub

The Challenge:

  1. Clone the repo.
  2. Run node engine.js to see the example app.
  3. "Feed" the core files (manifest.js, the files in core/, engine.js, etc.) to your favorite local LLM (Llama 3, Mistral, Phi-3, whatever you have).
  4. Give it a simple task, like:
    • "Add a new 'tax' field to the receipt that is 5% of the final total."
    • "Create a new action to apply a coupon code from an input field."
    • "Add a new component that shows a log of all actions."
  5. See what happens! Does it understand the declarative pattern? Does it successfully modify the manifest? Does it avoid writing unnecessary JavaScript?

I'm not looking for perfection. I'm looking for data. I want to know if this "guardrail" approach actually makes LLMs more reliable in practice.

Let me know your results, thoughts, or critiques in the comments below or in the GitHub issues. Is this a dead end, or is there something to this idea of building "LLM-proof" architectures?

Thanks for reading!


P.S. I'm not a native English speaker, so I apologize for any grammar quirks! The code should speak for itself, hopefully.

Top comments (1)

Collapse
 
xzdes profile image
Pavel

Hi everyone. Thanks for reading.
Just a heads-up for anyone exploring the repository: I've just pushed a major architectural update (v4.0) that takes the concepts from this article even further.
The key changes are:
Modularization: The project is no longer a monolith. It has been refactored into a proper NPM package (serverokey) and an example application (kassa-app-example) that consumes it.
Connector Architecture: The DataManager has been replaced by a more flexible ConnectorManager. This introduces a provider model for data sources, allowing for easy extension. The engine now ships with both json and in-memory connectors.
This makes the framework far more scalable and ready for integration into more complex projects. The code in the repository now reflects this new, more robust design.
All feedback on the new structure is welcome.