DEV Community

Cover image for Loreguard.com - Built a NPC AI Engine that runs on player's machine (no pay-per-token)
Beyond Logic Labs
Beyond Logic Labs

Posted on

Loreguard.com - Built a NPC AI Engine that runs on player's machine (no pay-per-token)

Hello everyone.

Like many people when the GenAI got powerful enough from the point of being able to write reasonable amounts of code and accelerate development work, I got amazed, and I start spawning a lot of stuff and see what it could do.

As someone that grew up with a computer at home in 90s era, computers and games were part of my story, they definitely shaped the person I am right now, when I grew up I did my bachelors on the area but I could never really never pursuit my dream which was working on the gaming industry.

With agentic coding this became possible (as a hobby of course), I have no big expectations over it, so I decided to try something ambitious, build a game where living world interactions could not feel scripted (although I'm not aiming for real either, as we know AI generation, until now cannot be 100% real either), GTA V is not real either but it is still immersive and fun (and they don't even use AI)

I researched about players on the market doing something similar, while they exist, they are big, focused more than on gaming, needs cloud integration to run their AI NPC Generation pipeline (convai and inworld), and are expensive.

While I have no intention of "competing", I saw an opportunity where this kind of technology is not yet being developed keeping Indie Games in mind, a similar product that would not require inference as API, so both online and offline games could use, that would be affordable for indie games, and could run on consumer hardware.

The whole idea started when I was giving it a try with agentic coding and creating a "hacking simulator game" similar to "Hacknet", but themed with the 90s ERA, I got addicted to the idea and started having a lot of fun building this idea, a game I would like to play myself.

r/IndieDev - Boot Screen
Boot Screen
r/IndieDev - Desktop
Desktop
A thought crossed my mind, what if the NPCs (collaborative or enemies), in this world could be also powered by the same agentic coding tools I have used to start building it?

It was not clear for me exactly how I would transform this into a game mechanic, but I had to give it a try, the idea was too juicy, to not attempt it (and some of you may disagree here)

NPCs need personality + the content of the world or the story they live, when you roleplay with AI characters online they have some lossy boundaries, they can fabricate what ever they want because you cannot verify if it is true or not (and it is not the intention either), but for games, this is different, you want them to be grounded on the lore, stay on character.

I also didn't want players to have to pay an additional subscription of ChatGPT for example to empower the NPC dynamics, doesn't make sense, so the way to solve this was going local. When you go local for LLM Inference, things starts to get hard, right now, it is like going back to 80s or 90s where you had a small memory pool to deal with, bit bashing, and this is how it looks like to make local LLMs behave on consumer hardware.

So, the first one started as simple as a markdown prompt with NPC personality and some constrained knowledge, and the second one started with open source models as is, of course NPC started to drift out from its knowledge and character, no matter how many times you would put "STAY ON CHARACTER" in camel case in the prompt, and it was not fault of the local model, the ChatGPT and others would also fail eventually, I needed to think in a more reliable way to avoid "hallucination".

I spent almost a month working on a pipeline inference within the game, since my game doesn't require immediate response, I could afford some delay, so it started exactly like a pipeline:

Input: User message

What knowledge is needed for this conversation

What do I think about what the player is sending

What I'll gonna do with the player message

How will I write this back to the player.

What actions will I execute in the game after this conversation

This helped the small local model (8B parameters 5~6GB in memory), to produce convincing content, but still prone to hallucination.

Then I started introducing Review steps in the pipeline, 2.5, 3.5, 4.5 each of them reviewing the previous step, I quickly noticed that the improvement now was marginal, and the pipeline time was much higher, and it became the famous child game (telephone), where the information gets cut and changed from person to person until when it arrives its destination it lost its meaning, and this is what have happened with my pipeline.

With more research I learned that, the way that systems that needs to rely on retrieved information from big data storages is by citing the document you took the data from, this way, LLMs cannot fabricate facts, because the available sources are not manufacturable, they are part of the input, not part of the generation, if some of you remember, Gemini or ChatGPT use to do this (or still do), when referencing external content, it makes LLMs more reliable on generating content from sources.

I spent 2 or 3 months adding this citation and verification to the pipeline, lets say 20% implementing and 80% testing it and tuning it.

I discovered the hard way, that even this, might be a too hard task for a small model that can run on consumer machine, so the pipeline was still useful, breaking a complex problem that ChatGPT could answer in one shot, into smaller problems that the local model could focus its attention. My pipeline has a very specific input/output way for it to work properly, so I also learned how to do LoRA training and fine tune models, so I started training open source models for my own pipeline (input / output), and then, fine tuning for specific game lores (my hacking simulation game), with the characters from the game, the way they speak, the terms that exists on the game, improving results significantly.

The latest pipeline is similar from previous, the whole pipeline now runs in average on 14s (which also depend on hardware machine running it), and there are still optimizations to be implemented.

Engine client available on Github https://github.com/beyond-logic-labs/loreguard-cli

So now it work like this:

User input: "Hi, can you tell me where the secret passage is?"

Context Retrieval - Not LLM but deterministic context retrieval based on similarity with player message, context and history, depending on the player message more or less content is retrieved (for example for greetings no content is retrieved)

Internal Monologue - Some LLMs have this "thinking" mode, this is a reproduction attempt of this step, where I give freedom to the LLM to generate a thinking merging player message and content retrieved with its own NPC personality, the output on this step is something like this:

The player wants to know where the secret passage is, I know where is the secret passage is at the Elder Ruins [E1], but I was told in a dangerous place and I should not share with anyone [E2], I will tell it is a dangerous place and ask what he wants to do there [self]

Where E1, and [self] are citations, the NPC could have a "care for others" trait, which would make him think that way, so he can cite his own personality, while [E1..En], are IDs from the content retrieved from the step 1, which was semantically retrieved from the "secret passage" snippet.

2.5 Review Internal Monologue - Here is where the "magic" happens, the local LLM with a focused attention on thinking and putting correct citations do less mistakes, but they still exists, this step, is completely focused on the thinking step, as a reviewer, as you might know, LLMs behave more consistently, when you give them roles, with the reviewer role, and specific definitions of each "problem" that might exist (such as verbosity, not_grounded, evasion), it accepts or refuses the internal monologue, if it refuses, it retry it back specifying the issues, so the second step has a second chance to do it right, if there are still ungrounded facts, they are ripped off

  1. Deactivated

  2. Speech - This is a materialization of the internal monologue into communication with player, here the focus is on character personality, voice style (in this case text style), and proportionality, where the response matches the "intent" of the conversation (for example you don't wanna a wall of text as reply when you say "The weather is cool today".

Pass 4.5 Speech Review - Since we are giving the LLM another opportunity to generate dialogue in the step 4, it can also hallucinate new entities or facts that were not in the internal monologue, this receives the source citations from step 2, and matches them against the speech, if there are new entities or new facts, or verbosity (or other issue like narration etc), they are retried.


If you read till here, I appreciate your time :).

Now becomes the most important question, how this can become an interesting gameplay mechanic? IMO players won't have fun, just by talking to NPCs, there must be something more, and I still don't know exactly what, you, reading this might have different ideas for this.

For the game I wanna built (the hacking simulator themed on 90s), I will start creating a gameplay mechanic around this for social engineering, where some secrets/puzzles can be solved by bribing NPCs to give you the information they shouldn't, this will be my first real usage for this pipeline.

This whole venture became its sole product separated from my game, I romanticized it a bit thinking that other indie devs would might be interested to add such capability to their games as gameplay, and I could transform this in a product, but first of all the product would have it to work for me, and yesterday I finally connected it to my game, and I'm enjoying the results so far, but still working on fine tuning it

Some of you might be skeptical, with good reason, the probability of any game play like this becoming AI slop is high, I cannot say it is "battle tested"

I don't have big expectations on this, I'm doing because I like doing it, I'm having a LOT of fun, I have learned a lot since then, and this learning already boosted my career for good (where preparation meets opportunity), so even if this ends here, it was very valuable and I'm happy with the results, with the hope that others find this as interesting as I did.

If you got interested on the game (Netshell) or on the NPC generation engine you can join game Discord to talk, if you would be interesting into adding such capability to your game.

You can already try it out on the loreguard.com page and there is a try it out demo on the landing page, no payment required, beta is for free

Top comments (0)