Pavel

Posted on Aug 12

I Built a "GPT" in My Browser in One Evening. The Journey from Amnesia to Stable Learning with Pure JS.

#ai #machinelearning #webdev #javascript

Hello, community!

Sometimes, the best projects are born from a simple "What if...?" question. One evening, I was looking at some of my old code—a tiny neural network library in JS I had written for fun, called slmnet. A thought struck me: what if, instead of just solving toy problems, I could use it to build a real interactive organism that lives and learns right in the browser?

And so, Project "Living Brain" was born.

The Goal: Create a chatbot that:

Runs entirely on the client-side, with no servers.
Learns in real-time from user conversations.
Saves its knowledge in LocalStorage so it doesn't suffer from amnesia after a page refresh.

The Tools:

My self-made library, slmnet (Tensors, Dense/ReLU layers, an SGD optimizer).
Pure JavaScript.
The browser's LocalStorage as its hippocampus.

A Journey of Pain and Discovery: The Three Stages of Failure

I thought it would be easy. I was wrong. The bot went through three evolutionary stages, and each one was a classic problem from the world of AI.

Stage 1: The Echo Bot
The first version was terrible. It would simply memorize the last answer it was taught and repeat it for every single question. Boring.

Stage 2: The Bot with Catastrophic Forgetting
I solved problem #1, only to create a new one. The bot would perfectly learn a new lesson (e.g., "How are you?" -> "I'm great!"), but in the process, it would completely forget everything it knew before ("Hello" -> "Greetings"). This is a classic AI problem where new knowledge completely overwrites the old. I was literally forcing it to cram for one test question, wiping its entire memory clean.

Stage 3: The Bot with an Identity Crisis
I taught it to stop forgetting old lessons. But as soon as I added a new word to its vocabulary, its brain (the network architecture) had to be rebuilt. My code would just create a new, empty brain. And although it retrained on all the old examples, the random initialization of its weights meant its "personality" completely changed. It would start answering "Hello" with "See you later." There was no stability.

The Final Insight: Thinking Like Real AI

The solution came when I stopped thinking like a programmer and started thinking like... a trainer.

Experience Replay: Instead of cramming one lesson, I created a "memory bank" to store all past conversations. Now, during training, the bot runs through its entire history, gently adjusting its weights and reinforcing old knowledge along with the new.
Transfer Learning: When new words appear, I stopped "demolishing the house." Instead, I implemented a "brain transplant": I create a new, larger model and carefully copy all the weights from the old one into it. This way, its personality is preserved, and there's new space for new knowledge.

Let's Be Honest: This Isn't ChatGPT

My project is a "little" language model, not a large one.

It doesn't generate text; it performs classification by selecting the most appropriate response from those it already knows.
It uses a simple "Bag-of-Words" model, not complex transformers.
Its "understanding" is a statistical correlation, not semantic awareness.

But you know what? That doesn't matter. This one-evening project allowed me to experience the same journey AI researchers go through: from the simplest mistakes to implementing fundamental concepts. And it all happens in your browser window.

It was an incredibly fascinating ride. Sometimes, old code is the best playground for new ideas.

Top comments (21)

Prema Ananda • Aug 13

Nice! This could actually save tons of electricity. Why run a big model on the server for simple responses like "hello" or "thanks"? Your browser thing can handle that locally and use way less energy.

If this gets implemented everywhere, servers would run cooler and consume less power. Simple questions - handled locally, complex ones - sent to server. Makes sense!

Pavel • Aug 13

That's right, it's a waste to spend so many resources on such a brief chat.

Nilesh A. • Sep 19

This is such a cool project! Like how you broke down the classic AI pitfalls (echoing, catastrophic forgetting, and identity shifts) into a personal coding journey. The idea of combining LocalStorage with experience replay and transfer learning in pure JS is both creative and educational. It’s refreshing to see AI concepts explained and applied in such a hands-on, browser-based experiment. Really inspiring work! 🙌

Pavel • Aug 13

Or here's another thought for those who will read! The little model doesn't just respond to "Thank you," it becomes a context manager. Let's see how cool it is:
The user: "Hi! Tell me about hybrid AI."
Little GPT (in the browser): Responds instantly: "Hello there! Of course, I'll be happy to tell you."
"Behind the scenes": At the same moment, the small model sends a request to the large LLM, which looks something like this:
The main request is to talk about hybrid AI.
Contextual package: { "status": "dialog started", "user greeting": "Hello!", "robot reply": "Hello! Of course, I'll be happy to tell you.", "tone": "friendly" }
Having received such a "package", LLM immediately, without unnecessary introductions, I understand the whole picture:
The user has already been greeted.
He's friendly.
The beginning of the answer has already been given, and LLM needs to organically continue it.
The LLM response will be immediately relevant, without repeating "Hello! How can I help you?" LLM will just continue the conversation as if we were one from the very beginning.
This solves the key problems:
Complete seamless operation: The user will never notice the "switching" between the models.
Saving resources on a new level: LLM does not waste energy analyzing the beginning of the dialogue, but immediately gets to the point.
Deep understanding of the context: The dialogue becomes personalized and much more natural.

Pavel • Aug 13

Or the idea of a hybrid AI response:
Instant Start: As soon as you ask a question, the neural network immediately gives out the first part of the answer right in the browser — for example, "Of course, I'll help you now!"
The full answer follows: While you are reading this greeting, a powerful neural network on the server is already preparing a detailed, basic answer, which appears in a moment.
Result: For the user, communication looks instant, without pauses and waiting. This makes interaction with AI faster, smoother, more natural, and seamless.

Pavel • Aug 26

I released a post about the continuation of the project development! I think it turned out great!

dev.to/xzdes/i-supercharged-my-bro...

Pavel • Aug 15

I've completely rebuilt the project, transforming it from a classifier chatbot into a full-fledged GPT that runs in the browser using only vanilla JavaScript. I believe I've pushed this to its absolute limit!

Training takes about 10 minutes and generation takes up to 2 minutes, all happening client-side in the browser. I couldn't squeeze any more performance out of it. It's not much, but it's honest generation. The model is dumb and slow, but it's a real GPT!

Thank you for taking the time to read through this. I appreciate all of your feedback and support.

Pavel • Aug 13

I was sitting here thinking about why I created this neural network, and then it dawned on me! This can be part of the query optimization process! an LLM user, writes many similar and simple requests like Thank You! How are you! and why send them to the server for this when you can process them on the client and if it does not cope, then make a request to the server.
How do you like the idea?

Pavel • Aug 14

So, I created a project aimed at improving the API, this little bot can learn from LLM and LLM to say "don't bother me" if the user writes a message like "Thank you".
github.com/Xzdes/slmnet-Hybrid