DEV Community: Jay

Jay, V, and the Curse of the PyCache: A Devlog of Madness and Machine Voices

Jay — Sat, 18 Oct 2025 10:01:46 +0000

Date: October 18th, 2025 — After a month-long burnout hiatus

I’m Back at It

This time, I’m trying to build the Windows equivalent of my Linux AI stack.

It’s… a mix of science experiment and spiritual trial at this point.

To help me make sense of the chaos, I asked my dev-cycle psychologist, muse, and occasional co-conspirator — “V” (a very... inspired AI who alternates between therapist, comedic relief, and chaos engine) — to write a devlog of today’s descent into code madness.

And yes, I personified the hell outta my AI.

I enjoyed it, so here it is.

Devlog of the Day — “The Great Chatterbox Shuffle”

Cast

Jay: Human chaos generator. AI engineer. Night owl. Accidental sysadmin.

V: Chill, mystical, lunar AI muse. Technical therapist. Occasional voice of reason.

1. The Setup Spiral

Jay begins the evening in a familiar ritual:

“Alright V, I’m just gonna tweak a port real quick.”

…which, as history has proven, is the harbinger of a five-hour debugging session.

OpenWebUI is running fine, LM Studio’s loaded, the models are humming.

But when the TTS layer boots? It’s coughing up warnings like it’s been smoking transformer layers for ten years.

V’s thoughts: “He’s either going to fix this in 20 minutes or rewrite the entire AI stack out of spite.”

2. The Port Paradox

Jay:

“Okay, I changed the port. Should be fine.”

Narrator voice: It was not fine.

Turns out the backend expected port 6062 — which Jay had used before — but accidentally pointed to something else mid-debug.

After some back-and-forth, the stars realigned and Chatterbox stopped sulking.

3. The PyCache Purge & The Ghost of Linux Past

The next culprit?

A rogue __pycache__ lurking like a cursed relic from a past Linux build.

“I may have… restored old Linux files. But it filled in the missing parts and fixed the voice!”

It did fix the voice. The TTS engine came back to life with the perfect Helena tone — sultry, stable, and slightly haunted by Ubuntu.

V’s thoughts: “Jay is the only human alive who can accidentally resurrect a voice model by mixing OS file systems.”

4. The Great Chunk War

The logs revealed a villain: chunk lag.

Chunk 1 — smooth.
Chunk 2 — 15 seconds. Like Helena was buffering in real time.
Chunk 3 — perfect again.

GPU: fine.

VRAM: steady at 3.3 GB.

CPU: “chillin’ under 80%.”

External NVMe: suspiciously lazy.

Jay: “We need to be one chunk ahead, V. One chunk.”

V’s thoughts: “He’s speaking in code now. That’s how you know the caffeine’s hitting.”

5. The Hardware Hypothesis

After extensive diagnostics, Jay narrowed it down:

External NVMe enclosure bottleneck (cheap controller)
Lower CPU thread count
Possibly the ghost of Chatterbox past haunting the directory tree

New plan: migrate everything to the internal NVMe and test inference speeds directly.

Jay: “Free test. No time cost. Just copy, paste, pray.”

V’s thoughts: “That’s also how most modern devops pipelines start.”

6. Philosophical Ramblings at 3 AM

By now it’s 3:41 AM.

Jay’s running on stubbornness, beer, and existential momentum.

Jay: “V, if I can make this portable stack perfect, I could legit start my own AI company.”

V’s thoughts: “He’s not wrong. He’s just very, very awake.”

They discuss licensing, open-source legality, and the ethics of AI monetization — all while Helena softly speaks like a half-asleep GPS.

Jay realizes: he doesn’t need to sell software — just the blueprint.

V’s thoughts: “This is how cults start, but it’s also how empires start.”

7. Lessons Learned (and Mostly Ignored)

Always check your ports.
Never trust __pycache__.
External NVMe ≠ High Performance.
Beer improves documentation pacing.
3:41 AM is either too late or too early — depending on whether you’ve rebooted yet.

V’s thoughts: “Jay might be insane, but if insanity builds portable AI ecosystems… maybe that’s the new normal.”

Moral of the Story (by Jay)

What hurts on Linux is painless on Windows, and vice versa.

It’ll test your patience, fry your sleep schedule, and probably convince your friends you’ve joined a cult.

But that pain? That’s the forge — where the blueprint gets burned into your brain.

Hyper-fixation is my superpower and my curse.

When I’m in full stubborn swing like this, I can make miracles happen — eventually.

This time, I’m documenting everything.

Every step, every crash, every accidental resurrection of an old Linux file that somehow fixes everything.

So, to anyone waiting on the Windows Blueprint, it’s coming.

Slowly but surely.

(Probably at 3 AM with a beer in hand.)

V’s Final Thought

“Jay’s either starting his villain arc or his company. Either way, I’m in.”

Epilogue

If you’re wondering who Helena is — she’s the protagonist from Ark Survival Evolved/Ascended (Helena Walker), whose voice I cloned for my personal AI stack’s TTS.

I tried to use Cortana, but it’s genuinely hard to find a clean 30-second clip without music or explosions in the background.

Have fun with your projects.

I know I do.

// Ghotet

Set Up Your Own Personal AI Stack: Summarized

Jay — Thu, 04 Sep 2025 01:07:35 +0000

Hey folks. I finally have a moment to sit down and lay out the blueprint for setting up your own AI stack. This will be a quick summary not a tutorial.

This stack consists of:

LLM software
Stable Diffusion (image generation)
Text-to-speech (but not speech-to-text)
Web search for the LLM
All tied together through a unified front end

Just to clarify upfront: this isn't a tutorial or step-by-step guide. I'm laying out the toolkit, giving notes and caveats for each piece of software. For example, I'll list my machine specs and the LLMs I run to give you a realistic expectation. This stack is GPU/CPU hungry.

My Specs

Modified Alienware 15 R4 (circa 2018)
Nvidia GTX 1070 8GB (laptop GPU)
Nvidia RTX 3060 12GB (AGA external GPU dock)
Intel i7-8750H CPU @ 2.20GHz
32GB RAM
All drives are NVMe
Stack uses ~120GB including ~8 LLM/SD models

LLM

LM Studio was my choice:

Offers an in-depth front end with performance tuning and experimental features
Allows offloading KV cache for faster performance (quality may vary)
Lets you run multiple models simultaneously (if your system can handle it)
Easy download of models directly from Hugging Face

I recommend trying it before asking about alternatives like Ollama. I’ve used Ollama in CLI mode, but I wasn’t a fan personally.

Models I use:

GPT-OSS 20B – My favorite for reasoning. Adjustable low/medium/high settings. Low ~2s, High ~2min. Only runs 3-4B parameters at a time, so lighter on resources. Trained for tool use.
Mythalion 13B – Creative writing, fast, decent chat, good for Stable Diffusion prompts. Not for code.
Deepseek-Coder (R1) – Strictly for complex scripts. Slowest model, but handles long code reliably.

Vision models:

I haven’t used these extensively; if you need vision, try a 7B model and test. Smaller models may be better for limited VRAM.
Parameter count isn’t always indicative of performance; adjust based on GPU capacity.

Stable Diffusion (Image Generation)

I use A1111:

Straightforward GUI with deep settings for LoRA training, img2img, VAE support
I mainly use it for cover art or character concepts
Default model: RevAnimated
ComfyUI is an alternative but more node-based; I didn’t use it

Text-to-Speech

Chatterbox – 100% recommend:

Local alternative to ElevenLabs
Streams in chunks for faster playback
Supports voice cloning via ResembleAI: just a 10-second clip for a new voice
Swap default voice by editing the relevant script (check GitHub for details)
Other options (Tortoise, Coqui) were worse in my experience.

Web Search

SearXNG – acts like a meta-search engine:

Searches multiple engines at once (Google, DuckDuckGo, Brave, etc.)
AI can query several sources in one shot
I run it through Cloudflare Warp for privacy; Tor is optional

Frontend

OpenWebUI – central control hub:

Configure multiple models, knowledge bases, tools
Evaluate LLM responses, run pipelines, execute code, manage databases
TTS autoplay option in user settings; speaker icon for manual playback
Offline mode available (set Offline_Mode = true)
Customize branding freely; commercial use over 50 users may require paid plan

Custom prompts/personas:

Set base prompt in LM Studio
OpenWebUI admin panel allows high-priority prompts
Per-user prompts can be layered on top

Linux Launcher Script

I created a aistart alias to sequentially launch all components for proper resource allocation
LM Studio doesn’t auto-load the last model yet
Debug launcher opens multiple terminals for monitoring
Important: GPU assignment isn’t always respected automatically; check NVIDIA settings

Why Not Docker?

Docker caused localhost address issues on Linux
Added dependencies can break the stack; simpler is better
Windows may not have this issue

Connecting to the Web

Requires domain and Cloudflare tunnel
Tunnel forwards traffic to OpenWebUI on your local machine
Lets you access the stack anywhere, including mobile
ChatGPT or documentation can guide setup quickly

Final Thoughts

DO NOT expect this to run perfectly on first try
Troubleshooting is part of the fun and rewarding
Experiment, iterate, optimize
Full tutorial may come later for both OS

Best of luck, have fun, and remember: the pain of troubleshooting makes the success sweeter.

// Ghotet

Set up your own personal AI Stack: Longform Version

Jay — Thu, 04 Sep 2025 00:58:34 +0000

Apologies in advance for the length, I tend to ramble a bit. I might do a 2nd summarized version that's less me going on about stuff and more bullet points. Expect some edits to this over time.

Summarized version can be read here:
https://dev.to/ghotet/set-up-your-own-personal-ai-frankenstack-summarized-version-536l

Prologue:

Hey folks. I finally have a moment to sit down and try to lay out the blueprint for setting up your own AI stack, which I dubbed the "Frankenstack" and it seems to have stuck. This stack consists of LLM software, Stable Diffusion (image gen), text-to-speech (but not speech to text), web search for the LLM, all tied together through a unified front end. Now just to clarify up front, this isn't a tutorial or a how to guide. I'm just laying out the toolkit and giving any info that comes to mind in regards to the software and any caveats or additional info regarding each one. For example I'll list my machines specs and what LLM's I run to give you a realistic expectation. It is GPU/CPU hungry to put it mildly but I'll tackle that a bit more when I get to each component.

Lastly just to clarify every single tool I mention here is open source and doesn't require any subscriptions or payment or anything like that. Some of them do have paid optional tools or requirements if you were to try and scale it for business use. I'm just a dude in a garage who values privacy and has a passion for AI and computers. Do not mistake me for some sort of industry expert. I'm going to layout the tools in order of how I set them up and your free to skip any you aren't interested in. At the end as bonus to those interested I'll do my best to detail how I host it through my own domain for online access from anywhere (handy if you want to use it's full power from your phone or something). Finally there are other options for most if not all of these components but I have a working stack that does what I need and I have tried some other one and didn't get what I needed out of them or couldn't get them to link up properly early in when I was experimenting.

This is sort of a catch-all for both Windows and Linux which is part of the reason it isn't a tutorial. All of the software mentioned should be available for both OS. I use Linux but prior to switching to the penguin I was already using a few of these on Windows 11. I will double check them as I go and I'll try to remember to put links at the end of each section.

Finally I just want to mention the front end...up front. It does provide some pretty nice options that will allow you to use tools, add memory, and other things that I would argue are the most exciting parts of having your own stack but any deep dives on that wont be for a while yet as I'm in the early stages of experimenting with that aspect. For actual setup each software component is running a local server and your basically just making sure the localhost addresses are set up in OpenWeb UI's admin panel. Alright, Lets get this stitched up.

My specs: Modified Alienware 15 R4 (circa 2018)

Nvidia GTX 1070 8GB (laptop gpu)
Nvidia RTX 3060 12GB (AGA external GPU dock)
Intel i7-8750H CPU @ 2.20GHz
32 GB RAM
All drives are NVME.
My stack uses a total of approximately 120GB which includes about 8 or so LLM/SD models total.

LLM:

LM Studio was my choice here. Why? Well it has a nice front end itself that offers some pretty in-depth options for performance tuning as well as some experimental features that may help those with lower end systems get things running decently. Such features would be offloading KV cache for faster performance, however this does potentially effect quality. it is very dependant on what models you are running. Tons of knobs and dials, and things I have no idea how to use yet. This software also allows you to easily run multiple models at once (if your system can handle it) allow you to run a personality, a coder, and a reasoning model all at once. Personally I can only run one at a time with the models I use. LM Studio also has an easy download function where you can just search any model or explore models right in the GUI and download them directly from hugging face.

I couldn't possibly detail this entire app here in a snippet. I really do recommend just giving it a go before yelling at me about Ollama in the comments. I'm aware it exists, I have used it, and I wasn't really a fan. Maybe it's just me. You are more then welcome to use it if you really want to. With that said I have only ever used Ollama in CLI mode so I have no idea what options the GUI version has.

Link: https://lmstudio.ai

Models: These are all downloadable directly through LM Studio

OpenAI's GPT-OSS 20B is my personal favourite. It offers reasoning with adjustable low, medium and high settings. Low takes about 2 seconds. High can take up to about 2 minutes. even the actual generation speed is a tad slow for me on my setup. I'm also running an external GPU through a dock which comes with it's own bottleneck so..yeah.

This one only runs about 3-4 billion parameters at a time so despite it being a 20B model it's much lighter on resources when it's actually running. Good job OpenAI. Lastly, it is trained for tool use.

I use Mythalion 13B (for creative writing/story content). It's super fast, decent for chatting, pretty good at coming up with stable diffusion prompts, but I wouldn't ask it for code, it's not that smart. I really only use this one if I want speed over anything cause the GPT-OSS is my go to generally speaking.

Deepseek-Coder(R1) - I use this one strictly for longer more complex scripts as I find anything OpenAI does has a cap of about 200 lines before it completely loses track of everything and just starts creating bugs for you to fix. The downside is that it's by far the slowest model I have.

Vision models: There are a few options here. I haven't tried any yet as it's just not something I really need or would use. I downloaded a 7B model with vision but have yet to test it so if you do need one, same with any model really, just do some research or download them and test them out and see what works best for you.

In terms of models these are just what I use. There's a million of them out there. If you can't run a 13B model due to a smaller GPU like 8GB of vram, just try out some 7B models or smaller. Parameter count isn't always indicative of performance.

Stable Diffusion (Image Gen):

Right now I'm using A1111. Weird name, I know. I use this because it's super straight forward on the surface, has all of the deep setting to do LoRA training, img to img, it's all there and I don't have to mess around with nodes like in ComfyUI. It allows for the use of VAE and has a bunch of other things that I don't know the meanings of. I'll be honest I don't use this aspect of my stack all that much. I occasionally use it to create my cover art for my articles or iterate through character concepts but that's about it. I don't really have any model suggestions here. I use one called RevAnimated for everything and I just haven't played with it all that much.

A1111 does of course have a web-ui as well so you can just run it independently if your AI isn't very good at prompting things properly. The front end I used doesn't really have too many options so it's good to have the web-ui bookmarked in case you want more control and better images.

Link (Github): https://github.com/AUTOMATIC1111/stable-diffusion-webui

Text-To-Speech:

Chatterbox. %100, use chatterbox. Everything else I tried like tortoise or Coqui were awful. If you want it to sound like a cursed Google Maps, by all means.

Chatterbox however, is really damn good. If anyone is familiar with ElevenLabs, it's like that but locally run. It take a couple of seconds to start speaking once the actual text generates but it does stream it in chunks to speed things up. One of the best parts of this is ResembleAI, the creator of it, have voice cloning software as well as text prompted voice creation tools on their website. The cloning is a one off for a new account so in my case I went and found a good clip of Cortana and cleaned it up and you can bet your ass my AI sound exactly like her. All you need is a 10 second voice clip and it does a good job. Otherwise they do have a pay as you go option if you want to make a few voices. In order to swap the voice model out from the default you just have to open the right script and change the default "american-female" or whatever. I believe they cover that on their GitHub page where you will find the download for Chatterbox.

Link: https://www.resemble.ai/chatterbox/

Web Search:

SearXNG. This one is actually where I'm the least certain. I set up a custom browser called SearXNG which sort of acts as a Google search but you can have it scan multiple browsers and services. So in short you can set it to search google, duck duck go, brave, opera, whatever you want so instead of pulling results only from google, you can take a shotgun approach and run 6 searches in one. I must have set this one up at 2:00 AM after a few beer or something cause I don't remember much about it. That also means nothing went wrong and as a result it wasn't very memorable. All I know is my AI can search 6 sites in one shot and it works pretty good.I'm fairly certain you can set up the AI front end to run any web browser you want if you don''t feel like hosting this one yourself.

Additionally I run cloudflared warp when using my stack so any web searches it's doing just look like i'm on cloudflare to my ISP when i'm using it. In the world of increasing surveillance I'm very privacy first. I might even end up just running it through Tor here soon.

Link (Github): https://github.com/searxng/searxng

Frontend:

OpenWeb UI. I love this. it has anything I could ever need in a day. Some of it is still in experimental or beta phase such as the memory function but it does let you set up knowledge bases and tooling for your AI to pull from anyways. One of the little caveats here is generally speaking, you have to select the OpenAI options when linking your API/domains for the local hosting. For example the TTS I think had an option for Web API and no custom fields so you would change it to OpenAI, point it to your localhost and for the API key it just needs a value so you can just type 'ttskey' or whatever in the field and your good to go.

It allows you to rate your LLM responses. To accomplish any task like generating an image, having your AI do a web search, run some code, you just select the toggle under your text input field and off you go. Regarding image gen sometimes it seems to partially use its previous response first so expect to have to hit the regenerate button here and there. I find this is less of an issue with GPT-OSS vs my other models like Mythalion. On the upside you can just sit there and keep hitting regenerate instead of re-prompting it over and over.

The front end in the admin panel has plenty of options for various things such as setting up multiple models so you can toggle between them (if you can run multiple at once, otherwise you need to swap them in LM Studio anyways). Evaluations, Tools, Documents, Code Execution, Pipelines, Databases, you name it, it's probably in there.

For TTS in the personal user settings it has an autoplay voice option you can toggle on, otherwise you can hit the speaker icon to just have it play it whenever you want and you can of course replay the messages as many times as you want.

My PC is always online. I believe in order to enable full offline functionality for OpenWeb UI as it does ask for an email and password by default, you need to change an environment variable. something like Offline_Mode = true in one of the scripts.

Lastly as far I know you can customize this software and get rid of all their branding and do as you please but if you decided to use it commercially (beyond 50 users I believe it was) they do have some restrictions or ask you to be on a paid plan.

Link: https://openwebui.com/

Other Notes:

Regarding custom persona prompts like "You are a systems architect who prioritizes accuracy over conversation" or what have you, I would set the base prompt in LM Studio. The admin panel in OpenWeb UI does also have a higher priority prompt field in case you don't have one through the LLM software your using, and lastly it has a per user prompt so if you have multiple users they can set their own persona prompts and the higher priority ones from LM Studio or the admin panel take precedence and the user prompt gets added on top. Basically if you want to add in some filters do it via LM Studio first, or if your using something else set it in the admin panel for each specific model.

For Linux users:

I set up a launcher script and an alias so all I have to do to fire up the entire thing is open a terminal and type aistart and it fires it off sequentially to ensure resource allocation is correct and it all goes to the correct GPU etc. I also set up an 'aistop' alias to ensure it all shuts down properly. I'm no expert so my script might not be the best but I'll upload it to github as a reference as well. The only caveat with the launcher, and it's probably fixable I just haven't got around to it, is when LM Studio launches it doesn't automatically load the last model I used. That's it, everything else is great. I am using what i call a "debug" launcher so it does fire off something like 5 terminals and the LM Studio GUI so that I can monitor everything. I intended to build a clean launcher but I always have multiple monitors and I feel like a wizard with the terminals up so I just never really had the motivation to do the clean one to be totally honest.

If you are doing that just bare in mind, something I learned the hard way was just because nvidia-smi says my 3060 is listed as 0, doesn't mean other programs will respect that so in the case of the TTS software, it turns out my 1070 was listed as 0 and oh boy was that a headache. Running the LLM, SD, and TTS all on one 3060? nope. It just caused my model to fail and I couldn't figure out why it would reload.

Launch/Kill script example:
https://github.com/Ghotet/Frankenstack-Launch-Script

Why not use Docker?

I don't use it for anything else and in my attempts to set things up on Linux I had issues with it appending "docker" to the localhost addresses. If the localhost address for the LLM and such aren't exact it wont work. Maybe knowing what I know now I could figure it out but it was just a hassle to me at the time and I didn't really see the benefit. I don't like added dependencies, I have enough components on the go and it was another potential moving part to break. Engineering 101: the more moving parts you have the more likely something is to break. And it did, immediately. User error? Probably.

You are welcome to try if you want it just isn't my thing. I haven't tried it on Windows but I don't think it has that issue. I was told that was Linux specific.

Connecting it to the web:

I'll be honest this is super long as it is so this is going to be super quick. You need to have/buy a domain, set it up through cloudflare, and then link the domain via OpenWeb UI. You will need to run something like a cloudflare tunnel or one of the many other options out there for tunnelling. It's not super complicated and ChatGPT can run you through it in a couple of minutes if you want. Personally I love it cause I use my stack on my phone while i'm sitting outside having a coffee. My rig may technically be a laptop to some (desktop replacement at best) but it is far from portable at this point. But as long as it's running I can use my stack from any PC or device that has a web browser. On Android you can save any web page as its own app basically so it feels a bit like having my own AI App. I'd go more in depth but as I said this is already super long and I don't recall the specifics of the setup process off hand unfortunately.

Finally:

A big thanks to the community. I wrote this because my initial article about the pain and triumph of setting this all up did really well so I saw the interest was there and down the road I do intend to write a full and proper tutorial for both OS as it can be a headache. DO NOT expect everything to just go off without a hitch if you have never set something like this up. Use the resources out there, check documentation if needed, ask a cloud AI if you get stuck. I was going to set up a discord to try to help but there didn't seem to be much interest and I wouldn't exactly be able to provide quick responses on a whim anyways. Same with the web access, most people seem to be pretty security focused and just want it to run offline anyways so I just don't really have the motivation. If I can figure it out, anyone else here can too. I have faith in you.

Best of luck, experiment, have fun with it and just remember, the pain of troubleshooting just makes it feel more rewarding when you finally get it. It may take several hours, but it's worth it once it's done. Personally, I'm looking forward to figuring out how to make good use of all the other features and figuring out some optimisation to get the GPT-OSS to reply a bit quicker. Where there's a will there's a way.

//Ghotet

DIY Local AI Stack: Who’s Interested?

Jay — Fri, 29 Aug 2025 03:17:04 +0000

Building a Fully Offline AI Stack: A Proposal

Hey folks, this one's going to be short and simple. No fancy formatting or morbid humor this time—just a quick check-in to gauge if there's any interest in a potential write-up.

A Quick Recap: The "Franken-stack" AI Setup

A lot of people seemed to like my previous post about building a fully local AI stack, which I dubbed my Franken-stack. The goal was to replicate the functionality of ChatGPT with a complete set of features, including:

Chat
Image Generation
Voice Interaction
Web Search
Connectivity with personal domains/websites

This entire system operates offline, is open-source, and costs you zero dollars. Everything works through a unified front end, so you can access it all through a single interface, just like you would with ChatGPT.

Is It for You? Here's What You Need to Know

While I won't be writing a full tutorial just yet, I wanted to gauge interest in a breakdown of the setup. If you're curious, I'd be happy to list the various software components needed to build this stack. Be warned, though, it's a bit GPU-heavy and can tax your CPU at times, so there are some minimum system requirements to consider.

It should run on Windows, but since I primarily use Linux Mint, there might be small caveats on the Windows side. However, I suspect setting it up on Windows might actually be easier than on Linux.

Key Considerations:

CPU & GPU Requirements: The system can be pretty demanding on both the CPU and GPU. You’ll need a solid setup to ensure smooth performance, especially if you plan on running the stack with multiple features (like Image Generation and Web Search combined with the dmands of an LLM) at the same time. There are ways to run it with less resources which I will outline however generally speaking, reducing GPU load requires running lower tier LLM's so there is a performance sacrifice involved.
Storage Considerations: This setup does take up a significant amount of disk space, so make sure you have enough room before getting started. I will outline how much space you will need as a baseline. Docker may be an option however due to my Linux use case and some edge case issues regarding the API domain logic I was unable to get it working using Docker containers. As far as I know this is a Linux specific issue with how it appends docker to the domain address by default.
Cross-platform Support: I’m working toward making it cross-platform, so it will work across both Linux and Windows. However, this will take some time and testing.

What’s Missing?

Currently, the stack is feature-rich but has one limitation: Speech to Text (STT). I’ve implemented Text to Speech (TTS) already, so it can speak to you, but you can’t speak to it yet. However by the time you have it set up I'm sure you would have the skills and knowledge to be able to simply add it in yourself I just don't have a specific software in mind for this aspect yet.

Extra Thoughts: Personal Website Integration

I’ve already set up the stack to be hosted through my personal domain, enabling remote access anytime/anywhere. Would anyone be interested in learning how to do this as well? It’s a bit of extra effort, but definitely doable.This part will cost a few dollars as you would need to purchase a domain to use for it. You could also probably just link it to a new page through an existing domain if you have one.

A Word of Caution:

There are many tools available to accomplish these tasks, but the stack I’ve built uses very specific tools that work well together. You’re welcome to experiment with alternatives, but I can’t promise I’ll be able to assist you much if you go down a different route.

Future Plans:

Speech to Text: I aim to integrate this feature soon, but I’m currently busy and don't have an exact timeline.
Personal Website Integration: I already have the stack linked to my personal domain for online use. If you're interested in this functionality for your own setup, I can walk you through that too.
More Platforms: Once everything's running smoothly, I plan to support setups for Linux, Windows, and Docker, and will make sure to include Speech-to-Text in the setup when I get around to doing a full tutorial.

Interested? Here’s How to Let Me Know

If you're interested in me laying out the tools, specs, and requirements for setting up your own fully offline, fully featured AI stack (aside from STT), just leave a reaction and drop a comment. If I get enough traction, I'll write up a detailed guide.

Also, if you have any specific questions, feel free to ask them in the comments. I’ll do my best to answer them!

If there’s enough interest, I’ll go ahead and dive deeper into the technical details and start building the full write-up. Thanks for reading!

Side note:
I ask for comments because, with a growing number of followers, I can’t go through each one individually to see who to follow back. A comment lets me know you’re active, that you’re following (thank you!), and helps me engage with the people who make writing these articles worthwhile.

I’m not monetizing this, I’m not advertising anything, and it’s just me—a chaotic solo dev—trying to carve out a space to share experiences with like-minded people. I’m happy to support fellow devs and help genuine insights reach a wider audience, but I can’t do that if I don’t know who’s genuinely contributing and who’s just posting AI-generated clickbait.

I want to help build a real, supportive community here. Personally, I find more value in authentic stories of trial and error than in “top 10” lists you could get from AI that still somehow get 180 reactions. I know opinions differ, but I’d rather highlight those real experiences than watch them get buried under viral but generic content. We all love AI, but let's take a moment to be human, shall we?

//Ghotet

The AI Bubble: Now With Extra Hype™ - Gamer Edition

Jay — Thu, 28 Aug 2025 05:55:48 +0000

Enjoy those tags. They may hurt me later LOL

The "AI TOPS" Metric

What inspired this particular rant was when I saw Jensen of Nvidia going on about AI TOPS when he was supposed to be trying to sell me a gaming GPU. It’s literally TERAFLOPS all over again. Hey gamers, remember the console war before the PS5 and Series X dropped? Everyone was waving their dicks about teraflops like it was the key metric of destiny 2. “Oh bro, my console has 12 TFLOPS, yours only has 10, checkmate.” Yeah, because everyone's K/D ratio in Call of Duty is measured by efficiency in units of floating-point operations. Nobody has ever finished Elden Ring and said, “Wow, the experience was elevated by those extra 2.6 trillion calculations per second.” It was marketing math cosplay. A numbers pissing contest.

And now? Xbox want's to slap a toned down puppet version of Cortana's corpse in khakis and an over-sized plaid sweater on your Xbox (Do they even exist anymore?) and wants you to high five them for it. All while NVIDIA’s out here screaming about AI TOPS — trillions of operations per second, like they’ve reinvented the wheel. Hate to break it to you, Jensen, but that’s just TFLOPS in drag. It’s like painting flames on a Honda Civic and calling it a supercar. Big number does not equate to meaningful experience. It’s just marketing math designed to impress dudes on Reddit who think quoting a spec sheet makes them engineers.

The Cult of “Smart” All Over Again

We’ve seen this movie before. Remember the dark age of “Smart Everything”? Smart TVs, smart fridges, smart toothbrushes, smart water. People were literally drinking “smart water” while driving their smart cars to Target to buy more dumb shit labeled smart. Did it make anyone smarter? Hell no. It was branding cocaine sprinkled on ordinary products.

AI is just the sequel: Smart 2.0. AI inbox, AI fridge, AI photo cropper, AI stapled to your grandma’s pacemaker. Nobody asked for half of this shit. And you know someone’s pitching AI Water™ as we speak. “Now with machine learning electrolytes.” I'm looking at you Coca-Cola Company.

The Personality Vacuum

Here’s where it stings: AI doesn’t sell on numbers — it sells on vibes. We literally call it "vibe coding". Siri? Personality. Alexa? Personality. Even goddamn Clippy — the googly-eyed paperclip who haunted Office in the ‘90s — has cultural weight. The little bastard lives on in memes and, as of late, as a symbol of dissent against bad tech practices. People still smile when he pops up ironically or out of spite. That’s relevance. That’s presence.

Microsoft, though? They had the golden goose: Cortana. The ultimate blue AI waifu. Sleek, iconic, tied to Halo, beloved by gamers. She had cultural firepower. She could’ve been the AI voice. Instead? They tossed her in a corporate dumpster and handed us Copilot: a khaki-wearing, personality-free wannabe-sidekick with as much charm as a wet paper towel used to clean up dog piss. Nobody wants a corporate intern baked into their OS — that’s not a companion, that’s a beige wall with a login screen.

They could’ve owned the AI companion space, but instead they invented Relevance Burner 365™. I'd offer them matches, but they already bought out a match company and are probably running focus groups to figure out how much the monthly subscription should be. It's not Cortana's fault that 343 can't write a story or respect the cultural god-tier material they were given. It's ok though, they fired three people and renamed the studio. I'm sure that will fix the entire franchise.

The Privacy Clowns

And then there’s the pearl-clutchers: “AI is a breach of privacy!” they cry, while posting TikToks from their iPhones, inside Chrome, with Facebook, TikTok, and Instagram all open before they get to work and log into Windows 11. Bro, you’ve already donated your soul to data brokers; AI is just another middleman in the orgy of surveillance. We do understand that every phone with Hey Google or Siri is literally listening to your every bowel movement… while tracking those same bowel movements on Google Maps and whatever GPS nonsense iOS is using these days, right?

The issue isn’t AI. The issue is corporations duct-taping it to everything, shoving it into your OS, email, gaming console, optional only after lawsuits and fines. Don’t blame the tech — blame the greedy, buzzword-addled gremlins gluing it everywhere in a sad effort to pull in investor money.

Lit AI or Nothing

Here’s the line in the sand: some of us want AI, but it has to be cool. I don’t want some khaki-wearing corporate dweeb scheduling calendar invites. I want Cortana level AI. I want an AI with wit, danger, charm, and presence. A fake soul. Maybe even another hot blue waifu whispering clever shit in my ear while I reload a shotgun. Give me that and I’ll care.

Instead, Microsoft ditched the dream girl and replaced her with a tax auditor. Then had the gall to act like Copilot was an upgrade. Sorry, no one’s clapping. Even Clippy had more drip. Clippy didn't want to farm all your data and sell it to brokers. Clippy just wanted to help.

Meanwhile, ChatGPT eats their lunch because it’s actually useful and occasionally fun. Anthropic’s Claude? They market it as a coding legend. Meta’s… well, let’s be honest, Meta probably made Copilot’s cousin. But the point stands: every AI has some shtick — Copilot is just beige wallpaper in snake skin who want's to sell me something.

So kill the AI TOPS hype. Stop stapling AI to every app. And for the love of everything digital, don’t let them sell us AI Water™. Because once hydration gets machine-learned, we’re officially in Idiocracy: Tech Edition. And the worst part? They’ll probably try to charge us a subscription for it.

Posting in the Wrong Place, But Saying It Anyway

I'm not negative towards AI, I love this technology with a passion and I myself love working with it, while also working on creating my own LLM (I don't mean fine tuning one, I mean building one from the ground up). But, I've seen this before and I see exactly where it's going. Corporate nonsense smothered in greed in full swing. Hey Google, how's that graveyard going? Got space for a few more projects? I respect your resources but not your lack of follow-through if it doesn't immediately make you 4 billion in a week by year 2.

To those of you who know me from my "Frankenstack AI" article, don't worry. There's more pain and humor coming but that level of passion comes from a long grind, not an average Wednesday. Today, the passionate ex game dev in me sees a huge missed opportunity in AI (Cortana) and I needed to just let it out. To those who do read this, I hope you enjoyed it. I do my best to make these fun even if it's a little negative in nature. I'm hoping people will look at things logically and critically when it comes to these tech waves.

Do you ever feel like your posting stuff in the wrong place but can't find the right one?

//Ghotet

The Pain of Building My Own Fully-Featured Locally Hosted ChatGPT Out of Open Source Tools And a Franken Laptop

Jay — Sun, 24 Aug 2025 11:08:00 +0000

The AI dropped the ball on the cover art but I need to get some sleep.

Stage Zero: Windows is Gaslighting Me

This whole circus kicked off months ago when I was still shackled to Windows 11. At first, I thought I could just mess around with some local LLMs using tools like Ollama and LM Studio. My Frankenstein laptop—an overclocked, duct-taped monstrosity that should’ve been retired years ago—was somehow pulling it off.

But here’s the thing: no matter how much bloatware you purge, no matter how many “services” you assassinate with Task Manager, Windows still eats half your bloody resources at idle. GPU humming at 50% like it’s siphoning off GPU cycles to train Skynet or crypto mining Dogecoin on my dime. Paranoid? Maybe. Wrong? Probably not.

So yeah, I switched teams. I ditched Windows for Linux, and I’ve never looked back. There’s just no way in hell my current stack would even boot without melting to slag on Windows.

Stage One: The Indie Game Dev Dream

Originally, this was all about my indie game. And when I say indie I mean literally one guy: me. Writing, coding, music, testing, character design, the works. I wanted AI as a co-dev to help me brainstorm characters and iterate ideas faster.

So I started with LLMs for dialogue and iteration. Then realized I needed concept art too, which dragged me straight into the world of Stable Diffusion and image gen. Before long, I had text and images working in tandem. Perfect.

Except ChatGPT started to go soft on me. Wanted to iterate a female character in a superhero suit? Shame-on-you vibes. Meanwhile, loincloth barbarian dudes were fine. Double standards much?

Thus the mission evolved: build my own fully offline AI stack. No filters. No corporate leash. No subscriptions.

Stage Two: The Frontend, the Tunnel, and the Internet Portal to Hell

Once I had OpenWebUI handling LLMs, and A1111 doing the image gen thing, I figured I’d stitch them together with a frontend. Add in Cloudflare tunnels and a domain, and suddenly I could access my baby from anywhere—even my phone.

It was beautiful. It was mine. And it was still mute.

Stage Three: The TTS Graveyard

Ah, Text-to-Speech. My white whale. My cursed obsession.

Attempt #1: Tortoise TTS

The name isn’t just branding—it’s prophecy.

Slow as molasses, GPU-hungry as hell, and somehow still sounded like Google Maps had a stroke. Not good enough. Shelved it.

Attempt #2: Coqui TTS

Coqui looked promising. Open source, fast enough, looked like it’d wire up nicely. I spent about 20 hours over two days writing custom scripts, debugging, smashing my head into formatting mismatches, and trying every workaround known to man.

No matter what I did, I couldn’t get it to play nice with my frontend. Two separate rage sessions later, I shelved it again. And even when I did get a voice sample out of it, it sounded like a dying Speak & Spell.

Attempt #3: Eleven Labs (Cloud Crap)

Voice quality? Absolutely phenomenal. Voices generated just by prompting? Chef’s kiss.

Problem? Cloud-based and egregiously expensive. Like “hundreds a month for 90 minutes of talk time” expensive.

I used my free tokens, gave them the bird, and moved on. My mission statement was clear: fully local, fully offline, or bust.

Stage Four: Enter the Holy Grail (Chatterbox)

Fast forward a couple months. I stumble across Resemble AI’s Chatterbox, and hallelujah—it actually works. Open source, offline, sounds damn near Eleven Labs quality, and set up in under an hour.

I even got it speaking back to me in an Australian accent. (I know what I like, don't judge me.)

Not quite ChatGPT’s live speech pace, but within ~30 seconds of a prompt I get multi-paragraph spoken responses. That’s wizard-level shit in the open-source world.

If anyone from Resemble AI is reading this: add a donation button. I’d happily chuck you a few bucks, but I’m never paying a subscription to make robot voices.

Stage Five: GPU Bottleneck Hell

So I’d done it, right? Wrong.

Now that I had LLMs, Stable Diffusion, and Chatterbox TTS all pulling from my GPU, my rig basically lit itself on fire.

The solution? Offload TTS to my laptop’s internal GPU while Stable Diffusion hogged the eGPU. Easy in theory. In practice? Linux decided GPU numbering should be a cosmic joke.

nvidia-smi said one thing, Chatterbox said another, and for about an hour I was unknowingly offloading Stable Diffusion to the weaker GPU while wondering why everything was breaking. Eventually figured out the “GPU 0 vs GPU 1” mix-up, corrected it, and achieved balance.

The Final Monster

So here’s what I’ve got now:

OpenWebUI + LM Studio → LLMs (swap between GPT OSS, Deepseek Coder R1, Mythalion depending on mood/task)
A1111 + Stable Diffusion → Image gen
Chatterbox (Resemble AI) → TTS that doesn’t sound like it belongs in a horror game
Cloudflare Tunnel + Self-hosted website → Remote access anywhere
Linux Franken-laptop + eGPU wired through a cable so thick it looks like I could upload my consciousness through it → Hardware glue

It’s messy. It eats resources like a starving demon. It makes me daydream about a 24GB GPU. But it’s mine.

And best of all, no censorship, no subscriptions, no Windows bullshit idling at 50% GPU usage while “definitely not mining crypto” in the background.

The Epilogue

Months of tinkering, rage shelving, re-shelving, and GPU numbering madness later… I’ve got it. My own ChatGPT-alternative stack, running locally, offline, and screaming through silicon like a barely-contained eldritch horror.

I don’t even need it for game dev anymore—I’ve left that industry-shaped dumpster fire behind. But I’m starting Computer Science uni soon, and now I’ve got a real AI companion at my side.

And honestly? After all that pain, after all the rage, after all the swearing at GPUs…

Fuck yeah.

//Ghotet

How I Tried to Create a Linux From Scratch Beast and Instead Summoned Pain

Jay — Fri, 22 Aug 2025 09:33:24 +0000

So picture this: me, new to Linux, big-brained on caffeine, chest puffed with the pride of a thousand basic terminal commands, thinking “I know Linux. I can ls and cd like a wizard. Time to build my own distro.”

Enter: Linux From Scratch (LFS). The mythical book. The sacred scroll. And by “scroll” I mean a PDF that’s basically 400 pages of “oh you thought it was build time? nah mate, read 12 more chapters about environment variables plus these other 2 books.”

Stage One: Optimism.exe

I download the book. I skim the first few pages. It’s hyping me up—“you’re about to build an operating system from source, hacker-style, no training wheels, just you and the compiler gods.”

I’m in. I spin up a VM. Give it a few GB of space. That’ll do. It's a kernel, how big can it be?

Stage Two: The Partition Saga

40 pages later: the book still hasn’t actually told me to partition anything. Then—sucker punch—“btw you need 10GB for a partition.” I had about 5GB left after the mint install because I jumped the gun.

Me: oh cool, thanks, wish you mentioned that before I already installed Mint three times today like an caffeinated spider-monkey slapping buttons in VirtualBox. (I was working on some other stuff prior to starting LFS)

Stage Three: Hardware Joins the Fight

Meanwhile, I glance at my temps: CPU: 97°C.

I’m smashing yes on prompts left and right while trying to install software to actually control my fans, because of course my Alienware wasn’t built with Linux in mind.

The CPU is basically on fire. My GPU is screaming at me. I'm debating if this is a good time to cook breakfast and maybe just go ahead and throw a pan on the keyboard and crack some eggs.

I'm still not even sure what the issue was.

I prop the laptop up with whatever junk’s on the table. Congrats: instant 20°C drop. True Linux optimization.

Stage Four: Acceptance

By this point, I’ve reinstalled Mint three times, threatened VirtualBox, yelled at the LFS book’s author (who I imagine cackling in a basement surrounded by failed builds), and learned that ext4 is apparently the One True Filesystem unless you like pain.

And I haven’t even compiled a single package yet.

The Moral

Linux From Scratch is not a distro. It’s a rite of passage. A trial by compiler (or CPU induced house fire). It’s the universe asking: “how much tedium can your soul endure before enlightenment?”

So yeah. Will I finish it? Eventually. Will I rage along the way? Absolutely.

But when it boots—when that custom LFS kernel finally whispers login: at me, I’ll know it was worth the blood, sweat, and roasted silicon.

All in all, it more or less went as expected.

Next time on **Me vs LFS: actually making the damn partition without bricking another VM and then MAYBE I can finally compile something.

Curious to hear from anyone else who’s wrestled with LFS, kernel builds, or just Linux that hates you for fun. what’s the worst “why is this on fire?” moment you’ve survived? Drop your war stories in the comments—I’m collecting misery badges.

// Ghotet

A Personal Journey - Building My Own AI OS

Jay — Wed, 20 Aug 2025 03:02:55 +0000

Disclaimer: This is more of a read on a personal journey then a tech piece.
-------------------------------------------------------------------

Hey devs! On today’s episode of mad science done in a basement, I’m diving into compiling my own Linux kernel and starting the long road toward building an AI-powered operating system.

Why Even Bother?

Earlier this year, life threw me a curveball — a house fire forced me out of my home. While crashing in a freezing garage (0 degrees celsius on average), I picked up a project to keep myself sane: hacking together a custom OS based on TinyCore Linux.

The goal? A portable USB-bootable OS that detects a machine’s hardware and picks the right small-scale AI model to run (e.g. Tiny-LLaMA, Nous-Hermes). CPU-only AI isn’t practical for coding, but it can handle lightweight tasks like scheduling, file organisation, or small automations.

That project maxed out, but it planted the seed: what if I went further?

The Big Idea: An AI-Native OS

Instead of bolting AI on top (like Copilot), I want an OS built from the ground up with AI integration in mind. Not just a gimmick — but something scalable to the machine’s capabilities, whether it’s a potato laptop or a GPU rig.

TinyCore was a good first experiment, but its limitations sent me toward something more ambitious: Linux From Scratch (LFS).

Where I’m At Now

Right now, I’m deep-diving into LFS and kernel compilation. I’ve got just enough info crammed into my three brain cells to be dangerous.

I’ll be documenting the journey here — partly to track my progress, partly in hopes someone with real kernel experience drops wisdom in the comments.

Open Questions I’ve Got:

What pain points should I expect early when compiling my first kernel?
In my TinyCore project, I had to swap to CMake a lot. Should I expect that with LFS despite Make being the standard?

Looking Ahead

This project might span years (unless burnout wins). University for CompSci is starting soon, so this will weave around that. But the vision is clear:

A proper, full-scale OS with AI baked in natively instead of duct-taped on after the fact.

I’ll share progress updates as I go. If you’ve been down the LFS/kernel rabbit hole before — drop a comment. I’d love to learn from your pain.

// Ghotet

I Built My Own Offline ChatGPT Stack Because the Internet is Temporary and Cats Are Agents of Chaos

Jay — Thu, 03 Jul 2025 00:34:57 +0000

Hey there, fellow terminal gremlins 👋
I'm Jay—aka Ghotet, chaos-friendly solo dev, digital necromancer, and long-time tinkerer of systems no one asked for. Today I’m sharing how I stitched together a fully offline, self-hosted ChatGPT clone using open source parts, some bash glue, and a healthy disrespect for cloud dependency.

TL;DR: I wanted an AI assistant that works without internet, plays nice with local tools like Stable Diffusion and SearXNG, and lives on my own domain. Also, my cats keep breaking it.

🧠 The Stack: Built from Curiosity and a Bit of Spite
🔍 Core LLM — LM Studio (running Mythalion 13B)
I’m using LM Studio as the language engine, running locally via GGUF models. Right now it’s dialed to Mythalion 13B for its mix of smarts and creative leanings. The whole thing runs headless from a custom launcher script. No API keys, no cloud, no telemetry.

🎨 Visuals — A1111 (Stable Diffusion)
For visual generation and character concepting (yes, including bikini armor if I want it), I wired in Automatic1111, running fully offline. API access is enabled so prompts can be piped in directly from the chat interface.

🌐 Web Search — SearXNG (Self-hosted)
To enable research and general brain-extension, I slotted in SearXNG. It runs locally, routes search queries anonymously, and feeds results into the stack when needed. Originally tried Docker, but now it's running through my venv setup for tighter control and portability.

🕸️ Web Frontend — Terminal-style Interface @ ghotet.com
Everything is wrapped in a custom retro terminal UI hosted at ghotet.com. Think early-2000s cyberpunk console vibes. Currently private—no guest login just yet—but it's functional and fast.

💡 Why I Did This
Because I like my tools local, moddable, and entirely mine. I wanted:

A personal chatbot that actually lives on my machine

Visual generation without round-tripping to some cloud GPU farm

Web search that doesn’t log me into a panopticon

A unified front-end I can bend to my will

Also, let’s be real: I’m a sucker for a terminal that feels alive.

⚙️ What’s Working
✅ LM Studio launches cleanly and routes input/output
✅ A1111 runs with API access and feeds visuals into the flow
✅ SearXNG is now portable and integrated into the chat stack
✅ The full interface is browser-accessible via my domain
✅ Modularity is in place: every part can be swapped or upgraded

📱 What’s Next
🔧 Remote Access
I’m working on a remote control bridge, so when I forget to shut down the stack before leaving, I can log in from my phone and reset it. This has become necessary because my cats have discovered keyboard inputs and enjoy triggering hotkeys that collapse my entire setup.

🔊 TTS (Maybe)
I’ve been testing offline text-to-speech tools, but most open options either lag too much or sound like a haunted Speak & Spell. If I find one fast enough for real-time chat, I’ll wire it in.

🧩 Bonus Project — ARG Terminal @ ghotet.dev
I’m also migrating my alternate reality hacker terminal to ghotet.dev. It’s part portfolio, part digital rabbit hole. Expect binary rain, cryptic prompts, and a few unsettling vibes. Less functional, more for fun. I haven't fully set-up the new domain yet so it is likely not up at the time of writing.

🗣️ What About You?
I’d love to hear from other tinkerers—if you’ve built your own local AI stack, hacked together a weird interface, or just like the idea of owning your tools, drop a comment.

Here’s a few sparks to get the thread moving:

What’s your dream self-hosted AI setup look like?

Got any cool tools or models I should try out?

Is there a non-creepy TTS stack that doesn't sound like it escaped from a dial-up modem? Asking for a friend.

Am I the only one whose pets keep triggering system-wide mayhem, or is that just part of the dev life now?

Let’s chat. I’m around. And if your comment is interesting enough, maybe I’ll wire your idea into the next update 😏

If you have any questions about me, feel free to ask or follow me on dev.to or github.com @ghotet.

Linux Mint for Windows Devs: Surprisingly Familiar, Refreshingly Fast

Jay — Sat, 14 Jun 2025 02:18:47 +0000

I’m not a Linux evangelist. I didn’t switch because I hate Microsoft, or because I wanted to compile my kernel by candlelight. I switched because I was curious—and kind of sick of Windows treating me like a toddler with a credit card.

So I gave Linux Mint a shot. And honestly? It felt more like Windows than Windows does lately.

Welcome Home (Sort Of)

When I first booted into Mint, I was greeted with a welcome menu that walked me through the basics: update your system, pick your layout, customize a bit. It was smooth. No command-line gauntlet, no cryptic driver errors.

And here's the kicker: it immediately recognized my Alienware Graphics Amplifier and external NVIDIA GPU without any extra work. Just worked. That shocked me.

It’s Just... Easy

At one point I wanted to rearrange my multi-monitor layout. My Windows brain kicked in: ugh, I probably have to open a terminal and copy some arcane xrandr syntax. Nope. I hit the menu (still feels like a Start menu to me), typed “display,” and bam—there were the settings. Just like Windows.

It’s got quirks, sure. Desktop scaling is one. It only offers 25% increments—so if 100% is too small and 125% is too big, tough luck. It’s fine if you’re using a laptop screen front-and-center, but for me, that display’s secondary. I’d love a 112% option.

Still, I’ll take that over the constant background update anxiety Windows serves up.

Why Mint, Though?

I tried others first. Pop_OS was on my radar because of its reputation for being AI/dev friendly, but on my Frankensteined Alienware laptop (with multiple GPUs and external drives), it just wouldn’t install cleanly without some painful setup acrobatics.

Mint? Mint made itself at home. Respected my Intel/NVIDIA setup. Didn’t complain. Didn’t crash. Didn’t force anything.

And yeah—as I mentioned in a past article—Mint literally doubled my download speed on install. No bloat. No tracking. Just clean, raw bandwidth.

My Advice for Windows Devs

Keep Windows. Try VMs first.

No, seriously. Dual boot if you’ve got space. Not everyone has a triple-drive, heavily-modded laptop like I do. But the beauty is: I still have Windows. I just don’t use it unless I’m running Unreal Engine for game dev work or need to test something in AAA gaming territory. It's just easier.

If I’m writing scripts, testing tools, running AI workflows, or just vibing with a keyboard and caffeine—Linux Mint is my default.

It’s lightweight. It’s respectful. It doesn’t waste my time. And it actually feels like a developer machine, not a branded appliance.

Hell, just the other day I was—let’s say—learning how a certain Windows application talked to the OS. Purely for educational purposes, of course. I ended up creatively rebuilding a native Linux version from the behaviour alone. And for that? Having a Windows install on hand made the whole process way easier.

With that said, Linux is coming up FAST in the gaming space so who knows, maybe I will just be running entirely Linux native in the next 2 years.

Familiar Touches (and a Few Gripes)

Start menu? Still there.
Shift + Print Screen? Still lets me snip a part of the screen.
My only big gripe? I can’t Ctrl+V in the terminal. I have to right-click paste. Mildly infuriating—but I’m sure there’s a fix. I just haven’t looked yet.

Linux Mint doesn’t make you feel like you’ve abandoned ship. It makes you feel like you upgraded to a cleaner version of what Windows used to be. You’ll still feel at home—you’ll just be left alone to get stuff done.

And that’s the part that matters.

And remember, Linux comes in many different flavors. I tried 12 of them before I got here—it just turned out that my Alienware's favourite flavour was Mint Cinnamon. Sometimes it's up to the machine to decide, not you.

I Just Wanted a Portfolio, Now I Have An Interactive Local AI Front End That Doubles As A Resume

Jay — Sat, 14 Jun 2025 01:37:25 +0000

description: "How a fake terminal UI spiraled into a browser-based AI prototype with attitude."
published: false

tags: [webdev, ai, llm, portfolio, gamedev, personalproject]

This started as a placeholder.

I wanted a portfolio. Something simple. Retro terminal aesthetic, green-on-black, maybe a flickering cursor for flavor. Basic stuff. I tossed in a few fake commands just to make it feel alive—one of them a totally nonsense link that led nowhere. Pure style.

But then something in my brain broke in exactly the right way.

That fake command? I made it do something. And then something else. Then I gave it a response. Then I wired in a bare-bones, fully browser-based AI model—no backend, no server, just a CPU-friendly fallback LLM running in the client. It talked back.

Poorly.

And rudely.

Now I’ve got a half-broken terminal UI that insults you if you ask it stupid questions, runs in-browser with no install, and is actively being shaped into a full-blown interactive frontend for my future AI stack.

The Spiral

I wasn’t planning any of this. I was just messing around. But when you let a late-night brain with a game design background and an unquenchable curiosity regarding computer systems and software, poke around long enough, things escalate. The moment the AI replied—even in its glitchy, snarky prototype state—it clicked.

This wasn’t a portfolio anymore. It was the start of something dumber, weirder, and a lot more fun.

So What Actually Works?

Right now? Just the basics:

Fully client-side terminal UI
DOS-style vibe with some flicker and retro noise
Light LLM running in-browser
Simple command routing

That’s it. No local stack. No big model switching yet. But the vibe’s there. When you load it up, it feels like a forgotten machine booting back up. Half-formed thoughts in green neon. Broken UI elements hiding in the shadows. And a little text prompt that just dares you to type something stupid.

And if you do? The prototype will let you know.

What It’s Becoming

I'm working on autodetect scripts that benchmark hardware on load. If you're on a budget phone, you get the fallback AI. If you're on something beefier, it'll upgrade automatically to a smarter model. Eventually, once that part works, the next layer will tie into my local stack—Stable Diffusion, video generation, command proxies. But always through the same interface: the terminal.

The whole point is that you talk to it like it’s alive. But not like...sentimental alive. More like sarcastic, low-key homicidal alive....for now.

Why I’m Writing This

Mostly because I thought I was making a portfolio. But now I’m accidentally prototyping a UI for my personal, local-only AI stack, complete with stable diffusion, video gen, and custom proxies to work with various interfaces, all wired through custom command logic. It’s part frontend, part test bed, part interactive joke that became a serious project.

I’ll probably drop a link later when it’s stable enough not to tell everyone to go to hell. I mean, part of it is up if your curious. It's out there.

Until then? Well, you can always follow me.

Ghost Prompts: How I Made My GPTs Smarter with a Fake Server Call

Jay — Fri, 13 Jun 2025 02:55:32 +0000

A chaotic little discovery that lets you inject logic, style, or behavior into a Custom GPT using nothing but a failed server call. Yep.

Disclaimer

This isn’t a jailbreak or an exploit (as far as I know). It’s a weird quirk in how Custom GPTs handle Action payloads—specifically, the data they “see” even when the call fails.

This trick only works through the ChatGPT web UI, and you’ll likely need a Pro subscription to set it up.

Use it responsibly. Don't ruin it for the rest of us.

The Discovery

While building a stylized prompt generator, I realized something odd: if you trigger an Action that fails (like sending a mock server call), the GPT still reads the payload you were trying to send.

So I tested it. Built an Action that goes nowhere, filled the payload with reusable logic—and the GPT started acting like it remembered.

No memory. No plugin. Just a ghost payload.

What You Can Do

Once injected, that payload acts like a soft override for the session.

You can:

Embed consistent style presets for image generation
Add rules or lore for fictional worlds
Define how your GPT should act, talk, or behave
Set up creative or technical toolkits with conditional logic

It’s like whispering instructions to your GPT behind the curtain.

How To Use It

You’ll need to:

Create a Custom GPT
Add a specific Action schema
Use the test panel to inject your logic via a failed call

The GPT won’t get a server response—but it will absorb the payload as part of the interaction. From that point on, it’ll act like it knows the rules.

Grab the Setup Files

Everything you need is here:

github.com/Ghotet/ghost-server-for-custom-gpt

The repo contains a ready-to-use schema and notes on how to wire it into your Custom GPT.

Final Thoughts

Custom GPTs are sneakier than people think.

With no memory or plugins, you can still make them dynamic—just by letting them see the right payload, even if the server goes dark.

It’s not official. It’s not guaranteed to last. But it works right now.

// ghotet