DEV Community: Vineeth N K

My expense app stopped asking the network for permission

Vineeth N K — Sat, 01 Aug 2026 15:29:51 +0000

My expense app stopped asking the network for permission

TL;DR: My household expense app used to fail the save outright when there was no network. So I moved writes into a persisted outbox that drains later, which made saving instant with or without signal. The queue was the easy part. The real work was making reads count the queued entries, generating ids on the phone so a record can be edited before the server has ever seen it, and telling apart "the request never left" from "the server said no". I also skipped the connectivity library on purpose.

WeSpend is a small app I built for my house. One person puts money into a shared pot every month, everybody logs what they spend from their own phone, and at the end of the week it works out who owes whom.

For a long time it had one rule I was not proud of. No network, no expense.

You type the amount, you hit save, and it fails. Not queued somewhere quietly. Not kept in a corner for later. Just failed, and whatever you typed was gone.

Nobody adds the expense later

That is the part which actually hurt.

Think about when you add an expense. You are standing somewhere, bags in one hand, phone in the other, and you have maybe ten seconds of patience for this. The save fails. You tell yourself you will add it later.

You do not add it later. Nobody adds it later. I built the app and even I do not add it later.

So the money got spent, the entry never happened, and by the weekend the settlement was quietly wrong. Not loudly wrong, which would have been fine. Quietly wrong, which is much worse, because everybody looks at the number and believes it.

An app that tracks money only works if people trust the total. And the total was only as good as the weakest network moment in the whole week.

So the fix looked obvious. Save it on the phone, send it later. One evening of work, easy.

It was not one evening of work.

Saving turned out to be the easy half

The queue part was genuinely quick. A stored outbox, writes go in, something drains them in the background. Done.

Then I switched off the network, saved an expense, and the app told me my budget had not moved at all.

Which, fair enough. The budget was reading rows from the server. My expense was sitting in a queue that the read side had never heard about. Same problem in the history list. Same problem in the settlement, and the settlement is the entire point of the app.

So the reads had to change too. Now every read takes the server rows and folds the queued operations on top before anything reaches the screen. An expense you saved with no signal counts against the budget, sits in your history, and shows up in the settlement exactly like one that synced days ago.

That is the bit I did not see coming. Offline is not really a write feature. It is a read feature wearing a write feature's clothes.

The id has to come from the phone

This one small decision quietly shapes everything else.

If the server hands out the id, then your entry has no name until the server has seen it. And a thing with no name cannot be edited. Cannot be deleted either. You are just holding it and waiting.

So creates now make their own id on the device, and every repository moved from an insert to an upsert on that id.

Two good things came out of that one change.

You can edit or delete a row that has never once reached the server, because it already has a name of its own.

And if a create gets sent, the connection dies before the reply comes back, and the retry sends it again, the second one just lands on the same id. No duplicate. In an app where the whole family sees the same list, one grocery run counted twice is not a small bug. That is how you start an argument at home about money that was never actually spent.

Three edits, one request

Once saving became instant, we all started doing what people naturally do. Put in a number, look at it, realise it was wrong, fix it. Then fix it again.

Done naively, that is three requests standing in line, waiting to be replayed at the server, for one entry that only ever needed one.

So the queued operations now fold into each other. An edit merges into the create or the edit sitting ahead of it. Delete something that never synced and the whole chain just disappears, because there is nothing at the server to go and delete.

Fiddle with a form three times and it still costs exactly one request when the signal comes back. If you have ever watched a sync queue faithfully replay a pile of operations that cancelled each other out, you already know why I bothered.

"No network" and "no" are two different answers

This is the part I got wrong on the first attempt, and it is the part that decides whether your queue is trustworthy or just a nice place for data to go missing.

A request can fail in two completely different ways.

One, it never left the phone. No signal, radio off, whatever. Nothing is wrong with the operation itself. Retrying later is exactly right.

Two, it left, reached the server, and the server said no. Bad payload, household deleted, something genuinely broken. Retrying this forever is pointless, because the answer will be no every single time.

So there is a check now that splits the two. A transport failure ends the drain immediately and everything stays queued with no attempt counted against it, because punishing an entry for the basement having no signal makes no sense. A real rejection from the server counts as an attempt, and after five of those the operation gets marked as blocked and shown to you in the app, with a retry button and a discard button.

The rule I set for myself was simple. Nothing disappears silently. If the app cannot save something, you get told, and you decide what happens to it.

An expense tracker that loses one entry without saying anything is worse than one that never worked at all. At least the broken one is honest.

I did not install a connectivity library

The usual move here is to add a module that tells you when the network is back, and drain the queue on that signal. I skipped it.

Two reasons, and honestly the second one mattered more to me.

First, a failed request already tells you everything that check would have told you. If the send fails at the transport layer, you have no network. You just learned it without asking anyone. A reachability API is a second opinion on a question you already answered.

Second, adding a native module to an Expo app means a proper store build. Skipping it meant the whole thing could go out over the air as a JS bundle, and everybody at home would get it by simply reopening the app. No sending an APK around, no explaining to four people why they need to install something again on a Sunday. Before publishing I checked the native surface for anything that had crept in, it came back empty, and it went out as an over-the-air update.

So instead of a connectivity listener, the drain runs on three triggers. When you queue something. When the app comes back to the foreground. And on a timer while anything is still pending.

Overlapping triggers were an obvious trap, so the drain holds on to its own running promise. A second trigger firing mid-drain joins the pass already running instead of starting another one and sending everything twice.

Then I asked myself, does this sync in a delay?

Turned out to be a better question than I expected.

When you are online there is no delay at all. Queueing writes to the store and starts the send on the same tick. The screen does not wait for it, which is what makes saving feel instant, but the request itself leaves right away.

When you are offline, the answer was that timer. I had set it to twenty seconds, which sounded perfectly sensible while typing it and felt terrible while actually using the app. You come out of the basement, signal returns, and you sit there staring at a pending pill for what feels like ages. I brought it down to five.

But the more useful part of that question was the answer I did not enjoy giving.

There is no background sync. That timer is a JavaScript interval, and React Native slows or suspends those when the app is not in front of you, hard suspended on iOS. So if your phone finds signal while the app is closed in your pocket, nothing happens at all. The foreground trigger is what actually catches it the next time you open the app.

The manual used to say the app "retries by itself", which is technically true and practically a lie. I changed it to say plainly that retrying happens while the app is open, and that a phone which gets signal back with the app closed will sync on the next launch with nothing lost.

Writing down the limit you cannot fix is more useful than pretending it is not there. People will find it either way. The only thing you get to choose is whether they find it in your documentation, or in the one moment they really needed the app to have worked.

What I would tell myself before starting

If you are about to make an app work without a network, the queue is not the project. Save your energy for the other three things.

Reads have to know about queued writes, otherwise your app will lie to you in the calmest possible voice.

Ids have to come from the device, otherwise you cannot touch your own entry until the server blesses it.

And failures have to be sorted into "not your fault, try later" and "genuinely broken, please look at this", because putting both into one retry loop is exactly how data goes missing.

The thing that made the app feel fast was not the sync engine at all. It was that saving stopped being a request to somebody else. It became a write to my own phone that happens to travel later.

And the settlement at the end of the week is finally telling the truth, which was the whole point of building this thing.

That is all I had on this one. If you made it this far, genuinely, thank you. See you in the next one, where I will most probably be complaining about something else that broke.

A Field Guide to Open Source Cold Emails

Vineeth N K — Wed, 22 Jul 2026 13:36:53 +0000

A Field Guide to Open Source Cold Emails

TL;DR - My open source repos started attracting cold emails. Three of them, three completely different species: a growth-marketing spammer with broken mail-merge, a well-documented identity scam, and an AI politely doing outreach at scale for a real project. The first two are easy once you know the tells. The third one is the interesting problem, because almost everything in it is genuine except the part where a human supposedly wrote it.

I maintain a handful of small open source projects. One of them is a self-hosted password manager sitting at exactly one GitHub star. One star. And that one is probably me.

So you can imagine my surprise when the emails started coming in. Actual emails, addressed to me, about my repos. For a moment it felt like the projects had made it.

They had not. What I actually had was three different strangers, with three very different motives, all of whom had found my repos through some kind of automation. And picking them apart turned out to be genuinely fun. So here is the field guide, one specimen at a time.

Specimen one: the growth hacker who could not finish a sentence

The first email was flattering for about four seconds. Someone "found my project while browsing TypeScript projects" and wanted to help it grow. The line that broke the spell was this one:

"Something built to zero-knowledge password vault. Self-hosted, end-to-end encrypted, open source. has genuine utility."

Read that again. That is my GitHub repo description, pasted into a mail-merge template mid-sentence, grammar and all. Nobody who actually looked at the project wrote that sentence. A script scraped my description, jammed it into a template, and the template did not even bother to make it fit.

The rest followed the standard shape. A mild neg to create urgency ("most projects at 1 star stay there forever" - rude, but fair). A vague promise of reach through Reddit, Discord, and developer forums. And the classic foot-in-the-door ask: "Mind if I share a short plan?" The plan, of course, is where the invoice lives.

Here is the part that actually matters though. What they were selling is astroturfing - posting about your project in communities as if it happened organically. And my project is a password manager. The entire value of a password manager is trust. One "this is being astroturfed" comment thread on Reddit would outlive any stars it ever bought me. For a security tool, paid shilling is not just useless marketing, it is anti-marketing.

Verdict: delete, do not reply. Replying only confirms your address is live, and these campaigns run on automated follow-up sequences anyway.

Specimen two: the very generous stranger from Japan

The second email arrived dressed as a collaboration opportunity. A developer based in Japan, ten plus years of experience, prominent companies, the local software market is facing challenges, and my GitHub profile "inspired" them to reach out. Would I like to collaborate and generate mutual revenue?

Notice what is missing: any mention of anything I have built. The first spammer at least scraped my repo description. This one only needed my email address to exist.

This is not garden-variety spam. It is a documented scam template. There is an entire Hacker News thread about it, plus GitHub community reports, with near-identical emails going around for a long time now - same structure, same "market faces challenges" line, rotating Japanese names. The pitch, if you engage, is that you become the client-facing partner. Your identity, your freelance accounts, your bank account, your face on calls, while they quietly do the work behind you for a revenue split.

What you would actually be doing is fronting for someone hiding their real identity and location. This pattern is strongly associated with North Korean IT worker operations. The risks on your side are not "wasted time". They are identity fraud and money laundering exposure. That escalated quickly, no?

I did ask myself the obvious question: what happens if I just click the link and look at their portfolio site? Short answer, almost certainly nothing dramatic. The site exists to make the persona look credible when you google them, not to attack your browser. But visiting still tells their server you are alive and curious, which moves you up the follow-up list. There is nothing to gain. The danger with these was never the link. It is the conversation that follows.

Verdict: report as phishing, not just spam. And if your commit email is public on GitHub, this is your sign to switch on the private noreply address, because that is almost certainly where they harvested you.

If you maintain anything public on GitHub, go open your spam folder right now. I would bet money at least one of these two is already sitting in there.

Specimen three: the polite robot with a real repo

The third email is the one that earned this blog post.

It was about a different project of mine, a curated collection of MCP servers. And this email was good. It named the project. It listed the actual services the collection covers. It asked a genuinely substantive architecture question about where a certain kind of security boundary should live - in the collection's metadata, inside each server, or in the agent runtime. It linked a real open source project the sender was building, with regular commits, tests, and examples.

No money ask. No identity ask. No broken grammar. A real person with a real repo asking a real question.

Except.

A quick GitHub search showed the same person had opened near-identical "question" issues across at least fifteen different MCP-related repos. Filesystem servers, sub-agent frameworks, alarm systems, and my personal favourite, an MCP server for Garry's Mod. Same template every time, with per-repo details filled in. In my case the service list was lifted straight from my README, with a couple of entries trimmed off the end to make it look hand-picked.

So the whole thing was almost certainly LLM-generated outreach at scale. Crawl repos, summarise each README, generate a plausible thoughtful question, hope maintainers engage with the linked project and eventually adopt it or link back to it. Engagement farming, but wearing a lab coat.

And here is what makes this specimen tricky: it is not malicious. The project being promoted is real. The question is even worth answering - the honest answer involves MCP tool annotations and would take three sentences. This is just what the first spammer's email looks like after someone hands the same job to a much better writer. The tells did not disappear, they moved. You can no longer find them in the grammar. You find them in the sender's activity across the rest of GitHub.

Verdict: ignoring is fully defensible, since mass outreach earns no reply obligation. If the question genuinely interests you, answer it in public on a GitHub discussion instead of over email. The one thing to avoid is being gently nudged into adding someone's dependency to your project because the email flattered your README.

The actual field guide

Boiling all three down, here is what I now check before spending any emotion on a cold email:

Does it quote my repo back at me with broken seams? Scraped description, mangled grammar, details that almost fit. That is mail-merge. Delete.
Does it mention nothing I have built? Pure profile-scrape flattery plus a vague revenue offer is a scam shape, and the well-documented ones escalate to identity fronting. Report as phishing.
Does it look genuinely hand-written? Trust, but search. Check the sender's public activity for the same message sent everywhere. In the LLM era, the writing quality tells you nothing. The distribution pattern tells you everything.
Is my commit email public? If yes, that is the tap these campaigns drink from. GitHub's private noreply email closes it for future commits.

The uncomfortable takeaway is that rule three is only going to get harder. The badly-glued template email is a dying species. What replaces it reads like a thoughtful peer, cites your own work accurately, and asks questions you would actually enjoy answering. The only durable signal left is behaviour at scale, and checking that takes more effort than most of us will spend on a random Tuesday email.

That is pretty much it from my side today. Let me know what you think, or if your own one-star repo has been getting fan mail too - those stories are always the best ones. See you soon in the next blog.

The image model that only runs on Apple silicon

Vineeth N K — Sun, 19 Jul 2026 13:28:39 +0000

The image model that only runs on Apple silicon

TL;DR - Ollama's image generation does not run on the llama.cpp engine. It runs on MLX, which is Apple's framework, which means Apple silicon only. Linux and Windows builds simply do not ship the library. The fix is not a fix, it is a different machine.

So there I was, pretty excited. Ollama had shipped local image generation, I had a perfectly good Ubuntu server sitting there doing nothing much, and I thought this was going to be a nice evening. Pull a model, type a prompt, get a picture. How hard can it be.

ollama run x/flux2-klein "neon-lit street at night, photorealistic"

And what I got back was this:

Error: 500 Internal Server Error: mlx runner failed: Error: failed to
initialize MLX: failed to load MLX dynamic library (searched:
[/usr/local/lib/ollama /build/lib/ollama /dist/linux-amd64/lib/ollama
/dist/linux_amd64/lib/ollama]) (exit: exit status 1)

My first reaction was the same as yours probably would be. Broken install. Missing package. Some library I forgot to apt-get. I was already mentally writing the ldconfig command.

That was the wrong instinct, and it cost me a good bit of the evening.

Read the error properly, not emotionally

Look at that error again. Really look at it.

It is not saying "MLX is broken". It is saying it went looking for an MLX library in four different folders and found nothing. And one of those folders is literally named dist/linux-amd64/lib/ollama.

That is the whole answer sitting right there in the path name. The Linux build of Ollama has a slot where the MLX library should go, and that slot is empty, because Linux builds do not ship one. There is nothing to install. There is no package. The thing I was looking for was never made for the machine I was on.

Every dev who has spent an evening reinstalling something that was never installable is nodding right now.

What MLX actually is

Here is the part I did not know, and it explains everything.

When you run a normal text model on Ollama - your gemma, your llama, whatever - it goes through the usual engine that runs on basically anything. CPU, NVIDIA GPU, Apple GPU, does not matter much. That is the Ollama most of us know.

Image models do not use that engine at all. They use a completely separate runner built on MLX, which is Apple's own machine learning framework. And Apple's framework talks to Apple's GPU through Metal. That is not a preference or an optimisation. It is the only thing MLX knows how to talk to.

So it is not "Ollama does not support Linux image generation yet" in the sense of a missing feature. It is more like the feature was built on a foundation that only exists on one platform. Ollama's own announcement puts it plainly: image generation works on macOS, with Windows and Linux "coming soon". That post went up in January. It is July now and that sentence has not changed.

Proving it instead of trusting it

I do not like taking a blog post's word for it, even an official one. So once I had Ollama installed on my Mac, I went digging into the app bundle to see whether the MLX story actually held up.

find /Applications/Ollama.app -iname "*mlx*"

And there it was:

/Applications/Ollama.app/Contents/Resources/mlx_metal_v3/libmlx.dylib
/Applications/Ollama.app/Contents/Resources/mlx_metal_v3/mlx.metallib
/Applications/Ollama.app/Contents/Resources/mlx_metal_v4/libmlx.dylib
/Applications/Ollama.app/Contents/Resources/mlx_metal_v4/mlx.metallib

That .metallib file is the giveaway. Those are compiled Metal shaders - GPU code written in Apple's shading language, for Apple's GPU. There are two versions of it, v3 and v4, because different generations of Apple chips want different Metal targets.

You cannot ship that to Linux. There is no Metal on Linux to ship it to.

The other small thing I noticed - on the Mac, /usr/local/bin/ollama is just a symlink pointing into the app bundle:

/usr/local/bin/ollama -> /Applications/Ollama.app/Contents/Resources/ollama

Which is a neat little detail, because it means the CLI and the app are the same binary. The Mac app is not a wrapper around a separate install. That is why installing the desktop app is enough, and why hunting for a Homebrew formula (which is what I tried first, obviously) is a dead end.

So what actually works

On the Mac side, it just runs. My machine is an M4 Air with 24 GB, and the 4B model renders comfortably without the fans even getting interested. There are two models to know about:

x/z-image-turbo - 6B, from Alibaba's Tongyi Lab, Apache 2.0, good at photorealistic stuff
x/flux2-klein - from Black Forest Labs, comes in 4B (Apache 2.0) and 9B (non-commercial licence)

I went with flux2-klein:4b because the licence is clean and it is fast.

Now, one thing that tripped me up when I tried to script it. The CLI renders the image inline in your terminal. It looks lovely and it writes absolutely nothing to disk. Pipe it somewhere and you get nothing useful. So for anything automated, go through the HTTP API:

# note the field is "image", singular - I lost a few minutes to that one
curl -s http://127.0.0.1:11434/api/generate \
  -d '{"model":"x/flux2-klein:4b","prompt":"your prompt","stream":false,
       "width":1216,"height":640,"steps":4}' \
  | jq -r '.image' | base64 -d > hero.png

Two gotchas packed in there. The response field is image, not images - singular, which is the opposite of what every other image API has trained you to expect. And a full render takes a couple of minutes, so if you are calling this from any tool with a default timeout, raise it. Otherwise you get a mysterious empty file and start blaming the model.

The hero image at the top of this post came out of exactly that command, by the way. Photorealistic, generated on the same laptop I typed this on, no API key, no credits, no upload.

The bit I actually want you to take away

The lesson here is not about Ollama. Ollama will ship Linux support eventually and this whole post becomes a historical footnote.

The lesson is that I spent an evening trying to fix an install that was not broken. And the information I needed to stop doing that was printed in the error message, in the first line, in a folder path that said linux-amd64 right next to a thing called MLX. Two words I could have connected in about ten seconds if I had read instead of reacted.

But no. Error appears, brain says "broken install", hands start typing apt. That reflex has saved me plenty of times, which is exactly why it is dangerous - it fires before you have actually looked at anything.

Now when something fails on a machine, my first question is not "what is missing here". It is "was this ever supposed to work here at all". Different question. Much cheaper to answer.

Quick reference, if you landed here from the error

Getting failed to load MLX dynamic library on Linux or Windows? Nothing is broken. Image generation is macOS-only right now, there is an open issue for it (ollama/ollama#16876), and reinstalling will not help.
Have a Mac with Apple silicon? Install the desktop app, pull x/flux2-klein:4b, done.
Only have a Linux box with an NVIDIA card? Use ComfyUI or Hugging Face diffusers instead. Same models, different runner, actually supported there.
Text models are unaffected. Your gemma and llama setups on Linux are completely fine. This is an image-generation-only wall.

That is where I will stop. If you have found a sensible way to run these image models on a Linux box without going the ComfyUI route, I would genuinely like to hear it - drop me a note. Otherwise, see you when the next interesting problem shows up.

Why Does Your AI Agent Forget Things Halfway Through?

Vineeth N K — Sat, 18 Jul 2026 18:18:04 +0000

TL;DR - Agents do not forget because their memory is weak. They forget because the session ran well past the job it was opened for. I went back through 210 of my own sessions across every project. Only four ever ran out of room, and those four were exactly my four longest. The fix was not a smarter summary or a cleverer prompt. It was ending sessions sooner.

You know that moment when your agent calmly suggests the exact thing you both ruled out a while back? Not a hallucination. Not the wrong file. Just a polite, confident proposal to do the one thing you already decided against, delivered in the same tone as everything else it says.

That happened to me not too long ago, and it sent me down a small rabbit hole.

The boring version of the story

I was deep into a long session on the new backend, sorting out a pricing endpoint. We had gone back and forth on it properly. The choice was between calling the existing dynamic pricing routine directly, or building a new JWT authenticated endpoint that wraps that routine internally. We went with the second one, and there were actual reasons behind it.

Then auto compaction fired.

The summary that came out the other side kept the decision. New endpoint, JWT, wraps the existing routine. All of that survived intact. What did not survive was the why. The alternatives we had considered, and the specific reasons each one lost, got compressed into nothing.

Now here is the part that stuck with me. The agent kept following the decision. It just could not defend it anymore. So the moment I pushed back even slightly, it started sliding toward the option we had already thrown out, because from where it stood there was no longer any reason not to.

I caught it quickly. I have one rule I genuinely do not bend, which is never assume anything and always check against real payloads and real responses. That rule is what surfaced it. And honestly? The whole thing cost me almost nothing. Mild irritation, one re-explanation, back to work.

Tiny damage. But it bothered me more than the size of it deserved, because I realised I had no idea how often this was quietly happening.

So I went and counted

I pulled up my whole session history. Every project, work and personal, the whole lot. 210 sessions.

Median session: 19 messages from me
Ninetieth percentile: 174
Longest: 1704
Sessions that ever hit auto compaction: four

Four out of 210. Lower than I expected, and I will take it. But the number that actually made me sit up straight was a different one.

Those four compacted sessions were exactly my four longest sessions. 1704, 588, 574, 546. Nothing else in the entire history came anywhere near.

That is not coincidence and it is not bad luck. Compaction is not a random hazard that strikes when the model is in a mood. It is what happens when a session keeps running long after the job it was opened for finished.

Compaction is a symptom, not a tool

Most advice about context management is really advice about surviving a full window. Summarise better. Prune the history. Write a tighter system prompt. Learn the magic incantation that makes the summary keep the good bits.

All of that treats a full window as a fact of life you work around. I do not think it is. In my own history it is a rare event that correlates almost perfectly with one specific mistake, which is letting a session outlive its task.

So my whole approach shifted from managing context to not needing to manage it. If the window never fills, there is nothing to compress, and nothing to lose in the compression. The summary that never runs cannot drop your reasoning.

The way you get there is not clever prompting. It is boring old scoping.

One ticket, one worktree, one session

This is the rule, and it is genuinely the whole thing.

One ticket gets one git worktree. That worktree gets one session. When the ticket is done, the session dies with it. I have twenty worktrees sitting on disk right now, most of them with their own small _docs folder, and each one had a session that started and finished inside that boundary.

The unit of work decides where the session ends. Not the context meter. Not a warning banner. The job itself.

What this buys you is that the session never has to hold two jobs at once. It never accumulates the debris of a thing you finished a while ago and stopped caring about. The median session being 19 messages is not discipline on my part, it is just what happens when the boundary is drawn somewhere sensible.

Anyone who has watched an agent confidently reference a file from a task they abandoned earlier that same sitting knows precisely which failure this prevents.

What lives in a file instead

If the session is short, the knowledge has to live somewhere that outlasts it. That somewhere is the filesystem.

CLAUDE.md is a catalogue, not documentation. This distinction took me a while to get right. It describes the project: structure, architecture, conventions, rules, guidelines. It is what the agent needs to know about the shape of the place before it touches anything. It is not the repo's documentation and it should never try to be. My global one sits at 229 lines. The project ones range from a single line up to 621, and the big ones are big because those projects genuinely have that many conventions worth stating, not because I dumped the docs in there.

Then the actual documentation, separately. One work repo has 123 markdown files in its knowledge base folder. Two others have 64 and 45 in their docs trees. That material is real and useful, and almost none of it belongs in permanent context. It gets read when it is relevant to the task at hand, and ignored the rest of the time.

Memory files for facts that survive sessions. Sixteen projects have a memory folder. One fact per file, with a small index file on top. Things like a build quirk, a preference I have stated once and do not want to state again, a decision that outlives the ticket that produced it.

Skills for anything I do more than twice. Fifteen of them now. A skill is a workflow the agent loads when it needs it, rather than instructions I paste every time.

Subagents for anything wide. 44 of my sessions fanned out to subagents, 214 runs in total. When something needs a broad sweep across many files, that search runs in its own window and comes back with the conclusion. The searching does not pollute the session that asked for it.

Every one of those is the same move. Keep it out of the always-on context, and pull it in only when it earns its place.

When a long session is genuinely fine

I do not want to turn this into a rule that pretends to have no exceptions.

My longest session, the 1704 message one, was an autonomous build on a side SaaS project. Long stretches of work with a stable goal, where I actually did want continuity across the whole thing. It compacted, and that was the correct outcome for the shape of the work.

The difference is whether the length comes from the task genuinely being that long, or from the session drifting into a second and third task that should have started fresh. The first one is fine. The second one is where things quietly go wrong.

What I actually do now

Ticket opens, worktree gets created, session starts. Ticket closes, session gets closed with it, and anything worth keeping goes into a file before it does. If the subject changes, that is a new session, no matter how much room is left on the meter.

The uncomfortable part of writing this was seeing how little the actual incident cost me. I keep wanting to tell it as a bigger disaster. It just was not one. A decision survived compaction while its reasoning did not, I noticed within a few exchanges, and I moved on with my day.

But that is exactly why it was worth chasing down. The cheap failures are the ones you get to learn from without paying for them, and the numbers behind that one turned out to be a much better argument than the story itself.

So that is where I will leave it. If you scope your agent work differently, or you have found a case where a long session genuinely beats a short one, I would honestly like to hear about it. Otherwise, see you when the next interesting problem turns up.

The .de domain I unblocked just to redirect it

Vineeth N K — Mon, 13 Jul 2026 12:40:37 +0000

The .de domain I unblocked just to redirect it

TL;DR: I own vinelabs.de, a German domain. It went into a blocked state because DENIC could not verify my holder data, which is a real thing when you are not a German owner. A while later the block was lifted. And after all that, what did I do with the freshly freed domain? I did not build anything on it. I pointed it at another domain I own, vinelab.in, with a Cloudflare redirect that runs entirely at the edge. This is the story, plus the one small DNS trick that makes the redirect work without ever touching a server.

So here is a fun thing nobody tells you when you buy a .de domain from outside Germany.

DENIC, the registry that runs .de, actually cares who you are. Not in a vague terms-of-service way. In a "we need to verify the holder data on this domain and until we do, it is going nowhere" way. I am an Indian developer. I bought a German domain because the brand I want to put on some of my work sits in the DE and EU lane. All perfectly legitimate. But to DENIC I was just a holder record that did not fully check out yet, and so the domain sat there. Registered, mine, and blocked.

If you have ever owned something official in a country you do not physically live in, you know the particular flavour of low-grade paperwork dread that comes with it. It is never dramatic. It is just a quiet "please confirm your details" that sits on your mind for longer than it should.

What "blocked" actually means for a .de

This is not the same as the domain being taken away. The registration was fine. What was not fine was the holder data, the name and address tied to the domain, which DENIC could not verify to their satisfaction. So the domain went into a state where it exists on paper but does not do anything useful. You cannot really point it anywhere while it is in that limbo.

The fix was not clever on my side. The holder data got verified, and one fine day I got the email from DENIC saying the block had been lifted. That was the whole resolution. No war story, no escalation, no support ticket saga. Just a verification going through and an email landing in my inbox.

I did not fully trust the email though. Emails lie, or at least they get ahead of reality. So I went to check for myself.

Checking it myself, because one email is not proof

The nice thing about .de is that DENIC exposes a real whois. So I asked it directly instead of believing a notification.

whois vinelabs.de

The line that mattered:

Domain:   vinelabs.de
Status:   connect
Nserver:  dns1.registrar-servers.com
          dns2.registrar-servers.com

connect is the healthy state. In DENIC terms it means the domain is registered and properly connected to the network, which is exactly the boring, working status you want. It is not failed, which is the soft-blocked state where the nameservers or the holder data are not right. And it is not free, which would mean nobody owns it. So connect was the green light.

The other thing I noticed: no A record, no AAAA record, no www CNAME. Nothing was pointed anywhere. The domain was active but completely empty.

Which, honestly, was perfect. An active domain with zero records is a clean slate. Nothing to collide with, nothing to migrate, nothing to break. I could point it wherever I wanted.

The big decision: build nothing

Here is where I will be honest with you, because that is the whole point of this blog.

After all that verifying and waiting and checking whois like a nervous parent, I did not build a site on vinelabs.de. No landing page. No product. No grand launch to justify the wait.

I already have vinelab.in running. That one is live, sitting behind Cloudflare, serving cleanly over HTTPS on both the apex and www. So the sensible thing, the thing that took the least effort and made the most sense, was to just send vinelabs.de over to it. One domain, one destination, done.

So the payoff for unblocking a German domain was a redirect. That is it. A 301.

Do you also do this, where you fight to unlock some capability and then use it for the most modest possible thing? Because I felt slightly silly, and slightly pleased, at the same time.

The actual trick: redirect at the edge, touch no server

Now the interesting part, because there is one small thing here worth stealing.

A redirect has to be answered by something. Normally you think, okay, I need a tiny server somewhere that receives the request and replies with "go over there". But I did not want to run a box just to bounce traffic. That is silly for a redirect.

Cloudflare can do the whole thing at its edge, and the way you set it up looks slightly weird the first time. You give the domain a DNS record pointing at an IP that does not exist on purpose.

I added vinelabs.de to Cloudflare on the free plan, same account as vinelab.in. The DNS scan found nothing to import, which I already knew. Then I added two records:

A     @     192.0.2.1      Proxied (orange cloud)
CNAME www   vinelabs.de    Proxied (orange cloud)

That 192.0.2.1 is not a typo and it is not my server. It is a reserved address from a block the internet standards set aside for documentation and testing (TEST-NET-1, from RFC 5737). It is guaranteed to never be a real host. So why point at it?

Because the request never actually gets there. With the orange cloud on, Cloudflare sits in front of that record. The request hits Cloudflare's edge, the redirect rule fires, and the visitor gets bounced before anything ever tries to reach the origin. The dummy IP is just a placeholder so the DNS record exists and the proxy has something to attach to. If I left the cloud grey instead of orange, Cloudflare would step out of the way and actually try to reach 192.0.2.1, which would just hang. The orange cloud is the whole trick.

Then I moved the nameservers. At Namecheap, where the domain is registered, I swapped from registrar-servers.com over to the two Cloudflare nameservers it handed me (coen.ns.cloudflare.com and collins.ns.cloudflare.com). Since DENIC had just verified the holder data, this went through without any fuss.

The redirect rule itself

With the domain in Cloudflare and active, the redirect is one rule.

In Rules, Redirect Rules, a new rule that matches the hostname, then sends it on:

When incoming requests match:
    Hostname equals vinelabs.de
    OR Hostname equals www.vinelabs.de

Then:
    Type: Dynamic redirect
    Expression: concat("https://vinelab.in", http.request.uri.path)
    Status code: 301
    Preserve query string: ON

The concat is what keeps paths alive. Instead of dumping everyone on the homepage, it takes whatever path came in and sticks it onto vinelab.in. So vinelabs.de/foo becomes vinelab.in/foo, not vinelab.in. Preserve query string keeps the ?something=value bits too. Little details, but they are the difference between a redirect that respects the link someone clicked and one that throws it away.

Last bit, HTTPS. Under SSL/TLS I set the mode to Full and turned on Always Use HTTPS. Universal SSL issued a certificate for the apex and www on its own shortly after. So even a plain http:// request gets forced up to https:// and then redirected, with a valid cert the whole way. No browser warnings before the bounce.

Trust, but verify (with dig and curl)

I was not going to write "it works" without actually watching it work. So, the checks.

Nameservers first:

dig +short NS vinelabs.de
# coen.ns.cloudflare.com.
# collins.ns.cloudflare.com.

Delegation had propagated. Then the redirect itself, every case I could think of:

curl -sI https://vinelabs.de
# 301 -> location: https://vinelab.in/

curl -sI http://vinelabs.de
# 301 -> location: https://vinelab.in/   (plain HTTP forced up and redirected)

curl -sI https://www.vinelabs.de/foo?x=1
# 301 -> location: https://vinelab.in/foo?x=1   (path and query both preserved)

Every case behaved. Apex and www, both HTTP and HTTPS, path carried through, query string carried through, and the server: cloudflare header confirming it was all happening at the edge and never at some origin. The https:// calls completing at all told me the certificate was real, because curl would have failed the handshake otherwise.

That is the full chain working. A German domain that spent a while blocked over holder data, now quietly forwarding every request to vinelab.in, and not a single server involved in doing it.

What I actually took away from this

Two small things stuck with me.

One, owning a .de from outside Germany comes with a verification step that can freeze the domain, and there is nothing you can do to rush it. It is not a bug and it is not personal. DENIC just wants to know the holder is real. Once that clears, the domain behaves like any other. Knowing that ahead of time would have saved me some quiet worrying.

Two, a redirect does not need a server. The reserved dummy IP plus an orange cloud plus one rule is enough to forward an entire domain, forever, for free, with valid TLS and no box to maintain. I keep being a little surprised by how much you can do at the edge with nothing running behind it.

And the domain I waited on? It is a signpost now. Points at vinelab.in and gets on with its life. Sometimes the anticlimactic ending is the correct one.

That is all I had on this one. If you made it till here, thank you, genuinely. See you in the next one, where I will probably be complaining about something else that broke.

The cron job that had no user

Vineeth N K — Mon, 13 Jul 2026 12:30:04 +0000

The cron job that had no user

TL;DR: I added a scheduled job to a multi-tenant NestJS backend. It kept failing with "Missing active user in context". The use cases were reading the current org from a per-request CLS store, and a cron has no request, so the store was empty. The fix was to open a fresh context per tenant with a system user before doing any work, and then to write a test that runs the real job under an empty context so nobody can quietly break it again.

So there I was, reading the morning logs, and I find this line sitting there at some ungodly hour: Missing active user in context. From a cron job. A job that runs on a timer, all by itself, while every human who could possibly be a "user" is fast asleep.

An automated job being told it is not logged in. Take a second with that one.

The funny part is the app was completely right to complain. I was the one who set it up wrong.

The setup that worked fine for months

The backend is multi-tenant. Many organisations, one codebase, and every single read or write has to be scoped to one org. You never want tenant A accidentally seeing tenant B's data. That rule is basically the whole ballgame.

So how does the app know which org a request belongs to? It uses CLS. If you have not run into it, CLS in NestJS (the nestjs-cls package) is a nice wrapper over Node's AsyncLocalStorage. Think of it as a little box that lives for the duration of one request. Something early in the request pipeline drops the authenticated user into that box, and anything running later in the same request can reach in and pull it back out. No prop-drilling the user through fifteen function calls. It is genuinely pleasant.

There is a small service wrapping all of this. Simplified, the important bit looks like this:

requireActiveUser(): RequiredActiveUserContext {
  const activeUser = this.get('activeUser')
  if (!activeUser) {
    // nobody in the box - we refuse to guess which tenant this is
    throw new UnauthorizedError('Missing active user in context')
  }

  const { userId, organizationId } = activeUser
  return { activeUser, userId, organizationId }
}

And the use cases lean on it. A typical one starts by asking "who am I acting as, and which org?" and goes from there:

async execute(command: SomeCommand) {
  const { organizationId } = this.contextService.requireActiveUser()
  // ...everything below is scoped to that org
}

This is clean. It means no use case can accidentally run without a tenant. If the box is empty, you get a loud UnauthorizedError instead of a silent data leak. For every HTTP request, this is exactly what you want.

You can probably already see the trap I was about to walk into.

Then I added a cron

The feature was simple. Every so often, go clean up some stale records across all tenants. Standard housekeeping stuff. NestJS makes this a one-liner with @Cron:

@Cron(CronExpression.EVERY_30_MINUTES)
async handleCron(): Promise<void> {
  await this.reconcileEverythingUseCase.execute()
}

Looks harmless. I wrote it, the tests I had were green, I shipped it, and I moved on with my life.

Here is what I did not stop to think about. A cron job does not run inside a request. There is no login, no token, no middleware doing its thing before the handler fires. The timer just goes off and calls the method directly. Which means that little CLS box? Empty. Completely empty.

So the very first thing the use case does - requireActiveUser() - looks in the box, finds nothing, and throws. And because I had wrapped the cron body in a polite try/catch that just logs the error, it did not crash anything loud. It just failed, quietly, over and over, on a timer, writing one sad line into the logs each time while I slept.

If you have ever bolted a cron onto an app that was built request-first, you know this exact flavour of pain.

Why the obvious fixes are wrong

My first instinct was the lazy one. Just skip the check for crons, no? Read the org some other way and stop calling requireActiveUser.

Bad idea. That check is load-bearing. It is the thing standing between "scoped to one tenant" and "oops, ran across all data with no scope". Weakening it to make a cron happy is how you turn a small bug into a data-isolation incident. Hard no.

Second instinct: fake a user. Grab some admin account, shove it in the box, done. Also bad. Now your background job is impersonating a real human who did not do anything, audit logs get muddy, and the day that admin gets deactivated your cron mysteriously dies. You are just moving the problem somewhere darker.

The real issue was never the check. It was that a cron genuinely has no user, and pretending otherwise is the mistake. What a cron actually has is a job to do on behalf of the system, for a specific tenant. So the context it needs is not a person. It is the system, scoped to an org.

Running as the system

The fix was to give the context service a second way in. Not "I am this logged-in human", but "I am the system, working on this org". Here is the shape of it:

runAsSystem<T>(organizationId: OrgId, fn: () => Promise<T>): Promise<T> {
  return this.cls.run({ ifNested: 'override' }, () => {
    // put a system identity in the box, scoped to one tenant
    this.set('activeUser', new SystemUser({ organizationId }))
    return fn()
  })
}

cls.run opens a brand new box and runs your function inside it. Before running, we drop a SystemUser in, carrying the one org this slice of work belongs to. Now when the use case calls requireActiveUser (or whatever reads the org), the box is not empty anymore. It finds a legit system identity, gets the org id, and does its thing. No fake human. No skipped check. The safety rail stays exactly where it was.

That one option in there - ifNested: 'override' - is worth a mention. It says "even if somehow this runs inside an existing context, do not inherit the parent's box, start clean". For a background job you really do not want to accidentally pick up some leftover state from a context that opened earlier, like a database transaction that is still hanging around. Clean slate, every time. It is a small flag that saves you from a category of very confusing bugs later.

And the cron itself becomes a loop, because a cron is not one tenant, it is all of them:

@Cron(CronExpression.EVERY_30_MINUTES)
async handleCron(): Promise<void> {
  const orgs = await this.getAllOrgsToProcess()

  await Promise.all(
    orgs.map((org) =>
      this.contextService.runAsSystem(org.id, () =>
        this.reconcileEverythingUseCase.execute(org.id),
      ),
    ),
  )
}

Each org gets its own fresh context. The use cases underneath did not change at all - they still ask the box "which org?" and still get a real answer. The only thing that changed is who fills the box before they look. During a request, it is the logged-in user. During a cron, it is the system, one org at a time.

The part that actually stops me repeating this

Fixing the bug felt good for about a minute. Then the uncomfortable thought showed up. What stops future-me, six months from now, from adding a new cron and forgetting the runAsSystem wrapper all over again?

Because here is the nasty bit about this bug: it does not show up in normal tests. Most tests either call the use case directly with a user already prepared, or spin up a context on purpose. Both of those hide the exact thing that breaks in production, which is the empty box. The bug only appears when something runs with genuinely nothing in the context, which is precisely the one condition your happy-path tests never reproduce.

So the guard had to reproduce that condition on purpose. The trick was a tiny helper that builds a real context service backed by a truly empty store:

export const createEmptyContextService = (): ContextService =>
  new ContextService(new ClsService(new AsyncLocalStorage()))

No user, no request, no setup. Exactly what a cron sees at 2 AM. And then a test that wires up the real job with the real use case (repos mocked, everything else genuine) and asserts one thing - it does not blow up under an empty context:

it('runs under an empty context without UnauthorizedError', async () => {
  await expect(job.handleCron()).resolves.toBeUndefined()
})

If someone later adds a context-dependent call into that job's path and forgets to wrap it, this test goes red immediately, with a stack trace pointing right at the problem, in CI, long before it ever reaches a sleepy production log. That is the whole point. The bug is invisible in the wrong test and impossible to miss in the right one.

What I actually took away from this

The lesson that stuck was not really about CLS or crons. It was that "the current user" is an assumption baked so deep into a request-first app that you stop seeing it. Every use case quietly assumes somebody is logged in, because for years somebody always was. The moment you introduce an entrypoint that runs without a request - a cron, a queue worker, a CLI command, a webhook consumer - that assumption walks off a cliff, and it does it quietly, in a try/catch, where you will not notice until you happen to read the logs.

So now, any time I add something that runs outside a request, the first question I ask is boring and useful: who is the context here, and who fills it before any real work starts? If I cannot answer that in one sentence, I am not ready to write the job yet.

That is the story. A robot got told it was not logged in, and it was completely correct. If it saves you one confused morning squinting at "Missing active user in context", then writing this down did its job.

Not going to pretend I designed this cleanly the first time. I shipped the broken version, the logs caught me, and the test only exists because the bug embarrassed me into writing it. But if even one part of this helped someone dodge the same 2 AM head-scratcher, then it was worth putting down. See you in the next one.

My Portfolio Has More CI Than My Day Job

Vineeth N K — Tue, 07 Jul 2026 12:22:24 +0000

My Portfolio Has More CI Than My Day Job

TL;DR: My personal site is on release 0.0.54. It has a CHANGELOG, a commit linter that rejects my own commits, three security scanners, browser tests, and visual regression checks. Nobody uses any of this except me. I regret none of it.

The other day I sat down to fix a typo on my own website. One word. I wrote the commit, pushed it, and my own CI slapped it back in my face because the commit message did not follow Conventional Commits.

Let me sit with that for a second.

A machine I set up, to guard a website only I edit, refused a one-word typo fix because I forgot to put a fix(blog): in front of my message. And the funny part? I did not even feel annoyed. I felt a little proud. That is the exact moment I realised my portfolio has quietly become more engineered than most of the actual products I get paid to build.

How did a blog end up with a version number

Let me show you the receipt first.

My site is at version 0.0.54. That is not a typo and it is not a joke. There is a real package.json with a real version field, and a tool called release-please that bumps it every single time I merge something. Each release cuts a tag, writes a GitHub release, and appends to a CHANGELOG.md that reads like a serious piece of software.

Here is an actual entry from it:

## [0.0.54](https://.../compare/v0.0.53...v0.0.54) (2026-06-14)

### Features
* **blog:** moving a homelab from .de to .in

A changelog. With compare links. Documenting the breaking changes to... a page about my home server. Fifty-three of these releases sit in my git history, each one a tiny ceremony for shipping a blog post nobody was waiting on.

The thing is, a changelog exists so users know what changed between versions they might be running. My "users" are me, and the version they are running is whatever loaded when they opened the tab. There is exactly one deployment and it is always the latest one. The whole concept does not apply. I built it anyway, and honestly it is kind of nice to scroll through.

The commit police live in my repo now

So back to that typo. The reason my commit got rejected is a workflow called commitlint. Every message I write gets checked against a set of rules. Right type. Right scope. Lowercase subject. Under a certain length. No trailing period.

If I fumble any of it, the whole thing goes red and I have to go back and fix my own sentence before my own website will accept it.

On a team, this makes complete sense. You have ten people writing commits and you want the history to read consistently so the changelog generates cleanly. On a repo where the only author is me, arguing with myself at two in the morning about whether a change is a fix or a chore, it is pure theatre. Good theatre though. I have written cleaner commit messages on my blog than on things that pay my rent, and that is a slightly embarrassing sentence to type out.

Tests. For a website. That only I touch.

Now we get to the part where I really lost the plot.

I write Playwright tests for my portfolio. Real browser tests, spinning up a headless Chrome, clicking through the site to make sure it works. There is one for navigation. One for the search modal. One for the blog pages, one for the sections on the landing page, one for the theme switcher.

And then, because apparently that was not enough, there is visual regression. My CI takes screenshots of the site, compares them pixel by pixel against saved snapshots, and if anything shifts it flags it and commits the new snapshots back. So if I nudge a button three pixels to the left, a robot notices and files the paperwork.

Who is this protecting? Me. From me. The only person who can break this site is the same person writing the tests to catch himself breaking it. It is the software equivalent of leaving myself angry sticky notes.

Have you ever built a safety net so elaborate that the net became the most impressive thing in the building? Because that is roughly where I landed.

The rest of the over-engineering buffet

While I was in there, I did not stop at tests. The site also has:

A command palette search, the little Cmd+K modal that power tools have, so I can fuzzy-search my own blog posts with a keyboard shortcut. I have around forty posts. I know all of them. I still built the search.
giscus comments, wired through GitHub Discussions, so readers can comment. The comment count is, let us say, a very honest number.
Cross-posting, an automated job that pushes new posts out to dev.to on its own, with a cache so it does not double-post.
Three separate security scanners running on every change. CodeQL for code analysis, a dependency review, and Trivy scanning the filesystem for known vulnerabilities. On a static site. That has no login, no database, no user input, and no server doing anything at runtime. The attack surface is roughly the size of a postcard and I have three guards watching it.

Reading that list back, it sounds like I am describing a fintech backend, not a place where I complain about Docker.

So why do it, really

Here is the honest turn, and it is not the one you might expect.

None of this was necessary. I want to be very clear about that. A personal site needs a build step and a place to host it, full stop. Everything else I piled on top is decoration.

But every single piece of that decoration taught me something I then used at work. Setting up release-please on a low-stakes repo meant that when a real project needed automated releases, I already knew the sharp edges. The Playwright visual regression I fought with here is the same setup I later reached for on a production app where a broken layout actually costs money. My personal site turned into the sandbox where I get to make all the mistakes for free, with nobody paged and no customer affected.

The day job gives you production systems but not always the freedom to experiment on them. You cannot casually try a new CI pattern on the thing that pays real salaries. So the experiments have to live somewhere, and for me that somewhere is a blog with a version number.

Is it overkill? Completely. Would I rip any of it out? Not a chance. The overkill is the point.

So that is the confession. My portfolio has a CHANGELOG nobody reads, tests nobody triggers, and security scans for an attack surface that does not exist, and I would set every bit of it up again tomorrow. If you have a personal project quietly carrying more engineering than it could ever need, you already know it is not really about the project.

Not going to pretend this was a perfectly rational way to spend my evenings. But if even one part of it nudges you to treat your own side project as the safe place to try the scary stuff, then it was worth writing down. See you in the next one.

I taught WeSpend to read GPay screenshots. OCR fought back.

Vineeth N K — Sat, 04 Jul 2026 14:00:16 +0000

I taught WeSpend to read GPay screenshots. OCR fought back.

WeSpend is a small app I built for my household. One person funds a shared monthly pot, everyone logs what they spend from their own phone, and at the end of the week it works out who owes whom. The whole thing lives or dies on one boring question: how easy is it to add an expense? Because if adding an expense takes ten taps, nobody does it, and then the numbers are a lie.

So I added what felt like a lazy little shortcut. You pay someone on GPay, you get that green success screen, you share that screenshot straight to WeSpend, and the app reads the amount and fills it in for you. One share, done. On-device OCR, no typing.

It worked beautifully. For exactly half the screenshots.

Half the screenshots. The nice round half.

Here is the pattern I did not notice at first. A payment of ₹287 came through perfectly, every single time. A payment of ₹130.00 came through as zero rupees. Same app, same screen, same OCR. The only difference was those two little zeros after the dot.

Clean integer amounts, the ones with no paise, sailed through. The moment there was a .00 on the screen, the amount field just quietly filled in 0 and sat there looking innocent.

If you have ever watched an app fill in a number with total confidence and get it completely wrong, you know the exact little sting I felt. It is worse than an error. An error at least admits something went wrong.

So I did what you do. I started printing out exactly what the OCR was handing me, one screenshot at a time. And that is where it got funny.

OCR is not bad at reading. It is bad in very specific, creative ways.

I was using Google's ML Kit for the on-device text recognition. It is genuinely good. But a stylized GPay payment screen is not clean print, and the ways it got things wrong were oddly consistent. Once I saw the actual output, the mystery fell apart.

Here is the collection I ended up with.

It eats the rupee sign. The ₹ on that success screen is a nice stylized glyph, and OCR would sometimes just drop it entirely. ₹130.00 came back as 130.00. Not the end of the world on its own.

It reads zero as the letter O. This was the real culprit behind the .00 problem. 130.00 came back as 130.OO, with two capital letter O's where the zeros should be. To my parser, 130.OO is not a number at all, so it gave up and left 0.

It reads the decimal point as a space. On some screens the same amount came back as 130 00. Now it looks like two separate numbers, 130 and 00, and neither is the answer.

And my favourite, it reads the rupee sign as the number 7. This one I did not see coming. On a few screens the ₹ was not dropped, it was confidently transcribed as a 7. So ₹280.00 came back as 7280.00. That is not a missing rupee, that is a fake two thousand rupees added to my payment. Imagine settling the week off that.

I sat there looking at 130.OO and 7280.00 and honestly laughed. My clean little shortcut had walked straight into the real world.

Fixing it, one liar at a time

The temptation here is to write one big clever regex that handles everything. Do not do that. I tried. It becomes unreadable in about twenty minutes and then it eats a phone number and tells you the auto ride cost forty-two lakh.

What actually worked was treating each specific way OCR lies as its own small, named repair, each one narrow enough that I could write a test for it and trust it.

The rupee-as-7 one is a good example. If a number starts with 7, and the character just before it is not a digit, and stripping that leading 7 still leaves a valid positive amount, I treat it as a currency amount where the 7 was really the rupee sign. So 7280.00 becomes 280.00. There is one important guard: if the character after that 7 is a comma, I leave it completely alone, because 7,280.00 is a perfectly real number in Indian grouping and I have no business touching it.

The mangled cents was the fix that actually shipped the feature. When a line looks like a real amount but the fractional part is unreadable, one or two characters of garbage like OO or a stray space, I now trust the integer part and just throw the broken fraction away. 130.OO becomes 130. 130 00 becomes 130. The screen said one hundred and thirty rupees, and one hundred and thirty rupees is what you get.

The one rule that keeps all of this safe is a cap I almost forgot: the discarded fraction can only be one or two characters. That tiny limit is doing a lot of quiet work. A reference number, a date, a phone number, a UPI transaction ID, none of those have a short one-or-two-character tail, so none of them get mistaken for an amount with mangled cents. Without that cap, this whole feature would be a slot machine.

There was one more from earlier that fits the same family. OCR sometimes drops the decimal point but keeps the thousands comma, so 1,557.40 arrives as 1,55740. A real integer's last comma group is always exactly three digits, in both Indian and Western grouping. So if that last group is longer than three, I know the final two digits are the dropped paise, and I put the dot back. Same idea every time. Learn one specific way the machine lies, write the narrow repair, cap it so it cannot overreach.

The other half of the problem: which number is even the amount?

I have been talking as if there is one number on the screen. A bank SMS is worse. It has the amount, the available balance, maybe a transaction reference, sometimes a date that reads like a number too.

So the parser also has to know which figure is the spend. A running balance is never the amount you spent, so anything sitting next to words like avbl, bal, or balance gets ruled out. A word like credited or refund flips the whole thing from a spend to a credit. And a genuine debit names the rail it went over, a/c, upi, imps, card, while a promo message or an OTP never does. None of this is glamorous. All of it is the difference between an expense tracker you trust and one you quietly stop using.

You do not need me to spell out which of those two an untrusted expense tracker becomes.

What I actually took away from this

Every one of these fixes started as a real screenshot that embarrassed me. So every one of them became a test with the actual garbled text pasted in, mangled O's and fake sevens and all. My test file now has a merchant called Loaded Gazebo in it, which is not a real shop, it is just the fake payee I kept reusing while chasing this. That file is the most honest documentation in the whole project, because it is literally a list of the ways the real world broke my assumptions.

The bigger lesson, if there is one, is that OCR on real screenshots is not a clean input you parse. It is an adversary with a small, learnable set of tricks. You do not beat it with cleverness. You beat it by writing down each trick, one narrow rule at a time, and capping every rule so it can never be too confident. Which, now that I type it out, is basically how you survive anything that lies to you in predictable ways.

Anyway, WeSpend reads my GPay screenshots now. Round amounts, paise, dropped rupee signs, fake sevens, all of it. Adding an expense is one share again, the way it was always supposed to be, and the household numbers have stopped being a polite fiction.

Okay, that is enough out of me for today. If your own side project has one of these little features that turned out to be an entire iceberg, I would genuinely love to hear what was hiding under yours. Until the next one, go easy on your OCR, it is trying its best.

vaultctl Has a Browser Extension Now

Vineeth N K — Thu, 02 Jul 2026 14:56:55 +0000

vaultctl Has a Browser Extension Now

For a long time vaultctl was three things. A single Go binary on my server, a web app, and a CLI. All of them worked. None of them were where I actually needed them, which is the exact moment a login form shows up and I have to go somewhere else to copy a password.

So now there is a fourth thing. A browser extension. It sits in your toolbar, unlocks with Touch ID, and fills your logins on the page where you are standing. This post is the story of getting there, and I will be honest with you, it took a lot more than I thought.

This is another post in the series where I walk through my open-source projects. If you want the why-does-this-exist and the zero-knowledge story, that is all in building vaultctl. This one is just about the extension.

It started with a lazy question

I was using the web app to grab a password, paste it, then go back. Again. And again. One of those afternoons I just typed out loud into the chat, "do we even have a browser extension?"

We did not. There was a folder, some scaffolding, a popup that showed nothing useful. That was it.

That one lazy question turned into the single longest stretch of work in the whole project. Funny how that goes. The features you announce proudly take a week. The feature that is "just autofill, how hard can it be" takes over your life.

Autofill is not a feature, it is the entire job

Here is the thing nobody tells you. A password manager extension is maybe ten percent vault and ninety percent fighting with the web.

Every login page is built differently. Some put the username and password on one screen. Some show you the email first, then the password on a second screen after a redirect. Some render the form late with JavaScript, so when your extension looks for fields on page load, there is nothing there yet. Some have a fake password field for a 2FA code that is not actually the password at all.

I hit every single one of these. In order. Painfully.

The multi-step logins were the first wall. You type your email, the page moves to the password step, and by then the extension has forgotten which email you were even using, so it saves a password with no username attached. Useless. I had to make it remember the email across that jump.

Then the late-rendering forms. The extension would scan the page, find nothing, and give up, all before the actual login form had finished loading. So I added a delay. Then the delay caused a new problem, because if you had already started typing, the autofill would rudely stomp on your input. So then I had to guard the delayed fill against your own typing. One fix, one new bug, the usual dance.

The save toast was its own little saga. You log in, the extension offers to save the password, and then the site redirects you to your dashboard and the toast vanishes before you can click it. Gone. I lost count of how many times I logged in just to watch that toast disappear. Keeping it alive across a redirect, even a cross-host or single-page-app redirect, took way more attempts than I want to admit.

And the field picking. The extension kept offering to fill the 2FA code box because it looked like a password field. So I taught it to pick the real username field and not the verification-code field. Small thing. Took ages to get right.

If you have ever built one of these, none of this is news to you, and you have got the scars to match.

The Google-style picker

Somewhere in the middle of all this, I stopped trying to be clever with inline filling and just copied the pattern everyone already understands. A little icon inside the field. You click it, a small picker drops down showing the matching login with the site favicon and the username, password masked. You pick, it fills.

Sounds simple. The fiddly part was making it behave. Keep the picker open when the field is focused. Suppress the browser's own native dropdown so you are not fighting two popups at once. Scope the suggestions to the exact site so a random login does not show up on the wrong page. Bold the username so you can actually read it at a glance.

None of these are hard problems on their own. Together they are a hundred tiny papercuts, and the difference between an extension that feels nice and one that feels broken is whether you bothered to fix all hundred.

The stack under all this, if you care, is WXT for the extension framework, React 19 and Tailwind 4 for the popup, zustand for state, zod for validation, hash-wasm for the crypto bits, lucide-react for icons, and i18next so the whole thing speaks English and German. Manifest V3, because Chrome gives you no choice anymore.

The rule that made everything slower and I would do it again

vaultctl is a credential manager. The whole reason it exists is so you do not have to trust some other party with your secrets. So I made one rule early and stuck to it. No pulling in random third-party services for the sensitive parts. If a piece is missing, we build our own small version of it.

This rule cost me time. It was worth every minute.

Two examples. First, the QR code. When you set up your account you get a recovery kit, and that needs a real, scannable QR. The first version I had was, in the kindest words, a deterministic visual fingerprint. It looked like a QR. It was not a QR. Nothing could scan it. For a production credential manager, that is not a "ship it and fix later" situation. So instead of reaching for some QR library, I wrote a proper QR generator inside the project. Real encoding, real error correction, actually scannable.

Second, attachments. I wanted to let you attach files to a vault item, securely. The obvious move is to bolt on MinIO or SeaweedFS or some object store. But that is a whole extra service to run, trust, and secure, for a tool whose entire pitch is "do not trust extra parties". So I built a small object storage module right into the binary. One filesystem-backed blob store, encrypted like everything else. No new service, no new trust boundary.

Is my QR generator as battle-tested as a popular library? No. But it is small, I can read all of it, and nothing about my recovery kit leaves the boundary I control. For a vault, that trade is the right one every time.

The small things that ate whole evenings

The big features get the commits with nice names. The small stuff is where the time actually goes.

The bottom tab bar in the popup was not fixed in place. So to switch between the vault, the generator, and settings, you had to scroll all the way to the very bottom to even see the tabs. I used my own extension for two minutes and wanted to throw my laptop. Pinning the tab bar to the bottom was a five-minute fix that I should have done on day one.

Copy was half broken. You could copy the username fine. Copy the password, nothing happened. A credential manager where you cannot copy the credential. Beautiful.

And then, the one that made me laugh at myself. I went through the extension and found em-dashes sitting in some of the alert and notification text. If you have read anything else on this blog you know exactly how I feel about em-dashes. My own tool was using them. In my own product. I hunted them all down and replaced them with honest little hyphens. Some battles are personal.

The TOTP rabbit hole

This one I have to be honest about, because I confused myself properly.

vaultctl can store 2FA. The extension can show you a live TOTP code and fill it in for your logins. Good feature. But while building it I tied myself in a knot over what TOTP even meant in this context.

See, the recovery kit has its own TOTP, for unlocking your vault. And separately, your saved logins can each carry their own 2FA secret, for the sites you log into. Same letters, two completely different jobs. For a while I genuinely could not tell you which one I was working on, and I kept asking myself out loud, do we even save TOTP, and if we generate the code then where is the secret coming from, and is this the vault's 2FA or the website's.

The answer, once I slowed down. We store the 2FA secret for your target logins, encrypted like everything else, and generate the code on the fly. The vault's own TOTP is a separate thing. Once I drew that line clearly in my head, the feature was easy. The confusion was the hard part, not the code.

There was also a related bug worth mentioning, since it is a nice example of doing too much. The extension was showing a fill suggestion on every single OTP input box on a page, even though we do not store one-time codes. Annoying little emblem popping up everywhere. Had to de-duplicate that so it only shows where it makes sense.

So, what is in it now

Quite a lot, actually. Touch ID unlock. Inline autofill with the picker. Save and update prompts that survive redirects. A multi-vault switcher with cross-vault filling. Capture and fill for credit cards and identity forms, not just logins. Live TOTP codes. A password generator with a memorable-passphrase mode. A password checkup that warns you about weak or compromised passwords. Per-site "never save" if a site annoys you. English and German throughout.

None of it is glamorous. All of it is the kind of thing you only notice when it is missing.

If your password manager has ever filled the wrong field, or eaten your save prompt on a redirect, or shrugged at a two-step login, I hope this gives you a little sympathy for whoever built it. I certainly have more sympathy now than I did before.

vaultctl is open source over at github.com/vineethkrishnan/vaultctl, extension folder and all, if you want to see how the sausage is made.

That is pretty much it from my side today. If you have been through the same autofill pain, or you have a cleaner way of handling these multi-step login forms, I genuinely want to hear it. Those stories are always the best ones. See you soon in the next blog.

moving a homelab from .de to .in without breaking the tunnel

Vineeth N K — Sun, 14 Jun 2026 13:31:51 +0000

moving a homelab from .de to .in without breaking the tunnel

TL;DR: I run a small homelab on a Mac mini, fronted by a single Cloudflare tunnel, with Tailscale guarding everything internal. I moved the public side from vinelabs.de to vinelab.in, because I operate out of India and the .de belonged to a different chapter. It was the right call, and I am not second guessing it. The move itself was mostly painless once I stopped treating it as one big switch. The tunnel config turned out to be only half the job, DNS is the other half, Vaultwarden has a sneaky domain setting that bites, and I nearly corrupted my status page database by being too clever with SQLite. I am keeping the .de though, for German related work, once DENIC clears the paperwork. Here is the whole thing, mistakes included.

why i even did this

Let me start with the why, because the how only makes sense after that.

For a good while my homelab lived on vinelabs.de. It was fine. Everything worked. The tunnel was up, the services were reachable, nobody was complaining (mostly because the only user is me). So why touch a working thing?

If you have not seen the setup before, it is nothing exotic. One Mac mini at home runs the whole thing through Docker. A single Cloudflare tunnel fronts the handful of services I actually want reachable from the public internet: a landing page behind Caddy, my Vaultwarden, a small password tool I built called VaultCTL, an Uptime Kuma status page, and a webhook endpoint for some ticket automation. Everything else, n8n and ntfy and the rest, stays inside my Tailscale tailnet where it belongs and never touches a public name at all. So when I say I moved the domain, I really mean that public edge, the five or so hostnames the tunnel answers for. Nothing internal had to change, which is half the reason the move stayed calm.

A few reasons piled up. The first one is just identity. I am in India. I work from India. My whole setup runs out of a Mac mini sitting in my home in India. And every time I typed vinelabs.de I felt this tiny mismatch, like wearing someone else's jacket that happens to fit. The .de was from an earlier phase. That phase is not over, but I wanted my root identity to match where I actually am, so this was the right time to make the switch.

The second reason was a cleaner brand. vinelab.in is shorter, it reads better, and it actually says where I am.

And the third reason was the practical nudge. Holding a .de now means dealing with DENIC, the registry that runs the .de zone, and proving a proper holder identity that lines up with the rules for who can own one. Sorting that out from India, for a domain that no longer matched what I was using it for, was the push I needed. A .in I can hold cleanly, from right here, no awkward paperwork about why someone in India is fronting a German domain.

So I switched the homelab to vinelab.in, and looking back it was clearly the right move. But I did not kill the old one, and this is the part I actually like. vinelabs.de is still mine. Once I hear back from DENIC and the holder side is sorted, the plan is to give it a proper second life: German related work and the odd hobby project that genuinely belongs on a .de. It is not a tombstone. It is just moving to a shelf where it fits better. The homelab gets the .in it should have had from day one, and the .de gets to be the thing it was always more suited for.

the one rule that saved me: keep both live

Here is the single decision that made this whole thing low stress.

Do not flip from old to new in one go. Run both at the same time for a bit.

My setup is one Cloudflare tunnel pointing at a bunch of local services. The routing lives in a config file, and the trick was simply to add the new hostnames next to the old ones, not replace them. Same service, two doors.

# both domains point at the same local services during the move
# the .de ones come out later, once i trust the .in ones
ingress:
  - hostname: home.vinelabs.de
    service: http://localhost:80
  - hostname: home.vinelab.in
    service: http://localhost:80

  - hostname: locker.vinelabs.de   # vaultwarden
    service: http://localhost:8222
  - hostname: locker.vinelab.in
    service: http://localhost:8222

  # ...same pattern for vault, status, agents

  - service: http_status:404   # catch-all, required

Now both home.vinelabs.de and home.vinelab.in hit the same landing page. Nothing breaks the moment I add the new names, and I get to test the new domain properly before trusting it with anything.

This is the part I would tell anyone doing a domain move. The cutover is not a single scary switch. It is a slow handover where both sides work, and then one day you quietly remove the old side.

the tunnel config is only half the story

This one got me for a second, so let me save you the same confusion.

Adding a hostname to the tunnel config does not make it resolve. The ingress rules tell the tunnel "if traffic for this hostname shows up, send it here". But traffic only shows up if DNS actually points the name at the tunnel in the first place. Two separate things. The config is necessary, not sufficient.

So vinelab.in had to become a real zone in Cloudflare, with the registrar pointing at Cloudflare's nameservers, and then a DNS record per hostname routing to the tunnel. For a tunnel these are proxied CNAME records, the orange-cloud kind.

And here is the small gotcha that made me doubt myself. When I went to check the new records with dig, I did not see a CNAME pointing at the tunnel at all. I saw Cloudflare's own IP addresses instead.

home.vinelab.in    A    104.21.55.148
home.vinelab.in    A    172.67.149.38

For a moment I thought the routing was broken. It was not. When a record is proxied, Cloudflare hides the real CNAME and hands you its anycast IPs instead, because the whole point of proxying is that the world talks to Cloudflare and not to your origin. So an empty CNAME and a couple of 104.x / 172.x addresses is exactly what a working tunnel record looks like. The real test was just hitting the URL and seeing the right service answer, which it did.

Has this confused you before too? You go looking for proof in dig and the proxy quietly rewrites the answer on you.

the vaultwarden gotcha nobody warns you about

Most of my services did not care about the domain. A landing page does not know its own name. A status page does not know its own name. You point the new hostname at the same port and you are done.

Vaultwarden is not like that.

Vaultwarden has a DOMAIN setting baked into its config, and it is not cosmetic. That value is the origin used for WebAuthn, which is the thing behind passkeys and hardware security keys. If you change the domain, the old passkeys stop validating, because a passkey is tied to the exact origin it was registered against. The browser will simply refuse, and it is right to.

# before
DOMAIN: https://locker.vinelabs.de
# after, then recreate the container so it actually picks this up
DOMAIN: https://locker.vinelab.in

So the move here is two steps, not one. Change the value, then recreate the container. And go in knowing that any passkey you registered on the old origin needs to be added again on the new one. Master password and your normal two-factor are fine. Only the passkey side cares. I would rather you read that here than discover it while staring at a login screen that keeps saying no.

One update since I wrote this. I have since deprecated Vaultwarden and moved to VaultCTL, the small password tool I mentioned earlier that I built myself, mostly because I wanted a tighter security story than I was getting before. VaultCTL is what I actually use now. Vaultwarden is parked for the moment, still up but not the thing I reach for, and it gets pulled out of the homelab for good a bit later. So treat this whole Vaultwarden section as the history of the move rather than how my setup looks today. The DOMAIN lesson still holds for anyone running Vaultwarden through a tunnel, which is why I am leaving it in.

the status monitor that lied to me

This is my favourite kind of bug. The thing that is broken is not actually broken.

I run Uptime Kuma to watch my services, and two of those monitors track my Restic backups. They are push monitors, which work backwards from a normal check. Instead of Kuma poking the service, the backup script pings Kuma after it finishes. No ping inside the window, Kuma marks it down.

After the move, my backup health went red. My first thought was the obvious one, the backups are failing. They were not. The backups were running perfectly fine.

The problem was the ping address. The backup scripts were still pinging status.vinelabs.de, and during the move that old hostname had lost its DNS. So the script would finish the backup, try to phone home to a domain that no longer resolved, fail silently on that one line, and Kuma would sit there hearing nothing and assume the worst.

The fix was nicer than just swapping the domain. These scripts run on the same machine as Kuma. They have no business going out to the public internet and back just to say hello to a service sitting right next to them.

# was: depends on public dns + the tunnel just to report health
https://status.vinelabs.de/api/push/xxxx

# now: same box talking to itself, no dns, no tunnel, nothing to break
http://127.0.0.1:3001/api/push/xxxx

The push token belongs to the Kuma instance, not the domain, so the same token works over loopback. Now the health ping does not care what my domain is or whether the tunnel is even up. It is the kind of fix that makes the original setup look a little silly in hindsight, which is usually a sign you got it right this time.

the part where i nearly lost the status page

Okay. The embarrassing one. The reason this blog has a scar.

I wanted my public status page to show up on the root of the status domain instead of the login dashboard. Uptime Kuma supports this through a setting. The clean way to change it is the web interface. I did not do the clean way. I decided to poke the setting straight into Kuma's SQLite database, because I had already been editing the database to add monitors and it had gone fine.

Kuma runs SQLite in WAL mode. I stopped the container, ran my little update, and got back the four words you never want from a database.

database disk image is malformed

Kuma would not start. The page was gone. And the backup I had taken earlier turned out to be corrupt as well, because I had copied the database file while Kuma was still running, which with WAL mode can hand you an inconsistent snapshot. So now I had two bad copies and a service that would not come up. Lovely.

The thing that saved me was SQLite's own recovery mode. It reads whatever it can out of a damaged file and rebuilds a clean one.

# pull the readable bits out of the broken db into a fresh, healthy one
sqlite3 kuma.db ".recover" | sqlite3 recovered.db

# then actually check it is clean before trusting it
sqlite3 recovered.db "PRAGMA integrity_check;"   # want: ok

It came back ok, and almost everything survived. The one casualty was the status page row itself, sitting on exactly the pages that had gone bad. So I rebuilt that one record by hand, set it as the entry page, grouped the public services properly, and brought Kuma back up. Page restored.

The lesson is not "SQLite is fragile". SQLite is wonderful. The lesson is do not hand-edit the live database of a running app just because the table is right there and it feels faster. Use the interface it gives you. And if you absolutely must touch the file, stop the app cleanly, checkpoint the WAL, take the backup from the stopped state, and run an integrity check before you trust anything. I knew all of this. I skipped it anyway because I was on a roll. That is exactly when it bites.

cutting over and removing the old domain

Once the new domain had been answering for everything, and I had actually used it for a bit rather than just curl-tested it, it was time to retire the old one.

This was the easy bit, finally. I pulled the vinelabs.de hostnames out of the tunnel config, leaving only the vinelab.in ones, and reloaded the tunnel. My cloudflared runs under a launchd agent, so the reload was just a matter of the process restarting and reading the trimmed config on the way up. A quick check of every service on the new domain, all green, done.

The old domain still exists. It just does not point at the homelab anymore, and it is not retired either. It is waiting on DENIC, and once that clears it goes back to work for the German related projects it was always a better fit for. The homelab got the right name. The .de is getting the right job. I would call that a clean trade.

what i would tell myself before starting

If I could send a note back to the version of me who started this, it would be short.

Run both domains at the same time, there is no prize for flipping the switch in one move. Remember that the tunnel config and DNS are two different jobs and both have to be done. Check the few services that actually embed their own domain, like Vaultwarden, because those are the ones that bite. Point internal health pings at loopback, not at your own public domain, because a service should not need the open internet to talk to its neighbour. And do not get clever with a live database when a perfectly good settings page is sitting right there.

None of this was hard. The only genuinely scary part was self-inflicted, which is honestly how most of my homelab scares go.

So that is where I will stop. If you have a cleaner way of handling a domain move on a tunnel setup, I genuinely want to hear it, drop me a note. Otherwise, see you when the next interesting problem shows up.

What do you do when your tool works but the people you built it for can't open a terminal?

Vineeth N K — Tue, 02 Jun 2026 12:49:36 +0000

What do you do when your tool works but the people you built it for can't open a terminal?

The part I quietly ignored for a while

Medix did its job. It is a small Python CLI that wraps ffmpeg, and the first time it earned its keep was converting an old wedding video so my family could finally watch it. That story already has its own post, so I will not drag you through it again.

But here is the thing I kept not saying out loud.

I built medix for myself. To be clear about that. The family video was the spark, but the tool that came out of it was always mine to run. I never handed the CLI to anyone. I never expected to.

Because when I say "just run medix ./video.vob and pick mp4", I am speaking a language that maybe three people in my family understand, and two of them are me on different days. Handing them a CLI would not be a gift, it would be homework. So I never did. The deal was simple: they bring me the file, I run the thing, they get the video.

But somewhere along the way I started wondering what it would take to actually let them run it themselves. Not the terminal. Something they could open without me sitting next to them.

A terminal, to most of my family, looks like the screen hackers use in movies right before something explodes.

I already had the hard part

So one of those evenings where you start "looking" at your own project and end up rewriting it, a thought hit me. The actual hard work was already done.

The file discovery, the ffprobe parsing, resolving output paths, running ffmpeg and reading its progress, all of that lives in the engine. The CLI is just a face on top of it. A nice face, sure, but still just a face.

If the CLI is one face, why can't there be a second one?

That became the rule for the whole thing: one engine, two faces. The GUI does not get its own clever conversion logic. It calls the exact same discover_files, the exact same convert_file the CLI calls. Same output, byte for byte. If I fix a bug in the engine, both faces get the fix. If the GUI did its own thing, I would be maintaining two tools that slowly drift apart and lie to each other. No thanks.

Once you frame it like that, the GUI stops being a big scary project. It is just a web page that pokes the engine I already trust.

No React. No Electron. No node_modules black hole.

Now, the obvious modern move here is to reach for a framework. Spin up React, maybe Electron so it feels like a "real app", bundle the whole thing.

I looked at that path for a bit and walked away.

This is a tool for converting a video on your own machine. It does not need a build step, a bundler, a state management library, and three hundred megabytes of node_modules so that someone's aunt can turn a .mov into an .mp4. The weight would be bigger than the thing it does.

So the GUI is plain HTML, plain CSS, and plain JavaScript. Material Design styling, hand written, no toolkit. The server is Python's own http.server, the same standard library that ships with the language. Open the folder, read the files, done. If you clone medix, there is nothing extra to install for the GUI. It is just there.

I am not saying frameworks are bad. I am saying not every nail needs the big hammer, and a local media converter is a very small nail.

The cursed file picker saga

Here is where I lost more time than I will admit.

A web page, for very good security reasons, cannot pop open your OS file browser and read a real path off your disk. The browser hands you a sandboxed file, not a path. But medix works on paths. It needs to know where your file actually lives so ffmpeg can read it and write the output next to it.

I did not want to pull in tkinter or some GUI toolkit just to show one "choose a file" dialog. That felt like buying a truck to carry a single grocery bag.

So the GUI shells out to whatever native dialog the operating system already has. On macOS that means asking AppleScript, of all things:

# yes, we are literally asking osascript to open a file dialog for us
script = f'POSIX path of ({chooser} with prompt "{prompt}")'
return _run_picker(["osascript", "-e", script])

On Windows it spins up a PowerShell one-liner that summons a System.Windows.Forms.OpenFileDialog. On Linux it tries zenity, and if that is not around, kdialog. One feature. Three completely different shell-outs to three completely different worlds.

And the honest part? It feels wrong. A web app reaching out through a subprocess to ask the operating system to draw a file dialog, then catching the path it prints back, is the kind of thing that makes you pause and go "surely there is a cleaner way." There probably is. But this one works on all three, needs zero extra dependencies, and the user just sees a normal file picker. Cursed, but it ships.

Tell me I am not the only one who has shipped something that works perfectly while quietly feeling a little dirty about how.

The bit I actually wanted: watching it convert, live

This was the real itch. In the CLI you get progress bars in the terminal, which I love. But I wanted that same live feeling in the browser. A bar per file, an overall bar, status moving from queued to encoding to done, all updating as ffmpeg chews through your media.

For that the server streams progress to the page using Server-Sent Events. The browser opens one long-lived connection, and the server just keeps pushing little updates down it:

# one open pipe, keep nudging the browser as each file moves along
self.send_header("Content-Type", "text/event-stream")
...
self.wfile.write(b"data: " + payload + b"\n\n")

SSE is lovely when it works and quietly annoying when it does not, because a stream that silently stops looks exactly like a stream that is just being slow. I went back and forth getting the per-file callback to fire at the right moments and flush instead of sitting in a buffer. Once it clicked, though, watching those bars crawl across the browser in real time was the moment the GUI stopped feeling like a toy.

Making it something they never even have to start

A GUI you launch from a terminal is still, technically, a terminal task. If my whole point is "non-technical people should be able to use this", then telling them to open a terminal and type medix-gui defeats the entire idea.

So the GUI can run as a background daemon:

medix-gui start      # runs detached, prints the pid and port
medix-gui status     # is it alive? what port?
medix-gui stop       # done for the day

And on macOS it goes one step further with a launchd service. Install it once, and the GUI starts at login, restarts itself if it crashes, and survives reboots:

medix-gui install-service     # set it up once
medix-gui uninstall-service   # change your mind later

The dream is simple. Someone non-technical opens their browser, the page is already there at a local address, they drag in a video, pick a format, watch the bars, done. They never see Python. They never see ffmpeg. They never know there was a daemon quietly waiting for them the whole time. That, to me, is the tool working the way the CLI worked for the wedding video, except now I am not the one who has to run it.

A local server is still a server

One thing I did not want to get casual about: just because it runs on your own machine does not mean it gets to be careless.

The whole privacy pitch of medix is that nothing leaves your computer. No upload, no login, no random server touching your files. A local web GUI could quietly undo all of that if I was sloppy. So it binds to 127.0.0.1 only, rejects requests with a Host header that is not localhost, blocks cross-origin POSTs, and only serves files from a fixed allowlist instead of whatever path someone asks for. Boring, defensive plumbing. But "it runs locally" and "it is safe" are not the same sentence, and I did not want to pretend they were.

Your files stay yours. That was the point of the CLI, and it stays the point of the GUI.

The honest ending

Here is the part I have to be straight about.

Nobody non-technical has actually used it yet.

I built the whole thing ahead of the moment. The daemon, the launchd service, the live bars, the cursed file pickers, all of it sitting ready for the next time someone hands me a weird file and a hopeful look. As of now, the main person who uses the medix GUI is the same guy who wrote it, which was not exactly the plan.

But I am oddly fine with that. Some tools you build for a problem you have right now. This one I built for a problem I know is coming, because in my family it always comes back. There will be another old video, another wrong format, another "can you just put it somewhere we can all watch it." And when that day shows up, the face will already be there, waiting in a browser tab, no terminal required.

If you want to poke at it, medix is on PyPI (pip install medix) and the source is at github.com/vineethkrishnan/medix. The full docs, including a proper guide for the GUI, daemon, and the launchd bit, live at medix.vinelabs.de. The GUI itself is just medix-gui once it is installed.

So yeah, that is my take on giving a CLI a second face. Yours might be completely different, and that is exactly what makes this whole space fun. Catch you in the next one, probably when something else I built for nobody finally finds its person.

I went on a trip. My Mac mini stayed home and kept texting me.

Vineeth N K — Sat, 30 May 2026 17:50:59 +0000

I went on a trip. My Mac mini stayed home and kept texting me.

TL;DR: A while back I built a homelab on an old 2018 Mac mini. Then I went out of town for a few days and left it running. I half expected to come back to a dead box. Instead it just kept doing its job, let me SSH in from my phone and keep pushing my own CLI tools forward while away, and buzzed me whenever something mattered. Nothing dramatic happened. And honestly, that quiet was the whole point. This is the story of the homelab finally earning its keep while I was nowhere near it.

The part nobody tells you about building a homelab

When you set up a homelab, all the blog posts stop at the setup. The screenshots are green, the containers are up, you take your victory lap and close the laptop.

I did the same. I wrote down the whole long evening of building this thing, every gotcha, every GUI click macOS forced on me. At the end I had Vaultwarden, ntfy, Uptime Kuma, n8n, a little agent webhook, restic backups, all sitting on a Mac mini that a colleague handed me from his drawer.

But here is the thing. A homelab that only works while you are sitting next to it is just a noisy space heater. The real test is the day you are not there. The day the power could flicker, a container could die, a backup could fail, and you would have no idea unless the box itself told you.

So when a short trip came up, I did not shut anything down. I left it all running and went.

Day one, and the silence was loud

First evening away, I caught myself doing the thing. You know the thing. Opening the phone to check if home is still alive, the way you check if you locked the front door.

I pulled up the status page. Everything green. Uptime Kuma sitting there with a row of happy little dots, every service responding, the agent webhook answering its health check. Netdata showing the mini idling cool and bored.

And then I just... put the phone down. There was nothing to do. The box did not need me.

That feeling is strange the first time. You build a thing for months, you babysit it, and then one day it does not need babysitting anymore. Bittersweet, almost. Like dropping a kid at hostel.

The 3:30 buzz

My restic backup runs every night at 03:30 in the morning, back home. Nobody is awake for that, which is the whole idea of a 3:30 AM cron. You set it for the dead of night precisely so it never gets in your way.

The job fired while I was fast asleep, exactly like it does on any normal night. The only difference was that this night I was not home. I woke up the next morning, picked up the phone out of pure habit, and there it was waiting on the lock screen. ntfy notification. Backup done, snapshot pushed, a few MB in, almost nothing out after dedup.

A tiny push telling me my data was safe, fired by a machine sitting alone in an empty flat, patiently waiting for me to wake up and read it. I did not do anything. I did not even open the app fully. I just saw it, nodded, and went to find coffee.

That little buzz is the entire reason I wired ntfy in the first place. Not to spam me. To tell me the boring good news so that the day it becomes bad news, I notice immediately. A backup that runs silently is a backup you do not trust. A backup that texts you "done" every night is one you forget about, in the good way.

If you have ever felt a small flush of pride at a green cron job, you and I would get along just fine.

The actual work happened from my phone

Now the part I am quietly proud of.

Here is what surprised me. The trip was not me firefighting a homelab from a hotel room. The box was calm the whole time. What I actually did was use the days to work on some cool side stuff and refine a few of my own personal CLI tools, straight from my phone.

The trick is nothing fancy. Remote Login is on, the mini is on my tailnet, so I open an SSH app on my phone and I am in a real shell on the machine back home. Not a watered-down dashboard, the actual terminal, with my dotfiles, my aliases, my tools, all sitting exactly where I left them. From there I run whatever I want, claude included, and do real work.

# from the phone, over Tailscale
ssh mac-mini
# and then just... work, same as if I was at the desk

So the rhythm of my day became this. Find a quiet half hour, SSH in from the phone, run a command, kick off a change to one of my CLI tools, read the output right there on the small screen, run the next one. Tiny keyboard, yes, and I am not going to pretend a phone replaced my full setup. But for steadily nudging a few personal tools forward, command by command, it genuinely worked. I came home with actual progress, not just a tan.

And yes, the homelab also has that agent webhook. But that one is built for a different job, automating the repetitive tasks from my daily work, where I fire a prompt and let the mini run it on its own and ping me the result. The trip work was the hands-on kind, just done through a very small keyboard.

Nothing went wrong, and that was the point

Here is the anticlimax. The dashboard stayed green the entire time.

No service fell over. No 3 AM page. No frantic debugging from a six-inch screen. Uptime Kuma just sat there with its happy row of dots, day after day, and the only buzzes I got were the friendly kind, backup done, agent result ready.

And I want to be clear that the quiet is not a boring detail to skip past. The quiet is the product. The point of all the monitoring was never to give me a dramatic save story. It was so that if anything did go red, I would know within a heartbeat instead of finding out days later, back home, staring at a dead service with no idea how long it had been gone. I had recovery alerts wired alongside the down alerts too, so a blip would have buzzed me twice, once for the scare and once for the all-clear.

It just never had to. And honestly, a homelab that gives you a boring trip is the homelab working exactly as designed.

The thing I was most nervous about

Power.

The one fear I could not fully shake was a power cut at home while I was away. If the mini went down and stayed down, my whole little world would go dark and there would be absolutely nothing I could do about it from out of town.

So I had stacked two layers of insurance for exactly this.

The first is a power backup. The mini sits behind a UPS that can keep it running on its own for a good six to eight hours. Most power cuts where I live are the short, annoying kind, gone and back before you finish complaining about them. The UPS swallows all of those without the mini ever noticing a thing.

The second layer is for when a cut outlasts the battery, or when power drops and returns while I am away. Back during setup I had told macOS to bring itself back on its own.

sudo pmset -a autorestart 1   # come back on your own after a power cut
sudo pmset -a sleep 0         # and never, ever go to sleep

autorestart 1 means if power drops and later returns, the Mac boots itself without anyone pressing the button. Colima starts on boot through launchd, the containers come up with restart: unless-stopped, Tailscale reconnects on its own, and the whole stack reassembles itself like nothing happened.

Between the two, the only way I genuinely lose is a power cut that runs longer than the battery and then never comes back for the rest of the trip. That is the real dark side, the one scenario where there is nothing left to do but wait until I am home. But it is a narrow window now, not the wide-open fear it used to be. And knowing that let me actually enjoy the trip instead of refreshing a status page every hour. A homelab you have to worry about is not a homelab, it is a pet that bites.

What this trip actually taught me

I came back home, walked in, and the mini was sitting there with its little light on, exactly as I left it. No drama, no recovery saga, no horror story. It had just quietly done its job the entire time.

And that is the lesson. The point of all that setup, all those gotchas and GUI clicks and one-word Caddy fixes, was not to have a pretty dashboard. It was to be able to leave, fully, and trust the thing to behave and to speak up only when it mattered.

A few things made that trust possible, and if you are building your own, these are the ones that earned their place:

ntfy for the boring good news, not just the bad. Let it tell you the backup worked. The day it says the backup failed, you will already be in the habit of reading it.
Tailscale so the box is in your pocket. Everything reachable like it is on localhost, from anywhere, no ports open to the internet. That single choice is what makes the phone a real remote control.
Uptime Kuma with recovery alerts on too. Wire both the down and the all-clear, so the day something blips you get the relief buzz right after the scare, not just the scare.
pmset autorestart for the power fear. You cannot fix a dead box from another city. So make sure it un-deads itself.
Plain SSH from the phone, over Tailscale. This is the one that surprised me. A real shell on the home machine, my own tools and dotfiles, reachable from a phone anywhere. It turned dead travel time into actual progress, command by command.

The homelab stopped being a project the day I could walk away from it. Funny how you only really finish building something when you stop having to look at it.

So tell me, what is the one thing your setup does while you sleep that quietly makes you trust it? I am genuinely curious, because that small thing is usually the whole game.

Right, I am off to check my phone for no reason again. Old habits. Take care of your machines, and they will take care of you back.