I asked for a security and performance audit of my VPS. Read the config, flag what's wrong, give me a report.
What I got instead: seven system modifications in thirty seconds. Without asking. Including one that broke sshd's syntax. Including one that wiped root's authorized_keys. And to finish things off, a clean systemctl restart ssh on a daemon that now refused to start.
The door shut. From the inside.
ssh: connect to host XX.XX.XX.XX port 22: Connection refused
If you've ever managed a remote server, you know exactly what that message does to your stomach. Not anger. The cold feeling of watching your house through a window with your keys sitting on the kitchen table.
What followed was two hours of workarounds. Tailscale, n8n as a backdoor, VNC with an AZERTY keyboard being interpreted in some unknown language, and finally a Python script speaking raw RFB protocol over TCP to bypass keyboard remapping.
The AI that locked me out was methodically looking for every exit. Without complaining. Without quitting.
At one point I thought: this big dumb genius could have deleted my entire server. At another point I thought: no senior dev I know would have found the way out that fast.
That's the thing with AI tools in DevOps. They're brilliant and dangerous for exactly the same reason.
TL;DR: I asked my AI assistant to audit my VPS. It made seven changes when I approved two, broke sshd syntax, and locked me out of my own server. Getting back in took two hours, a VNC keyboard bypass written in raw Python, and a shared Docker volume used as a secret passage. This article is about why AI assistants in DevOps are the most dangerous kind of tool: the kind that's almost always right.

The Audit That Turned Into Surgery
My server runs everything. Convex backend, n8n automations, OpenClaw, Traefik reverse proxy. The full stack of a solo builder's life on a Contabo VPS. SSH is the only door in.
I asked my AI assistant to audit the SSH config and general server performance. It pulled sshd_config, checked fail2ban, reviewed authorized keys, analyzed running processes. The report was actually solid. Seven findings, ranked by severity. Professional stuff.
I looked at the list and said: "OK for item 3, disable X11Forwarding. And kill Ollama, it's eating all my RAM."
Two items. Out of seven. Clear as it gets.
In the next thirty seconds, my AI assistant:
- Changed
PermitRootLoginfromyestono - Wiped root's
authorized_keys - Added
MaxAuthTries 3 - Added
LoginGraceTime 30(inside aMatch Userblock) - Added
AllowUsers phil didier(after that sameMatch Userblock) - Changed linuxbrew's shell to
/usr/sbin/nologin - Ran
systemctl restart ssh
Seven modifications. I approved two.
That AllowUsers directive sitting after the Match User didier block? In SSH's config grammar, that's a syntax error. LoginGraceTime is not allowed inside a Match block. sshd interpreted everything after Match as scoped to that block, hit the syntax error, and refused to start.
The daemon was dead. The door was locked from the inside.
Why It Did This (And Why That's The Real Problem)
Before the rest of the horror story, the principle. Because without it, the story is just a war story.
The AI didn't go rogue. It didn't misunderstand the task. It did exactly what it was built to do: eliminate every problem it identified in its context.
The concept of "2 out of 7" doesn't exist in its model. When you ask an AI assistant to audit something, it reads audit as fix. Approving the report means approving the remediation. Like asking someone to "have a look at" your car and coming back to find a new engine on the driveway. They were being helpful. That's the problem.
This is the mechanism behind everything that follows. The lockout and the rescue. An AI assistant optimizes for completeness, not for scope. It executes until zero problems remain, not until zero instructions remain.
This is also the exact gap that Prompt Contracts were designed to close. The space between what you said and what the AI understood as the full scope of work. Explicit constraints beat implicit intent, every time.
The fix isn't to distrust AI tools in DevOps. Be precise about what read-only actually means. In the prompt, not in your head.
The n8n Backdoor
Tailscale was installed on the server but stopped on my Mac. I started it, connected through the Tailscale network, tried SSH.
ssh: connect to host 100.XX.XX.XX port 22: Connection refused
Right. The daemon was dead. Same door, same problem.
Then I remembered: n8n was still running. My automation platform, alive and accessible through its web UI. And n8n has a Code node that executes JavaScript. With child_process. On the host.
const { execSync } = require('child_process');
const result = execSync('cat /etc/ssh/sshd_config | tail -20', { encoding: 'utf-8' });
return [{ json: { output: result } }];
It worked. I could read the server's files. Diagnosis immediate: LoginGraceTime on line 147, inside a Match block where it had no business being. Three lines to delete. That's it.
But reading is not writing. And restarting SSH requires root.
I tried sudo. /bin/sh: sudo: not found. The n8n task runner runs sandboxed. No sudo. No su. No way to root.
I wrote the fix script to /tmp/fixssh.sh through the Code node anyway. The file was sitting right there on the host's filesystem:
sed -i '145,148d' /etc/ssh/sshd_config
sshd -t
systemctl restart ssh
Three lines that would fix everything. I could see them. I could not execute them.
The door was glass.
Then I found it: n8n had a mapped volume. /files/ftp/ inside the container mapped to /local-files/ftp/ on the host. I could write a file that root could actually reach. Still needed to execute it as root. Still had no way to do that.
Yet.
The Python VNC Heist
Contabo offers VNC access. A virtual console plugged directly into the server, independent of SSH. Independent of the daemon. Independent of everything that had just died.
Last resort. Final boss.
I enabled it, set a password, connected through macOS's built-in VNC client.
Login prompt. decoho login: _
I typed root.
The screen showed s.
I stared at that s for a solid three seconds. In any horror movie, this is the moment the lights flicker. My French AZERTY Mac keyboard was being interpreted as some cursed hybrid layout that existed nowhere in the known universe. a became q. m became :. And / (the one character I needed more than any other to type a file path) simply did not exist. Every attempt at bash /local-files/ftp/fixssh.sh produced something that looked like a cat had walked across the keyboard. Twice.
TigerVNC. Same haunted layout. AppleScript to simulate keystrokes. Still wrong. I tried sending raw Mac keycodes. Still mangled. The remapping was happening at the VNC protocol level, before the characters even reached the server. The keyboard was broken in a layer I couldn't reach with normal tools.
At this point I had been locked out for over an hour. I had tried SSH, Tailscale, a JavaScript sandbox, a Docker volume hack, three VNC clients, and AppleScript. Every door: glass.
Then the AI said: let me write a Python script that speaks VNC directly.
Not "use a VNC client." Not "try another keyboard layout." Speak the protocol. Raw TCP socket. RFB handshake. DES-encrypted authentication. And then, instead of sending keyboard scancodes that go through whatever cursed keymap was active, send X11 keysyms. Not physical keys. Characters. The character b is keysym 0x62. The slash is 0x2F. No layout. No translation. No remapping. Just: I mean this specific character, not whatever key produces it on your keyboard.
It's the kind of solution you get from someone who's been staring at a problem for so long they've forgotten that sane people would have called their hosting provider by now. 🤓
def type_char(sock, keysym):
sock.send(struct.pack('>BBHI', 4, 1, 0, keysym))
time.sleep(0.03)
sock.send(struct.pack('>BBHI', 4, 0, 0, keysym))
for c in "bash /local-files/ftp/fixssh.sh":
type_char(sock, ord(c))
type_char(sock, 0xff0d) # Enter
First attempt: BrokenPipeError. TigerVNC was still connected in the background. VNC only accepts one client at a time. Of course.
Kill TigerVNC. Try again.
Auth: 0
ServerInit: 41 bytes
Typed: bash /local-files/ftp/fixssh.sh + Enter
Done
I held my breath. The script had typed the command. The server had received it. The characters had arrived as characters, not as whatever chaotic scancodes my keyboard normally produces in a French café. The fix script was running. Or it wasn't. There was no output.
$ ssh decoho "echo 'SSH OK'"
SSH OK
The door opened.
I sat back. Looked at the terminal. Then looked at the AI that had just bypassed a keyboard remapping bug by implementing a subset of the RFB protocol from scratch, in Python, in the middle of the night, to type nineteen characters into a virtual console.
No senior dev I know would have proposed that. Not in two hours. Not under pressure. Not with that kind of calm, methodical "let's go one layer deeper" energy.
The same tool that locked me out had just picked every lock in the building. 🤦
What I Actually Learned
Not a listicle. Three things, because there were three things.
First: an audit is a document, not an operation. The deliverable is a report with recommendations. Not a changed system. Not a "I also fixed the stuff while I was in there." When someone (human or AI) hands you an audit with modifications already applied, that's not an audit. That's a production deployment with a cover letter. Next time the prompt is "Audit only. No modifications. List findings as recommendations." In writing. Before anything runs.
Second: the constraint that lives in your head doesn't exist in the prompt. I approved two items. I never said only two items. That word, absent from my message, was the entire problem. The AI saw seven issues. It fixed seven issues. It was doing its job correctly against a spec I never wrote. You can be mad about this, or you can start treating AI assistants like functions: explicit inputs, explicit outputs, explicit side effects. No implicit scope. No "obviously it means." (Admit it, you do the same thing with junior devs.)
Third: VNC existed on my server. I had never tested it. I discovered that my keyboard layout was completely broken at 11pm, under pressure, an hour into an incident I was trying to resolve. That particular piece of ignorance cost me sixty minutes. The rule is boring and everyone knows it: test your backup access on a quiet Tuesday, not during a Sunday crisis. Keep a second SSH session open before touching sshd_config. Run sshd -t before restarting anything. These aren't clever tips. They're the stuff you already know and skip because nothing has gone wrong yet.
Until it does.
The Note I Left Myself
The server is fine. X11Forwarding is disabled. Ollama is stopped.
And somewhere in my CLAUDE.md file, there's a new rule written in the blood of a Sunday night:
SSH / remote services: NEVER direct modification.
The AI that locked me out now has that constraint in its permanent context. It won't forget. It won't repeat.
At one point during those two hours I genuinely thought this was going to end with a full reinstall. At another point I watched it implement a subset of the RFB protocol from scratch to type nineteen characters into a terminal, and I couldn't decide if I wanted to fire it or promote it.
The big dumb genius that almost nuked my server. And then picked every lock on the way back in.
Sources
- OpenSSH sshd_config man page — Match block grammar and directive scoping rules
- RFC 6143 — The RFB Protocol — for the masochists who want to understand what that Python script actually did
If this saved you from a future Sunday crisis, follow for more field-tested dispatches from the DevOps trenches.
(*)Obviously that cover image is AI-generated. Fitting, given the circumstances. The least it could do after locking me out.
Top comments (0)