YUVRAJ

Posted on May 24

I gave an AI a Kill Switch. Here's what I learned about trust in local-first tooling.

#opensource #ai #git #webdev

There's a moment every developer hits when they're using an AI
coding tool.

You paste in something sensitive. A database schema. An internal
API structure. A piece of logic that took three months to figure out.

And then you think: where did that just go?

That moment is why I built the Air-Gap mode in Rogue Studio.
And building it taught me more about the architecture of trust
in developer tooling than anything I've built before.

The problem with "local AI support"

Every major AI coding tool now claims to support local models.

And technically, they do. You can point them at Ollama.
You can run Llama locally. The checkbox exists.

But "supports local models" is not the same as
"guarantees your code stays local."

Here's what's actually happening in most tools:

The local model setting is a preference, not a policy
Telemetry calls still go out regardless of model choice
Error reporting sends context to external servers
Fallback logic silently switches to a cloud provider when the local model is slow or unavailable
There's no enforcement layer — just a UI toggle that you have to trust

You're not getting a guarantee. You're getting a setting.

For most developers that's fine. For security researchers,
for people working on proprietary systems, for anyone
building in regulated industries — it's not fine at all.

What a real guarantee looks like

I wanted something you could point to in the code and say:
here is where the guarantee is enforced.

Not in the UI. Not in a settings file. In the request pipeline.

Here's what I built in /api/chat/route.ts:

const EXTERNAL_PROVIDERS = [
  "openai", "anthropic", "gemini",
  "groq", "deepseek", "together", "openrouter"
];

const isAirGapped =
  req.headers.get("x-air-gap-mode") === "true";

if (isAirGapped && EXTERNAL_PROVIDERS.includes(provider)) {
  return NextResponse.json(
    {
      error: "AIR-GAP VIOLATION: External provider blocked.",
      provider,
      timestamp: new Date().toISOString()
    },
    { status: 403 }
  );
}

When the Kill Switch is on, this fires before any
provider initialization, before any API key lookup,
before any streaming starts.

The 403 is not a suggestion. It's a wall.

And because it's server-side, it fires regardless of
how the request was constructed — from the UI, from
a direct fetch(), from a script hitting the API.
There is no path through the middleware that reaches
an external provider when Air-Gap is active.

The only thing that passes through is ollama, which
talks exclusively to localhost:11434.

Zero bytes leave the machine. Not as a marketing claim.
As a diff you can read.

The UI had to match the architecture

I spent a surprising amount of time on the Kill Switch UI.

The technical guarantee meant nothing if users didn't
feel the weight of it. A small toggle in a settings
menu would undermine the architectural statement the
feature was making.

So I made it physical-looking. Big. Impossible to miss.

When it's OFF: the interface shows all available providers.
Everything is normal.

When it's ON: every external provider grays out immediately.
A banner appears — AIR-GAPPED: LOCAL ONLY. The provider
selector locks to Ollama. The Kill Switch itself turns red.

The visual design is doing real work here. It's communicating
that this is not a preference. It's a mode change with
real consequences.

What this taught me about trust as architecture

Building this forced me to think about trust as a
first-class architectural concern — not a policy
you write, but a constraint you build.

Most software trust models are based on:

Configuration (you set a flag, the software honors it)
Audit (you review logs after the fact)
Policy (legal agreements about data handling)

None of these are structural. They can all be violated
— through bugs, through misconfigurations, through
deliberate decisions made by people you've never met.

Structural trust is different. It means the system
is physically incapable of violating the constraint
regardless of what else changes.

The Air-Gap middleware is structural. It doesn't matter
what gets added to the codebase later — as long as that
middleware exists, the constraint holds. You could
add ten new AI providers tomorrow and none of them
would be reachable when Air-Gap is active.

This is why open source matters for this class of tooling
specifically. The guarantee is only as strong as your
ability to verify it. With Rogue Studio, you can read
the middleware in three minutes and know exactly what
it does and doesn't block.

The other side of the coin: the Red Team

The Air-Gap mode is about protecting your code from
leaving your machine.

The Red Team swarm is about protecting your code from
being wrong.

I built an adversarial agent loop where two AI agents
run against each other:

Blue Team writes the code.
Red Team immediately tries to break it — hunting for
XSS, SQL injection, buffer overflows, reentrancy
vulnerabilities, SSRF, path traversal.

If Red Team finds something, the exploit details go
back to Blue Team for patching. They loop up to
3 iterations until the code is clean.

The insight that drove this: the same model that
writes a vulnerability is statistically likely to
miss it in review. The blind spots go in both
directions. The fix is a different agent with an
opposing objective — not a helpful one, but a
destructive one.

Combined with Air-Gap mode, you get something
interesting: an AI that aggressively audits your
code for vulnerabilities, running entirely on your
machine, with a cryptographic guarantee that nothing
leaves.

What I'm building toward

Rogue Studio is my attempt to answer a question
I couldn't find a good answer to anywhere else:

What does AI tooling look like when you build
trust into the architecture instead of the policy?

The Air-Gap mode is one answer. The adversarial
swarm is another. The Reverse Engineer mode — which
swaps in a decompiler prompt for malware analysis
without safety refusals — is a third.

All of it is open source. All of it is auditable.
All of it runs locally.

If you've been waiting for AI developer tooling
that actually trusts you with your own tools —
this is what I've been building.

→ github.com/malgatyuvraj/Rogue-Studio

The repo is MIT licensed. Good first issues are
labeled. PRs are reviewed fast.

If any of this resonates — especially the
architecture-as-trust framing — I'd love to
hear your thoughts in the comments.

What other constraints should be structural in
developer tooling? I've been thinking about this
a lot and I'm curious what others in the security
and privacy space think.

rougestudio.vercel.app