Anthropic just dropped Project Glasswing — a big collaborative cybersecurity initiative with a shiny new model called Claude Mythos Preview that can find zero-day vulnerabilities at scale. Twelve major tech companies involved. $100M in credits. Found a 27-year-old flaw in OpenBSD. Impressive stuff.
But let's be real about what's happening here. Anthropic trained a model so capable at breaking into systems that they decided it was too dangerous to release publicly. So they wrapped the release in a collaborative security initiative. The security work is genuinely valuable. But it's also a smart way to keep control of something they know is too powerful to let loose.
The part that actually matters, though, is who benefits. Glasswing is for the big players. The companies with security teams, budgets, and the kind of infrastructure that gets invited to sit at the table with AWS, Microsoft, and Palo Alto Networks. What about the rest of us? The startups, the small SaaS shops, the indie developers running production systems on a shoestring?
The internet is a dark forest. That's not a metaphor anymore — it's becoming the literal reality. Bots, scrapers, automated exploit chains, credential stuffing, AI-generated phishing. A server goes up and within hours it's being scanned, fingerprinted, and probed by systems that don't sleep. Visibility equals vulnerability. And AI is making the attackers faster, cheaper, and more autonomous every month.
The ISC2 put it plainly — both offence and defence now operate at speeds beyond human intervention. The threats aren't people sitting at keyboards anymore. They're autonomous systems running campaigns end-to-end.
So what do we do about it?
Offensive security — but not the kind you're thinking
When I say offensive security, I don't mean red-teaming or penetration testing. I mean giving your systems the ability to fight back.
Picture an LLM that sits across your centralised logs — network traffic, database queries, user interactions, access patterns — and builds an understanding of what normal looks like for your system over weeks and months. Not just pattern matching against known signatures. Actually understanding the shape of healthy behaviour.
When something breaks the pattern, it doesn't just alert. It acts.
Disable a compromised account. Kill a service that's behaving strangely. Block a database connection that shouldn't exist. Create an incident with full context for a human to review. The response is proportional and immediate — not waiting for someone to check their phone at 3am.
The architecture is pretty straightforward:
graph TD
A[Application Logs] --> D[Secure Isolated Log Store]
B[Network Traffic] --> D
C[Database Queries] --> D
D --> F[Baseline Health Model]
E[User Activity] --> D
F -->|Anomaly Detected| G[LLM Analysis]
G -->|Analyse & Plan| H{Threat Assessment}
H -->|Low| I[Alert & Log]
H -->|Medium| J[Restrict & Escalate]
H -->|High| K[Disable & Isolate]
I --> L[Human Review]
J --> L
K --> L
The key is that the logging and analysis layer has to be isolated and secured separately from the systems it's watching. If an attacker can compromise the thing that's watching them, the whole model falls apart.
In practice that means separate infrastructure with its own auth boundary. Ingestion is write-only — your application services push logs in but can never read or modify what's already there. Append-only, immutable. The analysis layer gets scoped service accounts that can read logs, fire alerts, and pull specific emergency levers through a narrow API. Nothing else. If a compromised service tries to reach the log store directly, it hits a wall.
None of this is exotic. Centralised logging, immutable storage, scoped IAM — the building blocks exist. The hard part is wiring an LLM into that loop with the right constraints. Enough access to act, not enough to make things worse.
Where biology gets interesting
I've been doing research with my project C302 — using a simulation of the C. elegans roundworm's neural network as a behavioural controller for LLM agents. The worm has 302 neurons. That's it. And with those 302 neurons it navigates its environment, finds food, avoids threats, and adapts its behaviour based on what's working.
In that research, we mapped simple feedback signals to biological synapses and let the neural simulation drive agent behaviour. The live connectome — receiving real-time feedback from the agent's environment — showed a clear improvement over one following a fixed trajectory (0.960 vs 0.867 test pass rate), even when the topology, signals, and rules were identical. The only variable was whether the system adapted to what was actually happening. Early days with a small sample size, but the direction is promising.
Now apply that thinking to security monitoring.
Imagine mapping a sudden spike in unusual user activity to the equivalent of a "salt" sensory neuron in the worm's circuit. That fires, and the downstream effect is the security system becomes more aggressive in its investigation — widening its search, correlating more signals, lowering its threshold for action. A pattern of repeated failed authentications from new IPs could map to a "touch" response — the system recoils, tightening access controls automatically.
This isn't rule-based. It's adaptive. The system develops a behavioural pattern that's learned from running in your specific environment, responding to your specific traffic patterns. That's a fundamentally different thing from a static set of if-then rules.
This has to be open
Glasswing is cool. Open-source frameworks like CAI are making progress — but mostly on the offensive side, using LLMs for penetration testing and vulnerability research. On the defensive side, the tooling barely exists. There's no open-source equivalent for the kind of adaptive monitoring and response I'm describing here.
The building blocks are around. Centralised logging is a solved problem. Open standards for security event formats are maturing. Smaller open models are more than capable of pattern analysis on local infrastructure. What's missing is the glue — a framework that takes logs in, builds a baseline, detects anomalies, and can actually respond. Something a small team can deploy without a six-figure security budget.
The threats don't discriminate by company size. The defences shouldn't either. This can't be proprietary or locked behind enterprise contracts.
The dark forest doesn't care how big your company is. The bots scanning your infrastructure don't check your headcount before they attack. If the threats are going to be this accessible, the defences need to be too.
I'm taking the C302 work in this direction next. An open-source security agent — biologically inspired, adapts to your environment, acts when something breaks the pattern. Small enough for a startup to run. The pieces are all there. I just need to wire them together.
This post was originally published on jonno.nz. I write hands-on reviews of open-source AI tools and deep dives on engineering topics — check out more at jonno.nz.
Top comments (0)