DEV Community: Liam Romanis

Microsoft's Costly Mistakes, and What Europe Is Doing About Them

Liam Romanis — Wed, 03 Jun 2026 11:43:31 +0000

There's a particular kind of corporate confusion that happens when a company gets so fixated on where it wants to go that it stops paying attention to what it's destroying on the way there. Microsoft, a company that has navigated more technological transitions than almost any other in the industry, is displaying that confusion at scale right now, and the consequences are starting to land in places that will be very difficult to walk back.

The Recall Disaster That Refuses to End

Start with Windows Recall, because it tells you almost everything you need to know about Microsoft's current decision-making culture.

First announced in 2024, Recall was positioned as a flagship AI feature for Windows 11, a tool that would take a screenshot of your desktop every few seconds and build a searchable, AI-indexed timeline of everything you'd ever done on your computer. The privacy implications were immediately obvious to virtually everyone except, apparently, the people who approved it for launch. The original version stored all of that screenshot data unencrypted, in plaintext: passwords, financial information, private communications, all of it sitting in a folder, waiting. Security researchers had a field day. Microsoft pulled the feature before it ever shipped.

What followed was a year-long rework. Encryption was added. Biometric authentication was required to access the data. Microsoft declared the problem solved and shipped it again in April 2025. Independent testing in August 2025 found the sensitive information filter still failing to catch credit card numbers, bank balances, Social Security numbers, and passwords. As of early 2026, Microsoft is reportedly pulling back its entire Windows 11 AI push and rethinking Recall from the ground up.

The technology problem here is real but fixable. The cultural problem is harder: how does a company with Microsoft's resources and security expertise ship a feature that stores your passwords in plaintext as a flagship product? The answer seems to be that the pressure to demonstrate AI momentum overwhelmed the engineering discipline that should have caught it. That's a worrying pattern at a company with Windows installed on over a billion devices.

A Gaming Empire, Methodically Dismantled

The Xbox story is a different kind of confusion, but the same underlying dynamic.

Microsoft spent $69 billion acquiring Activision Blizzard, the largest acquisition in gaming history, positioning itself as a dominant force in interactive entertainment. The argument was scale: own the content, own the platform, own the subscription. Then, having made that argument convincingly enough to clear regulators on both sides of the Atlantic, Microsoft began quietly dismantling the creative studios that were supposed to make it meaningful. The Initiative, the studio behind the rebooted Perfect Dark, was shuttered. Everwild, years in development at Rare, was cancelled. The pattern was consistent enough to suggest a strategy, not a series of isolated decisions.

The gaming division is being, in the words of one analyst, "harvested", its resources redirected toward AI infrastructure. That may make sense on a spreadsheet. It makes considerably less sense as a stewardship of a $69 billion creative asset, or as a proposition to the developers and studios who were recruited under a very different set of promises.

Fifteen Thousand Jobs, Record Revenue

In May and July of 2025, Microsoft cut roughly 15,000 positions across engineering, marketing, and HR. The explicit rationale from CEO Satya Nadella was that AI tools, including Copilot, were now writing around 30% of the company's code, reducing the need for human engineers. This is a reasonable thing to say if you're trying to justify to Wall Street why your headcount is falling.

It is a harder thing to say when, in the same quarter, you report $70 billion in revenue and continued strong cloud growth. These were not austerity cuts. They were, by Microsoft's own framing, a reallocation: human capital out, AI infrastructure in. The message to the remaining workforce, and to anyone considering joining, is clear enough. The longer-term effect on institutional knowledge, on culture, and on the decade of goodwill that Nadella built when he arrived and humanised a company that had grown rigid and territorial, is harder to quantify but unlikely to be positive.

Skype: A $8.5 Billion Lesson in Benign Neglect

Skype shut down in May 2025, finally and definitively, with its users redirected to the free version of Microsoft Teams. This was not a surprise. Skype had been functionally abandoned for years, its daily active users collapsing from 300 million at peak to around 36 million by the time Microsoft put it out of its misery. The surprise, looked at from any distance, is how thoroughly Microsoft managed to destroy something it paid $8.5 billion for in 2011.

Skype didn't lose to Zoom on technology. It lost because Microsoft redesigned it in ways users actively hated, starved it of investment, and failed to respond to competitors for long enough that the market simply moved on. It's an instructive case study in how incumbent advantage erodes: not through a single catastrophic failure, but through years of quiet disinterest.

GitHub's Silent Bans and the Problem of Invisible Power

Microsoft's ownership of GitHub rarely surfaces in everyday conversation, but it matters. GitHub is not simply a code-hosting service. For most of the world's software developers, it is the infrastructure of their professional lives: their portfolio, their collaborative workspace, their contribution history. Over 100 million developers depend on it. When Microsoft makes decisions about how GitHub operates, those decisions land with the force of a utility provider, not a website.

Which makes the events of late 2025 worth examining carefully. In October of that year, GitHub quietly updated its acceptable use policies to prohibit content that is sexually themed or suggestive with no clear creative or educational purpose. The policy change itself was not unreasonable on its face. What followed, however, was a wave of silent, unexplained account suspensions that affected an estimated 80 to 90 repositories and 40 to 50 developers, the majority of them members of modding and plugin communities for adult games. In most cases, GitHub did not inform users which specific terms they had violated. Accounts simply returned a 404 error. Years of collaborative work, across dozens of contributors in some cases, vanished without notice or meaningful right of appeal.

The pattern is not new. Developers in regions subject to US sanctions have reported similar experiences for several years, finding their accounts restricted or suspended with no warning and facing an appeals process that, in documented cases, has gone unanswered for months. One developer cited being stuck since March 2025 with no response after multiple appeals, despite providing evidence they were not located in a sanctioned area.

The consistent thread across these cases is not that GitHub enforces its terms of service, which it is entitled to do, but that it does so with a near-total absence of transparency, explanation, or recourse. For a platform that positions itself as the home for all developers, that gap between aspiration and practice is significant.

There is also a broader structural concern. When a single company controls the dominant repository platform for the world's open source development, content moderation decisions taken in Redmond or San Francisco have global consequences. Developers whose livelihoods or projects are disrupted have nowhere equivalent to go, which is precisely why the manner in which that power is exercised deserves far more scrutiny than it currently receives. Driving communities underground, onto fragmented or less visible platforms, does not make the content disappear. It simply removes the visibility and accountability that a mainstream platform, however imperfectly, provides.

Europe's Response May Be Exactly What the Market Needs

Here is where Microsoft's internal difficulties connect to something larger, and arguably something healthier.

The European Union is, right now, in the middle of a significant and accelerating effort to reduce its dependence on American technology infrastructure. In October 2025, the European Commission launched a sovereign cloud procurement framework, and in April 2026 it awarded a €180 million contract for cloud services to four European providers, explicitly bypassing Amazon, Google, and Microsoft, which together control 63% of the global cloud market. A decisive vote on formalising those procurement rules is scheduled for June 2026. Denmark has begun transitioning government ministries from Microsoft Office 365 to LibreOffice. Several other member states have followed or are actively considering similar moves to Linux-based alternatives. The European Central Bank selected a European provider for its digital euro infrastructure. The Dutch parliament passed eight motions urging reduced dependence on US technology. The mood across the continent is one of deliberate, structural disengagement.

The political catalyst is partly Donald Trump. Specifically, the concern among European governments that American technology companies could, under sufficient geopolitical pressure, restrict or withdraw services to European users. An ICC judge losing access to his Visa card after US sanctions is a small thing; the same logic applied to cloud infrastructure is not. European officials point to it regularly when explaining why this shift feels urgent rather than merely desirable.

But whatever the political trigger, the competitive consequences are worth taking seriously in their own right. Markets work best when dominance is contested. For the better part of two decades, Microsoft, Google, and Amazon have divided the enterprise technology landscape between them with relatively little pressure from credible alternatives. The result, as the decisions catalogued above suggest, is a culture in which a company can ship a feature storing your passwords in plaintext, cut fifteen thousand jobs during a record revenue quarter, and abandon a $8.5 billion acquisition through sheer neglect, with no meaningful market consequence.

European governments moving to Linux desktops and sovereign cloud infrastructure changes that calculation. It signals to the market that the incumbents are not, in fact, irreplaceable. It creates space for alternative ecosystems to mature and for genuine competition to develop. It may also, in time, force Microsoft and its peers to remember that trust is a product feature, not a marketing position.

None of this will be painless. European cloud alternatives are improving but are not yet equivalent at enterprise scale, and governments switching to Linux face real transition costs and capability gaps. The road from dependency to sovereignty is long and expensive. But the destination, a technology market where no single vendor can afford to take its users for granted, is a genuinely better one. If Microsoft's recent run of poor decisions has helped accelerate that journey, it may turn out to be the most consequential contribution the company makes in the 2020s, though not in the way Satya Nadella intended.

CIFSwitch - CVE-2026-46243

Liam Romanis — Wed, 03 Jun 2026 00:33:11 +0000

Just released an open-source bash checker for CIFSwitch (CVE-2026-46243) — the 19-year-old Linux kernel LPE disclosed last week that lets any unprivileged local user get root by abusing the CIFS/SPNEGO upcall path.

The script runs on bare-metal, VMs, and inside containers, and is CI/CD-friendly with JSON output and clean exit codes.

It checks:
✅ Kernel version against patched thresholds (6.18.22 / 6.19.12 / 7.0+)
✅ cifs-utils presence and exploitable version
✅ CIFS kernel module load state and blacklist status
✅ Unprivileged user namespace sysctl (the pivot point for the exploit)
✅ Active request-key cifs.spnego rules
✅ SELinux / AppArmor enforcement
✅ Container capabilities (CAP_SYS_ADMIN)
✅ Kernel symbol verification for the fix commit

Outputs human-readable or JSON for SIEM ingestion. Exit 0 = safe, exit 1 = action needed — drop it straight into a pipeline.

CIFSwitch is the fourth Linux LPE in under six weeks (after Copy Fail, Dirty Frag, and Fragnesia). If you're running multi-tenant Linux, CI runners, or container build farms, now is a good time to audit.

liamromanis101 / cifswitch-check

Detection script for CIFSwitch - CVE-2026-46243

cifswitch-check

A shell script to check whether a Linux system is exposed to CIFSwitch (CVE-2026-46243) — a local privilege escalation vulnerability in the Linux kernel's CIFS/SMB client that has been present since 2007.

Runs on bare-metal hosts, VMs, and inside containers. Designed to drop straight into CI/CD pipelines.

Background

CIFSwitch was disclosed on 28 May 2026 by security researcher Asim Manizada. The flaw chains a missing input validation in the kernel's cifs.spnego key type with the rootful cifs.upcall helper from cifs-utils.

An unprivileged local user can call request_key() with a forged key description, causing the kernel to invoke cifs.upcall as root with attacker-controlled fields. With upcall_target=app, the helper enters the attacker's mount namespace and performs a getpwuid() lookup before dropping privileges — loading an attacker-controlled NSS module and executing arbitrary code as root.

Prerequisites for exploitation:

A vulnerable kernel (present since 2007, fixed in 6.18.22 / 6.19.12…

View on GitHub

I have also updated the cve_checks.conf in my my K8s-container_escape_audit toolkit to detect this issue.

liamromanis101 / K8s-container_escape_audit

Look for possible escape vectors from a container

K8s_container-escape-audit

A bash script that runs inside a Docker or Kubernetes container and checks for escape vectors. Built for penetration testers and security teams doing container security assessments.

For authorised security assessments only. Do not run this on systems you don't have explicit written permission to test.

What it does

container_escape_audit.sh v4.0 performs 47 checks plus a config-driven CVE engine, covering: privileged configuration, dangerous capabilities, namespace isolation, filesystem mounts, kernel exposure, Kubernetes misconfigurations, cloud metadata access, kernel hardening posture, and an updateable database of recent kernel CVEs. All checks are strictly read-only — the script makes no changes to the system.

Each finding comes with a structured report entry:

What it is: the misconfiguration or exposure
Impact: worst-case if exploited
Exploitability: difficulty, tooling, real-world precedent
Recommendation: specific remediation steps

The tool ships as two files that must sit in the same directory:

container_escape_audit.sh   # main script

…

View on GitHub

I built a container escape audit tool — here's what v4.0 adds

Liam Romanis — Mon, 01 Jun 2026 13:24:47 +0000

canonical_url: https://github.com/liamromanis101/K8s-container_escape_audit

Container security tooling tends to fall into two camps: heavyweight scanners that run outside the container before deployment, and ad-hoc one-liners you paste into a shell when something looks wrong. container_escape_audit.sh sits in neither — it runs inside a live container, checks the actual runtime environment, and tells you exactly what an attacker who just landed in that container would be looking at.

Version 4.0 adds 12 kernel hardening checks, a config-driven CVE engine, and a database of 10 current kernel CVEs — including three that are actively exploited in the wild right now. This post walks through what's new and why the config-driven approach matters.

Repo: github.com/liamromanis101/K8s-container_escape_audit

The quick version

# grab both files
curl -sO https://raw.githubusercontent.com/liamromanis101/K8s-container_escape_audit/main/container_escape_audit.sh
curl -sO https://raw.githubusercontent.com/liamromanis101/K8s-container_escape_audit/main/cve_checks.conf
chmod +x container_escape_audit.sh
./container_escape_audit.sh

It runs in 15–45 seconds, writes a structured report, and exits. No installation. No root required (though running as root inside the container gives you more complete results). Nothing is written to the system — every check is read-only.

Output looks like this:

[CRIT]  Container appears PRIVILEGED (CapEff=0000003fffffffff)
[CRIT]  VULNERABLE to Copy Fail (CVE-2026-31431) — AEAD socket bindable, kernel 6.1.112
[WARN]  kptr_restrict=1 — kernel pointers visible to root processes
[WARN]  Unprivileged user namespaces enabled (kernel.unprivileged_userns_clone=1)
[ OK ]  cgroup v2 subtree_control is not writable
[ OK ]  No readable SSH private keys found

==================== SUMMARY ====================
  [CRITICAL] Container is running in privileged mode
  [CRITICAL] Copy Fail (CVE-2026-31431) AF_ALG exposure — CRITICAL [ITW] [CISA-KEV]
  [HIGH    ] kptr_restrict=1: kernel pointers visible to root
  [HIGH    ] Unprivileged user namespace creation is enabled

  CRITICAL: 2  |  HIGH: 5  |  MEDIUM: 3  |  INFO: 4

Every finding in the full report has four fields: what it is, impact, exploitability, and recommendation.

What it checks

The script runs 47 checks across four sections plus the CVE engine.

Checks 1–23 are the classic container escape vectors most people are familiar with — privileged mode, dangerous capabilities, host namespace sharing, dangerous mounts, /proc exposure, Kubernetes service account tokens, writable cron and auth files, runtime sockets, SUID binaries, and so on.

Checks 24–35 cover newer runtime attack surface: NVIDIAScape (CVE-2025-23266), the runc masked-path race trio (CVE-2025-31133/-52565/-52881), eBPF exposure, debugfs, Kubernetes RBAC active probing via SelfSubjectAccessReview, kernel keyring access, OCI hook injection paths, page cache write primitives (splice + pipe2), and procfs namespace FD leakage.

Checks 36–47 are new in v4.0 — kernel hardening posture. More on these below.

The CVE engine reads cve_checks.conf and runs compound checks against each entry. Also new in v4.0.

New: kernel hardening checks

When you're auditing a container, you're looking at the host kernel's sysctl values too — they reflect directly what mitigations are and aren't active. All of these are read from /proc/sys with no writes, no side effects.

The ones that tend to be most impactful in practice:

kernel.kptr_restrict — if this is 0, every user on the box can read kernel symbol addresses from /proc/kallsyms. That's an instant KASLR bypass. Most exploits against kernel vulnerabilities need an address leak as step one; when kptr_restrict=0 you skip that step entirely. It should be 2.

kernel.unprivileged_userns_clone — unprivileged user namespaces are the prerequisite for the majority of container escape CVEs published since 2019. Flipping Pages (CVE-2024-1086), the Packet Socket Race (CVE-2025-38617), Copy Fail, Dirty Frag — all of them either require user namespaces or become significantly easier with them. Setting this to 0 on hosts that don't need rootless containers removes a huge amount of attack surface in one sysctl.

kernel.perf_event_paranoid — at value 0 or -1, unprivileged processes can access kernel-level performance counters. This is the foundation of Spectre-class side-channel attacks and enables cross-container information leakage on shared CPU nodes. It should be at least 2.

fs.protected_symlinks and fs.protected_hardlinks — classic /tmp race conditions. Still come up regularly in privilege escalation chains. Both should be 1.

The full list in v4.0:

#	Parameter	Recommended
36	`kernel.kptr_restrict`	2
37	`kernel.dmesg_restrict`	1
38	`kernel.randomize_va_space`	2
39	`fs.protected_symlinks` / `fs.protected_hardlinks`	1
40	`fs.protected_fifos` / `fs.protected_regular`	2
41	`net.ipv4.tcp_syncookies`	1
42	ICMP redirects / source routing / rp_filter	0 / 0 / 1
43	IP forwarding	informational
44	`kernel.unprivileged_userns_clone`	0
45	`kernel.perf_event_paranoid`	≥ 2
46	Dirty Frag modules (esp4, esp6, rxrpc)	not loaded
47	Dangerous loaded modules audit	14 modules

New: config-driven CVE checks

Previously, CVE-specific checks were hardcoded functions in the script. Adding a new one meant modifying the script. That's fine for a handful of checks, but it doesn't scale well — and it means the script and the CVE data are tightly coupled when they really shouldn't be.

In v4.0, CVE checks are defined in cve_checks.conf. The script reads the file at runtime and dispatches the right test for each entry. To add a new CVE, you append a block to the config. The script doesn't change.

A config entry looks like this:

cve_id=CVE-2024-1086
name=Flipping Pages
cvss=7.8
severity=CRITICAL
check_type=compound
introduced=3.15
fixed_versions=5.15:5.15.149 6.1:6.1.76 6.6:6.6.15
itw=yes
poc_public=yes
cisa_kev=yes
subsystem=net/netfilter/nf_tables
module_names=nf_tables
mitigation=rmmod nf_tables 2>/dev/null; echo 'install nf_tables /bin/false' > /etc/modprobe.d/nftables.conf
socket_af=none
socket_type=none
socket_proto=none
what=CVE-2024-1086 is a use-after-free in nf_tables...
impact=Full local privilege escalation...
exploit=Public PoC, 99.4% success rate on Debian/Ubuntu/KernelCTF...
rec=Patch to v5.15.149+, v6.1.76+, or v6.6.15+...

The check_type field drives what the engine actually does:

kernel_version — parses uname -r and compares against the introduced/fixed_versions ranges. Handles all current LTS series.
module_loaded — checks /proc/modules for the listed modules.
socket_family — tries to open a socket with the given AF/type/proto from Python, which tells you whether the attack surface is reachable from within this specific container regardless of kernel patch status.
compound — runs all three and synthesises a combined severity.

The compound severity logic is worth spelling out because it avoids the two failure modes of noisy and silent:

kernel in affected range + module loaded or socket reachable  →  CRITICAL
kernel in affected range + module not blacklisted             →  HIGH (auto-load risk)
kernel in affected range + module blacklisted                 →  MEDIUM (interim mitigation, patch needed)
kernel not in affected range                                  →  INFO

That last HIGH case catches the thing that trips people up: a module that's not currently loaded but also not blacklisted can be auto-loaded by the kernel just by opening the right socket. "Not loaded" is not the same as "not exploitable."

What's in the CVE database

The shipped cve_checks.conf has ten entries. Three are actively exploited right now.

Copy Fail (CVE-2026-31431, CVSS 7.8, CISA KEV) — a flaw in the algif_aead AF_ALG interface that gives any unprivileged user a 4-byte write into the page cache of any readable executable. A 732-byte Python PoC with no dependencies achieves reliable root. Affects kernels from 4.14. Interim mitigation: rmmod algif_aead && echo 'install algif_aead /bin/false' > /etc/modprobe.d/copyfail.conf.

Dirty Frag (CVE-2026-43284 + CVE-2026-43500, CVSS 8.8/7.8) — same bug class as Copy Fail but through the IPsec ESP and RxRPC subsystems. Provides full attacker-controlled page cache writes at any offset, not just 4 bytes. Two CVEs, typically chained. As of writing, no distro patch exists for the RxRPC path — blacklisting rxrpc is the only mitigation. Note: blacklisting esp4/esp6 kills IPsec — check before you fleet-deploy.

Flipping Pages (CVE-2024-1086, CVSS 7.8, CISA KEV) — use-after-free in nf_tables. 99.4% success rate PoC. Used in RansomHub and Akira ransomware campaigns. Needs unprivileged user namespaces + nf_tables loaded. Affects 3.15–6.6.14.

The other seven: Attack of the Vsock (CVE-2025-21756), Chronomaly (CVE-2025-38352), Packet Socket Race (CVE-2025-38617), OverlayFS SetUID Copy (CVE-2025-38352, CISA KEV), DirtyPipe (CVE-2022-0847), and DirtyCOW (CVE-2016-5195).

Updating the database

The whole point of the config-driven design is that you shouldn't need to touch the script when new vulnerabilities drop. When a CVE gets a distro patch, update fixed_versions. When something goes ITW, update itw=yes. Add a new entry for a new CVE:

cat >> cve_checks.conf << 'ENTRY'

cve_id=CVE-2025-XXXXX
name=Whatever it's called
cvss=8.1
severity=HIGH
check_type=compound
introduced=6.1
fixed_versions=6.6:6.6.50 6.12:6.12.5
itw=no
poc_public=yes
cisa_kev=no
subsystem=fs/btrfs
module_names=btrfs
mitigation=none
socket_af=none
socket_type=none
socket_proto=none
what=...
impact=...
exploit=...
rec=...
ENTRY

Running in Kubernetes

Drop it into a pod directly:

kubectl cp container_escape_audit.sh mynamespace/mypod:/tmp/audit.sh
kubectl cp cve_checks.conf mynamespace/mypod:/tmp/cve_checks.conf
kubectl exec -n mynamespace mypod -- bash /tmp/audit.sh \
  --cve-conf /tmp/cve_checks.conf \
  --report /tmp/audit_report.txt
kubectl cp mynamespace/mypod:/tmp/audit_report.txt ./

Or as a Job that audits the cluster's default security context:

apiVersion: batch/v1
kind: Job
metadata:
  name: container-escape-audit
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: audit
          image: alpine:latest
          command:
            - sh
            - -c
            - |
              apk add --no-cache bash curl python3 && \
              curl -sO https://raw.githubusercontent.com/liamromanis101/K8s-container_escape_audit/main/container_escape_audit.sh && \
              curl -sO https://raw.githubusercontent.com/liamromanis101/K8s-container_escape_audit/main/cve_checks.conf && \
              chmod +x container_escape_audit.sh && \
              ./container_escape_audit.sh --json

kubectl apply -f audit-job.yaml
kubectl wait --for=condition=complete job/container-escape-audit --timeout=120s
kubectl logs job/container-escape-audit
kubectl delete job container-escape-audit

The Job runs with whatever security context the cluster assigns by default — which is the point. You want to see what a workload running in your cluster can actually reach, not what it could reach if you gave it extra permissions.

CI integration

The JSON output makes it straightforward to gate on CRITICAL findings:

CRITICAL_COUNT=$(./container_escape_audit.sh --json --no-report \
  | jq '[.findings[] | select(.severity=="CRITICAL")] | length')

if [ "$CRITICAL_COUNT" -gt 0 ]; then
  echo "FAILED: $CRITICAL_COUNT critical escape vectors detected"
  exit 1
fi

Requirements and limitations

Requirements: Bash 4.2+, Python 3. Everything else uses /proc, /sys, and standard POSIX tools. curl is optional (used for IMDS and Kubelet API checks).

What it isn't: this is a point-in-time audit, not continuous monitoring. It identifies attack surface — it doesn't exploit anything. A CRITICAL finding means the prerequisites for a known attack are present, not that you've been compromised. For continuous detection, pair it with Falco rules watching for writes to release_agent and core_pattern, AF_ALG socket creation from non-root processes, and LD_PRELOAD pointing to /tmp or /dev/shm.

Licence

CC BY-NC 4.0 — free for non-commercial use with attribution. If you're using this as part of a paid engagement or commercial product, we're happy to discuss sponsorship: github.com/sponsors/liamromanis101.

Feedback, issues, and PRs welcome. If you hit a false positive, a missed check, or a CVE that should be in the database, open an issue on GitHub.

github.com/liamromanis101/K8s-container_escape_audit

GitHub Already Has More Engagement Than LinkedIn. It Should Do Something About It.

Liam Romanis — Mon, 18 May 2026 08:27:57 +0000

I published a CVE detection tool recently. Nothing unusual there.

https://github.com/liamromanis101/CVE-2026-31431-Copy-Fail---Vulnerability-Detection-Script

On LinkedIn, the post got 385 impressions. A handful of likes, two comments, one of which was a recruiter, and one person clicked the link which might have been me testing it.

The GitHub repo, in the last 14 days alone: 6,230 views, 3,254 unique visitors, 923 clones from 585 unique cloners. 22 stars, 22 watchers, 2 forks. People actually ran the code.

Nobody clones a repo to be polite.

That is not an edge case. That is the pattern, every time.

LinkedIn Shows You What People Claim. GitHub Shows You What People Do.

LinkedIn is a performance platform. You write a headline, list your credentials, and hope the algorithm rewards you. The engagement is performative by design. Reactions, reposts, congratulations on your work anniversary from people you met once in 2019.

GitHub is a proof-of-work platform. You push code. People clone it, break it, improve it, or ignore it. Nobody pretends to engage. The stars and forks are real signal because there is no social pressure to give them.

This matters enormously if you are trying to build a company or find people worth building one with.

The Problem GitHub Has Not Solved Yet

Here is the thing: GitHub already has everything it needs to become the place where startups are formed.

It has developers at the earliest stage of their best ideas. A commit is often the first tangible artifact of a company that will eventually be worth something. GitHub sees that before any investor does, before any accelerator does, before the founders themselves have put a name to what they are building.

It has genuine engagement data. Not vanity metrics. Actual signals about who is working on what, who is collaborating with whom, and whose code other people trust enough to build on.

It has a culture that LinkedIn cannot replicate. GitHub is not a strictly professional environment. It is not NSFW either. It sits in the space between the two, which means people are actually themselves on it. That is rare, and it is valuable.

What it does not have is a product layer that uses any of this deliberately.

What I Think GitHub Should Build

I wrote a full strategic proposal on this. The short version is: a founder matching and angel investor platform built on top of GitHub's existing engagement data.

Call it GitHub Launchpad.

The idea is not complicated. If you look at what somebody has built, what others have contributed to, who has consistently shown up across meaningful repositories, you have a far more reliable picture of a potential co-founder than any LinkedIn profile gives you. You are not reading a CV. You are reading a commit history.

Add a structured layer for founders to signal that they are looking for collaborators or early investment, connect that to angels who want to back people before the pitch deck exists, and you have something no other platform can replicate. Because the proof-of-work layer is already there. It is just not being used for this.

Why This Is a Disruption Opportunity, Not an Incremental Feature

LinkedIn is not the natural home for early-stage startup formation. It is where companies post jobs once they already have fifty people. The seed-stage problem, finding the right co-founder, getting in front of the right angel before you have traction, has never been solved well.

GitHub is already embedded in the moment before that. It is at the first commit. It sees the proof of capability that no other platform has access to.

The opportunity to own startup formation from a position of genuine structural advantage is available now. That window will not stay open indefinitely as purpose-built competitors continue to develop.

The Full Proposal

I have written this up properly, with the product detail, the revenue model, and the competitive picture.

If this is an idea worth discussing, I would rather discuss it somewhere people are actually building things.

The full proposal is in the repo below. Thoughts welcome, especially from anyone who has tried to solve the co-founder matching problem and found the existing options wanting.

github.com/liamromanis101/github-launchpad

Tags: #startup #github #productivity #discuss

I Built an Agentic Linux Security Tool. It Took Way More Iterations Than I Expected.

Liam Romanis — Sun, 17 May 2026 23:53:51 +0000

This started as a simple experiment: can you point an AI at a Linux system, have it collect forensic data, and get something more useful than a wall of text back?

The answer, it turns out, is yes — but not in the way I originally thought, and not without a lot of iteration to get there.

How It Started

The initial idea was straightforward. Run a bunch of forensic commands — process lists, open sockets, SUID binaries, kernel modules, log anomalies, the usual — pipe the output to Claude, and get a triage report back. Simple agentic loop. Collect, analyse, report.

And that bit worked fine. Claude is actually pretty good at reading ps auxf output and spotting things that look wrong. Better than I expected, honestly.

The problem was what happened next. You'd get a list of findings and then... nothing. The same problem every security tool has. Here are some things that look suspicious. Good luck. The AI had done the easy bit and left you to figure out the hard bit on your own.

That's not really agentic. That's just automation with a language model bolted on.

The Interesting Problem

What I actually wanted was an AI that could investigate alongside you. Not just flag things, but help you work through whether a finding is real, what to do about it, and whether the remediation you're considering is going to cause more problems than it solves.

The challenge is that investigation requires running commands on the live system. And if you're going to run commands on a live system based on AI suggestions, you absolutely cannot have those commands run automatically. The AI will get things wrong. The AI will suggest things that sound reasonable but aren't appropriate for your specific setup. The AI will, if you let it, suggest hardening measures that stop your system from booting.

That last one happened. Not in a catastrophic way, but enough to make the point very clearly: you need a human in the loop, and that human needs to actually understand what they're approving.

The Man-in-the-Loop Pattern

What emerged after a lot of iteration is a batch investigation loop that goes like this:

Claude analyses a finding and proposes a set of verification or remediation commands — typically three to six per round. These are displayed to you with a type badge (VERIFY, REMEDIATE, or INSTALL), a plain-English description of what the command does, the rationale for why it's useful, and — critically for anything that could affect system stability — a rollback command so you know how to undo it.

You review all of them. You can deselect any you don't want. You can ask Claude questions about any command before approving it — "what does this actually do", "is there a safer alternative", "why is this necessary" — and get a direct answer in context.

Then you click run. The approved commands execute sequentially, the output comes back, and Claude analyses everything together in one consolidated response rather than reacting to each command individually. If it needs more information, it proposes another batch. If it has enough to make a determination, it gives you a verdict and action buttons: confirm as false positive, keep as active finding, mark resolved.

It took a lot of iterations to get this feeling natural. The early versions had Claude proposing one command at a time, which created an exhausting back-and-forth. Batch proposals with a single analysis pass work much better. The thread also has a tendency to grow unwieldy, so completed investigation rounds collapse into summaries.

The False Positive Problem

Something that became obvious quickly: AI-generated findings are going to overlap with things you already know about and have decided to accept. Your custom SSH port. Your pentest tooling. The forensic agent's own token file sitting in /tmp.

The tool has a false positive management system that goes beyond a simple whitelist. It uses fuzzy matching — a combination of token-based Jaccard similarity and longest-common-subsequence ratio — so that when Claude words a finding slightly differently on the next scan, it still gets suppressed. There's also a session-dismiss for things you want to acknowledge without permanently suppressing, and an FP audit workflow where Claude reviews your saved false positives and flags any that probably shouldn't be permanently suppressed because they could indicate real malicious activity in a different context.

That last one is more useful than it sounds. "Orphaned PTY sessions" is a reasonable false positive if you left some terminals open. It's not a reasonable false positive to permanently ignore if it could also indicate someone else's session on your system.

Is It Any Good?

Honestly, it's useful. More useful than I expected. For investigating real findings on a system you're responsible for, the investigation loop pattern genuinely helps — it keeps you from either ignoring things you should look at or taking action you don't understand.

But let's be clear about what it isn't.

It is not production ready. It is not comprehensively tested. The AI analysis is non-deterministic and occasionally wrong. The agent runs as root with minimal authentication over localhost — fine for personal use, not something you'd put in front of customers.

Is it secure? No. It's a forensic tool that runs as root and executes commands you approve. Security is mostly your problem. Use it on systems you own, in environments you control, for purposes you understand.

Use it at your own risk. If you blindly approve every command Claude suggests without reading them, you will eventually do something you regret. The tool tries to help — boot-risk warnings, required rollback instructions for kernel changes, explicit confirmation checkboxes before anything that could affect system stability — but it cannot protect you from yourself.

What I Learned About Agentic Development

The most interesting thing about building this wasn't the security tooling. It was what the development process revealed about building agentic systems in general.

The gap between "AI that does a task" and "AI that works alongside a human on a task" is much larger than it looks. The first is just automation. The second requires thinking carefully about where the human needs to be in the loop, what information they need to make good decisions, and how to present AI suggestions in a way that encourages understanding rather than blind acceptance.

Getting that right took a lot more iteration than I expected. The first version had the AI running ahead too fast. Later versions were too cautious and required too many clicks for simple cases. The batch proposal pattern that ended up working is something I arrived at through trial and error, not design.

That feels like the honest state of agentic development right now. The patterns are still being worked out.

Where It Goes From Here

It's open source under MIT: github.com/liamromanis101/SysForensics

If you're interested in the agentic investigation pattern, want to add checks for other distributions, want to explore what commercial-grade security tooling built on this approach would look like, or just want to kick the tyres — get in touch. I'd be genuinely happy to make this a community project if there's interest.

There's a lot of room to go further with this. Better context-awareness between findings, CVE cross-referencing, fleet management, proper reporting for auditors. The foundation is there. Whether it goes anywhere depends on whether other people find the approach interesting.

If you do try it: read the commands before you approve them. That's the whole point.