Running a server security audit across a fleet by hand is miserable: you open
twenty SSH sessions, paste the same grep into each, copy results into a
spreadsheet, and pray you didn't fat-finger a rm on a production box at 2 a.m.
This guide shows you how to sweep an entire mixed fleet — web, API and database
hosts across several OS families — for security problems in a single pass, in
strict read-only mode, using the open-source MCP server
remote-agents. The whole point
is that an audit literally cannot change anything: every host starts in plan
mode, so a compliance or incident sweep is safe by construction.
In short: put the whole fleet into read-only
planmode withset_mode,
then fan out one audit at a time —fleet_exec,fleet_read,fleet_search
andmapreduce— to check SSH hardening, file permissions, leaked secrets,
brute-force IPs and pending patches. Payloads are end-to-end encrypted
(AES-GCM-256); the relay only ever sees ciphertext, and a hard denylist for
paths like/etc/shadowholds even in bypass mode.
Table of Contents
- Architecture: ~20 hosts, one room
- Installing agents and tagging the fleet
- Why and when to run a fleet-wide audit
- Quick summary table
- Recipe 1 — Put the fleet in read-only
planmode - Recipe 2 — Find world-writable files and bad permissions
- Recipe 3 — Audit SSH hardening across the fleet
- Recipe 4 — Scan for leaked secrets and keys by content
- Recipe 5 — Aggregate brute-force IPs with MapReduce
- Recipe 6 — Check for pending security updates
- FAQ
- Wrapping up
Architecture: ~20 hosts, one room
Let's take a realistic mid-size production fleet — about twenty hosts split into
web, API and database tiers, with a couple of different OS families thrown in
because real fleets are never homogeneous. Every node runs a lightweight agent
that connects outbound to one encrypted relay room. The AI assistant (Claude
or opencode) sees the whole fleet as a single computer and addresses groups of
hosts by tag or OS family.
| Host | Role | Tag | Stack |
|---|---|---|---|
web-1, web-2 … |
Public web / reverse proxy | web |
nginx, Ubuntu 22.04 |
api-1, api-2 … |
Application backends | api |
Node.js, Debian 12 |
db-1 |
Primary database | db |
PostgreSQL, Rocky Linux 9 |
| (mixed) | Edge / build boxes |
os:macos, os:windows
|
macOS / Windows |
The magic is targeting. A single operation can hit the whole fleet
(target="all"), one tier (target="web" or target="api,db"), or one OS
family (target="os:linux"). That means a 20-host audit is one tool call, not
twenty SSH sessions — and results come back aggregated per host, so one
unreachable box never sinks the whole batch.
┌──────────────── AI (Claude / opencode) ────────────────┐
│ remote-agents (MCP, stdio) │
└────────────────────────┬────────────────────────────────┘
│ wss:// (E2E AES-GCM-256)
┌───────┴────────┐ relay (CF Worker or self-host)
│ room=fleet │ (forwards ciphertext only)
┌──────────┬─────────┼──────────┬──────────────┬───────────┐
web-1 web-2 api-1 api-2 … db-1 mac/win edge
(web) (api) (db) (os:macos/windows)
Installing agents and tagging the fleet
On each host you install one Rust binary and start an agent with the right tag.
Tags are what make tier-wide audits possible, so be deliberate about them.
# once on every machine
npm i -g remote-agents
# web tier
remote-agents run --relay wss://<relay> --room fleet --token <secret> \
--name web-1 --tags web
remote-agents run ... --name web-2 --tags web
# api tier
remote-agents run ... --name api-1 --tags api
remote-agents run ... --name api-2 --tags api
# database
remote-agents run ... --name db-1 --tags db
For 24/7 hosts, use remote-agents install instead of run — it registers a
background service (systemd on Linux, launchd on macOS) so the agent survives
reboots. To confirm the whole fleet is online, ask the AI "show me the agents in
the room"; under the hood that calls list_agents, which returns each peer's
OS family, distro, kernel, shell, tags and an update_available flag.
Note: The network is a flat peer mesh — there is no controller/agent
split. Your local machine joins as an equal peer and dispatches work. If you
want a send-only controller that never executes commands itself, start it with
--no-agent.
Why and when to run a fleet-wide audit
A fleet-wide security sweep is the kind of task that should be routine but
rarely is, because doing it manually scales linearly with host count. Here is
when this approach pays off the most:
-
Compliance & audit windows. SOC 2, ISO 27001 and PCI all want evidence
that SSH is hardened, permissions are sane and patches are current — across
every host, not a sample. Read-only
planmode produces that evidence without any chance of mutating the systems under review. - Incident response. After a suspected breach you need to know right now which hosts have world-writable files, leaked keys on disk, or a spike of failed logins. Twenty parallel greps beat twenty serial SSH sessions when the clock is ticking.
-
Drift detection. Configs rot. A host that was hardened six months ago may
have had
PasswordAuthenticationflipped back during a debugging session. - Onboarding new servers. When you absorb a fleet you didn't build, a single audit pass tells you what you actually inherited.
-
Cross-platform reality. Real fleets mix Linux, macOS and Windows.
os:<family>targeting lets one audit branch correctly per platform instead of failing half the hosts.
The recurring theme: you want to look, not touch. Everything below is built
around that guarantee.
Quick summary table
| # | Recipe | What it does |
|---|---|---|
| 1 | Read-only plan mode |
Locks the whole fleet so the audit cannot write anything |
| 2 | World-writable & bad perms | Finds 0002-perm files and risky setuid binaries on every host |
| 3 | SSH hardening sweep | Checks PermitRootLogin / PasswordAuthentication fleet-wide |
| 4 | Secret scanning | Greps file content for AWS keys and private keys across hosts |
| 5 | Brute-force IP ranking | MapReduce over auth.log to rank top attacking IPs |
| 6 | Pending security updates | Lists upgradable packages per OS family (apt / dnf) |
Recipe 1 — Put the fleet in read-only plan mode
Before you run a single audit command, lock the fleet down. Each agent has a
safety mode, and plan allows only safe reads — read_file, git_status
and non-mutating exec. Writes, overwrites and risky operations are simply
rejected. This is the foundation that makes a security audit safe for compliance
and incident work: a host in plan mode physically cannot be changed by the
sweep.
Step 1. Switch every host to plan. You set the mode per host with
set_mode. For a fleet, loop it over each agent (or ask the AI to apply it
to all of them):
set_mode agent_id=web-1 mode=plan
set_mode agent_id=web-2 mode=plan
set_mode agent_id=api-1 mode=plan
set_mode agent_id=api-2 mode=plan
set_mode agent_id=db-1 mode=plan
Step 2. Confirm the lock with get_info. Don't take it on faith — read the
mode back from each host:
get_info agent_id=web-1
get_info agent_id=db-1
The get_info response reports the active mode, so you have proof every host is
read-only before the audit starts. That single field is exactly the kind of
artifact an auditor wants to see captured at the top of an evidence log.
Important: Even if a host were in
bypassmode, a hard denylist for
sensitive paths like/etc/shadowand/bootstill applies — those reads and
writes are refused unconditionally.planmode is your first guardrail; the
denylist is the one that never comes off.
Recipe 2 — Find world-writable files and bad permissions
World-writable files are a classic privilege-escalation vector: any local user
can overwrite them, and if one happens to be a script run by root, you have a
problem. Auditing this by hand across twenty hosts is exactly the kind of busywork
that gets skipped — so let's do all of it in one command.
Step 1. Hunt for world-writable files fleet-wide. Use fleet_exec with
target="all". The find expression -perm -0002 matches anything with the
world-writable bit set; we prune the noisy virtual filesystems:
fleet_exec target="all"
command="find / -xdev -type f -perm -0002 -not -path '/proc/*' -not -path '/sys/*' 2>/dev/null | head -n 50"
Results come back per host, so you immediately see that, say, web-2 has a
stray world-writable file in /var/www while every other host is clean. One
slow or offline host doesn't block the rest of the batch.
Step 2. Audit setuid/setgid binaries. Unexpected setuid binaries are another
red flag. Same fan-out, different predicate:
fleet_exec target="all"
command="find / -xdev -perm -4000 -o -perm -2000 -type f 2>/dev/null | sort"
Step 3. Spot-check sensitive directories. For example, confirm that no SSH
private keys are group/world-readable:
# what fleet_exec runs on each host
find /home /root -name 'id_*' -not -name '*.pub' -perm /0077 2>/dev/null
Tip: Add
-xdev(shown above) to keepfindfrom descending into
network mounts and bind mounts — otherwise an NFS share can make the audit
crawl on every host that mounts it. Cap output withheadso a misbehaving
box can't flood the per-host result back through the relay.
On macOS hosts the same find works but paths differ (/Users instead of
/home); on Windows you'd target os:windows separately with a PowerShell ACL
check rather than find. This is why OS-family targeting matters — more on that
in Recipe 6.
Recipe 3 — Audit SSH hardening across the fleet
SSH is the front door, so its config is the highest-value thing to audit. The two
settings that bite people most often are PermitRootLogin (should be no or
prohibit-password) and PasswordAuthentication (should be no if you're using
keys). Drift here is silent and dangerous.
Step 1. Read the raw config from every host. Use fleet_read to pull
/etc/ssh/sshd_config across the fleet in one shot:
fleet_read target="os:linux" path=/etc/ssh/sshd_config
You get the full file back per host, which is great for archival evidence. But
for a quick pass/fail you usually want just the relevant lines.
Step 2. Extract just the hardening-relevant directives. Switch to
fleet_exec and grep the effective values, ignoring comments:
fleet_exec target="os:linux"
command="grep -Ei '^\s*(PermitRootLogin|PasswordAuthentication|PubkeyAuthentication|X11Forwarding|PermitEmptyPasswords)' /etc/ssh/sshd_config || echo 'NONE SET (defaults apply)'"
The per-host output makes outliers jump out: if api-2 reports
PasswordAuthentication yes while the rest report no, you've found your drift.
Step 3. Check the running config, not just the file. Includes and drop-in
sshd_config.d/ files can override the main file, so verify what sshd actually
loaded:
fleet_exec target="os:linux"
command="sshd -T 2>/dev/null | grep -Ei 'permitrootlogin|passwordauthentication|pubkeyauthentication'"
Note:
sshd -Tprints the fully resolved effective configuration —
including drop-ins under/etc/ssh/sshd_config.d/. Auditing the main file
alone can miss an override that re-enables password auth. Because everything is
read-only, this is a perfectly safe command to run inplanmode.
On macOS, SSH config lives at the same path but is managed through
systemsetup -getremotelogin; remember to scope macOS hosts separately rather
than assuming the Linux command applies everywhere.
Recipe 4 — Scan for leaked secrets and keys by content
Secrets end up on disk far more often than anyone admits: an AWS key pasted into
a .env, a private key copied to the wrong host during a migration, a token in a
forgotten shell history. The right tool here searches by content, not just
filename, because attackers don't name files aws-key.txt.
Step 1. Search the fleet's content for AWS access keys. Use
fleet_search — it searches each host's files by content. AWS access key IDs
match a recognizable pattern:
fleet_search target="all"
content="AKIA[0-9A-Z]{16}"
Each match comes back with the host, file path and surrounding context, so you
know exactly which box and which file to rotate.
Step 2. Hunt for private key material. PEM-encoded private keys all start
with a tell-tale header:
fleet_search target="all"
content="BEGIN (RSA|OPENSSH|EC|DSA) PRIVATE KEY"
Step 3. Catch generic high-entropy tokens. For things like
password=, secret_key, or bearer tokens, a looser content search across the
fleet surfaces candidates worth a human review:
fleet_search target="all"
content="(api[_-]?key|secret|token|password)\s*[:=]"
fleet_search defaults to scanning sensible roots (home plus
Documents/Downloads/Desktop/Pictures), which is where stray credentials
usually land. For deployment configs under /etc or /srv, fall back to
fleet_exec with grep -rIE scoped to those directories.
Important: Treat the audit output itself as sensitive — it now contains the
very secrets you were hunting. The relay never sees plaintext (payloads are
AES-GCM-256 encrypted end-to-end and the relay forwards only ciphertext), but
the results land in your AI chat transcript. Rotate any key the scan finds; a
leaked key on disk should be considered compromised the moment you discover it.
Recipe 5 — Aggregate brute-force IPs with MapReduce
Individual hosts each see their own slice of a brute-force campaign. The
interesting question — which IPs are hammering the **whole fleet? — needs
cross-host aggregation. That's exactly what mapreduce is for: it partitions
the data across the fleet, runs a map_fn per partition, then folds everything
together with a reduce_fn.
Step 1. Map failed-password IPs per host. Each partition is a host's auth
log; the map step extracts the offending IPs and counts them locally:
mapreduce
data=["web-1:/var/log/auth.log", "web-2:/var/log/auth.log", "api-1:/var/log/auth.log", "api-2:/var/log/auth.log", "db-1:/var/log/secure"]
map_fn="grep 'Failed password' \"$(cut -d: -f2)\" | grep -oE 'from [0-9.]+' | awk '{print $2}' | sort | uniq -c"
reduce_fn="awk '{c[$2]+=$1} END {for (ip in c) print c[ip], ip}' | sort -rn | head -n 20"
The map step does the heavy per-host counting in parallel; the reduce step merges
those partial counts and ranks the top 20 attacking IPs across the entire
fleet — a single ranked list instead of five logs you'd otherwise correlate by
hand.
Step 2. Make the run resilient. Add max_retries so a momentarily
unreachable host doesn't blow up the whole report:
mapreduce
data=[ ... same as above ... ]
map_fn="..."
reduce_fn="..."
max_retries=3
Failed partitions are automatically re-dispatched up to the retry limit, so a
box that hiccups gets a second (and third) chance rather than leaving a hole in
your data.
Tip: Different distros log SSH failures to different files —
Debian/Ubuntu use/var/log/auth.log, while RHEL/Rocky/Fedora use
/var/log/secure. Note howdb-1points at/var/log/secureabove. If your
hosts usesystemd-journaldwith no flat file, swap the map command'sgrep
forjournalctl _COMM=sshd | grep 'Failed password'.
Once you have the ranked list, feeding the worst offenders into fail2ban or a
firewall denylist is a natural follow-up — but that's a write, so you'd
deliberately move the relevant host to edit mode first.
Recipe 6 — Check for pending security updates
Unpatched packages are the most common breach vector there is, and "are we
patched?" should be answerable in one command. The catch is that the answer
differs by OS, so this recipe leans hard on os:<family> targeting.
Step 1. List upgradable packages on Debian/Ubuntu hosts. Scope to the
relevant family and ask apt:
fleet_exec target="os:linux"
command="command -v apt >/dev/null && apt list --upgradable 2>/dev/null | grep -i security || true"
Step 2. Check RHEL-family hosts for security errata. dnf has a dedicated
security view:
fleet_exec target="os:linux"
command="command -v dnf >/dev/null && dnf updateinfo list security 2>/dev/null || true"
Guarding each command with command -v means a single target="os:linux" call
handles both Debian and RHEL hosts gracefully — the wrong tool simply no-ops
instead of erroring.
Step 3. Don't forget non-Linux hosts. macOS and Windows need entirely
different commands, so target them separately:
fleet_exec target="os:macos" command="softwareupdate -l"
fleet_exec target="os:windows" command="powershell -Command \"Get-WindowsUpdate\""
Note: This is a read-only check — none of the commands above install
anything, so they're safe inplanmode. When you're ready to actually
patch, that's a mutating operation: move the target hosts toedit(or
bypass) mode first, then run the upgrade. To find hosts running an outdated
agent binary, there's a dedicated tool —fleet_update_check— which flags
idle peers with a newer version available, after which you run
npm i -g remote-agents@lateston them.
That cross-platform split is the whole reason os: targeting exists: one logical
audit ("are we patched?") branches into the right command per platform instead of
failing on every host where the assumption doesn't hold.
FAQ
Is it really safe to run a security audit on production?
Yes — that's the entire design goal. You start every host in plan (read-only)
mode and confirm it with get_info, so the audit physically cannot write,
overwrite or delete anything. On top of that, a hard denylist for paths like
/etc/shadow and /boot applies even in bypass mode, and all command payloads
and results are end-to-end encrypted, with the relay forwarding only ciphertext.
How many servers can I audit in one command?
There's no fixed cap — fleet_exec, fleet_read, fleet_search and mapreduce
all fan out to as many hosts as match your target. A 20-host fleet is one
command instead of twenty SSH sessions, and results are aggregated per host
so you can read them like a report. One unreachable or slow host doesn't sink the
batch; with mapreduce you also get max_retries for transient failures.
How is this different from a dedicated scanner like OpenSCAP or Lynis?
It's complementary, not a replacement. Tools like Lynis and OpenSCAP run a fixed,
standardized benchmark on a single host. remote-agents is an interactive,
AI-driven layer that lets you run any check — including those scanners
themselves — across a whole fleet at once, correlate results, and pivot
instantly ("now show me which of those hosts also has password auth enabled").
You can even fleet_exec lynis audit system everywhere and aggregate the
scores.
Does the relay or any cloud service see my data?
No. Payloads are encrypted with AES-GCM-256 using a key derived from your room
token, and the relay — whether the Cloudflare Worker or a self-hosted Rust relay
(remote-agents-relay) — only ever forwards opaque ciphertext. You can run the
relay entirely on your own infrastructure and point agents at ws://your-host:8080,
so nothing leaves your environment.
Can I schedule the audit to run automatically?
Yes. Use schedule_add to register a 6-field cron job (sec min hour day month) directly on a host; it runs on the agent itself, so it keeps firing
dow
even if the relay link drops. A nightly world-writable-file check or weekly SSH
config diff is a natural fit.
Wrapping up
A thorough server security audit across a fleet doesn't have to mean a day of
SSH tedium. With every host locked into read-only plan mode, one operator can
sweep twenty machines for world-writable files, SSH-hardening drift, leaked
secrets, brute-force IPs and missing patches — each in a single command, with
results aggregated per host and the worst offenders ranked automatically by
MapReduce. Because the work happens in plan mode behind a hard denylist and
end-to-end encryption, the same sweep that powers your incident response also
doubles as clean, repeatable compliance evidence — without ever risking the
systems under review.
Install:
npm i -g remote-agents→
package on npm ·
source & documentation
Top comments (0)