The state of Sui: What external-facing risk looks like (and why top engineers miss it)

#security #blockchain #sui #cybersecurity

TL;DR
I analysed the externally-observable posture of 122 Sui network endpoints. What I found isn't about whether the Sui team build great software, it's about how even 'good' engineers can miss external operational risk: exposed services, misconfigured infrastructure, and public metrics that leak sensitive operational data. This piece summarises my main findings, why they matter, and practical steps operators can take today.

Why I did this

I wanted to show, with data, how external attack surface and operational misconfigurations can defeat even excellent engineering. The Sui protocol has strong engineering — my goal is educational: to help teams measure and close external exposure before an attacker finds it.

The data was shared with the Sui security team in August 2025.

What I scanned and how

Briefly (full methodology in the linked report):

I measured 122 Sui-related endpoints for externally reachable services (HTTP, RPC, Docker API, metrics endpoints, etc.).
My approach focused on externally observable posture — what an internet attacker can see and reach — not on private code or internal access.
I applied conservative confidence thresholds for version/CVE mapping and logged only reproducible findings.

See the full methodology and raw data in my published findings. (link: https://github.com/pgdn-oss/sui-network-report-250819)

Topline findings

A non-trivial percent of observed endpoints exposed services that should never be public (for example, metrics endpoints reachable from the internet, and port 2375 — Docker remote API — observed in a surprising number of hosts). SSH all over the shop.
Many public websites were default vendor landing pages or misconfigured web servers (these can leak service versions and admin consoles).
Only a small fraction had WAFs present when an HTTP endpoint existed.
Several hosts returned service banners or version strings that mapped to known CVEs (I used a conservative confidence policy; the “CVE-affected” label is an upper bound pending operator verification).
The distribution of problems is not uniform — some operators were well locked down, others left obvious signals that an external attacker could use.

(Full counts, tables and heatmaps are available in the full report.)

Why this matters

External visibility is an attacker’s map. Public metrics, misconfigured HTTP endpoints and exposed management APIs are high-value reconnaissance.
Automated attacks scale. An exposed metrics endpoint or Docker API is trivial for automated tooling to find and target at scale.
Engineers think inside-out. Teams often focus on consensus and cryptography (rightly), and under-invest in hardening the network/ops layer that faces the internet.

Concrete examples (anonymised)

Metrics endpoints reachable on the public internet that expose internal state and operational metrics.
Docker remote API (2375/tcp) responding with service banners — a trivial path to container escape or remote code execution in the wrong hands.
Default web server landing pages that leak version information or provide admin paths.

(Again — see the report for technical reproduction notes and timeline.)

Remediation checklist (for operators & you)

Inventory your externally reachable endpoints. If you can’t list them, you can’t secure them. Use internal scans + trusted external scans.
Close management interfaces to the public. Docker APIs, admin consoles, metrics scrape endpoints — bind them to localhost / private networks only.
Require auth and network controls. Where management APIs must be reachable externally, place them behind a mutual-TLS gateway, VPN, or tightly-scoped firewall rules.
Harden metrics endpoints. Don’t expose Prometheus or similar scrapers to the public internet. Use an internal scraper or secure gateway.
Remove verbose banners & version strings. Configure servers to not reveal build/versioning in HTTP headers or service banners.
Monitor for drift. Re-run external posture scans regularly and detect when previously-closed ports reappear.
Patch management. Track service versions and patch known CVEs promptly — but assume some versions may still be exposed until verified.

Limitations & ethics

My scans are non-invasive and focused on public-facing services. I do not exploit vulnerabilities, nor do I publish private data.
Some “port-only” observations require operator verification (e.g., distinguishing a ghost port from a genuine service).
The CVE mappings are conservative upper-bound estimates that need operator confirmation for actionable triage.

See the full methodology, opsec and reproducibility appendix for the exact scanner commands and the policy I used for CVE confidence. (link: https://github.com/pgdn-oss/pgdn-cve)

What I recommend to protocol teams and operators

Fund or mandate periodic external posture reviews as part of release processes.
Automate external smoke tests that confirm management APIs and metrics are not exposed.
Make “no-management-exposed” a documented runbook for deployment.
Share anonymised exposure telemetry so the community can learn and raise the bar.

Closing — why I published this

This is about shared risk and learning. Great protocol engineering doesn’t immunise an operator against mistakes in deployment and ops. My hope: this write-up becomes a practical resource for teams and operators to make the mesh of Sui (and similar networks) safer for everyone.

Full report (data, scripts, and appendices): https://github.com/pgdn-oss/sui-network-report-250819

I’ve been building something new that takes this kind of analysis much further — automating external risk discovery at scale. More on that soon.

Thanks for reading, Simon.