404Saint

Posted on Apr 11 • Edited on May 3

SurfaceLens V2: Infrastructure Attack Surface and Shadow IT Intelligence Engine

#python #opensource #security #cybersecurity

By RUGERO Tesla (@404Saint)

The thing nobody wants to admit

Most organizations don't actually know what they're exposing to the internet.

I don't mean that as a criticism. I mean it literally. Assets drift. Services get spun up and forgotten. Teams build things outside the controlled network boundary because it's faster. A subdomain that pointed somewhere important three years ago still resolves, except now it points at nothing, and nothing is claimable by anyone with the right timing.

This is what Shadow IT looks like from the outside. Not malicious. Just invisible.

I spent a lot of time doing recon simulations and building lab environments around infrastructure security, and the same problem kept showing up. Discovery is a solved problem. You can find assets. What's hard is understanding how they relate to each other, which ones actually belong to the organization you're looking at, and which ones represent real exposure versus expected noise.

SurfaceLens V2 is my attempt to build something that treats those questions seriously.

What it is

SurfaceLens V2 is a modular attack surface management tool, but calling it a scanner misses the point. It's built as an intelligence pipeline. The difference matters.

A scanner gives you a list. A pipeline takes that list and asks what it means. Who does this asset belong to? Has it appeared before? Does its TLS configuration match what you'd expect? Is this subdomain pointing at infrastructure that's been decommissioned?

The goal is moving from raw discovery to something you can actually act on.

What I kept running into

Doing recon across different lab environments and simulated enterprise networks, four things came up constantly.

Subdomains pointing at decommissioned infrastructure nobody had cleaned up. In some cases the underlying cloud resource was unclaimed, meaning anyone could register it and inherit whatever trust the subdomain carried. Subdomain takeover is well documented but it's still everywhere.

Services exposed outside their intended boundaries. RDP and SSH sitting on public IPs. Databases reachable without a VPN. Not because anyone decided that was fine, just because nobody noticed.

Assets that clearly belonged to an organization but didn't match its DNS patterns at all. Shadow IT, basically. Someone built something, it works, it lives outside the perimeter anyone is actually monitoring.

TLS configurations that ranged from outdated to outright broken, on infrastructure that looked authoritative enough that a user would trust it without thinking.

None of these are surprising individually. Together they paint a picture of an attack surface nobody has a complete map of.

How SurfaceLens approaches it

Pull from multiple sources

The first stage aggregates asset data from Shodan, Censys, LeakIX, CriminalIP, and local datasets. Using multiple providers matters because each one sees different things. An asset invisible to Shodan might be indexed by Censys. Combining sources gives you a more complete picture than any single feed.

Track state over time

One of the decisions I spent the most time on was persistence. Most recon tools treat each scan as a standalone event. You run it, you get results, you move on.

That model throws away something valuable. The question isn't just what's exposed right now. It's what's new since the last time you looked, what disappeared, what changed.

SurfaceLens stores assets in a local SQLite database with first-seen and last-seen timestamps. New exposures surface immediately. An asset that vanished and came back shows up as a change worth investigating. Recon becomes monitoring instead of a one-time snapshot.

Run each asset through the pipeline

Every asset that comes in goes through a series of modular checks.

The SSL Auditor pulls certificate data and evaluates TLS configuration. Weak ciphers, expired certs, misconfigured chains. Anything that would make a security-conscious person wince.

The DNS Correlator does attribution analysis. This is the part I find most interesting. It tries to determine whether an asset actually belongs to the organization you're analyzing, or whether it's drifted outside controlled boundaries. This is where Shadow IT becomes visible in the data rather than just suspected.

The Fingerprinter identifies technologies and service layers. What's running behind the asset? A reverse proxy? A specific web server version? This context changes how you interpret everything else.

The Sensitive File Hunter checks for common exposure patterns. .env files, robots.txt entries that reveal more than intended, backup files sitting in predictable locations. Simple checks that still catch real things regularly.

The Risk Prioritizer pulls all of this together into a weighted score between 0 and 10. Not a magic number that tells you what to do, but a signal that tells you where to look first when you have fifty assets and time for five.

The shift that changed how I think about this

When I started building SurfaceLens I was thinking about discovery. Find the things, list the things, report the things.

Somewhere in the middle of building the DNS Correlator I started thinking differently.

Individual findings don't tell you much. An open port is an open port. A TLS misconfiguration is a TLS misconfiguration. But when you start correlating DNS attribution with service exposure with certificate data with historical visibility, you start seeing something that looks less like a list of issues and more like a map of how an attacker would move.

That's where exposure stops being a checkbox and starts being an attack path.

I don't think I fully understood that distinction until I had to implement it. Which is probably the best argument for building tools rather than just using them.

Output

The same underlying data comes out three ways depending on what you need.

CLI output for quick assessments when you want high-signal results without overhead. Markdown reports for documentation and audit trails. A Flask web dashboard for anything that benefits from a persistent, navigable view of assets, risk scores, and historical changes.

Same data model, different interfaces. Nothing gets lost between them.

What it isn't

SurfaceLens is passive-first. It relies on aggregated intelligence sources and non-intrusive active checks. It's not an aggressive scanner. It's not trying to enumerate everything as fast as possible.

That's a deliberate choice. In real environments, volume creates noise. Noise buries signal. The tool is more useful if it's telling you fewer, more meaningful things than if it's generating a report that takes three days to triage.

Where it goes next

SurfaceLens V2 is a foundation. The areas I'm actively thinking about are better attribution models for asset ownership, risk scoring that's more context-aware than weighted signals alone, and tighter integration with automated security workflows.

The detection coverage for infrastructure misconfigurations has room to grow too. There's a long list of checks that would add value without adding noise, and working through that list is ongoing.

Use this responsibly. SurfaceLens is built for defensive research and authorized assessments. Don't point it at infrastructure you don't have permission to analyze.

The project

github.com/404saint/surfacelens_v2

If you're working in infrastructure security or attack surface management, take a look. Issues and PRs are open. I'm especially interested in feedback from people who've tried to solve the attribution problem differently.

Built by RUGERO Tesla · GitHub: @404Saint
Offensive security researcher focused on infrastructure, network security, attack surface analysis, and Shadow IT discovery.

DEV Community