What We Learned Scanning Netflix Atlas

#architecture

Clear Code Intelligence scanned a public Netflix repository: Netflix/atlas.

This is not a dunk on Netflix.

It is a public-code methodology test.

After scanning Google zx and Microsoft agent-framework, we wanted a different kind of repository. Netflix Atlas is an observability and telemetry project with a mature platform-engineering shape. It is mostly Scala, and it includes query/evaluator logic, API modules, language-server tooling, resource files, tests, and platform integration code.

That makes it a useful scan target because it tests whether a technical debt report can understand domain context.

What We Scanned

The Clear Code scan reviewed the public Netflix/atlas repository and produced a technical diligence PDF report.

The scan measured:

1,247 repository files
706 analyzed files
89,113 lines of code
186 report findings
high AI token debt risk

The scorecard was mixed:

Area	Score
Overall diligence	35/100
Projected after remediation	53/100
Delivery	96/100
Open source readiness	83/100
Architecture	45/100
Maintainability	0/100
AI governance	0/100

The delivery and open-source signals were strong. That matters because a serious report should not only criticize. It should show where the repository is already strong.

The Important Lesson Is Classification

Atlas is an observability/query system.

That means some findings require domain-aware interpretation.

For example, a generic scanner can flag evaluator-style code as dynamic execution. But in a query language, expression evaluation may be expected product behavior. The real report question is not simply "is there eval-like behavior?"

The better questions are:

Is this expected DSL/query behavior?
Is user input constrained?
Is execution sandboxed or bounded?
Are failure modes tested?
Are ownership boundaries clear?
Is this active debt or accepted design?

That distinction matters.

A scanner dump can find a pattern.

A useful technical debt report has to explain what the pattern means.

Where AI Token Debt Appears

AI token debt is the extra AI-agent context, search, inference, retry, and review work created when a codebase is hard to reason about.

The Atlas scan modeled high AI token debt because of:

complexity drag
context sprawl
large-context files
deferred decisions
dependency uncertainty

Some context hotspots included:

atlas-lsp/src/main/scala/com/netflix/atlas/lsp/AslDocumentAnalyzer.scala
atlas-core/src/main/scala/com/netflix/atlas/core/stacklang/Interpreter.scala
atlas-webapi/src/main/scala/com/netflix/atlas/webapi/ExprApi.scala
atlas-postgres/src/main/scala/com/netflix/atlas/postgres/SqlUtils.scala
atlas-pekko/src/main/scala/com/netflix/atlas/pekko/StreamOps.scala

The key point is not that large files are automatically bad.

The key point is that AI agents pay for ambiguity.

When a future agent needs to modify query behavior, language-server behavior, expression parsing, or web API behavior, it has to reconstruct domain context before it can safely change the code. The more concentrated that context is, the more the agent spends on search, inference, retries, and human review.

False Positives Are Product Feedback

The scan also exposed places where tooling should improve.

For example:

palette resource files are not the same as large runtime modules
postgres/postgres in a local test suite is not the same as a leaked production credential
syntax-highlighting token names are not credentials
query/evaluator logic needs domain context
benchmark modules should not be scored the same way as production paths

That does not make the scan useless.

It makes the scan useful product feedback.

Technical debt tooling needs scope classification:

production runtime code
test fixture
local-only config
static resource
generated asset
benchmark code
expected domain behavior
active debt
accepted risk
false positive

Without that layer, reports become noisy.

With that layer, reports become decision support.

Why Public Scans Matter

Public repositories are useful because the evidence can be inspected and the methodology can be challenged.

The goal is not to shame maintainers.

The goal is to make technical debt analysis concrete:

exact source evidence
confidence level
scope classification
domain interpretation
remediation path
verification expectation
AI-agent cost driver

If anyone from Netflix Open Source or the Atlas maintainer community wants the full PDF report, we would be glad to share it and hear where the scan should be corrected, tuned, or scoped differently.

Public code deserves public, fair, evidence-backed analysis.