Albert Alov

Posted on Mar 21

I audited my tool, fixed 44 bugs - and it still didn’t work

#ai #webdev #opentelemetry #typescript

252 green tests, zero traces in Jaeger, and the one-line OpenTelemetry mistake that made my observability tool blind.

TL;DR

I shipped an observability tool with 252 green tests — but zero traces ever reached Jaeger. The root cause was an OpenTelemetry config detail that looked harmless (spanProcessors: []) but silently disabled trace export. Manual testing found it in minutes.

Act 1 ·

Act 2 ·

Root cause ·

Fix ·

Checklist ·

Links

I shipped v2.2.0 of my observability tool with 143 tests and a green CI run.

Then I did what I thought was the responsible thing: a deep code + DX audit. I found 44 issues, fixed them in a sprint, bumped the version a bunch of times, and ended at v2.4.4 with 252 tests.

I felt great — until I ran the tool like a real user would.

Zero traces were reaching the backend. Not “sometimes.” Not “misconfigured.” Just: never.

252 unit tests. All green. Traces were broken since day one.

This is how I found out, what the root cause was, and why tests (and code audits) didn’t see it.

Act 1: The audit (I found what I expected)

The audit was useful. It caught real problems — especially the kind that looks “reasonable” in code review and passes unit tests.

Three examples that matter for the story:

1) A privacy feature that leaked PII

I had an auditMasking mode meant to help debug redaction. Great intention, terrible output: it logged the original unmasked text to stdout.

If your logs go to CloudWatch/Datadog (they do), stdout isn’t “local debug.” It’s a data pipeline.

Caption: “Fix: audit mode no longer prints raw input (PII).”

2) `diag.warn()` was invisible by default

I used OpenTelemetry’s diag.warn() for user-facing warnings.

Problem: diag.* emits nothing unless diagnostics are explicitly configured. So warnings existed… but users never saw them. Typo? Missing SDK? Collector down? Silent.

Keep this in mind: “silent failure” becomes the recurring theme of this story.

3) `npx` CLI was completely dead

The CLI entry guard compared a symlink path to a real path, so npx toad-eye ... produced zero output. Entire CLI dead via npx.

Caption: “Fix: npx runs via a symlink — compare real paths or the CLI never executes.”

At this point, the audit felt like a win: 44 issues found, 44 fixed, tests grew from 143 → 252. Ship it.

Act 2: Manual testing (I found what I didn’t expect)

After the audit, I wrote a quick testing guide and ran the tool end-to-end:

1) npx init

2) import into a tiny app

3) run against a real Collector

4) confirm traces show up in Jaeger

This is where everything fell apart fast:

Step 1 — `npx toad-eye init`: silence

That was the broken npx guard (fixed as above). The tool looked “dead” for the most common installation path.

Step 2 — importing with `tsx`: `ERR_PACKAGE_PATH_NOT_EXPORTED`

Package exports were missing the "default" condition. Another “works locally” trap.

Step 3 — Jaeger: nothing

No service. No spans. No errors. No warnings (because diag.warn was invisible).

So I did what everyone does: I blamed Docker and infrastructure. I spent an hour tweaking Collector configs, flipping between gRPC and HTTP ports, restarting containers — all while assuming the problem was upstream in Jaeger or the Collector.

But the pipeline wasn’t broken in the middle.

It was broken at the source.

The root cause: I accidentally disabled trace export completely

Here’s the bug:

I passed spanProcessors: [] to OpenTelemetry’s NodeSDK.

That looks harmless. It’s not.

An empty spanProcessors array doesn’t mean “use defaults.”

It means “override defaults with nothing.”

No span processor → nothing exports.

Metrics still worked (separate pipeline), which made the bug even harder to spot. The tool looked “alive” while traces were dead.

Even worse: when instrument: ['ai'] was enabled, spanProcessors became non-empty… but the processor I provided only recorded metrics. I still didn’t include the default BatchSpanProcessor for exporting spans.

Different code path, same result: zero traces.

Important: this wasn’t a flaky config issue. Traces never worked for any user. Ever.

The one-line fix

The fix is almost insulting:

Don’t pass an empty array.

Let the SDK create its default BatchSpanProcessor unless you actually have span processors to set.

Caption: “Fix: don’t override the default BatchSpanProcessor with spanProcessors: [].”

Act 3: The takeaway (what changed in how I test)

After this, “252 tests” stopped feeling comforting.

Because the real problem wasn’t “insufficient assertions.”

It was: my tests weren’t testing reality.

Why unit tests didn’t catch it

My unit tests mocked the OpenTelemetry SDK.

So they verified:

“I call NodeSDK with these options”
“I register this instrumentation”
“I construct this processor”

But they didn’t verify the one thing an observability tool must do: do traces actually show up in the backend?

The checklist I’m keeping now

Practical, not preachy. If you ship devtools (especially observability), keep one test path that’s real.

Testing (reality checks)

Integration smoke test: run a real Collector + Jaeger and assert at least one span shows up
Don’t mock away the pipeline: at least one test should export for real
When debugging, start at the source (your app), not the destination (Jaeger)

Design (don’t disable defaults accidentally)

Avoid “empty override” configs ([]) unless you truly mean “disable defaults”
Treat streaming / special modes as separate first-class paths (parity tests)

UX (make failure loud)

Make failures visible by default (don’t rely on invisible diagnostics)
Run the “11pm developer” test: typos, missing Docker, empty dashboards — does the tool explain itself?

Results (numbers, no hype)

Before:

v2.2.0
143 tests
traces: broken since day 1
npx CLI: silent/dead

After:

v2.4.4
252 tests
code audit: 44 issues found, 44 fixed
manual testing: 5 critical bugs found, all fixed
traces: working
npx CLI: working
npm publishes: 6 (in one day)

DEV Community

I audited my tool, fixed 44 bugs - and it still didn’t work