A t-digest as a Go CLI. It estimates quantiles (p50, p99, p99.9…) of a stream in a few KB instead of sorting the whole thing. The hinges: (1) a scale function
k(q)controls centroid size so the median is coarse and the tails stay sharp — p99.9 ends up more precise than the median, by design; (2)q=0/q=1return the exact min/max while the middle interpolates between centroid means. Plus digests merge. Part four of a probabilistic-data-structure series after #274–276 (membership / cardinality / frequency) — this one is quantiles.
📦 GitHub: https://github.com/sen-ltd/t-digest-cli
What a t-digest is
Exact quantiles need every value in memory, sorted — O(N) space. A t-digest
keeps instead a few hundred centroids, each a (mean, weight) pair
summarizing a cluster of nearby values, and answers any quantile by walking
them. For 500k points that's ~100 centroids and a few KB, regardless of N.
The trap with naive bucketing is the tails: lump values into equal-sized buckets
and your p99.9 is as coarse as your median. A t-digest fixes exactly that — it's
the whole reason the structure exists.
Here's the CLI, real output:
$ td summary --from latencies-ms.txt --exact -c 200
count 500000
min / max 0.3 / 1655.8
centroids 122 (δ=200, 1952 bytes)
p50 20.1183 exact 20.1 (+0.09%)
p90 63.736 exact 63.7 (+0.06%)
p99 162.803 exact 161.9 (+0.56%)
p99.9 322.033 exact 312.6 (+3.02%)
Half a million latency samples summarized in under 2 KB, p99 off by half a
percent.
Hinge 1: a scale function that keeps the tails sharp
Centroids aren't allowed to grow uniformly. A scale function k(q) maps the
quantile q into a "k-space" where each centroid may span at most 1 unit.
With the standard k1 scale,
k(q) = δ/(2π) · asin(2q − 1)
k⁻¹(k) = (sin(2π k/δ) + 1) / 2
k is steep near q=0 and q=1, flat near q=0.5. So a centroid covering one
k-unit spans a wide band of quantiles around the median (where precision is
cheap) and a razor-thin band at the tails (where p99.9 lives). The compression
δ is the single knob: more δ → more centroids → tighter quantiles.
q := q0 + (cur.Weight+x.Weight)/total
if q <= qLimit { // still within 1 k-unit → absorb
cur = mergeInto(cur, x)
} else { // would overflow → seal cur, start a new centroid
out = append(out, cur)
q0 += cur.Weight / total
qLimit = kInv(k(q0) + 1)
cur = x
}
A test on a uniform stream pins the payoff: p50 is accurate to ~1% while p999 is
held to ~0.2% — tighter at the tail (TestUniformQuantiles).
Hinge 2: exact min/max, interpolated middle
Quantiles are read by walking centroids, accumulating half-weights, and linearly
interpolating between adjacent centroid means. The two ends are special: q=0
and q=1 return the exact min/max (tracked separately), and the tails
interpolate between those extremes and the outermost centroids:
func (t *TDigest) Quantile(q float64) float64 {
if q <= 0 { return t.min }
if q >= 1 { return t.max }
index := q * t.total
// ... walk centroids, interpolate between means ...
}
TestExactExtremes checks the ends are exact; TestMonotonic checks the whole
curve is non-decreasing.
The signature property: merge
t-digests combine: dump one digest's centroids into another and recompress. The
merged digest answers quantiles as if it had seen both streams — so per-shard or
per-minute digests roll up into a global one with no re-scan:
func (t *TDigest) Merge(other *TDigest) { /* absorb centroids, recompress */ }
$ td merge -o all.tdigest a.tdigest b.tdigest
merged 2 digests → all.tdigest count=500000 centroids=112
TestMergeApproximatesCombined pins that a merge of two disjoint-range digests
matches the digest of the combined stream across p10…p99 — the same composability
as HyperLogLog's merge-by-max (#275) and Count-Min's merge-by-sum (#276).
Persistence
A digest serializes to a compact blob — "TDG1" | δ | min | max | total | n | — a handful of bytes per centroid. Hand-rolled, no codec.
(mean, weight)*n
tdigest/tdigest.go — digest: k1 scale, compress, quantile, merge, (de)serialize (11 tests)
main.go — CLI: summary / add / quantile / merge / info, file or stdin
Takeaway
The t-digest is the go-to for "measure p99 latency without keeping every sample."
The keys are a scale function that keeps the tails sharp, exact min/max,
and mergeability. With it the probabilistic series now spans membership
(#274), cardinality (#275), frequency (#276), and quantiles (#277) — all the same
instinct of approximating in fixed memory, here expressed as pushing precision
out to the tail where it matters.
📦 GitHub: https://github.com/sen-ltd/t-digest-cli

Top comments (0)