- Book: The Complete Guide to Go Programming
- Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go
- My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
- Me: xgabriel.com | GitHub
You take a heap profile of a metrics-heavy Go service and the largest object on the heap is a map[string]*Series with millions of entries. Each key is something like http_requests_total{method="GET",route="/users/:id",status="200"}. The map is hundreds of megabytes, most of it duplicate keys. The runtime has been holding the same "GET" string in memory a few hundred thousand times because every label-set assembly path went through fmt.Sprintf or strings.Join and produced a fresh allocation.
This is the workload unique was added for in Go 1.23. The package solves one problem: give equal values one shared backing copy so the heap stops carrying duplicates. On the right workload it cuts memory by ~10x. On the wrong workload you pay for a hash and a map lookup that buys nothing.
What unique actually does
The package landed in Go 1.23 as unique. The full surface today:
package unique
type Handle[T comparable] struct { /* opaque */ }
func Make[T comparable](value T) Handle[T]
func (h Handle[T]) Value() T
Make takes any comparable value and returns a Handle[T]. Two calls to Make with equal values return handles that point to the same backing memory. The handle is the size of one word. The backing value lives in an internal concurrent structure with weak references, keyed by value equality. When no live handle references a value, the runtime can drop it during GC.
For a Handle[string], equal strings collapse into a single canonical copy and the GC handles cleanup. No global map, mutex, or sync.Pool dance to maintain.
package main
import (
"fmt"
"unique"
)
func main() {
a := unique.Make("application/json")
b := unique.Make("application/json")
// Both handles point to the same backing string.
fmt.Println(a == b) // true
fmt.Println(a.Value() == b.Value()) // true
}
a == b works because Handle[T] is comparable for any comparable T. That single fact is the reason this package matters: you can use a Handle[string] as a map key, in a struct, or passed from a worker goroutine to a coordinator. Equality is a pointer compare, not a byte-by-byte string compare.
The memory math on a real-shaped workload
Take a metrics ingestion path. One million spans arrive per minute. Each span carries a service.name, http.method, http.route, http.status_code as four string tags. In production the cardinality of those tags is small:
-
service.name: ~50 unique values across the org -
http.method: 7 values (GET,POST,PUT,PATCH,DELETE,HEAD,OPTIONS) -
http.route: a few thousand templated routes -
http.status_code: ~10 values that ever appear in practice
Without interning, every span allocates four fresh strings during decode. One million spans × four strings × 16 bytes of string header alone = 64MB just for the headers, on top of the underlying byte arrays the decoder allocated. If half those strings were "GET" (5 bytes), that's another ~2.5MB of byte data the GC has to chase, then drop, then re-allocate next minute.
With unique.Make applied at the decode boundary:
type Span struct {
ServiceName unique.Handle[string]
Method unique.Handle[string]
Route unique.Handle[string]
Status unique.Handle[string]
// ... non-tag fields
}
func decode(raw rawSpan) Span {
return Span{
ServiceName: unique.Make(raw.ServiceName),
Method: unique.Make(raw.Method),
Route: unique.Make(raw.Route),
Status: unique.Make(raw.Status),
}
}
Each Span now holds four word-sized handles. The backing strings are stored once per unique value, not once per span. With ~50 services, 7 methods, ~3,000 routes, and ~10 statuses, the total backing storage for tags across all in-flight spans is about 3,067 strings rather than 4 million. The heap difference on tag-heavy paths is order-of-magnitude. Your numbers will depend on cardinality and string length.
What matters is the ratio between Make calls and distinct values produced. If that ratio is 1, you are paying overhead for no benefit. If it is 10,000, you are getting the order-of-magnitude win.
What this replaces
Before 1.23, the same problem was solved three ways. Each one is now a candidate for deletion.
Hand-rolled intern map
The classic shape:
type interner struct {
mu sync.Mutex
m map[string]string
}
func (i *interner) Intern(s string) string {
i.mu.Lock()
defer i.mu.Unlock()
if v, ok := i.m[s]; ok {
return v
}
i.m[s] = s
return s
}
This works for low cardinality with stable keys. Three problems show up in production:
-
No eviction. The map grows until process restart. A misbehaving client that emits
request_idas a tag fills the map with garbage that lives forever. - The mutex. Every intern call serializes. On a hot decode path with multiple goroutines, the lock becomes the bottleneck before the saving from interning shows up in pprof.
-
It does not actually share underlying bytes if the input came from a
[]byteconversion.m[s] = sstores the samestringheader you were handed. Most decoders hand you substrings of a larger buffer, so the larger buffer stays alive until eviction.
unique.Make fixes all three. It uses a weak map so values can be dropped after GC. It uses a sharded internal structure so contention is bounded. And it copies the value into a stable backing store rather than retaining whatever buffer you happened to hand it.
sync.Pool for short-lived buffers
A sync.Pool is sometimes used as a "string canonicalizer" — keep a pool of *strings.Builder, build the canonical form, return the builder. This was always a misuse. sync.Pool is for amortizing allocation cost on objects you reuse, not for deduplicating values. The pool gives you back any object, not the same object, so you cannot use it for canonicalization. Wherever you see this pattern in a codebase, the migration is to unique.Make, not a smarter pool.
string([]byte) deduplication tricks
Go's compiler does some constant-string deduplication at link time. People sometimes try to extend that at runtime by reading from string([]byte) conversions and praying the runtime gives them the same address. It does not. Every string([]byte) call allocates fresh unless the compiler can prove the result is used only as a transient map key (the one narrow optimization the compiler does perform). unique.Make is what you want.
When unique.Make is wrong
The package looks free in microbenchmarks. It is not.
Each Make call hashes the value, takes a read lock on the internal sharded map, does a lookup, and on a miss takes a write lock and copies the value. A miss is more expensive than a regular allocation. Hits are roughly the cost of a map lookup plus a hash.
The break-even is governed by three factors.
Cardinality ratio. If the number of Make calls divided by the number of distinct values is below ~10, the overhead exceeds the savings. A request_id, a unix-nanosecond timestamp, or a UUID is unique by construction; wrapping them in unique.Handle is pure cost.
String length. The savings scale with the size of the duplicate value. Interning "GET" saves a few bytes per call. Interning a 4KB serialized config blob saves 4KB per call. If your duplicates are tiny and few, the bookkeeping (one handle per value site + the map entry) can outweigh the data savings.
Lifetime. If the values being interned are ephemeral — built, used inside one function, discarded before GC runs — the heap never sees the duplication anyway. Interning hands the runtime a value to track that it would otherwise have collected on the stack or in the next sweep. Reach for unique.Make when the values are about to live in long-lived structs (a cache, a queue, a coordinator's state), not when they live for a few microseconds inside one handler.
A simple rule that catches most cases: ask whether the value is going to sit on the heap for more than a few seconds, and whether you expect to encounter the same value again before it is freed. If both are true, reach for unique. Otherwise skip it.
Putting handles in maps
The most useful application is Handle[string] as a map key.
// Before: hashing and equality on the full string per lookup.
counts := map[string]int{}
for _, span := range spans {
key := span.Method + "|" + span.Route // allocates per span
counts[key]++
}
Equality on a string map key compares bytes. For long keys that's a measurable cost. For pointer-equal handles, equality is a word compare:
type tagKey struct {
method unique.Handle[string]
route unique.Handle[string]
}
counts := map[tagKey]int{}
for _, span := range spans {
key := tagKey{
method: unique.Make(span.Method),
route: unique.Make(span.Route),
}
counts[key]++
}
The composite key is two words. Hashing it is fast. Equality is an integer compare per field. The string concat in the first version is gone. On hot aggregation paths this is the version you want. You save RAM and CPU on every map operation.
Two traps to watch for
Two patterns show up in code review.
Wrapping at the wrong layer. unique.Make should run at the boundary where duplicates first appear — typically the decoder, or the consumer that pulls from a queue. Wrapping at every internal call site multiplies the lookup cost without adding savings, because each call after the first is already operating on the canonical form.
Forgetting Value() in serialization. A Handle[string] is not a string. It does not satisfy json.Marshaler, encoding.TextMarshaler, or fmt.Stringer by default. If you put a handle in a struct and try to JSON-encode it, you get {} because every field on Handle is unexported. Either define a MarshalJSON on the wrapping struct that calls Value(), or unwrap at the serialization boundary. This bites people the first time they put Handle[string] in a public DTO.
When to reach for it
- High-cardinality decoder output where the same values repeat thousands of times: tags, headers, enum strings, service names.
- Long-lived in-memory structures whose keys come from a small alphabet: routing tables, label sets, configuration sets.
- Map keys built from concatenated strings, where you want pointer-equal hashing.
The mirror image, when to leave it alone:
- Per-request short-lived strings.
- Unique-by-construction values (UUIDs, timestamps, request IDs).
- Tiny duplicates with low repetition.
- Code that has to round-trip through JSON or another wire format without an unwrap layer.
Two functions. Find one decoder in your service where the same string lands a thousand times. That is where it pays.
If this was useful
If you write Go services where the heap shape and the GC budget actually matter — metrics pipelines, caches, queue consumers — the part of Thinking in Go that walks through allocation patterns, escape analysis, and stdlib choices like unique is what I'd point you at. The companion book on hexagonal architecture is for when those services start to ossify around their decoders and you want to keep the boundary clean.



Top comments (0)