lev kitten sitting mittenprints each candidate's Levenshtein edit distance and similarity, ranked — a tiny fuzzy finder, written in Go. Three implementation hinges: (1) fold the DP table to two rows for O(min(m,n)) memory, (2) operate on runes, not bytes, so "café"↔"cafe" is 1 and CJK/emoji are counted correctly, and (3) make the goroutine fan-out race-free without a mutex by partitioning the result slice by index. This is SEN's first Go entry (the previous one was Rust — two non-JS in a row).
📦 GitHub: https://github.com/sen-ltd/levenshtein-cli
(It's a CLI — no live demo. Run with go run or Docker.)
Edit distance
Levenshtein distance is "the minimum number of single-character insertions, deletions, or substitutions to turn one string into the other." kitten → sitting is 3 (substitute k→s, e→i, append g). It's the basis of spell correction and fuzzy search.
Hinge 1: two-row DP → O(min(m,n)) memory
Textbook Levenshtein fills an (m+1)×(n+1) matrix. But each cell depends only on the one above, the one to the left, and the diagonal — so you keep two rows and roll them. Make the inner row the shorter string and memory is O(min(m,n)), not O(m·n):
func Distance(a, b string) int {
ra, rb := []rune(a), []rune(b)
if len(ra) < len(rb) { ra, rb = rb, ra } // shorter string drives row width
if len(rb) == 0 { return len(ra) }
prev := make([]int, len(rb)+1)
curr := make([]int, len(rb)+1)
for j := range prev { prev[j] = j }
for i := 1; i <= len(ra); i++ {
curr[0] = i
for j := 1; j <= len(rb); j++ {
cost := 1
if ra[i-1] == rb[j-1] { cost = 0 }
curr[j] = min3(
prev[j]+1, // deletion
curr[j-1]+1, // insertion
prev[j-1]+cost, // substitution / match
)
}
prev, curr = curr, prev // O(1) swap, buffer reused
}
return prev[len(rb)]
}
The prev, curr = curr, prev swap reuses the allocated buffers as it advances. Even for long strings, memory scales only with the shorter one.
Hinge 2: runes, not bytes
This one quietly matters. é is two bytes in UTF-8. A byte-based distance says café → cafe is 2 (counting the two bytes of é separately); the correct answer is 1. Converting to []rune fixes it and makes CJK and emoji work:
ra, rb := []rune(a), []rune(b) // Unicode code points, not bytes
func TestUnicodeAware(t *testing.T) {
if d := Distance("café", "cafe"); d != 1 { ... } // 1, not 2
if d := Distance("こんにちは", "こんばんは"); d != 2 { ... } // 2 substitutions
if d := Distance("a🎉b", "ab"); d != 1 { ... } // emoji = 1 rune
}
Range a Go string as []byte and you see UTF-8 bytes; convert to []rune and you see characters. Edit distance is about character operations a human sees, so runes are correct.
Hinge 3: mutex-free concurrency by index partitioning
With many candidates, fan the distance computations out across goroutines — the idiomatic Go bit. Multiple goroutines writing one slice sounds like a data race, but if each worker writes only to results[i] for the index it pulled off the channel, no mutex is needed: distinct indices are distinct memory.
func Rank(target string, candidates []string, workers int) []Scored {
results := make([]Scored, len(candidates))
jobs := make(chan int)
var wg sync.WaitGroup
for w := 0; w < workers; w++ {
wg.Add(1)
go func() {
defer wg.Done()
for i := range jobs {
results[i] = Scored{ // writes only results[i]
Candidate: candidates[i],
Distance: Distance(target, candidates[i]),
Similarity: Similarity(target, candidates[i]),
}
}
}()
}
for i := range candidates { jobs <- i }
close(jobs)
wg.Wait()
sort.SliceStable(results, func(i, j int) bool { /* distance asc, ties A-Z */ })
return results
}
Pin correctness with "worker-count invariant" + -race
The most effective test for concurrent code asserts the result is identical for any worker count:
func TestRankIsWorkerCountInvariant(t *testing.T) {
base := Rank("apple", cands, 1)
for _, w := range []int{2, 3, 8, 100, runtime.NumCPU()} {
got := Rank("apple", cands, w)
for i := range base {
if got[i] != base[i] { t.Errorf("workers=%d differs at %d", w, i) }
}
}
}
One worker vs. a hundred — if a single field differs, it fails. Run it under go test -race and the race detector catches memory-level races on top of logical mismatches. Both green means the concurrency is correct.
Note:
-racerequires cgo, so when building with the Alpinegolangimage,apk add build-basefor gcc and setCGO_ENABLED=1.
CLI: args or stdin
target := args[0]
candidates := args[1:]
if len(candidates) == 0 {
sc := bufio.NewScanner(os.Stdin) // no inline candidates → read stdin
for sc.Scan() { /* append non-empty lines */ }
}
ranked := levenshtein.Rank(target, candidates, *workers)
Two usage modes:
lev kitten sitting mitten # candidates as args
cat words.txt | lev --top 5 needle # rank every line, best 5
--top N caps results, --json for machine output, --workers N for parallelism.
Architecture
levenshtein/levenshtein.go — Distance / Similarity / Rank (stdlib only)
main.go — CLI (args or stdin), table / JSON output
Zero external dependencies. Built with golang:1.23-alpine, runs on alpine, non-root.
Run it
go test ./... -race
go run . kitten sitting # 1
docker build -t lev . && docker run --rm lev kitten sitting mitten
Takeaways
- Levenshtein DP folds to two rows. Make the inner row the shorter string for O(min(m,n)) memory;
prev, curr = curr, prevreuses buffers. - Compute distance over runes, not bytes, or
café↔cafeis 2. CJK and emoji need runes too. - Goroutine fan-out is race-free without a mutex if you partition the output slice by index — distinct index, distinct memory.
- Guard concurrent code with a worker-count-invariant test and
go test -race. -
-raceneeds cgo —apk add build-baseon Alpine.
This is OSS portfolio #267 from SEN LLC (Tokyo) — and our first Go entry. https://sen.ltd/portfolio/

Top comments (0)