DEV Community

Discussion on: Levenshtein Distance (Part 2: Gotta Go Fast)

Collapse
 
turnerj profile image
James Turner • Edited

Thanks! I knew about the series listing functionality but totally spaced on it when publishing the articles. I've set that up now.

Yeah, the 1D array form of a matrix (what I've seen called a "dense matrix") is pretty interesting. When I started out trying to get performance improvements, I thought there might actually be one there.

I did some benchmarks in C# on it previously with mixed results - small strings it performed better, medium-sized strings it performed worse and long strings it performed better again (by nearly 25%). There is probably only two problems with a "dense matrix": it would need to allocate all the data in a row uninterrupted (a problem if memory is fragmented - there might be enough total memory available but not enough contiguous memory available) and on VERY long strings (40-ish thousand characters) you might overflow the index value (in my case, a 32-bit signed Int) though that would only be hit in a full matrix - a 1D array of 2N length would be fine 🙂.