DEV Community

Discussion on: [JS] Calculate phonetic similarity of two strings, any ideas?

Collapse
 
dmfay profile image
Dian Fay

There's a whole family of phonetic encoding algorithms beginning with Soundex. If you look for that (or successors like Metaphone) on npmjs.org you'll find implementations. To calculate the similarity you'll want to figure out the Levenshtein distance between two encoded words.

Collapse
 
jochemstoel profile image
Jochem Stoel • Edited

Hmm. I don't think I need the Levenshtein distance. Yes, I am comparing two strings but based on their phonetics and not the string similarity.

In information theory, Linguistics and computer science, the Levenshtein (Cyrillic: Левенштейн) distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965.[1]
Levenshtein distance may also be referred to as edit distance, although that term may also denote a larger family of distance metrics.[2]:32 It is closely related to pairwise string alignments._

Collapse
 
isaacdlyman profile image
Isaac Lyman

That's what Soundex is for. It phoneticizes the words so that if they are spelled similarly in Soundex, then they also sound similar in English. Then the Levenshtein distance algorithm becomes useful for exactly what you're trying to do.