Discussion on: [JS] Calculate phonetic similarity of two strings, any ideas?

View post

There's a whole family of phonetic encoding algorithms beginning with Soundex. If you look for that (or successors like Metaphone) on npmjs.org you'll find implementations. To calculate the similarity you'll want to figure out the Levenshtein distance between two encoded words.

Jochem Stoel • Feb 20 '18 • Edited

Hmm. I don't think I need the Levenshtein distance. Yes, I am comparing two strings but based on their phonetics and not the string similarity.

In information theory, Linguistics and computer science, the Levenshtein (Cyrillic: Левенштейн) distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965.[1]
Levenshtein distance may also be referred to as edit distance, although that term may also denote a larger family of distance metrics.[2]:32 It is closely related to pairwise string alignments._

Isaac Lyman • Feb 20 '18

That's what Soundex is for. It phoneticizes the words so that if they are spelled similarly in Soundex, then they also sound similar in English. Then the Levenshtein distance algorithm becomes useful for exactly what you're trying to do.