Soundex
From Wikipedia, the free encyclopedia
Soundex is a phonetic algorithm for indexing names by their sound when pronounced in English. The basic aim is for names with the same pronunciation to be encoded to the same string so that matching can occur despite minor differences in spelling. Soundex is the most widely known of all phonetic algorithms and is often used (incorrectly) as a synonym for "phonetic algorithm".
The Soundex code for a name consists of a letter followed by three numbers: the letter is the first letter of the name, and the numbers encode the remaining consonants. Similar sounding consonants share the same number so, for example, the HYPERLINK "http://en.wikipedia.org/wiki/Labial"
Levenshtein distance
From Wikipedia, the free encyclopedia
In information theory and computer science, the Levenshtein distance or edit distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. It is named after Vladimir Levenshtein, who considered this distance in 1965. It is useful in applications that need to determine how similar two strings are, such as spell checkers.
For example, the Levenshtein distance between "kitten" and "sitting" is 3, since these three edits change one into the other, and there is no way to do it with fewer than three edits:
kitten sitten (substitution of for
sitten sittin (substitution of for
sittin sitting (insert at the end)
This is a class so using it will be very easy.
Where do we use these algorithms. Well in writing Spell checkers and such software where words have to be computed into some kind of values for proximity calcualtions, etc.