Re: How to compare the similarity of two words in VB?
For example "weather" and "wther" are 70% similar
"Unable to select an address from the list" and "Unable to select an address " has 23 characters out of 34 characters are the same so 67.6% are similar
I don't know if this is what you want, but it finds multiple similarities if possible. For example, if the second string was changed to "Unable select an address list", it would match all of the characters to the first string, in 3 parts ("Unable" + " select an address " + "list").
..if you don't want that, just something like the For loops in _sub should be enough!
Last edited by si_the_geek; Nov 24th, 2006 at 06:24 PM.
Re: How to compare the similarity of two words in VB?
Hi there, long shot replying 12 years later but is this code based on a mathematical/statistical method or just your own intuition? Loving it regardless!
Re: How to compare the similarity of two words in VB?
It's VBA code with two versions. Put either one in a code module and see what happens. You can also search VBForums for 'soundx' and probably find an answer. BTW, what exactly happened when you tried the solutions from Digirev and twisted thoughts.
Re: How to compare the similarity of two words in VB?
How to compare the similarity of two words in VB?
For example "weather" and "wther" are 70% similar
"Unable to select an address from the list" and "Unable to select an address " has 23 characters out of 34 characters are the same so 67.6% are similar
This got me interested in the algorithmic point of view. But also makes me wonder, what use there is to know the similarity percentage between two strings?
Re: How to compare the similarity of two words in VB?
Originally Posted by Dry Bone
This got me interested in the algorithmic point of view. But also makes me wonder, what use there is to know the similarity percentage between two strings?
Think google-search and/or Regex
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
Re: How to compare the similarity of two words in VB?
Originally Posted by Dry Bone
This got me interested in the algorithmic point of view. But also makes me wonder, what use there is to know the similarity percentage between two strings?
I've used this (similar library) to implement auto-correct on free-text search on a products catalog site.
When the user is searching for a term which is not found in terms table (i.e. unique list of stemmed roots of all words in product descriptions) then a suggestion was made and the query was executed for the most similar search term instead (with an option to rollback to original).
Re: How to compare the similarity of two words in VB?
Originally Posted by Dry Bone
what use there is to know the similarity percentage between two strings?
Our task is not to ask why, unless asking gives an idea for implementation (which it often does, and the question should be worded that way). If someone asks for something and we have an idea of a way to do it, we should make suggestions.
The original request also is very vague...phonetic similarity, characters, similes between words? We can assume character-based, in which case a custom algorithm which takes into account the length of the word and the number of characters that appear in both in the same place or nearby should do the job...it's something even the amateur coders can do, and is often something given as a programming task in classes, though I highly doubt people are still being taught VB6 these days :-)
These algos do not work like "Google" (which is a FullText-search, based on AND/OR combined whole "stemmed words").
It also has nothing to do with "RegExp"...
Ratcliff is built into the RC5/6 SQLite-wrapper (as a UserDefined-Function):
Code:
Option Explicit
Private Sub Form_Load()
Dim MemDB As cMemDB, Rs As cRecordset
Set MemDB = New_c.MemDB
MemDB.Exec "Create Table T(ID Integer Primary Key, Name Text)"
'Demo-inserts
Const insName = "Insert Into T(Name) Values(?)"
MemDB.ExecCmd insName, "Beth Mercks"
MemDB.ExecCmd insName, "Brett Perks"
MemDB.ExecCmd insName, "Fred Merx"
MemDB.ExecCmd insName, "Mad Max"
MemDB.ExecCmd insName, "Matt Murks"
MemDB.ExecCmd insName, "Mett Wurst"
'Ratcliff-query (unsharp search)
Const qryUnsharp = "Select Top 5 Ratcliff(Name, ?) RcPerc, * From T " & _
" Order By RcPerc Desc"
Set Rs = MemDB.GetRs(qryUnsharp, "Brad Merks") '<- SearchParameter-Passing
Do Until Rs.EOF
If Rs.AbsolutePosition = 1 Then Debug.Print "RcPerc Name"
Debug.Print Rs!RcPerc, Rs!Name
Rs.MoveNext
Loop
End Sub
Re: How to compare the similarity of two words in VB?
Wow! So much to learn!
I didn't know so many algorithms exist for this.
I only found about Myers algorithm, and, kind of, implemented some variant of it, since I didn't understand the original one.
But I think mine is doing fine though :-)
Maybe I will take the time to know more algorithms...
Re: How to compare the similarity of two words in VB?
I think the Levenshtein Distance method is the "Industry Standard"
Code:
Public Function LevenDis(sTxt1 As String, sTxt2 As String, Optional blnAsPercent = True) As Double
Dim i As Long, j As Long
Dim Len1, Len2 As Long
Dim min1 As Long, min2 As Long
Dim lDis() As Long
Len1 = Len(sTxt1)
Len2 = Len(sTxt2)
ReDim lDis(Len1, Len2)
For i = 0 To Len1: lDis(i, 0) = i: Next i
For j = 0 To Len2: lDis(0, j) = j: Next j
For i = 1 To Len1
For j = 1 To Len2
If Mid(sTxt1, i, 1) = Mid(sTxt2, j, 1) Then
lDis(i, j) = lDis(i - 1, j - 1)
Else
min1 = lDis(i - 1, j) + 1
min2 = lDis(i, j - 1) + 1
If min2 < min1 Then
min1 = min2
End If
min2 = lDis(i - 1, j - 1) + 1
If min2 < min1 Then
min1 = min2
End If
lDis(i, j) = min1
End If
Next j
Next i
LevenDis = lDis(Len1, Len2)
If blnAsPercent Then LevenDis = Abs(1 - (LevenDis / IIf(Len(sTxt1) > Len(sTxt2), Len(sTxt1), Len(sTxt2)))) * 100
End Function