-
Jan 11th, 2006, 09:42 AM
#1
[RESOLVED] string comparison function
Does anyone happen to have a function available with the following functionality:
Input: 2 strings
Output: long
The function should compare the two strings, and return the number of characters that are the same.
example 1:
string 1: "123456789012345"
string 2: "023456789012345"
Return value: 14
This was an easy one, and I have no problem writing a function that does that. The problem is, that I want it to be able to recognize patterns that are the same like in the next example.
example 2:
string 1: "123456789012345"
string 2: "234567890123450"
return value: 14
I marked the characters that are identical in red:
123456789012345
234567890123450
The purpose of the function is comparing IMEI numbers (an id for mobile phones).
These imei numbers are always numerical, and have 15 characters.
I want to find out if the number entered by the customer (when he entered the order on the website), is identical to the number that was on the phone.
If the numbers are not identical, it could be that the customer made a typo (switched two numbers, or forgot a number, and typed a fake number at the end), or that the number belongs to a different phone.
I want to handle these situations differently, so if the numbers match partialy (say 12 characters match), I guess the customer made a typo.
P.S. I know IMEI numbers must pass a luhn check, but there are reasons I don't want to check this at order entry (at least not right now).
The
Last edited by Frans C; Jan 12th, 2006 at 04:26 AM.
Frans
-
Jan 11th, 2006, 09:52 AM
#2
Re: string comparison function
Not quite sure about it but you may try using Regular Expression.
-
Jan 11th, 2006, 10:06 AM
#3
Re: string comparison function
Something quick I did in five minutes:
VB Code:
Option Explicit
Public Function CompareLongestMatch(String1 As String, String2 As String) As Long
Dim lngA As Long, lngB As Long, lngC As Long, lngD As Long
Dim CountTotal As Long
If LenB(String1) = 0 Then Exit Function
If LenB(String2) = 0 Then Exit Function
For lngA = 1 To Len(String1)
lngB = InStr(String2, Mid$(String1, lngA, 1))
If lngB > 0 Then
If CountTotal < 1 Then CountTotal = 1
lngD = 1
For lngC = 1 To Len(String1) - lngA
If Mid$(String1, lngA + lngC, 1) = Mid$(String2, lngB + lngC, 1) Then
lngD = lngD + 1
Else
Exit For
End If
Next lngC
If lngD > CountTotal Then CountTotal = lngD
End If
Next lngA
CompareLongestMatch = CountTotal
End Function
Private Sub Form_Load()
MsgBox CompareLongestMatch("123456789012345", "234567890123450")
End Sub
It has plenty of room for optimization, but I'll leave that to you.
-
Jan 11th, 2006, 10:29 AM
#4
Re: string comparison function
Nicely done.
I am affraid it is not enough though.
I forgot to mention that the typo does not need to be at the start or the end.
eg
123456789012345
123456790123450
should return 14
-
Jan 11th, 2006, 11:15 AM
#5
Re: string comparison function
In that case it gets complicated enough not worth of my time to code it completely. But how to do it:
- make an index of the matches
- unlike in the code above, also search for the next matches of a character instead of calling InStr only once (this is a bug in my code above which should be fixed)
- find the longest matches which do not have a collision with eachother; ignore one character long matches
It shouldn't be too hard after you can get the image of how it works in your mind.
-
Jan 11th, 2006, 11:49 AM
#6
Re: string comparison function
If it was easy, I hadn't asked
I don't expect anyone to do my work for me, but if someone had a function like that laying around, it would be convenient.
I guess I have to put some effort in it.
Thanks for you help anyway.
-
Jan 11th, 2006, 11:51 AM
#7
Re: string comparison function
EDIT: Major bugs in this, but its something to work on, its meant to run though string1 until a match isnt found then recursive through again 1 char up the strings.
i think this works, might be bugs though, only set up for if string1 is the same length as string2.
VB Code:
Private j As Long
Private i As Long, z As Long
Dim string1 As String, string2 As String
Public Sub CompareLongestMatch(ByVal string1 As String, _
ByVal string2 As String)
If i > Len(string1) Then
MsgBox z
Exit Sub
End If
Do Until Mid$(string1, i, 1) <> Mid$(string2, j, 1)
If Mid$(string1, i, 1) = Mid$(string2, j, 1) Then
If j > Len(string1) Then
MsgBox z
Exit Sub
End If
' List1.AddItem z
z = z + 1
j = j + 1
End If
i = i + 1
DoEvents
'string1 = "123456789012345"
' string2 = "023456789012345"
Loop
i = i + 1
j = j + 1
DoEvents
CompareLongestMatch string1, string2
End Sub
Private Sub Command1_Click()
i = 1
j = 1
string1 = "123456789012345301235777899"
string2 = "023456789012345401236777799"
CompareLongestMatch string1, string2
End Sub
Last edited by Jmacp; Jan 11th, 2006 at 11:56 AM.
-
Jan 11th, 2006, 12:29 PM
#8
Re: string comparison function
Check out this link Frans.
-
Jan 11th, 2006, 01:49 PM
#9
Re: string comparison function
This should work for any set doesnt matter if strings are different lengths,
what it does is looks for a match of 2 chars or greater and builds sets.
VB Code:
Option Explicit
Private j As Long
Private i As Long
Private string1 As String
Private string2 As String
Dim z As Long
Private Sub Command1_Click()
findagain
End Sub
Public Sub CompareLongestMatch(ByVal string1 As String, _
ByVal string2 As String)
Dim strtmp As String
Do Until Mid$(string1, i, 1) <> Mid$(string2, j, 1)
strtmp = strtmp & Mid$(string1, i, 1)
j = j + 1
i = i + 1
List1.AddItem strtmp
DoEvents
Loop
If Len(strtmp) > 1 Then
List1.AddItem strtmp
z = z + Len(strtmp)
End If
strtmp = ""
End Sub
Private Sub findagain()
i = 1
j = 1
If Len(string2) = 0 Then
MsgBox z
Exit Sub
End If
For i = 1 To Len(string1)
If Mid$(string1, i, 1) = Mid$(string2, j, 1) Then
CompareLongestMatch string1, string2
string2 = Mid$(string2, j, Len(string2))
Exit For
End If
Next i
findagain
End Sub
Private Sub Form_Load()
string1 = "123456789012345" & vbNullChar
string2 = "123456790123450" & vbNullString
End Sub
-
Jan 11th, 2006, 04:44 PM
#10
Re: string comparison function
Give it a try
VB Code:
Private Function getMatchCount(pStr1 As String, pStr2 As String) As Long
Dim i As Long, Total As Long
Dim Tmp As Long
Dim TmpChar As String
Dim tmpStrTotal As String
Dim LongestString As Long
Total = 0
tmpStrTotal = vbNullString
LongestString = 0
For i = 1 To Len(pStr2)
TmpChar = Mid(pStr2, i, 1)
tmpStrTotal = tmpStrTotal & TmpChar
Tmp = InStr(1, pStr1, tmpStrTotal, 1)
If CBool(Tmp) Then
If Len(tmpStrTotal) > LongestString Then
Total = Total + 1
LongestString = LongestString + 1
End If
Else
If CBool(InStr(1, pStr1, TmpChar, 1)) Then
tmpStrTotal = TmpChar
Else
tmpStrTotal = vbNullString
End If
End If
Next
getMatchCount = Total
End Function
Last edited by jcis; Jan 11th, 2006 at 04:51 PM.
-
Jan 11th, 2006, 06:53 PM
#11
Re: string comparison function
Frans C: I got some idle time, so I coded it. I put some more thought on it and came up with something relatively simple: no need to keep an index.
Here is the code:
VB Code:
Option Explicit
Public Function GetMatchString(String1 As String, String2 As String) As String
Dim lngA As Long, lngB As Long, lngC As Long, lngD As Long
Dim LongestString As String, CurString As String, CompareString As String
If LenB(String1) = 0 Then Exit Function
If LenB(String2) = 0 Then Exit Function
For lngA = 1 To Len(String1)
CompareString = String2
lngB = 0
Do Until lngB > 0 Or lngA > Len(String1)
lngB = InStr(CompareString, Mid$(String1, lngA, 1))
If lngB < 1 Then lngA = lngA + 1
Loop
If lngA > Len(String1) Then Exit For
Do
lngC = 1
Do
If lngA + lngC > Len(String1) Then Exit Do
If lngB + lngC > Len(CompareString) Then Exit Do
If Mid$(String1, lngA + lngC, 1) = Mid$(String2, lngB + lngC, 1) Then
If lngC = 1 Then CurString = CurString & Mid$(String1, lngA, 1): Mid$(CompareString, lngB, 1) = vbNullChar
CurString = CurString & Mid$(String1, lngA + lngC, 1)
Mid$(CompareString, lngB + lngC, 1) = vbNullChar
lngC = lngC + 1
Else
If lngC = 1 Then
lngD = InStr(lngB + 1, CompareString, Mid$(String1, lngA, 1))
If lngD > 1 Then
lngB = lngD
Else
Exit Do
End If
Else
Exit Do
End If
End If
Loop
lngD = 0
Do Until lngD > 0 Or lngA + lngC > Len(String1)
lngD = InStr(CompareString, Mid$(String1, lngA + lngC, 1))
If lngD < 1 Then lngC = lngC + 1
Loop
If lngD > 0 Then
lngB = lngD
lngA = lngA + lngC
Else
Exit Do
End If
Loop
If LenB(CurString) > LenB(LongestString) Then LongestString = CurString
CurString = vbNullString
Next lngA
GetMatchString = LongestString
End Function
Private Sub Form_Load()
MsgBox GetMatchString("123456789012345", "123456790123450")
MsgBox GetMatchString("it works nicely!", " nicely, it works!")
End Sub
As you can see, it returns the matched characters. If you only want to get the number of characters, just use Len()
Edit: Polished it! It should now work well enough for any combination you need
Last edited by Merri; Jan 11th, 2006 at 07:04 PM.
-
Jan 11th, 2006, 07:56 PM
#12
Re: string comparison function
Let's GetTickCount all methods
-
Jan 11th, 2006, 08:06 PM
#13
Re: string comparison function
Your function doesn't work as well as mine does, try the samples My priority was making something that works, not something that works fast (and I'm all too tired to even think about that it being 3 AM already).
-
Jan 11th, 2006, 08:09 PM
#14
Re: string comparison function
Originally Posted by jcis
Let's GetTickCount all methods
sounds good to me, add mine as well..
-
Jan 11th, 2006, 08:28 PM
#15
Re: string comparison function
just for laughs..
my pc specs, pretty darn low for this comp, 900mhz cpu, 128 mb ram, 16 mb video card, have a killer rig but is out of order, needs a new hardrive, so anyway for this machine for this string i get a tickcount of 5128,
1234567890123451234567890123451234567890123451234567890123451234567890123451234567890123451234567890 1234512345678901234512345678901234512345678901234512345678901234512345678901234512345678901234512345 6789012345123456789012345123456789012345123456789012345123456789012345123456789012345123456789012345 1234567890123451234567890123451234567890123451234567890123451234567890123451234567890123451234567890 1234512345678901234512345678901234512345678901234512345678901234512345678901234512345678901234512345 6789012345123456789012345123456789012345123456789012345123456789012345123456789012345123456789012345
returns 600 btw.
-
Jan 11th, 2006, 09:09 PM
#16
Re: string comparison function
whoops removed the doevents now get 2 for the gettickcount, forget it getting to OT.
-
Jan 11th, 2006, 09:30 PM
#17
Re: string comparison function
Is that one string or two strings? If one, what is the other string you compare to?
Anyways, my thoughts came up with yet-another way to do this. Since I'm too tired to code it, I'll just describe it:- First, check if the whole string matches (of the length of the second comparable string)
- If not, then take a string that is shorter by one than the comparable string and see if that is found. Move in the search string until a match is found or there is no space
- If no match is found, shrink again and keep doing it
- If a match is found, drop these characters and work with what is still left, doing the same as above, until there is really nothing to look at
I think this would be a pretty much of a flawless solution. My current code isn't able to find all search strings "perfectly", for example:
VB Code:
MsgBox GetMatchString(_
" this inn is a wonderful inn as I were sleeping I saw dreams ", _
" I saw dreams while sleeping as I were in this wonderful inn ")
What I might expect it to return would be:
Code:
wonderful inn I saw dreams as I were sleeping this in in
Meaning, it would return the longest matches first and then go for shorter valid ones.
Last edited by Merri; Jan 11th, 2006 at 09:33 PM.
-
Jan 11th, 2006, 09:39 PM
#18
Re: string comparison function
Originally Posted by Merri
Is that one string or two strings? If one, what is the other string you compare to?
Anyways, my thoughts came up with yet-another way to do this. Since I'm too tired to code it, I'll just describe it: - First, check if the whole string matches (of the length of the second comparable string)
- If not, then take a string that is shorter by one than the comparable string and see if that is found. Move in the search string until a match is found or there is no space
- If no match is found, shrink again and keep doing it
- If a match is found, drop these characters and work with what is still left, doing the same as above, until there is really nothing to look at
I think this would be a pretty much of a flawless solution. My current code isn't able to find all search strings "perfectly", for example:
VB Code:
MsgBox GetMatchString(_
" this inn is a wonderful inn as I were sleeping I saw dreams ", _
" I saw dreams while sleeping as I were in this wonderful inn ")
What I might expect it to return would be:
Code:
wonderful inn I saw dreams as I were sleeping this in in
Meaning, it would return the longest matches first and then go for shorter valid ones.
just skimmed through what you wrote but i think thats what i did, i started with string1 and ran though it 1 char at a time comparing each of these to the first char in string2 until i got a match if a match was found then see how long the match goes on for then remove that from string2 and repeat this until string2 = 0 if no match is found then run though string1 again this time starting at the next char in string2.
-
Jan 11th, 2006, 09:44 PM
#19
Re: string comparison function
No, that's not it, its the opposite: first take as much of string1 as possible (= the length of string2). Compare it all to string2. If no match and if there are more chars in string1, move to right by one, take a string, compare to string2 and so on. Then when this is done and no matches, shrink the lookup string by one, check through string1 and this time as the lookup string is shorter than string2, you need to check more from string2 as well. Then keep doing this and mark out matches until there is nothing more to look for (= have only one character long matches which we ignore).
Marking out always the longest first will get rid of the problem that is with the example strings I posted in my last message.
-
Jan 12th, 2006, 03:07 AM
#20
Re: string comparison function
Thank you guys.
I will use Merri's method, because it is the only one that works correctly.
jcis's method doen't always return the correct results.
Jmacp's method produces an out of stack space error with the following strings:
VB Code:
Private Sub Form_Load()
string1 = "123456790123450" & vbNullChar
string2 = "123456789012345" & vbNullString
End Sub
Although it returns the correct result when the strings are switched.
Case closed.
-
Jan 12th, 2006, 03:23 AM
#21
Re: string comparison function
Originally Posted by Frans C
Thank you guys.
I will use Merri's method, because it is the only one that works correctly.
jcis's method doen't always return the correct results.
Jmacp's method produces an out of stack space error with the following
Case closed.
Yes, I didn't read your post #4, sorry.
And yes, Merri's method is working fine, except for this cases:
123456789012345
023456789012345
and
123456789012345
234567890123450
(both examples in your first post)
Shouldn't be 14? here is returning 15 for both of them.
(Maybe I should substract 1 from the result, i don't know)
Last edited by jcis; Jan 12th, 2006 at 03:26 AM.
-
Jan 12th, 2006, 04:25 AM
#22
Re: [RESOLVED] string comparison function
Yes, you are right.
I guess it isn't resolved after all.
-
Jan 12th, 2006, 05:16 AM
#23
Re: string comparison function
I picked some ideas from all of you, and wrote the following function.
I tested it with all examples, and it looks like it works.
I took another thing into consideration.
AAABB
BBAAA
should not return 5 but 3. I don't want to accept swapping characters as if the strings would be identical.
Again, thank all of you for the effort you have put into it.
VB Code:
Private Sub Command1_Click()
MsgBox GetLongestMatch("123056789012345", "123456789012345")
End Sub
Private Function GetLongestMatch(ByVal String1 As String, ByVal String2 As String) As Long
Dim strHelp As String
Dim lngA As Long
Dim lngB As Long
Dim blnStart As Boolean
Dim strSearch As String
Dim lngMatch As Long
Dim strRemaining As String
If String1 = String2 Then
lngMatch = Len(String1)
Else
If Len(String1) > Len(String2) Then
' we want string1 to be the shortest string
strHelp = String2
String2 = String1
String1 = strHelp
End If
For lngA = Len(String1) To 1 Step -1
' try to find the left characters of string1 in string2
strSearch = Left$(String1, lngA)
If InStr(String2, strSearch) > 0 Then
lngMatch = Len(strSearch)
Exit For
Else
' try to find the right characters of string1 in string2
strSearch = Right$(String1, lngA)
If InStr(String2, strSearch) > 0 Then
lngMatch = Len(strSearch)
Exit For
End If
End If
Next
If lngMatch > 0 Then
' check for longest match in remaining strings (before and after matching strings)
lngA = InStr(String1, strSearch)
lngB = InStr(String2, strSearch)
If lngA > 1 And lngB > 1 Then
' there are strings before the match, check these
lngMatch = lngMatch + GetLongestMatch(Left$(String1, lngA - 1), Left$(String2, lngB - 1))
End If
If (lngA + Len(strSearch) < Len(String1)) And (lngB + Len(strSearch) < Len(String2)) Then
' there are strings after the match, check these
lngMatch = lngMatch + GetLongestMatch(Right$(String1, Len(String1) - lngA - Len(strSearch) + 1), Right$(String2, Len(String2) - lngB - Len(strSearch) + 1))
End If
End If
End If
GetLongestMatch = lngMatch
End Function
-
Jan 12th, 2006, 05:33 AM
#24
Re: [RESOLVED] string comparison function
Sorry, it still had an error it the logic.
This should work though.
VB Code:
Private Function GetLongestMatch(ByVal String1 As String, ByVal String2 As String) As Long
Dim strHelp As String
Dim lngA As Long
Dim lngB As Long
Dim blnStart As Boolean
Dim strSearch As String
Dim lngMatch As Long
Dim strRemaining As String
Dim blnFound As Boolean
If String1 = String2 Then
lngMatch = Len(String1)
Else
If Len(String1) > Len(String2) Then
' we want string1 to be the shortest string
strHelp = String2
String2 = String1
String1 = strHelp
End If
For lngA = Len(String1) To 1 Step -1
' try to find characters of string1 in string2
' we start with the longest possible string, and if not found, make the string smaller
For lngB = 1 To Len(String2) - lngA + 1
' roam through string1 from left to right
strSearch = Mid$(String1, lngB, lngA)
If InStr(String2, strSearch) > 0 Then
lngMatch = Len(strSearch)
blnFound = True
Exit For
End If
Next
If blnFound Then Exit For
Next
If lngMatch > 0 Then
' check for longest match in remaining strings (before and after matching strings)
lngA = InStr(String1, strSearch)
lngB = InStr(String2, strSearch)
If lngA > 1 And lngB > 1 Then
' there are strings before the match, check these
lngMatch = lngMatch + GetLongestMatch(Left$(String1, lngA - 1), Left$(String2, lngB - 1))
End If
If (lngA + Len(strSearch) < Len(String1)) And (lngB + Len(strSearch) < Len(String2)) Then
' there are strings after the match, check these
lngMatch = lngMatch + GetLongestMatch(Right$(String1, Len(String1) - lngA - Len(strSearch) + 1), Right$(String2, Len(String2) - lngB - Len(strSearch) + 1))
End If
End If
End If
GetLongestMatch = lngMatch
End Function
-
Jan 12th, 2006, 09:33 AM
#25
Re: [RESOLVED] string comparison function
i think i fixed my version,
VB Code:
Option Explicit
Private j As Long
Private i As Long
Private string1 As String
Private string2 As String
Private z As Long
Private blIsAMatchFoundBetweenStrings As Boolean
Private Sub Command1_Click()
string1 = "123456790123450"
string2 = "123456789012345"
blIsAMatchFoundBetweenStrings = False
IterateString
End Sub
'***
Public Sub CompareLongestMatch(ByVal string1 As String, _
ByVal string2 As String)
Dim strTmpString As String
' when a match is found this sub starts, the value of i is carried on
' from the previous IterateString sub
' string1 = "123456790123450"
' string2 = "123456789012345"
' so if i = 1 run through this till i <> j
' first run up till 8, finds, 1234567
Do Until Mid$(string1, i, 1) <> Mid$(string2, j, 1)
strTmpString = strTmpString & Mid$(string1, i, 1)
j = j + 1
i = i + 1
DoEvents
Loop
' the length of all Consecutive letters found
If Len(strTmpString) > 1 Then
z = z + Len(strTmpString)
End If
strTmpString = vbNullString
End Sub
Private Sub IterateString()
i = 1
j = 1
' when the length of string2 = 0, end, will always reach 0
If Len(string2) = 0 Then
MsgBox z
z = 0
Exit Sub
End If
'string1 = "123456790123450"
'string2 = "123456789012345"
For i = 1 To Len(string1)
' run through string1 1 char at a time and compare each to first char in string2
' j stays at 1 for first loop
If Mid$(string1, i, 1) = Mid$(string2, j, 1) Then
'if a match is found then find out how long it goes on for ***
CompareLongestMatch string1, string2
' redim string2 so its string2 -
' so for the above strings 1234567 are found so string2 becomes
' 89012345, string1 doesn't change
' 8 isnt found so string2 =9012345 then
' 9012345 so strin2 = 0
string2 = Mid$(string2, j, Len(string2))
DoEvents
blIsAMatchFoundBetweenStrings = False
Exit For
End If
Next i
'if no match is found then that character must be a single char then remove it
' so this next condition only gets entered if no match was found
If blIsAMatchFoundBetweenStrings Then
' so redim string , string2 - 1
string2 = Right$(string2, Len(string2) - 1)
End If
' set the flag back to true
' when a match is found flag sets to false
blIsAMatchFoundBetweenStrings = True
' recursive
IterateString
End Sub
-
Jan 12th, 2006, 10:34 AM
#26
Re: [RESOLVED] string comparison function
Nice, but try this:
string1 = "1234500000"
string2 = "1234512345"
It should return 5
-
Jan 12th, 2006, 10:47 AM
#27
Re: [RESOLVED] string comparison function
Originally Posted by Frans C
Nice, but try this:
string1 = "1234500000"
string2 = "1234512345"
It should return 5
pretty major typo , but i see whats happening,
string2 starts at 1234512345 then reduces to 12345 so a second match is found, will leave at the moment...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|