|
-
Mar 3rd, 2006, 06:07 PM
#1
Thread Starter
Fanatic Member
Need ideas for a string function optimization
I have a function for finding the longest matching substring in two string that is in need of some help. If you pass it two different strings, it will return the longest string fragment they have in common. I.e., if you had these two strings,
"This is a test string."
"What is a test object."
it would return " is a test ". Usually calls to it will be much more complex . A good baseline call would be a string with 2000 characters, and the same string with 10 arbitrary inserts and 10 arbitrary deletes. This takes an average of about 5 seconds on my machine. Currently, my code is spending about 99% percent of processing time in this function, so it's in obvious need of any optimizations I can come up with. I tried API functions to replace Mid$, but they were a bit slower--I also might try using byte arrays.
Here's the main function:
VB Code:
Private Function SeekMatch(ByVal sOne As String, ByVal sTwo As String) As String
'This function finds the longest substring of sTwo that exists in sOne, and returns it. No matches
'returns a vbNullString.
Dim sTest As String, lSize As Long, lStart As Long, lResult As Long, lLength As Long
On Error GoTo ErrHand
lLength = Len(sTwo)
For lSize = lLength - 1 To 1 Step -1 'Loop through all posible substring lengths, descending.
For lStart = lLength - lSize To 1 Step -1 'Loop through all possible substring positions.
sTest = Mid$(sTwo, lStart, lSize) 'Get the substring from the Buffer string.
If InStrB(1, sOne, sTest) <> 0 Then 'If there is a match,
SeekMatch = sTest 'Return and exit.
Exit Function
End If
Next lStart
Next lSize
ErrHand:
If Err.Number <> 0 Then
Call MsgBox("Error number " & Err.Number & vbCrLf & Err.Description & vbCrLf & _
"in SeekMatch.", vbCritical)
End If
End Function
Here are a couple test functions I use to run benchmarks:
VB Code:
Private Declare Function GetTickCount& Lib "kernel32" ()
Private Sub BenchMark()
Dim sString1 As String, sString2 As String, lStart As Long, lEnd As Long, sOut As String
sString1 = RandomString(2000)
sString2 = RandomChanges(sString1, 10, 10, 5)
Debug.Print sString1
Debug.Print sString2
lStart = GetTickCount
sOut = SeekMatch(sString1, sString2)
lEnd = GetTickCount
Debug.Print sOut
Debug.Print lEnd - lStart
End Sub
Private Function RandomString(lChars As Long) As String
Dim lCount As Long, bTemp() As Byte
ReDim bTemp(lChars)
For lCount = LBound(bTemp) To UBound(bTemp)
bTemp(lCount) = Rand(33, 126)
Next lCount
RandomString = StrConv(bTemp, vbUnicode)
End Function
Private Function RandomChanges(ByVal sString As String, lInserts As Long, lDeletes As Long, _
Optional lBoundry As Long = 10) As String
Dim lCount As Long, sOut As String, lStart As Long, lEnd As Long, sInsert As String, lLen As Long
Dim sLeft As String, sRight As String, lTemp As Long
For lCount = 1 To lInserts
lLen = Len(sString)
lStart = Rand(1, lLen)
sInsert = RandomString(Rand(1, lBoundry))
If lStart = 0 Then
sString = sInsert & sString
ElseIf lStart = lLen Then
sString = sString & sInsert
Else
sLeft = Left$(sString, lStart)
sRight = Right$(sString, lLen - lStart)
sString = sLeft & sInsert & sRight
End If
Next lCount
For lCount = 1 To lDeletes
lLen = Len(sString)
lStart = Rand(1, lLen)
If lStart + lBoundry > lLen Then
lEnd = lLen
Else
lEnd = Rand(lStart, lStart + lBoundry)
End If
If lStart = 1 And lEnd = lLen Then
sString = vbNullString
Exit For
End If
If lEnd = lLen Then
sString = Left$(sString, lStart - 1)
ElseIf lStart = 1 Then
sString = Right$(sString, lLen - lEnd)
Else
sLeft = Left$(sString, lStart - 1)
sRight = Right$(sString, lLen - lEnd)
sString = sLeft & sRight
End If
Next lCount
RandomChanges = sString
End Function
Private Function Rand(lLow As Long, lHigh As Long) As Long
Rand = Int((lHigh - lLow + 1) * Rnd + lLow)
End Function
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|