Thanks. Was a little busy over the weekend but I finally had time to dive into this today.
Originally Posted by jpbro
I've just tested looping a byte array to find a series of bytes, and it is slower than InStr but certainly "close". Some things to note:
1) It's dramatically slower in the IDE (~25x slower).
2) It's noticeably slower when compiled with no optimizations (~5x)
3) It's still slower when compiled with the "Remove Array Bounds Checks" optimization is selected (~2x - so yes, still slower but maybe OK?)
Are you running compiled with the "Remove Array Bounds Checks" optimization selected when you test the performance?
This didn't surprise me that much. I'm willing to bet that most people aren't aware that modern x86 processors have string instructions, however you cannot access them directly from a high level language like VB6. This was what actually prompted me to ask for your implementation, so I could test it against not only the native Instr function but a handwritten assembly version.
I took some time today to write a byte array version of Instr in pure assembly and these were the results:-
The runtime Instr is the normal Instr we all know and love, the VB6 Instr is a VB6 version based on the code you posted and the ASM Instr is my own handwritten assembly version. As you have already discovered, a pure VB6 version is horribly slow, at best it's twice as slow as the normal Instr function. The assembly version performs only slightly better than normal Instr. It performed roughly 30% better.
This is the benchmark code:-
Code:
Private Sub Form_Load()
Const MAX As Long = (10 ^ 7)
Dim sText As String
Dim sSrch As String
Dim arText() As Byte
Dim arSrch() As Byte
Dim l As Long
Dim i As Long
Dim T As Double, RuntimeTime As Double, VBTime As Double, ASMTime As Double
sText = "Hello there. Do you like kittens?"
sSrch = "like"
arText = sText 'StrConv(sText, vbFromUnicode)
arSrch = sSrch 'StrConv(sSrch, vbFromUnicode)
T = Timer
For i = 1 To MAX
l = ASMInstrByteArray(1, arText, arSrch)
Next
ASMTime = Timer - T
Prn "ASM Instr : " & Fo(ASMTime)
T = Timer
For i = 1 To MAX
l = InStr(1, sText, sSrch, vbBinaryCompare)
Next
RuntimeTime = Timer - T
Prn "Runtime Instr : " & Fo(RuntimeTime)
T = Timer
For i = 1 To MAX
l = VBInstrByteArray(1, arText, arSrch)
Next
VBTime = Timer - T
Prn "VB6 Instr : " & Fo(VBTime)
'****************************************
Prn ""
Prn "Runtime/ASM : " & Fo(RuntimeTime / ASMTime)
Prn "VB6/ASM : " & Fo(VBTime / ASMTime)
End Sub
This is the pure VB6 version that's based on yours:-
Code:
Private Function VBInstrByteArray(ByVal start As Long, ByRef data() As Byte, ByRef srch() As Byte) As Long
Dim a As Long, b As Long
Dim i As Long
Dim x As Long
a = (LBound(data) + start) - 1
b = LBound(srch)
For i = a To UBound(data)
If data(i) = srch(b) Then
b = b + 1
If b > UBound(srch) Then
VBInstrByteArray = i - (UBound(srch) - LBound(srch)) + 1
Exit For
End If
Else
b = LBound(srch)
End If
Next
End Function
And this is the assembly version:-
Code:
use32
push ebp
mov ebp, esp
push esi
push edi
push ebx
;----------------------------------
mov edi, [ebp + 12] ;Copy SAFEARRAY** from 1st arg into EDI
mov edi, [edi] ;Get SAFEARRAY* from SAFEARRAY**
mov ecx, [edi + 16] ;Number of elements in array in 2nd arg goes in
;ECX
mov edi, [edi + 12] ;Get pvData pointer from SAFEARRAY and put that
;into EDI. EDI now points to the first element
;in the byte array passed in the 2nd argument
mov ebx, edi ;Keep a copy of the pointer to the start
;of the data in the 1st array in EBX
add edi, [ebp + 8] ;Move pointer of the array being searched
;to the offset specified in the 1st argument
dec edi ;Our function is one-based like VB6's Instr
;so we decrement the pointer by 1
mov esi, [ebp + 16] ;Copy SAFEARRAY** from 3rd arg into ESI
mov esi, [esi] ;Get SAFEARRAY* from SAFEARRAY**
mov edx, [esi + 16] ;Number of elements in array in 3rd arg goes in
;EDX
mov esi, [esi + 12] ;Get pvData pointer from SAFEARRAY and put that
;into ESI. ESI now points to the first element
;in the byte array passed in the 3rd argument
mov al, [esi] ;Move the value of the first byte of the second
;array into AL register.
dec edx ;Subtract one from the length of the second
;array. We always start
;the CMBSB from the
;second byte in the second array since
;SCASB will match the 1st byte
inc esi ;Move to the second byte of the
;second array
scanAgain:
repne scasb ;Scan each byte of the 1st array
;until a match is found to the
;byte value in AL.
jne returnZero ;If the end of the string was reached without
;a match then return 0
cmp edx, 0
je matchFound ;If the length of the second array is 1
;(Remember EDX is the length of second array -1)
;then we don't need any further matching
cmp edx, ecx ;Compare the length of the second array(-1) to
;the number of bytes remaining to be scanned in
;the first array
jg returnZero ;If the remainer to be scanned in the 1st array
;is less than the size of the second array, then
;there no match is possible so return 0
push ecx ;Preseve counter of the remaining bytes
;to be scanned in the 1st array.
push esi ;Preserve pointer to second byte
;in the second array
mov ecx, edx ;Put the length of the second byte array
;into ECX for the REP prefix counter
repe cmpsb ;Compare the two byte arrays until
;either ECX is 0 or a match was not found
pop esi ;Restore pointer to second array
pop ecx ;Restore SCASB counter
jne scanAgain ;If the result of the last comparison
;was not a match then continue searching the
;the array for a match
matchFound:
mov eax, edi ;EDI should be pointed to
;end the of match found
sub eax, ebx ;Subtract it from the address at the start
;of the array to get an offset
sub eax, edx ;Subtract the length of the second array
;to get an offset to the start of the match
;Note: EDX actually stores the length of the
;second array -1.
;Also note that the value returned in EAX here
;is 1 based like the VB6 Instr function
jmp exitFunc
returnZero:
xor eax, eax
exitFunc:
;----------------------------------
pop ebx
pop edi
pop esi
mov esp, ebp
pop ebp
ret 12
What makes the assembly version special is that uses processor's native string instructions via these two lines:-
Code:
repne scasb
Code:
repe cmpsb
Those two instructions are actually looping instructions where the loops are implemented at a very low level in the processor and as such they are extremely fast. They are about as fast as you can possibly get.
I've attached the entire project to this post.
IMPORANT!
In order to run the project you must install the trick's assembly add-in. You can find a link to it in this post. It's quick to install and "dog" easy to use.
C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter
There's just no reason to use garbage like InputBox. - jmcilhinney
The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber
.
I just whipped it up - there may be bugs, and I doubt it's the most efficient approach.
I'd also like to point out that it may be very possible to optimize the pure VB6 version to perform at least as well as the normal Instr function. Like you, I also get the feeling that it may be possible to optimize it further. Despite performing twice as bad as both the ASM and normal Instr, I was still very surprised at how well it did with all optimizations enabled. Someone like Olaf could probably make it faster.
I'm also pretty sure that my ASM version could also be optimized further. Though I love playing around with assembly, I'm far from an assembly language expert.
C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter
There's just no reason to use garbage like InputBox. - jmcilhinney
The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber
I'd also like to point out that it may be very possible to optimize the pure VB6 version to perform at least as well as the normal Instr function. Like you, I also get the feeling that it may be possible to optimize it further. Despite performing twice as bad as both the ASM and normal Instr, I was still very surprised at how well it did with all optimizations enabled. Someone like Olaf could probably make it faster.
I'm also pretty sure that my ASM version could also be optimized further. Though I love playing around with assembly, I'm far from an assembly language expert.
You could put your VBInstrByteArray into a standard module and precalculate the Ubound/LBound values to increase the performance.
You could put your VBInstrByteArray into a standard module and precalculate the Ubound/LBound values to increase the performance.
You were absolutely right! I refactored it a bit here and put it in a module:-
Code:
Public Function VBInstrByteArray2(ByVal start As Long, ByRef data() As Byte, ByRef srch() As Byte) As Long
Dim b As Long
Dim i As Long
Dim x As Long
Dim lbSrch As Long, ubSrch As Long, lenSrch As Long
lbSrch = LBound(srch)
ubSrch = UBound(srch)
lenSrch = (ubSrch - lbSrch) + 1
b = lbSrch
For i = (LBound(data) + start) - 1 To UBound(data)
If data(i) = srch(b) Then
b = b + 1
If b > ubSrch Then
VBInstrByteArray2 = i - lenSrch
Exit For
End If
Else
b = lbSrch
End If
Next
End Function
And these were the results from an EXE with all optimizations enabled:-
C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter
There's just no reason to use garbage like InputBox. - jmcilhinney
The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber
Of course we should obligatory mention Boyer–Moore string-search algorithm which has the potential to obliterate any naive loop approach be it in ASM or not.
Good job, Niya! Note that repne scasb is not the fastest option, especially if the first byte to search is a common character like "e". With SIMD instructions, it can be a factor 2-3 faster. The bigger problem, in any case, are the conversions done under the hood.
Originally Posted by wqweto
Of course we should obligatory mention Boyer–Moore string-search algorithm which has the potential to obliterate any naive loop approach be it in ASM or not.