Page 2 of 2 FirstFirst 12
Results 41 to 46 of 46

Thread: Faster ByteArray search/loop

  1. #41
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Faster ByteArray search/loop

    Quote Originally Posted by jpbro View Post
    Sure, here it is:....
    Thanks. Was a little busy over the weekend but I finally had time to dive into this today.

    Quote Originally Posted by jpbro View Post
    I've just tested looping a byte array to find a series of bytes, and it is slower than InStr but certainly "close". Some things to note:

    1) It's dramatically slower in the IDE (~25x slower).
    2) It's noticeably slower when compiled with no optimizations (~5x)
    3) It's still slower when compiled with the "Remove Array Bounds Checks" optimization is selected (~2x - so yes, still slower but maybe OK?)

    Are you running compiled with the "Remove Array Bounds Checks" optimization selected when you test the performance?
    This didn't surprise me that much. I'm willing to bet that most people aren't aware that modern x86 processors have string instructions, however you cannot access them directly from a high level language like VB6. This was what actually prompted me to ask for your implementation, so I could test it against not only the native Instr function but a handwritten assembly version.

    I took some time today to write a byte array version of Instr in pure assembly and these were the results:-

    IDE
    Code:
    ASM Instr : 1.042
    Runtime Instr : 1.417
    VB6 Instr : 14.549
    
    Runtime/ASM : 1.3599
    VB6/ASM : 13.9626
    EXE with all optimizations enabled
    Code:
    ASM Instr : 0.88
    Runtime Instr : 1.17
    VB6 Instr : 1.893
    
    Runtime/ASM : 1.3295
    VB6/ASM : 2.1511
    The runtime Instr is the normal Instr we all know and love, the VB6 Instr is a VB6 version based on the code you posted and the ASM Instr is my own handwritten assembly version. As you have already discovered, a pure VB6 version is horribly slow, at best it's twice as slow as the normal Instr function. The assembly version performs only slightly better than normal Instr. It performed roughly 30% better.

    This is the benchmark code:-
    Code:
    Private Sub Form_Load()
        
        Const MAX As Long = (10 ^ 7)
        Dim sText As String
        Dim sSrch As String
        
        Dim arText() As Byte
        Dim arSrch() As Byte
        Dim l As Long
        Dim i As Long
        Dim T As Double, RuntimeTime As Double, VBTime As Double, ASMTime As Double
        
        sText = "Hello there. Do you like kittens?"
        sSrch = "like"
        
        arText = sText 'StrConv(sText, vbFromUnicode)
        arSrch = sSrch 'StrConv(sSrch, vbFromUnicode)
        
        T = Timer
        For i = 1 To MAX
            l = ASMInstrByteArray(1, arText, arSrch)
        Next
        ASMTime = Timer - T
        Prn "ASM Instr : " & Fo(ASMTime)
       
        
        T = Timer
        For i = 1 To MAX
            l = InStr(1, sText, sSrch, vbBinaryCompare)
        Next
        RuntimeTime = Timer - T
        Prn "Runtime Instr : " & Fo(RuntimeTime)
          
       
        T = Timer
        For i = 1 To MAX
            l = VBInstrByteArray(1, arText, arSrch)
        Next
        VBTime = Timer - T
        Prn "VB6 Instr : " & Fo(VBTime)
           
        '****************************************
        
        Prn ""
        Prn "Runtime/ASM : " & Fo(RuntimeTime / ASMTime)
        Prn "VB6/ASM : " & Fo(VBTime / ASMTime)
    
    End Sub
    This is the pure VB6 version that's based on yours:-
    Code:
    Private Function VBInstrByteArray(ByVal start As Long, ByRef data() As Byte, ByRef srch() As Byte) As Long
        
        Dim a As Long, b As Long
        Dim i As Long
        Dim x As Long
            
        a = (LBound(data) + start) - 1
        b = LBound(srch)
        
        For i = a To UBound(data)
            If data(i) = srch(b) Then
                b = b + 1
                If b > UBound(srch) Then
                    VBInstrByteArray = i - (UBound(srch) - LBound(srch)) + 1
                    Exit For
                End If
            Else
                b = LBound(srch)
            End If
        Next
    
    End Function
    And this is the assembly version:-
    Code:
    use32
    
    push ebp							
    mov ebp, esp	
    
    push esi
    push edi
    push ebx
    ;----------------------------------
    
    mov edi, [ebp + 12]	;Copy SAFEARRAY** from 1st arg into EDI
    mov edi, [edi]		;Get SAFEARRAY* from SAFEARRAY**
    
    mov ecx, [edi + 16]	;Number of elements in array in 2nd arg goes in
    			;ECX
    
    mov edi, [edi + 12]	;Get pvData pointer from SAFEARRAY and put that
    			;into EDI. EDI now points to the first element
    			;in the byte array passed in the 2nd argument
    
    mov ebx, edi		;Keep a copy of the pointer to the start
    			;of the data in the 1st array in EBX
    
    add edi, [ebp + 8]	;Move pointer of the array being searched
    			;to the offset specified in the 1st argument
    
    dec edi			;Our function is one-based like VB6's Instr
    			;so we decrement the pointer by 1
    
    mov esi, [ebp + 16]	;Copy SAFEARRAY** from 3rd arg into ESI
    mov esi, [esi]		;Get SAFEARRAY* from SAFEARRAY**
    
    mov edx, [esi + 16]	;Number of elements in array in 3rd arg goes in
    			;EDX
    
    mov esi, [esi + 12]	;Get pvData pointer from SAFEARRAY and put that
    			;into ESI. ESI now points to the first element
    			;in the byte array passed in the 3rd argument
    
    
    
    mov al, [esi]		;Move the value of the first byte of the second
    			;array into AL register.
    
    dec edx			;Subtract one from the length of the second
    			;array. We always start
    			;the CMBSB from the 
    			;second byte in the second array since
    			;SCASB will match the 1st byte
    
    inc esi			;Move to the second byte of the
    			;second array
    
    scanAgain:
    
    repne scasb 		;Scan each byte of the 1st array
    			;until a match is found to the
    			;byte value in AL.
    
    
    jne returnZero		;If the end of the string was reached without
    			;a match then return 0
    
    cmp edx, 0		
    
    je matchFound		;If the length of the second array is 1
    			;(Remember EDX is the length of second array -1)
    			;then we don't need any further matching
    
    
    cmp edx, ecx		;Compare the length of the second array(-1) to
    			;the number of bytes remaining to be scanned in 
    			;the first array
    
    jg returnZero		;If the remainer to be scanned in the 1st array
    			;is less than the size of the second array, then
    			;there no match is possible so return 0
    
    push ecx		;Preseve counter of the remaining bytes
    			;to be scanned in the 1st array.
    
    push esi		;Preserve pointer to second byte
    			;in the second array
    
    mov ecx, edx		;Put the length of the second byte array
    			;into ECX for the REP prefix counter
    
    
    repe cmpsb		;Compare the two byte arrays until
    			;either ECX is 0 or a match was not found
    
    pop esi			;Restore pointer to second array
    pop ecx			;Restore SCASB counter
    
    jne scanAgain		;If the result of the last comparison
    			;was not a match then continue searching the
    			;the array for a match
    
    matchFound:
    mov eax, edi		;EDI should be pointed to
    			;end the of match found
    
    sub eax, ebx		;Subtract it from the address at the start
    			;of the array to get an offset
    
    sub eax, edx		;Subtract the length of the second array
    			;to get an offset to the start of the match
    			;Note: EDX actually stores the length of the
    			;second array -1. 
    			;Also note that the value returned in EAX here
    			;is 1 based like the VB6 Instr function
    
    jmp exitFunc
    
    returnZero:
    
    xor eax, eax
    
    
    exitFunc:
    ;----------------------------------
    pop ebx
    pop edi
    pop esi
    
    mov esp, ebp
    pop ebp
    
    ret 12
    What makes the assembly version special is that uses processor's native string instructions via these two lines:-
    Code:
    repne scasb
    Code:
    repe cmpsb
    Those two instructions are actually looping instructions where the loops are implemented at a very low level in the processor and as such they are extremely fast. They are about as fast as you can possibly get.

    I've attached the entire project to this post.

    IMPORANT!

    In order to run the project you must install the trick's assembly add-in. You can find a link to it in this post. It's quick to install and "dog" easy to use.
    Attached Files Attached Files
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  2. #42
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Faster ByteArray search/loop

    Quote Originally Posted by jpbro View Post
    .
    I just whipped it up - there may be bugs, and I doubt it's the most efficient approach.
    I'd also like to point out that it may be very possible to optimize the pure VB6 version to perform at least as well as the normal Instr function. Like you, I also get the feeling that it may be possible to optimize it further. Despite performing twice as bad as both the ASM and normal Instr, I was still very surprised at how well it did with all optimizations enabled. Someone like Olaf could probably make it faster.

    I'm also pretty sure that my ASM version could also be optimized further. Though I love playing around with assembly, I'm far from an assembly language expert.
    Last edited by Niya; Apr 17th, 2022 at 03:57 AM.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  3. #43
    PowerPoster
    Join Date
    Feb 2015
    Posts
    2,797

    Re: Faster ByteArray search/loop

    Quote Originally Posted by Niya View Post
    I'd also like to point out that it may be very possible to optimize the pure VB6 version to perform at least as well as the normal Instr function. Like you, I also get the feeling that it may be possible to optimize it further. Despite performing twice as bad as both the ASM and normal Instr, I was still very surprised at how well it did with all optimizations enabled. Someone like Olaf could probably make it faster.

    I'm also pretty sure that my ASM version could also be optimized further. Though I love playing around with assembly, I'm far from an assembly language expert.
    You could put your VBInstrByteArray into a standard module and precalculate the Ubound/LBound values to increase the performance.

  4. #44
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Faster ByteArray search/loop

    Quote Originally Posted by The trick View Post
    You could put your VBInstrByteArray into a standard module and precalculate the Ubound/LBound values to increase the performance.
    You were absolutely right! I refactored it a bit here and put it in a module:-
    Code:
    Public Function VBInstrByteArray2(ByVal start As Long, ByRef data() As Byte, ByRef srch() As Byte) As Long
        
        Dim b As Long
        Dim i As Long
        Dim x As Long
        Dim lbSrch As Long, ubSrch As Long, lenSrch As Long
        
        lbSrch = LBound(srch)
        ubSrch = UBound(srch)
        lenSrch = (ubSrch - lbSrch) + 1
        b = lbSrch
        
        For i = (LBound(data) + start) - 1 To UBound(data)
            If data(i) = srch(b) Then
                b = b + 1
                If b > ubSrch Then
                    VBInstrByteArray2 = i - lenSrch
                    Exit For
                End If
            Else
                b = lbSrch
            End If
        Next
    
    End Function
    And these were the results from an EXE with all optimizations enabled:-
    Code:
    ASM Instr : 0.931
    Runtime Instr : 1.249
    VB6 Instr : 1.883
    VB6 Instr v2 : 0.704
    
    Runtime/ASM : 1.3416
    VB6/ASM : 2.0226
    VB6 v2/ASM : 0.7562
    It performed the best, outperforming my ASM version by 30%.
    Last edited by Niya; Apr 17th, 2022 at 03:09 PM.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  5. #45

  6. #46
    Addicted Member jj2007's Avatar
    Join Date
    Dec 2015
    Posts
    206

    Re: Faster ByteArray search/loop

    Good job, Niya! Note that repne scasb is not the fastest option, especially if the first byte to search is a common character like "e". With SIMD instructions, it can be a factor 2-3 faster. The bigger problem, in any case, are the conversions done under the hood.

    Quote Originally Posted by wqweto View Post
    Of course we should obligatory mention Boyer–Moore string-search algorithm which has the potential to obliterate any naive loop approach be it in ASM or not.
    We tested BM ad nauseam, and never found an advantage.

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width