Results 1 to 31 of 31

Thread: Binary search for ANSI string

  1. #1

    Thread Starter
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,911

    Binary search for ANSI string

    I've already got this working, but very inefficiently.

    I need to search for an ANSI string in a Binary file. The file will already be opened as Binary.

    Currently, I'm reading in chunks, and then converting those chunks to Unicode, and then using Instr against my search string.

    I'd much prefer to convert my "sNeedle" string to a byte array, and do the search without messing with Unicode, but I don't have an algorithm to search for a byte array within a much larger byte array, which is what I need.

    I took a look at InstrB, but I've never used that and I'm not sure how it works. I did do the following test, but it didn't do anything useful (always returning 0).

    Code:
    
    Option Explicit
    
    Private Sub Form_Load()
    
        Dim bb() As Byte
        ReDim bb(3)
        bb(0) = 5
        Dim bb2() As Byte
        ReDim bb2(100)
        bb2(44) = 5
    
        Debug.Print InStrB(bb, bb2)
    
    End Sub
    
    
    Any ideas/suggestions?

    ---------------

    p.s. I suppose I could write loops to do my search, but I'm not sure that'd be any faster than what I'm doing.

    ---------------

    Here, I'll post what I'm doing (but don't like):

    Code:
    Private Function BinaryFileSearch(hFle As Integer, sSearchString As String, _
                                      Optional iStartPosition As Long = 1&, _
                                      Optional iFoundPosition As Long) As Boolean
        ' Returns true if sSearchString is found, else false.
        ' sSearchString can be no longer than 128.
        ' The lFoundPosition is a return argument.
        '    It returns the latest position before lStartPosition (if there isn't one after lStartPosition) or
        '    it returns the earliest position after lStartPosition.
        '
        Dim sFileData   As String
        Dim pFile       As Long
        Dim iLen        As Long
        Dim iPos        As Long
        '
        sFileData = Space$(1024&)
        iLen = LOF(hFle)
        pFile = iStartPosition
        Do
            If pFile > iLen Then Exit Do
            Get hFle, pFile, sFileData
            iPos = InStr(sFileData, sSearchString)
            If iPos <> 0& Then
                iFoundPosition = pFile + iPos - 1&
                If iFoundPosition >= iStartPosition Then
                    BinaryFileSearch = True
                    Exit Do
                End If
            End If
            pFile = ((pFile + 1024&) - Len(sSearchString)) + 1&
        Loop
    End Function
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  2. #2
    PowerPoster VanGoghGaming's Avatar
    Join Date
    Jan 2020
    Location
    Eve Online - Mining, Missions & Market Trading!
    Posts
    2,629

    Wink Re: Binary search for ANSI string

    Quote Originally Posted by Elroy View Post
    Code:
    
    Option Explicit
    
    Private Sub Form_Load()
    
        Dim bb() As Byte
        ReDim bb(3)
        bb(0) = 5
        Dim bb2() As Byte
        ReDim bb2(100)
        bb2(44) = 5
    
        Debug.Print InStrB(bb, bb2)
    
    End Sub
    
    
    You need to search the needle in the haystack, not the other way around!

    Code:
    Dim baNeedle() As Byte, baHaystack() As Byte
        baNeedle = StrConv("needle", vbFromUnicode)
        baHaystack = StrConv("There is a needle somewhere in this haystack!", vbFromUnicode)
        Debug.Print InStrB(baHaystack, baNeedle)

  3. #3
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Binary search for ANSI string

    Code:
    Private Sub Form_Load()
        Dim bb() As Byte
        ReDim bb(0)
        bb(0) = 5
        Dim bb2() As Byte
        ReDim bb2(100)
        bb2(44) = 5
    
        Debug.Print InStrB(bb2, bb)
    End Sub
    Calling InStrB() causes temporary String values to be created from bb and bb2.

  4. #4
    The Idiot
    Join Date
    Dec 2014
    Posts
    3,003

    Re: Binary search for ANSI string

    I remember asking about search method and I tried all kind of stuff
    like using my own loop, but it was slower than instr and the trick explained why.

    this is interesting: https://www.vbforums.com/showthread....xtCompare-mode
    and here is what the Trick pointed us if using instrB:
    https://www.cyberforum.ru/post16798157.html
    https://www.cyberforum.ru/post16785777.html

  5. #5

  6. #6
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Binary search for ANSI string

    Quote Originally Posted by The trick View Post
    No, it doesn't.
    How does it know the length of the arguments then? Does the SAFEARRAY/Vector descriptor get passed? I haven't tried disassembling to see.

  7. #7
    The Idiot
    Join Date
    Dec 2014
    Posts
    3,003

    Re: Binary search for ANSI string

    another thing that is interesting with instr is from this site
    http://www.xbeat.net/vbspeed/c_InStr.htm

    as u see when using vbTextCompare instr is slow and the author instr01 is faster.
    but using instr using binarycompare we see the power of it.

  8. #8

  9. #9

    Thread Starter
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,911

    Re: Binary search for ANSI string

    Quote Originally Posted by VanGoghGaming View Post
    You need to search the needle in the haystack, not the other way around!
    Ahh, well that explains why it didn't work at all. But...

    Quote Originally Posted by dilettante View Post
    Calling InStrB() causes temporary String values to be created from bb and bb2.
    Hmm, if that's the case (and I've read the following posts and it appears it is), then using Byte arrays provides no advantage, and possibly a disadvantage.

    I guess I'll explore an API approach, possibly using RtlCompareMemory or something similar. I didn't realize that this issue was as complex as it is. Theoretical, it seems fairly straightforward, but I guess VB6 complicates it.

    Whatever I come up with doesn't need to be ultra-fast, just reasonably fast.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  10. #10

    Thread Starter
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,911

    Re: Binary search for ANSI string

    Ok, I've come up with this:

    Code:
    
    Option Explicit
    '
    Public Declare Function RtlCompareMemory Lib "ntdll.dll" (ByRef lpvSrc1 As Any, ByRef lpvSrc2 As Any, ByVal cbLen As Long) As Long
    '
    
    Public Function ByteComp(baHay() As Byte, baNeedle() As Byte) As Long
        ' Returns -1 if not found, else array index into baHay() of first byte of match.
        ' Both arrays are expected to be dimensioned with data and 0-based.
        ' baHay's Ubound MUST be >= to baNeedle's Ubound, or bad things.
        '
        Dim iLen As Long:   iLen = UBound(baNeedle) + 1&
        Dim iStop As Long:  iStop = UBound(baHay) - UBound(baNeedle)
        For ByteComp = 0& To iStop
            If RtlCompareMemory(baHay(ByteComp), baNeedle(0&), iLen) = iLen Then Exit Function
        Next
        ByteComp = -1&
    End Function
    
    
    
    And to test:

    Code:
    
    Option Explicit
    '
    
    
    Private Sub Form_Load()
        Dim hay() As Byte
        Dim needle() As Byte
    
        hay = StrConv("asdfasdfasdf_1234_qwerqwerqwer", vbFromUnicode)
        needle = StrConv("1234", vbFromUnicode)
    
        Debug.Print ByteComp(hay, needle)
    
        needle = StrConv("zzz", vbFromUnicode)
        Debug.Print ByteComp(hay, needle)
    
    End Sub
    
    
    If I keep everything as Byte arrays, I'm thinking that'll be faster than either Instr() or InstrB().

    Anyone think of additional improvements, or some vastly superior way?
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  11. #11
    PowerPoster VanGoghGaming's Avatar
    Join Date
    Jan 2020
    Location
    Eve Online - Mining, Missions & Market Trading!
    Posts
    2,629

    Re: Binary search for ANSI string

    You'd be hard-pressed to find a faster searching algorithm than InStr. Maybe you could speed it up by reading bigger chunks or the whole file into memory. If the file is too big then maybe memory-mapping it will help.

  12. #12
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Binary search for ANSI string

    You can make an effort to avoid coercion on each InStrB() call by assigning explicitly to String variables early and retaining/using them as long as they are useful.

    For example multiple blocks of data might need to be searched several times to look for each/any of several values. Convert your search targets once during initialization, convert each block once after reading it from disk.

  13. #13
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,263

    Re: Binary search for ANSI string

    Depending on complexity of needle and haystack: What about RegEx?
    Would avoid the nested loop
    https://regex101.com/r/PnLSsC/2
    Code:
    Sub main()
    Dim p As String
    Dim r As New RegExp
    Dim fc As MatchCollection
    Dim m As Match
    Dim s As String
        s = "asdfasdfasdf_1234_qwerqwerqwer"
        Debug.Print Len(s)
        r.Pattern = "(1234)"
        r.MultiLine = False
        r.Global = True
        Set fc = r.Execute(s)
        For Each m In fc
            If fc.count>0 Then
                 Debug.Print m.FirstIndex
                 Debug.Print m.Length
            Else
                 Debug.Print "Found nothing"
            End If
        Next
    End Sub
    FirstIndex is zero-based

    EDIT: The fc.Count>0-Check should be outside the For Each-Loop....
    Last edited by Zvoni; Aug 28th, 2023 at 01:13 AM.
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  14. #14
    The Idiot
    Join Date
    Dec 2014
    Posts
    3,003

    Re: Binary search for ANSI string

    u speed comparing would be nice here
    RtlCompareMemory seems to do byte comparison and count matching bytes.
    that is kind of a loop. surely its low-level and should be fast, but I think a comparison between the API and instr would be nice to have.

  15. #15

  16. #16
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,263

    Re: Binary search for ANSI string

    What about StrStr-Api?
    https://learn.microsoft.com/en-us/wi...hlwapi-strstra
    It's basically the C-Version of InStr
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  17. #17

  18. #18
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,454

    Re: Binary search for ANSI string

    FWIW, InstrB is significantly faster than the RTLCompare-function:
    (especially when used in "String-Mode", to avoid the conversions)

    Code:
    Private Sub Form_Click()
        Cls
        Dim i As Long, T As Single
        
        T = Timer
          Dim bHay() As Byte, bNeedle() As Byte
          bHay = StrConv("asdfasdfasdf_1234_qwerqwerqwer", vbFromUnicode)
          bNeedle = StrConv("1234", vbFromUnicode)
     
          For i = 1 To 10 ^ 6
             If ByteComp(bHay, bNeedle) <> 13 Then Stop
          Next
        Print "ByteComp: ", CLng((Timer - T) * 1000) & "msec"
        
        T = Timer
          Dim sHay As String, sNeedle As String
          sHay = StrConv("asdfasdfasdf_1234_qwerqwerqwer", vbFromUnicode)
          sNeedle = StrConv("1234", vbFromUnicode)
     
          For i = 1 To 10 ^ 6
             If InStrB(sHay, sNeedle) <> 14 Then Stop
          Next
        Print "InStrB: ", CLng((Timer - T) * 1000) & "msec"
    End Sub
    Olaf

  19. #19
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: Binary search for ANSI string

    InStrB() has the advantage of being a "dumb" search that can accept out of band lengths and thus cope with NUL bytes.

    Other functions you might find are "smarter" as in:

    The lstrcmpi function uses a word sort, rather than a string sort. A word sort treats hyphens and apostrophes differently than it treats other symbols that are not alphanumeric, in order to ensure that words such as "coop" and "co-op" stay together within a sorted list. For a detailed discussion of word sorts and string sorts, see the Remarks section for the CompareString function.
    My guess is that for a dumb search in C you'd just write a loop of your own.

  20. #20
    PowerPoster
    Join Date
    Feb 2015
    Posts
    2,797

    Re: Binary search for ANSI string

    Quote Originally Posted by Schmidt View Post
    FWIW, InstrB is significantly faster than the RTLCompare-function:
    InstrB uses memchr which uses some bitwise improvements.
    This is the source code of InstrB:
    Code:
    char *__stdcall __vbaInStrB(int a1, BSTR bstrDst, BSTR bstrSrc, int lStart)
    {
      unsigned int dwSrcLen; // esi
      unsigned int dwOffset; // edi
      int dwDiff; // esi
      char *pbCur; // eax
      char *pEnd; // ebx
      char *pStart; // eax
      int dwDstLen; // [esp+Ch] [ebp-4h]
      unsigned int dwSizeDec; // [esp+Ch] [ebp-4h]
      int iChar; // [esp+24h] [ebp+14h]
    
      if ( bstrSrc )
        dwSrcLen = *((_DWORD *)bstrSrc - 1);
      else
        dwSrcLen = 0;
      if ( bstrDst )
        dwDstLen = *((_DWORD *)bstrDst - 1);
      else
        dwDstLen = 0;
      dwOffset = lStart - 1;
      if ( lStart - 1 < 0 )
        EbRaiseExceptionCode(5);
      if ( dwOffset < dwSrcLen )
      {
        if ( !dwDstLen )
          return (char *)lStart;
        dwDiff = dwSrcLen - dwDstLen;
        dwSizeDec = dwDstLen - 1;
        pbCur = (char *)bstrSrc + dwOffset;
        pEnd = (char *)bstrSrc + dwDiff + 1;
        iChar = *(char *)bstrDst;
        while ( pbCur < pEnd )
        {
          pStart = (char *)memchr(pbCur, iChar, pEnd - pbCur);
          if ( !pStart )
            break;
          pbCur = pStart + 1;
          if ( !memcmp(pbCur, (char *)bstrDst + 1, dwSizeDec) )
            return (char *)(pbCur - (char *)bstrSrc);
        }
      }
      return 0;
    }
    You can use your own implementation with memchr and RtlCompareMemory.

  21. #21

    Thread Starter
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,911

    Re: Binary search for ANSI string

    Hmmm, I'm a bit shocked at how slow RtlCompareMemory is.

    Name:  Timings.png
Views: 561
Size:  4.1 KB

    Code for testing:
    Code:
    
    Option Explicit
    Private Declare Function RtlCompareMemory Lib "ntdll.dll" (ByRef lpvSrc1 As Any, ByRef lpvSrc2 As Any, ByVal cbLen As Long) As Long
    '
    
    Private Sub Form_Load()
        Cls
        Dim i As Long, T As Single
    
          Dim bHay() As Byte, bNeedle() As Byte
          bHay = StrConv("asdfasdfasdf_1234_qwerqwerqwer", vbFromUnicode)
          bNeedle = StrConv("1234", vbFromUnicode)
    
        T = Timer
          For i = 1 To 10 ^ 6
             If ByteComp(bHay, bNeedle) <> 13 Then Stop
          Next
        Print "ByteComp: ", CLng((Timer - T) * 1000) & " msec"
    
          Dim sHay As String, sNeedle As String
          sHay = StrConv("asdfasdfasdf_1234_qwerqwerqwer", vbFromUnicode)
          sNeedle = StrConv("1234", vbFromUnicode)
    
        T = Timer
          For i = 1 To 10 ^ 6
             If InStrB(sHay, sNeedle) <> 14 Then Stop
          Next
        Print "InStrB: ", CLng((Timer - T) * 1000) & " msec"
    
          sHay = "asdfasdfasdf_1234_qwerqwerqwer"
          sNeedle = "1234"
    
        T = Timer
          For i = 1 To 10 ^ 6
             If InStr(sHay, sNeedle) <> 14 Then Stop
          Next
        Print "InStr: ", CLng((Timer - T) * 1000) & " msec"
    End Sub
    
    Private Function ByteComp(baHay() As Byte, baNeedle() As Byte) As Long
        ' Returns -1 if not found, else array index into baHay() of first byte of match.
        ' Both arrays are expected to be dimensioned with data and 0-based.
        ' baHay's Ubound MUST be >= to baNeedle's Ubound, or bad things.
        '
        Dim iLen As Long:   iLen = UBound(baNeedle) + 1&
        Dim iStop As Long:  iStop = UBound(baHay) - UBound(baNeedle)
        For ByteComp = 0& To iStop
            If RtlCompareMemory(baHay(ByteComp), baNeedle(0&), iLen) = iLen Then Exit Function
        Next
        ByteComp = -1&
    End Function
    
    
    I guess I'll rework it to use InstrB and call it done.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  22. #22

  23. #23

    Thread Starter
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,911

    Re: Binary search for ANSI string

    Quote Originally Posted by The trick View Post
    Elroy, attach the project with the benchmark. Thanks.
    It's really just the above code, but I saved and zipped the project (attached).
    Attached Files Attached Files
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  24. #24
    PowerPoster VanGoghGaming's Avatar
    Join Date
    Jan 2020
    Location
    Eve Online - Mining, Missions & Market Trading!
    Posts
    2,629

    Talking Re: Binary search for ANSI string

    All these Rtl* functions from "ntdll" are not exactly known for their speed (for example the Rtl bit-shift functions). In most cases you are better off using something else, the most obvious exception being RtlMoveMemory (the ubiquitous "CopyMemory" that you can't live without, haha) but that's from kernel32 not ntdll...

  25. #25
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,894

    Re: Binary search for ANSI string

    MapViewOfFile and a vanilla VB For loop with Remove Array Bounds Checks & Remove Integer Overflow Checks optimizations selected performs nicely when compiled:

    Name:  2023-08-28_19-02-52.jpg
Views: 552
Size:  17.9 KB

  26. #26

  27. #27
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,894

    Re: Binary search for ANSI string

    Quote Originally Posted by The trick View Post
    Why memory-mapped files? Can you attach your code? I've made the several variants and change input data.
    I thought 4ms looked a bit too good to be true Looks like I messed up. I used a Windows API TLB to save time having vs. finding all the appropriate Declares. Everything worked OK the IDE, but when compiled CreateFileW was failing with error code 998. I wasn't testing for errors since it was just a quick test. Anyway, I put in all the appropriate Declares and dropped the TLB and now I'm getting ~30ms for the loop, so InStrB is still the champ (I'm getting ~20ms per run using InStrB).

    Here's the code I used in case it's of any interest, but its a lot of extra work to lose out to an intrinsic function, so I don't think it is very useful after all:

    Code:
    Option Explicit
    
    Private Type SAFEARRAY1D
       cDims As Integer
       fFeatures As Integer
       cbElements As Long
       cLocks As Long
       pvData As Long
       cElements1D As Long
       lLbound1D As Long
    End Type
    
    Private Declare Function CreateFile Lib "kernel32" Alias "CreateFileW" (ByVal lpFileName As Long, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long
    Private Declare Function CreateFileMapping& Lib "kernel32" Alias "CreateFileMappingW" (ByVal hFile&, ByVal secAttr&, ByVal Protect&, ByVal SizeHigh&, ByVal SizeLow As Long, ByVal MapName As Long)
    Private Declare Function MapViewOfFile& Lib "kernel32" (ByVal hFile&, ByVal DesAccess&, ByVal OffsHigh&, ByVal OffsLow&, ByVal NumberOfBytesToMap As Long)
    Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile&, ByVal lpFileSizeHigh As Long) As Long
    Private Declare Sub BindArray Lib "kernel32" Alias "RtlMoveMemory" (PArr() As Any, pSrc&, Optional ByVal CB& = 4)
    Private Declare Sub ReleaseArray Lib "kernel32" Alias "RtlMoveMemory" (PArr() As Any, Optional pSrc& = 0, Optional ByVal CB& = 4)
    Private Declare Function UnmapViewOfFile& Lib "kernel32" (ByVal lpBaseAddress As Long)
    Private Declare Function CloseHandle& Lib "kernel32" (ByVal hObj As Long)
    
    Private Const FILE_MAP_ALL_ACCESS& = &HF001F
    Private Const FILE_MAP_READ& = &H4
    Private Const FILE_SHARE_READ& = &H1
    Private Const GENERIC_READ = &H80000000
    Private Const OPEN_EXISTING = &H3
    Private Const PAGE_READONLY = &H2
    
    Private Sub Form_Click()
       Dim T As Single
       Dim i As Long
       Dim j As Long
       
       Dim la_Hay() As Byte, la_Needle() As Byte
       Dim la_SharedArray() As Byte
       Dim lt_SharedArray As SAFEARRAY1D
       Dim l_HFile As Long
       Dim l_HMap As Long
       Dim l_SearchOffset As Long
       
       Cls
       
       la_Needle = StrConv("1234", vbFromUnicode)
       
       If Dir$(App.Path & "\searchme.txt") = vbNullString Then
          Open App.Path & "\searchme.txt" For Binary Access Write As #1
          Put #1, , "asdfasdfasdf_1234_qwerqwerqwer"
          Close #1
       End If
       
       l_HFile = CreateFile(StrPtr(App.Path & "\searchme.txt"), GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0)
       l_HMap = CreateFileMapping(l_HFile, 0, PAGE_READONLY, 0, 0, 0)
       
       'Initialize safearray-descriptor
       With lt_SharedArray
          .cDims = 1   'one-dimensional array
          .cbElements = 1   'byte array, 1 byte per element
          .lLbound1D = 1   'set the LBound of the Array to 1-based
          .cElements1D = GetFileSize(l_HFile, 0)
          .pvData = MapViewOfFile(l_HMap, FILE_MAP_READ, 0, 0, .cElements1D)
       End With
    
       BindArray la_SharedArray, VarPtr(lt_SharedArray)
       
       T = Timer
       
       For i = 1 To 10 ^ 6
          l_SearchOffset = 0
          For j = 1 To UBound(la_SharedArray)
             If la_SharedArray(j) = la_Needle(l_SearchOffset) Then
                l_SearchOffset = l_SearchOffset + 1
                
                If l_SearchOffset > UBound(la_Needle) Then
                   j = j - UBound(la_Needle) - 1
                   Exit For
                End If
             
             Else
                l_SearchOffset = 0
             End If
          Next j
       Next i
       
       Print "MapViewOfFileLoop: ", CLng((Timer - T) * 1000) & " msec"
    
       ReleaseArray la_SharedArray
       UnmapViewOfFile lt_SharedArray.pvData
       CloseHandle l_HMap
       CloseHandle l_HFile
    End Sub

  28. #28
    PowerPoster
    Join Date
    Feb 2015
    Posts
    2,797

    Re: Binary search for ANSI string

    I've added bitwise trick:

    Code:
    Public Function CompBitwise( _
                    baHay() As Byte, _
                    baNeedle() As Byte) As Long
        Dim bIsIDE  As Boolean
        Dim lFChar  As Long
        Dim lFCharD As Long
        Dim tLDesc  As SAFEARRAY1D
        Dim lData() As Long
        Dim lNSize  As Long
        Dim lTotal  As Long
        Dim lIndex  As Long
        Dim lDIndex As Long
        Dim lBits   As Long
        Dim lTest   As Long
        Dim lPos    As Long
        Dim lIndex2 As Long
        
        Debug.Assert MakeTrue(bIsIDE)
        
        lFChar = baNeedle(0)
        lNSize = UBound(baNeedle)
        lTotal = UBound(baHay) - lNSize + 1
        CompBitwise = -1
        
        If bIsIDE Then
        
            lFCharD = lFChar * &H10101
            
            If lFChar And &H80 Then
                lFCharD = lFCharD Or ((lFChar And &H7F) * &H1000000) Or &H80000000
            Else
                lFCharD = lFCharD Or (lFChar * &H1000000)
            End If
            
        Else
            lFCharD = lFChar * &H1010101
        End If
        
        With tLDesc
            .cbElements = 4
            .cDims = 1
            .fFeatures = 1
            .pvData = VarPtr(baHay(0))
            .tBounds.cElements = lTotal \ 4
            PutMem4 ByVal ArrPtr(lData), VarPtr(tLDesc)
        End With
        
        Do While lIndex < lTotal
         
            lDIndex = lIndex
            
            Do While lDIndex And &H3
            
                If baHay(lDIndex) = lFChar Then
                    lPos = lDIndex
                    GoTo first_found
                End If
                
                lDIndex = lDIndex + 1
                
            Loop
            
            lDIndex = lDIndex \ 4
            
            For lDIndex = lDIndex To tLDesc.tBounds.cElements - 1
    
                If bIsIDE Then
                
                    lBits = lData(lDIndex) Xor lFCharD
                    
                    If (lBits And &HFF) = 0 Then
                        lPos = lDIndex * 4
                        GoTo first_found
                    ElseIf (lBits And &HFF00&) = 0 Then
                        lPos = lDIndex * 4 + 1
                        GoTo first_found
                    ElseIf (lBits And &HFF0000) = 0 Then
                        lPos = lDIndex * 4 + 2
                        GoTo first_found
                    ElseIf (lBits And &HFF000000) = 0 Then
                        lPos = lDIndex * 4 + 3
                        GoTo first_found
                    End If
    
                Else
                    
                    lBits = lData(lDIndex) Xor lFCharD
                    lTest = ((lBits + &H7EFEFEFF) Xor (Not lBits)) And &H81010100
                    
                    If lTest Then
                        
                        If lTest And &H100 Then
                            lPos = lDIndex * 4
                        ElseIf lTest And &H10000 Then
                            lPos = lDIndex * 4 + 1
                        ElseIf lTest And &H1000000 Then
                            lPos = lDIndex * 4 + 2
                        ElseIf (lBits And &HFF000000) = 0 Then
                            lPos = lDIndex * 4 + 3
                        End If
     
                        GoTo first_found
                        
                    End If
                    
                End If
                
            Next
                
            Exit Do
            
    first_found:
    
            If lPos > lTotal Then
                Exit Do
            End If
    
            For lIndex2 = 1 To lNSize
                
                If baHay(lPos + lIndex2) <> baNeedle(lIndex2) Then
                    lIndex = lPos + 1
                    GoTo continue
                End If
                
            Next
            
            CompBitwise = lPos
            
            Exit Do
            
    continue:
            
        Loop
    
    End Function
    
    Private Function MakeTrue( _
                     ByRef bValue As Boolean) As Boolean
        bValue = True
        MakeTrue = True
    End Function
    When compiled with full optimizations:

    Attached Files Attached Files

  29. #29
    Addicted Member
    Join Date
    Feb 2022
    Posts
    217

    Re: Binary search for ANSI string

    Quote Originally Posted by The trick View Post
    InstrB uses memchr which uses some bitwise improvements.
    ...
    Off topic, sorry gang... @the trick : what (code indenter / formatter tool) are you using to make the code look so pleasant on that forum?
    I am harboring a strong suspicion that you wrote the tool as an add-in for yourself, but I hereby plead for you to release it in the code bank, (or at least me personally, haha) Cheers
    Last edited by taishan; Sep 1st, 2023 at 09:47 PM.

  30. #30
    The Idiot
    Join Date
    Dec 2014
    Posts
    3,003

    Re: Binary search for ANSI string

    we could say that instr/instrb are really well optimized for VB6
    and hard to match. especially instrb.

    but with the help of the experts, sometimes its possible to give it a fight, as showned with CompBitwise.
    this is something I have been trying to find myself, bytearray-search instead of string-search, since I work mostly in byte nowadays.

    as always, good work The trick. once again u showed that its possible to make something really good in VB6 that can match low-level routines.

  31. #31
    PowerPoster
    Join Date
    Feb 2015
    Posts
    2,797

    Re: Binary search for ANSI string

    Quote Originally Posted by taishan View Post
    Off topic, sorry gang... @the trick : what (code indenter / formatter tool) are you using to make the code look so pleasant on that forum?
    I am harboring a strong suspicion that you wrote the tool as an add-in for yourself, but I hereby plead for you to release it in the code bank, (or at least me personally, haha) Cheers
    Hello. Yes i have a private Add-in but i don't publish it because it has bugs and not tested well. I've attached here you can fix it if you need.
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width