Results 1 to 12 of 12

Thread: Fastest way to search for text in a file.

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Fastest way to search for text in a file.

    I have a binary file that is about 20mb. I have a list of strings that I need to search if each exist in the binary file. I tried using INSTR with textcomparison(need to be case insensitive) but it is extremely slow.


    BTW... Did do a search on this already but most are using INSTR with textcomparison.
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

  2. #2
    Fanatic Member Mxjerrett's Avatar
    Join Date
    Apr 2006
    Location
    Oklahoma
    Posts
    939

    Re: Fastest way to search for text in a file.

    Instr will probably be your best choice in this case. I know it is pretty slow, but in this case there probably aren't many other choices.

    If a post has been helpful please rate it.
    If your question has been answered, pull down the tread tools and mark it as resolved.

  3. #3
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: Fastest way to search for text in a file.

    check out binary compare functions posted in codebank
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

  4. #4

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Re: Fastest way to search for text in a file.

    Thanks Mxjerrett and WestConn

    Quote Originally Posted by westconn1
    check out binary compare functions posted in codebank
    West, several came up on the search. Only problem with binary is that it is case sensitive.
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

  5. #5
    Fanatic Member
    Join Date
    Jun 2001
    Location
    Oregon
    Posts
    643

    Re: Fastest way to search for text in a file.

    Quote Originally Posted by Liquid Metal
    Thanks Mxjerrett and WestConn

    West, several came up on the search. Only problem with binary is that it is case sensitive.
    UCase() or LCase() both strings before comparing

  6. #6
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Fastest way to search for text in a file.

    Take a look at this function: InBArrBM

    It is especially faster with TextCompare. It isn't optimal (I could make it better these days), but I'll throw a guess it is much better than InStr for what you're doing.

    It uses Boyer-Moore to find stuff faster than the brute force search InStr does, that is why it can be faster than InStr thanks to a better algorithm.

  7. #7

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Re: Fastest way to search for text in a file.

    hi Merri,

    I got it to work but not sure exactly what it is benchmarking. I believe it is benchmarking between the two command buttons and is the listbox contains data type? Can you explain a little bit about it?
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

  8. #8

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Re: Fastest way to search for text in a file.

    Quote Originally Posted by PMad
    UCase() or LCase() both strings before comparing
    good idea and it worked. Only problem now is that I realized that I have to break up the file into small chunks in case it has to handle a super duper big file.

    Thanks
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

  9. #9
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Fastest way to search for text in a file.

    Well, ignore the benchmarker and just rip the function

    Note that doing UCase$ or LCase$ to an entire massive file, be it in small or big chunks, is very slow.

  10. #10

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Re: Fastest way to search for text in a file.

    Quote Originally Posted by Merri
    Note that doing UCase$ or LCase$ to an entire massive file, be it in small or big chunks, is very slow.
    Completely agree. Actually, just loading an entire file into memory is already bad enough. You know of an example to GET by chunk? I am tinkering with one right now but haven't able to get it to work yet.
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

  11. #11
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Fastest way to search for text in a file.

    With a byte array you simply dimension the array to the chunk size, then keep reading the file until bytes to be read is smaller than the size of a chunk. If there is more than zero bytes to read, then resize the byte array to get the last bytes. You can use FileLen to get the length of a file into a variable.

  12. #12

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2003
    Location
    The Future - Skynet
    Posts
    1,157

    Re: Fastest way to search for text in a file.

    Can't remember where I ripped this code from but it was from one of the member here helping to shredd a file. I tweaked my code to my needs. Can you check and let me know the logic.

    Code:
        ' Open file
        Dim intFreeFile As Integer
        intFreeFile = FreeFile
        Open strShredFile For Binary As #intFreeFile
            
        ' Get total length
        Dim lngLOF As Long
        lngLOF = LOF(intFreeFile)
        
        ' Allocate buffer size for array byte
        Const cintMAXSize As Integer = 1024 '* 4 '32& * 1024&
        Dim intBufferSize As Integer
        intBufferSize = IIf(lngLOF > cintMAXSize, cintMAXSize, lngLOF)
        
        Dim byteArr() As Byte
        Dim strData As String
        Dim lngPos As Long
        lngPos = 1
        Seek #intFreeFile, 1
        Do
            ' Allocate byteArr
            Erase byteArr
            If (lngPos + intBufferSize) >= lngLOF Then  'Test if Final looping - this need to be tested first - good methodology
                ReDim byteArr(lngLOF - lngPos)
            Else                                        'Continuous looping
                ReDim byteArr(intBufferSize)
            End If
            
            Get #intFreeFile, lngPos, byteArr
            strData = LCase(byteArr)
            
            Dim intCounter As Integer
            Dim intPos As Integer
            Do
                intPos = InStr(intPos + 1, strData, "winxml", vbBinaryCompare)
                If intPos = 0 Then Exit Do
                
                intCounter = intCounter + 1
            Loop
            
            ' Return write position
            lngPos = lngPos + intBufferSize - (Len("winxml") * 2)
        Loop Until lngPos >= lngLOF
        Close #intFreeFile
    I'll Be Back!

    T-1000

    Microsoft .Net 2005
    Microsoft Visual Basic 6
    Prefer using API

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width