Results 1 to 27 of 27

Thread: [RESOLVED] Find string in range of Byte Array

  1. #1

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Resolved [RESOLVED] Find string in range of Byte Array

    I read a very large csv file into a byte array.
    During reading, i find a carriage return charcode and now I need to find a string in a specific range of the byte array.
    I know the starting address of the range and the end address. How fast can I find a string pattern in an array of bytes?
    Ten Years After - 01 You Give Me Loving

  2. #2
    Fanatic Member
    Join Date
    Jan 2013
    Posts
    894

    Re: Find string in range of Byte Array

    open the file in binary mode #1
    set a string variable = space$(filelen(path+filename)) or faster use space$(lof(1)) if file is already opened.

    get #1, , string_variable

    use x= INSTR(string_variable,search_reason)

    x= position if found
    x= 0 if not found

    don't forget to close the file with CLOSE #1



    the only one thing, as VB uses UNICODE (16bits), it will get the double in size in RAM than if holding in a byte array.

    But, you can do fast searches, using INSTR than looping in a byte array.

    Well, there is another way, which is very low level calls, get the array pointer, and do a low level CPU find routine, equivalent to instr(), but needs to handle pointers varptr iirc, (in C is &).

    The best way is to compile in mASM a little routine for that, but IIRC Windows has a kernel function to do the same, it just needs pointers and buffer size, But can't recall its name.
    Last edited by flyguille; Mar 19th, 2017 at 08:42 PM.

  3. #3

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Re: Find string in range of Byte Array

    The comma-separated file contains records:
    Category Available Quantity
    House Yes 10
    Apartment Yes 9
    Machine No 0
    Apartment No 8
    And to me from this file it is necessary to select records where there is a field an 'Apartment'. Then paste in new file.
    VarPtr is better solution maybe.
    Ten Years After - 01 You Give Me Loving

  4. #4
    Fanatic Member
    Join Date
    Jan 2013
    Posts
    894

    Re: Find string in range of Byte Array

    Quote Originally Posted by sergeos View Post
    The comma-separated file contains records:
    Category Available Quantity
    House Yes 10
    Apartment Yes 9
    Machine No 0
    Apartment No 8
    And to me from this file it is necessary to select records where there is a field an 'Apartment'. Then paste in new file.
    VarPtr is better solution maybe.
    it is text format with CRCL line termination?

    it is easy, seems like a programming test for the first year students???

    Any computer since 70's can do that in basic.

    open source$ for input as #1
    open dest$ for output as #2
    while eof(1) = false
    line input #1, a$
    if instr(a$,"Apartment")=1 then PRINT#2, a$
    wend
    close #2
    close #1


    DONE!.

  5. #5

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Re: Find string in range of Byte Array

    Quote Originally Posted by flyguille View Post
    it is text format with CRCL line termination?

    it is easy, seems like a programming test for the first year students???

    Any computer since 70's can do that in basic.
    Smile. Did you try your code on file over 100-150MB? Termination symbol is vbLf
    New Computers lost few hours for that work.
    70-s goes to eternal peace...

    If it was so easy I would not create a topic.
    Ten Years After - 01 You Give Me Loving

  6. #6
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,871

    Re: Find string in range of Byte Array

    You already have the data in a byte array.
    Then search for the byte with value '10' starting on offset 0.
    Then check if the first 9 bytes of the starting on offset matches 'Apartment'
    If so copy the bytes from offset till "found location -1" to a new file and append a byte with value '10' (vbLf)

  7. #7
    Fanatic Member
    Join Date
    Jan 2013
    Posts
    894

    Re: Find string in range of Byte Array

    Quote Originally Posted by sergeos View Post
    Smile. Did you try your code on file over 100-150MB? Termination symbol is vbLf
    New Computers lost few hours for that work.
    70-s goes to eternal peace...

    If it was so easy I would not create a topic.
    I didn't said it is the faster way. Because it reads one text line at a time.

    and is one disk I/O operation per line, two i/o operation if found the target, this is the slower method.
    Last edited by flyguille; Mar 20th, 2017 at 09:15 AM.

  8. #8
    Fanatic Member
    Join Date
    Jan 2013
    Posts
    894

    Re: Find string in range of Byte Array

    ok there goes, air coding.....

    dim b() as byte

    open source$ for binary access read #1
    redim b(0 to lof(1)-1)
    get #1,, b()
    close #1


    open dest$ for binary access write lock read as #1
    x=0

    on error goto ENDing ' IF compiled without arrays boundaries checks for faster execution it won't works.

    while x<= ubound(b)
    if b(x)=&h41 then if b(x+1)=&h70 then if b(x+2)=&h61 then if b(x+3)=&h72 then if b(x+4)=&h74 then if b(x+5) =&h6d then if b(x+6) = &h65 then if b(x+7)=&h6e then if b(x+8) = &h74 thenGosub SendLine

    x=x+1
    wend

    close #1

    exit sub



    SendLine:
    if x=0 then else if b(x-1) = 10 then else RETURN

    while x<=ubound(b)
    put#1, ,b(x)
    if b(x)=10 then RETURN
    x=x+1
    wend

    x=x-1:RETURN



    ENDing:

    close #1

    end sub

  9. #9
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    Just a thought. Have you tried to use ADO to read the csv file? It would have the advantage to filter on the Category header along with handling vbLf (chr 10) and/or vbCr (chr 13) terminated lines. Thinking no need to "reinvent the wheel" if not absolutely necessary.

    Edited: Another advantage suits your needs. The filtered recordset can be written to disk as a separate file, i.e., snippet:
    Code:
     ' establish ADO connection to the folder containing csv
    Set RS = New ADODB.RecordSet
    strSQL = "Select * FROM [" & filename+extension & "] WHERE (Category = 'Apartment')"
    RS.Open strSQL, [connection], adOpenForwardOnly, adLockReadOnly, adCmdText
    fNr = FreeFile()
    Open [new file] For Output As #fNr
    Print #fNr, RS.GetString(adClipString, , ","); ' can error if no records returned
    Close #fNr
    RS.Close
    ' close connection & set ADODB objects to Nothing
    Headers won't be included, but you can iterate through the RS.Fields collection to print those to file beforehand if needed. Plenty of examples in this forum on connecting to and reading csv files with ADO
    Last edited by LaVolpe; Mar 20th, 2017 at 04:29 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  10. #10
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Find string in range of Byte Array

    You could use the Jet 4.0 Text IISAM to do this, and by selecting into a new table (file) you can get the header row as well. It should also be faster that way than sucking it into RAM all at once and using GetString() on a Recordset.

    But for files of any substantial size it won't be speedy. Not bad but not as fast as you seem to hope for.


    I tweaked an existing Text I/O class to support raw ANSI Text I/O as well as ANSI and Unicode. Then I applied a few more optimizations. That along with a few other small tricks to avoid creating tons of small String values got the time down a bit. It may not be bug-free but so far it seems to pass the tests I have run.

    There are two Projects here: MakeFile and FilterFile.

    MakeFile creates a sample file of 8,000,000 rows, about 105MB. The compiled EXE takes around 25 seconds to run on this PC, faster machines should run faster since it is mostly CPU-bound.

    FilterFile reads that created sample file. It copies the column header row, then copies data rows where the first column (Category) = "Apartment" to a new file. The compiled EXE takes around 10 seconds to run on this PC.


    Code:
    Option Explicit
    
    Private Declare Function GetTickCount Lib "kernel32" () As Long
    
    'Case-sensitive:
    Private Declare Function StrCmpN Lib "shlwapi" Alias "StrCmpNA" ( _
        ByVal lpStr1 As Long, _
        ByVal lpStr2 As Long, _
        ByVal nChar As Long) As Long
    
    'Case insensitive:
    'Private Declare Function StrCmpNI Lib "shlwapi" Alias "StrCmpNIA" ( _
    '    ByVal lpStr1 As Long, _
    '    ByVal lpStr2 As Long, _
    '    ByVal nChar As Long) As Long
    
    Private Sub Main()
        Dim T0 As Long
        Dim Row As String
        Dim htfOut As HugeTextFile
        Dim AnsiComma As String
        Dim AnsiApartment As String
        Dim CommaPos As Long
        Dim Apartments As Long
        
        T0 = GetTickCount()
        ChDir App.Path
        ChDrive App.Path
        With New HugeTextFile
            .OpenFile "DataOrig.csv", _
                      Access:=HTF_ACCESS_READ, _
                      LfNewLine:=True, _
                      Encoding:=HTF_ENC_ANSIRAW
            Row = .ReadLine() 'Colheader row.
            Set htfOut = New HugeTextFile
            htfOut.OpenFile "DataFiltered.csv", _
                            Access:=HTF_ACCESS_WRITE, _
                            LfNewLine:=True, _
                            Encoding:=HTF_ENC_ANSIRAW
            htfOut.WriteLine Row
            AnsiComma = StrConv(",", vbFromUnicode)
            AnsiApartment = StrConv("Apartment", vbFromUnicode)
            Do Until .EOF
                Row = .ReadLine()
                'This logic assumes:
                '
                '   o Category field values begin in column 1 and have no spaces
                '     between the value and its delimiting comma, i.e. no trimming
                '     is needed.
                '
                '   o Category field values have expected case (case-sensitive
                '     compare is made).
                CommaPos = InStrB(1, Row, AnsiComma)
                If StrCmpN(StrPtr(Row), StrPtr(AnsiApartment), CommaPos - 1) = 0 Then
                    Apartments = Apartments + 1
                    htfOut.WriteLine Row
                End If
            Loop
            htfOut.CloseFile
            .CloseFile
        End With
        MsgBox "Done:" & vbNewLine _
             & vbNewLine _
             & Format$(Apartments, "#,##0") & " Apartments copied" & vbNewLine _
             & vbNewLine _
             & Format$(CSng(GetTickCount() - T0) / 1000, "0.0##") & " seconds"
    End Sub
    Even if you don't want to use this code it may have a few ideas in it that you could make use of.
    Attached Files Attached Files
    Last edited by dilettante; Mar 21st, 2017 at 12:29 PM.

  11. #11
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    You could use the Jet 4.0 Text IISAM to do this, and by selecting into a new table (file) you can get the header row as well. It should also be faster that way than sucking it into RAM all at once and using GetString() on a Recordset.
    Just FYI to others. GetString isn't truly necessary. The data can be extracted from source to destination, in same format, same headers, just using an ADO text connection and executing a single statement. Something as simple as:
    Code:
    SELECT * INTO [%DestFileName%] IN '' [text;Database=%DestPath%] FROM 
        [%SourceTableName%] WHERE (Category = 'Apartment')
    The placeholders above would need to be provided. A schema.ini should be provided for the source file to ensure no oddities like guessing wrong on source data types
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  12. #12
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Find string in range of Byte Array

    I don't have anything in my schema notes to suggest that you can specify the row delimiter though. You'd probably always get CRLFs in the output file.

  13. #13
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    I don't have anything in my schema notes to suggest that you can specify the row delimiter though.
    Question for me? If so, the answer is no and good point if the default delimiter isn't desired. If that is the case, then my last post isn't a player. The GetString method does allow that, though as you suggest, performance may be an issue. I don't know if OLEDB sucks up the entire 150mb csv at once or not when a where clause is used. If the schema.ini has MaxScanRows=0, then yes. Otherwise, if you say so, I'll defer to your statement. Just a bit surprised no one considered ADO as a possible solution.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  14. #14
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Find string in range of Byte Array

    I had looked at it and performance was pretty bad just using it to create a large sample file. Bad enough that I gave it up, though more like 120 seconds than "a few hours."

    I misread earlier posts and thought that 70 seconds had been achieved and rejected as too slow. Considering the earlier posts again the Jet Text IISAM (ADO or even DAO) might be acceptable.

    That is a decent general tool for this sort of thing.

  15. #15
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    That is a decent general tool for this sort of thing.
    Agreed, key word being general. Custom solutions often out perform general libraries. Besides, depending on the actual format of the source csv, ADO likely won't exactly duplicate the output to match the source. Example: if the source has some fields quote delimited, others not, for example. That may or may not be a requirement.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  16. #16

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Re: Find string in range of Byte Array

    LaVolpe, Can't open CSV with your method. Only one column selecting into RS. Maybe separator must be set, don't know.

    dilettante, no bad! Your case faster by me on 7 sec. My code is:
    Code:
    Public Function GetRecord(ByRef Fields() As Byte) As Boolean
        
        Dim arrTemp()   As Byte
        Dim lPOint      As Long
        
        Const SEARCH_CRITERIA As String = "Apartment"
        ReDim arrTemp(Len(SEARCH_CRITERIA) - 1)
        
        Do
            If BufNextChar >= BufSize Then
                'Need to read the next buffer of data.
                If FileCharsLeft <= 0 Then
                    'No more data to read: make sure we didn't have a last
                    'record with no CRLF, or exit with EOF result.
                    Exit Function
                'End If
                Else
                'If FileCharsLeft < BufSize Then
                    'Final buffer is smaller than the others.
                    BufSize = FileCharsLeft
                End If
                
                ReDim Buf(BufSize - 1) 'Undo previous Redim Preserve.
                Get #F, , Buf
                ReDim Preserve Buf(BufSize) 'We need to append a terminating 0 byte.
                FileCharsLeft = FileCharsLeft - BufSize
                BufNextChar = 0
                lPOint = 0
            End If
            
            Do Until BufNextChar >= BufSize
                If Buf(BufNextChar) = Delim Or Buf(BufNextChar) = CR Then
                    CopyMemory arrTemp(0), ByVal VarPtr(Buf(lPOint)), LenB(Buf(lPOint)) * (UBound(arrTemp) + 1)
                    
                    If StrConv(arrTemp, vbUnicode) = SEARCH_CRITERIA Then
                        If (Not Fields) = -1 Then
                            ReDim Preserve Fields(BufNextChar - lPOint + 0) ' UBound(Fields))
                        Else
                            ReDim Preserve Fields(BufNextChar - lPOint + UBound(Fields) + 1)
                        End If
                        CopyMemory Fields(UBound(Fields) - (BufNextChar - lPOint)), ByVal VarPtr(Buf(lPOint)), LenB(Buf(lPOint)) * (BufNextChar - lPOint + 1)
                    End If
                                       
                    lPOint = BufNextChar + 1
                                            
                End If
                
                BufNextChar = BufNextChar + 1 'Dirty chars and commas skipped here.
            Loop
        Loop
    End Function
    Ten Years After - 01 You Give Me Loving

  17. #17
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Find string in range of Byte Array

    The problem with custom low-level code is always getting it right though. I'm still not sure the Text I/O class I posted above is fully debugged.

    Data corruption is bad enough, but silent loss of data can be even more insidious. Using a power tool known to be reliable makes sense even if it costs you some performance. Of course you still need to test, test, test because one can easily make an error working with high-level tools.

  18. #18
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    LaVolpe, Can't open CSV with your method. Only one column selecting into RS. Maybe separator must be set, don't know
    If you are interested in the ADO approach, would need to know more details about your specific csv file... For example:
    - What is the column/field delimiter or are the columns fixed widths?
    - Can any fields be interpreted incorrectly, i.e., a text field interpreted as a Date, Integer, Boolean field incorrectly?
    Those are generally the most common issues and can be resolved by supplying a schema.ini file. That file tells the csv parser (ADO, OLEDB, etc) how to handle specifics when the csv is not technically to specs.

    Edited. Using the simple example you provided in post #3, I created a file called tester.csv with that content. Then executed a statement similar to the one in post #11 which produced a file containing exactly this:
    Code:
    "Category","Available","Quantity"
    "House","Yes",10
    "Apartment","Yes",9
    "Machine","No",0
    "Apartment","No",8
    Adding a where clause (Category='Apartment') produced exactly this:
    Code:
    "Category","Available","Quantity"
    "Apartment","Yes",9
    "Apartment","No",8
    Connection string used... replace %TARGET% with the csv folder
    Code:
    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=%TARGET%;Extended Properties='text;HDR=YES;FMT=CSVDelimited'
    Last edited by LaVolpe; Mar 21st, 2017 at 01:47 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  19. #19

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Re: Find string in range of Byte Array

    LaVolpe
    This is my code for ADO. read only one Column
    Code:
    Dim dbConnectionString  As String
            Dim CN                  As Object
            Dim vRS                 As Object
            Dim sSQL                As String
            Dim PathFile            As String
            
            Set CN = CreateObject("ADODB.Connection")
            dbConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & "C:\mm" & "" & ";Extended Properties='text;HDR=NO'"
            
            CN.Open dbConnectionString
            
            PathFile = "test.csv"
            
            Set vRS = CreateObject("ADODB.Recordset")
            sSQL = "Select * FROM [" & PathFile & "]" 'WHERE (Category = 'Tyres')"
            
            vRS.Open sSQL, CN, 0, 1, 1
            'If Not vRS.EOF Then .Range("A2").CopyFromRecordset vRS
    test.zip
    Ten Years After - 01 You Give Me Loving

  20. #20
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    Thank you for the test.zip. That will help out others tremendously.

    Your file appears to be semi-colon delimited. By adding a schema.ini file, it appears to be parsed out correctly; however, to save the matching records in the same format would require using ADO's GetString or something like what Dilettane suggests.

    If one does not already exist, create a file in same folder as your csv and name the file: schema.ini
    Paste this into the file and save it, then try your sample code again
    Code:
    [test.csv]
    Format=Delimited(;)
    ColNameHeader=False
    There appears to be money values in European format which converted incorrectly on my US system; though they may parse correctly on yours. If this file were actually something that needed to be parsed correctly on anyone's pc, the schema.ini file would need to be adjusted for even greater detail. Just FYI. If more interested in schema.ini files, simply google: csv schema.ini
    Last edited by LaVolpe; Mar 21st, 2017 at 03:24 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  21. #21
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Find string in range of Byte Array

    One more note. The test file you provided recently isn't anything like the sample data your provided in post #3.

    If the test file isn't actual data you are trying to process, please provide a small sample file of actual data. It doesn't help you if we are providing solutions for data you won't even be using. Only a couple records are needed, along with any headers. If your actual data contains information that should not be public, you can blank out those field values or provide bogus values, i.e., instead of something like a real phone number, use 555-0123 for example.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  22. #22
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    9,852

    Re: Find string in range of Byte Array

    Hi sergeos,

    You didn't really specify things very well in your OP (post #1). For instance, is your input file ANSI or Unicode? In theory, it could quite easily be either. A CSV can certainly be either, but I suspect ANSI is probably more typical, so that's what I'll assume.

    Also, apparently, the file is of a size that it'll fit into memory or you wouldn't be talking about a byte array. If it were me, I probably wouldn't read it into a byte array in the first place, but that's what you've asked for, so I'll go with it.

    Therefore, we're reading an ANSI file into a byte array, doing some kind of parsing such that we want to search within portions of it for some string. Here's how I'd tend to do that....

    Code:
    
    Option Explicit
    '
    Private Declare Sub CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" (ByRef Dest As Any, ByRef Source As Any, ByVal Bytes As Long)
    '
    
    Private Sub Form_Load()
    
    
        ' Create a byte array with ANSI "ABCDEFGHIJK" in it (11 letters).
    
        Dim bb() As Byte
        ReDim bb(1 To 11)
        bb(1) = 65  ' A
        bb(2) = 66  ' B
        bb(3) = 67  ' C
        bb(4) = 68  ' D
        bb(5) = 69  ' E
        bb(6) = 70  ' F
        bb(7) = 71  ' G
        bb(8) = 72  ' H
        bb(9) = 73  ' I
        bb(10) = 74 ' J
        bb(11) = 75 ' K
    
        ' Create a needle to search for.
        Dim sNeedle As String
        sNeedle = "CD"
    
    
    
        ' Do search and report results.
        MsgBox StringIsInByteArrayRange(bb(), 1, 11, sNeedle)
        MsgBox StringIsInByteArrayRange(bb(), 3, 11, sNeedle)
        MsgBox StringIsInByteArrayRange(bb(), 4, 11, sNeedle)
        MsgBox StringIsInByteArrayRange(bb(), 3, 4, sNeedle)
    
    
        Unload Me
    End Sub
    
    Private Function StringIsInByteArrayRange(bbHay() As Byte, iRangeBegin As Long, iRangeEnd As Long, sNeedle As String, Optional bHayIsUnicode As Boolean = False) As Long
        ' Returns the position where found, with iRangeBegin being first position.
        ' Returns ZERO if not found.
        ' Returns -1 if there's something wrong with things.
        ' CAVEATS: If you're actually searching for vbNullChar specified in sNeedle, this function won't work correctly.
        Dim iErr As Long
        Dim iLength As Long
        Dim iHayStrLen As Long
        Dim sHay As String
        Dim i As Long
        '
        ' Set number of bytes to search in hay.
        iLength = iRangeEnd - iRangeBegin + 1
        '
        ' Set "something wrong" return value.
        StringIsInByteArrayRange = -1
        '
        ' First a bit of validity checking.
        On Error Resume Next
            If iRangeBegin < LBound(bbHay) Then Exit Function
            If iRangeEnd > UBound(bbHay) Then Exit Function
            If iRangeBegin > iRangeEnd Then Exit Function
            If LBound(bbHay) > UBound(bbHay) Then Exit Function
            If bHayIsUnicode And (iLength Mod 2) Then Exit Function ' If Hay is Unicode, we must search an even number of bytes.
            iErr = Err
        On Error GoTo 0
        If iErr Then Exit Function ' This is a "DIM'ed" check.
        '
        ' We should be good, so do it.
        iHayStrLen = (iLength + 1) \ 2     ' The +1 handles odd lengths.  The \2 is because strings are Unicode, not bytes.
        ' There may be an extra 0 at the end of the string (for odd byte ranges), but nothing we can do about that.
        sHay = String$(iHayStrLen, 0)
        '
        ' Move our hay range into the string.
        ' Note that "bytes" are moved with no conversion to Unicode.
        CopyMemory ByVal StrPtr(sHay), bbHay(iRangeBegin), iLength
        '
        ' Is our hay ANSI (or just bytes) or is it Unicode?
        ' If it's ANSI (or bytes) we must convert to Unicode for search.
        If Not bHayIsUnicode Then sHay = StrConv(sHay, vbUnicode)
        '
        ' We're now ready to use Instr for our search.
        StringIsInByteArrayRange = InStr(sHay, sNeedle)
    End Function
    
    
    Enjoy,
    Elroy

    p.s. I did put an option in the search function for dealing with a Unicode byte array (i.e., Unicode CSV file), but I didn't test it. Also, hopefully you can see that, for testing, I just created a small byte array that represents your ANSI file.
    Last edited by Elroy; Mar 21st, 2017 at 03:12 PM.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  23. #23
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    9,852

    Re: Find string in range of Byte Array

    And here's another little procedure I often use. It'll search for your strings without reading the file into memory. However, it does work only for ANSI strings in the file. I've just never (at least not yet) had a need to search a file for Unicode strings.

    Code:
    
    Option Explicit
    
    Private Sub Form_Load()
        Dim sFileSpec As String
        Dim iFoundPos As Long
        sFileSpec = "C:\Users\Elroy\Desktop\test.txt"
    
    
        MsgBox BinaryFileSearch(sFileSpec, "asdf", False, 6, iFoundPos)
        MsgBox iFoundPos
    
    
    
    End Sub
    
    
    Public Function BinaryFileSearch(sFileSpec As String, sSearchString As String, Optional bCaseSensitive As Boolean = True, _
                                     Optional lStartPosition As Long = 1, Optional lFoundPosition As Long, _
                                     Optional lFileHandleToUse As Long = 0) As Boolean
        ' Returns true if sSearchString is found, else false.
        ' sSearchString can be no longer than 128.
        ' This will work even if Word or Excel has the file open.
        ' The lFoundPosition is a return argument.
        '    It returns the latest position before lStartPosition (if there isn't one after lStartPosition) or
        '    it returns the earliest position after lStartPosition.
        Dim iFle As Long
        Dim FileData As String
        Dim FilePointer As Long
        Dim FileLength As Long
        Dim sFind As String
        Dim iPos As Long
        '
        If Len(sSearchString) > 128 Then
            Err.Raise 1234
            Exit Function
        End If
        '
        If lFileHandleToUse = 0 Then
            If Not bFileExists(sFileSpec) Then Exit Function
            iFle = FreeFile
            On Error Resume Next
                Open sFileSpec For Binary As iFle
                If Err <> 0 Then
                    Close iFle
                    On Error GoTo 0
                    Exit Function
                End If
            On Error GoTo 0
            '
            If Len(iFle) = 0 Then Close iFle: Exit Function
        Else
            iFle = lFileHandleToUse ' The file MUST be opened BINARY for this to work.
        End If
        '
        If bCaseSensitive Then
            sFind = sSearchString
        Else
            sFind = LCase$(sSearchString)
        End If
        FileData = Space(1024)
        FileLength = LOF(iFle)
        FilePointer = lStartPosition
        Do
            If FilePointer > FileLength Then Exit Do
            Get iFle, FilePointer, FileData
            If Not bCaseSensitive Then FileData = LCase$(FileData)
            iPos = InStr(FileData, sFind)
            If iPos <> 0 Then
                lFoundPosition = FilePointer + iPos - 1
                If lFoundPosition >= lStartPosition Then
                    BinaryFileSearch = True
                    Exit Do
                End If
            End If
            FilePointer = ((FilePointer + 1024) - Len(sFind)) + 1
        Loop
        If lFileHandleToUse = 0 Then Close iFle
    End Function
    
    Public Function bFileExists(fle As String) As Boolean
        On Error GoTo FileExistsError
        ' If no error then something existed.
        bFileExists = (GetAttr(fle) And vbDirectory) = 0
        Exit Function
    FileExistsError:
        bFileExists = False
        Exit Function
    End Function
    
    
    
    
    Enjoy,
    Elroy
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  24. #24
    Fanatic Member
    Join Date
    Aug 2016
    Posts
    673

    Re: Find string in range of Byte Array

    why not used instrB?

  25. #25

    Thread Starter
    Addicted Member sergeos's Avatar
    Join Date
    Apr 2009
    Location
    Belarus
    Posts
    162

    Re: Find string in range of Byte Array

    To be honest, I forgot about this function.

    As it turned out there were plenty of decisions. The problem is solved.
    Ten Years After - 01 You Give Me Loving

  26. #26
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: [RESOLVED] Find string in range of Byte Array

    @sergeos. When resolved, please mark your thread as such. Use the Thread Tools menu near top of your 1st post. Additionally, it would be helpful to indicate which method you used to solve the problem. This could help others that happen upon your thread for a similar problem.

    Edited: Bad timing on my part, you resolved this while I was typing the post.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  27. #27
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    9,852

    Re: [RESOLVED] Find string in range of Byte Array

    Hi xxdoc123,

    Well, you certainly could use InstrB. In fact, you could pass the entire bb() Byte Array into InstrB. However, be sure to recognize that InstrB is going to make a copy of the Byte Array (and convert it to a string) when it's passed into InstrB. I don't know how large Sergeos's files will be, but InstrB will certainly double the memory requirements. If we're talking about relatively small files, I'll agree that InstrB is a good choice. In fact, it could be used in the function of my post #22 to make it work a bit better.

    All The Best,
    Elroy
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width