Results 1 to 11 of 11

Thread: [RESOLVED] How to read some specific columns from a text file ?

  1. #1
    Junior Member
    Join Date
    Feb 11
    Location
    Bucharest, Romania
    Posts
    26

    Resolved [RESOLVED] How to read some specific columns from a text file ?

    Hello VB friends,

    I have the following usual lines of code written to open a database in the text format. The database has 5 columns but for me are useful only two (col1 and col 3, which contain cities). My goal is to join these two columns in a single string (bigCities). I used the Input method because I have considered it is fast enough for my needs and a two dimensional array (myArr).

    The issue is that I don't now how to avoid to load other 3 unuseful columns as long as I need only 2...I wonder if there are other more suitable aproach.

    Code:
    Dim fnum As Integer, i As Long
    Dim myArr() As String, bigArray As String
    Dim bigCities As String
    ReDim myArr(1 To 5, 10000)
    
    fnum = FreeFile
    Open "myFile" For Input As #fnum
    i = 0
    Do While Not EOF(fnum)
    Input #fnum, myArr(1, i), myArr(2, i), myArr(3, i), myArr(4, i), myArr(5, i)
    i = i + 1
    Loop
    Close #fnum
    After that, I had to extract the useful columns (1 and 3) in this way:

    Code:
    ReDim bigArray(1 To UBound(myArr, 2))
    For i = 1 To UBound(myArr, 2)
    bigArray(i) = myArr(1, i) & " " & myArr(3, i) & " "
    Next i
    And, finally, we will have:

    Code:
    bigCities = Join(bigArray)
    Even the result is ok I feel the right approach should be other, a simpler one...
    For example, I think the last next-for looping could be, in a way, avoided. But how? Thank you in advance.

    Daniel
    Last edited by koolsid; Jul 21st, 2012 at 05:29 AM. Reason: Added Code Tags

  2. #2
    Hyperactive Member jokerfool's Avatar
    Join Date
    Dec 06
    Location
    Gold Coast, Australia
    Posts
    413

    Re: How to read some specific columns from a text file ?

    Would you be able to wrap the code links around your code?

  3. #3
    PowerPoster
    Join Date
    Jul 06
    Location
    Maldon, Essex. UK
    Posts
    5,136

    Re: How to read some specific columns from a text file ?

    You don't need to use any Arrays and you only need one loop
    Code:
    Dim strA As String
    Dim strB As String
    Dim strC As String
    Dim strD As String
    Dim strE As String
    Dim bigCities As String
    fnum = FreeFile
    Open "myFile" For Input As #fnum
    Do While Not EOF(fnum)
        Input #fnum, strA, strB, strC, strD, strE
        bigCities = bigCities & strA & " " & strC & " "
    Loop
    Close #fnum
    '
    ' Remove trailing space
    '
    bigCities = Left$(bigCities, Len(bigCities) - 1)

  4. #4
    Junior Member
    Join Date
    Feb 11
    Location
    Bucharest, Romania
    Posts
    26

    Re: How to read some specific columns from a text file ?

    Hello Doogle,
    Thank you for your quick reply.

    Honestly, I had hoped to find a better solution not a worse one...
    My post has been just a reduced sample, the real text file has almost 4.5 milion rows... Your example, based on concatenation, is just a theoretical solution. If my aproach takes almost 18 seconds, yours should take hours...I have just tested it and after 15 minutes I gave up...
    If you have doubts I can proove these sending you both files (vbp and txt).
    Using arrays - approach which is not random at all - I tried to avoid a cummulative usage of memory that makes entire looping process very very slow.
    Anyway, thank you for your help.

  5. #5
    PowerPoster
    Join Date
    Jul 06
    Location
    Maldon, Essex. UK
    Posts
    5,136

    Re: How to read some specific columns from a text file ?

    I wonder whether using the Text Driver for Jet and ADO would be any quicker. It may be worth posting a text file so we can 'play'

  6. #6
    Fanatic Member
    Join Date
    Jun 06
    Posts
    999

    Re: How to read some specific columns from a text file ?

    You could output to a temporary file all of the information you are interested in, then read that file afterward to access the data.
    Code:
    Dim fInput As Integer, fOutput As Integer
    Dim sData(4) As String
    Dim BigCities As String
    
    fInput = FreeFile
    Open "myfile" For Input As #fInput
    fOutput = FreeFile
    Open "tempfile" For Output As #fOutput
    Do Until EOF(fInput)
      Input #fInput, sData(0), sData(1), sData(2), sData(3), sData(4)
      Print #fOutput, sData(0); " "; sData(2); " ";
    Loop
    Close #fInput
    Close #fOutput
    
    fInput = FreeFile
    Open "tempfile" For Input As #fInput
    BigCities = Input(LOF(fInput), fInput)
    Close #fInput
    You could also use the Mid statement to create the result string in memory. This would require more code, but would be faster than output to hard drive.

    If you give a sample of your text file (the first several lines should do), we may be able to provide a better solution.

  7. #7
    Junior Member
    Join Date
    Feb 11
    Location
    Bucharest, Romania
    Posts
    26

    Re: How to read some specific columns from a text file ?

    Hello guys,

    I have just posted both the entire subroutine involved in the extraction and the first 100 rows of the text file. Please keep in mind that this file has 5 columns and almost 4230000 rows (112Mb) ... The size is a very important detail as long as we should avoid some aproaches which take time.
    As you can see, I have kept an usual timer to make different trials worked around comparable. In my PC the below process took almost 12 seconds and I would consider this time as a reference.

    Code:
      'Load using the Input # statement (Fastest Method)
    Private Sub btnInput_Click()
        Dim fnum As Integer, i As Long
        Dim myArray() As String, bigArray() As String
        Dim start_time As Single, stop_time As Single
        Dim pathFile As String, bigJoin As String
    
        pathFile = App.Path & "\hugeFile.txt"
        Screen.MousePointer = vbHourglass
        DoEvents
        'ReDim's should be done in run time but this takes time....
        ReDim myArray(1 To 5, 6000000)
        ReDim bigArray(6000000)
    
        fnum = FreeFile
        start_time = Timer
        Open pathFile For Input As #fnum
            i = 0
        Do While Not EOF(fnum) 'Check for end of file
            Input #fnum, myArray(1, i), myArray(2, i), myArray(3, i), myArray(4, i), myArray(5, i)
            i = i + 1
        Loop
        Close #fnum
       
        ReDim Preserve myArray(1 To 5, i)
        ReDim bigArray(1 To UBound(myArray, 2))
        'We keep only the first and the fourth column
        For i = 1 To UBound(myArray, 2)
        bigArray(i) = myArray(1, i) & " " & myArray(4, i) & " "
        Next i
       
        bigJoin = Join(bigArray)
        
        stop_time = Timer
        Label1.Caption = Format$(stop_time - start_time, "0.00") & " sec."
        Screen.MousePointer = vbDefault
    End Sub
    Again, is something redundant in my aproach (in terms of time or coding) ? Maybe I could avoid, in a way, to load those three unuseful columns in myArray...Is there any other feasible aproach to work with a huge database like this? I have some doubts that using an ADO I could improve my process but I will try tommorow. At least if we take into account that my goal is to find some substrings in that huge string...
    Thank you all for your support.
    Attached Files Attached Files

  8. #8
    Fanatic Member
    Join Date
    Jun 06
    Posts
    999

    Re: How to read some specific columns from a text file ?

    This is just about the fastest way to get the end result (space-delimited string containing all data from columns 1 and 4)
    Code:
      Dim Col1 As String, Col2 As String, Col3 As String, Col4 As String, Col5 As String
      Dim fnum As Integer
      Dim BigString As String
      Dim lPos As Long
      
      BigString = Space(1048576)
      lPos = 1
      fnum = FreeFile
      Open "hugeFile.txt" For Input As #1
      ' Input column headers (this data is ignored)
      Input #fnum, Col1, Col2, Col3, Col4, Col5
      
      Do Until EOF(fnum)
        Input #fnum, Col1, Col2, Col3, Col4, Col5
        
        ' Ensure enough space in buffer
        If lPos + Len(Col1) + Len(Col4) + 1 > Len(BigString) Then
          BigString = BigString & Space(Len(BigString))
        End If    
        
        Mid(BigString, lPos) = Col1
        lPos = lPos + Len(Col1) + 1        
        
        Mid(BigString, lPos) = Col4
        lPos = lPos + Len(Col4) + 1
      Loop
      Close #fnum
      BigString = Left$(BigString, lPos - 1)

  9. #9
    Frenzied Member
    Join Date
    Aug 11
    Location
    B.C., Canada
    Posts
    1,838

    Re: How to read some specific columns from a text file ?

    it seems like its as fast as it can get already. unless like said before you make a temp file with the first and 4th column's info and go from that file if you need access.... of course opening a 100mb text file wont take 1 sec to load! There is no way around. its like if i play a game and it takes 45sec to load but i want it to load in 5sec wont be possible maybe 40sec if you tweak pc a bit, the pc has to process all the information(files) no matter how you want it to do it.

  10. #10
    PowerPoster
    Join Date
    Jul 06
    Location
    Maldon, Essex. UK
    Posts
    5,136

    Re: How to read some specific columns from a text file ?

    Quote Originally Posted by Daniel Duta View Post
    Is there any other feasible aproach to work with a huge database like this? I have some doubts that using an ADO I could improve my process but I will try tommorow. At least if we take into account that my goal is to find some substrings in that huge string...
    The obvious feasible approach would be to perform a one-off upload of the columns required to a proper Database engine such as SQL Server, especially if you're going to want to search for sub-strings.

  11. #11
    Junior Member
    Join Date
    Feb 11
    Location
    Bucharest, Romania
    Posts
    26

    Re: How to read some specific columns from a text file ?

    Hello everybody,

    Hats off to you Logophobic! Your method is 5 seconds faster...Is amazing indeed to be able to handle over 4 milions rows in 7 seconds without any other special class, API functions or library. I think the trick is you have increased the size of the BigString (1Mb) from beginning.
    Nice approach and very didactic. You have 5 globes for this solution.

    Thank you too Doogle for your interest. Of course, I will use specialized database engines but for more serious and complex tasks. In this case I was just looking for a simply and economic way to do some checkings. Besides, I have learnt a new solution to a problem and I discovered a bit of ingenuity.
    Thank you all.

    Regards,
    Daniel

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •