Results 1 to 24 of 24

Thread: Read Number of Lines in txt file [RESOLVED]

  1. #1

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575

    Read Number of Lines in txt file [RESOLVED]

    Is there a simple way to find out how many lines there are in a txt file?

    I need the number of lines in the text file i have open in order to calculate how long it will take to process it for my progress bar, but I'm not sure how to count the lines...

    What my program does is run through line by line decrypting it. now, this takes a while on large files. So a progress bar would be very handy. I'll just set the max thing of the progress bar to the amount of lines, then add to the count by 1 each time a line gets processed.

    This alright? All I need now is to know how many lines there are.

    Thanks a bunch!
    Last edited by LITHIA; Jul 29th, 2004 at 08:52 AM.

  2. #2

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    I just tried:
    VB Code:
    1. Dim lines As New IO.StreamReader(New IO.FileStream(OpenFileDialog1.FileName, IO.FileMode.Open))
    2.         Dim x As Double = lines.ReadToEnd().Split(Environment.NewLine.ToCharArray).GetUpperBound(0)

    And it seems to work... although i do get double the amount of lines than i shud... so i just divided it by 2 and got the correct amount.

    It seems to work... although there is a delay before my progress bar does anything, but i think that's because its processing how many lines there are before it decrypts it.

    it's alright tho. i think its resolved... anyone got a better method though?

  3. #3

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    okay i just realised how aggrivating this is now....

    this method is total rubbish!

    It doesn't even get the line count correct and dividing it by 2 only helped in that one case anyway!

    So i'm completey stuck now... I will really appreciate some help that will get the exact number of lines!

    Thanks!

  4. #4
    PowerPoster SuperSparks's Avatar
    Join Date
    May 2003
    Location
    London, England
    Posts
    265
    Something like this should do it:

    VB Code:
    1. Dim Lines as Long
    2. Dim sr As StreamReader = New StreamReader(path to file)
    3. Do While sr.Peek() >=0
    4. sr.ReadLine
    5. Lines += 1
    6. Loop
    7. sr.Close()
    Nick.

  5. #5
    Sleep mode
    Join Date
    Aug 2002
    Location
    RUH
    Posts
    8,083
    hint ,
    Use Regex , it's damn fast .

  6. #6

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    Originally posted by Pirate
    hint ,
    Use Regex , it's damn fast .
    that sounds like a good idea pirate, from what i heard of Regex anyway.
    I'm just not sure how to use it, I have no experience and have no idea how to work Regex. I can barely declare it

    Any chance of a example please? I searched for Regex on the forum but I didn't get exactly what I wanted.

    Thanks a bunch

  7. #7
    Sleep mode
    Join Date
    Aug 2002
    Location
    RUH
    Posts
    8,083
    I'm sorry . I read a lot about regex patterns but still can't write it myself . That's pretty simple if you find a good resource about regex pattern . You can write that on your own . I'm sure .

  8. #8
    Member
    Join Date
    Aug 2003
    Posts
    51
    I think SuperSparks already give you the answer.

    I think Regex has no use in this instance, as you're not searching for any thing.

    GL

  9. #9

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    okay thanks, this method is working i think, i may have a few differences but thanks for it!

    and are you sure vutle? Could you search for VbCrLf? that would signify a new line wouldn't it?

  10. #10
    Fanatic Member brown monkey's Avatar
    Join Date
    Jun 2004
    Location
    Cebu
    Posts
    552
    this mate?
    VB Code:
    1. Dim sr As New StreamReader("D:\My Documents\Docs\pautot.txt")
    2.       MessageBox.Show(Regex.Matches(sr.ReadToEnd, "\n").Count + 1)  ' plus 1 cause no \n in the last line
    Last edited by brown monkey; Jul 25th, 2004 at 08:54 PM.

  11. #11
    Member
    Join Date
    Aug 2003
    Posts
    51
    Originally posted by LITHIA
    okay thanks, this method is working i think, i may have a few differences but thanks for it!

    and are you sure vutle? Could you search for VbCrLf? that would signify a new line wouldn't it?
    Yes! but in your case, you're only want to count number of line. You're not comparing anything. Hence it is probably faster to use the method provided by: SuperSparks

    I've compare the Regex method provided by brown monkey with suppersparks for a 1 million lines file. The result is SuperSparks method is faster.

  12. #12
    Sleep mode
    Join Date
    Aug 2002
    Location
    RUH
    Posts
    8,083
    Originally posted by vutle
    Yes! but in your case, you're only want to count number of line. You're not comparing anything. Hence it is probably faster to use the method provided by: SuperSparks

    I've compare the Regex method provided by brown monkey with suppersparks for a 1 million lines file. The result is SuperSparks method is faster.
    Did you use other regex options (2nd parameter in the match RegexOptions) . This makes a lot of difference .

  13. #13
    Fanatic Member brown monkey's Avatar
    Join Date
    Jun 2004
    Location
    Cebu
    Posts
    552
    i don't understand this. hehe. regex is designed for this mates. but still, he is not capable of this \n thing. hehe.
    VB Code:
    1. Dim sr As New StreamReader("d:\sometext.txt")
    2.       Dim a As Integer = Environment.TickCount
    3.       Me.Text = Regex.Matches(sr.ReadToEnd, "(\n)", RegexOptions.Compiled).Count + 1 & "   " & Environment.TickCount - a
    this code hangs. hehe. i don't know why. tested for 1088781 lines. but this
    VB Code:
    1. Dim sr As New StreamReader("d:\sometext.txt")
    2.       Dim a As Integer = Environment.TickCount
    3.  
    4.       Dim l As Integer = 0
    5.       While Not IsNothing(sr.ReadLine)
    6.          l += 1
    7.       End While
    8.       Me.Text = l & "   " & Environment.TickCount - a
    provides 2078 tick counts. hehehe. me doesn't know the details mates. cheers.

  14. #14
    Member
    Join Date
    Aug 2003
    Posts
    51
    Originally posted by brown monkey
    i don't understand this. hehe. regex is designed for this mates. but still, he is not capable of this \n thing. hehe.
    VB Code:
    1. Dim sr As New StreamReader("d:\sometext.txt")
    2.       Dim a As Integer = Environment.TickCount
    3.       Me.Text = Regex.Matches(sr.ReadToEnd, "(\n)", RegexOptions.Compiled).Count + 1 & "   " & Environment.TickCount - a
    this code hangs. hehe. i don't know why. tested for 1088781 lines. but this
    VB Code:
    1. Dim sr As New StreamReader("d:\sometext.txt")
    2.       Dim a As Integer = Environment.TickCount
    3.  
    4.       Dim l As Integer = 0
    5.       While Not IsNothing(sr.ReadLine)
    6.          l += 1
    7.       End While
    8.       Me.Text = l & "   " & Environment.TickCount - a
    provides 2078 tick counts. hehehe. me doesn't know the details mates. cheers.
    No hard feeling brown monkey.

    I think that, the reason why Regex take longer is, since it use some sort of search or compare method. Hence it would take longer.

    Test Results-: 1076920 lines on 2.6Ghz laptop
    SuperSparks - 1512 tickcount
    Regex by brown monkey("\n") = 50743 tickcount
    Regex by brown monkey modify(VBCRLF) = 6389 tickcount
    brown monkey new code = 1512 tickcount
    Last edited by vutle; Jul 26th, 2004 at 06:00 AM.

  15. #15
    Lively Member
    Join Date
    Jun 2004
    Posts
    74
    If you are doing what I usually do and trying to tell the user how much longer the file will be read, I would use the size of the file available through a FileSystemObject and keep track of the number of bytes read (each line = Len + 2, for CRLF). Then you have the progress in terms of percent by dividing the bytes read/bytes total.

    One other tip is to only update your progress bar every X% so that CPU time is spent on processing the file and not updating the screen as often.

    HTH

    Hume

  16. #16

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    Originally posted by HumePeabody
    If you are doing what I usually do and trying to tell the user how much longer the file will be read, I would use the size of the file available through a FileSystemObject and keep track of the number of bytes read (each line = Len + 2, for CRLF). Then you have the progress in terms of percent by dividing the bytes read/bytes total.

    One other tip is to only update your progress bar every X% so that CPU time is spent on processing the file and not updating the screen as often.

    HTH

    Hume
    that's a good idea! thanks, i'll have a look into that when i got the time

  17. #17

    Thread Starter
    Fanatic Member LITHIA's Avatar
    Join Date
    Dec 2002
    Location
    UK, England
    Posts
    575
    oh, any chance of example code for this please actually? not completely sure about it...

    thanks!

  18. #18
    Lively Member
    Join Date
    Jun 2004
    Posts
    74
    Here is a generic read function template that I use to start things

    Public Sub GenericRead(ByVal FileStr As String)

    Dim FSO As New Scripting.FileSystemObject
    Dim FileStream As Scripting.TextStream
    Dim InputStr As String, FileLen As Single, FileRead As Single
    Dim FileProgress As Byte, ProgressIncrement As Byte

    If Not FSO.FileExists(FileStr) Then Exit Sub

    FileLen = FSO.GetFile(FileStr).Size
    FileStream = FSO.OpenTextFile(FileStr, Scripting.IOMode.ForReading)
    FileRead = 0.0 : FileProgress = 5

    Do While Not FileStream.AtEndOfStream
    InputStr = FileStream.ReadLine()
    FileRead = FileRead + InputStr.Length + 2

    ' Process Text String here <may be many lines of code>

    ' Progress through file
    If CByte(FileRead / FileLen * 100) > FileProgress Then _
    FileProgress = FileProgress + ProgressIncrement
    ' Update Progress bar if needed
    Loop

    FileStream.Close()
    FileStream = Nothing
    FSO = Nothing

    End Sub

    HTH

    Hume

  19. #19
    I wonder how many charact
    Join Date
    Feb 2001
    Location
    Savage, MN, USA
    Posts
    3,704
    Still like Sparks' method, you can easily put in a remaining bytes count:
    VB Code:
    1. Dim Lines As Long
    2.         Dim remaining As Integer
    3.         Dim sr As IO.StreamReader = New IO.StreamReader("C:\test.txt")
    4.         remaining = sr.BaseStream.Length.ToString
    5.         Do While sr.Peek() >= 0
    6.             remaining -= sr.ReadLine().Length
    7.             Debug.WriteLine(remaining.ToString)
    8.             Lines += 1
    9.         Loop
    10.         sr.Close()

  20. #20
    Lively Member
    Join Date
    Jun 2004
    Posts
    74
    The big advantage of using the file size over the number of lines is that you can access the file size directly and only have to read through the file once at process time. If you need to count lines, then you have to read through twice (once to get the number of lines, once to actually process the file). For most folks, this may not be an issue, but if you have a 500+ MB file, this is wasted CPU time to just be able to report progress...

    All depends on your particular needs...

    Hume

  21. #21
    Member
    Join Date
    Aug 2003
    Posts
    51
    Originally posted by HumePeabody
    The big advantage of using the file size over the number of lines is that you can access the file size directly and only have to read through the file once at process time. If you need to count lines, then you have to read through twice (once to get the number of lines, once to actually process the file). For most folks, this may not be an issue, but if you have a 500+ MB file, this is wasted CPU time to just be able to report progress...

    All depends on your particular needs...

    Hume
    No, you read it only once.

    With that file size you better use a proper database.

  22. #22
    Lively Member
    Join Date
    Jun 2004
    Posts
    74
    Based on the original post...how can you get the number of lines in a file without first reading all the way through it? When you are processing the file line by line you don't have pre-knowledge of how many lines are in the file. Even if you query the number of characters in the whole file looking for new lines, some function is still reading thgrought the whole file to get this count.

    If you use the file size or bytes remaining or bytes read (anything based on how much you have processed in bytes), then the first pass through can tell you how far you are in terms or a percentage (i.e. bytes read/bytes total), which should then be reflected to the progress bar.

    If all you want is the line count, then that is fine, but the original; goal was to be able to display progress...

    my 2 cents

  23. #23
    Member
    Join Date
    Aug 2003
    Posts
    51
    Originally posted by HumePeabody
    Based on the original post...how can you get the number of lines in a file without first reading all the way through it? When you are processing the file line by line you don't have pre-knowledge of how many lines are in the file. Even if you query the number of characters in the whole file looking for new lines, some function is still reading thgrought the whole file to get this count.

    If you use the file size or bytes remaining or bytes read (anything based on how much you have processed in bytes), then the first pass through can tell you how far you are in terms or a percentage (i.e. bytes read/bytes total), which should then be reflected to the progress bar.

    If all you want is the line count, then that is fine, but the original; goal was to be able to display progress...

    my 2 cents
    Yes byte count like - nemaroller or your routine above for refreshing the progress bar.

  24. #24
    Lively Member
    Join Date
    Jun 2004
    Posts
    74
    OK, just making sure someone didn't have a clever way to get the line count (without wasted CPU time), because that is a bit more intuitive to follow over the bytes read way...

    Hume

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width