Results 1 to 5 of 5

Thread: What is a line in a textfile?

  1. #1

    Thread Starter
    Fanatic Member ThomasJohnsen's Avatar
    Join Date
    Jul 2010
    Location
    Denmark
    Posts
    528

    What is a line in a textfile?

    Watching this post in another thread, I suddenly remembered once I tried something similar and got some empty lines. This prompted me to try an experiment:
    Are the following equivalent:
    * File.ReadAllLines(filename)
    * File.ReadAllText(filename).Split(Environment.NewLine) '(Or ControlChars.CrLf - gives same result)
    and does any of them actually split on Carriage Return followed by Linefeed and only there?

    Surprisingly I found that the answer was no to both questions - which prompted me to question: what is a line in a textfile? I've allways assumed it had to end with CrLf.

    File.ReadAllLines(filename) will split on Lf, Cr and CrLf (maybe more - haven't tested all controlchars)
    File.ReadAllText(filename).Split(Environment.NewLine) seems to get confused and not act rationally at all!
    File.ReadAllText(filename).Split() splits on whitespace causing CrLf to leave an undesired Lf string in the array.
    File.ReadAllText(filename).Split(New Char() {ControlChars.Cr, ControlChars.Lf}) acts differently than all above.

    I tested the examples and wrote my own version, of what I would expect for 'correct behaviour', with the following:
    vb.net Code:
    1. Dim filename As String = Path.Combine(Application.StartupPath, "Testing.txt")
    2.  
    3.         File.WriteAllLines(filename, New String() {"Carriage" & ControlChars.Cr & "return", "Nothing", _
    4.                                                    "Line" & ControlChars.Lf & "feed", "Spa ce"})
    5.  
    6.         Dim lines() As String
    7.  
    8.         Console.WriteLine("Version 1: ----------------------")
    9.  
    10.         lines = File.ReadAllLines(filename)
    11.         For Each line As String In lines
    12.             Console.WriteLine(String.Format("'{0}'", line))
    13.         Next
    14.  
    15.         Console.WriteLine("Version 2: ----------------------")
    16.  
    17.         lines = File.ReadAllText(filename).Split(Environment.NewLine)
    18.         For Each line As String In lines
    19.             Console.WriteLine(String.Format("'{0}'", line))
    20.         Next
    21.  
    22.         Console.WriteLine("Version 3: ----------------------")
    23.  
    24.         lines = File.ReadAllText(filename).Split()
    25.         For Each line As String In lines
    26.             Console.WriteLine(String.Format("'{0}'", line))
    27.         Next
    28.  
    29.         Console.WriteLine("Version 4: ----------------------")
    30.  
    31.         lines = File.ReadAllText(filename).Split(New Char() {ControlChars.Cr, ControlChars.Lf})
    32.         For Each line As String In lines
    33.             Console.WriteLine(String.Format("'{0}'", line))
    34.         Next
    35.  
    36.         Console.WriteLine("Version 5: ----------------------")
    37.  
    38.         Dim total As String = File.ReadAllText(filename)
    39.         Dim i As Integer = 0
    40.         Dim length As Integer = 0
    41.        
    42.         While i < total.Length
    43.             If total(i) = ControlChars.Lf Then
    44.                 If i > 0 AndAlso total(i - 1) = ControlChars.Cr Then
    45.                     Console.WriteLine(String.Format("'{0}'", If(i > 1, total.Substring(i - length, length - 1), String.Empty)))
    46.                     length = 0
    47.                 Else
    48.                     length += 1
    49.                 End If
    50.             Else
    51.                 length += 1
    52.             End If
    53.             i += 1
    54.         End While

    The question was: are lines in a textfile not supposed to be delimited by CrLf (for those who forgot it reading all that sheit)?.

    Tom
    In truth, a mature man who uses hair-oil, unless medicinally , that man has probably got a quoggy spot in him somewhere. As a general rule, he can't amount to much in his totality. (Melville: Moby Dick)

  2. #2
    eXtreme Programmer .paul.'s Avatar
    Join Date
    May 2007
    Location
    Chelmsford UK
    Posts
    25,464

    Re: What is a line in a textfile?

    it depends on how the file was saved.
    textboxes use crlf + richtextboxes only use lf

  3. #3

    Thread Starter
    Fanatic Member ThomasJohnsen's Avatar
    Join Date
    Jul 2010
    Location
    Denmark
    Posts
    528

    Re: What is a line in a textfile?

    Quote Originally Posted by .paul. View Post
    it depends on how the file was saved.
    textboxes use crlf + richtextboxes only use lf
    So there is no reason to maintain CrLf? And why was a new line ever implemented using 2 chars - old printers perhaps that had a different interpretation of Cr and Lf?

    Anyways having 2 chars for a newline seems incredibly silly to me.
    In truth, a mature man who uses hair-oil, unless medicinally , that man has probably got a quoggy spot in him somewhere. As a general rule, he can't amount to much in his totality. (Melville: Moby Dick)

  4. #4
    PowerPoster dunfiddlin's Avatar
    Join Date
    Jun 2012
    Posts
    8,245

    Re: What is a line in a textfile?

    Cr and Lf are separate functions for printers. It was only on typewriters where both functions were integrated mechanically. But there's one very good reason to retain CrLf - the millions of text files written since the beginning of computing!
    As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"

    Reviews: "dunfiddlin likes his DataTables" - jmcilhinney

    Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!

  5. #5

    Thread Starter
    Fanatic Member ThomasJohnsen's Avatar
    Join Date
    Jul 2010
    Location
    Denmark
    Posts
    528

    Re: What is a line in a textfile?

    Quote Originally Posted by dunfiddlin View Post
    Cr and Lf are separate functions for printers. It was only on typewriters where both functions were integrated mechanically. But there's one very good reason to retain CrLf - the millions of text files written since the beginning of computing!
    True! Tbh I didn't think about that. I guess changing formats and standards for files, hardware etc. is more difficult now than it has ever been.
    In truth, a mature man who uses hair-oil, unless medicinally , that man has probably got a quoggy spot in him somewhere. As a general rule, he can't amount to much in his totality. (Melville: Moby Dick)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width