Results 1 to 4 of 4

Thread: All your carriage returns

  1. #1

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,047

    All your carriage returns

    There is a program used by a variety of people that writes data to a SQL Server Express database on their local computers. That data then gets uploaded somewhere. I have a program that takes that data and sends it to a service that adds it to a different database. All of that is working nicely, but somebody noted a day or two back that some types of comments wouldn't work right when the data from out database ended up in a CSV. Tracking that down was pretty easy, as it was clear that there were CRLF in some comments. That's bit of a story in itself, as the users have a very small area to type terse comments, yet some are writing War And Peace into that tiny area.

    Anyways, I removed all the CRLF, and took another look at the data. It was still messed up, so I did a query. What is actually in that field is a smattering of CRLF, CR without LF, and LF without CR. The CR without LF was a small number (about 500), while the CRLF and LF without CR were 150K records, or so. Not an even distribution.

    What I'm wondering is: How did this happen? The users aren't doing anything other than typing into a textbox. I would expect just one form of line termination, and I would expect it to be CRLF on a Windows computer. They are clearly getting a mix, and that almost certainly has to be the doing of that program they are using to enter the data, but what is it doing? The program was written in C#, and that's about all I know about it. Are there circumstances where you would get LF instead of CRLF? Are there much more rare circumstances where you would get CR instead of CRLF?
    My usual boring signature: Nothing

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,352

    Re: All your carriage returns

    When typing into a WinForms TextBox control, hitting the Enter key adds a CrLf pair. Doing the same with a RichTextBox, you get just an Lf, i.e. the RichTextBox uses Unix line breaks. Could that account for that variation?

    If I'm not mistaken, MacOS uses Cr alone for line breaks. Could anyone have edited your data on a Mac?

  3. #3
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,352

    Re: All your carriage returns

    By the way, you can incorporate fields with line breaks, i.e. record delimiters, in a CSV the same way you incorporate fields with commas, i.e. field delimiters. You do both by quoting the field value. CSV readers like a .NET TextFieldParser will successfully parse quoted data containing line breaks.

  4. #4

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,047

    Re: All your carriage returns

    Interesting points, but after a bit of thought, I don't believe they can be relevant. I'm pretty sure the Mac option is out of the question, as I don't think we have anybody who uses a Mac for this. The very low number of CR only means that it is somewhat possible, though they'd have to be using an emulator to run the Windows only software, and I'm not sure that Macs have the hardware port options needed for this use.

    One thing I can say is that I'm pretty sure the program is WPF. I suppose it is possible that there are two screens, one which uses a textbox while the other uses an RTB. I haven't used it enough to know for sure. It seems unlikely, though. It's not a terribly complicated application, and this is just one field. It seems unlikely that they'd use one control for entry and a different for edits.
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width