-
Feb 7th, 2019, 06:07 PM
#1
All your carriage returns
There is a program used by a variety of people that writes data to a SQL Server Express database on their local computers. That data then gets uploaded somewhere. I have a program that takes that data and sends it to a service that adds it to a different database. All of that is working nicely, but somebody noted a day or two back that some types of comments wouldn't work right when the data from out database ended up in a CSV. Tracking that down was pretty easy, as it was clear that there were CRLF in some comments. That's bit of a story in itself, as the users have a very small area to type terse comments, yet some are writing War And Peace into that tiny area.
Anyways, I removed all the CRLF, and took another look at the data. It was still messed up, so I did a query. What is actually in that field is a smattering of CRLF, CR without LF, and LF without CR. The CR without LF was a small number (about 500), while the CRLF and LF without CR were 150K records, or so. Not an even distribution.
What I'm wondering is: How did this happen? The users aren't doing anything other than typing into a textbox. I would expect just one form of line termination, and I would expect it to be CRLF on a Windows computer. They are clearly getting a mix, and that almost certainly has to be the doing of that program they are using to enter the data, but what is it doing? The program was written in C#, and that's about all I know about it. Are there circumstances where you would get LF instead of CRLF? Are there much more rare circumstances where you would get CR instead of CRLF?
My usual boring signature: Nothing
-
Feb 7th, 2019, 09:13 PM
#2
Re: All your carriage returns
When typing into a WinForms TextBox control, hitting the Enter key adds a CrLf pair. Doing the same with a RichTextBox, you get just an Lf, i.e. the RichTextBox uses Unix line breaks. Could that account for that variation?
If I'm not mistaken, MacOS uses Cr alone for line breaks. Could anyone have edited your data on a Mac?
-
Feb 7th, 2019, 09:16 PM
#3
Re: All your carriage returns
By the way, you can incorporate fields with line breaks, i.e. record delimiters, in a CSV the same way you incorporate fields with commas, i.e. field delimiters. You do both by quoting the field value. CSV readers like a .NET TextFieldParser will successfully parse quoted data containing line breaks.
-
Feb 8th, 2019, 11:40 AM
#4
Re: All your carriage returns
Interesting points, but after a bit of thought, I don't believe they can be relevant. I'm pretty sure the Mac option is out of the question, as I don't think we have anybody who uses a Mac for this. The very low number of CR only means that it is somewhat possible, though they'd have to be using an emulator to run the Windows only software, and I'm not sure that Macs have the hardware port options needed for this use.
One thing I can say is that I'm pretty sure the program is WPF. I suppose it is possible that there are two screens, one which uses a textbox while the other uses an RTB. I haven't used it enough to know for sure. It seems unlikely, though. It's not a terribly complicated application, and this is just one field. It seems unlikely that they'd use one control for entry and a different for edits.
My usual boring signature: Nothing
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|