I have the following function
which worked well in the past, but now the company who is sending us the CSV has added a new field which sometimes has newlines (\r\n). This function will no longer support us because (line = readFile.ReadLine()) splits at each newline.Code:/// <summary> /// Pulls info from CSV file and stores each entry as list of string arrays /// </summary> /// <param name="path"></param> /// <returns></returns> public static List<string[]> parseCSV(string path) { // List<string[]> parsedData = new List<string[]>(); try { using (StreamReader readFile = new StreamReader(path)) { string line; string[] row; string pattern = ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))"; //Should be commas that are not encapsulated in quotation marks Regex r = new Regex(pattern); while ((line = readFile.ReadLine()) != null) { row = r.Split(line); parsedData.Add(row); } } } catch (Exception e) { MessageBox.Show(e.Message); CommitSuicide(); } return parsedData; }
What is the best way to modify the existing function to only split at newlines that aren't enclosed in double quotes? I suppose I could create a StreamReader extension and call it ReadEntry and basically recreate what ReadLine already does... but that sounds rather tedious and out of my skill level, to be honest.




Base 2
Reply With Quote