Results 1 to 2 of 2

Thread: [vb6]Yet Another CSV Parser

Threaded View

  1. #1

    Thread Starter
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    [vb6]Yet Another CSV Parser

    A fairly basic CSV parser, with a bit more user-control and a slight twist: Event driven by Record
    The parser does handle quoted field data and delimiters, carriage returns, and other non-printable characters wtihin field data, assuming the field data is properly formatted.

    The parser will raise an event for each record it has finished parsing. You would respond to this event to use/format the parsed field data. The event allows aborting further processing if the CSV appears corrupted. The event will inform you if the CSV file appears corrupt.

    The class offers two ways of feeding it CSV information but only one method is called.
    1) Entire file read into a string
    2) Line by line from the CSV file, read from a loop or from a split string

    As mentioned, an event is called for each record. This event has 3 states:
    1) RecordParsed. A vbNullChar-delimited string is passed which contains the entire record
    2) FieldNamesStatic. A vbNullChar-delimeted string containing field names from the CSV's first row of data
    3) FieldNamesGeneric. A vbNullChar-delimited string containing default field names for CSVs without a header row
    Each event has has three other parameters: HeaderCount, FieldDifferential, and RecordNumber
    RecordNumber will be zero if processing field names else incrementing by 1 each time event is called
    HeaderCount is the number of fields based from the 1st row of parsed data
    FieldDifferential is basically an error if non-zero.
    0 indicates that number of fields in the record equal number of field names
    Negative indicates number of missing fields in the record. HeaderCount + FieldDifferential = number processed fields
    Positive indicates number of extra fields in the record.
    If you feel comfortable trying to handle any discrepancies between field count and header count, no need to reply to the event. However, if you want to abort processing any further records, you simply return the Record parameter as a null string.

    Let's talk about proper formatting
    Delimiters come in various flavors with this class:
    :: Record Delimiter defines when a record ends & a new record begins. Hard coded in the class a vbCr and/or vbLf
    :: Field Delimiter defines when a field ends & a new field begins. This is user-defined & defaults to a comma
    :: Quote/Text Delimiter defines start and end of text where any character (delimiter or not) is not specially handled
    :: Escape Delimiter defines characters not specially handled. Escape delimited files are rare
    Quote and Escape Delimiters are also used to delimit themselves as non-special characters

    1) Every record in a CSV, including any header row, is delimited by a carriage return and/or line feed
    -- Only exception is the final record. It does not require a record delimiter

    2) Every field within a record must be delimited by a character you specify. This class does not process fixed-length field CSVs
    -- Field delimiters are never used before the 1st field and never used after the final field
    -- Any field that contains a null character (ASCII byte 0) will abort the parser

    3) Quotes, i.e., ", are defaulted to be handled as text identifiers. This option can be turned off or changed to a different character
    -- Quote delimiters allow non-printable characters and other delimiters to be treated as just any other character

    4) If any field contains a record and/or field delimiter within the field's data, the delimiters must be identified as non-delimiters
    -- Two options are provided in the class: quoted field data and escape characters

    Delimiters. Let's say the field delimiter is a comma
    :: If any character within a field contains any delimiter, then that delimiter must be escaped
    :: Sample field that contains a field delimiter: Hello, my name is LaVolpe
    If Quote delimit character is " then should be saved to file as: "Hello, my name is LaVolpe"
    If Escape delimit character is \ then should be saved to file as: Hello\, my name is LaVolpe

    Delimiting the delimiters. Simple rule, replace each delimiter with a double delimiter
    :: Sample field with Quote delimiter: Hello, my name is "LaVolpe"
    Saved to file as: "Hello, my name is ""LaVolpe"""
    -- if no field delimiter existed, then this works too: My name is ""LaVolpe""
    :: Sample field with Escape delimiter: C:\My Documents
    Saved to file as: C:\\My Documents
    Note that the Quote & Escape delimiters are a tad different.
    -- Quote delimiters are doubled only within the field data.
    -- If field data has delimiters, then the field data is written to file with a single quote both as a prefix & suffix
    -- Quote delimiters are always paired if used as delimiters
    -- Any record or field delimiter within a field needs no special handling, when that field on disk begins & ends with a quote delimiter
    -- Escape delimiters, if used, are required for every field, record and escape delimiter that exists within a field
    -- Escape delimiters have no 'pairing' requirement unless escaping itself
    -- Mixing Quote & Escape delimiters is not recommended, though can be used if you want to customize your CSV data
    -- Quote and/or Escape delimiter characters must be defined by the user, both are optional
    -- General rule of thumb
    :: Quote/Escape delimiters only needed if record or field delimiters exist in field data
    :: The fact that a field has quotes does not force a requirement of using quote delimiters
    If no quote or no escape delimiters were defined during parser initialization, the following records would work just fine because no record nor field delimiter exists within the individual fields
    Year,Make,Model,Engine,Body,NickName
    1969,Ford,Mustang,351 Windsor,Sport Roof,"Fast Back"


    Quick examples of using the class, both for reading line by line & entire file
    Code:
    Private WithEvents CSVParser As ICSVParser
    
    Private CSVParser_ProcessRecord(ByVal State As ProcessStateEnum, _
    				Record As String, _
    				ByVal FieldDifferential As Long, _
    				ByVal HeaderCount As Long, _
    				ByVal RecordNumber As Long)
    	' process parsed CSV record
    	Dim sData() As String
    	Select Case State
    	Case csvRecordParsed
    		If FieldDifferential Then
    			' handle potentially corrupt CSV
    			' to abort further processing: Record = vbNullString
    		Else
    			sData = Split(Record, vbNullChar)
    			' process data
    		End If
    	Case csvFieldNameStatic
    		sData = Split(Record, vbNullChar)
    		' process field names
    	Case csvFieldNameGeneric
    		sData = Split(Record, vbNullChar)
    		' process field names, optionally, using your own names
    	End Select
    End Sub
    
    Private Sub Command1_Click() ' Full file example
    	If CSVParser Is Nothing Then Set CSVParser = New ICSVParser
    	Dim fnr As Integer, sFile As string
    	fnr = FreeFile()
    	Open "C:\Temp\TestCSV.csv" For Binary As #fnr
    	sFile = Space$(LOF(fnr))
    	Get #fnr, 1, sFile
    	Close #fnr
    	CSVParser.InitializeParser True
    	If CSVParser.ParseRecord(sFile) = False Then
    		' handle informing user of corrupt file
    	End If
    	If CSVParser.TerminateParser() = False Then
    		' handle informing user of corrupt file, final record
    	End If
    End Sub
    
    Private Sub Command2_Click() ' Line by line example
    	If CSVParser Is Nothing Then Set CSVParser = New ICSVParser
    	Dim fnr As Integer, sLine As string
    	fnr = FreeFile()
    	Open "C:\Temp\TestCSV.csv" For Input As #fnr
    	CSVParser.InitializeParser True
    	Do Until EOF(fnr) = True
    		Line Input #fnr, sLine
    		If CSVParser.ParseRecord(sLine) = False Then 
    			' handle informing user of corrupt file
    			Exit Do
    		End If
    	Loop
    	Close #fnr
    	If CSVParser.TerminateParser() = False Then
    		' handle informing user of corrupt file, final record
    	End If
    End Sub
    And, just for the heck of it, a real simple example of loading a CSV to a ListView. In the parser's event:
    Code:
        Dim sData() As String, lItem As Long
        Select Case State
        Case csvRecordParsed
            If FieldDifferential > 0 Then
                Record = ""
            Else
                sData = Split(Record, vbNullChar)
                With ListView1.ListItems.Add(, , sData(0))
                    For lItem = 1 To UBound(sData)
                        .SubItems(lItem) = sData(lItem)
                    Next
                End With
            End If
                    
        Case csvFieldNamesGeneric, csvFieldNamesStatic
            sData = Split(Record, vbNullChar)
            With ListView1
                .ListItems.Clear
                .ColumnHeaders.Clear
                For lItem = 0 To HeaderCount - 1
                    .ColumnHeaders.Add , , sData(lItem)
                Next
            End With
        End Select
    Another parser can be found at this link, this site, by a well-respected coder.
    The attachment below is a class file, remove the .txt extension after downloading
    Attached Files Attached Files
    Last edited by LaVolpe; Nov 27th, 2015 at 11:27 AM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width