Results 1 to 15 of 15

Thread: Parsing CSV File

  1. #1

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24

    Question Parsing CSV File

    I am trying to parse a CSV File.

    My data looks like this:
    1,Meg,Valentine,"ABC, Inc"

    the split function looks at this data and splits:
    1
    Meg
    Valentine
    "ABC
    Inc"

    is there an easy way to get the company name in one field of the array?

    thanks so much in advance!
    meg

  2. #2
    DerFarm
    Guest
    Do you want a generalized solution, or will the excess comman
    ONLY happen in field X?

  3. #3
    C# Aficionado Lord_Rat's Avatar
    Join Date
    Sep 2001
    Location
    Cave
    Posts
    2,497
    Two options:

    1) Create a funciton that evaluates the file character by character and keeps track of whether you are inside of quotes or not.

    If you are inside quotes, re-join the splitted lines.


    2) Use ADO and create a connection to the CSV file. Then read it as a recordset.
    Need to re-register ASP.NET?
    C:\WINNT\Microsoft.NET\Framework\v#VERSIONNUMBER#\aspnet_regiis -i

    (Edit #VERSIONNUMBER# as needed - do a DIR if you don't know)

  4. #4

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24

    Parsing a CSV File

    a general solution, please.

    thanks so much!

  5. #5
    DerFarm
    Guest
    Presumably, you only get quotes when a comma is embedded? If so:

    Code:
       dim quo1 as integer,quo2 as integer,xyz as string
    
       read in string xyz
       xyz = xyz & " "
       quo1 = instr(xyz,chr$(34))
    
       while quo1 
          quo2 = instr(quo1+1,xyz,chr$(34))
        
          if quo2 then
             com1 = instr(quo1,xyz,",")
             if com1 < quo2 then mid$(xyz,com1,1)=chr$(254)
             quo1 = instr(quo2+1,xyz,chr$(34))
          else
             quo1 = 0
          endif
       wend
    
       split the string
        walk the array looking for chr$(254) and replace with a comma
    a bit gruesome, but it should work.

  6. #6

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24
    thanks so much! i appreciate the help

  7. #7
    Addicted Member JasonGS's Avatar
    Join Date
    May 2000
    Location
    California
    Posts
    155
    If you’re reading from a file, you can use regular Basic Input. It automatically determines the text qualifier, so if you have a file like so:

    1002, "Smith, John", "CA", 90210
    1003, "Smith, Nancy", "NV", 13647

    ...it knows how to parse it.

    Code:
    Open "myfile.csv" For Input As #1
    	While Not EOF(1)
    		Input #1, Field1, Field2, Field3, Field4
    	Wend
    Close #1

  8. #8
    Fanatic Member HaxSoft's Avatar
    Join Date
    May 2000
    Location
    Ohio
    Posts
    593
    YES -- listen to JasonGS ... he's got it right.

    AND I bet 100 US$ he's been doing QuickBASIC, PDS, or QBasic at some point.... just a feeling.

  9. #9

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24
    thanks jason....but the problem is that my csv file is not very consistent....

    the first 10 (or so) lines aren't comma delimited....then the next unknown amount of lines are comma delimited, then there are more that are not, etc. etc.

    i dont think your solution deals with that because if field2 isnt on the same line as field1, it will go to the next line to look for it.

  10. #10
    Fanatic Member HaxSoft's Avatar
    Join Date
    May 2000
    Location
    Ohio
    Posts
    593

    REALITY CHECK

    What the *@£#¤%!§ type of file do you have?

    Is there any logic in the data file? If you tell us approximately how the file is organized, then we can help. But if it's just any old file, then where do we start?

  11. #11

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24
    hi. i am sorry that i didnt give more info in the beginning. i was hoping that there would be a simple solution.....i am new to vb, but it seems like vb has simple solutions for everything....

    anyway, my file is arranged like so:
    3-4 short lines of text (varies)
    n number of comma delimited lines (n depending on the number of open orders)
    3-4 short lines of text (varies)
    n number of comma delimited lines (n depending on the number of closed orders)

    i was hoping to have an easy solution to do the following:

    read a line.
    parse the line by commas
    fi the number of fields = 12 then i care about the info....but if not, then i would just go on.

    i dont like the idea of counting quotes and commas because one of the fields is a comments field....and you never know what a user will put in that.....maybe he closes the quotes and maybe he doesnt.

    further, unfortunately, i cannot change the output of the file....what i have is what i have to work with

    earlier, somebody had mentioned that i do something with ado. given my circumstances is this my best option?

    thanks so much for your help. i really do appreciate it.

  12. #12
    PowerPoster jdc2000's Avatar
    Join Date
    Oct 2001
    Location
    Idaho Falls, Idaho USA
    Posts
    2,526
    The primary problem that will need to be solved first is to determine a method that can be used to distinguish the comment lines from the data lines. It is possible, depending on what the users enter for them, that the comment lines might actually look like a data line. Is there any text that ALWAYS shows up in either the first or last comment line, or is always in the data lines?

  13. #13

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24
    i am 95% certain that i can count on the first cell of the first line of the open orders to be "Open", and the the first line of the closed orders to be "Closed"

    so the first two lines of the open orders are:

    Open,"Meg Valentine","ABC, Inc","20,000",.....
    ,"John Doe","DEF, Inc","2,000",....

    I am also 99% certain that a comment line will never look like a data line, since the data lines have 12 fields, and from everything i have seen, the comment lines are pretty short.

  14. #14
    PowerPoster jdc2000's Avatar
    Join Date
    Oct 2001
    Location
    Idaho Falls, Idaho USA
    Posts
    2,526
    I have written a function that will separate the fields from a file like yours, but it won't work properly if there are beginning quotes without ending quotes in the line of data. That is why you need to be able to determine when you have a comment line and when you have a data line. You will need to add variable definitions to the following code as well as modify it to suit your data file and needs - it is by no means a complete program.

    Code:
    Public strFields(100) As String, intFields As Integer
    
    Private Sub ReadData()
    
        Open strInputFileName For Input As #1
        
        Do While Not EOF(1)
            Line Input #1, strDataRecord
            If Trim(strDataRecord) <> vbNullString Then
                ' Split record into fields
                GetFields
            End If
        Loop
        Close #1
    
    End Sub
    
    
    Private Sub GetFields()
    
        Dim intTemp1 As Integer, intTemp2 As Integer
        Dim strTemp1 As String
    
        strTemp1 = strDataRecord
        intFields = 1
    
    CheckForComma:
        If Left$(strTemp1, 1) = Chr$(34) Then
            intTemp1 = InStr(2, strTemp1, Chr$(34)) + 1
        Else
            intTemp1 = 1
        End If
        intTemp2 = InStr(intTemp1, strTemp1, ",")
        If intTemp2 = 0 Then GoTo ReadFile
        strFields(intFields) = Trim(Left$(strTemp1, intTemp2 - 1))
        strFields(intFields) = RemoveQuotes(strFields(intFields))
        intFields = intFields + 1
        If Len(strTemp1) > 1 Then
            strTemp1 = Right$(strTemp1, Len(strTemp1) - intTemp2)
            GoTo CheckForComma
        End If
    
    ReadFile:
        strFields(intFields) = RemoveQuotes(Trim(strTemp1))
        ' If strFields(1) = "1779" Or strFields(1) = "2502" Or strFields(1) = "2137" Or strFields(1) = "4331" Then
        ' MsgBox "Field 1: " + strFields(1) + vbCrLf + _
        '        "Field 2: " + strFields(2) + vbCrLf + _
        '        "Field 3: " + strFields(3) + vbCrLf + _
        '        "Field 4: " + strFields(4) + vbCrLf + _
        '        "Field 5: " + strFields(5) + vbCrLf + _
        '        "Field 6: " + strFields(6), , "Data"
        ' End If
        ' If intFields <> 6 Then Stop
    
    End Sub
    
    
    Private Function RemoveCommas(strText As String) As String
    
        Dim intTemp1 As Integer
        Dim strTemp1 As String
        
        strTemp1 = vbNullString
        If Len(strText) > 0 Then
            For intTemp1 = 1 To Len(strText)
                If Mid$(strText, intTemp1, 1) <> "," And _
                   Mid$(strText, intTemp1, 1) <> "." Then
                    strTemp1 = strTemp1 + Mid$(strText, intTemp1, 1)
                End If
            Next intTemp1
        End If
        RemoveCommas = strTemp1
    
    End Function
    
    
    Private Function RemoveQuotes(strText As String) As String
    
        Dim strTemp1 As String
        
        strTemp1 = strText
        
        ' Remove quotes
        If strTemp1 = Chr$(34) Then strTemp1 = vbNullString
        If strTemp1 = Chr$(34) + Chr$(34) Then strTemp1 = vbNullString
        If strTemp1 <> vbNullString Then
            If Left$(strTemp1, 1) = Chr$(34) Then
                If Len(strTemp1) = 1 Then
                    strTemp1 = vbNullString
                Else
                    strTemp1 = Right$(strTemp1, Len(strTemp1) - 1)
                End If
            End If
        End If
        If strTemp1 <> vbNullString Then
            If Right$(strTemp1, 1) = Chr$(34) Then
                If Len(strTemp1) = 1 Then
                    strTemp1 = vbNullString
                Else
                    strTemp1 = Left$(strTemp1, Len(strTemp1) - 1)
                End If
            End If
        End If
        
        If InStr(1, strTemp1, Chr$(34)) <> 0 Then
            strTemp1 = ReplaceQuotes(strTemp1)
        End If
        RemoveQuotes = strTemp1
    
    End Function
    
    
    Private Function ReplaceQuotes(strData As String) As String
    
        Dim intTemp1 As Integer
        Dim strTemp1 As String, strTemp2 As String
        
        strTemp1 = strData
        strTemp2 = strQuoteReplacement
        If strTemp2 <> vbNullString Then
            If Left$(strTemp2, 1) >= "0" And Left$(strTemp2, 1) <= "9" Then
                If Val(strTemp2) >= 0 And Val(strTemp2) <= 255 Then strTemp2 = Chr$(Val(strTemp2))
            End If
        End If
        intTemp1 = InStr(1, strTemp1, Chr$(34))
        Do While intTemp1 <> 0
            If intTemp1 = 1 Then
                If Len(strTemp1) = 1 Then
                    strTemp1 = strTemp2
                Else
                    strTemp1 = strTemp2 + Right$(strTemp1, Len(strTemp1) - 1)
                End If
                intTemp1 = 0
            End If
            If intTemp1 = Len(strTemp1) Then
                strTemp1 = Left$(strTemp1, Len(strTemp1) - 1) + strTemp2
                intTemp1 = 0
            End If
            If intTemp1 <> 0 Then
                strTemp1 = Left$(strTemp1, intTemp1 - 1) + strTemp2 + Right$(strTemp1, Len(strTemp1) - intTemp1)
            End If
            intTemp1 = InStr(1, strTemp1, Chr$(34))
        Loop
        ReplaceQuotes = strTemp1
    
    End Function

  15. #15

    Thread Starter
    Junior Member
    Join Date
    Feb 2002
    Posts
    24
    wow! thanks so much for your help. i will copy this is my project and go from there....

    so much for an easy solution

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width