Results 1 to 3 of 3

Thread: [VB6] Convert CSV to Excel Using ADO

  1. #1

    Thread Starter
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,072

    [VB6] Convert CSV to Excel Using ADO

    A piece of code I thought might be useful to some. Caveat first... Won't be the ideal solution for everyone.

    Having Microsoft Excel on the computer is not a requirement. Including an ADO reference is. This code was tested using both ADO Library versions 2.8 and 6.1. The ADO dependency could be removed if code tweaked to use late-binding CreateObject() like VB scripts do.

    This is unicode-friendly regarding file names. There are comments in the code should anyone want to tweak it to handle unicode file content. The routine will default to the more modern versions of Excel and can be forced to use the lower versions as desired.

    A few common options are provided as function parameters and a schema.ini file would likely be needed for more complex options. Comments in the code talk a bit about that.

    The code is really simple and allows ADO to do 100% of the work. Most of the routine below consists of sanity checks along with dealing with various options. The guts is an ADO connection to the csv file and an SQL execution on that connection to create the Excel file, create the tab/sheet, and copy the csv content to that sheet -- all done in that one execution.

    Code:
     ' API used to check if file exists
    Private Declare Function GetFileAttributes Lib "kernel32.dll" Alias "GetFileAttributesW" (ByVal lpFileName As Long) As Long
    
    Private Function ConvertCSVtoExcel(CsvFile As String, _
                                Optional CsvHasHeaders As Boolean = True, _
                                Optional ExcelSheetName As String = "Sheet1", _
                                Optional ByVal ExcelVersion8 As Boolean = False, _
                                Optional DestinationPath As String = vbNullString, _
                                Optional ReturnedErrorString As String) As String
    
        ' Function will return full Excel file path & name if no errors & ReturnedErrorString will be null
        '   else function returns vbNullString and ReturnedErrorString contains error description
        '   Converted file name will be the same as the CSV except having an Excel extension
        
        ' Unicode. Handles unicode file names & sheet names.
        ' For those that want to also handle unicode CSV data, you will want to pass a new parameter and
        '   modify this routine. Key google term: CharacterSet=Unicode
        
        ' ensure CsvFile exists before calling this function
        ' ensure DestinationPath has write-access. By default this path is same as CsvFile path
        ' ExcelVersion8 can be accessed by most versions of Excel except very, very old versions
        '   if absolutely needed, you may want to rework this to pass an exact version, i.e., 5, 8, 12, etc
        '   If parameter is False, v12 (xlsx extension) will be attempted & falls back to v8 if needed
        '   Version 12 driver can be found here & requires at least Win7
        '   https://www.microsoft.com/en-us/download/details.aspx?id=13255
        
        ' Last but not least, some additional info
        '   many delimited file options can be used, but require a schema.ini file & no changes in this routine
        '       i.e., other delimiter than comma, specifying column data types, different column header names, etc
        '       https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver
        '   if you need to play with xlsb (binary files) vs xlsx files, remove the XML from the v12 connection string
    
        Static v12ProviderAbsent As Boolean
        Const E_NOPROVIDER As Long = 3706&
    
        Dim cn As ADODB.Connection, p As Long
        Dim sSrcFile As String, sSrcPath As String
        Dim sSQL As String, sDest As String
        Dim sHDRprop As String, sVersion As String
        
        ' sanity checks and prep
        p = InStrRev(CsvFile, "\")
        sSrcFile = Mid$(CsvFile, p + 1)
        sSrcPath = Left$(CsvFile, p)
        If DestinationPath = vbNullString Then
            sDest = sSrcPath
        ElseIf Right$(DestinationPath, 1) <> "\" Then
            sDest = DestinationPath & "\"
        Else
            sDest = DestinationPath
        End If
        If v12ProviderAbsent = True Then ExcelVersion8 = True
        p = InStrRev(sSrcFile, ".")
        If p = 0 Then sDest = sDest & "." Else sDest = sDest & Left$(sSrcFile, p)
        If ExcelVersion8 Then sDest = sDest & "xls" Else sDest = sDest & "xlsx"
        If ExcelSheetName = vbNullString Then ExcelSheetName = "Data"
        If CsvHasHeaders Then sHDRprop = "Yes" Else sHDRprop = "No"
        
        ' prevent overwriting existing file; Excel file creation fails if file/sheet already exists
        Do
            If GetFileAttributes(StrPtr(sDest)) = -1& Then Exit Do
            If ExcelVersion8 Then sDest = sDest & ".xls" Else sDest = sDest & ".xlsx"
        Loop
        
        ' verify we can open the csv
        On Error Resume Next
        Set cn = New ADODB.Connection
        cn.CursorLocation = adUseClient
        If Not ExcelVersion8 Then
            cn.ConnectionString = "Provider=Microsoft.Ace.OLEDB.12.0;Data Source=" & _
                sSrcPath & ";Extended Properties='text;HDR=" & sHDRprop & ";FMT=CSVDelimited'"
            cn.Open
            If Err Then ' failure. Either version 12 not installed or O/S less than Win7
                If Err.Number = E_NOPROVIDER Then v12ProviderAbsent = True
                ExcelVersion8 = True                ' try again using lower Excel version
                sDest = Left$(sDest, Len(sDest) - 1)
                Err.Clear
            Else
                sVersion = "12.0 XML"
            End If
        End If
        If ExcelVersion8 Then
            cn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & _
                sSrcPath & ";Extended Properties='text;HDR=" & sHDRprop & ";FMT=CSVDelimited'"
            cn.Open
            If Err Then ' can't be done via this routine
                ReturnedErrorString = Err.Description
                Err.Clear
                GoTo ExitRoutine
            End If
            sVersion = "8.0"
        End If
        
        ' create the excel file, sheet, & import data in one call
        sSQL = "SELECT * INTO [" & ExcelSheetName & "] IN '' [Excel " & sVersion & _
                ";Database=" & sDest & "] FROM [" & sSrcFile & "]"
        cn.Execute sSQL
        If Err Then
            ReturnedErrorString = Err.Description
            Err.Clear
        Else
            ReturnedErrorString = vbNullString
            ConvertCSVtoExcel = sDest
        End If
        
    ExitRoutine:
        If cn.State Then cn.Close
        Set cn = Nothing
    End Function
    Edited: A specific scenario has yet to be resolved. If the CSV file name (excluding the path) is unicode, the routine fails when the actual data is not unicode (i.e., ASCII); just the file name is. For full unicode support, including actual CSV data in unicode, the CharacterSet=Unicode extended property on the connection string should work well in most cases.
    Last edited by LaVolpe; Mar 14th, 2017 at 05:07 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  2. #2

    Thread Starter
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,072

    Re: [VB6] Convert CSV to Excel Using ADO

    Just an add-on to this thread.

    Extracting/converting existing source to a csv is easily done with ADO. The advantage of using ADO is that the produced csv file is per specifications. This means comma-delimited, proper quote escaping, and dealing with fields containing embedded commas and carriage returns.

    From trial and error, when extracting from csv to csv, the source csv should be listed in the schema.ini file within that source file's folder. That ini entry for the file should include: MaxScanRows=0. This will help ensure that the extraction doesn't incorrectly best-guess the data type of each extracted field. Jet/ACE creates a schema.ini entry for the new csv file and in that file you can see what data types the extracted fields were set to.

    For csv to csv actions, though the ACE.12.0 driver can be used, not really necessary. Here's a typical csv connection string using JET.4.0. Replace %TARGET% with the source csv file's path
    Code:
    "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=%TARGET%;Extended Properties='text;HDR=YES;FMT=CSVDelimited'"
    If Excel or Access is the source, then modify the connection string as needed to connect to that source.

    If the file holding the extracted/converted data doesn't yet exist, use a SELECT INTO statement like the one below. This will error if the destination file already exists, so delete it as needed. The destination path must exist. The csv headers will be same as the source
    Code:
    "SELECT * INTO [%DestFileName%] IN '' [text;Database=%DestPath%] FROM [%SourceTableName%]"
    
    ' replace %DestFileName% with the destination file name and extension
    ' replace %DestPath% with the destination folder
    ' csv: replace %SourceTableName% with the source file name contained in the folder of your active ADO 
    connection
    ' others: %SourceTableName% is the table name. Note that Excel tables are suffixed with $
    If wanting to append rows to an existing csv file, we would use a INSERT INTO statement. The rule here is simple, append same number of fields and, if needed, in the SQL statement, rename/alias the source fields to the destination fields within the Select portion of the statement: i.e., Select [SourceFldName] As [DestFieldName]
    Code:
    "INSERT INTO [%DestFileName%] IN '' [text;Database=%DestPath%] " & _
         "Select * FROM [%SourceTableName%]"
    
    ' replace %DestFileName% with the destination file name and extension
    ' replace %DestPath% with the destination folder
    ' csv: replace %SourceTableName% with the source file name contained in the folder of your active ADO 
    connection
    ' others: %SourceTableName% is the table name. Note that Excel tables are suffixed with $
    Once you have the SQL statement, you simply provide it as the 1st parameter of the connection object's Execute command, i.e., cn.Execute sSQL

    Edited: Addressing the 255 character field limit for csv files

    When extracting from csv and a source field can have more than 255 characters, you will need to ensure a schema.ini file entry exists for the source file. In that ini entry, DO NOT include MaxScanRows, but do include a Col# entry for each column. The data type for the column(s) that can contain lengths > 255 should be: Memo. If not done, the field values will be truncated to 255 as applicable. When OELDB (Jet/ACE) creates the ini entry for the new file, this is what it enters for such fields, which can also be used instead of Memo: LongChar Attribute 32

    Excel file sources have similar issues. If a field value is > 255 characters, it may be truncated if that field is not the 1st 8 scanned by OLEDB/Excel. The best fix is to set the Excel registry key TypeGuessRows value to 0. This forces every row to be scanned before data type is guessed. It may also result in overall poorer performance accessing Excel files. An alternative way, which likely isn't nearly as useful is to provide a dummy record in the 1st row that contains 256+ characters in the fields/columns you are concerned about. Your SQL statement can prevent the dummy row from being exported by excluding that row using a WHERE clause.
    Last edited by LaVolpe; Mar 21st, 2017 at 10:35 AM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  3. #3

    Thread Starter
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,072

    Re: [VB6] ADO/OLEDB and Case-Sensitivity

    Case-Sensitive SQL Where and Join clauses... This can be very frustrating when its happening and you don't know about it.

    If a simple source with 2 records contains these values in the TITLE field: 1) The Therapist and 2) Road to the Top, then what would this Where clause return: WHERE (TITLE Like '%the%')

    Answer: Depends on case-sensitivity. Most will proclaim that LIKE is case-insensitive (CI), but as far as the Jet/ACE OLEDB drivers are concerned, that isn't always the case. In my tests, that query would return both source records if the source were Excel or Access. However, it is case-sensitive (CS) in text sources like csv files; only 1 record returned. This can be further proven, and could possibly return different results, depending on what's installed on the pc.

    1. Create a connection to a source, be it Access, Excel, csv, or other. We'll call that connection: conn
    2. Using conn, execute the following, supplying the TableName value, to get your answer
    Code:
        With conn.Execute("Select Top 1 0 FROM [TableName] Where UCASE('a') Like '%a%'")
            If .EOF = True Then Debug.Print "Case-sensitive" Else Debug.Print "Case-Insensitive"
            .Close
        End With
    With OLEDB, if COLLATE statement remain unavailable to force CS/CI, then some possible workarounds to deal with this problem if OLEDB is your only option for connecting to a source.

    There are a few basic methods to overcome this problem and each has its own headaches. Feel free to mention more... options not headaches

    Option 1: If known in advance if CS/CI applies, create source to allow desired comparisons without needing workarounds.

    Option 2: Create new field(s) in the source to contain the comparison values needed. May not be possible in many cases.

    Option 3: Tweak the SQL statements to be used. Here are some examples. Know that this may seriously impact query performance. Fortunately, some common VB commands can be used in OLEDB queries...

    Force CI when comparisons will be CS (like csv files in above case)
    - Equality: UCase(fieldname) = UCase(value)
    - Like: UCase(fieldname) = UCase('%value%')
    - JOIN: INNER JOIN t2 ON UCase(t2.FieldName) = UCase(t1.FieldName)
    - IN: can't use UCase, i.e., WHERE ... IN UCase('a', 'b', 'c')
    suggest maybe passing the IN values to a routine to UCase them, then:
    WHERE UCase(fieldname) IN (the modified upper case string)

    Force CS when comparisons will be CI. StrComp/InStr with final param as 0 = binary compare
    - Equality: StrComp(fieldname, value, 0) = 0
    - Like: InStr(1, fieldname, value, 0) > 0
    - JOIN: INNER JOIN t2 ON (StrComp(t2.FieldName, t1.FieldName, 0) = 0)
    - IN: this would require splitting the IN values into separate OR statements
    Instead of: WHERE fieldname IN ('aBc', 'DeF')
    Use: WHERE (StrComp(fieldname, 'aBc', 0)=0 OR StrComp(fieldname, 'DeF', 0)=0)

    Final notes: I haven't found any method to override case-sensitivity options in a connection string.

    For CI queries, you may be able to use the Find and/or Filter method of a recordset. Even though the query string sent to OLEDB may be interpreted CS, the recordset's Filter/Find functions are CI. Haven't found any exceptions yet.
    Last edited by LaVolpe; Mar 21st, 2017 at 01:15 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width