-
Mar 12th, 2017, 09:09 PM
#1
[VB6] Convert CSV to Excel Using ADO
A piece of code I thought might be useful to some. Caveat first... Won't be the ideal solution for everyone.
Having Microsoft Excel on the computer is not a requirement. Including an ADO reference is. This code was tested using both ADO Library versions 2.8 and 6.1. The ADO dependency could be removed if code tweaked to use late-binding CreateObject() like VB scripts do.
This is unicode-friendly regarding file names. There are comments in the code should anyone want to tweak it to handle unicode file content. The routine will default to the more modern versions of Excel and can be forced to use the lower versions as desired.
A few common options are provided as function parameters and a schema.ini file would likely be needed for more complex options. Comments in the code talk a bit about that.
The code is really simple and allows ADO to do 100% of the work. Most of the routine below consists of sanity checks along with dealing with various options. The guts is an ADO connection to the csv file and an SQL execution on that connection to create the Excel file, create the tab/sheet, and copy the csv content to that sheet -- all done in that one execution.
Code:
' API used to check if file exists
Private Declare Function GetFileAttributes Lib "kernel32.dll" Alias "GetFileAttributesW" (ByVal lpFileName As Long) As Long
Private Function ConvertCSVtoExcel(CsvFile As String, _
Optional CsvHasHeaders As Boolean = True, _
Optional ExcelSheetName As String = "Sheet1", _
Optional ByVal ExcelVersion8 As Boolean = False, _
Optional DestinationPath As String = vbNullString, _
Optional ReturnedErrorString As String) As String
' Function will return full Excel file path & name if no errors & ReturnedErrorString will be null
' else function returns vbNullString and ReturnedErrorString contains error description
' Converted file name will be the same as the CSV except having an Excel extension
' Unicode. Handles unicode file names & sheet names.
' For those that want to also handle unicode CSV data, you will want to pass a new parameter and
' modify this routine. Key google term: CharacterSet=Unicode
' ensure CsvFile exists before calling this function
' ensure DestinationPath has write-access. By default this path is same as CsvFile path
' ExcelVersion8 can be accessed by most versions of Excel except very, very old versions
' if absolutely needed, you may want to rework this to pass an exact version, i.e., 5, 8, 12, etc
' If parameter is False, v12 (xlsx extension) will be attempted & falls back to v8 if needed
' Version 12 driver can be found here & requires at least Win7
' https://www.microsoft.com/en-us/download/details.aspx?id=13255
' Last but not least, some additional info
' many delimited file options can be used, but require a schema.ini file & no changes in this routine
' i.e., other delimiter than comma, specifying column data types, different column header names, etc
' https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver
' if you need to play with xlsb (binary files) vs xlsx files, remove the XML from the v12 connection string
Static v12ProviderAbsent As Boolean
Const E_NOPROVIDER As Long = 3706&
Dim cn As ADODB.Connection, p As Long
Dim sSrcFile As String, sSrcPath As String
Dim sSQL As String, sDest As String
Dim sHDRprop As String, sVersion As String
' sanity checks and prep
p = InStrRev(CsvFile, "\")
sSrcFile = Mid$(CsvFile, p + 1)
sSrcPath = Left$(CsvFile, p)
If DestinationPath = vbNullString Then
sDest = sSrcPath
ElseIf Right$(DestinationPath, 1) <> "\" Then
sDest = DestinationPath & "\"
Else
sDest = DestinationPath
End If
If v12ProviderAbsent = True Then ExcelVersion8 = True
p = InStrRev(sSrcFile, ".")
If p = 0 Then sDest = sDest & "." Else sDest = sDest & Left$(sSrcFile, p)
If ExcelVersion8 Then sDest = sDest & "xls" Else sDest = sDest & "xlsx"
If ExcelSheetName = vbNullString Then ExcelSheetName = "Data"
If CsvHasHeaders Then sHDRprop = "Yes" Else sHDRprop = "No"
' prevent overwriting existing file; Excel file creation fails if file/sheet already exists
Do
If GetFileAttributes(StrPtr(sDest)) = -1& Then Exit Do
If ExcelVersion8 Then sDest = sDest & ".xls" Else sDest = sDest & ".xlsx"
Loop
' verify we can open the csv
On Error Resume Next
Set cn = New ADODB.Connection
cn.CursorLocation = adUseClient
If Not ExcelVersion8 Then
cn.ConnectionString = "Provider=Microsoft.Ace.OLEDB.12.0;Data Source=" & _
sSrcPath & ";Extended Properties='text;HDR=" & sHDRprop & ";FMT=CSVDelimited'"
cn.Open
If Err Then ' failure. Either version 12 not installed or O/S less than Win7
If Err.Number = E_NOPROVIDER Then v12ProviderAbsent = True
ExcelVersion8 = True ' try again using lower Excel version
sDest = Left$(sDest, Len(sDest) - 1)
Err.Clear
Else
sVersion = "12.0 XML"
End If
End If
If ExcelVersion8 Then
cn.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & _
sSrcPath & ";Extended Properties='text;HDR=" & sHDRprop & ";FMT=CSVDelimited'"
cn.Open
If Err Then ' can't be done via this routine
ReturnedErrorString = Err.Description
Err.Clear
GoTo ExitRoutine
End If
sVersion = "8.0"
End If
' create the excel file, sheet, & import data in one call
sSQL = "SELECT * INTO [" & ExcelSheetName & "] IN '' [Excel " & sVersion & _
";Database=" & sDest & "] FROM [" & sSrcFile & "]"
cn.Execute sSQL
If Err Then
ReturnedErrorString = Err.Description
Err.Clear
Else
ReturnedErrorString = vbNullString
ConvertCSVtoExcel = sDest
End If
ExitRoutine:
If cn.State Then cn.Close
Set cn = Nothing
End Function
Edited: A specific scenario has yet to be resolved. If the CSV file name (excluding the path) is unicode, the routine fails when the actual data is not unicode (i.e., ASCII); just the file name is. For full unicode support, including actual CSV data in unicode, the CharacterSet=Unicode extended property on the connection string should work well in most cases.
Last edited by LaVolpe; Mar 14th, 2017 at 05:07 PM.
-
Mar 21st, 2017, 09:05 AM
#2
Re: [VB6] Convert CSV to Excel Using ADO
Just an add-on to this thread.
Extracting/converting existing source to a csv is easily done with ADO. The advantage of using ADO is that the produced csv file is per specifications. This means comma-delimited, proper quote escaping, and dealing with fields containing embedded commas and carriage returns.
From trial and error, when extracting from csv to csv, the source csv should be listed in the schema.ini file within that source file's folder. That ini entry for the file should include: MaxScanRows=0. This will help ensure that the extraction doesn't incorrectly best-guess the data type of each extracted field. Jet/ACE creates a schema.ini entry for the new csv file and in that file you can see what data types the extracted fields were set to.
For csv to csv actions, though the ACE.12.0 driver can be used, not really necessary. Here's a typical csv connection string using JET.4.0. Replace %TARGET% with the source csv file's path
Code:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=%TARGET%;Extended Properties='text;HDR=YES;FMT=CSVDelimited'"
If Excel or Access is the source, then modify the connection string as needed to connect to that source.
If the file holding the extracted/converted data doesn't yet exist, use a SELECT INTO statement like the one below. This will error if the destination file already exists, so delete it as needed. The destination path must exist. The csv headers will be same as the source
Code:
"SELECT * INTO [%DestFileName%] IN '' [text;Database=%DestPath%] FROM [%SourceTableName%]"
' replace %DestFileName% with the destination file name and extension
' replace %DestPath% with the destination folder
' csv: replace %SourceTableName% with the source file name contained in the folder of your active ADO
connection
' others: %SourceTableName% is the table name. Note that Excel tables are suffixed with $
If wanting to append rows to an existing csv file, we would use a INSERT INTO statement. The rule here is simple, append same number of fields and, if needed, in the SQL statement, rename/alias the source fields to the destination fields within the Select portion of the statement: i.e., Select [SourceFldName] As [DestFieldName]
Code:
"INSERT INTO [%DestFileName%] IN '' [text;Database=%DestPath%] " & _
"Select * FROM [%SourceTableName%]"
' replace %DestFileName% with the destination file name and extension
' replace %DestPath% with the destination folder
' csv: replace %SourceTableName% with the source file name contained in the folder of your active ADO
connection
' others: %SourceTableName% is the table name. Note that Excel tables are suffixed with $
Once you have the SQL statement, you simply provide it as the 1st parameter of the connection object's Execute command, i.e., cn.Execute sSQL
Edited: Addressing the 255 character field limit for csv files
When extracting from csv and a source field can have more than 255 characters, you will need to ensure a schema.ini file entry exists for the source file. In that ini entry, DO NOT include MaxScanRows, but do include a Col# entry for each column. The data type for the column(s) that can contain lengths > 255 should be: Memo. If not done, the field values will be truncated to 255 as applicable. When OELDB (Jet/ACE) creates the ini entry for the new file, this is what it enters for such fields, which can also be used instead of Memo: LongChar Attribute 32
Excel file sources have similar issues. If a field value is > 255 characters, it may be truncated if that field is not the 1st 8 scanned by OLEDB/Excel. The best fix is to set the Excel registry key TypeGuessRows value to 0. This forces every row to be scanned before data type is guessed. It may also result in overall poorer performance accessing Excel files. An alternative way, which likely isn't nearly as useful is to provide a dummy record in the 1st row that contains 256+ characters in the fields/columns you are concerned about. Your SQL statement can prevent the dummy row from being exported by excluding that row using a WHERE clause.
Last edited by LaVolpe; Mar 21st, 2017 at 10:35 AM.
-
Mar 21st, 2017, 12:31 PM
#3
Re: [VB6] ADO/OLEDB and Case-Sensitivity
Case-Sensitive SQL Where and Join clauses... This can be very frustrating when its happening and you don't know about it.
If a simple source with 2 records contains these values in the TITLE field: 1) The Therapist and 2) Road to the Top, then what would this Where clause return: WHERE (TITLE Like '%the%')
Answer: Depends on case-sensitivity. Most will proclaim that LIKE is case-insensitive (CI), but as far as the Jet/ACE OLEDB drivers are concerned, that isn't always the case. In my tests, that query would return both source records if the source were Excel or Access. However, it is case-sensitive (CS) in text sources like csv files; only 1 record returned. This can be further proven, and could possibly return different results, depending on what's installed on the pc.
1. Create a connection to a source, be it Access, Excel, csv, or other. We'll call that connection: conn
2. Using conn, execute the following, supplying the TableName value, to get your answer
Code:
With conn.Execute("Select Top 1 0 FROM [TableName] Where UCASE('a') Like '%a%'")
If .EOF = True Then Debug.Print "Case-sensitive" Else Debug.Print "Case-Insensitive"
.Close
End With
With OLEDB, if COLLATE statement remain unavailable to force CS/CI, then some possible workarounds to deal with this problem if OLEDB is your only option for connecting to a source.
There are a few basic methods to overcome this problem and each has its own headaches. Feel free to mention more... options not headaches
Option 1: If known in advance if CS/CI applies, create source to allow desired comparisons without needing workarounds.
Option 2: Create new field(s) in the source to contain the comparison values needed. May not be possible in many cases.
Option 3: Tweak the SQL statements to be used. Here are some examples. Know that this may seriously impact query performance. Fortunately, some common VB commands can be used in OLEDB queries...
Force CI when comparisons will be CS (like csv files in above case)
- Equality: UCase(fieldname) = UCase(value)
- Like: UCase(fieldname) = UCase('%value%')
- JOIN: INNER JOIN t2 ON UCase(t2.FieldName) = UCase(t1.FieldName)
- IN: can't use UCase, i.e., WHERE ... IN UCase('a', 'b', 'c')
suggest maybe passing the IN values to a routine to UCase them, then:
WHERE UCase(fieldname) IN (the modified upper case string)
Force CS when comparisons will be CI. StrComp/InStr with final param as 0 = binary compare
- Equality: StrComp(fieldname, value, 0) = 0
- Like: InStr(1, fieldname, value, 0) > 0
- JOIN: INNER JOIN t2 ON (StrComp(t2.FieldName, t1.FieldName, 0) = 0)
- IN: this would require splitting the IN values into separate OR statements
Instead of: WHERE fieldname IN ('aBc', 'DeF')
Use: WHERE (StrComp(fieldname, 'aBc', 0)=0 OR StrComp(fieldname, 'DeF', 0)=0)
Final notes: I haven't found any method to override case-sensitivity options in a connection string.
For CI queries, you may be able to use the Find and/or Filter method of a recordset. Even though the query string sent to OLEDB may be interpreted CS, the recordset's Filter/Find functions are CI. Haven't found any exceptions yet.
Last edited by LaVolpe; Mar 21st, 2017 at 01:15 PM.
-
Nov 19th, 2020, 02:42 PM
#4
New Member
Re: [VB6] Convert CSV to Excel Using ADO
I created an account just to say a "THANK YOU" to you LaVolpe! I was in a pickle having to convert a CSV to EXCEL and needed a quick solution. Found your code and it worked flawlessly. Thank you so much!
-
Nov 20th, 2020, 09:34 AM
#5
Hyperactive Member
Re: [VB6] Convert CSV to Excel Using ADO
LaVolpe, Is it possible to do this work in the form of a project with a description and examples of use to add to the code base?
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|