-
Feb 18th, 2010, 10:46 AM
#1
Fast & correctly reading CSV parser (Comma-separated values)
I'm rather surprised that I didn't really find any CSV parsers from our CodeBank. You can find CSV saving code, but reading a CSV file seems to be ignored.
My function here attempts to read a CSV file properly. This means:- Rows are separated by line change.
- Columns are separated by comma.
- Commas and line changes are allowed within quotes.
- Dual quotes within quotes = quote.
- Space characters are preserved as is required by RFC4180
- You can customize row, column and quote separators (this allows reading tab separated CSV files generated by Excel)
The procedure returns one dimensional string array, but it holds 2D data. This is why it also outputs the number of columns and rows.
Usage is as follows:
Code:
Dim strCSV() As String, lngRows As Long, lngCols As Long
Dim strFile As String, lngA As Long
' read sample data to strFile
Open "test.csv" For Input As #1
strFile = Input(LOF(1), #1)
Close #1
' parse CSV
SplitCSV strFile, strCSV, lngCols, lngRows
' sample output
For lngA = 0 To UBound(strCSV)
Debug.Print "Row " & (lngA \ lngCols), "Column " & (lngA Mod lngCols), "Data: " & strCSV(lngA)
Next lngA
You can also input the number of columns you want and the rest of the columns are ignored. This does not work for rows, all rows are always returned. The column feature is useful if you want to add new columns, you can simply directly work with the string array and have the new columns available.
Finally, the procedure below is optimized for speed. It could be better, but it does a very good job as it is. Using other solutions such as a VBScript regular expression will be a lot slower.
CODE IN NEXT POST
Last edited by Merri; Feb 22nd, 2010 at 10:59 AM.
-
Feb 18th, 2010, 10:46 AM
#2
Re: Fast CSV (Comma-separated values) parser
Code:
' modCSV.bas
Option Explicit
Private Declare Function ArrayPtr Lib "msvbvm60" Alias "VarPtr" (Arr() As Any) As Long
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
Private Declare Function SafeArrayRedim Lib "oleaut32" (ByVal saPtr As Long, saBound As Long) As Long
' returns one dimensional zero based string array in ResultSplit containing parsed CSV cells
' - ResultCols (in/out) number of columns; if positive on input the CSV data is fixed to given number of columns
' - ResultRows (out) number of rows
Public Sub SplitCSV(Expression As String, ResultSplit() As String, ResultCols As Long, ResultRows As Long, Optional ColumnDelimiter As String = ",", Optional RowDelimiter As String = vbNewLine, Optional Quote As String = """")
Dim CSV() As Integer, HeaderCSV(5) As Long, lngCSV As Long
' general variables that we need
Dim intColumn As Integer, intQuote As Integer, lngRow As Long, strRow As String
Dim lngExpLen As Long, lngRowLen As Long
Dim blnQuote As Boolean, lngA As Long, lngB As Long, lngC As Long, lngCount As Long, lngResults() As Long
' some dummy variables that we happen to need
Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
' length information
lngExpLen = LenB(Expression)
lngRowLen = LenB(RowDelimiter)
' validate lengths
If lngExpLen > 0 And lngRowLen > 0 Then
' column delimiter
If LenB(ColumnDelimiter) Then intColumn = AscW(ColumnDelimiter): ColumnDelimiter = Left$(ColumnDelimiter, 1) Else intColumn = 44: ColumnDelimiter = ","
' quote character
If LenB(Quote) Then intQuote = AscW(Quote): Quote = Left$(Quote, 1) Else intQuote = 34: Quote = """"
' maximum number of results
ReDim lngResults(0 To (lngExpLen \ lngRowLen))
' prepare CSV array
HeaderCSV(0) = 1
HeaderCSV(1) = 2
HeaderCSV(3) = StrPtr(Expression)
HeaderCSV(4) = Len(Expression)
' assign Expression data to the Integer array
lngCSV = ArrayPtr(CSV)
PutMem4 lngCSV, VarPtr(HeaderCSV(0))
' find first row delimiter, see if within quote or not
lngA = InStrB(1, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
lngB = InStrB(1, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
Do While lngA > 0
If lngA + lngRowLen <= lngB Or lngB = 0 Then
lngResults(lngCount) = lngA
lngA = InStrB(lngA + lngRowLen, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngCount Then
lngCount = lngCount + 1
Else
' calculate number of resulting columns if invalid number of columns
If ResultCols < 1 Then
ResultCols = 1
intColumn = AscW(ColumnDelimiter)
For lngC = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
If CSV(lngC) <> intQuote Then Else blnQuote = False
Else
Select Case CSV(lngC)
Case intQuote
blnQuote = True
Case intColumn
ResultCols = ResultCols + 1
End Select
End If
Next lngC
End If
lngCount = 1
End If
Else
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
If lngB Then
lngA = InStrB(lngB + 2, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngA Then
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
End If
End If
End If
Loop
lngResults(lngCount) = lngExpLen + 1
' number of rows
ResultRows = lngCount + 1
' string array items to return
ReDim Preserve ResultSplit(0 To ResultRows * ResultCols - 1)
' first row
lngCount = 0
strRow = LeftB$(Expression, lngResults(0) - 1)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
ResultSplit(lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= ResultCols Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < ResultCols Then ResultSplit(lngCount) = Left$(strRow, lngC - 1)
' rows after first
For lngA = 1 To ResultRows - 1
' start index for columns
lngRow = lngA * ResultCols
lngCount = 0
strRow = MidB$(Expression, lngResults(lngA - 1) + lngRowLen, lngResults(lngA) - lngResults(lngA - 1) - lngRowLen)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(lngA) - lngResults(lngA - 1) - lngRowLen) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
ResultSplit(lngRow + lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= ResultCols Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < ResultCols Then ResultSplit(lngRow + lngCount) = Left$(strRow, lngC - 1)
Next lngA
' clean up CSV array
PutMem4 lngCSV, 0
Else
ResultCols = 0
ResultRows = 0
' clean any possible data that exists in the passed string array (like if it is multidimensional)
If Not Not ResultSplit Then Erase ResultSplit
' mysterious IDE error fix
Debug.Assert App.hInstance
' reset to one element, one dimension
ReDim ResultSplit(0 To 0)
' custom redimension: remove the items (this duplicates the VB6 Split behavior)
SafeArrayRedim Not Not ResultSplit, SafeArrayBound(0)
End If
End Sub
Let me know if you have a troublesome file that does not open as expected and I'll know whether the problem is in the file or in the parser.
Last edited by Merri; May 2nd, 2010 at 02:34 AM.
Reason: Minor bug fix
-
Apr 27th, 2021, 08:05 PM
#3
Member
Re: Fast & correctly reading CSV parser (Comma-separated values)
Here's a slightly modified version that puts the results into a custom type so you can say xResult.Rows(i).Columns(j)
Usage:
Code:
Dim xResult as tCSVResult
xResult = SplitCSV(strCSVText)
'Loop through rows and column
for i = lbound(xresult.Rows) to ubound(xresult.Rows)
for j = lbound(xresult.Rows(i).Columns) to ubound(xresult.Rows(i).Columns)
next i
There's probably better ways of doing this, I just hacked it together really quickly and have barely done any testing.
Code:
Private Type tCSVRow
Columns() As String
Initialized As Boolean
End Type
Public Type tCSVResult
Rows() As tCSVRow
Initialized As Boolean
RowCount As Long
ColumnCount As Long
End Type
' returns one dimensional zero based string array in ResultSplit containing parsed CSV cells
' - ResultCols (in/out) number of columns; if positive on input the CSV data is fixed to given number of columns
' - ResultRows (out) number of rows
Public Function SplitCSV(Expression As String, Optional ColumnDelimiter As String = ",", Optional RowDelimiter As String = vbNewLine, Optional Quote As String = """") As tCSVResult
Dim CSV() As Integer, HeaderCSV(5) As Long, lngCSV As Long
' general variables that we need
Dim intColumn As Integer, intQuote As Integer, lngRow As Long, strRow As String
Dim lngExpLen As Long, lngRowLen As Long
Dim blnQuote As Boolean, lngA As Long, lngB As Long, lngC As Long, lngCount As Long, lngResults() As Long
' some dummy variables that we happen to need
Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
' length information
lngExpLen = LenB(Expression)
lngRowLen = LenB(RowDelimiter)
' validate lengths
If lngExpLen > 0 And lngRowLen > 0 Then
' column delimiter
If LenB(ColumnDelimiter) Then intColumn = AscW(ColumnDelimiter): ColumnDelimiter = Left$(ColumnDelimiter, 1) Else intColumn = 44: ColumnDelimiter = ","
' quote character
If LenB(Quote) Then intQuote = AscW(Quote): Quote = Left$(Quote, 1) Else intQuote = 34: Quote = """"
' maximum number of results
ReDim lngResults(0 To (lngExpLen \ lngRowLen))
' prepare CSV array
HeaderCSV(0) = 1
HeaderCSV(1) = 2
HeaderCSV(3) = StrPtr(Expression)
HeaderCSV(4) = Len(Expression)
' assign Expression data to the Integer array
lngCSV = ArrayPtr(CSV)
PutMem4 lngCSV, VarPtr(HeaderCSV(0))
' find first row delimiter, see if within quote or not
lngA = InStrB(1, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
lngB = InStrB(1, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
Do While lngA > 0
If lngA + lngRowLen <= lngB Or lngB = 0 Then
lngResults(lngCount) = lngA
lngA = InStrB(lngA + lngRowLen, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngCount Then
lngCount = lngCount + 1
Else
' calculate number of resulting columns if invalid number of columns
If SplitCSV.ColumnCount < 1 Then
SplitCSV.ColumnCount = 1
intColumn = AscW(ColumnDelimiter)
For lngC = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
If CSV(lngC) <> intQuote Then Else blnQuote = False
Else
Select Case CSV(lngC)
Case intQuote
blnQuote = True
Case intColumn
SplitCSV.ColumnCount = SplitCSV.ColumnCount + 1
End Select
End If
Next lngC
End If
lngCount = 1
End If
Else
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
If lngB Then
lngA = InStrB(lngB + 2, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngA Then
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
End If
End If
End If
Loop
lngResults(lngCount) = lngExpLen + 1
' number of rows
SplitCSV.RowCount = lngCount '+ 1
' string array items to return
ReDim Preserve SplitCSV.Rows(0 To SplitCSV.RowCount - 1)
' first row
lngCount = 0
strRow = LeftB$(Expression, lngResults(0) - 1)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
If SplitCSV.Rows(0).Initialized = True Then
Else
ReDim SplitCSV.Rows(0).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(0).Initialized = True
End If
SplitCSV.Rows(0).Columns(lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= SplitCSV.ColumnCount Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < SplitCSV.ColumnCount Then
If SplitCSV.Rows(0).Initialized = True Then
Else
ReDim SplitCSV.Rows(0).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(0).Initialized = True
End If
SplitCSV.Rows(0).Columns(lngCount) = Left$(strRow, lngC - 1)
End If
' rows after first
For lngA = 1 To SplitCSV.RowCount - 1
' start index for columns
lngRow = lngA * SplitCSV.ColumnCount
lngCount = 0
strRow = MidB$(Expression, lngResults(lngA - 1) + lngRowLen, lngResults(lngA) - lngResults(lngA - 1) - lngRowLen)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(lngA) - lngResults(lngA - 1) - lngRowLen) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
If SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True Then
Else
ReDim SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True
End If
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= SplitCSV.ColumnCount Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < SplitCSV.ColumnCount Then
If SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True Then
Else
ReDim SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True
End If
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(lngCount) = Left$(strRow, lngC - 1)
End If
Next lngA
' clean up CSV array
PutMem4 lngCSV, 0
Else
SplitCSV.ColumnCount = 0
SplitCSV.RowCount = 0
' clean any possible data that exists in the passed string array (like if it is multidimensional)
If Not Not SplitCSV.Rows Then Erase SplitCSV.Rows
' mysterious IDE error fix
Debug.Assert App.hInstance
' reset to one element, one dimension
ReDim SplitCSV.Rows(0 To 0)
' custom redimension: remove the items (this duplicates the VB6 Split behavior)
SafeArrayRedim Not Not SplitCSV.Rows, SafeArrayBound(0)
End If
End Function
-
Apr 27th, 2021, 08:56 PM
#4
Re: Fast & correctly reading CSV parser (Comma-separated values)
Originally Posted by bahbahbah
Here's a slightly modified version that puts the results into a custom type so you can say xResult.Rows(i).Columns(j)
Usage:
Code:
Dim xResult as tCSVResult
xResult = SplitCSV(strCSVText)
'Loop through rows and column
for i = lbound(xresult.Rows) to ubound(xresult.Rows)
for j = lbound(xresult.Rows(i).Columns) to ubound(xresult.Rows(i).Columns)
next i
There's probably better ways of doing this, I just hacked it together really quickly and have barely done any testing.
Code:
Private Type tCSVRow
Columns() As String
Initialized As Boolean
End Type
Public Type tCSVResult
Rows() As tCSVRow
Initialized As Boolean
RowCount As Long
ColumnCount As Long
End Type
' returns one dimensional zero based string array in ResultSplit containing parsed CSV cells
' - ResultCols (in/out) number of columns; if positive on input the CSV data is fixed to given number of columns
' - ResultRows (out) number of rows
Public Function SplitCSV(Expression As String, Optional ColumnDelimiter As String = ",", Optional RowDelimiter As String = vbNewLine, Optional Quote As String = """") As tCSVResult
Dim CSV() As Integer, HeaderCSV(5) As Long, lngCSV As Long
' general variables that we need
Dim intColumn As Integer, intQuote As Integer, lngRow As Long, strRow As String
Dim lngExpLen As Long, lngRowLen As Long
Dim blnQuote As Boolean, lngA As Long, lngB As Long, lngC As Long, lngCount As Long, lngResults() As Long
' some dummy variables that we happen to need
Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
' length information
lngExpLen = LenB(Expression)
lngRowLen = LenB(RowDelimiter)
' validate lengths
If lngExpLen > 0 And lngRowLen > 0 Then
' column delimiter
If LenB(ColumnDelimiter) Then intColumn = AscW(ColumnDelimiter): ColumnDelimiter = Left$(ColumnDelimiter, 1) Else intColumn = 44: ColumnDelimiter = ","
' quote character
If LenB(Quote) Then intQuote = AscW(Quote): Quote = Left$(Quote, 1) Else intQuote = 34: Quote = """"
' maximum number of results
ReDim lngResults(0 To (lngExpLen \ lngRowLen))
' prepare CSV array
HeaderCSV(0) = 1
HeaderCSV(1) = 2
HeaderCSV(3) = StrPtr(Expression)
HeaderCSV(4) = Len(Expression)
' assign Expression data to the Integer array
lngCSV = ArrayPtr(CSV)
PutMem4 lngCSV, VarPtr(HeaderCSV(0))
' find first row delimiter, see if within quote or not
lngA = InStrB(1, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
lngB = InStrB(1, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
Do While lngA > 0
If lngA + lngRowLen <= lngB Or lngB = 0 Then
lngResults(lngCount) = lngA
lngA = InStrB(lngA + lngRowLen, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngCount Then
lngCount = lngCount + 1
Else
' calculate number of resulting columns if invalid number of columns
If SplitCSV.ColumnCount < 1 Then
SplitCSV.ColumnCount = 1
intColumn = AscW(ColumnDelimiter)
For lngC = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
If CSV(lngC) <> intQuote Then Else blnQuote = False
Else
Select Case CSV(lngC)
Case intQuote
blnQuote = True
Case intColumn
SplitCSV.ColumnCount = SplitCSV.ColumnCount + 1
End Select
End If
Next lngC
End If
lngCount = 1
End If
Else
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
If lngB Then
lngA = InStrB(lngB + 2, Expression, RowDelimiter, Compare)
Do Until (lngA And 1) Or (lngA = 0)
lngA = InStrB(lngA + 1, Expression, RowDelimiter, Compare)
Loop
If lngA Then
lngB = InStrB(lngB + 2, Expression, Quote, Compare)
Do Until (lngB And 1) Or (lngB = 0)
lngB = InStrB(lngB + 1, Expression, Quote, Compare)
Loop
End If
End If
End If
Loop
lngResults(lngCount) = lngExpLen + 1
' number of rows
SplitCSV.RowCount = lngCount '+ 1
' string array items to return
ReDim Preserve SplitCSV.Rows(0 To SplitCSV.RowCount - 1)
' first row
lngCount = 0
strRow = LeftB$(Expression, lngResults(0) - 1)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(0) - 1) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
If SplitCSV.Rows(0).Initialized = True Then
Else
ReDim SplitCSV.Rows(0).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(0).Initialized = True
End If
SplitCSV.Rows(0).Columns(lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= SplitCSV.ColumnCount Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < SplitCSV.ColumnCount Then
If SplitCSV.Rows(0).Initialized = True Then
Else
ReDim SplitCSV.Rows(0).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(0).Initialized = True
End If
SplitCSV.Rows(0).Columns(lngCount) = Left$(strRow, lngC - 1)
End If
' rows after first
For lngA = 1 To SplitCSV.RowCount - 1
' start index for columns
lngRow = lngA * SplitCSV.ColumnCount
lngCount = 0
strRow = MidB$(Expression, lngResults(lngA - 1) + lngRowLen, lngResults(lngA) - lngResults(lngA - 1) - lngRowLen)
HeaderCSV(3) = StrPtr(strRow)
lngC = 0
blnQuote = False
For lngB = 0 To (lngResults(lngA) - lngResults(lngA - 1) - lngRowLen) \ 2
If blnQuote Then
Select Case CSV(lngB)
Case intQuote
If CSV(lngB + 1) = intQuote Then
' skip next char (quote)
lngB = lngB + 1
' add quote char
CSV(lngC) = intQuote
lngC = lngC + 1
Else
blnQuote = False
End If
Case Else
' add this char
CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
Else
Select Case CSV(lngB)
Case intQuote
blnQuote = True
Case intColumn
' add this column
If SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True Then
Else
ReDim SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True
End If
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(lngCount) = Left$(strRow, lngC)
' max column reached?
lngCount = lngCount + 1
If lngCount >= SplitCSV.ColumnCount Then Exit For
' start filling column string buffer from start (strRow)
lngC = 0
Case Else
' add this char
If lngB > lngC Then CSV(lngC) = CSV(lngB)
lngC = lngC + 1
End Select
End If
Next lngB
' add last column item?
If lngCount < SplitCSV.ColumnCount Then
If SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True Then
Else
ReDim SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(0 To SplitCSV.ColumnCount - 1)
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Initialized = True
End If
SplitCSV.Rows(lngRow / SplitCSV.ColumnCount).Columns(lngCount) = Left$(strRow, lngC - 1)
End If
Next lngA
' clean up CSV array
PutMem4 lngCSV, 0
Else
SplitCSV.ColumnCount = 0
SplitCSV.RowCount = 0
' clean any possible data that exists in the passed string array (like if it is multidimensional)
If Not Not SplitCSV.Rows Then Erase SplitCSV.Rows
' mysterious IDE error fix
Debug.Assert App.hInstance
' reset to one element, one dimension
ReDim SplitCSV.Rows(0 To 0)
' custom redimension: remove the items (this duplicates the VB6 Split behavior)
SafeArrayRedim Not Not SplitCSV.Rows, SafeArrayBound(0)
End If
End Function
Hi bahbahbah, if you have time and interest, maybe you can compare the performance of your code and RC6.CSV when reading huge text files. These results will be very meaningful.
-
Apr 27th, 2021, 10:51 PM
#5
Member
Re: Fast & correctly reading CSV parser (Comma-separated values)
Not sure how to use the RC6 CSV module - no documentation. If you can provide a sample I can check it.
Here's some basic timings using GetTickCount
Original: 703ms
Original using types: 860ms
Original using class modules: 1100ms
RC6: 125ms
Not sure that the RC6 one is correct as I've no idea how to use it. I've implemented it like so:
In declarations:
Code:
Implements RC6.ICSVCallback
Private Function ICSVCallback_NewValue(ByVal RowNr As Long, ByVal ColNr As Long, B() As Byte, ByVal BValStartPos As Long, ByVal BValLen As Long) As Long
End Function
And the code to run it:
Code:
Dim xTemp As RC6.cCSV
Set xTemp = New RC6.cCSV
xTemp.ParseFile XXXFILENAMEXXX, Me
Besides the callback, I'm not sure there's a way to get the data? So presumably you'd have to store that yourself? I'd guess that would up the time a wee bit.
-
Apr 28th, 2021, 01:08 PM
#6
Re: Fast & correctly reading CSV parser (Comma-separated values)
Last edited by SearchingDataOnly; Apr 28th, 2021 at 01:20 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|