I trying to import a text file (CSV ) , but inside text there are one or more in text that there is ;
Code:
1;XX;17.062.00;;"Outros bolos industrializados e produtos de panificação não especificados anteriormente; exceto casquinhas para sorvete e pães";;30,93
The text "Outros bolos industrializados e produtos de panificação não especificados anteriormente; exceto casquinhas para sorvete e pães" must to be only a column, but there a ; inside , how can I to do a workaround ?
Code:
Dim file_name As String
Dim fnum As Integer
Dim whole_file As String
Dim lines As Variant
Dim one_line As Variant
Dim num_rows As Long
Dim num_cols As Long
Dim the_array() As String
Dim R As Long
Dim C As Long
Dim STR_IE As String
Dim str_sql As String
'----------------------------------
'-DELETe records
'----------------------------------
gDatabase.DbExecuteSQL "BEGIN DELETE WRK_CEST_IE; COMMIT; END;"
file_name = App.Path
If Right$(file_name, 1) <> "\" Then file_name = _
file_name & "\"
file_name = file_name & "test.csv"
file_name = txtArquivo
' Load the file.
fnum = FreeFile
Open file_name For Input As fnum
whole_file = Input$(LOF(fnum), #fnum)
Close fnum
' Quebra o arquivo em Linhas
lines = Split(whole_file, vbCrLf)
' Dimensiona o array
num_rows = UBound(lines)
one_line = Split(lines(0), ";")
num_cols = UBound(one_line)
ReDim the_array(num_rows, num_cols)
STR_IE = "E"
If Len(lines(0)) > 0 Then
one_line = Split(lines(0), ";")
For C = 0 To num_cols
'the_array(0, C) = one_line(C)
Next C
End If
then you know there are extra ; somewhere at least. If the only possibility is that text field then you can manipulate the one_line array to join the segments back together. Don't forget to add back the ; that was removed by the Split function.
CSV normally uses "," but in locales that use "," as their decimal point symbol the ";" is normally substituted as the column delimiter for convenience. So I understand that you have a non-comma-CSV format file.
If you have column data with the column delimiter (here ";") in it then by common CSV practice the column value should be quoted using the text-qualifier symbol (typically the quotation mark). If the program that wrote this file failed to do that it is pretty tough for a program reading the file to guess which ";" means what.
But your sample line does use text-qualifiers, so you're ok.
You can't rely on the newbie "split-de-split" technique for parsing this data and wil have to get a little more elaborate or use something like the Jet Text IISAM.
Split-de-splits eats RAM and CU time, but with a little hacking text-delimiters can be accomodated. Example:
Code:
Private Function ParseSemicolonCSV(ByVal FileName As String) As String()
'This code assumes that the first row of the file data has the
'maximum number of columns of all rows.
Dim F As Integer
Dim Rows() As String
Dim RowMax As Long
Dim Parts() As String
Dim Part As Long
Dim ColMax As Long
Dim Data() As String
Dim Row As Long
Dim Col As Long
F = FreeFile(0)
Open FileName For Input As #F
Rows = Split(Input$(LOF(F), #F), vbNewLine)
Close #F
RowMax = UBound(Rows)
'Deal with the possible "empty row" at the end of the file:
If Len(Rows(RowMax)) < 1 Then
RowMax = RowMax - 1
End If
'Examine 1st Row of text to determine ColMax:
Parts = Split(Rows(0), """")
For Part = 0 To UBound(Parts) Step 2
Parts(Part) = Replace$(Parts(Part), ";", vbNullChar)
Next
Parts = Split(Join$(Parts, ""), vbNullChar)
ColMax = UBound(Parts)
ReDim Data(RowMax, ColMax)
'Now process each Row and copy into Data:
For Row = 0 To RowMax
Parts = Split(Rows(Row), """")
For Part = 0 To UBound(Parts) Step 2
Parts(Part) = Replace$(Parts(Part), ";", vbNullChar)
Next
Parts = Split(Join$(Parts, ""), vbNullChar)
For Col = 0 To ColMax
Data(Row, Col) = Parts(Col)
Next
Next
ParseSemicolonCSV = Data
End Function
Another approach would be to make a single pass through the line copying each segment when a semi-colon is encountered.
If a quote character is encountered set a boolean flag variable that overrides ending your segment until another quote character is encountered at which point you turn your flag off.
Code:
Option Explicit
Const Quote As String * 1 = """"
Dim TestInput As String
Private Sub UserForm_Initialize()
TestInput = "12.125;3;54.87;'Wigglesworth; by jove';26.30;'Fortunately not.';78"
TestInput = Replace(TestInput, "'", Quote)
End Sub
Private Sub CommandButton1_Click()
Dim Cells() As String
Cells = ParseColumns(TestInput)
'Display test results
Dim Item As String
For Each Item In Cells
List1.AddItem Item
Next Item
End Sub
Private Function ParseColumns(Line As String) As String()
Dim Cells() As String
Dim Seg As String, C As String * 1
Dim x As Integer, i As Integer
Dim isText As Boolean
' Make sure text ends with a semicolon
If Right$(Line, 1) <> ";" Then Line = Line & ";"
i = -1
isText = False
For x = 1 To Len(Line)
C = Mid$(Line, x, 1)
If C = Quote Then
isText = Not (isText) 'Toggle Flag
ElseIf C = ";" Then
If isText Then
Seg = Seg & C
Else
i = i + 1
ReDim Preserve Cells(i) As String
Cells(i) = Seg
Seg = ""
End If
Else
Seg = Seg & C
End If
Next x
ParseColumns = Cells
End Function
Last edited by Gruff; Jul 3rd, 2016 at 02:33 PM.
Burn the land and boil the sea
You can't take the sky from me
It can be quicker to parse in linear fashion but more work than "split-de-splits." I'd think InStr() would be quicker than most other alternatives such as extracting character-by-character.
As long as the file(s) are not large the costs of keeping two copies in RAM during the process and the extra CPU you burn don't matter a lot though. Reading and parsing a line at a time would at least avoid the extra RAM required.
In the end unless you do have large files you just want the problem solved with accuracy and reliability though. Performance doesn't matter... until it does.