Page 1 of 2 12 LastLast
Results 1 to 40 of 53

Thread: [RESOLVED] VB6 & Reading CSV file with data including spaces

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Resolved [RESOLVED] VB6 & Reading CSV file with data including spaces

    Greetings

    I am reading csv file with below code;
    SAMPLE INPUT FILE:

    AAAA,43438894322,USA,BASIC
    BBBB,73738 9393 939,INDIA,BASIC
    CCCC,00293993020,UK,PREMIUM

    all is well until i get any file with data having space like
    01 0002 020929
    65650002 543

    Code:
    connCSV.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" _
         & Path & ";Extended Properties='text;HDR=" & hasHeader & ";FMT=Delimited'"
        
    rsCSV.Open "Select * From " & fileName, _
        connCSV, adOpenStatic, adLockReadOnly, adCmdText
              Do While Not rsCSV.EOF()
                     cntRows = cntRows + 1
                         With MSFlexGrid1
                              currRow = .Rows - 1
                              
                             .TextMatrix(currRow, 1) = rsCSV.Fields(0).Value
                             .TextMatrix(currRow, 2) = rsCSV.Fields(1).Value
                             .TextMatrix(currRow, 3) = Format(rsCSV.Fields(2).Value, "####.000")
                             .TextMatrix(currRow, 4) = PadRight(Left(rsCSV.Fields(3).Value, 8), "XXXXXXXX")
                             If Len(Trim(rsCSV.Fields(4).Value)) = 0 Then .TextMatrix(currRow, 5) = "" Else .TextMatrix(currRow, 5) = rsCSV.Fields(4).Value
                             .TextMatrix(currRow, 6) = rsCSV.Fields(5).Value
                             .TextMatrix(currRow, 7) = rsCSV.Fields(6).Value
    
                             .Rows = .Rows + 1
                         End With
                 rsCSV.MoveNext
                 Loop
    Your help will be highly appreciated,

    Thanks

    Sam

  2. #2
    Frenzied Member
    Join Date
    Jan 2009
    Location
    Watch Window(Shift+f9)
    Posts
    1,879

    Smile Re: VB6 & Reading CSV file with data including spaces

    all is well until i get any file with data having space like
    01 0002 020929
    65650002 543
    so you don't want space .use trim() function .it will remove trailing space from both side of rows value .
    Code:
    connCSV.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" _
         & Path & ";Extended Properties='text;HDR=" & hasHeader & ";FMT=Delimited'"
        
    rsCSV.Open "Select * From " & fileName, _
        connCSV, adOpenStatic, adLockReadOnly, adCmdText
              Do While Not rsCSV.EOF()
                     cntRows = cntRows + 1
                         With MSFlexGrid1
                              currRow = .Rows - 1
                              
                             .TextMatrix(currRow, 1) = trim(rsCSV.Fields(0).value)                         .TextMatrix(currRow, 2) = trim(rsCSV.Fields(1).value)
                             .TextMatrix(currRow, 3) = trim(Format(rsCSV.Fields(2).Value, "####.000"))
                             .TextMatrix(currRow, 4) = trim(PadRight(Left(rsCSV.Fields(3).Value, 8), "XXXXXXXX"))
                             If Len(Trim(rsCSV.Fields(4).Value)) = 0 Then .TextMatrix(currRow, 5) = "" Else .TextMatrix(currRow, 5) = rsCSV.Fields(4).Value
                             .TextMatrix(currRow, 6) = trim(rsCSV.Fields(5).Value)
                             .TextMatrix(currRow, 7) = trim(rsCSV.Fields(6).Value)
    
                             .Rows = .Rows + 1
                         End With
                 rsCSV.MoveNext
                 Loop

  3. #3

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Thanks Firoz for reply, but trim won't work as record line will be like this
    AAAA,10 438 89432 2,USA,BASIC
    and rsCSV.Fields(0).value is returning NULL for that reason
    where as I need this as result
    AAAA
    10 438 89432 2
    USA
    BASIC


    if the line is
    AAAA,10438894322,USA,BASIC
    it works well.

    Best Regards,

    Sam

  4. #4
    Frenzied Member
    Join Date
    Jan 2009
    Location
    Watch Window(Shift+f9)
    Posts
    1,879

    Smile Re: VB6 & Reading CSV file with data including spaces

    and rsCSV.Fields(0).value is returning NULL for that reason
    where as I need this as result
    you Need to Handle Null .try the following way .
    Code:
    connCSV.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" _
         & Path & ";Extended Properties='text;HDR=" & hasHeader & ";FMT=Delimited'"
        
    rsCSV.Open "Select * From " & fileName, _
        connCSV, adOpenStatic, adLockReadOnly, adCmdText
              Do While Not rsCSV.EOF()
                     cntRows = cntRows + 1
                         With MSFlexGrid1
                              currRow = .Rows - 1
                              
                             .TextMatrix(currRow, 1) = iif(IsNull(rsCSV.Fields(0).Value,"",rsCSV.Fields(0).Value)
                             .TextMatrix(currRow, 2) = iif(IsNull(rsCSV.Fields(1).Value,"",rsCSV.Fields(1).Value)
                             .TextMatrix(currRow, 3) = Format(rsCSV.Fields(2).Value, "####.000")
                             .TextMatrix(currRow, 4) = PadRight(Left(rsCSV.Fields(3).Value, 8), "XXXXXXXX")
                             If Len(Trim(rsCSV.Fields(4).Value)) = 0 Then .TextMatrix(currRow, 5) = "" Else .TextMatrix(currRow, 5) = rsCSV.Fields(4).Value
                             .TextMatrix(currRow, 6) = iif(isnull(rsCSV.Fields(5).Value,"",rsCSV.Fields(5).Value)
                             .TextMatrix(currRow, 7) = iif(IsNull(rsCSV.Fields(6).Value,"",rsCSV.Fields(6).Value)
    
                             .Rows = .Rows + 1
                         End With
                 rsCSV.MoveNext
                 Loop

  5. #5

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Thanks Firoz this will take care of the NULL no doubt, but the problem is it should not fetch it as NULL at first place because if you see there is already data present.
    It should return me 10 438 89432 2 not NULL.

    I guess you are not getting the problem, its problem with .csv file reading

    Best Regards,

    Sam

  6. #6
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    There is no need to trim anything, the Jet Text IISAM takes care of that just fine.

    Here is my theory on your problem:
    You are letting too many things default, in this case specifically you are letting the IISAM scan rows to try to determine the data type of each column. Depending on your registry settings MaxScanRows is either 0 (scan all rows) or 20 which is often set if you have MS Access or another Office program installed.

    When it scans your second column it thinks "Hmm, the majority of the scanned rows appear to have a large numeric value so we'll call it that."

    However when it hits those rows where there are spaces embedded in column 2 the value is invalid, so Null is stored.
    I decided to throw together a test case from an old program I already had on hand. Sure enough, providing schema information to tell the IISAM that column 2 contains text seems to do the trick.

    Ready-to-run demo project with sample data attached.
    Attached Files Attached Files

  7. #7
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    BTW:

    Do While Not EOF is sort of silly and clunky to read, don't you think? Why not Do Until EOF instead? Double negatives are both bad grammar and bad code.

    Also, a lot of your column formatting could be done via the Jet Expression Service instead of tinkering with every cell in the grid via TextMatrix, which gets darned slow for long data sets. The demo code above contains a simple example, see the code comments.

  8. #8

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Thanks a lot dilettante, the code looks really something nice and cool.
    Will go through it, as you are creating .ini file at run time so i guess it will help in my case where the input file will have a header sometime and sometime not.

    how it deals where there is NO header to the input file?

    Best Regards,

    Sam

  9. #9
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    In the no-header case it uses ColNameHeader=False in the schema.ini file.

    As noted in the comments, you could also use a fixed, premade schema.ini file with sections for two formats, one with and one without the header row. Then in your code you could rename input data files to and later back from the fixed names in the schema.ini sections.

    The stuff in FormatGrid was something I already had. You may not want that and it has nothing to do with the "spaces gets me Null" problem.

  10. #10

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Great! I will work on it. Thanks again dilettante.

    Best Regards,

    Sam

  11. #11

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Getting confused here dilettante;
    See this is my file format but the problem is either I don't know what the user will put header name as or sometimes there will be no header.

    but the number of columns will be fixed and the data also (field names are not sure to be same every time)

    To Account No - [text]
    To Name - [text]
    Amount - [decimal value]
    From Account No - [text]
    From Name - [text]
    Description - [text]

    can you please help how will be my schema.ini look like for above?

    Also in your connection open statement you are writing column names but in our case we don't know the column name if they exist and sometime there will be no header, so how we call this statement; if with SELECT * then how we format particular column like you did?

    sorry to trouble you with so many questions

    your help is highly appreciated,

    Best Regards,

    Sam

  12. #12

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Ok, I think I am getting few things here, we can configure the column names as we required and it will treat them respectively according to their placement .. if I am not wrong?

    And NameHeader is nothing but to tell if their is Header or Not.

    Let me get into it and see how far I go

  13. #13
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    Yes, you give the columns names by position. This overrides the header row if you have one.

    Sounds good. Too bad there isn't either always or never a header row!

  14. #14

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    ok, thanks dilettante now I have successfully got over the with header thing.
    Now moving on to no header will need you help.
    1. how do we write here the schema.ini ? with "ColNameHeader=False" will we skip the col1=text, col1=text part all together?
    2. if we skip these (col1=text,col2=text) lines from schema.ini then how we tell JET that we have here text values so please don't try to read it as numbers?
    3. Also how we do the format part for header less data; like you did "Format$([Sold Count], '#') AS [Sold Count]

    Best Regards,

    Sam

  15. #15
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    If you look you'll see the header/no header issue was already handled, using a Boolean argument passed to that subroutine. In other words no, you do not omit the column definitions from the schema file.

    As far as formatting goes you can use almost any VBA expression and many VBA built-in functions like Format$(), just be sure to quote strings with apostrophes or else double-up quotes to escape them:
    Code:
    "Format$([Sold Count], '#') AS [Sold Count] FROM ["
    can be replaced by:
    Code:
    "Format$([Sold Count], ""#"") AS [Sold Count] FROM ["
    if you really want to.

  16. #16
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    Your PadRight() might be the tricky part depending on what it must do.

  17. #17

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Sorry I guess I was not able ask my doubt properly.
    Actually I meant; i was able to implement it with my need for files having header.
    But now what should I be doing when the user input non header file; how will my schema.ini file look ex:
    if the first one was like this
    Code:
        FileNum = FreeFile(0)
        Open DataPath & "\schema.ini" For Output As #FileNum
        Print #FileNum, "["; fileName; "]"
        Print #FileNum, "Format=CSVDelimited"
        Print #FileNum, "TextDelimiter=none"
        Print #FileNum, "ColNameHeader="; NameHeader
        Print #FileNum, "CharacterSet=ANSI"
        Print #FileNum, "MaxScanRows=1" '0 means scan all rows, a waste of time for us.
        Print #FileNum, "Col1=""BNFAC"" Text"
        Print #FileNum, "Col2=BNFName"" Text"
        Print #FileNum, "Col3=Amount"" Double"
        Print #FileNum, "Col4=""Code"" Text"
        Print #FileNum, "Col5=""OrdCustAC"" Text"
        Print #FileNum, "Col6=""OrdCustName"" Text"
        Print #FileNum, "Col7=""Description"" Text"
        
        Close #FileNum
    then will the No Header be like this ?
    Code:
        FileNum = FreeFile(0)
        Open DataPath & "\schema.ini" For Output As #FileNum
        Print #FileNum, "["; fileName; "]"
        Print #FileNum, "Format=CSVDelimited"
        Print #FileNum, "TextDelimiter=none"
        Print #FileNum, "ColNameHeader="; NameHeader
        Print #FileNum, "CharacterSet=ANSI"
        Print #FileNum, "MaxScanRows=1" '0 means scan all rows, a waste of time for us.
        
        Close #FileNum
    if so, then how we write this statement without knowing the columns?

    Code:
    .Open "SELECT [4 Letters], [Some Code], Country, Edition, " _
                & "Format$([Sold Count], '#') AS [Sold Count] FROM [" & FileName & "]", _
                  "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='" _
                & DataPath _
                & "';Extended Properties='Text'", _
                  adOpenStatic, adLockReadOnly, adCmdText
    and also how does JET will know which columns have text data so not to return NULL in case of data mismatch?

  18. #18

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    when I try same schema.ini with NameHeader=False
    same open statement; I get this error No value given for one or more required parameters.

  19. #19
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    See this?
    Code:
    Print #FileNum, "ColNameHeader="; NameHeader
    NameHeader there is a Boolean variable. True prints "True" and False prints "False" there.

    I have no idea why you are removing the column defnitions... that's what this entre excercise is about!

    You need them whether the CSV file has a header row or not.


    Go back to the ZIP archive I posted and unZIP it. Edit the CSV file in Notepad, delete the header row, save/exit. Open the Project, change:
    Code:
    LoadGrid App.Path, "somedata.csv", True
    to:
    Code:
    LoadGrid App.Path, "somedata.csv", False
    Then run this "no header" version.

  20. #20
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    Hmm, just in case your locale is not an English language one, use:
    Code:
    Print #FileNum, "ColNameHeader="; IIf(NameHeader, "True", "False")

  21. #21

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Ok .. Understood. I checked with your code it works fine, thanks dilettante.
    I tried same thing my code its not working; somewhere something is missing l feel I am spending more time on this so little bit blank sometimes

    Anyways really appreciate your support and time spent;

    Best Regards,

    Sam

  22. #22

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Hello dilettante, I tried it with my code; but a weird thing happens
    when I run the code through your application project file it gives me the correct results, where as when i copy paste exact same code from my Project file it doesn't work.

    Any idea why would this happen? 2 things are happening wrong when I move the code to my Project
    1. The Number with space is returned as NULL
    2. The open statement doesn't work for Header=False

    The Same code works well from your Sent Project file

    had my head scratching whole night to figure this out may be I am working too hard

    Best Regards,

    Sam

  23. #23

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    Oh My God! Finally I guess I got hold of the issue
    The to be Imported file needs to be in the App.path ???? that's real tricky part ooops!!!

    hmmmmmmmmmm

  24. #24
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: VB6 & Reading CSV file with data including spaces

    Quote Originally Posted by sam4help View Post
    The to be Imported file needs to be in the App.path ???? that's real tricky part ooops!!!
    I'm not sure how you came to this conclusion.

    True, my example has the data file in App.Path. But that isn't any kind of requirement. The schema.ini file has to be in the same directory as the data files it describes though.


    Perhaps you ought to slow down, read through the example, and figure out what it is doing.

    Notice how there are subroutine arguments for the data directory, the file, and whether or not the file has a header row? It works for any file, in any directory, with or without a header row. The file just needs to have the record layout that the program expects and be ANSI encoded text.

  25. #25

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: VB6 & Reading CSV file with data including spaces

    ok, anywayz my application was allowing user to select the input file through dialog box and it was not working as I had kept the schema.ini file in app.path and the input file path varied due to selection; so now I am copying the selected input file into the app.path and then process it.

    All works well, let me take this opportunity to thank you million times, it has helped a lot.

    Best Regards,

    Sam

  26. #26
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Good to hear.

    Note that in many cases it can be "cheaper" to put schema.ini next to the data, since it is small and your data may be large.

    However more importantly, App.Path should never be used for data except data you only read from. If your program is ever installed App.Path will be within Program Files which users should not have write access to, and this is enforced in modern versions of Windows.

    At best your program might get a snicker and sneer from Windows and the files virtualized, even worse your program may just blow up on a security violation. At the worst one file will be in App.Path and the other placed in the VirtualStore, and then your program will be back to the mess you were in originally.

    Using App.Path is really just for creating test programs, not real ones.

    If there is any chance that your users will select files that are in a read-only location you may be far better off copying the file and writing the schema.ini file to some temporary location within the user's profile.


    An example of a way to do this might be:
    Code:
    Dim TempPath As String
    Const ssfLOCALAPPDATA = &H1C
    With CreateObject("Shell.Application").NameSpace(ssfLOCALAPPDATA).Self
        TempPath = .Path & "\" & App.Title & "Temp"
    End With
    MkDir TempPath
    Then after you are done Kill the files and finally RmDir TempPath.

  27. #27

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    ok, thanks. point noted.

    Best Regard,

    Sam

  28. #28

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Hello dilettante, again here.

    Now, I came across a big file that's more than may be 40k records and it may increase more than that also. Now the problem is msflexgrid started not responding so I had to remove that step itself.

    My basic requirement is open a csv or a excel file and clean the data of each field with allowing only these below characters and get that resultant data into recordset. Which later has to be later exported into XML File.

    Validity Checks 1.
    Code:
     Alpha Numeric = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ()./?'-+-: "
    Validity Check 2.
    Code field should be 8 Digit if not then pad with X

    Validity Check 3.
    Amount has to be ####.000 format

    Currently even if I remove the msflexgrid I the process of cleaning is taking around 2 hours to go through each record and clean and validate the data. I am using this function and loop through the recordset.

    Code:
    Public Function CleanTheString(TheString)
    Dim newString As String
    newString = Replace(TheString, "&", "AND")
    newString = Replace(newString, "/", "-")
    TheString = newString
      'msgbox thestring
          strAlphaNumeric = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ()./?'-+-: " 'Used to check for special characters.
          For i = 1 To Len(TheString)
              strChar = Mid(TheString, i, 1)
              If InStr(strAlphaNumeric, strChar) Then
                  CleanedString = CleanedString & strChar
              End If
          Next
          'msgbox cleanedstring
          CleanTheString = CleanedString
    
      End Function

    So, while getting the recordset I need to return the recordset with this validity and format implemented. What would your experience and knowledge suggest and how to do it better way,

    SAMPLE DATA

    AC Name Amount Code Employer Employer
    0342-013 ABCDE 281.900 CDUSOMRSSX 65069516 Country International
    05 00 11 XXHJH 228.660 DFUSOM 65000165 International L.L.C
    03032001 YHSUJ 342.950 BMUSOMRX 65000616 ACG International

    Thanking you in advance,

    Best Regards,

    Sam
    Last edited by sam4help; Mar 19th, 2012 at 09:28 AM.

  29. #29

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    ok, Padding; Amount Format and If Case I have achieved

    Format$([Amount], '####.000') AS [Amount], Left$([tCode] & String(8, 'X'), 8) AS [tCode]

    "IIf(Left$([tCode] & String(8, 'X'), 8) = 'ABCDEXXX', 'Internal', 'External') AS [TransType]

    Now the challenge is

    2. Clean field data using same sql statement
    Last edited by sam4help; Mar 19th, 2012 at 02:56 PM.

  30. #30
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    I think you have gone beyond the simple field data formatting that can be done in Jet SQL. If these requirements had been mentioned earlier I probably would never have set you down this path, so I'm sorry if this has eaten too much time.

    Once we begin to consider large sets of data and the need to display the data directly in a grid control goes away better alternatives present themselves.

    While Jet's VBA Expression Service can be convenient, there are limits to its flexibility as well as a performance price over compiled logic. This character filtering requirement is an example of something you won't find an easy way to do just using Jet SQL.


    This takes us right back to using VB I/O statements and handling all of the character filtering, field parsing, and data formatting in code. While that may not be an issue for CSV Text input, it becomes problematic for Excel as a data source.


    A possible alternative might be to continue using the Jet Text and Excel IISAMs for reading/parsing the input using no SQL at all, i.e. just the "table name" (text file name or Excel Worksheet range name) with adCmdTable requests.

    To optimize that I'd try using a Recordset with CursorLocation = adUseServer, CursorType = adOpenForwardOnly, and CacheSize = 10 (or even larger, like 50). And of course open read-only.

    After opening this Recordset iterate through record by record (MoveNext) until EOF, taking the data from the Fields collection to operate on each field as required using VB6 logic, and outputting the filtered/formatted results to "wherever."

    This "wherever" might be something like a new text file, a Jet MDB, or your finished XML. The only reason to consider using an MDB for the reformatted data would be cases where you need to make multiple passes over the reformatted data where indexing or other operations could be of benefit.

  31. #31

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Thanks dilettante for your kind reply; ok, how about only text csv files, i have found an easy and fast alternative where I am able to convert these excel files as csv so, now I have only csv files to be taken care? how can this be done in this case .. any help?

    Looping through its taking very much time, so just looking at other optimized alternatives

    or as an alternative may get involve mdb file for getting better optimized performance ? where we first export csv file directly to mdb and through query or something we do this processing on sql level ?

    Best Regards,

    Sam
    Last edited by sam4help; Mar 19th, 2012 at 04:18 PM.

  32. #32
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    I can't think of any way SQL will help you filter out "bad" characters in your fields. So this means using an intermediate MDB won't help either.

    If we only have CSV text files to worry about the fastest way is to bypass ADO, Jet, etc. and go back to file I/O.


    The best way to speed this up is to read the file in one gulp and process it. This falls apart pretty fast though for the general case where your files can grow into megabyte sizes.

    Your data records look pretty simple. They seem to be fairly clean CSV format records that Input # can handle.

    Can you test the speed of a small program reading one of your big files to see if this might be fast enough? Example of what I have in mind:
    Code:
    Sub Test(ByVal HeaderPresent As Boolean)
        Dim ToAccountNo As String
        Dim ToName As String
        Dim Amount As Currency 'Or Single, or Double
        Dim FromAccountNo As String
        Dim FromName As String
        Dim Description As String
        Dim F as Integer
    
        F = FreeFile(0)
        Open "somedata.csv" For Input As #F Len=32767 'Tell VB to use the largest buffer it can.
        If HeaderPresent Then Line Input #F, Description 'Just skip header if present.
        Do Until EOF(F)
            Input #F, ToAccountNo, ToName, Amount, FromAccountNo, FromName, Description
            'In a real program we'd process data here.
        Loop
        Close #F
    End Sub
    If this is still very slow then it gets tougher.

    Part of what makes it slow is the time spent waiting for I/O. The code above tries to reduce that a little by specifying the largest read buffer that VB6 supports (32767) for its text I/O operations.

    To get any faster you'd have to fall back on techniques that read in larger chunks and do efficient field and record parsing. Note that Split() is not particularly efficient at this for very large chunks of data.

    But before getting complicated test something like the sample code above.

    It may easily be that much more time is being eaten up by your field data processing than the I/O. if so you'll need to optimize that.

    Filtering out your "bad" characters may get quite time consuming, and that will probably need optimization no matter how you read the data.

  33. #33
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Here's a prebuilt demo that might possibly help.

    It consists of a class for reading and "cleaning" CSV data as you suggest, and a demo Project that just outputs cleaned CSV data. There is another project in the attached archive that makes a "dirty" CSV file to clean.

    For a real program you'd use the Fields() array it returns for each GetRecord() call and reformat these values, convert them to Single, Integer, etc. as needed and then go ahead and use them.

    Hope this helps.

    Others might have further optimizations of the CSV cleaning and parsing process to suggest as well. This seems to run reasonably fast for me though.
    Last edited by dilettante; Mar 22nd, 2012 at 09:32 PM. Reason: removed flawed attachment

  34. #34

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Thanks a ton dilettante I really appreciate your kind help,
    It should surely help, I will go through it and see how to implement.

    Best Regards,

    Sam

  35. #35

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Hello dilettante,

    I need to save a recordset into a csv file. What would be the fastest way of doing so, as I am exporting a Excel file into csv and then it will be cleaned through the csv cleaner. This step will help me skip the looping through each record .. Any help?

    I came across this post http://www.vbforums.com/showthread.php?t=481705 but, saw some comments from you down under, as by now you know my situation very well, will this be a good way for me or you feel something better can be done ?
    Last edited by sam4help; Mar 21st, 2012 at 02:48 PM.

  36. #36
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    The comments there only apply in cases where you need more control over the formatting of the output CSV file. For example you might need to quote String values that might contain commas, or you might need to format Date values in a specific way, avoid scientific notation for "real" numbers (i.e. never 1.1E-2 for 0.011), and other things like that.


    But in your case you don't have a Recordset anyway, do you? What form of "CSV cleaning" did you end up using?

  37. #37
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    I'm not sure if this helps, but take a look at this attachment.

    It is similar to the prior one, but attempts to do some field formatting and write records as it goes using a faster writing technique. I'm not convinced it is hugely faster than the previous example though.

    Like the previous one, it loops reading/cleaning an input CSV file and writes a cleaned output CSV file. Here I have added some trivial reformatting of two of the fields.


    However I still think I'm missing the point you were trying to make when you said:
    This step will help me skip the looping through each record
    Last edited by dilettante; Mar 22nd, 2012 at 09:33 PM. Reason: removed flawed attachment

  38. #38

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    Thanks for the reply dilettante,

    ok, here it goes. What I am doing is I am doing few formatting which are possible through first ado sql statement like

    Code:
    rsCSVData.Open "SELECT [BNFAC],[BNFName],Format$([Amount], '####.000') AS [Amount], Left$([tCode] & String(8, 'X'),  8) AS [tCode]," _
            & "[OrdCustAC],[OrdCustName],[Description]," _
                & "IIf(Left$([tCode] & String(8, 'X'),  8) = '" & lblIntExtCriteria.caption & "', 'Internal', IIf(Format$([Amount], '####.000')>Format$(20000.000, '####.000'),'Others','External'))  AS [Type] " _
                    & "from [" & FileName & "] WHERE [Amount] > 0", connCSV, adOpenStatic, adLockReadOnly, adCmdText
    Now, with this I have recordset with formatting and only required records which have to be considered; so this is the recordset I am talking about. Now I need to save this and then this saved new csv file I will clean up with your csv cleaner and then export into a xml.

    Hope I am clear enough this time, hence the request for the help on saving of recordset into csv file.

    Code:
    But in your case you don't have a Recordset anyway, do you? What form of "CSV cleaning" did you end up using?
    I have not used the csv cleaning yet, once i get the above recordset saved i will clean the new saved csv.

    Best Regards,

    Sam
    Last edited by sam4help; Mar 21st, 2012 at 11:10 PM.

  39. #39
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    I think we've managed to get a bit tangled up here.

    If you are going to need to do the "CSV cleaning" step then there is no benefit in processing the CSV data using SQL, it will only make the process slower. Plus you've added another whole expensive step.

    You'd be reading the data using SQL (and doing some reformatting), then writing it back out. Then reading the new CSV data to clean it, and writing it back out again. Worse yet it sounds like you'll be reading the reformatted and cleaned CSV data a third time to reformat and write it as XML.


    It would probably be much faster to do all of this in one pass: Use the CleanCSVReader.cls to read the original CSV and clean it, and as it returns back records (in the Fields array) do any required reformatting right there, and write this record as XML.


    If you really wanted to use Jet SQL to do the reformatting you'd just use a SELECT INTO statement that creates the new CSV file. There is no need to get a Recordset or fiddle with it at all. This will be faster than trying to get a Recordset and extract and write its contents, but that will still be slower than just doing this in one pass.


    So there are at least two reasons here not to use SQL: your need to "clean" out invalid characters and your need to produce XML as the result.

  40. #40

    Thread Starter
    Lively Member
    Join Date
    Mar 2012
    Posts
    113

    Re: [RESOLVED] VB6 & Reading CSV file with data including spaces

    hmm understandable, but the thing is i will need to use this for these reason
    1. I need to get additional columns based on the few adjoining column data like
    Code:
    IIf(Left$([tCode] & String(8, 'X'),  8) = '" & lblIntExtCriteria.caption & "', 'Internal', IIf(Format$([Amount], '####.000')>Format$(20000.000, '####.000'),'Others','External'))  AS [Type]
    which are not there in original csv file.

    2. Only get the criteria based records to be cleaned instead of entire first csv file which will help reduce burden on the csv cleaner

    Hope I am able to explain..

    Code:
    If you really wanted to use Jet SQL to do the reformatting you'd just use a SELECT INTO statement that creates the new CSV file. There is no need to get a Recordset or fiddle with it at all. This will be faster than trying to get a Recordset and extract and write its contents, but that will still be slower than just doing this in one pass
    How can above be done any example will really help.. like how to select into csv?


    Best regards,

    Sam
    Last edited by sam4help; Mar 22nd, 2012 at 04:03 AM.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width