Results 1 to 3 of 3

Thread: [VB6] An ADO "PutString" function

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Post [VB6] An ADO "PutString" function

    Normally we can use tools like Jet's Text IISAM to import delimited text. But sometimes our delimited text might not be in a file. Perhaps we received it from a web service or a TCP connection or something, and we don't want to take the step of writing the data to disk just to turn around and import it into our database.

    The ADO Recordset has a GetString method that can be used to convert its contents to a delimited text String value fairly easily. If only we had an inverse function, a sort of PutString we could used?


    PutString

    Here is a function that does just that. It takes care of parsing the delimited columns and rows and posts these to a database table using an append-only cursor Recordset.

    All of this seems pretty well optimized, though with effort you might squeeze out another millisecond or two. The commonly advocated "split the splits" approach is far slower than this logic:

    Code:
    Private Function PutString( _
        ByRef StringData As String, _
        ByVal Connection As ADODB.Connection, _
        ByVal TableName As String, _
        ByVal ColumnIds As Variant, _
        Optional ByVal ColumnDelimiter As String = vbTab, _
        Optional ByVal RowDelimiter As String = vbCr, _
        Optional ByVal NullExpr As Variant = vbNullString) As Long
        'A sort of "inverse analog" of the ADO Recordset's GetString() method.
        '
        'Returns count of rows added.
    
        Dim SaveCursorLocation As CursorLocationEnum
        Dim RS As ADODB.Recordset
        Dim ColumnStart As Long
        Dim ColumnLength As Long
        Dim ColumnValues() As Variant
        Dim Pos As Long
        Dim NewPos As Long
        Dim RowLimit As Long
        Dim I As Long
        Dim AtRowEnd As Boolean
    
        If (VarType(ColumnIds) And vbArray) = 0 Then Err.Raise 5 'Invalid procedure call or argument.
    
        SaveCursorLocation = Connection.CursorLocation
        Connection.CursorLocation = adUseServer 'Required to create this fast-append Recordset:
        With New ADODB.Command
            Set .ActiveConnection = Connection
            .CommandType = adCmdTable
            .CommandText = TableName
            .Properties![Append-Only Rowset] = True
            .Properties![Own Changes Visible] = False       'Doesn't matter when using exclusive access.
            .Properties![Others' Changes Visible] = False   'Doesn't matter when using exclusive access.
            Set RS = .Execute()
        End With
        Connection.CursorLocation = SaveCursorLocation
    
        ReDim ColumnValues(UBound(ColumnIds))
        Pos = 1
        Do
            RowLimit = InStr(Pos, StringData, RowDelimiter)
            If RowLimit = 0 Then RowLimit = Len(StringData) + 1
            I = 0
            AtRowEnd = False
            Do
                ColumnStart = Pos
                NewPos = InStr(Pos, StringData, ColumnDelimiter)
                If NewPos = 0 Or NewPos > RowLimit Then
                    Pos = InStr(Pos, StringData, RowDelimiter)
                    ColumnLength = RowLimit - ColumnStart
                    If Pos <> 0 Then
                        Pos = Pos + Len(RowDelimiter)
                        'Auto-handle CrLf when RowDelimiter is vbCr.  GetString()
                        'itself defaults to vbCr as the RowDelimiter.  Some software
                        'strangely enough will use a mix of vbCr and vbCrLf:
                        If RowDelimiter = vbCr Then
                            If Mid$(StringData, Pos, 1) = vbLf Then Pos = Pos + 1
                        End If
                    End If
                    AtRowEnd = True
                Else
                    Pos = NewPos
                    ColumnLength = Pos - ColumnStart
                    Pos = Pos + Len(ColumnDelimiter)
                End If
                ColumnValues(I) = Trim$(Mid$(StringData, ColumnStart, ColumnLength))
                If Not IsMissing(NullExpr) Then
                    If ColumnValues(I) = NullExpr Then ColumnValues(I) = Null
                End If
                I = I + 1
            Loop Until AtRowEnd
            RS.AddNew ColumnIds, ColumnValues
            PutString = PutString + 1
        Loop Until Pos = 0 Or Pos > Len(StringData)
    End Function

    Demo

    PutString is contained in the attached demo within Module1.bas.

    This demo creates a new empty database with a single table SOMETABLE on its first run. Once it has an open database connection it first deletes all rows (if any) from SOMETABLE.

    Then it creates a big String containing 5000 rows with 8 random data fields (of several types). This String has TAB column delimiters and CR/LF row delimiters.

    Then it calls PutString to append the data to SOMETABLE, displays a MsgBox with the elapsed time for the PutString, and ends.

    The compiled program takes from 0.12 to 0.16 seconds to do the PutString call here, but the Timer() function isn't very accurate for small intervals.


    Issues

    I think I have the bugs out of the parsing logic.

    This has only been tested with the Jet 4.0 provider, and I'm not sure how well it will do with other DBMSs. With Jet I found no advantage at all to wrapping the appends in a transaction or using batch updating, both whizzy performance gaining techniques according to common wisdom (which often isn't wise at all). Using any form of client Recordset only hurt performance, pretty much as expected.

    Of course many variables have been left out, for example other connections could be updating, holding locks, etc. and that could make a huge difference.

    Opening the database with exclusive access gains you a little more performance too. When you aren't sharing a database this is always a good bet, since eliminating locking naturally improves performance. The demo just lets this default to shared access.


    Nasty Issues

    The ADO Recordset's GetString method has a nasty secret. Not quite that big of a secret to classic ASP scripters since it was tripped over quite early. That secret is:

    GetString does not use the invariant locale and you cannot set its locale

    How does this matter?

    What about Boolean values? What about fractional numeric values?

    It turns out the PutString has the very same limitation (or is that a feature??).

    As far as I can determine through testing, the demo should work just fine even in one of the Central European locales (e.g. Germany) with funky number punctuation it different wors for "true" and for "false." That's because it is using the locale-aware CStr() function when building the big test String value.

    However the main reasons to work with a delimited text tabular data format are (a.) persisting, and (b.) interchanging data.

    So a program running on a German language machine can't use this for talking to a French language machine. A French machine can't talk to an English machine because the number formats may match but the Booleans are goofy.


    SetThreadLocale

    The clever may think they know the answer, but calling SetThreadLocale passing LOCALE_INVARIANT won't cut it. For that matter the story is more complicated for supported versions of Windows anyway, involving SetThreadUILanguage calls.

    But as far as I can tell the Variant parsing/formatting routines within OLE Automation that ADO makes use of lock in the locale pretty early and are not swayed by flippity-flopping locale settings around GetString or (my) PutString calls.
    Attached Files Attached Files

  2. #2

    Thread Starter
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [VB6] An ADO "PutString" function

    I pounded on the code a little and created something similar for non-ADO related parsing of String values containing tabular data.

    In the demo program below this new TabularParse() function returns a Variant containing a 2D array of parsed String cell data in columns and rows. Still quite quick. Here I used a flexgrid control to display the results.

    It can autonumber the rows in column 0 if desired.

    This could be useful for controls that accept data in this form, or when a flexgrid's Clip property is inadequate because you have column delimiters other than TAB and/or row delimiters other than CR (or CRLF). It can also replace "empty" values (two delimiters in a row) by a NullExpr value you provide.

    A good piece of the attachment consists of sample data. The program is rather small.
    Attached Files Attached Files

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: [VB6] An ADO "PutString" function

    PutString version 2:

    Adds another optional parameter ColNameHeader. When True is passed, PutString will use the 1st row of the delimited text as column ID values.

    No other changes, the previous version is still good. This one is just a little more flexible. But you must still watch out for UI locale issues such as the decimal point character, words that mean True/False, etc.
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width