Page 1 of 2 12 LastLast
Results 1 to 40 of 54

Thread: [VB6] Faster Split & Join (development)

  1. #1

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    [VB6] Faster Split & Join (development)

    Old short QuickSplit versions in post #11.

    Strings.Split
    This is an optimized code using advanced methods for providing a "no external files" solution using a single BAS module file. No TLB, no DLL.

    There is many advantages:
    • 100% syntax compatible with native Split: just add the module to your project and you have a faster Split!1)
    • Faster with any possible call! Be the output empty array, empty strings, one array item only, short strings, mid length strings, long strings, this Split is always faster.
    • BinaryCompare for ultimate speed (InStrB), CharacterCompare for possible problems with some special case characters (InStr). InStrB is always faster than native Split; InStr version is faster or rougly equal depending on case.

    1) TextCompare not supported, you can call VBA.Split for TextCompare. Adding support for TextCompare is too complex as it might require analyzing the entire Unicode character map.
    Attached Images Attached Images     
    Attached Files Attached Files
    Last edited by Merri; Jun 14th, 2010 at 12:30 PM.

  2. #2

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    Here is a byte version that splits bytewise instead of characterwise, a feature not supported natively by VB:
    Code:
    Public Sub QuickSplitB(Expression As String, ResultSplit() As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1, Optional ByVal Compare As VbCompareMethod = vbBinaryCompare, Optional ByRef IgnoreDelimiterWithin As String = vbNullString)
        Dim lngA As Long, lngCount As Long, lngDelLen As Long, lngExpLen As Long, lngIgnLen As Long, lngResults() As Long
        lngExpLen = LenB(Expression)
        lngDelLen = LenB(Delimiter)
        If lngExpLen > 0 And lngDelLen > 0 And (Limit > 0 Or Limit = -1&) Then
            lngIgnLen = LenB(IgnoreDelimiterWithin)
            If lngIgnLen Then
                lngA = InStrB(1, Expression, Delimiter, Compare)
                lngB = InStrB(1, Expression, IgnoreDelimiterWithin, Compare)
                If Limit = -1& Then
                    ReDim lngResults(0 To (lngExpLen \ lngDelLen))
                    Do While lngA > 0
                        If lngA + lngDelLen <= lngB Or lngB = 0 Then
                            lngResults(lngCount) = lngA
                            lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                            lngCount = lngCount + 1
                        Else
                            lngB = InStrB(lngB + lngIgnLen, Expression, IgnoreDelimiterWithin, Compare)
                            If lngB Then
                                lngA = InStrB(lngB + lngIgnLen, Expression, Delimiter, Compare)
                                If lngA Then
                                    lngB = InStrB(lngB + lngIgnLen, Expression, IgnoreDelimiterWithin, Compare)
                                End If
                            End If
                        End If
                    Loop
                Else
                    ReDim lngResults(0 To Limit - 1)
                    Do While lngA > 0
                        If lngA + lngDelLen <= lngB Or lngB = 0 Then
                            lngResults(lngCount) = lngA
                            lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                            lngCount = lngCount + 1
                            If lngCount = Limit Then Exit Do
                        Else
                            lngB = InStrB(lngB + lngIgnLen, Expression, IgnoreDelimiterWithin, Compare)
                            If lngB Then
                                lngA = InStrB(lngB + lngIgnLen, Expression, Delimiter, Compare)
                                If lngA Then
                                    lngB = InStrB(lngB + lngIgnLen, Expression, IgnoreDelimiterWithin, Compare)
                                End If
                            End If
                        End If
                    Loop
                End If
            Else
                lngA = InStrB(1, Expression, Delimiter, Compare)
                If Limit = -1& Then
                    ReDim lngResults(0 To (lngExpLen \ lngDelLen))
                    Do While lngA > 0
                        lngResults(lngCount) = lngA
                        lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                        lngCount = lngCount + 1
                    Loop
                Else
                    ReDim lngResults(0 To Limit - 1)
                    Do While lngA > 0 And lngCount < Limit
                        lngResults(lngCount) = lngA
                        lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                        lngCount = lngCount + 1
                    Loop
                End If
            End If
            ReDim Preserve ResultSplit(0 To lngCount)
            If lngCount = 0 Then
                ResultSplit(0) = Expression
            Else
                ResultSplit(0) = LeftB$(Expression, lngResults(0) - 1)
                For lngCount = 0 To lngCount - 2
                    ResultSplit(lngCount + 1) = MidB$(Expression, lngResults(lngCount) + lngDelLen, lngResults(lngCount + 1) - lngResults(lngCount) - lngDelLen)
                Next lngCount
                ResultSplit(lngCount + 1) = RightB$(Expression, lngExpLen - lngResults(lngCount) - lngDelLen + 1)
            End If
        Else
            ResultSplit = VBA.Split(vbNullString)
        End If
    End Sub
    Edit!
    1. Removed unused variables.
    Last edited by Merri; Sep 20th, 2008 at 05:00 PM.

  3. #3

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    A version for those who want it commented, simplified and just to be a super fast Split replacement that is usable in both VB5 & VB6:
    Code:
    Option Explicit
    
    Private Declare Function SafeArrayRedim Lib "oleaut32" (ByVal saPtr As Long, saBound As Long) As Long
    
    ' Short, commented & fully VB5 compatible version of QuickSplit at http://www.vbforums.com/showthread.php?t=540323
    Public Sub QuickSplit(Expression As String, ResultSplit() As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1)
        ' general variables that we need
        Dim lngA As Long, lngCount As Long, lngDelLen As Long, lngExpLen As Long, lngResults() As Long
        ' some dummy variables that we happen to need
        Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
        ' length information
        lngExpLen = LenB(Expression)
        lngDelLen = LenB(Delimiter)
        ' validate lengths and limit (limit must be larger than 0 or it must be unlimited)
        If lngExpLen > 0 And lngDelLen > 0 And (Limit > 0 Or Limit = -1&) Then
            ' now look up for the first position
            lngA = InStrB(1, Expression, Delimiter, Compare)
            ' InStrB is very fast, but it may give "between characters" results
            Do Until (lngA And 1) Or (lngA = 0)
                ' this is why we look for odd positions (1, 3, 5, 7 etc. are a valid position)
                lngA = InStrB(lngA + 1, Expression, Delimiter, Compare)
            Loop
            ' unlimited or limited?
            If Limit = -1& Then
                ' unlimited, reserve space for maximum possible amount of returned items
                ReDim lngResults(0 To (lngExpLen \ lngDelLen))
                ' index positions until none is found
                Do While lngA > 0
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    Do Until (lngA And 1) Or (lngA = 0)
                        lngA = InStrB(lngA + 1, Expression, Delimiter, Compare)
                    Loop
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            Else
                ' limited, reserve space for the wanted amount of items
                ReDim lngResults(0 To Limit - 1)
                ' index positions until none is found or until counter hits limitation
                Do While lngA > 0 And lngCount < Limit
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    Do Until (lngA And 1) Or (lngA = 0)
                        lngA = InStrB(lngA + 1, Expression, Delimiter, Compare)
                    Loop
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            End If
            ' set results to actual findings
            ReDim Preserve ResultSplit(0 To lngCount)
            ' see if we found any results
            If lngCount = 0 Then
                ' nope, just set the only item to be the whole string
                ResultSplit(0) = Expression
            Else
                ' get the first item
                ResultSplit(0) = LeftB$(Expression, lngResults(0) - 1)
                ' get the other items except the last one
                For lngCount = 0 To lngCount - 2
                    ResultSplit(lngCount + 1) = MidB$(Expression, lngResults(lngCount) + lngDelLen, lngResults(lngCount + 1) - lngResults(lngCount) - lngDelLen)
                Next lngCount
                ' get the last item
                ResultSplit(lngCount + 1) = RightB$(Expression, lngExpLen - lngResults(lngCount) - lngDelLen + 1)
            End If
        Else
            ' clean any possible data that exists in the passed string array (like if it is multidimensional)
            If Not Not ResultSplit Then Erase ResultSplit
            ' mysterious IDE error fix
            Debug.Assert App.hInstance
            ' reset to one element, one dimension
            ReDim ResultSplit(0 To 0)
            ' custom redimension: remove the items (this duplicates the VB6 Split behavior)
            SafeArrayRedim Not Not ResultSplit, SafeArrayBound(0)
        End If
    End Sub
    
    ' Short, commented & fully VB5 compatible version of QuickSplitB at http://www.vbforums.com/showthread.php?t=540323
    Public Sub QuickSplitB(Expression As String, ResultSplit() As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1)
        ' general variables that we need
        Dim lngA As Long, lngCount As Long, lngDelLen As Long, lngExpLen As Long, lngResults() As Long
        ' some dummy variables that we happen to need
        Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
        ' length information
        lngExpLen = LenB(Expression)
        lngDelLen = LenB(Delimiter)
        ' validate lengths and limit (limit must be larger than 0 or it must be unlimited)
        If lngExpLen > 0 And lngDelLen > 0 And (Limit > 0 Or Limit = -1&) Then
            ' now look up for the first position
            lngA = InStrB(1, Expression, Delimiter, Compare)
            ' unlimited or limited?
            If Limit = -1& Then
                ' unlimited, reserve space for maximum possible amount of returned items
                ReDim lngResults(0 To (lngExpLen \ lngDelLen))
                ' index positions until none is found
                Do While lngA > 0
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            Else
                ' limited, reserve space for the wanted amount of items
                ReDim lngResults(0 To Limit - 1)
                ' index positions until none is found or until counter hits limitation
                Do While lngA > 0 And lngCount < Limit
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            End If
            ' set results to actual findings
            ReDim Preserve ResultSplit(0 To lngCount)
            ' see if we found any results
            If lngCount = 0 Then
                ' nope, just set the only item to be the whole string
                ResultSplit(0) = Expression
            Else
                ' get the first item
                ResultSplit(0) = LeftB$(Expression, lngResults(0) - 1)
                ' get the other items except the last one
                For lngCount = 0 To lngCount - 2
                    ResultSplit(lngCount + 1) = MidB$(Expression, lngResults(lngCount) + lngDelLen, lngResults(lngCount + 1) - lngResults(lngCount) - lngDelLen)
                Next lngCount
                ' get the last item
                ResultSplit(lngCount + 1) = RightB$(Expression, lngExpLen - lngResults(lngCount) - lngDelLen + 1)
            End If
        Else
            ' clean any possible data that exists in the passed string array (like if it is multidimensional)
            If Not Not ResultSplit Then Erase ResultSplit
            ' mysterious IDE error fix
            Debug.Assert App.hInstance
            ' reset to one element, one dimension
            ReDim ResultSplit(0 To 0)
            ' custom redimension: remove the items (this duplicates the VB6 Split behavior)
            SafeArrayRedim Not Not ResultSplit, SafeArrayBound(0)
        End If
    End Sub
    This test shows it is a perfect VB6 Split replacement by behavior:
    Code:
    Private Sub Form_Load()
        Dim strArray() As String
        QuickSplit vbNullString, strArray
        Debug.Print LBound(strArray), UBound(strArray)
    End Sub
    VB6 Split would also return a similar array.

    Also to show how the byte version works:
    Code:
    Private Sub Form_Load()
        Dim strArray() As String
        QuickSplitB String$(2, ChrW$(&H2020)), strArray, ChrB$(&H20)
        Debug.Print LBound(strArray), UBound(strArray)
    End Sub
    You will get 5 items (because there are 4 space 8-bit ANSI characters). Very useful if you happen to read files directly into strings via API without any character set conversions from ANSI to Unicode (see an example of reading a text file fully using API).
    Last edited by Merri; Sep 20th, 2008 at 02:55 PM.

  4. #4

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    Finally, for those interested, I have made a version of QuickSplit that splits directly into string variables instead of a string array, SplitToVar. Please note that unlike Split and QuickSplit it modifies the expression string passed into the sub.

  5. #5
    Junior Member
    Join Date
    Aug 2008
    Posts
    19

    Re: VB6 QuickSplit and QuickSplitB

    Merri,
    I am using VB5 and am finding it hard to follow your QuickSplitB code.
    My situation is that I have a textbox with string, including commas or underscores, etc and I want to make an array to output to individual textboxes or a listbox (essentially the same as the Split function in VB6).
    How does your code relate to my situation, as well as the situation where the data is in a text file? Where does the code refer to the textbox, label, etc?
    I would appreciate any assistance.
    Cheers,
    Snookered.

  6. #6

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    With VB6's Split you would do something like this to add the separated items to listbox:
    Code:
    Dim strArray() As String, lngA As Long
    ' split by comma
    strArray = Split(Text1.Text, ",")
    ' clean up the listbox
    List1.Clear
    ' fill the listbox with all the resulting items
    For lngA = 0 To UBound(strArray)
        List1.AddItem strArray(lngA)
    Next lngA
    When QuickSplit has been added to the project (into a module), the usage is slightly different, instead of this line:

    strArray = Split(Text1.Text, ",")

    You use:

    QuickSplit Text1.Text, strArray, ","

    This allows getting a true string array in VB5 and it is also much faster.

    Code:
    Dim strArray() As String, lngA As Long
    ' split by comma
    QuickSplit Text1.Text, strArray, ","
    ' clean up the listbox
    List1.Clear
    ' fill the listbox with all the resulting items
    For lngA = 0 To UBound(strArray)
        List1.AddItem strArray(lngA)
    Next lngA

  7. #7
    Junior Member
    Join Date
    Aug 2008
    Posts
    19

    Re: VB6 QuickSplit and QuickSplitB

    I have got this working fine. Thanks for your help.
    Snookered.

  8. #8
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: VB6 QuickSplit and QuickSplitB

    You could tighten that code up a little. Instead of having two mostly duplicate code blocks for limited/unlimited, if you switch the logic around to a For...Next loop I think it'll be very close to just as fast.
    Code:
            ' unlimited or limited?
            If Limit = -1& Then
                ' unlimited, reserve space for maximum possible amount of returned items
                ReDim lngResults(0 To (lngExpLen \ lngDelLen))
                ' index positions until none is found
                Do While lngA > 0
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            Else
                ' limited, reserve space for the wanted amount of items
                ReDim lngResults(0 To Limit - 1)
                ' index positions until none is found or until counter hits limitation
                Do While lngA > 0 And lngCount < Limit
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    ' increase found counter
                    lngCount = lngCount + 1
                Loop
            End If
    Adding a variable would reduce the above code to:
    Code:
            ' unlimited or limited?
            If Limit = -1& Then
                lngMax = lngExpLen \ lngDelLen
            Else
                lngMax = Limit - 1
            End If
            ReDim lngResults(lngMax)
            For lngCount = 0 To lngMax
                If lngA <= 0 Then Exit For
                ' remember this position
                lngResults(lngCount) = lngA
                ' look for the next one
                lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
            Next

  9. #9
    Junior Member
    Join Date
    Sep 2008
    Posts
    26

    Re: VB6 QuickSplit and QuickSplitB

    Merri, I'm now using SplitB05 that downloaded from VBSpeed. And luckily I found another Split function at here! I want to know, QuickSplit and QuickSplitB which is faster? Are they support Chinese character as well? I'm using SplitB05(VBSpeed) has no problem with the chinese character.

  10. #10

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    tinjiunnyann: They do different slightly things. QuickSplit works with character data (two bytes = one character). QuickSplitB works with byte data (one byte = half a character). With most strings you want to use QuickSplit. Both work with any character you give them.


    Ellis Dee: thanks, I'll take a look at some point (hopefully later today... I was requested do some LAN work at home so I'm just in a quick visit at the moment)

  11. #11

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    Ellis Dee: These are the short versions updated to the even shorter code. I did a minor logical change: I made the loop code run via a True condition, instead of only checking whether to exit the loop, switching that to be the slightly slower False condition. I think this kind of improves readibility too, but that is a matter of opinion.

    I also did some further optimization that gives 0 to 4 % of performance boost, depends on the usage (how often called, how big input data, how many possible output results).
    Code:
    Option Explicit
    
    Private Declare Function SafeArrayRedim Lib "oleaut32" (ByVal saPtr As Long, saBound As Long) As Long
    
    ' Short, commented & fully VB5 compatible version of QuickSplit at http://www.vbforums.com/showthread.php?t=540323
    Public Sub QuickSplit(Expression As String, ResultSplit() As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1)
        Static lngResults() As Long
        ' general variables that we need
        Dim lngA As Long, lngCount As Long, lngDelLen As Long, lngExpLen As Long
        ' some dummy variables that we happen to need
        Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
        ' length information
        lngExpLen = LenB(Expression)
        lngDelLen = LenB(Delimiter)
        ' validate lengths and limit (limit must be larger than 0 or it must be unlimited)
        If lngExpLen > 0 And lngDelLen > 0 And (Limit > 0 Or Limit = -1&) Then
            ' unlimited or limited?
            If Limit = -1& Then
                ' unlimited, reserve space for maximum possible amount of returned items
                lngCount = lngExpLen \ lngDelLen
            Else
                ' limited, reserve space for maximum wanted amount of returned items
                lngCount = Limit - 1
            End If
            ' data of found positions
            ReDim Preserve lngResults(0 To lngCount)
            ' now look up for the first position
            lngA = InStrB(1, Expression, Delimiter, Compare)
            ' InStrB is very fast, but it may give "between characters" results
            Do Until (lngA And 1) Or (lngA = 0)
                ' this is why we look for odd positions (1, 3, 5, 7 etc. are a valid position)
                lngA = InStrB(lngA + 1, Expression, Delimiter, Compare)
            Loop
            ' now work until maximum limit
            For lngCount = 0 To lngCount
                ' did we find anything?
                If lngA > 0 Then
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                    Do Until (lngA And 1) Or (lngA = 0)
                        lngA = InStrB(lngA + 1, Expression, Delimiter, Compare)
                    Loop
                Else
                    ' we did not find the next string, we are done
                    Exit For
                End If
            Next lngCount
            ' set results to actual findings
            ReDim Preserve ResultSplit(0 To lngCount)
            ' see if we found any results
            If lngCount = 0 Then
                ' nope, just set the only item to be the whole string
                ResultSplit(0) = Expression
            Else
                ' get the first item
                ResultSplit(0) = LeftB$(Expression, lngResults(0) - 1)
                ' get the other items except the last one
                For lngCount = 0 To lngCount - 2
                    ResultSplit(lngCount + 1) = MidB$(Expression, lngResults(lngCount) + lngDelLen, lngResults(lngCount + 1) - lngResults(lngCount) - lngDelLen)
                Next lngCount
                ' get the last item
                ResultSplit(lngCount + 1) = RightB$(Expression, lngExpLen - lngResults(lngCount) - lngDelLen + 1)
            End If
        Else
            ' clean any possible data that exists in the passed string array (like if it is multidimensional)
            If Not Not ResultSplit Then Erase ResultSplit
            ' mysterious IDE error fix
            Debug.Assert App.hInstance
            ' reset to one element, one dimension
            ReDim ResultSplit(0 To 0)
            ' remove the item (this duplicates the VB6 Split behavior)
            SafeArrayRedim Not Not ResultSplit, SafeArrayBound(0)
        End If
    End Sub
    
    ' Short, commented & fully VB5 compatible version of QuickSplitB at http://www.vbforums.com/showthread.php?t=540323
    Public Sub QuickSplitB(Expression As String, ResultSplit() As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1)
        Static lngResults() As Long
        ' general variables that we need
        Dim lngA As Long, lngCount As Long, lngDelLen As Long, lngExpLen As Long
        ' some dummy variables that we happen to need
        Dim Compare As VbCompareMethod, SafeArrayBound(1) As Long
        ' length information
        lngExpLen = LenB(Expression)
        lngDelLen = LenB(Delimiter)
        ' validate lengths and limit (limit must be larger than 0 or it must be unlimited)
        If lngExpLen > 0 And lngDelLen > 0 And (Limit > 0 Or Limit = -1&) Then
            ' unlimited or limited?
            If Limit = -1& Then
                ' unlimited, reserve space for maximum possible amount of returned items
                lngCount = lngExpLen \ lngDelLen
            Else
                ' limited, reserve space for maximum wanted amount of returned items
                lngCount = Limit - 1
            End If
            ' data of found positions
            ReDim Preserve lngResults(0 To lngCount)
            ' now look up for the first position
            lngA = InStrB(1, Expression, Delimiter, Compare)
            ' now work until maximum limit
            For lngCount = 0 To lngCount
                ' did we find anything?
                If lngA > 0 Then
                    ' remember this position
                    lngResults(lngCount) = lngA
                    ' look for the next one
                    lngA = InStrB(lngA + lngDelLen, Expression, Delimiter, Compare)
                Else
                    ' we did not find the next string, we are done
                    Exit For
                End If
            Next lngCount
            ' set results to actual findings
            ReDim Preserve ResultSplit(0 To lngCount)
            ' see if we found any results
            If lngCount = 0 Then
                ' nope, just set the only item to be the whole string
                ResultSplit(0) = Expression
            Else
                ' get the first item
                ResultSplit(0) = LeftB$(Expression, lngResults(0) - 1)
                ' get the other items except the last one
                For lngCount = 0 To lngCount - 2
                    ResultSplit(lngCount + 1) = MidB$(Expression, lngResults(lngCount) + lngDelLen, lngResults(lngCount + 1) - lngResults(lngCount) - lngDelLen)
                Next lngCount
                ' get the last item
                ResultSplit(lngCount + 1) = RightB$(Expression, lngExpLen - lngResults(lngCount) - lngDelLen + 1)
            End If
        Else
            ' clean any possible data that exists in the passed string array (like if it is multidimensional)
            If Not Not ResultSplit Then Erase ResultSplit
            ' mysterious IDE error fix
            Debug.Assert App.hInstance
            ' reset to one element, one dimension
            ReDim ResultSplit(0 To 0)
            ' remove the item (this duplicates the VB6 Split behavior)
            SafeArrayRedim Not Not ResultSplit, SafeArrayBound(0)
        End If
    End Sub
    I haven't perfectly tested these codes, but they appear to work the same.


    Edit!
    A screenshot showing some of the performance difference against Splits available over at VBspeed:




    Edit #2!
    One final addition... the longer the output strings in the array, the better QuickSplit does against what is available over at VBspeed:
    Last edited by Merri; Sep 28th, 2008 at 04:48 PM.

  12. #12
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: VB6 QuickSplit and QuickSplitB

    Quote Originally Posted by Merri
    Ellis Dee: These are the short versions updated to the even shorter code. I did a minor logical change: I made the loop code run via a True condition, instead of only checking whether to exit the loop, switching that to be the slightly slower False condition. I think this kind of improves readibility too, but that is a matter of opinion.
    Yeah, looks much nicer. I agree on the readability of the logical flow.

  13. #13
    Junior Member
    Join Date
    Sep 2008
    Posts
    26

    Re: VB6 QuickSplit and QuickSplitB

    Merri, your QuickSplit is very nice, and I've using SplitB05(VBSpeed), QuickSplit and QuickSplitB for doing a comparison in my environment. I found that QuickSplit and QuickSplitB are 50% faster than SplitB05(VBSpeed). And as my coding result shows me, QuickSplitB is slightly faster than QuickSplit!

    As the above testing benchmark are based on my own coding and my own usage.

  14. #14
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: VB6 QuickSplit and QuickSplitB

    Oh, and I like how you reuse lngCount instead of introducing a new variable. But I'm a fan of reusing variables as a concept; no doubt others will find it less readable/more confusing/whatever.

    For lngCount = 0 To lngCount

  15. #15
    Junior Member
    Join Date
    Sep 2008
    Posts
    26

    Re: VB6 QuickSplit and QuickSplitB

    Quote Originally Posted by Ellis Dee
    Oh, and I like how you reuse lngCount instead of introducing a new variable. But I'm a fan of reusing variables as a concept; no doubt others will find it less readable/more confusing/whatever.

    For lngCount = 0 To lngCount
    Is it will improve a bit of performance if we reuse the same variable for looping?

  16. #16
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: VB6 QuickSplit and QuickSplitB

    Quote Originally Posted by tanjiunnyann
    Is it will improve a bit of performance if we reuse the same variable for looping?
    No. I just like that there is one less variable. Makes it seem cleaner to me.

  17. #17

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    tanjiunnyann: you should not use QuickSplitB for regular character data! Especially with Chinese you may get incorrect results at times, because a character may be understood "from the middle" of a character.

    This is a theoretical example: say you're looking for character which hex representation is |2020|. You have two characters within a string: |2040| and |4020| - when put together as a byte stream, they appear as |4020|2040|. Now, when QuickSplitB is processing the data, you may get a in between result: |4020|2040| - this is something that cannot happen with QuickSplit, because it only works per character, not by byte. Thus they should not be compared to each other as they're doing a slightly different thing (except of course for curiosity).


    Ellis Dee: I reused lngCount, because I didn't see a need for a new variable as lngCount was still working for the same purpose: being a counter. I could have, however, used the Limit variable too, it is passed ByVal so I could do whatever I'd like with it, thus the code could be For lngCount = 0 To Limit instead.
    Last edited by Merri; Sep 29th, 2008 at 09:01 AM.

  18. #18
    Junior Member
    Join Date
    Sep 2008
    Posts
    26

    Re: VB6 QuickSplit and QuickSplitB

    Merri, thanks for your advice! I'm using QuickSplit instaed of QuickSplitB.

    Your Split function is now the fastest split function what I've ever use before! I hope got some more improvement of performance from you next time.

  19. #19
    Addicted Member
    Join Date
    Jul 2007
    Posts
    228

    Re: VB6 QuickSplit and QuickSplitB

    Nice Job!

    I converted a VB6 Split function to QuickSplit using a .csv file that needed to be parsed. I didn't time it but the visual display by itself tells me it is much faster. The data is already displayed when the form pops-up. Can't see it being drawn!

    Thanks for the code!

    Tom

  20. #20

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: VB6 QuickSplit and QuickSplitB

    I spent some time doing further research on speeding up this procedure. I found a case where QuickSplit performs badly: a string where there are a lot of longer substrings causes performance issues. I also tweaked a lot of other areas. The resulting code is a lengthy one:
    Code:
    ' VBspeedMerri4.bas
    Option Explicit
    
    Private Declare Function ArrPtr Lib "msvbvm60" Alias "VarPtr" (Arr() As Any) As Long
    Private Declare Sub GetMem4 Lib "msvbvm60" (ByVal Ptr As Long, Value As Long)
    Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
    Private Declare Sub PutMem8 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Currency)
    Private Declare Function SysAllocStringByteLen Lib "oleaut32" (ByVal Ptr As Long, ByVal Length As Long) As Long
    
    Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long
    Private Declare Function GetCurrentProcessId Lib "kernel32" () As Long
    Private Declare Function OpenProcess Lib "kernel32" (ByVal dwDesiredAccess As Long, ByVal bInheritHandle As Long, ByVal dwProcessId As Long) As Long
    Private Declare Function WriteProcessMemory Lib "kernel32" (ByVal hProcess As Long, lpBaseAddress As Any, lpBuffer As Any, ByVal nSize As Long, Optional lpNumberOfBytesWritten As Long) As Long
    
    Private Function InIDE(Optional IDE) As Boolean
        If IsMissing(IDE) Then Debug.Assert Not InIDE(InIDE) Else IDE = True
    End Function
    
    Private Property Get Procedure(ByVal AddressOfDest As Long) As Long
        Procedure = AddressOfDest
    End Property
    
    Private Property Let Procedure(ByVal AddressOfDest As Long, ByVal AddressOfSrc As Long)
        Dim JMP As Currency, PID As Long
        ' get process handle
        PID = OpenProcess(&H1F0FFF, 0&, GetCurrentProcessId)
        If PID Then
            If InIDE Then
                ' get correct pointers to procedures in IDE
                GetMem4 AddressOfDest + &H16&, AddressOfDest
                GetMem4 AddressOfSrc + &H16&, AddressOfSrc
            End If
            Debug.Assert App.hInstance
            ' ASM JMP (0xE9) followed by bytes to jump in memory
            JMP = (&HE9& * 0.0001@) + (AddressOfSrc - AddressOfDest - 5@) * 0.0256@
            ' write the JMP over the destination procedure
            WriteProcessMemory PID, ByVal AddressOfDest, JMP, 5
            ' close process handle
            CloseHandle PID
        End If
    End Property
    
    Public Sub QuickSplit4(ResultSplit() As String, Expression As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1)
        ' general variables that we need
        Dim D() As Integer, E() As Integer, H() As Long, HE() As Long, HS() As Long, P() As Long, S() As Long
        Dim C As Long, I As Long, J As Long, K As Long, N As Long, LD As Long, LE As Long, PL As Long, V As Long
        Dim PD As Long, PE As Long, PS As Long
        ' length information
        LE = LenB(Expression)
        LD = LenB(Delimiter)
        ' unlimited or limited?
        If Limit = -1 Then Limit = LE \ LD + 1
        ' validate lengths and limit
        If LE > 0 And LD > 0 And Limit >= 0 Then
            ' find the first item
            If Limit > 1 Then
                Do: I = InStrB(I + 1, Expression, Delimiter)
                Loop Until (I And 1) = 1 Or (I = 0)
            End If
            ' did we find an item?
            If I Then
                ' space for knowing the positions
                PL = Limit \ 64
                ReDim P(0 To PL)
                Do
                    ' remember position
                    P(C) = I
                    ' find next
                    I = I + LD - 1
                    Do: I = InStrB(I + 1, Expression, Delimiter)
                    Loop Until (I And 1) = 1 Or (I = 0)
                    ' increase counter
                    C = C + 1
                    If C > PL Then PL = PL * 2: ReDim Preserve P(PL)
                Loop While I > 0 And C <= Limit
                P(C) = LE + 1
                ' get pointer
                I = Not Not ResultSplit
                Debug.Assert App.hInstance
                ' check if it is multidimensional array
                If I Then GetMem4 I, I: If I And &HFFFF& <> 1 Then Erase ResultSplit
                ' make space for the new items
                ReDim Preserve ResultSplit(0 To C)
                If C < 32 Then
                    ' there is not a lot to do so keep it simple!
                    I = 1
                    For C = 0 To C
                        J = P(C) - I
                        ResultSplit(C) = MidB$(Expression, I, J)
                        I = P(C) + LD
                    Next C
                Else
                    ' okay, now there should be enough to work with to have use of safe arrays...
                    ReDim H(0 To 5)
                    HS = H
                    HS(0) = 1: HS(1) = 4: HS(4) = 1
                    ' S = current ResultSplit array item (Long)
                    PS = ArrPtr(S)
                    PutMem4 PS, VarPtr(HS(0))
                    H(0) = 1: H(1) = 2: H(4) = LenB(Expression)
                    HE = H
                    HE(3) = StrPtr(Expression)
                    ' D = current ResultSplit item's data (Integer array)
                    PD = ArrPtr(D)
                    PutMem4 PD, VarPtr(H(0))
                    ' E = Expression's data (Integer array)
                    PE = ArrPtr(E)
                    PutMem4 PE, VarPtr(HE(0))
                    V = VarPtr(ResultSplit(0))
                    ' then start working...
                    I = 1
                    For C = 0 To C
                        J = P(C) - I
                        If J Then
                            If J > 191 Then
                                ' MidB$ is faster than Integer array with long copy
                                ResultSplit(C) = MidB$(Expression, I, J)
                            Else
                                ' Integer array is faster with a short copy
                                HS(3) = V + C * 4
                                If S(0) Then
                                    If LenB(ResultSplit(C)) <> J Then
                                        ResultSplit(C) = vbNullString
                                        S(0) = SysAllocStringByteLen(0, J)
                                    End If
                                Else
                                    S(0) = SysAllocStringByteLen(0, J)
                                End If
                                H(3) = S(0)
                                N = (I - 1) \ 2
                                For K = 0 To (J - 1) \ 2
                                    D(K) = E(N + K)
                                Next K
                            End If
                        End If
                        I = P(C) + LD
                    Next C
                    PutMem4 PS, 0
                    PutMem4 PD, 0
                    PutMem4 PE, 0
                End If
            Else
                ' one item
                ReDim ResultSplit(0)
                ResultSplit(0) = Expression
            End If
        Else
            I = Not Not ResultSplit
            ' VB6 IDE mysterious bug fix with Not Array
            Debug.Assert App.hInstance
            If I Then
                If LBound(ResultSplit) = 0 And UBound(ResultSplit) = -1 Then Exit Sub
                Erase ResultSplit
            End If
            ' safe array header for an empty string array
            ReDim H(0 To 6): H(0) = vbString: H(1) = &H1800001: H(2) = 4
            ' H(1) becomes ArrPtr; H(0) is a negative item telling array datatype
            PutMem4 StrArrPtr(ResultSplit), VarPtr(H(1))
            ' remove items from H array
            PutMem8 (Not Not H) + 12, 0
            ' VB6 IDE mysterious bug fix with Not Array
            Debug.Assert App.hInstance
        End If
    End Sub
    
    Public Function StrArrPtr(Arr() As String) As Long
        Procedure(AddressOf VBspeedMerri4.StrArrPtr) = Procedure(AddressOf VBspeedMerri4.z_ArrPtr)
        StrArrPtr = VBspeedMerri4.StrArrPtr(Arr)
    End Function
    
    Public Function z_ArrPtr(ByVal Value As Long) As Long
        z_ArrPtr = Value
    End Function
    The name QuickSplit4 only comes from the test version: I did two other versions to find out where things could be improved. Whatever the call this procedure should be faster than the native Split.

    Btw, VBspeed's SplitB03 is actually quite a good one! The tests in VBspeed's site are poor for finding out all the cases. Here is a modified version:
    Code:
    Public Sub SplitB03(Expression$, ResultSplit$(), Optional Delimiter$ = " ")
    ' by G.Beckmann, G.Beckmann@NikoCity.de
    ' modified from InStr to InStrB by Merri
     
        Dim C&, iLen&, iLast&, iCur&
        
        iLen = LenB(Delimiter)
        
        If iLen Then
            
            '/ count delimiters
            Do: iCur = InStrB(iCur + 1, Expression, Delimiter)
            Loop Until (iCur And 1) = 1 Or (iCur = 0)
            Do While iCur
                iCur = iCur + iLen - 1
                Do: iCur = InStrB(iCur + 1, Expression, Delimiter)
                Loop Until (iCur And 1) = 1 Or (iCur = 0)
                C = C + 1
            Loop
            
            '/ initalization
            ReDim Preserve ResultSplit(0 To C)
            C = 0: iCur = 0: iLast = 1
            
            '/ search again...
            Do: iCur = InStrB(iCur + 1, Expression, Delimiter)
            Loop Until (iCur And 1) = 1 Or (iCur = 0)
            Do While iCur
                ResultSplit(C) = MidB$(Expression, iLast, iCur - iLast)
                iLast = iCur + iLen
                iCur = iLast - 1
                Do: iCur = InStrB(iCur + 1, Expression, Delimiter)
                Loop Until (iCur And 1) = 1 Or (iCur = 0)
                C = C + 1
            Loop
            ResultSplit(C) = MidB$(Expression, iLast)
            
        Else
            ReDim Preserve ResultSplit(0 To 0)
            ResultSplit(0) = Expression
        End If
     
    End Sub
    This is a good short code to work with, but it doesn't beat the native Split by a great margin very often (depends a lot on what you're doing).


    Edit!
    Test benchmark project attached. Note that I've added anti-cheating code: ReDim Preserve messes up true comparison between procedures in a multicall benchmark with exact same Expression and Delimiter. Thus I added Erase to clean up the array, just like what happens with native Split. This gives a truer comparison between Split and other solutions.
    Attached Files Attached Files
    Last edited by Merri; Jun 14th, 2010 at 09:05 AM. Reason: Killing off potential memory issue

  21. #21

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split (QuickSplit development)

    First post has been updated with a hopefully "final" version. I don't know if I can come up with anything to improve it more using VB6 code: ASM thunks would be "cheating".

  22. #22

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split (QuickSplit development)

    More improvements here! Both faster & shorter!

    Code:
    ' Strings2.bas
    Option Explicit
    
    Public Enum SplitCompareMethod
        [Split BinaryCompare] = VbCompareMethod.vbBinaryCompare         ' InStrB
    '    [Split TextCompare] = VbCompareMethod.vbTextCompare             ' InStr(TextCompare)
        [Split CharacterCompare] = VbCompareMethod.vbDatabaseCompare    ' InStr(BinaryCompare)
    End Enum
    
    Private Declare Function ArrPtr Lib "msvbvm60" Alias "VarPtr" (Arr() As Any) As Long
    Private Declare Sub GetMem4 Lib "msvbvm60" (ByVal ptr As Long, Value As Long)
    Private Declare Function InitStringArray Lib "oleaut32" Alias "SafeArrayCreate" (Optional ByVal VarType As VbVarType = vbString, Optional ByVal Dims As Integer = 1, Optional saBound As Currency) As Long
    Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal ptr As Long, ByVal Value As Long)
    Private Declare Function SysAllocStringByteLen Lib "oleaut32" (ByVal ptr As Long, ByVal Length As Long) As Long
    Private Declare Function SysAllocStringLen Lib "oleaut32" (ByVal ptr As Long, ByVal Length As Long) As Long
    
    Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long
    Private Declare Function GetCurrentProcessId Lib "kernel32" () As Long
    Private Declare Function OpenProcess Lib "kernel32" (ByVal dwDesiredAccess As Long, ByVal bInheritHandle As Long, ByVal dwProcessId As Long) As Long
    Private Declare Function WriteProcessMemory Lib "kernel32" (ByVal hProcess As Long, lpBaseAddress As Any, lpBuffer As Any, ByVal nSize As Long, Optional lpNumberOfBytesWritten As Long) As Long
    
    Private Function InIDE(Optional IDE) As Boolean
        If IsMissing(IDE) Then Debug.Assert Not InIDE(InIDE) Else IDE = True
    End Function
    
    Private Property Get Procedure(ByVal AddressOfDest As Long) As Long
        Procedure = AddressOfDest
    End Property
    
    Private Property Let Procedure(ByVal AddressOfDest As Long, ByVal AddressOfSrc As Long)
        Dim JMP As Currency, PID As Long
        ' get process handle
        PID = OpenProcess(&H1F0FFF, 0&, GetCurrentProcessId)
        If PID Then
            If InIDE Then
                ' get correct pointers to procedures in IDE
                GetMem4 AddressOfDest + &H16&, AddressOfDest
                GetMem4 AddressOfSrc + &H16&, AddressOfSrc
            End If
            Debug.Assert App.hInstance
            ' ASM JMP (0xE9) followed by bytes to jump in memory
            JMP = (&HE9& * 0.0001@) + (AddressOfSrc - AddressOfDest - 5@) * 0.0256@
            ' write the JMP over the destination procedure
            WriteProcessMemory PID, ByVal AddressOfDest, JMP, 5
            ' close process handle
            CloseHandle PID
        End If
    End Property
    
    Public Function Split(Expression As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1, Optional ByVal Compare As SplitCompareMethod) As String()
        Procedure(AddressOf Strings2.Split) = Procedure(AddressOf Strings2.z_Split)
        Split = Strings2.Split(Expression, Delimiter, Limit, Compare)
    End Function
    
    Public Function z_Split(Expression As String, Optional Delimiter As String = " ", Optional ByVal Limit As Long = -1, Optional ByVal Compare As SplitCompareMethod) As Long
        ' general variables that we need
        Dim P() As Long, R() As Long
        Dim C As Long, I As Long, J As Long, K As Long, LD As Long, LE As Long, PL As Long, PS As Long
        ' get pointer
        PS = StrPtr(Expression)
        ' length information
        LE = LenB(Expression)
        LD = LenB(Delimiter)
        ' unlimited or limited?
        If Limit = -1 Then If LD Then Limit = LE \ LD + 1
        ' validate lengths and limit
        If LE > 0 And LD > 0 And Limit >= 0 Then
            ' find the first item
            If Limit > 1 Then
                If Compare = [Split BinaryCompare] Then
                    Do: I = InStrB(I + 1, Expression, Delimiter)
                    Loop Until (I And 1) = 1 Or (I = 0)
                Else
                    I = InStr(Expression, Delimiter)
                End If
            End If
            ' did we find an item?
            If I Then
                ' space for knowing the positions
                PL = Limit \ 80
                ReDim P(0 To PL)
                ' InStrB?
                If Compare = [Split BinaryCompare] Then
                    Do
                        ' remember position
                        P(C) = I - 1
                        ' find next
                        I = I + LD - 1
                        Do: I = InStrB(I + 1, Expression, Delimiter)
                        Loop Until (I And 1) = 1 Or (I = 0)
                        ' increase counter
                        C = C + 1
                        If C > PL Then PL = PL + C: ReDim Preserve P(PL)
                    Loop While I > 0 And C <= Limit
                Else ' InStr
                    Do
                        ' remember position
                        P(C) = (I - 1) * 2
                        ' find next
                        I = InStr(I + LD \ 2, Expression, Delimiter)
                        ' increase counter
                        C = C + 1
                        If C > PL Then PL = PL + C: ReDim Preserve P(PL)
                    Loop While I > 0 And C <= Limit
                End If
                P(C) = LE
                ' make space for the new items
                z_Split = InitStringArray(, , (C + 1) * 0.0001@)
                PutMem4 ArrPtr(R), z_Split
                ' keep it simple, stupid!
                I = 0
                For C = 0 To C
                    K = P(C)
                    J = K - I
                    If J Then R(C) = SysAllocStringByteLen(PS + I, J)
                    I = K + LD
                Next C
            Else
                ' one item
                z_Split = InitStringArray(, , 0.0001@)
                PutMem4 ArrPtr(R), z_Split
                R(0) = SysAllocStringByteLen(PS, LE)
            End If
            ' clean up z_Split reference
            PutMem4 ArrPtr(R), 0
        Else
            z_Split = InitStringArray
        End If
    End Function
    This is one of those improvements after which I have this "boy do I feel like an idiot" feeling. This is massively faster with large amount of data.
    Last edited by Merri; Jun 20th, 2010 at 06:35 PM. Reason: Updated with shorter code

  23. #23

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split (QuickSplit development)

    I got interested on improving Join function. Initially the native VB6 function seems nearly unbeatable (without using ASM thunks). Even the functions available at VBspeed have a lot of trouble being faster. Also all VBspeed's benchmarks are for joining just one character delimiters: use a two character delimiter and you're back being either equal to or slower than the native Join.

    So, with this in mind I started looking for solutions. A major thing that I found (that was missing in VBspeed solutions) was the use of non-array variables. Using this trick I ended up with a very long specialized code that is still experimental, but it is fast with short delimiters (up to 8 ~ 16 characters) & very short array items (0 to 4 characters). Being experimental I'm still working my way with it. One thing that I simply had to do was adding Goto. Without it the function would be simply unreadable long.

    Anyway, here is the current version of Strings.bas that now has fast Join & Split. For joining large data (long string array items and/or delimiters longer than 16 characters) I still have to recommend VBA.Join as this function does not provide speed there, just a barely equal speed. With short stuff the speed can be almost twice at fast (and some null length cases are some four times faster, but they're rare & special).


    I think I may next move to using only Integer & Long datatypes to simplify the code & make things simpler. At the moment there are arrays for Byte, Integer, Long & Currency. The Currency speed doesn't seem to be as good as I'd like it to be.
    Attached Files Attached Files
    Last edited by Merri; Jun 23rd, 2010 at 03:33 PM.

  24. #24

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split (QuickSplit development)

    Another update. Some tweaks that improve both Join & Split, especially with lightweight calls. Also I present the fastest StrReverse implementation in VB code. It is nowhere near the native VB6 function (it is ASM optimized), but it is clearly faster than any of the VBspeed solutions.
    Attached Files Attached Files
    Last edited by Merri; Jun 29th, 2010 at 02:39 PM.

  25. #25

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split & Join (development)

    Quick update. First, a bug fix: I noticed I had a small bug in the Join as it had a comma as a delimiter. Otherwise, I've improved Join so that with short delimiters and/or short strings to join it is always faster than the native Join, up to four times in some cases! With both long delimiters & long array strings native Join is still faster (by a small margin).

    I think at this point the Join function has reached "maturity", it no longer has clearly visible weak spots.


    I still have some ideas for improvement... for example, RtlMoveMemory copies from right to left. The current copy order goes from left to right for the most part. I throw a guess that if all copying always progressed from right to left then it may improve the overall speed of filling the string, but I can't be sure before I've tested it.

    Edit!
    I think I had a brainfart while thinking that, but anyway... I did find some interesting results.
    Attached Files Attached Files
    Last edited by Merri; Jul 1st, 2010 at 10:08 AM.

  26. #26

    Thread Starter
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [VB6] Faster Split & Join (development)

    The code attached is for those curious to check out how reversed code performs. What I have to say here is based on what I found out in my benchmarks:
    1. RtlMoveMemory copies stuff from left to right. It performs optimally when it can run continually from left to right. It performs worse if it needs to jump right to left between calls.
    2. VB6 arrays perform better when stuff is copied right to left – the better if it can run continually from right to left. Arrays perform worse if the copy order is left to right.


    My only guess is that VB6's compiler somehow optimizes right to left copying better, or that this is somehow true in general for safe arrays.


    Based on this information I've also improved Join function. Especially the 8 character copy performs much better now. However I won't post a new version of Strings2.bas at this point... I want to do more this time before a new release. Just wanted to post this information for those that may still do some performance tuning in VB6. (And yes, I'm doing this only because I'm interested where this old warhorse stands performance wise.)
    Attached Files Attached Files

  27. #27
    New Member
    Join Date
    Jul 2012
    Posts
    3

    Re: [VB6] Faster Split & Join (development)

    I just wanna say thank you.

    Deal with DXF files was a terrible pain till now. The vba.split function behaves very bad, and given the file size I use to open (over 10 MB almost always) it was a sad moment when I get to the split part.

    Your BAS module worked flawlessly (SPLIT function). Incredible speed, it's well under 1/20, or less, the time it used to do the split. And the best part is that I only had to take apart the VBA. before the function call.

    The Replace function is halfway, and as I don't clearly understand what was your original idea, I could not fix it, at least to return a string. It's a pitty, because it seems I've got the same speed problem with VBA.replace than with VBA.split.

    Again, very nice work.
    Last edited by adulador; Jul 12th, 2012 at 06:05 AM. Reason: Replace function

  28. #28
    New Member
    Join Date
    Jul 2012
    Posts
    3

    Re: [VB6] Faster Split & Join (development)

    Just one thing I didn't mentioned.

    I'm using VBA (Excel). So I got two warnings with "App." object, that had to be changed to "Application." object.

    I don't know how to change this for being compatible under VB and VBA.

    Of course minimal problem, got replaced and everything worked as a charm under VBA.

  29. #29
    Lively Member
    Join Date
    Feb 2012
    Posts
    106

    Re: [VB6] Faster Split & Join (development)

    Greetings,
    Just have one simple question.
    Can it split file of 5 GB in to 1 GB parts and vice-versa?

    Thanks
    Regards,

  30. #30
    Default Member Bonnie West's Avatar
    Join Date
    Jun 2012
    Location
    InIDE
    Posts
    4,060

    Re: [VB6] Faster Split & Join (development)

    Quote Originally Posted by green.pitch View Post
    Can it split file of 5 GB in to 1 GB parts and vice-versa?
    According to the String Data Type documentation:

    Quote Originally Posted by MSDN
    A variable-length string can contain up to approximately 2 billion (2^31) characters.
    Strings can contain binary data. A single character is 2 Bytes long (remember, VB Strings are Unicode). Therefore, theoretically, you can have at most about 4 GB of binary data held in virtual memory. Obviously, your process and other processes uses some of that memory too, so the maximum String size that you can actually allocate is going to be less than 4 GB.

    Splitting and joining huge files are best processed one manageable chunk at a time. The CodeBank probably has an example or two on how to do that.
    On Local Error Resume Next: If Not Empty Is Nothing Then Do While Null: ReDim i(True To False) As Currency: Loop: Else Debug.Assert CCur(CLng(CInt(CBool(False Imp True Xor False Eqv True)))): Stop: On Local Error GoTo 0
    Declare Sub CrashVB Lib "msvbvm60" (Optional DontPassMe As Any)

  31. #31
    New Member
    Join Date
    Sep 2015
    Posts
    1

    Re: [VB6] Faster Split & Join (development)

    I do wonder if that VB6 code is compatible with VB.NET (Visual Studio 2008). I could not use it simply by changing its extension to .VB

    I am desperately searching for a function which is faster than string.split() native .NET function. Since, I am not good at VB.NET (started to use it recently due to a project development requirement) I would appreciate if I can get strings2.vb version which can be directly used in VB.NET environment.

    Thanks.

  32. #32
    Hyperactive Member
    Join Date
    Feb 2015
    Location
    Colorado USA
    Posts
    261

    Re: [VB6] Faster Split & Join (development)

    Merri,

    You have done some things in your code that I do not understand. I write a lot of code with Windows API calls but some of your stuff is totally new to me.

    For example, starting in line 815 you have the following Sub:

    Code:
    Public Sub PutLong(ByVal Ptr As Long, ByVal Value As Long)
        Procedure(AddressOf Strings.PutLong) = API("msvbvm60", "PutMem4")
        PutLong Ptr, Value
    End Sub
    I would normally write this as simply

    Code:
    Public Sub PutLong(ByVal Ptr As Long, ByVal Value As Long)
        PutMem4 Ptr, Value
    End Sub
    and I probably would have just used it inline instead of using a separate function. I have never seen the "Procedure" statement and I didn't find anything in Help for VB or VBA that told me what it was doing. It appears you are looking for the PutMem4 procedure in msvbm60.dll but what is the advantage of doing what you did versus just a simple call to PutMem4? You have done it several times in this module so you must see and advantage to doing it this way but it's not apparent to me what that advantage is. Sorry for being dense...

  33. #33
    Default Member Bonnie West's Avatar
    Join Date
    Jun 2012
    Location
    InIDE
    Posts
    4,060

    Re: [VB6] Faster Split & Join (development)

    Quote Originally Posted by MountainMan View Post
    Merri,
    Quote Originally Posted by Merri View Post
    I don't visit here much at all anymore, ...
    Quote Originally Posted by MountainMan View Post
    I have never seen the "Procedure" statement and I didn't find anything in Help for VB or VBA that told me what it was doing.
    It is a custom read/write property that is defined in the same module as the PutLong subroutine. Its purpose appears to be retrieval of the specified procedure's address and overwriting of the specified procedure so that it jumps to another procedure (in other words, redirect the procedure so that it executes another procedure).

    Quote Originally Posted by MountainMan View Post
    ... but what is the advantage of doing what you did versus just a simple call to PutMem4? You have done it several times in this module so you must see and advantage to doing it this way but it's not apparent to me what that advantage is.
    Merri seems to be using it to bypass the overhead of the Declare statement and call API functions as directly as possible via proxy procedures in standard modules. Although it isn't quite as direct as API functions declared in a type library, it is close enough and has the advantage of not requiring a TLB dependency.
    On Local Error Resume Next: If Not Empty Is Nothing Then Do While Null: ReDim i(True To False) As Currency: Loop: Else Debug.Assert CCur(CLng(CInt(CBool(False Imp True Xor False Eqv True)))): Stop: On Local Error GoTo 0
    Declare Sub CrashVB Lib "msvbvm60" (Optional DontPassMe As Any)

  34. #34
    PowerPoster
    Join Date
    Jun 2015
    Posts
    2,224

    Re: [VB6] Faster Split & Join (development)

    Make sure you watch out for how you have parameters defined. It lets the runtime know whether or not to Deref passed parameters, or pass ByVal.

    Depending on how you declare PutMem4 you want PutMem4 ByVal Ptr, Value. Merri overwrites a Procedure's location with a JMP to an API, and preserves the call stack. This way VB assumes the original procedure declaration is essentially how the API should be called.
    Last edited by DEXWERX; Feb 22nd, 2016 at 02:08 PM.

  35. #35
    New Member
    Join Date
    Feb 2016
    Posts
    1

    Re: [VB6] Faster Split & Join (development)

    Hello,
    First of all thanks for the Function. Except for one case it works flawlessly.
    In Excel VBA when I call the following function (Tester). Excel just crashes.
    It crashes when the function CountCchrinString sholud finish.

    It only Happens with empty Expressions.

    Code:
    Sub tester()
    
    Debug.Print CountChrInString("", ".")
    
    End Sub
    
    
    Public Function CountChrInString(ByVal Expression As String, ByVal Character As String) As Long
    ' Returns the count of the specified character in the specified string.
    '
        Dim lngResult As Long
        Dim strParts() As String
        'If Expression = vbNullString Then Stop
        strParts = Strings.Split(Expression, Character)
    
        lngResult = UBound(strParts, 1)
    
        If (lngResult = -1) Then
            lngResult = 0
        End If
    
        CountChrInString = lngResult
    
    End Function

  36. #36
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,872

    Re: [VB6] Faster Split & Join (development)

    It can be corrected as follows:
    Code:
    Public Function CountChrInString(ByVal Expression As String, ByVal Character As String) As Long
    ' Returns the count of the specified character in the specified string.
    '
        Dim lngResult As Long
        Dim strParts() As String
        '
        If Len(Expression) > 0 Then
          strParts = Strings.Split(Expression, Character)
    
          lngResult = UBound(strParts, 1)
    
          If (lngResult = -1) Then
              lngResult = 0
          End If
    
          CountChrInString = lngResult
       End If
    End Function
    But I would not use a Split to count the number of characters in a string.
    Code:
    Public Function CharCount(sString As String, sChar As String) As Long
      Dim lPos As Long
      
      If Len(sString) = 0 Then Exit Function
      
      lPos = 0
      '  CharCount= 1
      Do
        lPos = InStr(lPos + 1, sString, sChar , vbBinaryCompare)
        If lPos > 0 Then CharCount= CharCount+ 1
      Loop Until lPos = 0
      
    End Function
    Last edited by Arnoutdv; Feb 24th, 2016 at 10:27 AM. Reason: The CharCount = 1 should not be there

  37. #37
    PowerPoster
    Join Date
    Jun 2015
    Posts
    2,224

    Re: [VB6] Faster Split & Join (development)

    Quote Originally Posted by Arnoutdv View Post
    But I would not use a Split to count the number of characters in a string.
    Code:
    Public Function CharCount(sString As String, sChar As String) As Long
      Dim lPos As Long
      
      If Len(sString) = 0 Then Exit Function
      
      lPos = 0
      CharCount= 1
      Do
        lPos = InStr(lPos + 1, sString, sChar , vbBinaryCompare)
        If lPos > 0 Then CharCount= CharCount+ 1
      Loop Until lPos = 0
      
    End Function
    your air code has an off by 1 bug.

  38. #38
    PowerPoster Arnoutdv's Avatar
    Join Date
    Oct 2013
    Posts
    5,872

    Re: [VB6] Faster Split & Join (development)

    A sorry, that's what you get when you Copy'&'Paste some code, and rename the function.
    It's was meant to count the number of elements separated by a certain character.

    Indeed the correct version should not start with CharCount = 1

  39. #39
    New Member
    Join Date
    Sep 2019
    Posts
    5

    Re: [VB6] Faster Split & Join (development)

    Sorry for resurrecting an old link. However, I am trying to load a 10 MB text file with 190,000 lines and each line has fields separated by a comma. So, text file looks like below:
    Code:
    124, 456, 789, 147, 875, 547, 667
    421, 576, 884, 189, 885, 475, 843
    ...190,000 lines
    I load using ObjectFSO like below and read the complete file in one go
    Code:
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objTF = objFSO.OpenTextFile("D:\Data\Trac\Dny-Number.txt", 1)
    
    strIn = objTF.readall
    X() = Split(strIn, vbNewLine)
    'X() = Strings.Split(strIn, vbNewLine)
    Using VBA inbuilt Split function:
    Code takes 0.69 seconds to read the complete file in variable strIn.
    Code takes 0.94 seconds to load an array X() with 190,000 rows in 0.94 seconds.
    So, time taken to run Split function 190,000 times is 0.94-0.69 = 0.25 seconds.

    After loading the .bas file Strings, I redo the same setup with Strings.Split function given in this thread. I get following numbers:
    Code takes 0.70 seconds to read the complete file in variable strIn.
    Code takes 1.15 seconds to load an array X() with 190,000 rows in 0.94 seconds.
    So, time taken to run Split function 190,000 times is 0.94-0.69 = 0.45 seconds.

    Why is the Split function developed in this thread slower compared to VBA inbuilt Split function? Am I missing something?

  40. #40
    New Member
    Join Date
    Sep 2019
    Posts
    5

    Re: [VB6] Faster Split & Join (development)

    Quote Originally Posted by raj_vba View Post
    After loading the .bas file Strings, I redo the same setup with Strings.Split function given in this thread. I get following numbers:
    Code takes 0.70 seconds to read the complete file in variable strIn.
    Code takes 1.15 seconds to load an array X() with 190,000 rows.
    So, time taken to run Split function 190,000 times is 1.15-0.70 = 0.45 seconds.

    Why is the Split function developed in this thread slower compared to VBA inbuilt Split function? Am I missing something?
    Sorry, please read above post with above correction. I am new here and couldn't find a way to edit my original post.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width