Results 1 to 12 of 12

Thread: finding a replacement to the Collection class

  1. #1

    Thread Starter
    Junior Member
    Join Date
    Jul 2008
    Posts
    21

    finding a replacement to the Collection class

    Hello,

    My application uses the standard VB6 Collection class heavily.
    There are about 10 collections containing typically 50 elements for some and 500 elements for others.
    I think that at least 50% of the CPU time is spent accessing these collections.
    I would be interrested to speed up the application, and I would like to consider the use of some replacement for the standard Collection class.

    Where could I find some replacement?
    What would be the advantages of the replacements?
    Is there some hope to speed up the use of Collections? (or should I better try optimize the other 50% CPU activity)

  2. #2
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: finding a replacement to the Collection class

    You could use arrays of the specific datatype, which is the fastest way to go. Of course you lack the convenience of keys, so if you must use them, you'd have to keep indexes of the keys yourself.

    You could also make an array of your own type:
    Code:
    Public Type PersonInformation
        Name As String
        DOB As Date
        DOD As Date
    End Type
    
    Dim People() As PersonInformation
    Accessing array of a type is only a little bit slower than array of a specific datatype.


    There is also a Dictionary object which is better than Collections, but I don't know about it's speed and have never used it as I think it had an extra dependancy.

  3. #3
    Cumbrian Milk's Avatar
    Join Date
    Jan 2007
    Location
    0xDEADBEEF
    Posts
    2,448

    Re: finding a replacement to the Collection class

    You could write your own collection class.

    Given a collection is just a managed array of variants with the option to search by key you could presumably get a performance boost by creating a managed array of the datatype you wish to hold.

    Consider two arrays, one holding the collection data and one holding the string keys.

    I imagine the really slow part of a collection is where you access an item by it's key rather than it's index. When comparing strings of different lengths you can get a marked performance boost by first comparing the LenB (faster than Len) of the string and then the string itself (if the length matches.)

    You could get a massive performance boost if you stored the keys alphabetically, then you could search by divide and conquer. 10000 keys would only require 13 comparisons. This would not be so easy to set up, you would need two more arrays to link the indexes of the items to the keys and back, and adding removing items would be slow. Accessing items would be like lightning.

    Edit: rather a slow post be me there, started posting before you merri

  4. #4

    Thread Starter
    Junior Member
    Join Date
    Jul 2008
    Posts
    21

    Re: finding a replacement to the Collection class

    I am using Collection objects mainly to convert keys to indexes.
    All the collections I use (exept one) are a mean to get an index in an array based on the name of something.
    (or a double index sometimes)

    Could that help finding an optimization?

  5. #5
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: finding a replacement to the Collection class

    I would just keep them as sorted arrays.

    To sort an initially unsorted list, use quicksort or shellsort. Once sorted, if you add a single item and want to maintain the order, use insertion sort or gnome sort.

    Then a simple binary search as Milk describes would make lookups very fast.

  6. #6
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: finding a replacement to the Collection class

    Here we go, a pretty fast key collection class for cross referencing keys, item values and collection indexes:
    Code:
    ' KeyCollection.cls
    Option Explicit
    
    Private Type IndexInformation
        Pos As Long
        Value As Long
    End Type
    
    Private m_Index() As IndexInformation
    Private m_Keys As String
    
    Public Function Add(ByVal Item As Long, ByRef Key As String) As Long
        ' must have key
        If LenB(Key) Then
            ' must not exist
            If InStr(m_Keys, vbNullChar & Key & vbNullChar) = 0 Then
                ' yay, add it!
                Add = UBound(m_Index) + 1
                ' space for it
                ReDim Preserve m_Index(Add)
                ' remember key position
                With m_Index(Add)
                    .Pos = Len(m_Keys) + 1
                    .Value = Item
                End With
                ' add to our key string
                m_Keys = m_Keys & (Key & vbNullChar)
                Exit Function
            Else
                ' uncomment if you want an error if the item exists
                'Err.Raise 5, "KeyCollection.Add", "Key already exists."
            End If
        Else
            ' uncomment if you want an error if the key is invalid
            'Err.Raise 5, "KeyCollection.Add", "Invalid key."
        End If
    End Function
    Public Function Count() As Long
        ' this is straightforward
        Count = UBound(m_Index)
    End Function
    Public Function Item(ByRef Key As String) As Long
        Dim lngA As Long, lngPos As Long
        ' must have key
        If LenB(Key) Then
            ' find position
            lngPos = InStr(m_Keys, vbNullChar & Key & vbNullChar) + 1
            ' do we have it?
            If lngPos > 1 Then
                ' now we just have to find which item it is
                For lngA = 1 To UBound(m_Index)
                    With m_Index(lngA)
                        ' it is more likely to not have a match: True condition is faster
                        If .Pos <> lngPos Then
                            ' not found
                        Else
                            ' found!
                            Item = .Value
                            Exit Function
                        End If
                    End With
                Next lngA
                ' uncomment if you want to know about this critical error
                'Err.Raise 5, "KeyCollection.Item", "Key not found yet key exists!"
            Else
                ' uncomment if you want an error if the key doesn't exist
                'Err.Raise 5, "KeyCollection.Item", "Key does not exist."
            End If
        Else
            ' uncomment if you want an error if the key is invalid
            'Err.Raise 5, "KeyCollection.Item", "Invalid key."
        End If
    End Function
    Public Function Key(ByVal Item As Long) As String
        Dim lngA As Long, lngPos As Long
        For lngA = 1 To UBound(m_Index)
            With m_Index(lngA)
                If .Value <> Item Then
                    ' not found
                Else
                    ' get the key for this index
                    lngPos = InStr(.Pos, m_Keys, vbNullChar)
                    If lngPos > 1 Then
                        ' return it
                        Key = Mid$(m_Keys, .Pos, lngPos - .Pos)
                        Exit Function
                    Else
                        ' uncomment if you want this critical error
                        'Err.Raise 5, "KeyCollection.Key", "Key string not found!"
                    End If
                End If
            End With
        Next lngA
        ' uncomment if you want an error if the key doesn't exist
        'Err.Raise 5, "KeyCollection.Key", "Item value not found."
    End Function
    Public Function KeyByIndex(ByVal Index As Long) As String
        Dim lngPos As Long
        ' validate range
        If Index >= 1 And Index <= UBound(m_Index) Then
            With m_Index(Index)
                ' get the key for this index
                lngPos = InStr(.Pos, m_Keys, vbNullChar)
                If lngPos > 1 Then
                    ' return it
                    KeyByIndex = Mid$(m_Keys, .Pos, lngPos - .Pos)
                    Exit Function
                Else
                    ' uncomment if you want this critical error
                    'Err.Raise 5, "KeyCollection.KeyByIndex", "Key string not found!"
                End If
            End With
        Else
            ' uncomment if you want an error if the index is invalid
            'Err.Raise 5, "KeyCollection.KeyByIndex", "Invalid."
        End If
    End Function
    Public Function Remove(ByRef Key As String) As Boolean
        Dim lngA As Long, lngB As Long, lngPos As Long, lngLen As Long
        ' must have key
        If LenB(Key) Then
            ' find position
            lngPos = InStr(m_Keys, vbNullChar & Key & vbNullChar) + 1
            ' do we have it?
            If lngPos > 1 Then
                ' now we just have to find which item it is
                For lngA = 1 To UBound(m_Index)
                    ' it is more likely to not have a match: True condition is faster
                    If m_Index(lngA).Pos <> lngPos Then
                        ' not found
                    Else
                        ' found, start removal
                        lngLen = Len(Key) + 1
                        ' copy information
                        For lngB = lngA To UBound(m_Index) - 1
                            m_Index(lngB).Pos = m_Index(lngB + 1).Pos - lngLen
                            m_Index(lngB).Value = m_Index(lngB + 1).Value
                        Next lngB
                        ' free memory
                        ReDim Preserve m_Index(lngB - 1)
                        ' now, remove the key
                        m_Keys = Left$(m_Keys, lngPos - 1) & Mid$(m_Keys, lngPos + lngLen)
                        ' success!
                        Remove = True
                        Exit Function
                    End If
                Next lngA
                ' uncomment if you want to know about this critical error
                'Err.Raise 5, "KeyCollection.Item", "Key not found yet key exists!"
            Else
                ' uncomment if you want an error if the key doesn't exist
                'Err.Raise 5, "KeyCollection.Item", "Key does not exist."
            End If
        Else
            ' uncomment if you want an error if the key is invalid
            'Err.Raise 5, "KeyCollection.Item", "Invalid key."
        End If
    End Function
    Private Sub Class_Initialize()
        m_Keys = vbNullChar
        ReDim m_Index(0)
    End Sub
    Private Sub Class_Terminate()
        m_Keys = vbNullString
        Erase m_Index
    End Sub
    Sample:
    Code:
    Option Explicit
    
    Private Sub Form_Load()
        Dim Keys As New KeyCollection
        Keys.Add 4, "A"
        Keys.Add 3, "B"
        Keys.Add 2, "C"
        Keys.Add 1, "D"
        ' make key "D" invalid, index 4 invalid and item 1 invalid
        Keys.Remove "D"
        ' by index "ABC" (note: I have set KeyByIndex as (Default) in Tools > Procedure Attributes)
        MsgBox Keys(1) & Keys(2) & Keys(3) & Keys(4)
        ' by key string "4320"
        MsgBox Keys.Item("A") & Keys.Item("B") & Keys.Item("C") & Keys.Item("D")
        ' by item "CBA"
        MsgBox Keys.Key(1) & Keys.Key(2) & Keys.Key(3) & Keys.Key(4)
    End Sub
    Class also attached. Tell me if you find any problems in it Uncomment error lines if you want Collection style errors.
    Attached Files Attached Files
    Last edited by Merri; Jul 12th, 2008 at 10:06 AM.

  7. #7
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: finding a replacement to the Collection class

    Because the post got too long, here we have some more information:
    • Add returns index if it succeeds or 0 if it fails.
    • Count returns the number of items. It never fails.
    • Item returns item value by given key or 0 if it fails.
    • Key returns key by given item value or null string if it fails.
    • KeyByIndex returns key by given index or nullstring if it fails.
    • Remove returns True if it succeeded to remove item by given key or False if it fails.


    If you need more features, it shouldn't be too hard to add them in

    The performance will drop for adding new items when the m_Keys string gets past about 60000 characters. This is because the string gets longer than VB's internal buffer for strings.
    You may not use vbNullChar in any key strings. For speed reasons there is no error check for this.

  8. #8
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: finding a replacement to the Collection class

    The best way to proceed is to profile your code which will help you discover where your performance problems actualy are. There are many good code profilers around, and some of them come in trial versions that work on small projects.


    Collections might well be your performance bottleneck, especially if you are using them as you suggest above:
    I am using Collection objects mainly to convert keys to indexes.
    All the collections I use (exept one) are a mean to get an index in an array based on the name of something.
    (or a double index sometimes)
    A Collection isn't intended as a hash index to an external data structure, it is a data structure that contains its own hash index. In other words the object of a search or keyed access is what you are supposed to store in each Collection item, not something like a number to be used to index something else.

    I'd think this extra indirection may be what is eating your performance.


    Without more specific knowledge of what you are trying to accomplish it is very hard to suggest a specific remedy. It feels as if you are trying to model something like a Collection with multi-values items by using a 2D array and indexing it through a set of single-valued Collections.

    There are two standard approaches for doing this. One is to use a single Collection where each item value is a Variant array. This has the limitation of offering just one automatic hash index however.

    A more flexible alternative is to use a fabricated ADO 2.5+ Recordset. Here you can add an automatic hash index for any Field simply by setting that Field's "Optimize" ADO dynamic property to True.


    In either case the result will typically provide plenty of performance. This performance will often outstrip hand crafted solutions using array sorting and binary searches, which have the disadvantages of being bug-prone and clumsy to maintain as well. This doesn't mean you can't do better with hand coding, but the cost/benefit almost always weighs against it - especially when changes are required.

    An ADO solution also offers a host of additional features such as serialization and persistence: you can save and load the data trivially.


    The hard part of this is retrofitting it into an existing program that has been written against the "world is flat (arrays)" programming model. However by learning how to use the ADO object model as an internal datastore you can apply the approach in future programs and see an increase in your development productivity.


    I'd start with code profiling though. There is no point in guessing where you need to optimize. Have you looked at anything like VB Watch v2 before?

  9. #9
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: finding a replacement to the Collection class

    Quote Originally Posted by dilettante
    An ADO solution also offers a host of additional features such as serialization and persistence: you can save and load the data trivially.
    Arrays are also trivially easy to save and load.

  10. #10
    PowerPoster dilettante's Avatar
    Join Date
    Feb 2006
    Posts
    24,487

    Re: finding a replacement to the Collection class

    Not as easy as:
    Code:
    rs.Save strFilename
    Plus you lose all of your metadata.

  11. #11
    Member
    Join Date
    Jul 2008
    Posts
    36

    Re: finding a replacement to the Collection class

    Easy enough...you can save/load a UDT array directly from disk, even if it has arrays of its own, and it's very fast...and doesn't add a dependency...

    Not sure what you mean about "metadata"

  12. #12
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: finding a replacement to the Collection class

    Quote Originally Posted by guitar
    Easy enough...you can save/load a UDT array directly from disk, even if it has arrays of its own, and it's very fast...and doesn't add a dependency...

    Not sure what you mean about "metadata"
    He may not know that you can save an entire udt with sub-arrays with a single line of code. I agree that it would be worth a pointless dependency to avoid having to read/write each individual record -- or even worse, each indivdual variable for each individual record -- in some huge project-specific code.

    Luckily, like most most programming languages VB6 assumes you'll want to save/load udts to/from the harddrive, so it lets you do it as a one-liner.

    rs.Save strFilename

    vs

    Put #1, 1, MyUDTWithSubArrays

    Doesn't seem worth a dependency to me. Especially not MDAC.

    I would also wager that just about anything you do with a udt array will be faster than treating it as a recordset for the simple fact of having way less overhead. Plus it gives you control over how you handle optimization.
    Last edited by Ellis Dee; Jul 13th, 2008 at 07:59 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width