Results 1 to 23 of 23

Thread: [RESOLVED] Why is my Multi-Column Collection extremely slow?

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Resolved [RESOLVED] Why is my Multi-Column Collection extremely slow?

    I need an ultra-high-performance Multi-Column Collection that takes up as little memory as possible. I wrote a couple of array classes and combined them into a Multi-Column Collection, but it was extremely slow. I don't know what the reason is.
    Attached Files Attached Files

  2. #2
    Lively Member
    Join Date
    Jul 2017
    Posts
    107

    Re: Why is my Multi-Column Collection extremely slow?

    Your remove function scales poorly, at worst O(n).

    Have a read through these, particularly the first one.
    https://www.vbforums.com/showthread....llection-class
    https://www.vbforums.com/showthread....VB6-Hash-table

  3. #3
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,010

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by bahbahbah View Post
    Your remove function scales poorly, at worst O(n).
    Which "Remove"-Function? The one of the "SuperCollection" or the one for the item?

    The Remove of the Collection just runs "horizontally" through the "columns".
    So, if he has 10 Columns (which means 10 Arrays), he calls 10 Times the CopyMemory-API within the Remove-Function of the Array.

    Instead of an Array for each column, i might have looked at an UDT and/or separate class, representing the "Entry/Item" in the Collection, but having itself the "number" of "columns".
    Nutshell: Such an item would represent a whole "row".
    Removing such an item from the Collection would indeed be one single call to "CopyMemory".

    OP wants to have different Datatypes for the Columns (currently at String and Long). In such an Item-Class as described above, i'd go with Variant and be done with it.
    In the "Collection" i might keep a private Array holding the the DataType Description for each column (which would be Variants), since you need the "real" Datatype only for Calculations and/or displaying, but not for storing
    Last edited by Zvoni; Mar 27th, 2025 at 02:42 AM.
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  4. #4

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Thank you, bahbahbah.

    Quote Originally Posted by bahbahbah View Post
    Your remove function scales poorly, at worst O(n).
    Yes, the remove method is the biggest cause of slow performance. Because I'm using arrays instead of hash-table and trees. There doesn't seem to be an effective way for arrays to solve the problem of efficiency of inserts and deletions in the case of large amounts of data. This is why RC6.ArrayList is equally inefficient when performing Remove operations.

    vamvam provides a great Multi-Column-Collection, but unfortunately it's implemented in C++ and I can't apply it to my project.

    Quote Originally Posted by bahbahbah View Post
    Yes, I took a closer look at these two threads, which took me back to the good old days of vbForums, when people were talking about technical issues in high spirits, even if there were occasional arguments, but they didn't affect the technical discussions.

    The information provided by the above two threads doesn't solve my problem though. If I can translate vamvam's C code into VB6 code might solve my problem, but I don't have the time and energy to do this right now.
    Last edited by SearchingDataOnly; Mar 27th, 2025 at 10:25 AM.

  5. #5

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Zvoni View Post
    Which "Remove"-Function? The one of the "SuperCollection" or the one for the item?

    The Remove of the Collection just runs "horizontally" through the "columns".
    So, if he has 10 Columns (which means 10 Arrays), he calls 10 Times the CopyMemory-API within the Remove-Function of the Array.

    Instead of an Array for each column, i might have looked at an UDT and/or separate class, representing the "Entry/Item" in the Collection, but having itself the "number" of "columns".
    Nutshell: Such an item would represent a whole "row".
    Removing such an item from the Collection would indeed be one single call to "CopyMemory".

    OP wants to have different Datatypes for the Columns (currently at String and Long). In such an Item-Class as described above, i'd go with Variant and be done with it.
    In the "Collection" i might keep a private Array holding the the DataType Description for each column (which would be Variants), since you need the "real" Datatype only for Calculations and/or displaying, but not for storing
    Thank you, Zvoni.

    If we can dynamically set the members of the UDT, then using the UDT is a great solution.
    For example:

    Code:
    With MyCollection
        .Cols = 4
        .ColType(0) = vbLong
        .ColType(1) = vbString
        .ColType(2) = vbString
        .ColType(3) = vbString
    End With
    So, how do we dynamically generate the following UDT based on the above example?

    Code:
    Type MyUDT
        Field1 As Long
        Field2 As String
        Field3 As String
        Field4 As String
    End Type
    Last edited by SearchingDataOnly; Mar 27th, 2025 at 09:45 PM.

  6. #6
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,010

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by SearchingDataOnly View Post
    Thank you, Zvoni.

    If we can dynamically set the members of the UDT, then using the UDT is a great solution.
    For example:

    Code:
    With MyCollection
        .Cols = 4
        .ColType(0) = vbLong
        .ColType(1) = vbString
        .ColType(2) = vbString
        .ColType(3) = vbString
    End With
    So, how do we dynamically generate the following UDT based on the above example?

    Code:
    Type MyUDT
        Field1 As Long
        Field2 As String
        Field3 As String
        Field4 As String
    End Type
    None they i know of (maybe with Class-Factory?).

    My idea was mor along the lines of
    Code:
    Type MyUDT
       Fields() As Variant
    End Type
    The Variant itself will know if it's a Long or a String.

    Aircode

    In Collection-Class
    Code:
    Private DataTypesCols() As VBVarType
    Private myArray() AS MyUDT
    ...
    Public Sub Create(ByVal ColNo As Long, Optional SizeOfArray As Long=-1)
    Dim sa As Long
    Dim i as long
      ReDim DataTypeCols(0 To ColNo-1)
      If SizeOfArray>-1 Then sa=SizeOfArray Else sa=InitialCapacity
      ReDim myArray(0 To sa-1)
      For i=LBound(myArray) To UBound(myArray)
        ReDim myArray(i).Fields(0 To ColNo-1)
      Next
      'Set DataTypeCols
      'Fill MyArray
    End Sub
    'Blablabla
    The UDT represents a complete row, so adding or deleting involves only one call to CopyMemory
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  7. #7
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,438

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by SearchingDataOnly View Post
    I need an ultra-high-performance Multi-Column Collection that takes up as little memory as possible.
    If you use an SQLite InMemory-(DB)-Table - in conjunction with the RC6.cCursor-Class -
    (wrapped behind some convenience methods)
    then you'll get high performance and low memory without larger efforts.

    Olaf

  8. #8
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,010

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Schmidt View Post
    If you use an SQLite InMemory-(DB)-Table - in conjunction with the RC6.cCursor-Class -
    (wrapped behind some convenience methods)
    then you'll get high performance and low memory without larger efforts.

    Olaf
    Yep. That would also be my approach (at least in FreePascal).
    No idea if OP uses RC6.
    Would also simplify stuff like (multi-column) sorting, selecting "sub-sections" of the Dataset (a.k.a filtering, searching for specific Datasets etc.), "Duplicates yes/no"
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  9. #9

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Zvoni View Post
    None they i know of (maybe with Class-Factory?).

    My idea was mor along the lines of
    Code:
    Type MyUDT
       Fields() As Variant
    End Type
    The Variant itself will know if it's a Long or a String.

    Aircode

    In Collection-Class
    Code:
    Private DataTypesCols() As VBVarType
    Private myArray() AS MyUDT
    ...
    Public Sub Create(ByVal ColNo As Long, Optional SizeOfArray As Long=-1)
    Dim sa As Long
    Dim i as long
      ReDim DataTypeCols(0 To ColNo-1)
      If SizeOfArray>-1 Then sa=SizeOfArray Else sa=InitialCapacity
      ReDim myArray(0 To sa-1)
      For i=LBound(myArray) To UBound(myArray)
        ReDim myArray(i).Fields(0 To ColNo-1)
      Next
      'Set DataTypeCols
      'Fill MyArray
    End Sub
    'Blablabla
    The UDT represents a complete row, so adding or deleting involves only one call to CopyMemory
    I'll take a closer look at your options. Thank you, Zvoni.

  10. #10

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Schmidt View Post
    If you use an SQLite InMemory-(DB)-Table - in conjunction with the RC6.cCursor-Class -
    (wrapped behind some convenience methods)
    then you'll get high performance and low memory without larger efforts.

    Olaf
    Thank you for your guidance, Olaf.

    I don't know how to add and delete records with RC6.cCursor yet. I used RC6.cCommand and RC6.cRecordSet to simulate a Muti-Column-Collection. Here are the test results.

    Code:
    ====================================
    Test RC6-cMcCollection
    Add million items at the end:  5,613.53msec
    Delete 500000 items from the middle:  551.07msec
    Update values:  2,482.29msec
    
    ====================================
    Test VMMC
    Add million items at the end:  3,382.08msec
    Delete 500000 items from the middle:  315.41msec
    Update values:  744.66msec
    Attached Files Attached Files

  11. #11
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,795

    Re: Why is my Multi-Column Collection extremely slow?

    https://www.vbforums.com/showthread.php?903908-Homegrown-Self-Balancing-Binary-Search-Tree-(BST)\

    That is basically a "Collection" that can be specifically customized to your needs, rather than "generalized" (with Variants) like the VB6 Collection.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  12. #12

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Elroy View Post
    https://www.vbforums.com/showthread.php?903908-Homegrown-Self-Balancing-Binary-Search-Tree-(BST)\

    That is basically a "Collection" that can be specifically customized to your needs, rather than "generalized" (with Variants) like the VB6 Collection.
    Hi Elroy,

    Thank you for the information. I tested your BST but I don't seem to know how to write a Multi-Column-Collection on top of your BST yet. I've combined your BST_LngKey_NoVal and BST_StrKey_NoVal into a Multi-Column-Collection, but there seems to be something wrong with the performance.

    Here are the test results:

    Code:
    ===================================
    Test cSuperCollection-BST
    ===================================
    Add million items at the end: 72.344 seconds
    Attached Files Attached Files

  13. #13
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,795

    Re: Why is my Multi-Column Collection extremely slow?

    Hello SDO,

    I guess I don't know what you mean by "multi-column collection". If it were me, I'd probabily just create multiple (perfectly matched) collections if I needed "multiple columns". My BST routines could be modified for a specific UDT (which could represent columns), but I'm not sure I'm willing to do that work for some custom-UDT, when just using "paired-collections" will get it done.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  14. #14

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Elroy View Post
    Hello SDO,

    I guess I don't know what you mean by "multi-column collection". If it were me, I'd probabily just create multiple (perfectly matched) collections if I needed "multiple columns". My BST routines could be modified for a specific UDT (which could represent columns), but I'm not sure I'm willing to do that work for some custom-UDT, when just using "paired-collections" will get it done.
    It seems difficult to simulate the high performance of vamvam's Multi-Column-Collection with VB.Collection, but not only that, it's also extremely easy to use and has a low memory footprint.

  15. #15
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    5,848

    Re: Why is my Multi-Column Collection extremely slow?

    Btw, I usually simulate "Multi-Column-Collections" with a simple array of UDTs. Works fine for me *and* is strongly typed i.e. no need c.ColType(0) = vmeInteger and c.ColType(1) = vmeString.

    Now, for deleting an element in the *middle* this probably sucks but then there are ways to optimize it by marking deleted records, keeping an index and what not, which should be attempted if deleting in the *middle* is a core requirement for the app in question (usually not).

    cheers,
    </wqw>

  16. #16
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,795

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by wqweto View Post
    Btw, I usually simulate "Multi-Column-Collections" with a simple array of UDTs. Works fine for me *and* is strongly typed i.e. no need c.ColType(0) = vmeInteger and c.ColType(1) = vmeString.

    Now, for deleting an element in the *middle* this probably sucks but then there are ways to optimize it by marking deleted records, keeping an index and what not, which should be attempted if deleting in the *middle* is a core requirement for the app in question (usually not).

    cheers,
    </wqw>
    Searching an array for your key, or keeping it sorted, can also be tricky. That's what makes collection/dictionary-type things so useful.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  17. #17
    PowerPoster Elroy's Avatar
    Join Date
    Jun 2014
    Location
    Near Nashville TN
    Posts
    10,795

    Re: Why is my Multi-Column Collection extremely slow?

    Also, SDO, it was mentioned above but a disconnected ADO recordset is another excellent alternative. And it's inherently multi-column, and also easily searchable.

    Here's a codebank link where I put together code to make these things easy to create-and-use.

    I haven't done comparative speed tests on them, but they should be ok.



    Also, you mentioned "millions of items" above. If you're getting into that volume of possible records, that's starting to sound more like a disk-based recordset in some database.
    Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.

  18. #18

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by wqweto View Post
    Btw, I usually simulate "Multi-Column-Collections" with a simple array of UDTs. Works fine for me *and* is strongly typed i.e. no need c.ColType(0) = vmeInteger and c.ColType(1) = vmeString.
    Yes, using UDT arrays is the fastest when performing Add and Update operations, but the code is a bit verbose.. Here's the code for my MyMultiColumnList (it looks a little cleaner and more concise):

    Code:
    Dim oList As New MyMultiColumnList
    oList.Init vbInteger, vbString, vbString, vbString
    
    With oExternLib
        For i = 0 To oExternLib.Count -1
            oList.Add .Id(i), .Caption(i), .Tag(i), .Description(i)
        Next i
    End With
    Also, as a class, MyMultiColumnList is more convenient to pass between different modules and different functions. The only drawback is that the deletion operation performance is extremely low in large data volumes.

    Quote Originally Posted by wqweto View Post
    Now, for deleting an element in the *middle* this probably sucks but then there are ways to optimize it by marking deleted records, keeping an index and what not, which should be attempted if deleting in the *middle* is a core requirement for the app in question (usually not).
    It's a good idea to set a deleted flag. Thank you, wqweto.
    Last edited by SearchingDataOnly; Mar 30th, 2025 at 10:26 PM.

  19. #19

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Elroy View Post
    Also, SDO, it was mentioned above but a disconnected ADO recordset is another excellent alternative. And it's inherently multi-column, and also easily searchable.

    Here's a codebank link where I put together code to make these things easy to create-and-use.
    Thank you, Elroy. I've simulated a MultiColumnCollection with RC6.MemDB.

    but its performance is still a bit short of vamvam's Multi-Column-Collection.

    Here are the test results.

    Code:
    ====================================
    Test RC6-cMcCollection
    Add million items at the end:  5,613.53msec
    Delete 500000 items from the middle:  551.07msec
    Update values:  2,482.29msec
    
    ====================================
    Test VMMC
    Add million items at the end:  3,382.08msec
    Delete 500000 items from the middle:  315.41msec
    Update values:  744.66msec
    Quote Originally Posted by Elroy View Post
    I haven't done comparative speed tests on them, but they should be ok.
    IMO, the performance of database-based recordsets cannot be compared to arrays, lists, and trees.

  20. #20

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Quote Originally Posted by Zvoni View Post
    None they i know of (maybe with Class-Factory?).

    My idea was mor along the lines of
    Code:
    Type MyUDT
       Fields() As Variant
    End Type
    The Variant itself will know if it's a Long or a String.

    Aircode

    In Collection-Class
    Code:
    Private DataTypesCols() As VBVarType
    Private myArray() AS MyUDT
    ...
    Public Sub Create(ByVal ColNo As Long, Optional SizeOfArray As Long=-1)
    Dim sa As Long
    Dim i as long
      ReDim DataTypeCols(0 To ColNo-1)
      If SizeOfArray>-1 Then sa=SizeOfArray Else sa=InitialCapacity
      ReDim myArray(0 To sa-1)
      For i=LBound(myArray) To UBound(myArray)
        ReDim myArray(i).Fields(0 To ColNo-1)
      Next
      'Set DataTypeCols
      'Fill MyArray
    End Sub
    'Blablabla
    The UDT represents a complete row, so adding or deleting involves only one call to CopyMemory
    I tested your method and it has the following problems:

    (1) It takes up too much memory
    (2) The performance is not excellent
    (3) An error occurred when using CopyMemory to manipulate Variant-Array(Fields array).
    Attached Files Attached Files

  21. #21

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: Why is my Multi-Column Collection extremely slow?

    Someone sent me a private message with an AI-generated piece of code. I tested it carefully and it was extremely fast and it won the championship in terms of performance. Thank you very very much, good Samaritan.

    Here are the test results.

    Code:
    ====================================
    Test AI-MultiColumnList
    Add million items at the end: 3.031 seconds
    Delete 500000 items from the middle: 0.063 seconds
    Update values: 0.828 seconds
    Attached Files Attached Files
    Last edited by SearchingDataOnly; Apr 1st, 2025 at 04:18 AM.

  22. #22
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    5,848

    Re: [RESOLVED] Why is my Multi-Column Collection extremely slow?

    This code snippet is unnecessary

    Code:
                Case vbObject
                    ' For objects, we need to carefully preserve existing references
                    Dim TempObjects() As Object
                    ReDim TempObjects(NewSize)
     
                    ' Copy existing object references
                    For j = 0 To UBound(Columns(i).ObjectValues)
                        Set TempObjects(j) = Columns(i).ObjectValues(j)
                    Next j
     
                    ' Initialize new object references to Nothing
                    For j = UBound(Columns(i).ObjectValues) + 1 To NewSize
                        Set TempObjects(j) = Nothing
                    Next j
     
                    ' Replace the array
                    Columns(i).ObjectValues = TempObjects
                    Erase TempObjects
    cheers,
    </wqw>

  23. #23

    Thread Starter
    Frenzied Member
    Join Date
    Aug 2020
    Posts
    1,670

    Re: [RESOLVED] Why is my Multi-Column Collection extremely slow?

    Good advice.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width