-
May 27th, 2018, 12:57 PM
#1
Thread Starter
PowerPoster
[RESOLVED] How to implement "bits-map" for massive grid cell data?
The trick told me an excellent Grid optimization method: (Thank you very much, The trick)
http://www.vbforums.com/showthread.p...ts-in-an-array
Originally Posted by The trick
dreammanor, for example when you create the grid the most of cells have no value (empty items). You can create bit-map with the bits that represent the item state (similar approach is used in TLS, (RTL_BITMAP)), when you redraw the items you can fast check if item exists. It consumes less memory - 1 byte per 8 items. Furthermore you can divide the cells to the areas (for example if a region has no data you can allocate no memory). To access data by position you can use a map or a hash table.
But I don't know how to implement "bits-map". Does anyone know how to implement "bits-map"? Much greatful.
Edit:
My understanding of "bits-map" is that a bit has 2 states(0 or 1) and a Byte can represent the state of 8 grid cells (0 means the cell is empty and 1 means the cell has data).
How to use a Byte to represent 8 grid cells?
Last edited by dreammanor; May 27th, 2018 at 07:38 PM.
-
May 27th, 2018, 02:41 PM
#2
Re: How to implement "bits-map" for massive grid cell data?
I believe what he's talking about is something like the following:
Code:
Option Explicit
Const State1 As Integer = 1
Const State2 As Integer = 2
Const State3 As Integer = 4
Const State4 As Integer = 8
Const State5 As Integer = &H10
Const State6 As Integer = &H20
Const State7 As Integer = &H40
Const State8 As Integer = &H80
Const State9 As Integer = &H100
Const State10 As Integer = &H200
Const State11 As Integer = &H400
Const State12 As Integer = &H800
Const State13 As Integer = &H1000
Const State14 As Integer = &H2000
Const State15 As Integer = &H4000
Const State16 As Integer = &H8000
Private Sub Form_Load()
Dim MyStates As Integer
MyStates = State2 Or State7 Or State13
End Sub
You just create a series of constants with one bit turned on for each constant. Then, you can use a single variable (as the same Type as the constant), and turn on or off the bits to indicate the states. Using an Integer as I did, you can track up to 16 states. Using a Byte, you could track 8. Using a Long, you could track up to 32 states.
Good Luck,
Elroy
EDIT1: And also, with only one bit turned on for each Constant, you can use "Or" or "+" to combine your states into a state variable. To test for specific states, you'd use the "And" operator.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
May 27th, 2018, 03:34 PM
#3
Re: How to implement "bits-map" for massive grid cell data?
-
May 27th, 2018, 07:33 PM
#4
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by Elroy
I believe what he's talking about is something like the following:
Code:
Option Explicit
Const State1 As Integer = 1
Const State2 As Integer = 2
Const State3 As Integer = 4
Const State4 As Integer = 8
Const State5 As Integer = &H10
Const State6 As Integer = &H20
Const State7 As Integer = &H40
Const State8 As Integer = &H80
Const State9 As Integer = &H100
Const State10 As Integer = &H200
Const State11 As Integer = &H400
Const State12 As Integer = &H800
Const State13 As Integer = &H1000
Const State14 As Integer = &H2000
Const State15 As Integer = &H4000
Const State16 As Integer = &H8000
Private Sub Form_Load()
Dim MyStates As Integer
MyStates = State2 Or State7 Or State13
End Sub
You just create a series of constants with one bit turned on for each constant. Then, you can use a single variable (as the same Type as the constant), and turn on or off the bits to indicate the states. Using an Integer as I did, you can track up to 16 states. Using a Byte, you could track 8. Using a Long, you could track up to 32 states.
Good Luck,
Elroy
EDIT1: And also, with only one bit turned on for each Constant, you can use "Or" or "+" to combine your states into a state variable. To test for specific states, you'd use the "And" operator.
I only need two states, 0 and 1.
My understanding of "bits-map" is that a Byte can represent the state of 8 grid cells (0 means the cell is empty and 1 means the cell has data).
How to use a Byte to represent 8 grid cells?
Last edited by dreammanor; May 27th, 2018 at 07:40 PM.
-
May 27th, 2018, 07:35 PM
#5
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dilettante
I tested your code, but I got an error "File not found: ntoskrnl.exe"
-
May 27th, 2018, 09:29 PM
#6
Re: How to implement "bits-map" for massive grid cell data?
Are you trying to run it on some downlevel OS like Windows XP?
If so, you may have to look for these entrypoints in ntdll.dll instead.
-
May 27th, 2018, 11:58 PM
#7
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dreammanor
How to use a Byte to represent 8 grid cells?
If I was determined to be extremely efficient with keeping track of some single state for an arbitrary number of cells, I suppose I'd develop an algorithm that told me which bit I'm dealing with in a byte array. However, that seems a bit obsessive to me.
It sounds like you're already creating a UDT for each cell of this grid you're designing. If it were me, I'd just put a byte in that UDT to track the various cell states. Ok, if you only have one state you're tracking right now, then you'll have seven bits reserved for other stuff down the road. And, I'm betting that a waste of seven bits per cell is relatively small potatoes compared to what you're already planning for in this UDT.
You may also want to think about intra-UDT padding as well as inter-UDT padding when used in an array. You're probably already throwing away several bytes there that you don't even know about. You could declare some Byte variables where that padding takes place and it wouldn't cost you any additional memory at all.
Here's a pretty good web page that talks about UDTs and their padding.
Good Luck,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
-
May 28th, 2018, 01:12 AM
#8
Re: How to implement "bits-map" for massive grid cell data?
It seems you are now focusing on grids with thousands of cells which are not in a normal matrix.
Is this just a hypothetical case or are you really dealing with a lot of empty cells?
I use grids a lot (vsFlexGrid) and have often millions of cells, but they are always is a "normal" filled matrix, no really empty cells.
What you can do is having special UDTs (or classes) in which you define the formatting used for ranges of cells.
Code:
Private type tpCellRange
lRow1 As Long
lCol1 As Long
lRow2 As Long
lCol2 As Long
End Type
Private type tpCellBackColor
lBackColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
Private type tpCellForeColor
lForeColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
Private type tpCellFontBold
lForeColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
-
May 28th, 2018, 01:40 AM
#9
Re: How to implement "bits-map" for massive grid cell data?
Usually a grid control starts w/ lots of empy cells that eventually get populated from the data source. Using bits to track full/empty status sounds like a premature optimization at this stage.
On a side-note: Noticing the stream of follow-up questions this grid control effort seems is going to be quite a journey. Wish I had the time to do it too :-))
cheers,
</wqw>
-
May 28th, 2018, 02:51 AM
#10
Re: How to implement "bits-map" for massive grid cell data?
I only need two states, 0 and 1.
My understanding of "bits-map" is that a Byte can represent the state of 8 grid cells (0 means the cell is empty and 1 means the cell has data).
How to use a Byte to represent 8 grid cells?
Try this very basic BitArray class:
Code:
Option Explicit
Private m_BitArray() As Byte
Private m_BitMask(7) As Byte
Property Get Value(x As Long, y As Long) As Boolean
Value = m_BitArray(x \ 8, y) And m_BitMask(x Mod 8)
End Property
Property Let Value(x As Long, y As Long, bValue As Boolean)
If bValue Then
m_BitArray(x \ 8, y) = (m_BitArray(x \ 8, y) Or m_BitMask(x Mod 8))
Else
m_BitArray(x \ 8, y) = (m_BitArray(x \ 8, y) And Not m_BitMask(x Mod 8))
End If
End Property
Public Function Initialize(x As Long, y As Long, Optional bPreserve As Boolean = False)
If bPreserve Then
ReDim Preserve m_BitArray(x \ 8, y)
Else
ReDim m_BitArray(x \ 8, y)
End If
End Function
Private Sub Class_Initialize()
Dim i As Integer
For i = 0 To 7
m_BitMask(i) = 2 ^ i
Next i
End Sub
-
May 28th, 2018, 08:59 AM
#11
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dilettante
Are you trying to run it on some downlevel OS like Windows XP?
If so, you may have to look for these entrypoints in ntdll.dll instead.
Hi dilettante, I haven't tested it on XP yet, because my grid control will run on Win10. Is there a solution that does not require the ntoskrnl.exe or ntdll.dll?
Originally Posted by Elroy
If I was determined to be extremely efficient with keeping track of some single state for an arbitrary number of cells, I suppose I'd develop an algorithm that told me which bit I'm dealing with in a byte array. However, that seems a bit obsessive to me.
It sounds like you're already creating a UDT for each cell of this grid you're designing. If it were me, I'd just put a byte in that UDT to track the various cell states. Ok, if you only have one state you're tracking right now, then you'll have seven bits reserved for other stuff down the road. And, I'm betting that a waste of seven bits per cell is relatively small potatoes compared to what you're already planning for in this UDT.
You may also want to think about intra-UDT padding as well as inter-UDT padding when used in an array. You're probably already throwing away several bytes there that you don't even know about. You could declare some Byte variables where that padding takes place and it wouldn't cost you any additional memory at all.
Here's a pretty good web page that talks about UDTs and their padding.
Hi Elroy, thank you for the information provided. In fact, Bits-Map is not related to the data structure of the grid cell. If the bit is 0, the system does not need to initialize the cell object (that is, do not need to allocate memory for the cell). If the bit is 1, the system needs to instantiate the cell object( class or UDT).
-
May 28th, 2018, 09:01 AM
#12
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by Arnoutdv
It seems you are now focusing on grids with thousands of cells which are not in a normal matrix.
Is this just a hypothetical case or are you really dealing with a lot of empty cells?
I use grids a lot (vsFlexGrid) and have often millions of cells, but they are always is a "normal" filled matrix, no really empty cells.
What you can do is having special UDTs (or classes) in which you define the formatting used for ranges of cells.
Code:
Private type tpCellRange
lRow1 As Long
lCol1 As Long
lRow2 As Long
lCol2 As Long
End Type
Private type tpCellBackColor
lBackColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
Private type tpCellForeColor
lForeColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
Private type tpCellFontBold
lForeColor As Long
tRange() As tpRange
lNofRanges As Long
End Type
Hi Arnoutdv, thank you for your code and suggestion. In fact, the number of grid cells I have to process is a few hundred million, so I need to have a very good optimization.
Originally Posted by wqweto
Usually a grid control starts w/ lots of empy cells that eventually get populated from the data source. Using bits to track full/empty status sounds like a premature optimization at this stage.
On a side-note: Noticing the stream of follow-up questions this grid control effort seems is going to be quite a journey. Wish I had the time to do it too :-))
Hi wqweto, the reason I want to use Bits-Map is to avoid prematurely allocating memory for empty grid cells. If you are willing to develop a grid control that would be great.
-
May 28th, 2018, 09:04 AM
#13
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by Arnoutdv
Try this very basic BitArray class:
Code:
Option Explicit
Private m_BitArray() As Byte
Private m_BitMask(7) As Byte
Property Get Value(x As Long, y As Long) As Boolean
Value = m_BitArray(x \ 8, y) And m_BitMask(x Mod 8)
End Property
Property Let Value(x As Long, y As Long, bValue As Boolean)
If bValue Then
m_BitArray(x \ 8, y) = (m_BitArray(x \ 8, y) Or m_BitMask(x Mod 8))
Else
m_BitArray(x \ 8, y) = (m_BitArray(x \ 8, y) And Not m_BitMask(x Mod 8))
End If
End Property
Public Function Initialize(x As Long, y As Long, Optional bPreserve As Boolean = False)
If bPreserve Then
ReDim Preserve m_BitArray(x \ 8, y)
Else
ReDim m_BitArray(x \ 8, y)
End If
End Function
Private Sub Class_Initialize()
Dim i As Integer
For i = 0 To 7
m_BitMask(i) = 2 ^ i
Next i
End Sub
Hi Arnoutdv, Your BitArray class has implemented Bits-Map, much grateful.
When I execute "objBitArray.Initialize 100, 88000000", the test program runs well and the memory used is 1092M.
When I execute "objBitArray.Initialize 100, 90000000", the system prompts "Memory overflow"
The above results is completely acceptable.
In my plan, the MaxRows in my grid control will be allowed to reach 200 million and the MaxCols will be allowed to reach 1000. So I'd like know if there is any way to optimize your BitArray class, such as using shift operations or CopyMemory for optimization?
In addition, The trick mentions the hash-table, and I don't know what the hash-table has to do with Bits-Map.
-
May 28th, 2018, 11:06 AM
#14
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dreammanor
Hi dilettante, I haven't tested it on XP yet, because my grid control will run on Win10. Is there a solution that does not require the ntoskrnl.exe or ntdll.dll?
As far as I know this isn't implemented anywhere else in Windows. There is NtosKrnl.lib for static linking but that doesn't help us in VB6.
However your needs are fairly simple anyway right? You don't need things like searching for a run of unset bits of a given length within the bitmap for example.
I think all you really want is a packed boolean array, and the code others have shown above is about as good as things will get in VB6. To do better you'd have to create a DLL in some language that offers fast shift and mask operations on unsigned 32-bit values.
Is this entire enterprise fruitless?
How often does it make sense for a user interface to offer a scrollable view of a vast number of rows? The scrollbar becomes so thin and touchy that users can't pinpoint specific rows. It is barely usable. This may be the wrong user interface model.
For those very rare cases where we want to damn the torpedoes and do the wrong thing anyway we already have ListView, which we can use in report view and virtualized mode.
Failing to do that means keeping a vast tub of data in RAM, most of it never used. This not only wastes machine resources but you also have the cost of loading the data from wherever it really lives, and if you allowing updating you have the cost of updating it where it really lives.
Since the user can only deal with a window on the data of 30, 100, maybe 200 rows at a time... why spend memory on 10 million rows?
This is like chopping down an entire apple tree every time you want to eat an apple.
-
May 28th, 2018, 11:34 AM
#15
Addicted Member
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dilettante
we already have ListView, which we can use in report view and virtualized mode.
Exactly. Search for LVS_OWNERDATA.
-
May 28th, 2018, 01:54 PM
#16
Re: How to implement "bits-map" for massive grid cell data?
dreammanor, if you want to store very big amount of data you can use a FileMapping object, each time you need only the small portion of data the other data is back-swapped into that file. Using bits-map you can check if a page is allocated (big chunks) and inside page you can use bits-map to check data within that page (small chunks).
-
May 29th, 2018, 09:08 AM
#17
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dilettante
As far as I know this isn't implemented anywhere else in Windows. There is NtosKrnl.lib for static linking but that doesn't help us in VB6.
However your needs are fairly simple anyway right? You don't need things like searching for a run of unset bits of a given length within the bitmap for example.
I think all you really want is a packed boolean array, and the code others have shown above is about as good as things will get in VB6. To do better you'd have to create a DLL in some language that offers fast shift and mask operations on unsigned 32-bit values.
Is this entire enterprise fruitless?
How often does it make sense for a user interface to offer a scrollable view of a vast number of rows? The scrollbar becomes so thin and touchy that users can't pinpoint specific rows. It is barely usable. This may be the wrong user interface model.
For those very rare cases where we want to damn the torpedoes and do the wrong thing anyway we already have ListView, which we can use in report view and virtualized mode.
Failing to do that means keeping a vast tub of data in RAM, most of it never used. This not only wastes machine resources but you also have the cost of loading the data from wherever it really lives, and if you allowing updating you have the cost of updating it where it really lives.
Since the user can only deal with a window on the data of 30, 100, maybe 200 rows at a time... why spend memory on 10 million rows?
This is like chopping down an entire apple tree every time you want to eat an apple.
I need to do massive data analysis(BI). The amount of data processed by my grid control is generally in the hundreds of thousands. If all the data is in memory, it will not only greatly accelerate the speed of the data analysis, but also greatly reduce the amount of software source code.
Originally Posted by jj2007
Exactly. Search for LVS_OWNERDATA.
Hi jj2007, thank you for your reply.
-
May 29th, 2018, 09:12 AM
#18
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by The trick
dreammanor, if you want to store very big amount of data you can use a FileMapping object, each time you need only the small portion of data the other data is back-swapped into that file. Using bits-map you can check if a page is allocated (big chunks) and inside page you can use bits-map to check data within that page (small chunks).
OK, I'll search the usage of FileMapping. Bits-map is very useful to me. Extremely grateful.
-
May 29th, 2018, 12:39 PM
#19
Addicted Member
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dreammanor
I need to do massive data analysis(BI). The amount of data processed by my grid control is generally in the hundreds of thousands. If all the data is in memory, it will not only greatly accelerate the speed of the data analysis, but also greatly reduce the amount of software source code.
The trick is to have the data in memory, in one big linear array; thus you can do efficient analysis of your data. BUT the control "sees" only as much data as is necessary for the display. This is what LVS_OWNERDATA is all about. If you insist to have all the data held in memory by the control itself, it will slow down to a crawl.
-
May 30th, 2018, 08:35 AM
#20
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by jj2007
The trick is to have the data in memory, in one big linear array; thus you can do efficient analysis of your data. BUT the control "sees" only as much data as is necessary for the display. This is what LVS_OWNERDATA is all about. If you insist to have all the data held in memory by the control itself, it will slow down to a crawl.
Hi jj2007, thank you for your reply. My understanding of "one big linear array" is to put data into memory. I don't know whether my understanding is correct.
-
May 30th, 2018, 03:54 PM
#21
Addicted Member
Re: How to implement "bits-map" for massive grid cell data?
Originally Posted by dreammanor
My understanding of "one big linear array" is to put data into memory.
Yes, that's correct. The point is that you can put, say, 10 Million records with 80 bytes each into
1. a linear 800 Million byte array or
2. some unknown data format used internally by a listview or grid control, with an unknown overhead
Solution 2 may lead to HeapAlloc failure, if the overhead is more than a few bytes...
-
May 30th, 2018, 09:31 PM
#22
Thread Starter
PowerPoster
Re: How to implement "bits-map" for massive grid cell data?
Yes, I'll keep a complete Bits-Map in memory. The actual grid data is partially stored in memory, and part of it is placed in a disk file.
I've collected enough information and now I've started to develop my grid control. Thanks to all the people who helped me.
Last edited by dreammanor; May 30th, 2018 at 09:36 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|