There have been a couple of roll-your-own collection classes posted in these forums, and I'm not knocking any of these. Some of them claim to be faster than the core VB6 Collection object, and they may be. That's not what this post is about.
This post is, to some degree, in response to a recent lively thread. However, in that thread, there was much discussion about the internal workings of VB6 Collections. Some was sorted, and some wasn't. What wasn't sorted was how VB6 Collections are as fast as they are. And that's also not the point of this post. We know they are relatively fast, and the code in this post continues to take advantage of that.
Some things that were sorted is how to get the keys back from a collection.
Also, it was rather thoroughly illustrated that VB6 Collection keys follow some strange rules. For instance, they compare strings (determining existence and duplication) similar to StrComp() using the vbTextCompare constant. This has all kinds of strange consequences. For one, it can be locale specific. Secondly, it's case insensitive, which can cause very strange problems. Thirdly, strange problems can arise when a character in the key string is outside of the valid &h0020 to &hD7FF USC-2 Unicode range.
Therefore, this is a wrapper that allows for strings that are case sensitive. In fact, they can have anything at all in them, and it won't matter. Basically, a HEX version of the string is what's actually placed in the collection, although this wrapper hides that from you.
This wrapper has the four members (Add, Item, Count, Remove) of a typical VB6 Collection. It also has an added set of "helper" members:
KeyExists - Just a boolean check if a key exists in the collection.
Keys() - Returns a string array with all the Collection's keys.
ChangeKey - Change old key to new key.
ChangeIndex - Change old index to new index.
ItemKey - This is a read/write String property. Upon supplying the Index, you can retrieve an item's Key, or you can change it.
ItemIndex - This is a read/write Long property. Upon supplying the Key, you can retrieve an item's Index, or you can change it.
In the spirit of my better angels, I'll give a shout out to DEXWERX and to dilettante for their assistance in fleshing out these ideas. Also, they make use of some known information about a header structure as well as an item structure of VB6 Collections. The precise origin of the teasing out of these structures is unknown, but possibly attributable to LaVolpe.
And now for the wrapper. Just place this in a Class named to your choosing (for my use, I've named it CollectionEx), and use it as a wrapper to the internal VB6 Collection object. Again, just to enumerate the advantages:
Keys are completely case sensitive. Basically, they're compared on a binary level rather than in a vbTextCompare way.
The "For Each" syntax is enabled.
An extra set of "helper" members is included (see list above).
Code:
' Because of certain Procedure Attributes that aren't actually shown in the code window,
' I decided to just have this as an attachment.
'
' Specifically, the "Attribute Item.VB_UserMemId = 0" which makes the Item method the default
' would be lost. And the two attributes necessary for the "For Each" syntax in the NewEnum
' procedure would be lost. These are documented in the attached code, but you must open the
' code in Notepad (or observe in the Tools/Procedure Attributes... area) to actually see them.
EDIT1: It was graciously reported by dz32 that I neglected to consider a couple of things. For one, I hadn't made the "Item" procedure the default. This caused an incompatibility between VB6's built-in Collections and this wrapper. This is now corrected. And secondly, he pointed out that I neglected to consider objects stored as the collection data. When objects are stored, when returning them, the "Set" statement must be used (rather than the implicit "Let" statement). This is also now corrected. Just re-download the attachment for these changes, all tested and ready to go.
EDIT2: It was also pointed out that my Add method would take a numeric value for the key, and implicitly typecast it to a string, whereas the built-in VB6 collection would throw a type-mismatch error when doing this. It was suggested that this was a nice "feature". However, this feature wasn't fully implemented. Specifically, when using the default Item method, anything numeric would be interpreted as an index (and not a key). Same was true for Remove method. To clear this up, a new Optional bForceInterpretAsKey argument has been added to the Item and Remove methods. This allows someone to use numeric arguments for all the keys if they like. These changes are now included in the attached class.
Enjoy,
Elroy
Last edited by Elroy; Aug 30th, 2016 at 05:32 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
In particular, note that control chars in the range U+0000 - U+0020 and chars in the range U+E000 to U+FFFF are completely valid. Characters above U+FFFF are also valid, but they must be manually encoded as surrogate pairs, using the special range 0xD800 - 0xDBFF and 0xDC00 - 0xDFFF. Characters in this range *are valid* if they adhere to the surrogate encoding rules.
Tanner, here's a quote from the link you've provided:
U+D800 to U+DFFF: The Unicode standard permanently reserves these code point values for UTF-16 encoding of the high and low surrogates, and they will never be assigned a character, so there should be no reason to encode them. The official Unicode standard says that no UTF forms, including UTF-16, can encode these code points.
Yes, I see the part about "high and low surrogates", but that just means the upper/lower case issue gets even further confused, especially when generating keys with something like ChrW$(&h####). And obviously, from the confusion generated in this thread, I think it's safe to say that anything above &hD7FF can cause problems when dealing with Collection keys.
Also, as I'm sure you know, &h0000 is (in theory) reserved for an End-Of-String character, but we also know that VB6 tends to lean much more on the strength length in the BSTR block than actually looking for a &h0000. But there are exceptions, such as making an API call with a string.
And the &h0001 to &h001F is the control characters block. So it's probably inadvisable to use those as Collection keys as well.
Also, just FYI for folks, a true UTF-16 implementation (which VB6 is not), has the ability to have characters that are longer than two bytes. In VB6 (in memory), every character is always and only two bytes (with a possible caveat that we're on any relatively contemporary computer).
So, bottom line, using ChrW() to generate keys has the potential to cause problems, and my wrapper completely solves these problems.
Regards,
Elroy
EDIT1: It was clearly illustrated in that other thread that certain Unicode characters give VB6 Collections a fit. This wrapper (hopefully) completely solves those problems. Using this wrapper you can think of your keys as bits (rather than characters), and any particular bit pattern will never collide with any different bit pattern when adding new items to the collection. What that means is that any character anywhere in the &h0000 to &hFFFF range is completely acceptable, and that upper/lower case differences are observed as different keys (inclusive of the high and low surrogate encoding).
Last edited by Elroy; Aug 29th, 2016 at 11:17 AM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
I never posted my own collection wrapper because there already a couple good ones on here, but options never hurt!
I think it was dilettante posted his own CollectionEX that adds Case Sensitivity, LaVolpe I think had one,
and Schmidt's got here one that greatly improves retrieval by Index.
Yeah, I debated whether or not to post this also. My original thought was that it'd be rather obviously simple, and provide a way to circumvent the Unicode idiosyncrasies. However, it still wound up being a bit long.
I did run it through a good battery of tests, checking all the edge conditions I could possibly think of, so I decided to throw it out here. At its core, it's really quite simple. It's just a wrapper that converts the keys to HEX before they're stored in the collection, and converts them back into their original Unicode string form before giving them back to the user.
And yes, I've seen Schmidt's version. Knowing how particular he is, I suspect they're a good set of procedures. My version was truly intended to still utilize the VB6 built-in Collection object, and just circumvent Unicode and vbTextCompare-like idiosyncrasies, and to throw in a few more members that might be handy for some folks.
Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
@dz32, I seldom create ActiveX stuff, tending to just wrap it up into a single executable. My thinking was that it'd just be a nice class-Collections-wrapper to throw into a project that might need it. If you'd like to compile as an ActiveX component, I'm sure it'd work quite well. Just change all the Friend declarations to Public.
Also, many moons ago, I got into the habit of declaring class methods as Friend at every opportunity I got. And there are actually times when that has benefits over a Public method:
They compile into faster code.
You can use them to pass UDTs into and out-of class modules, when the UDT has been declared Public in a standard (BAS) module.
Just as an FYI, class modules thrown into your project may make the size of your executable grow, but it doesn't necessarily increase the size of your program's memory-footprint. I'm not sure how far back you go, but I think of class modules much like the old-style overlays. In fact, you still see them compiled and processed by the linker as OBJ files. These classes compiled into the executable are only called into memory when they're instantiated as an object. The same is true of forms (which are effectively class modules with an interface). Therefore, the decision of whether or not to go ActiveX with this stuff, when you've actually got the source code, is really just a matter of personal choice.
Again, unless I've got a specific reason otherwise, anytime I want to expose a form or class method to the rest of my project, I always tend to use Friend.
Regards,
Elroy
Last edited by Elroy; Aug 29th, 2016 at 06:32 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
Are you asking if there's any particular license needed to use the above code? If so, the answer is absolutely not. Anything I post in these forums is free for anyone to do anything they like with. Open source / Shareware in the extreme.
If you're asking about other down-sides to Friend in a standard executable module, the only one I can think of would be the fact that they can't be late-bound. For instance, if you start a new project, and then create Class1 with the following ...
Code:
Option Explicit
Friend Function TestMethod() As Long
TestMethod = 99
End Function
And then do something like the following in Form1...
Code:
Option Explicit
Private Sub Form_Load()
Dim o As Object
Set o = New Class1
MsgBox o.TestMethod
End Sub
When you try to execute the project, you'll get...
The fix is to either declare the Class1 TestMethod as Public, or to do early-binding with the object (in Form1: "Dim o As Class1" rather than "Dim o As Object").
When speaking of a standard exe project, I'm probably forgetting several other things that a Public method will do that a Friend method won't, but that's all that immediately comes to mind.
Bottom line, IMHO, Friend is much better than Public when everything will be early-binding to any instantiation of the class.
Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
A couple of changes were made to correct some original incompatibility to the built-in VB6 collections.
Please see the EDIT1 note in the original #1 post.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
EDIT1: It was clearly illustrated in that other thread that certain Unicode characters give VB6 Collections a fit. This wrapper (hopefully) completely solves those problems.
Thank you for the reply, Elroy. I'm hammering on this point simply because I don't want other readers to be confused, and because there are use-cases where Collection key strings may come from API functions (which can *absolutely* contain things like multiple null-chars or surrogate characters, and are in no way "invalid" Unicode strings). As long as you feed the Collection object well-formed UTF-16 strings, it shouldn't produce unexpected errors during key retrieval.
What we ultimately demonstrated in the other thread is that *malformed* UTF-16 strings can cause problems for VB's built-in collection. The primary way to "malform" a UTF-16 string is to take characters in the reserved surrogate pair range (used to encode code points above U+FFFF), and mix and match them in ways that do not resolve to valid code points. In hindsight, I suppose this isn't a huge revelation, as malformed UTF-16 strings cause problems for any non-binary string comparison, including all modern WAPI methods, because the compare algorithm can't resolve those malformed surrogate markers into actual code points.
Anyway, what we *did not* demonstrate in the other thread is that the VB6 collection's key comparison code behaves unexpectedly on *correctly formed* UTF-16 strings, with or without surrogate characters. Maybe this is just semantics, but IMO it's a relevant distinction. Feed the Collection object valid UTF-16 strings (including surrogate pairs, or code points >= U+E000) and you'll still be okay. (Assuming you want the default vbTextCompare behavior, obviously.)
None of this is meant to invalidate your project here - it's a fine example, and I think binary-type Key comparisons are what most developers probably want! But while VB's existing Collection object has any number of issues, I don't think we should call it "Unicode incompatible" or state that it has "problems with certain Unicode chars". Again, the Collection object appears to work correctly as long as it's fed valid UTF-16 strings. (...Where "correctly" means, "there are no unexpected errors, and key retrieval operates on the same locale-specific string comparison behavior as vbTextCompare".)
Anyway, apologies for derailing your thread. I'll try to stay quiet from here on out.
I'm not at my main computer, but an obvious way to handle this would be to...
Code:
c.Add "test", Cstr(5)
MsgBox c(Cstr(5))
However, an AddByKey sub and an ItemByKey property may not be a bad idea. I suppose it'd also take a RemoveByKey to be complete.
@tanner, no problem. I'll respond more later. I'm poking at a tablet at this moment.
Regards,
Elroy
Last edited by Elroy; Aug 30th, 2016 at 04:16 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
@dz32, I looked through all your suggested helper methods. Let me just say that my original intent was to try and provide a clean and short alternative/wrapper for the VB6 Collection object. I absolutely appreciate your suggestions! And please keep them coming.
But please don't be upset if I don't take them all. I thought the idea of a completely numeric key (at least from the appearance to the user), was an absolute excellent idea, and I've implemented that (see Edit2 of the first post).
Regarding the others, everyone is a good idea. But some just seem like a bit of bloat. For instance, IsEmpty and Clear are just too easily done without the need of a method. I suppose a "clear" internal to the CollectionEx class might be a few milliseconds faster, but it's pretty easy to just re-instantiate the whole class. The minute the original CollectionEx object is destroyed, it'll release/destroy its internal "c" (VBA.Collection) object.
I'll admit that I did consider your ToArray idea. However, that just seems like the fundamental idea of what a collection is in the first place. It's almost like making a second copy. I just didn't see the need. (Sort if the same thoughts about ToString.)
The FromArray is definitely a novel idea, and I'm still considering that one. I'd probably name it something like AddFromArray. However, to work well, it'd probably have to be tied to a Publicly declared structure (UDT) in a standard (BAS) module that looked something like the following:
Code:
Public CollectionExItemType
vData As Variant
Key As String
Before As Variant
After As Variant
End Type
And then, you'd create an array of these, fill them with data, and pass it to some AddFromArray function. The only thing that keeps me from doing it now is that it'd take a separate standard module (or a separate TypeLib) for this structure to get it done.
Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
@Tanner_H, thanks for your reply as well (both the one about Unicode and your latest one).
I must admit that I seldom (really, if ever) mess with the UTF-16 characters above the OxD7FF range, and I'm not totally on-top of all of their idiosyncrasies. I've just never had the need. I just know that they're peculiar, so I stay away from them.
I do sometimes dabble in another forum which has the complete set of UTF-16 implemented. In this other forum, it's not at all unusual to see people jump out to the UTF-8 & Unicode site and find funny characters to include in their posts, or to even use for various purposes in their code. (Just FYI for the uninitiated, UTF-8 is something entirely different from what VB6 deals with).
The code there has a complete implementation of UTF-16, and actually occasionally has characters longer than two bytes.
Anyway, if I ever have a strong need, maybe I'll get all of the different Unicode conventions completely under my belt.
Regards,
Elroy
EDIT1: Yes, I'll agree with something else you said. I suppose, in the strictest of terms, the VBA.Collection object should be "Unicode compatible". It's probably better said that it's not "any-bit-pattern-in-string-bits compatible".
Last edited by Elroy; Aug 30th, 2016 at 06:05 PM.
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
I was politely pointing out the class would be unusable as a public class in a dll which yes is the only use for friend...unless I am missing something..which is why I left room for explanation...
Sorry, I hadn't looked at the code had no idea anyone would write VB6 in such a manner.
I actually like it. That's a great idea. I've managed to get very busy with my consulting. However, when I get a breather, I'll add that to the primary post.
The difference between a Get Property and a Function has always been a bit interesting to me (other than a Get Property can have a corresponding Let-or-Set Property). From a functional perspective, I can't see any difference. However, I do know that VB6 keeps track of them differently, particularly when they appear in a class or form object.
Regards,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.
Sure, do what you like with it. And yeah, I use this code in a couple of places in my primary project, and it seems to work fine.
Good Luck,
Elroy
Any software I post in these forums written by me is provided "AS IS" without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. To all, peace and happiness.