PHP User Warning: fetch_template() calls should be replaced by the vB_Template class. Template name: bbcode_highlight in ..../includes/functions.php on line 4197
Generating a unique fixed-length hash in VB6-VBForums
Page 1 of 2 12 LastLast
Results 1 to 40 of 44

Thread: Generating a unique fixed-length hash in VB6

  1. #1

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Question Generating a unique fixed-length hash in VB6

    Hello, all!

    I've been researching this for a while now, and I'm still not coming up with exactly what I'm looking for. Perhaps I'm just being dense, but I can't seem to translate any of the answers I've found into an actual, workable solution for my current situation.

    I need to generate a unique ID that can be stored in a database to retrieve a specific record with the following requirements:
    1. Unique
    2. Non-Sequential
    3. Fixed-length of 10 characters (The specific length could potentially change before final implementation of the solution, depending on what the "higher-ups" decide they want. Of course, if it were to change, it would probably be to decrease the length unfortunately)
    4. Alphanumeric (preferably upper-case only)
    5. VB6 (I know, I know...)

    The "input" available for generating this code is comprised of at least three separate elements:
    1. Customer number (fixed-length)
    2. Account number (variable-length)
    3. A specific date

    These three elements should ensure the uniqueness of the record to which the generated ID is tied.

    The overall objective is to provide a user with this code on a printed and mailed notification letter. The user will be able to enter this code on our company's Web site for submitting information back to us. The unique aspect is important to ensure that no two users receive the same code, thereby submitting information which is tied to the wrong data record.

    I could potentially exclude the date from the above "input" values, but I'm including it at this point to ensure that, even if we send multiple notification letters to the same individual (customer number & account number), we will always pull the correct database record (our company will never send two notification letters to the same individual on the same day). If necessary, I could potentially include additional data elements for input, but I fear that this could cause even more problems than it solves.

    I'm sure I could bang my head on my desk for a while longer and eventually work something out, but my brain is starting to hurt with all of the search results I've reviewed so far. I was wondering if someone here could help point me in the right direction for finding a viable solution. Thank you in advance for all of your assistance.
    Last edited by G_Hosa_Phat; Jul 31st, 2015 at 04:09 PM.

  2. #2
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Normally a hash has far fewer bits of information than its inputs. For this reason hashes are not normally unique but merely "approximately unique" and when used for retrieval you have to expect multiple hits and then walk through until you find the one you really want.

    Normally you want "approximately unique" to be as unique as possible, to limit the size of the resulting hit list to be walked.


    The only way to make a unique hash is to strip out noise bits from each of your input data items and then concatenate the remaining bits.

    For example an input consisting of digits 0 to 9 only needs 4 bits per digit instead a whole byte for each digit. But using a numeric Byte, Integer, or Long can be even briefer (a Byte can handle 0 to 255, Integer -32,768 to 32,767, Long -2,147,483,648 to 2,147,483,647). Same goes for currency and real numeric types.

    Strings that only use the 7-bit ASCII subset could be packed together saving 1 bit per character compared to ANSI. You could even pack things tighter if you exclude control characters, but not trivially.


    However trying to shoehorn your data (you didn't give sizes) into 10 uppercase alphanumeric characters... well uniqueness doesn't seem practical: 36^10 = &HCFD41B9100000 or somewhat under 52 bits of uniqueness.
    Last edited by dilettante; Jul 31st, 2015 at 04:38 PM.

  3. #3

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    @delettante - Thank you for your response. Perhaps a full-fledged hash of the data is going a bit "too far" in this situation, as the only real requirement is to have a unique, non-sequential, alphanumeric, 10-character code generated for each of the notification letters our company sends out (I might be able to talk them into expanding the size a bit, but not much more than 10). Based on my estimations, we probably produce less than 100,000 of these letters per year, so the "limitations" of 36^10 unique codes should be acceptable for the foreseeable future.

    Maybe I could/should just try to use a random generator of some sort to produce the code, then put it on a loop that checks the database until I get one that hasn't already been used. I just thought that finding a way to hash data that should be unique to begin with would help to guarantee that the generated code would end up unique. Any suggestions would be appreciated.

  4. #4
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    No, your problem will be trying to invent a hash function that can map your inputs to those 36^10 values with no duplication. It just isn't possible, you have under 52 bits of uniqueness. Hashing is not a magic trick.


    The only solution I can think of is a database table or equivalent that contains all of your input fields and indexes them together as a key along with a conventional IDENTITY type field to serve as your "hash." Every time you have a record you look it up in the table, if you get a hit you have your "hash value." If not you store a new record in the table, auto-generating the next IDENTITY feld value.

    To keep these from being altered by database compaction, etc. you might have to use a regular data field and increment the highest value to get a new value. This means even more chruning away on the database, and probably indexing that field too.

    At that point you are far worse off than before you began this unless storage is cheap and new records are rare. Then it is merely sluggish.


    If somebody has a better idea I'd love to hear it.

  5. #5
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Another way to look at hashing is as lossy compression.

    Since you want lossless compression there are extreme limits to what can be accomplished. At best you'll probably only be able to squeeze a 50% reduction, so with 51 and change bits available you are limited to 102 bits or so of possible input data.

    Even that assumes a lossless compression algorithm that can make guarantees that most do not.

  6. #6
    Fanatic Member
    Join Date
    Aug 2013
    Posts
    806

    Re: Generating a unique fixed-length hash in VB6

    Hi G_Hosa_Phat. Dilettante has done a great job describing the difficulty of what you're trying to do. The technical name of this task is perfect hashing. That link takes you to a Wikipedia article that describes the difficulty of the problem, and various solutions. One of the linked books in that article goes into a lot more detail.

    Generally speaking, an excessive amount of work is required to construct a perfect hash function for arbitrary-length inputs. It's much easier to assume that collisions will happen, and to just use a (much simpler) imperfect hash function.

    If you have no choice but to go down the perfect hashing road, those links will get you started, but in all seriousness, I'd avoid that requirement if at all possible. Your additional constraints of specialized input and output data formats make this even more difficult than it would otherwise be.

    (This is my $0.02... unless of course you're very confident in your math skills! )
    Check out PhotoDemon, a pro-grade photo editor written completely in VB6. (Full source available at GitHub.)

  7. #7
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Tanner_H View Post
    ... unless of course you're very confident in your math skills! )
    Well, the underlying math is not really all that difficult (in case the Hash-Function
    in question is "near ideal" - and many CRC-algos are considered to coming close,
    and are well-researched regarding their distribution (for different polynomials).

    Here's a link which shows a nice overview for collision-probabilities of different Hash-Lengths:
    http://stackoverflow.com/questions/1...ollision-crc32

    According to the link above, the formula(s) for the OP would be:

    Code:
    Public Function GetAmountOfCollisions(Samples, HashBits)
      GetAmountOfCollisions = (Samples ^ 2 - Samples) / (2 * (2 ^ HashBits))
    End Function
    
    Public Function GetProbForAtLeastOneCollision(Samples, HashBits)
      GetProbForAtLeastOneCollision = 1 - Exp(-(Samples ^ 2 - Samples) / (2 * (2 ^ HashBits)))
    End Function
    E.g. when using a CRC-32 - along with a yearly amount of 100000 drawn samples,
    the results would come out as (e.g. ran in the Immediate-Window):

    ? GetAmountOfCollisions(10^5, 32)
    1.16414157673717
    ? GetProbForAtLeastOneCollision(10^5, 32)
    0.687809461338744

    So, CRC-32 would have a roughly 70% chance to collide already within the first year - over
    10 years (amount then 10^6) the collisions would increase to about 116 (among these 1Mio draws).

    With a near ideal 52Bit-Hash, the probability to generate at least one collision would be:
    - in one year (10^5 samples) roughly "one in a million"
    ? GetProbForAtLeastOneCollision(10^5, 52)
    1.11021130611011E-06
    - and after 10 years (10^6 drawn samples) roughly "one in ten-thousand"
    ? GetProbForAtLeastOneCollision(10^6, 52)
    1.1101602870478E-04

    With a 64Bit Hash, the Probabilities would be:
    2.71047850830541E-10 ("2.7 in 10^10" for 10^5 samples)
    2.71050268896289E-08 ("2.7 in 10^8" for 10^6 samples)

    The OP will have to present these Probs to the deciders - and let them ponder,
    if they could live with a "one in 10000 chance over 10 years" for a duplicate Hash -
    or if adding 3 Chars, to get about 64Bit would worth it, to lower the collision-
    probability to about 2.7 in 100Mio (over 10 years).

    Olaf

  8. #8
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    FWIW, maybe the following could be considered a compromise...

    It is using 30Bits from two different 32Bit-Hashes, then combined and
    resulting in a 12Char-AlphaNumeric Output.

    Each of these 12Chars representing 5Bit = 60Bit-hash-resolution in total.
    (since we use only 32Chars of the AlphaNumeric ASCII-range, I removed
    the Chars which could be "ambigous reads" - as e.g. the 0, the O and the Q -

    This is the 32Char-String against which the 5Bits per Char are mapped:
    Const CharMap$ = "ABCDEFGHJKLMNPRSTUVWXYZ123456789"

    The following module I wrote some years ago, and it contains different CRC-
    implementations which (when native compiled) will perform quite well:

    *Into a Module - e.g. modCRC.bas
    Code:
    Option Explicit
     
    Public Function Crc6(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 6, True, &H6B&
     
      crc = 63
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (crc \ 256)
        Next i
      Crc6 = crc Xor 63
    End Function
    
    Public Function Crc16(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, True, &H8005&
      
      crc = 0
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (crc \ 256)
        Next i
      Crc16 = crc
    End Function
    
    Public Function Crc16_ModBus(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, True, &H8005&
      
      crc = &HFFFF&
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (crc \ 256)
        Next i
      Crc16_ModBus = crc
    End Function
    
    Public Function Crc16_USB(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, True, &H8005&
      
      crc = &HFFFF&
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (crc \ 256)
        Next i
      Crc16_USB = crc Xor &HFFFF&
    End Function
    
    Public Function Crc16_CCITT(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, False, &H1021&
      
      crc = &HFFFF&
        For i = LBound(B) To UBound(B)
          crc = (crcTab(((crc \ 256) Xor B(i)) And &HFF&) Xor (crc * 256)) And &HFFFF&
        Next i
      Crc16_CCITT = crc
    End Function
    
    Public Function Crc16_CCITT_XModem(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, False, &H1021&
      
      crc = 0
        For i = LBound(B) To UBound(B)
          crc = (crcTab(((crc \ 256) Xor B(i)) And &HFF&) Xor (crc * 256)) And &HFFFF&
        Next i
      Crc16_CCITT_XModem = crc
    End Function
    
    Public Function Crc16_CCITT_Kermit(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 16, True, &H1021&
      
      crc = 0
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (crc \ 256)
        Next i
      Crc16_CCITT_Kermit = crc \ 256 + 256 * (crc And &HFF&) 'Byte-Swap
    End Function
    
    Public Function Crc32(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 32, True, &H4C11DB7
      
      crc = &HFFFFFFFF
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (((crc And &HFFFFFF00) \ &H100) And &HFFFFFF)
        Next i
      Crc32 = crc Xor &HFFFFFFFF
    End Function
    
    Public Function Crc32_C(B() As Byte) As Long
    Dim i As Long, crc As Long: Static crcTab(0 To 255) As Long
      If crcTab(1) = 0 Then CreateLookupTable crcTab, 32, True, &H1EDC6F41
      
      crc = &HFFFFFFFF
        For i = LBound(B) To UBound(B)
          crc = crcTab((crc Xor B(i)) And &HFF&) Xor (((crc And &HFFFFFF00) \ &H100) And &HFFFFFF)
        Next i
      Crc32_C = crc Xor &HFFFFFFFF
    End Function
    '-------------- Helper-Functions for Lookup-Table-Generation ----------------------
    
    Private Sub CreateLookupTable(crcTab() As Long, ByVal BitLen As Long, ByVal Reflected As Boolean, ByVal Poly As Long)
    Dim r As Long, i As Long, V As Long, BM As Long
      If BitLen = 32 Then BM = &H80000000 Else BM = 2 ^ (BitLen - 1)
      
      For V = 0 To UBound(crcTab)
        r = V
        If Reflected Then r = Reflect(V, IIf(BitLen < 8, BitLen, 8))
        If BitLen > 8 Then r = SHL(r, BitLen - 8)
     
        For i = 0 To IIf(BitLen < 8, BitLen, 8) - 1
          r = SHL(r, 1) Xor IIf(r And BM, Poly, 0)
        Next
        
        If Reflected Then r = Reflect(r, BitLen)
        
        If BitLen = 32 Then crcTab(V) = r Else crcTab(V) = r And CLng(2 ^ BitLen - 1)
      Next V
    End Sub
    
    Private Function Reflect(ByVal v As Long, ByVal Bits As Long) As Long
    Dim i As Long, M As Long
      Reflect = v
      For i = 0 To Bits - 1
        If (Bits - i - 1) = 31 Then M = &H80000000 Else M = 2 ^ (Bits - i - 1)
        Reflect = IIf(v And 1, Reflect Or M, Reflect And Not M)
        v = SHR(v, 1)
      Next i
    End Function
     
    Private Function SHL(ByVal V As Long, ByVal Bits As Long) As Long
      If Bits = 0 Then SHL = V: Exit Function
      SHL = (V And (2 ^ (31 - Bits) - 1)) * 2 ^ Bits Or IIf(V And 2 ^ (31 - Bits), &H80000000, 0)
    End Function
    
    Private Function SHR(ByVal V As Long, ByVal Bits As Long) As Long
      If V < 0 Then SHR = ((V And &H7FFFFFFF) \ 2 ^ Bits) Or 2 ^ (31 - Bits) Else SHR = V \ 2 ^ Bits
    End Function
    
    Public Function GetAmountOfCollisions(Samples, HashBits)
      GetAmountOfCollisions = (Samples ^ 2 - Samples) / (2 * (2 ^ HashBits))
    End Function
    
    Public Function GetProbForAtLeastOneCollision(Samples, HashBits)
      GetProbForAtLeastOneCollision = 1 - Exp(-(Samples ^ 2 - Samples) / (2 * (2 ^ HashBits)))
    End Function
    And here the applied-usage of the above CRC-Module in a small routine,
    combining the results of two different CRC32-implementations
    (the "classic CRC32", which is e.g. used in Ethernet-scenarios - and
    the CRC32-C version which is using a different polynomial (&H1EDC6F41)
    and is then also fed with the reversed Input-String...

    Form-Code for testing:
    Code:
    Option Explicit
    
    Private Sub Form_Load()
      Debug.Print GetAlphaNumeric60BitHash("abc")
    End Sub
     
    Public Function GetAlphaNumeric60BitHash(S As String) As String
    Const CharMap$ = "ABCDEFGHJKLMNPRSTUVWXYZ123456789" '<- 32Chars (5Bit-Mapping)
    Dim B() As Byte, CharLUT() As Byte, H(0 To 1) As Long, i&, j&
        B = S
        H(0) = Crc32(B) And &H7FFFFFFF
        B = StrReverse(S)
        H(1) = Crc32_C(B) And &H7FFFFFFF
      
      GetAlphaNumeric60BitHash = Space$(12)
      For i = 0 To 1: For j = 0 To 5
        Mid$(GetAlphaNumeric60BitHash, i * 6 + j + 1, 1) = Mid$(CharMap, (H(i) And 31&) + 1, 1)
        H(i) = H(i) \ 32 'shift 5 Bits to the right
      Next j, i
    End Function
    HTH

    Olaf
    Last edited by Schmidt; Jul 31st, 2015 at 10:29 PM.

  9. #9
    Fanatic Member
    Join Date
    Apr 2015
    Location
    Finland
    Posts
    658

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by G_Hosa_Phat View Post
    I need to generate a unique ID that can be stored in a database to retrieve a specific record with the following requirements:
    1. Unique
    2. Non-Sequential
    3. Fixed-length of 10 characters (The specific length could potentially change before final implementation of the solution, depending on what the "higher-ups" decide they want. Of course, if it were to change, it would probably be to decrease the length unfortunately)
    4. Alphanumeric (preferably upper-case only)
    5. VB6 (I know, I know...)

    The "input" available for generating this code is comprised of at least three separate elements:
    1. Customer number (fixed-length)
    2. Account number (variable-length)
    3. A specific date

    These three elements should ensure the uniqueness of the record to which the generated ID is tied.
    Most database systems has globally unique ID datatype 'as standard', which you can use as a primary key and also as a replication key in failsafe/failover databases. However if you want to create unique value, then there is CoCreateGUID api at your disposal.

    Code:
    Option Explicit
    
    Private Type GUID
     Data1 As Long
     Data2 As Integer
     Data3 As Integer
     Data4(7) As Byte
    End Type
    
    Private Declare Function CoCreateGuid Lib "OLE32.DLL" (pGuid As GUID) As Long
    
    Public Function CreateGUID() As String
    Dim udtGUID As GUID
    If (CoCreateGuid(udtGUID) = 0) Then
    CreateGUID = _
       String(8 - Len(Hex$(udtGUID.Data1)), "0") & Hex$(udtGUID.Data1) & _
       String(4 - Len(Hex$(udtGUID.Data2)), "0") & Hex$(udtGUID.Data2) & _
       String(4 - Len(Hex$(udtGUID.Data3)), "0") & Hex$(udtGUID.Data3) & _
       IIf((udtGUID.Data4(0) < &H10), "0", "") & Hex$(udtGUID.Data4(0)) & _
       IIf((udtGUID.Data4(1) < &H10), "0", "") & Hex$(udtGUID.Data4(1)) & _
       IIf((udtGUID.Data4(2) < &H10), "0", "") & Hex$(udtGUID.Data4(2)) & _
       IIf((udtGUID.Data4(3) < &H10), "0", "") & Hex$(udtGUID.Data4(3)) & _
       IIf((udtGUID.Data4(4) < &H10), "0", "") & Hex$(udtGUID.Data4(4)) & _
       IIf((udtGUID.Data4(5) < &H10), "0", "") & Hex$(udtGUID.Data4(5)) & _
       IIf((udtGUID.Data4(6) < &H10), "0", "") & Hex$(udtGUID.Data4(6)) & _
       IIf((udtGUID.Data4(7) < &H10), "0", "") & Hex$(udtGUID.Data4(7))
    End If
    End Function
    Simply call by;
    Code:
    Dim sGUID as string
    sGUID = CreateGUID

  10. #10
    Fanatic Member
    Join Date
    Dec 2012
    Posts
    798

    Re: Generating a unique fixed-length hash in VB6

    Forgive if I am wrong, but both the Customer Number and the Account Number should already be unique. I suppose that one customer could have multiple accounts, so that would make the Account Number the obvious choice to use. How some companies address this issue is to use a temporary "Secret" that is associated with the account. A "Secret" would basically be a key to the account. For example, my email provider sends me daily email regarding spam that has been quarantined or bounced. It contains a link to my account combined with an 8 character secret. That secret is replaced by a new secret when the next notification is issued. The account is public information, but the secret is not. You could make the secret as short as you want, as the uniqueness is already provided by the Account Number.

    J.A. Coutts

  11. #11
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by couttsj View Post
    Forgive if I am wrong, but both the Customer Number and the Account Number should already be unique. I suppose that one customer could have multiple accounts, so that would make the Account Number the obvious choice to use. How some companies address this issue is to use a temporary "Secret" that is associated with the account. A "Secret" would basically be a key to the account. For example, my email provider sends me daily email regarding spam that has been quarantined or bounced. It contains a link to my account combined with an 8 character secret. That secret is replaced by a new secret when the next notification is issued. The account is public information, but the secret is not. You could make the secret as short as you want, as the uniqueness is already provided by the Account Number.
    I think he wants to identify a "record" (which probably corresponds to some business event, for example a sale) and not customer or account. Even then cust + acct + timestamp could produce collisions but even at one second resolution that should be good enough.


    GUID/UUID values are never guaranteed to be unique. They are only as unique as a 128 bit value chosen pseudo-randomly can be, and many of the bits are "weighting" based on the generating machine's MAC address. This restricts their use to scenarios where they get used sparsely or temporarily (replication for example) and even then collisions may occur. Plus they are not hashes and you can't plug inputs in and get the same 128-bit value out each time.


    The Perfect Hash link was interesting and puts a name to the creature, so that helps even if the article there doesn't seem of much practical use. Dynamic Perfect Hash sounds like the subset of that which is needed here.

    I think what we really have here is a case of trying to put 10 pounds of stuff into a 5 pound bag.

  12. #12
    Fanatic Member
    Join Date
    Aug 2013
    Posts
    806

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Schmidt View Post
    Well, the underlying math is not really all that difficult (in case the Hash-Function
    in question is "near ideal" - and many CRC-algos are considered to coming close,
    and are well-researched regarding their distribution (for different polynomials).
    In probability, the difference between "guaranteed 0%" and "very near 0%" is astronomical.

    If "very low risk of collision" ends up being acceptable to the OP, then pretty much any hashing function would work. He'd just need to write code that covers the case of rare-but-not-impossible collisions.

    But assuming no collisions and then "playing the odds" is a dangerous proposition, even when the odds are in your favor.

    Quote Originally Posted by dilettante
    The Perfect Hash link was interesting and puts a name to the creature, so that helps even if the article there doesn't seem of much practical use. Dynamic Perfect Hash sounds like the subset of that which is needed here.
    Dynamic perfect hashing is an ugly solution, but it would technically solve the OP's problem. Basically, dynamic perfect hashing just embraces the fact that you can't easily prove 0% risk of collisions, so you construct a hash function with high variability and some kind of random-seed-like secondary input (like salting).

    If you experience a collision, you change the "seed" value to some new random value and rebuild the entire hash table, and keep doing this until you find a seed with zero collisions for the current input set. (A more technical implementation uses layers of hash tables, and you only rebuild the last layer as necessary; the wiki page explains this better.)

    As long as the variability of the hash results are high and the relative size of the dataset is low, dynamic perfect hashing gives you O(1) performance... But if collisions are more frequent than you predicted, it becomes a slow, ugly solution. (Which may be what the doctor ordered here? idk)
    Last edited by Tanner_H; Aug 1st, 2015 at 08:40 AM. Reason: clarified "dynamic perfect hashing" vs just "dynamic hashing"
    Check out PhotoDemon, a pro-grade photo editor written completely in VB6. (Full source available at GitHub.)

  13. #13
    Fanatic Member
    Join Date
    Apr 2015
    Location
    Finland
    Posts
    658

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by dilettante View Post
    GUID/UUID values are never guaranteed to be unique. They are only as unique as a 128 bit value chosen pseudo-randomly can be, and many of the bits are "weighting" based on the generating machine's MAC address. This restricts their use to scenarios where they get used sparsely or temporarily (replication for example) and even then collisions may occur. Plus they are not hashes and you can't plug inputs in and get the same 128-bit value out each time.
    True - but one can always run generated GUID value, through MD5 128-bit hashing algorithm.

  14. #14
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Tech99 View Post
    True - but one can always run generated GUID value, through MD5 128-bit hashing algorithm.
    Gosh, I don't know what to say. You sure got me there.

  15. #15
    Fanatic Member
    Join Date
    Apr 2015
    Location
    Finland
    Posts
    658

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by dilettante View Post
    Gosh, I don't know what to say. You sure got me there.
    Yeah, altought meant to say bit more salted, but been trivial method, left this out.

  16. #16
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Tanner_H View Post
    In probability, the difference between "guaranteed 0%" and "very near 0%" is astronomical.
    Well, whilst the above being (mathematically) true -
    I assume that in reality you're not such a "glass-half-empty"-person...

    Real-life is full of risks and absolute certainties are nearly nowhere to be found (aside
    from a few exceptions, as e.g. that men will *certainly* never understand women <g>).

    Quote Originally Posted by Tanner_H View Post
    ... assuming no collisions and then "playing the odds" is a dangerous proposition,
    even when the odds are in your favor.
    I only offered VB-versions of the formulas - as well as a possible (60Bit in 12Chars) solution
    for the OP (along with some probabilities, which he could present to the decision-makers).

    As for the (unlikely) case of a collision (along with your wording of "unchecked Hashing"
    being a "dangerous proposition")...

    In the case of the OP, the danger can certainly be quantified
    (assuming the worst-case of having generated the same hash for two different
    records in the "send-out"-scenario the OP has described, the worst case will be,
    that a certain notification will *not* be send to a certain customer)...

    That's the price which will have to be paid, when the (unchecked) Hash-Approach
    will be used in that scenario (in case of the unlikely event of a collision, when hash-
    length >= 60Bits will be used).

    That price will have to be set into relation with the price for the implementation
    of a "water-tight-solution" - in *conjunction* with the probabilities:
    1) the "water-tight-solution" will certainly cost x Extra "-ManHours", "-Storage-Space", "-CPU-Cycles" and "-Inconvenience"
    2) the "unchecked Hash-solution" might cost a not send customer-notification (+ the side-effects of that)

    I can only underline (since the OPs scenario is certainly not a "mission-critical-NASA-project")
    how extremely low the likelihood is, that going with the above point #2 "might cost something".

    The 60Bit-solution (in 12 well-readable ASCII-Chars) has a likelihood of not sending
    a certain notification in 100,000 messages/year (over 10 full years) of: 0.000043%

    Well, let's say that this quite low percentage is "still scary enough" - and the number of
    ASCII-Chars which represent the Hash shall therefore be increased to, say, 20 (=100Hash-Bits):
    Then the likelihood to encounter a collision within 10 years will be: ~ 4*10^-19

    For comparison, the likelihood, that you (as a single person) will be hit by a Meteorite
    tomorrow - (or any other day, for the rest of your life) - will be roughly: 3*10^-15
    That this event happens, is about 7500 times more likely than the not send notification
    (now seriously wondering, why those of us not having decided to live in a bunker permanently,
    aren't running from cover to cover - or discard wearing a helmet all day long... )

    I'm not saying that a watertight solution shall be discarded (in the OPs case dilettante
    already outlined an AutoID-approach which is relatively "cheap" to implement - only
    costing a few man-hours, some Disk-Space and a bit of Query-Time) ... I guess I just
    try to defend software-solutions or approaches, which rely on such probabilities
    (e.g. Git relying on a 160Bit-SHA1-Hash, to describe and identify certain content-snippets).

    The math is given - and the Hashing-Algos "sound enough" (operating near the ideal hash).

    For those in doubt of the above sentence - one can check that out oneself, to see
    how the hash-algos we use today "meet the reality":

    E.g. using the Code-Block for the CRC-Module I've posted in #8, pasting it into a *.bas,
    one can then use the following in a Form:

    Code:
    Option Explicit
     
    Private Sub Form_Load()
       Const SampleCountCrc16 = 5000
       Debug.Print "CRC-16 ( SampleCount:"; SampleCountCrc16; ")"
       Debug.Print "Prediction:"; CLng(GetAmountOfCollisions(SampleCountCrc16, 16));
       Debug.Print " --> Reality:"; GetRealWorldCollisions(SampleCountCrc16, False); vbLf
     
       Const SampleCountCrc32 = 500000 'only half a Million, to not wait too long
       Debug.Print "CRC-32 ( SampleCount:"; SampleCountCrc32; ")"
       Debug.Print "Prediction:"; CLng(GetAmountOfCollisions(SampleCountCrc32, 32));
       Debug.Print " --> Reality:"; GetRealWorldCollisions(SampleCountCrc32, True); vbLf
    End Sub
    
    Private Function GetRealWorldCollisions(AmountOfUniqueRandomInputs, ByVal UseCrc32 As Boolean) As Long
    Dim i&, S$, B() As Byte, Hash As String, Col As New Collection
       Rnd -1234 'ensure we create always the same random sequence
       
       On Error Resume Next 'to detect failed Adds to the Collection
       
       For i = 1 To AmountOfUniqueRandomInputs
         S = Chr$(Rnd * 127) & Chr$(Rnd * 127) & CStr(i) '<- last part, to make it unique and ensure some length-variation
         B = StrConv(S, vbFromUnicode)
         If UseCrc32 Then
           Hash = Hex(Crc32(B))
         Else
           Hash = Hex(Crc16(B))
         End If
         
         Err.Clear
           Col.Add i, Hash
         If Err Then GetRealWorldCollisions = GetRealWorldCollisions + 1
       Next
    End Function
    The above code then printing out (just wait a few secs for the 0.5Mio Crc32 calculations):
    Code:
    CRC-16 ( SampleCount: 5000 )
    Prediction: 191  --> Reality: 187 
    
    CRC-32 ( SampleCount: 500000 )
    Prediction: 29  --> Reality: 26
    Olaf
    Last edited by Schmidt; Aug 1st, 2015 at 05:08 PM.

  17. #17
    Fanatic Member
    Join Date
    Aug 2013
    Posts
    806

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Schmidt View Post
    As for the (unlikely) case of a collision (along with your wording of "unchecked Hashing"
    being a "dangerous proposition")...

    In the case of the OP, the danger can certainly be quantified
    (assuming the worst-case of having generated the same hash for two different
    records in the "send-out"-scenario the OP has described, the worst case will be,
    that a certain notification will *not* be send to a certain customer)...
    It's difficult to discuss the risks without the OP's clarification. The OP says: "The overall objective is to provide a user with this code on a printed and mailed notification letter. The user will be able to enter this code on our company's Web site for submitting information back to us. The unique aspect is important to ensure that no two users receive the same code, thereby submitting information which is tied to the wrong data record."

    What data is the user submitting? SSNs? Credit cards? If they are, would a collision allow some other user to pull the submitted data?

    Hashing personal data is serious business, full-stop. IMO, if hash collisions are a possibility (and given the OP's constraints re: 10-upper-case-character hash, large potential input space), the OP should code for them.

    Collision attacks are a very real thing (they brought down the venerable MD5 hash function, after all). If it were my personal information being stored and processed by this company, I would want the coder to make sure the hash function was well-constructed and that all possibilities were accounted for.

    If the OP can increase the variability of the hash (e.g. lowercase letters, symbols, more chars, etc) then this is a different discussion. Given the exponential nature of bit-representation, 60 bits would certainly be preferable to ~50. A more reasonable 128 would be even better.

    But given the original constraints, and dilettante's correctly calculated 52-bits of variation, collision risk is high enough to warrant implementation considerations. My $0.02.

    Quote Originally Posted by Schmidt View Post
    I'm not saying that a watertight solution shall be discarded (in the OPs case dilettante
    already outlined an AutoID-approach which is relatively "cheap" to implement - only
    costing a few man-hours, some Disk-Space and a bit of Query-Time) ... I guess I just
    try to defend software-solutions or approaches, which rely on such probabilities
    (e.g. Git relying on a 160Bit-SHA1-Hash, to describe and identify certain content-snippets).
    This is false equivocation, and is the kind of statement I worry about with new coders who don't understand the risks involved.

    160 bits != 52 bits.
    Check out PhotoDemon, a pro-grade photo editor written completely in VB6. (Full source available at GitHub.)

  18. #18
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Sideshow:

    Quote Originally Posted by Schmidt View Post
    ... e.g. Git relying on a 160Bit-SHA1-Hash, to describe and identify certain content-snippets...
    Surely they don't actually do that?

    Are you sure that isn't just a hash they are using as a check on file transfer corruption and tampering?

  19. #19
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by Tanner_H View Post
    It's difficult to discuss the risks without the OP's clarification. The OP says: "The overall objective is to provide a user with this code on a printed and mailed notification letter.
    ... let's stop right here ...

    As I understood it, a notification will be prepared, printed (and mailed) only when there's
    currently no hash-entry already to be found ... e.g. in a DB-Table, called [NotificationsSent] ...
    (having HashStr as the PK - and CustomerNr, AccountNr, SomeDate and perhaps SendState as additional Columns)

    Such a Table is normal in such "E-Mail-Pumping"-scenarios, for logging-purposes and to avoid sending a
    Notification twice to the same customer (for the same "Notification-Job") - because E-Mail-Pumps are a
    fickle thing, which can fail easily for a variety of reasons (e.g. Mail-Server down), and so the record-entries in
    that [NotificationsSent]-table are therefore treated like "Jobs in a Queue" - often requiring more than just one
    send-attempt (usually handled aside from the HashValue to "peek-in", over the above mentioned SendState-Field).

    So, in case of a collision (same Hash for a different set of "CustomerNr, AccountNr, SomeDate") -
    the lookup into that table [NotificationsSent] over the produced HashValue will just tell:
    "it's already in the works" or "it was already sent" (for another Customer) - hence no E-Mailing
    to that new "colliding, different customer" will happen at all.

    Quote Originally Posted by Tanner_H View Post
    What data is the user submitting? SSNs? Credit cards?
    If they are, would a collision allow some other user to pull the submitted data?
    I guess not (when the notification-system at the OPs side is working roughly as outlined above,
    not sending an E-Mail with a WebSite-Link to that "colliding customer" at all).

    Quote Originally Posted by Tanner_H View Post
    Hashing personal data is serious business, full-stop. IMO, if hash collisions are a possibility
    (and given the OP's constraints re: 10-upper-case-character hash, large potential input space),
    the OP should code for them.
    *If* it's a "serious business-scenario" (money involved, expensive law-suits possible), then perhaps yes.

    As for the "10 upper case chars" - here I thought the OP made it clear, that it doesn't need to be 10 Chars exactly
    (50Bits is a bit sparse for a Hash-Value with regards to collision-probabililties in a 100000 Mails/Year scenario).

    That's why I suggested to go with at least 12 Chars (60Bit) or better 16 or 20 Chars (80 and 100Bit respectively).

    Quote Originally Posted by Tanner_H View Post
    Collision attacks are a very real thing (they brought down the venerable MD5 hash function, after all).
    That's a completely different realm, since we don't talk about "Secure Hashes" in the given scenario.
    The requirement we have here, is only about Hash-Algos, which ensure:
    "collision-probabilities as close as possible to the theoretically achievable"
    The fact that Hash-Algos "designed for security" usually fulfill that requirement as well,
    is IMO beside the point.

    Aside from that, even when the OP would go with a "collision-free AutoID-approach"
    (as the "Key-number to be entered" which then refers to a certain record in a DB-Table
    at the vendors site), the very same "impersonation-attempts" could be performed by a hacker
    (or any "man in the middle"), trying to deduce which ID maps to a certain customer, to make
    use of that ID in "other potentially possible attacks").

    Quote Originally Posted by Tanner_H View Post
    If it were my personal information being stored and processed by this company, I would want the coder
    to make sure the hash function was well-constructed and that all possibilities were accounted for.
    Thought I tried just that (with my suggestions to the OP, to increase the HashLength
    to at least 60Bit, because only then the collision-probabilities come into a reasonable range,
    getting increasingly better with every added "5-Bit-Char").

    Quote Originally Posted by Tanner_H View Post
    If the OP can increase the variability of the hash (e.g. lowercase letters, symbols, more chars, etc) then this is a different discussion.
    No, the only thing that counts in the end is the OverAll-Bit-Length of the String-Representation -
    and if you want to stay in the realm of "readable String-representations" which work
    in the ASCII-Range, you can either use:
    - a 4Bit(16Char) alphabet (a wellknown one of this sort is what we know as: "HexString-representations")
    - a 5Bit(32Char) alphabet (e.g. the one I was using in my Demo-Snippets)
    - a 6Bit(64Char) alphabet (a wellknown one of this sort is what we know as: "Base64")

    So, increasing the "amount of allowed, easy to enter chars" would (IMO) max out at 6Bit per Char,
    which then would allow a 60Bit-Hash to be represented in only 10 Chars, but that's not all that
    much different from the 12Chars I needed for the 60Bit-Hash-Length in my Demo-routines.

    Quote Originally Posted by Tanner_H View Post
    Given the exponential nature of bit-representation, 60 bits would certainly be preferable to ~50. A more reasonable 128 would be even better.
    Ack, absolutely!
    (the more "allowed Chars" the OP can squeeze out of the deciders, the better)...

    Quote Originally Posted by Tanner_H View Post
    [my mentioning of the 160Bit SHA-1 Hash, as being the workhorse for identifiying
    small Content-Snippets within GIT (a worldwide used, distributed SCM-System)]...

    This is false equivocation, and is the kind of statement I worry about
    with new coders who don't understand the risks involved.

    160 bits != 52 bits.
    That's a bit unfair, after all the efforts, explanations and code-snippets I've put into this thread,
    to make it easy for new coders to calculate and understand Hash-collision-probabilities better -
    and how those can be derived from the Bit-Length (and the amount of "hashed Input-Samples").

    Here again, what I wrote in post #7:
    With a near ideal 52Bit-Hash, the probability to generate at least one collision would be:
    - in one year (10^5 samples) roughly "one in a million"
    ? GetProbForAtLeastOneCollision(10^5, 52)
    1.11021130611011E-06
    - and after 10 years (10^6 drawn samples) roughly "one in ten-thousand"
    ? GetProbForAtLeastOneCollision(10^6, 52)
    1.1101602870478E-04
    So, "it's all there" for the "new coders" in question who read this thread in the future.

    BTW, the argument the both of us have here about Hashing (or in more common terms,
    about the topic: "what to use as a Unique-ID"), is a quite wellknown one...

    Discussion then often heating up with proponents of the two realms:
    1) "let's play it safe" (no matter how large the extra-efforts or the inconveniences)
    2) "let's use applied math in form of long enough Hashes which are 'unique enough'"

    1) has the main-disadvantage that, to produce a new Unique-ID safely, you will have to "contact some storage first"
    ("storage" usually being a DB, producing a new ID in a new transaction - often over the built-in "@identity-mechanisms").
    That "needed contacting" makes this approach especially difficult, when we talk about truly distributed
    scenarios (as e.g. with GIT) - since online-connections are not always available.

    2) whilst easy to produce "without contacting anybody", has the main-disadvantage that:
    "Uniqueness cannot be truly guaranteed".

    The last sentence as it stands above is usually enough for most proponents of 1),
    to discard *any* Hash-based solutions outright.

    What most of them fail to recognize is the extremely low probability to produce
    a non-unique-ID, which is (when we talk about HashBit-Length above 100) lower
    than the already mentioned "hit by a meteorite"-scenario.

    Here's another probability-scenario, the proponents of 1) should consider
    (whilst "feeling safe" with their solutions).

    The already mentioned storage which is needed when following 1), is usually
    hosted on Hard-Disks these days - but usage of HardDisks involves a risk -
    and with regards to what we talk about here - what if you stored a nice Auto-
    generated-ID in a DB-Identity-Column residing on that disk, with the value of:
    11111
    but when you try to find this value, you get the result, that no record was
    found - because the HardDisk had a Read-Error - and passed the read-out of:
    11110
    instead to the search-routine of the DB-Engine.

    What now?
    Can't happen you might say - HardDisks have error-correction-routines built in,
    which can restore the original value which was written to the disk, even when
    a few Bits were not read correctly.

    Well, that's only one kind of read-error (known as "recoverable-read-errors", 'RRE').

    But there's another kind of read-errors in the realm of HardDisks, known as 'URE' -
    the "Unrecoverable ones").

    And there's probabilities for that kind of error in the vendor-specifications of
    most hard-disks - which are usually in the realm of 10^-15 or 10^-16.

    And then there's field-studies of "high-scale-harddisk-users" (from MS, Amazon,
    Google, CERN) which basically all state, that these URE's are likely to happen far
    more often than specified by the HD-vendors - here a link to a review at the CERN:
    http://indico.cern.ch/event/13797/se.../1?contribId=3

    Executive Summary

    We have established that low level data corruptions exist and that they have several origins.
    The error rates are at the 10^-7 level, ...
    For comparison, here again the probability of a "duplicate hash"-collision for
    a 100Bit-Hash-Algo, producing 1Mio Hashes (from different Input-Values):
    4*10^-19

    That's even lower than what the vendors (optimistically) provide for their
    unrecoverable Read-Errors.

    So, as a conclusion (given a scenario of 1Mio AutoID'ed records, accumulated in a DB over 10 years):
    It is far more likely that your DB will link you to the wrong AutoID (due to an URE on the harddisk), than
    it is to produce a Hash-Collision for the same amount of Entries with an algo that's at least using 100HashBits).

    (and the above not even considering risks due to Read-Errors, cause by faulty memory,
    or a hickup in an overheated CPU etc.).

    Olaf

  20. #20
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,073

    Re: Generating a unique fixed-length hash in VB6

    Good Grief Things are getting out of hand here


    Anything I post is an example only and is not intended to be the only solution, the total solution nor the final solution to your request nor do I claim that it is. If you find it useful then it is entirely up to you to make whatever changes necessary you feel are adequate for your purposes.

  21. #21
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by dilettante View Post
    [GIT using SHA-1 Hashing extensively, to map to Content-Values]

    Surely they don't actually do that?

    Are you sure that isn't just a hash they are using as a check on file transfer corruption and tampering?
    They do:
    https://git-scm.com/book/en/v2/Git-I...ls-Git-Objects

    As does the (SQLite-based - and quite nice and tidy) distributed SCM, called Fossil:
    http://fossil-scm.org/xfer/doc/trunk..._overview.wiki

    Here's the math again:

    Let's be extreme and say that there's a certain GIT-managed Source-Project, which has:
    - 1Mio contributors world-wide
    - each of them producing 1Mio check-in-snippets per year
    - over a time-period of 100years

    This would amount to a a total of check-in-snippets in that repo (after 100 years) of:
    1Mio * 1Mio * 100 = 10^14 content-snippets

    With the already provided formula, the probability of a collision for this amount is:
    ? GetAmountOfCollisions(10 ^ 14, 160)
    3.4 * 10^-21

    As already said in my last reply to Tanner, this is (by magnitudes) lower than the probability
    to encounter a "GIT-problem" due to reasons of e.g. an unrecoverable HD-ReadError.

    Olaf

  22. #22
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by jmsrickland View Post
    Good Grief Things are getting out of hand here
    Nah - the discussion is as OnTopic as can be...

    What's your question?

    Olaf

  23. #23
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    I think we're down to two camps.

    One says "the data won't fit into the available bits."

    The other says "who cares, most of the time no data will get lost or corrupted, after all it is rare that you can flip a coin 10 times and get heads 10 times in a row." Never mind the cost of unraveling the mess if it ever happens I guess.

    See Gambler's fallacy.

  24. #24
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by dilettante View Post
    I think we're down to two camps.
    One says "the data won't fit into the available bits."
    And here I thought, that this was already obvious to the OP himself -
    hence his idea with hashing the data, which didn't fit into 10 (5Bit or 6Bit) chars.


    Quote Originally Posted by dilettante View Post
    The other says "who cares, most of the time no data will get lost or corrupted,
    after all it is rare that you can flip a coin 10 times and get heads 10 times in a row."
    You obviously didn't read (or understand) properly then...
    <shrug>

    Quote Originally Posted by dilettante View Post
    Never mind the cost of unraveling the mess if it ever happens I guess.
    Sigh - and also regarding the "costs" I already wrote quite some text.
    Sadly the OP doesn't reply anymore - but (IMO) the "costs" due to
    a Hash-collision will involve only a "not send mail-notification".

    Note, that I didn't *recommend* using Hashes at all to the OP.

    All my replies were just meant, to show different results for truly calculatable risks -
    depending on the Hash-Length in Bits and the amount of hashed-inputsamples
    (should the OP "go forward" with his Hashing-idea - hopefully with a bit more Chars than 10).

    As for "fitting a unique identifier into as few well-typable Chars as possible" -
    the OP could restrict the needed Chars easily to only 5 instead of 10, when an AutoID-
    based approach would be used (e.g. when using them as the PK in the [NotificationsSent]-table
    I mentioned earlier) - since 100000 Mail-Notifications per year (over 100 years) amounts
    only to 10Mio - and 10Mio can be expressed in 5Chars (a 5Bit each).

    Quote Originally Posted by dilettante View Post
    Bringing that into the discussion, clearly shows that you didn't understand
    much of what was said, sorry, since the probabilities which were mentioned,
    are clearly by leaps and bounds below those of "getting tails after tossing a coin".

    There's currently not a single known input-pair which would cause the
    same SHA-1-HashValue (despite a whole lot of specialized SHA1-generating
    attempts on specialized hardware, and despite Millions of Git-Users).
    http://stackoverflow.com/questions/3...n-demo-example

    As for Meteorite-Hits (which are far more likely than an SHA1-collision), there's
    so far apparently two known (verifyable and witnessed) examples already:
    http://www.telegraph.co.uk/news/scie...meteorite.html

    So, don't be a victim to the "Gamblers fallacy" dile, follow through on your recommendation
    and do the safe thing - (making sure you put your helmet on, the next day you go out...)

    Olaf
    Last edited by Schmidt; Aug 2nd, 2015 at 05:50 PM.

  25. #25
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Generating a unique fixed-length hash in VB6

    This is my Hash.
    ? hash("91110","6224-32412-12361","2016/10/17")
    CTRRQKTRVI
    ? hash("91111","6224-32412-12361","2016/10/17")
    CTRRRKTRVI
    ? hash("91110","6224-32412-12362","2016/10/18")
    BVTORLVTWH
    ? hash("91111","6224-32412-12362","2016/10/18")
    BVTOTLVTWH

    Code:
    Function hash(custID$, AcountVrId$, dt As Date) As String
    Dim a() As Byte, b() As Byte, i As Integer, bb$, p As Integer
    a() = custID$ + Right$(AcountVrId$, 6) + "1234567890"
    b() = Left$(AcountVrId$, 2) + Format$(dt, "ddmmyyyy")
    For i = 0 To 19 Step 2
    p = a(i) + b(i) + Day(dt)
    If p > 90 Then p = p - 90 + p / 2
    If p > 90 Then p = p - 90 + p / 2
    If p > 90 Then p = p - 90 + p / 2
    If p < 65 Then p = 2 * 65 - p
    If p > 90 Then p = p - 90 + p / 2
    If p > 90 Then p = p - 90 + p / 2
    If p > 90 Then p = p - 90 + p / 2
    bb$ = bb$ + Chr(p)
    Next i
    hash = bb$
    End Function

  26. #26
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by georgekar View Post
    This is my Hash.
    Sorry, George - but this is definitely not a useful Hashing.

    E.g. you don't even cover the full input-range (leaving out
    large parts of the Account-ID entirely)...

    ? hash("91110","6224-32412-12361","2016/10/17")

    I've marked the input-parts you're leaving un-handled
    (not influencing the result at all) in magenta.

    Stuff like that is usually done by concatenating all the different
    input-Parameters into a single string first (and in case of variable-
    length input-params - usually with proper delimiters between
    the different arguments, to avoid concatenating-surprises).

    After that, the concatenated result-string is fed (usually as a Byte-Array)
    into a *well-tested* (well-researched) Hash-function with known and
    proper distribution-behaviour.

    Self-developed Hash-algos are perhaps useful when you develop
    your own Hash-Collection-Class or something (where Hash-creation-
    speed is of greater interest than proper behaviour with regards to
    collision-probabilities).

    If the OP decides to use Hashing at all, I'd recommend going with the
    established algos, which have proven themselves in the wild for some years
    and were properly reviewed by more than one expert in that field.

    Olaf

  27. #27
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Generating a unique fixed-length hash in VB6

    Is better from nothing...
    My hash has to proved if works. In case that I was OP I do a test, for all customId. Maybe the acountId have other significant bits. But we have something to start...

  28. #28

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    I apologize for not posting over the weekend. I honestly didn't expect or intend to stir up such a hornet's nest of ideas and opinions. Honestly, with my admittedly very limited knowledge of the ins and outs of how hashing actually works behind the scenes, I had no idea that I'd be getting myself into such a mess, but I truly appreciate the discussion and all of the examples provided. This is turning out to be much more of a headache than I had hoped.

    To try to respond to some of the speculation from above, perhaps a more detailed explanation of the business rules are in order. Our company monitors current insurance status on collateralized loans for lending institutions. While we do not deal with SSN's or credit card information, the information we deal with is considered "sensitive" to the point that we have to be very careful with the specific data that we transmit electronically.

    One purpose behind the proposed "hash" code is to help the end user (the individual receiving the printed notification letter) to submit valid insurance information. The intent is that the user will be required to enter this code, which the site will use to pull a record from the database that contains the details for the specific loan & collateral for which the notification letter was generated. Only a subset of the data will actually be displayed on the Web site, but much more detail will be stored in the database record.

    The Account Numbers are strings of alphanumeric values (generally only numeric, but sometimes alpha) of varying length (anywhere from 3-4 characters up to 10-20 characters), but the Customer Numbers will be a fixed, four-character string of numeric characters. The Account Number will be unique to a specific Customer Number, but may be duplicated across different Customer Numbers (i.e., Account Number 123456 may be used by both Customer Number 9876 and Customer Number 7890). Additionally, over the term of a given loan, it is possible that multiple notification letters may be sent to the same Customer/Account Number combination. This is why I considered including a date (the date the notification letter was sent) as a part of any "hash" developed.

    As I stated above, based on the apparent difficulty of trying to develop an actual hash of the data to meet my required constraints, it may be more effective to instead attempt to simply generate a random series of characters for each notification letter and include some logic that will check the database to see if that specific combination has already been used before writing the record to the database.
    Last edited by G_Hosa_Phat; Aug 3rd, 2015 at 09:55 AM.

  29. #29
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    The sizes and values of the inputs matters in terms of simply jamming the data into a UniqueID value. If that could be done by removing redundant bits you would have a perfect mapping and your troubles would be over. But the definitions you gave above rules that out: you have too many important bits.


    Quote Originally Posted by G_Hosa_Phat View Post
    As I stated above, based on the apparent difficulty of trying to develop an actual hash of the data to meet my required constraints, it may be more effective to instead attempt to simply generate a random series of characters for each notification letter and include some logic that will check the database to see if that specific combination has already been used before writing the record to the database.
    Randomness is a tricky thing though since you'd want a uniform distribution. But either that or a conventional hash would work as long as you can produce the desired bit-width with a good distribution to minimize collisions. There aren't any common hashes that create a value that fits well into 10 base-36 digits and you can't just trucate a 64-bit hash and expect uniform distribution. But there are techniques for doing this.

    For example see D.W.'s answer here: Are there hash algorithms with variable length output?

    That gets pretty ugly. Partially because he is striving for cryptographic purity, and party because that's just how hashes work.

    So say you come up with a useful enough random value generator. Now you need database probes to detect non-unique values and then tie-breaker logic (e.g. adding 1 or 3 or 7 and re-probing). You need to lock the entire table in order to resolve concurrency collisions!

    That gets ugly fast and will have severe performance impacts. So much so that you are probably going to be far better off using a second table of "in-use UniqueID values" in order to limit full-table locking to just that data. That will allow other readers, and writers of other fields, to operate without waiting on this big lock.

  30. #30

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    Again, perhaps I'm being dense, but I'm not sure I understand the purpose or necessity of building a separate "in-use" table for this operation. The idea is that, when a notification letter is produced, it writes a record to a completely separate table in the database with the details of that specific letter. That record needs to be uniquely identified, but once it's written, there should be no further write action taken on that record. From that point forward, the only reason that record should be accessed would be to retrieve its details when a user supplies the identification code on the Web site.

    Since my primary concern is simply ensuring a unique and non-sequential value, Perhaps I need to "ease up" on the restrictions and expand the code to 12 characters (although I would like to exclude "ambiguous" characters, as suggested by @Schmidt above, to limit the potential for data entry issues) and just let the "higher-ups" know that this is a programming requirement so that we are able to reliably retain a historical record of the data.

    Regardless, I imagine that I could simply do a
    Code:
    SELECT COUNT(code) FROM notifications WHERE code = <generated code>
    and loop the random code generation function until the return value is not greater than 0 immediately before the INSERT statement is executed. Is there a reason that this would not be a workable solution?

    EDIT: Here's a simple VB6 random code generator that I could use in the proposed loop:
    VB Code:
    1. Public Function GenerateCode() As String
    2.     Dim intLoop As Integer
    3.     Dim intChar As Integer
    4.     Dim blnContinue As Boolean
    5.     Dim strResult As String
    6.    
    7.     Randomize
    8.    
    9.     For intLoop = 1 To 12  ' --- Expanding the code to 12 characters
    10.         blnContinue = False
    11.        
    12.         Do While blnContinue = False
    13.             intChar = Int((90 - 48 + 1) * Rnd + 48)
    14.            
    15.             Select Case intChar
    16.                 Case 50 To 57  ' --- 2-9 (Exclude "0" and "1")
    17.                     blnContinue = True
    18.                 Case 65 To 72  ' --- Upper-Case A - H (Exclude "I")
    19.                     blnContinue = True
    20.                 Case 74 To 78  ' --- Upper-Case J - N (Exclude "O")
    21.                     blnContinue = True
    22.                 Case 79 To 90  ' --- Upper-Case P - Z
    23.                     blnContinue = True
    24.                 Case Else
    25.                     blnContinue = False
    26.             End Select
    27.         Loop
    28.        
    29.         strResult = strResult & Chr(intChar)
    30.     Next intLoop
    31.    
    32.     GenerateCode = strResult
    33. End Function
    Again, if I loop this until I find a code that is not currently "used" in the database, I believe it could satisfy the specific requirements without causing any substantial performance degradation. Of course, I could be very wrong in that regard, and I'll have to wait until I've done some serious testing to verify that last statement.

    Even so, I don't know that this would be any more intensive than attempting to develop and deliver a true and working hash function.
    Last edited by G_Hosa_Phat; Aug 3rd, 2015 at 12:08 PM.

  31. #31
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,073

    Re: Generating a unique fixed-length hash in VB6

    What's wrong with sequential numbers


    Anything I post is an example only and is not intended to be the only solution, the total solution nor the final solution to your request nor do I claim that it is. If you find it useful then it is entirely up to you to make whatever changes necessary you feel are adequate for your purposes.

  32. #32

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    Sequential numbers greatly increases the potential for someone to enter another user's "code" and retrieve information that is not associated with them. While, at some point, the same can be true for a randomly generated code (if all of the possible combinations within the scope have been used and none of the data has ever been purged from the system), the likelihood of someone "guessing" at such a code is much less of a concern.

    This brings up one other thing that I've been considering as well. I believe the collision potential could be mitigated to some extent if I implement a method for purging these notification messages from the database at some point in the future (say 5-7 years, or so). Honestly, I had hoped I wouldn't need to include this element, but I think it might be best for the overall solution that it seems I'm coming to here.

  33. #33
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    If the table in question never requires updating (aside from perhaps deletes, i.e. your purging old records) then things get a bit simpler.

    However you still need to lock the entire table for any "write" operation. The issue is that when trying to do collision detection one writer could find a slot/value "free" and then before he uses it another writer could find the very same slot free as well and use it before the "first" writer actually does. The he writes his record and now you have a duplicate "unique ID."

    That goes away if there is some singleton thread that does this in a middle logic tier I suppose, but otherwise you need to consider this concurrency issue. The problem is very similar to the one Microsoft describes in How To Implement Multiuser Custom Counters in Jet 4.0 and ADO 2.1 where there is just a one-row table but the same idea applies here.

  34. #34

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    Okay, so it looks like my best solution is to forego attempting to hash the specific data and instead try to work with randomly generated values. You've all given me quite a bit to think about with regards to this implementation. Perhaps I need to seriously consider trying to find a way of off-loading the logic for this code generation to the RDBMS if possible.

    Thank you all for the tremendous amount of information and all of the suggestions and ideas. I truly appreciate all of the feedback.

  35. #35
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,516

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by G_Hosa_Phat View Post
    Okay, so it looks like my best solution is to forego attempting to hash the specific data and instead try to work with randomly generated values.
    Nah - since you apparently already have (or plan) something like I guessed
    (a DB-Table which already stores all the send Notifications, to allow a "re-mapping" later),
    you can really use a simple AutoID as the PK for this thing.

    Your concerns about the sequential behaviour of this AutoID can easily be
    remedied by a simple Encryption (which e.g. scrambles the 3 lower Bytes of
    an Int32-AutoID-DB-Field - and then maps this at maximum 24Bit-Chunk
    to the 5 Chars (5 * 5Bit) I've already mentioned in post #24.

    You might try out what I mean, by taking a look at the following code, which makes
    use of a relative strong encryption-algo, derived from the RC4 (aka arcfour), also
    applying a bit of salt - then followed by a mapping of the encrypted 24Bits to 5Chars.

    The whole thing is reversible of course and implemented in two main-functions -
    please put the following into a *.bas Module:

    Code:
    Option Explicit
    
    Public Function IdToFiveChars(ByVal ID As Long, Key As String, Salt As String) As String
    Const CharMap$ = "ABCDEFGHJKLMNPRSTUVWXYZ123456789" '<- 32Chars (5Bit-Mapping)
    Dim i As Long, BDat() As Byte, BKey() As Byte, EncrID As Long
      
      If ID <= 0 Or ID >= 2 ^ 24 Then Err.Raise vbObjectError, , "We need an ID in the range of [1 to 2^24-1]"
      
      BKey = Key   'make a ByteArray out of our Key
      BDat = Salt  'apply the salt and init the Data-ByteArray
      
      'make space for the 24Bits (3Bytes) of our passed ID in BDat ...
      ReDim Preserve BDat(UBound(BDat) + 3)
      For i = 0 To 2 '...and copy the last 3 Bytes of ID over at the end
        BDat(UBound(BDat) - 2 + i) = (ID And 255&): ID = ID \ 256
      Next
     
      SimpleStreamCipher BDat, BKey, True 'and apply the encryption on BDat
    
      For i = 0 To 2 'make a Long-value out of the 3 now encrypted Bytes at the end
        EncrID = 256 * EncrID + BDat(UBound(BDat) - i)
      Next
      
      IdToFiveChars = Space$(5) 'make space for the FiveChar-Mapping of EncrID
      For i = 0 To 4 'and convert EncrID appropriately in a Loop
        Mid$(IdToFiveChars, i + 1, 1) = Mid$(CharMap, (EncrID And 31&) + 1, 1)
        EncrID = EncrID \ 32 'shift 5 Bits to the right
      Next
    End Function
    
    Public Function FiveCharsToId(Chars As String, Key As String, Salt As String) As Long
    Const CharMap$ = "ABCDEFGHJKLMNPRSTUVWXYZ123456789" '<- 32Chars (5Bit-Mapping)
    Dim i As Long, BDat() As Byte, BKey() As Byte, EncrID As Long
        
      If Len(Chars) <> 5 Then Err.Raise vbObjectError, , "We need 5 Chars here"
      
      For i = 5 To 1 Step -1 're-map from the 5-Chars into the encrypted Long-Value
        EncrID = EncrID * 32 + InStr(1, CharMap, Mid$(Chars, i, 1)) - 1
      Next
      
      BKey = Key   'make a ByteArray out of our Key
      BDat = Salt  'apply the salt and init the Data-ByteArray
      SimpleStreamCipher BDat, BKey, True 'encrypt only the Salt first in BDat
    
      'make space for the 24Bits (3Bytes) of our EncrID at the end of salt-encr. BDat ...
      ReDim Preserve BDat(UBound(BDat) + 3)
      For i = 0 To 2 '...and copy the last 3 Bytes of ID over
        BDat(UBound(BDat) - 2 + i) = (EncrID And 255&): EncrID = EncrID \ 256
      Next
    
      SimpleStreamCipher BDat, BKey, False 'now apply the decryption on the whole of BDat
      
      For i = 0 To 2 'finally a copy-over of the 3 bytes at the end into our decrypted result-ID
        FiveCharsToId = 256 * FiveCharsToId + BDat(UBound(BDat) - i)
      Next
    End Function
    
    'this cipher is based on the RC4-algorithm (quite strong, no "play-thing")
    Public Sub SimpleStreamCipher(B() As Byte, BK() As Byte, ByVal Enc As Boolean)
    Dim i&, j&, k&, UB&, CK&, Tmp As Byte, C(255) As Byte, T(255) As Byte
     
      UB = UBound(B): CK = UBound(BK) + 1
     
      'Init Key-Arrays
      For i = 0 To 255: C(i) = i: T(i) = BK(i Mod CK): Next
      For i = 0 To 255
        j = (j + C(i) + T(i)) Mod 256
        Tmp = C(i): C(i) = C(j): C(j) = Tmp
      Next i
      'Crypt
      i = 0: j = 0: CK = 0
      For k = 0 To UB
        i = (i + 1) Mod 256
        j = (j + C(i)) Mod 256
        Tmp = C(i): C(i) = C(j): C(j) = Tmp 'swap C(i) and C(j)
        If Enc And k > 0 Then CK = B(k - 1) Else Tmp = B(k)
        B(k) = B(k) Xor C((CLng(C(i)) + C(j) + CK) Mod 256)
        If Not Enc Then CK = Tmp
      Next k
    End Sub
    Usage of the two Functions (IdToFiveChars and FiveCharsToId) of the above module
    then e.g. like this (into a Form without any Controls):

    Code:
    Option Explicit
     
    Const Key As String = "Key", Salt As String = "Salt"
    
    Private Sub Form_Load()
      Dim FiveChars As String, i As Long
      For i = 1 To 3 'test 3 IDs at the start of the allowed range
        FiveChars = IdToFiveChars(i, Key, Salt)
        Debug.Print i, FiveChars, FiveCharsToId(FiveChars, Key, Salt)
      Next
      For i = 2 ^ 24 - 3 To 2 ^ 24 - 1 'and 3 IDs at the end of the allowed range
        FiveChars = IdToFiveChars(i, Key, Salt)
        Debug.Print i, FiveChars, FiveCharsToId(FiveChars, Key, Salt)
      Next
     End Sub
    The above shows the possible ID-Range and prints the following into the Immediate-Window:
    Code:
     1            R5YBL          1 
     2            PWLKF          2 
     3            NMNTK          3 
     16777213     VN59J          16777213 
     16777214     UNA8P          16777214 
     16777215     T6P2A          16777215
    BTW, I've already tested the whole range (1 to 2^24-1) of possible ID-Values
    successfully in a Loop - getting back exactly what I've put in (no duplicates, no errors) -
    when you want to try that yourself, I suggest compiling native (because it takes a bit, roughly 1 Minute).

    And just for completeness - the creation of a new record, then retrieving its AutoID
    can easily (and safely) be accomplished by wrapping the task in a transaction.

    Below is an extended Demo, which requires the same *.bas-Module with the two ID-Mapper-Functions -
    and in addition a ListBox-Control (List1) on the Form, as well as a checked in Project-Reference to ADO.

    Code:
    Option Explicit
    
    Const JetPrefix = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
    
    Private DBFileName As String, Cnn As ADODB.Connection
    
    Const Key As String = "Key", Salt As String = "Salt"
    
    Private Sub Form_Load()
      'simple, DB-independent Test beforehand
      Dim FiveChars As String, i As Long
      For i = 1 To 3 'test 3 IDs at the start of the allowed range
        FiveChars = IdToFiveChars(i, Key, Salt)
        Debug.Print i, FiveChars, FiveCharsToId(FiveChars, Key, Salt)
      Next
      For i = 2 ^ 24 - 3 To 2 ^ 24 - 1 'and 3 IDs at the end of the allowed range
        FiveChars = IdToFiveChars(i, Key, Salt)
        Debug.Print i, FiveChars, FiveCharsToId(FiveChars, Key, Salt)
      Next
      
      Caption = "Click the ListBox-Entries"
      DBFileName = Environ("temp") & "\Test.mdb"   'define a FileName
      
      On Error Resume Next
        Kill DBFileName 'ensure we kill an already existing File in case it exists
      On Error GoTo 0
      
      CreateObject("ADOX.Catalog").Create JetPrefix & DBFileName '...create a new *.mdb
      
      Set Cnn = New ADODB.Connection 'create and open the Connection
          Cnn.CursorLocation = adUseClient
          Cnn.Open JetPrefix & DBFileName
      
      'create an appropriate Table in the Test-DB (the SQL-DDL could be used in both MS-SQLServer and JET)
      Cnn.Execute "Create Table Notifications(ID Int Identity, CustNr nVarChar(5), AccNr nVarChar(25), NotifyDate DateTime)"
      
      Dim NotifyDate As Date
          NotifyDate = Now 'same Date for all of the 3 new records we add below
      List1.AddItem AddNotification("1234", "123-456", NotifyDate)
      List1.AddItem AddNotification("2345", "234-567", NotifyDate)
      List1.AddItem AddNotification("3456", "345-678", NotifyDate)
    End Sub
    
    Private Sub List1_Click()
      With GetRs("Select * From Notifications Where ID=" & FiveCharsToId(List1.Text, Key, Salt))
        MsgBox !ID & vbCrLf & !CustNr & vbCrLf & !AccNr & vbCrLf & !NotifyDate
      End With
    End Sub
    
    Function AddNotification(CustNr$, AccNr$, NotifyDate As Date, Optional ID As Long) As String
      On Error GoTo RollBack
      Cnn.BeginTrans 'wrap the whole Operation in a Transaction
      
      With GetRs("Select * From Notifications Where 1=0")
        .AddNew 'apply the 3 Values into a new Record
            !CustNr = CustNr
            !AccNr = AccNr
            !NotifyDate = NotifyDate
        .Update 'and update it
      End With
      
      'now ask the DB about the last Auto-ID it has given in this Transaction
      ID = Cnn.Execute("Select @@identity")(0)
     
      Cnn.CommitTrans 'no failures so far, so we commit it
      
      AddNotification = IdToFiveChars(ID, Key, Salt) 'return the 5-Chars as a result
    
      Exit Function
      
    RollBack:
      Cnn.RollbackTrans
    End Function
     
    Function GetRs(SQL As String) As ADODB.Recordset
      Set GetRs = New ADODB.Recordset
          GetRs.Open SQL, Cnn, adOpenStatic, adLockOptimistic
    End Function
    HTH

    Olaf

  36. #36

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    @Schmidt Thank you for your suggestion. I believe I've found a workable solution by creating a custom ID field that generates a random alphanumeric code in the RDBMS so that, when a new record is inserted, the database will automatically take care of this part of it for me (rather than having the application logic generate something and hope that I don't run into a concurrency issue). The function that's written in the database to generate the unique ID will actually loop until it finds one that isn't in use before committing the INSERT.

    Thank you all again for the tremendous discussion. It has really shed some light on the whole issue and helped me to work through a number of things rather than banging my head against a wall. I truly appreciate all of the assistance.

  37. #37
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Generating a unique fixed-length hash in VB6

    Sure you do...and this is a more clever system..
    You have a simple autonumber and you have to change positions by a standard way to make number to a fake random id.
    so if we have 4 chars say 1435 we have to change positions like this 5314. You may have a random increment by one or two or three so the next to 1435 maybe can be 1437 so you get 7314. No customer can see the others "Serial number", so if he try to paste a random one to your automatic system maybe he put a 5315. You see now you go to other side, one small code to be a bigger one (as serial number or something like that), and now you can be creative. How many mails you send a day? 1000, 2000, 3000.. With a random "increment" of 3 you need just 4 chars, and you can put 3 or 4 random chars between original. So you have a function MailCode$(X) to give you each time different code( with random inside chars) and you can read and find in one step without search the X to test with name and address of the customer.
    Last edited by georgekar; Aug 4th, 2015 at 10:20 AM.

  38. #38
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,073

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by G_Hosa_Phat View Post
    The function that's written in the database to generate the unique ID will actually loop until it finds one that isn't in use before committing the INSERT.
    What happens if it doesn't find one that isn't in use (if that is possible at all)


    Anything I post is an example only and is not intended to be the only solution, the total solution nor the final solution to your request nor do I claim that it is. If you find it useful then it is entirely up to you to make whatever changes necessary you feel are adequate for your purposes.

  39. #39

    Thread Starter
    Member G_Hosa_Phat's Avatar
    Join Date
    May 2008
    Location
    Oklahoma City, OK
    Posts
    38

    Re: Generating a unique fixed-length hash in VB6

    Quote Originally Posted by jmsrickland View Post
    What happens if it doesn't find one that isn't in use (if that is possible at all)
    Actually, the way that it's currently written, it will continue to loop until it finds an available ID. There, of course, is the potential for it to get completely hung up in that loop if it keeps generating the same (random) values, and I may have to do some additional "tweaking" to prevent that from happening, but this is a good starting point. This is still a "work in progress", and I have a bit of time to finalize everything before any of this goes into production.

  40. #40
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,793

    Re: Generating a unique fixed-length hash in VB6

    Normally when you find a collision you'd increment the value by 1 or a random small value > 0 and probe again.

Page 1 of 2 12 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width