Create random value from string.

**couttsj** · Oct 2nd, 2021, 04:57 PM

I am trying to create a random byte from a string to use as a seed. What I have so far is this:

Code:

Private Declare Function WideCharToMultiByte Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, ByVal lpMultiByteStr As Long, ByVal cbMultiByte As Long, ByVal lpDefaultChar As Long, ByVal lpUsedDefaultChar As Long) As Long

Private Function Seed(sInput As String) As Byte
    Dim N%
    Dim bInput() As Byte
    Dim bResult As Byte
    bInput = StrToUtf8(sInput)
    For N% = 0 To UBound(bInput)
        bResult = bResult Xor bInput(N%)
    Next N%
    bResult = bResult Xor bInput(8)
    Debug.Print bResult
    Seed = bResult
End Function

Private Function StrToUtf8(strInput As String) As Byte()
    Const CP_UTF8 = 65001
    Dim nBytes As Long
    Dim bBuffer() As Byte
    If Len(strInput) < 1 Then Exit Function
    'Get length in bytes *including* terminating null
    nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, 0&, 0&, 0&, 0&)
    ReDim bBuffer(nBytes - 2)  'Remove terminating byte
    nBytes = WideCharToMultiByte(CP_UTF8, 0&, ByVal StrPtr(strInput), -1, ByVal VarPtr(bBuffer(0)), nBytes - 1, 0&, 0&)
    StrToUtf8 = bBuffer
End Function

The problem is that because the string I used was all lower case ASCII, the Seed routine does not make full use of the byte (8 to 30 instead of 1 to 255).

Is there a better way?

J.A. Coutts

**wqweto** · Oct 3rd, 2021, 04:22 AM

Originally Posted by couttsj

Is there a better way?

Of course there is and you've already used such in TLS implementation. It's called "key derivation function" in which a short key (like a password or a "master key") is *expanded* into a variable sized random sequence of bytes e.g. two sets of 32 bytes each for traffic key, two sets of 12 bytes each for IV and 16 bytes for MAC key (no MAC with AEAD ciphers but you get the point). TLS 1.3 uses HKDF algorithm for this purpose based on HMAC hashes.

You can use BCryptDeriveKeyPBKDF2 API function (uses PBKDF2 algorithm which again uses HMAC as described in RFC 2898) to expand passwords to variable length random byte array (which does not increase password entropy btw) like in this sample code: Simple AES 256-bit password protected encryption

Code:

    '--- generate RFC 2898 based derived key
    On Error GoTo EH_Unsupported '--- CNG API missing on XP
    hResult = BCryptOpenAlgorithmProvider(uCrypto.hPbkdf2Alg, StrPtr("SHA1"), StrPtr(MS_PRIMITIVE_PROVIDER), BCRYPT_ALG_HANDLE_HMAC_FLAG)
    On Error GoTo 0
    ReDim baDerivedKey(0 To 2 * lKeyLen + 1) As Byte
    On Error GoTo EH_Unsupported '--- PBKDF2 API missing on Vista
    hResult = BCryptDeriveKeyPBKDF2(uCrypto.hPbkdf2Alg, baPass(0), UBound(baPass) + 1, baSalt(0), UBound(baSalt) + 1, 1000, 0, baDerivedKey(0), UBound(baDerivedKey) + 1, 0)
    On Error GoTo 0

. . . where baPass is the password and the output baDerivedKey receives expanded key to whatever size you want (still keeping the original password entropy though).

cheers,
</wqw>

**The trick** · Oct 3rd, 2021, 05:12 AM

HashData

**Elroy** · Oct 3rd, 2021, 09:02 AM

To generate a truly random string (ANSI), I'd probably do something like the following:

Code:


Option Explicit
'
Private Declare Function CryptAcquireContextW Lib "advapi32.dll" (hProv As Long, ByVal pszContainer As Long, ByVal pszProvider As Long, ByVal dwProvType As Long, ByVal dwFlags As Long) As Boolean
Private Declare Function CryptGenRandom Lib "advapi32.dll" (ByVal hProv As Long, ByVal dwlen As Long, pbBuffer As Any) As Boolean
Private Declare Function CryptReleaseContext Lib "advapi32.dll" (ByVal hProv As Long, ByVal dwFlags As Long) As Long
'

Public Function RandomAnsiString(iLen As Long) As String
    ' Generates random ANSI strings with characters in the full range of &h00 to &hff.
    Dim hCrypt                      As Long
    Const PROV_RSA_FULL             As Long = 1&
    Const CRYPT_VERIFYCONTEXT       As Long = &HF0000000
    '
    Dim bb()                        As Byte
    ReDim bb(iLen - 1&)
    '
    Call CryptAcquireContextW(hCrypt, 0&, 0&, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT)   ' Initialize advapi32.
    Call CryptGenRandom(hCrypt, iLen, bb(0))                                        ' Get our random bytes.
    Call CryptReleaseContext(hCrypt, 0&)                                            ' Turn off advapi32.
    '
    RandomAnsiString = StrConv(bb, vbUnicode)                                       ' Put ANSI bytes into Unicode VB6 string.
End Function

I was curious, so I generated a "few" random characters and then saw what the frequency distribution looked like. Here's the code I used to generate the characters:

Code:


Private Sub Form_Load()
    ' Test "flatness" of single ANSI characters from above function.
    Dim i As Long
    Const TestCount As Long = 50000

    ' Generate some random ANSI values.
    Dim sa(TestCount) As String
    For i = 1& To TestCount
        sa(i) = RandomAnsiString(1&)
    Next

    ' Count how many we got of each.
    Dim bb(255&)
    Dim j As Long
    For i = 1& To TestCount
        j = Asc(sa(i))
        bb(j) = bb(j) + 1&
    Next

    ' Dump frequencies.  (In two passes, so we don't overflow the Immediate window.)
    For i = 0& To 127&
        Debug.Print i, bb(i)
    Next
    Stop
    For i = 128& To 255&
        Debug.Print i, bb(i)
    Next
    Stop
End Sub

And here's the frequencies it generated after one pass:

Code:

 0            185 
 1            184 
 2            198 
 3            204 
 4            216 
 5            183 
 6            186 
 7            212 
 8            177 
 9            202 
 10           193 
 11           185 
 12           187 
 13           201 
 14           176 
 15           180 
 16           200 
 17           188 
 18           172 
 19           189 
 20           202 
 21           218 
 22           193 
 23           182 
 24           172 
 25           187 
 26           222 
 27           202 
 28           185 
 29           189 
 30           195 
 31           184 
 32           199 
 33           204 
 34           194 
 35           190 
 36           220 
 37           215 
 38           189 
 39           184 
 40           188 
 41           186 
 42           185 
 43           217 
 44           194 
 45           200 
 46           199 
 47           202 
 48           197 
 49           223 
 50           190 
 51           244 
 52           195 
 53           198 
 54           186 
 55           180 
 56           186 
 57           202 
 58           182 
 59           179 
 60           202 
 61           178 
 62           194 
 63           215 
 64           195 
 65           186 
 66           183 
 67           198 
 68           192 
 69           168 
 70           189 
 71           212 
 72           194 
 73           206 
 74           225 
 75           205 
 76           168 
 77           190 
 78           193 
 79           216 
 80           232 
 81           178 
 82           185 
 83           195 
 84           183 
 85           168 
 86           211 
 87           193 
 88           166 
 89           208 
 90           173 
 91           193 
 92           208 
 93           189 
 94           201 
 95           192 
 96           192 
 97           204 
 98           215 
 99           209 
 100          198 
 101          209 
 102          207 
 103          181 
 104          199 
 105          190 
 106          184 
 107          204 
 108          187 
 109          193 
 110          221 
 111          196 
 112          203 
 113          197 
 114          205 
 115          176 
 116          217 
 117          205 
 118          220 
 119          197 
 120          173 
 121          200 
 122          199 
 123          215 
 124          197 
 125          206 
 126          202 
 127          202 
 128          187 
 129          170 
 130          187 
 131          217 
 132          210 
 133          205 
 134          176 
 135          198 
 136          180 
 137          192 
 138          186 
 139          192 
 140          204 
 141          203 
 142          204 
 143          212 
 144          217 
 145          196 
 146          181 
 147          212 
 148          204 
 149          180 
 150          188 
 151          234 
 152          200 
 153          218 
 154          196 
 155          208 
 156          172 
 157          210 
 158          189 
 159          187 
 160          183 
 161          208 
 162          200 
 163          211 
 164          195 
 165          176 
 166          175 
 167          185 
 168          206 
 169          208 
 170          189 
 171          176 
 172          202 
 173          197 
 174          224 
 175          197 
 176          188 
 177          201 
 178          182 
 179          202 
 180          195 
 181          206 
 182          192 
 183          183 
 184          201 
 185          177 
 186          189 
 187          203 
 188          181 
 189          194 
 190          210 
 191          194 
 192          202 
 193          206 
 194          191 
 195          215 
 196          202 
 197          197 
 198          190 
 199          191 
 200          195 
 201          176 
 202          202 
 203          184 
 204          200 
 205          194 
 206          195 
 207          179 
 208          227 
 209          172 
 210          184 
 211          195 
 212          196 
 213          194 
 214          204 
 215          170 
 216          187 
 217          188 
 218          171 
 219          181 
 220          185 
 221          185 
 222          186 
 223          187 
 224          200 
 225          183 
 226          224 
 227          200 
 228          215 
 229          200 
 230          203 
 231          168 
 232          209 
 233          187 
 234          191 
 235          196 
 236          192 
 237          194 
 238          187 
 239          191 
 240          200 
 241          187 
 242          216 
 243          192 
 244          183 
 245          188 
 246          202 
 247          174 
 248          146 
 249          206 
 250          208 
 251          213 
 252          204 
 253          208 
 254          193 
 255          200

I didn't perform any statistical test, but that looks like a fairly "flat" (uniform) distribution to me.

p.s. If you wanted Unicode characters, it wouldn't take much adjusting of that function to get those. Basically, double the size of the bb() array, double the size of the random data you request, and then just directly assign the bb() array to the returned string value.

**Eduardo-** · Oct 3rd, 2021, 10:36 AM

Originally Posted by Elroy

To generate a truly random string...

I didn't follow all the details on the thread, but I wanted to comment a couple of things:

1) By definition, if the OP wants a "random value" from a String, the values returned must be tied to the given String, so they won't be "random".
I mean, the same String must produce the same "random value".

2) Does "truly random" really exist, or only values produced by complex processes that we don't know how to predict? (I mean, in the Universe).
Well, don't worry about this second one.

**Elroy** · Oct 3rd, 2021, 11:17 AM

Yeah, I wasn't clear on what the OP wanted as well.

I initially read it as wanting a random string that could be used as a seed. I thought about commenting that that's typically not how a seed is used, but maybe I misunderstood.

Yes, if we're just wanting a seed "from a string", and we want the same string to generate the same seed each time, then yeah, a hash sounds like the correct answer (as Trick noted).

**wqweto** · Oct 3rd, 2021, 11:24 AM

Hashing is when you have a long input (e.g. a file) and produce short output (e.g. a 32 byte hash).

"Expanding" is the opposite of hashing i.e. deriving a long key (100-1000 bytes) from a short key (a password or a master secret) which of course cannot produce more randomness than the initial short key already possesses (e.g. a password of only small latin latters reduces this a lot) but the point is not to lose any randomness in the process.

This is what PBKDF2 can be used for (there are many other algorithms) and it also has a parameter for number of iterations to perform which manually slows down the whole process so the same algorithm can be used to store password hashes as well.

cheers,
</wqw>

**Elroy** · Oct 3rd, 2021, 12:07 PM

Ok, just for grins, I made a "hashing" algorithm. Couttsj, if you're not concerned with super privacy between your string and the hash (such as something like SHA-3), then this will do for you:

Code:


Option Explicit
'
Private Declare Function GetMem4 Lib "msvbvm60" (ByRef Source As Any, ByRef Dest As Any) As Long ' Always ignore the returned value, it's useless.

Public Function SeedFromAnsiString(sAnsi As String) As Single
    '
    ' Transfer string to bytes, forcing ANSI, ignoring second byte of each character.
    Dim bb() As Byte
    bb = StrConv(sAnsi, vbFromUnicode)
    '
    ' Make sure length if multiple of 4.
    Dim iLen As Long
    If Len(sAnsi) = 0& Then iLen = 3& Else iLen = Len(sAnsi) + 2&
    While (iLen + 1&) Mod 4&: iLen = iLen - 1&: Wend    ' Zero based, multiple of 4.
    If UBound(bb) <> iLen Then ReDim Preserve bb(iLen)  ' Make adjustment, if necessary.
    '
    ' Make Longs and XOR them.
    Dim iLong() As Long
    ReDim iLong(iLen \ 4)
    Dim iHash As Long
    Dim iHold As Long
    Dim i As Long
    For i = 0& To iLen Step 4&
        GetMem4 bb(i), iHold
        iHash = iHash Xor iHold
    Next
    '
    ' Move iHash into a single, and make sure it's a valid IEEE Single.
    ' We check the Inf and NaN possibility first, as it's easier as a Long.
    ' Basically, if the &H7F800000 are all on, it's either Inf or NaN.
    If (iHash And &H7F800000) = &H7F800000 Then iHash = iHash And Not &H7F800000
    GetMem4 iHash, SeedFromAnsiString    ' And now we can move it, creating a hash (non Inf, non NaN) Single.
End Function

I did make a few assumptions:

1) That we're dealing with ANSI strings.

2) That the VB6 Randomize seed is a Single. I researched and couldn't find a definitive answer for this, but I did find some highly suggestive comments. So, that's the hash that I generated (a Single).

Just as some further notes, I tested this "seed as Single" assumption a bit. If you do the following in the Immediate window, you get the same starting random number:

Code:

rnd(-1): randomize 5.0000000001: ? rnd
 0.8944274 
rnd(-1): randomize 5.0000000002: ? rnd
 0.8944274 
rnd(-1): randomize 5.0000000003: ? rnd
 0.8944274

That certainly suggests that it's not a Double. And furthermore, if you reduce the accuracy within the range of a Single, it does change:

Code:

rnd(-1): randomize 5.0001: ? rnd
 0.300952 
rnd(-1): randomize 5.0002: ? rnd
 8.137804E-02 
rnd(-1): randomize 5.0003: ? rnd
 0.484973

So, I'm thinking that a "Single" for the seed is correct.

**couttsj** · Oct 3rd, 2021, 03:25 PM

It appears that I did not describe the purpose adequately. The seed is used to shuffle the string bytes, and the shuffle must be reversible. That is to say, the old string must produce a seed that is used to create a new shuffled string, and can be used to unshuffle the new string. That is the reason for the extra Xor function, as the same bytes in a different order produce the same seed.

In addition, I have to duplicate this process in JavaScript, and some browsers restrict the use of crypto functions to HTTPS only (eg. Google Chrome & others). I actually had it working, but I wanted to test the function to see how extendable it was. Testing was much easier to do in the VB6 version, and what I found was that sometimes the seed would get stuck on the same result, and sometimes it would actually go to zero.

This is hard for me to explain, so I have created a sample (attached). I did notice that I got a wider seed calculation when at least one capital character is used.

J.A. Coutts

**baka** · Oct 4th, 2021, 03:26 AM

so, you want to create a seed that can both "shuffle" and "un-shuffle"
instead of seed we could call it "key", we generate a key that suffles a string, and the same key can un-shuffle it.

the easiest way would be to make the string into bytes, the same with the "key"
using xor method, you shuffle the bytes together with the key, like

byte(x) = byte(x) Xor byte(y) Xor key(z)

and of course x/y/z can be in any way as long it can be mirrored, like an encode/decoding function. that should shuffle it quite well.
the encoding/decoding are mirrors. the function I use to encode the string is not the same as the decode, even if they both use the same key.
but it all depends on the complexity of the encoding of course.

**wqweto** · Oct 4th, 2021, 07:01 AM

This now looks more like a one-time pad than a key.

The same pad is used to XOR the plaintext both on encode to produce ciphertext and on decode to produce back the original message.

cheers,
</wqw>

**couttsj** · Oct 5th, 2021, 09:49 AM

I ran several text strings with varying lengths using seed values 1 through 254, and there were no duplications in the shuffled strings. Then I ran the string "UserIDpassword" through 1270 loops with no duplications. The occasional repeat is not a problem, but with 2.631308369 E+35 possible permutations, that should be a rarity. It all depends on getting a diverse seed calculation, and the current routine I am using does not provide that.

J.A. Coutts

**Elroy** · Oct 5th, 2021, 11:08 AM

Just FYI, there are nowhere near 2.631308369 E+35 possible permutations for a seed. At most, there are 2^32 - 2^24 permutations (4,278,190,080).

Where did I get that number? From everything I can tell, the seed is a Single, which is four bytes. The possible permutations for four bytes is 2^32. However, when all of the 8 exponent bits are on, it's either NaN or Inf (which I assumed to be invalid seeds), and that's the minus 2^24. Admittedly, I didn't check to see what happens with different NaN (or Inf) seeds. Also, I didn't check any sub-normal values to make sure they're acceptable as seeds either (with different sub-normals resulting in a different seed, different starting random number).

**Elroy** · Oct 5th, 2021, 12:05 PM

It appears that Inf (infinity) does provide a valid seed. However, NaN values cause an overflow error when we attempt to use them as a seed.

So, with this, I guess the answer to the number of valid seeds (resulting in different starting points) is 2^32 - 2^24 + 1

Testing Code:

Code:


Option Explicit
Private Declare Function GetMem4 Lib "msvbvm60" (ByRef Source As Any, ByRef Dest As Any) As Long ' Always ignore the returned value, it's useless.

Private Sub Form_Load()
    Const NanOrInf As Long = &H7F800000
    Dim f As Single
    Dim i As Long


    i = NanOrInf:           GetMem4 i, f    ' Creates an Inf Single.
    Rnd -1: Randomize f
    Debug.Print Rnd                         ' Reports 0.5835753


    i = NanOrInf Or &H1&:   GetMem4 i, f    ' Creates a NaN Single.
    Rnd -1: Randomize f
    Debug.Print Rnd                         ' Overflow error.


    Unload Me
End Sub

And sub-normals seem to work fine.
Test Code:

Code:


Option Explicit
Private Declare Function GetMem4 Lib "msvbvm60" (ByRef Source As Any, ByRef Dest As Any) As Long ' Always ignore the returned value, it's useless.

Private Sub Form_Load()
    Dim f As Single
    Dim i As Long


    i = &H1&:       GetMem4 i, f    ' Creates a Sub-Normal Single.
    Rnd -1: Randomize f
    Debug.Print Rnd                 ' Reports 0.1927062

    i = &H2&:       GetMem4 i, f    ' Creates a Sub-Normal Single.
    Rnd -1: Randomize f
    Debug.Print Rnd                 ' Reports 0.4419737

    i = &H1001&:    GetMem4 i, f    ' Creates a Sub-Normal Single.
    Rnd -1: Randomize f
    Debug.Print Rnd                 ' Reports 0.1956359


    Unload Me
End Sub

Personally, I'm guessing that the Rnd algorithm just does everything as an IEEE Single. The Rnd function certainly returns a Single, so this seems reasonable.

**Elroy** · Oct 5th, 2021, 12:24 PM

Also, it'd be easy to modify my code in post #8 to be more random. The first thought that comes to my mind is a bit-shift-and-wrap based on the character position of each character in the input string.

Also, regarding JavaScript, I'm sure you can find a way to copy four bytes from one variable type to another. But, a line-for-line translation might be difficult (if not impossible). For one, JavaScript variables are very loosely typed. Basically, they're all like Variants (with specific typing option). Also, it appears that Math.Random (in JavaScript) returns an IEEE Double (in one of those loosely typed variables). Therefore, it's essentially going to be a completely different algorithm with seeds behaving differently and the returned sequence being different.

**couttsj** · Oct 5th, 2021, 12:45 PM

posted in error!

**couttsj** · Oct 5th, 2021, 12:49 PM

Originally Posted by Elroy

Just FYI, there are nowhere near 2.631308369 E+35 possible permutations for a seed. At most, there are 2^32 - 2^24 permutations (4,278,190,080).

Where did I get that number? From everything I can tell, the seed is a Single, which is four bytes. The possible permutations for four bytes is 2^32. However, when all of the 8 exponent bits are on, it's either NaN or Inf (which I assumed to be invalid seeds), and that's the minus 2^24. Admittedly, I didn't check to see what happens with different NaN (or Inf) seeds. Also, I didn't check any sub-normal values to make sure they're acceptable as seeds either (with different sub-normals resulting in a different seed, different starting random number).

You are right. That permutation count is wrong. I used an online calculator for 32 characters when I was using an SHA-256 hash. For the 14 character string I am now using, the number is 87,178,291,200. The seed is a single byte, and I was able to get a range of 4 to 254 by adding weight to each character in the string.

Code:

   bResult = bResult Xor bInput(N%) + CByte(N%)

It will take more testing, but hopefully this will provide the results I am looking for.

J.A. Coutts

Thread: Create random value from string.

Thread Tools

Display

Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Re: Create random value from string.

Posting Permissions