Results 1 to 18 of 18

Thread: Merging two data streams (string or byte array)

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Merging two data streams (string or byte array)

    This is probably going to be a bit advanced for some people...it definitely is for me...what I basically want is to be able to take two strings and merge their *values* together to make a third string.

    For instance, if I had "aaaaaa" and the values of the second string was "123456" (note the string *ISN'T* 123456, their ascii values would be) then the outputted string would be "bcdefg"

    This is a piece of code that I have written that can do it the simple way:

    VB Code:
    1. str1 = "aaaaaa"
    2. str2 = "123456"
    3.  
    4. For b = 1 To Len(str1)
    5. tmpnum = Asc(Mid(str1, b, 1)) + Val(Mid(str2, b, 1))
    6. If tmpnum > 255 Then tmpnum = tmpnum - 256
    7. str3 = str3 & Chr(tmpnum)
    8. Next b

    For simplicity, I have made this use val rather than asc for the second string, and for the purposes of my requirements either way works fine for me...and if val was used, the string COULD be "123456" rather than an ascii equivalent. The code also *HAS* to deduct 256 from the value if it's over 255 to keep it a valid character

    Basically, this piece of code will be run something like 90%-95% of the time in my program and I need it to be as fast as possible. In my case, the code works with 50k strings of text and will need to process many gigabytes of data, so any speed increase would help.

    For the record, this sort of code used with a prng (search wiki for the word :-)) would make an excellent encryption system...although that's not what I am doing with it. :-)

    Another aspect of this code I will need is the ability to do the opposite...deduct the value from string 1 and if it's below 0 add 256 to it...which again would be useful for decrypting the encrypted data...the simple way to do this is to have 256 minus the value in string 2 then deduct 256 from the value of the number...or two lines of code where one adds while the other subtracts and an if/then on each checking for something.

    And if anyone is wondering why I would want this, it's just a filter for something I'm working on...you could say it's encryption as that *is* the effect it has but the resulting effect for me has nothing to do with encryption :-P

  2. #2
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    To see a "near fastest possible on VB6" method of processing strings, check out this post by me. It probably goes way over your head, but what you see there is really fast. It avoids making extra copy of a string, it avoids string processing (which is slow), it avoids any unnecessary extra calls and sticks with the minimum and fast.

    You'd need to have two arrays instead of just one I've used in that code.


    Note: hitting Stop while that code is running probably crashes your IDE. Don't do it. Manipulating memory follows the rule "if you change something that is done by something else, restore the original state once you're done".

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    Yeah, most of it is beyond me on first glance, I don't see how it would be able to do what was required by me :-)

    BTW, I noticed you use gettickcount but don't have it declared anywhere in that code :-P

  4. #4
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    I'm just about to go to sleep so I'm just going to go with a short "introduction".

    When you have a string variable, it has two things in memory: the data part, where all the characters are. Right before this datapart is also length of the data. This data part is referenced by a pointer in memory. Basically a string variable is a pointer, a four bytes long value.

    Then we have an integer array that is empty. When integer array is defined, it has a SAFEARRAYHEADER. The array variable in VB6 is a pointer to this header. Then, this header has another pointer value to where the data part resides.

    What the code does is that it changes the integer array's pointer to point into a custom SAFEARRAYHEADER, which in turn has it's settings set so that it points into data part of a wanted string. In practise this allows handling the string's contents directly using the integer array, so iarText(0) would be the same as AscW(Left$(strText, 1)) - for the exception that accessing the integer array is greatly faster.


    A simple training function that you can use instead of BeforeNumbers, you can paste this into the clsSA.cls:
    VB Code:
    1. Public Function Test(ByRef Text As String) As Integer
    2.     ' make iarText data pointer to point to Text string's data
    3.     SA(3) = StrPtr(Text)
    4.     ' this makes UBound work correctly
    5.     SA(4) = Len(Text)
    6.     ' since this is a simple function, it simply returns the ASCII value of the first character
    7.     Test = iarText(0)
    8. End Function


    Usage would be something like
    VB Code:
    1. MsgBox "Character code for A is " & SA.Test("A")

    You can use this line of code to replace everything that is between Set SA = New clsSA and Set SA = Nothing

  5. #5

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    I think I understand most of that...how it all actually works is probably beyond me though so maybe I'll stick with either the code I pasted originally or an optimised version of that using some sort of array so I don't have to keep doing asc/chr/instr on parts of the string :-)

    I will have a play about with your code to see if it helps though, of course...maybe it'll speed up certain aspects :-)

  6. #6
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    "Certain aspects" might not be descriptive enough: the code might end up being 100 times faster or more than the current code you have.


    An alternative "easy" method is to use a byte array. You can convert strings directly to byte arrays and vice versa. Why this is slower than the method I've shown above is simple: you copy memory. Always when you copy memory around, you cause slowdown. If you're able to work without moving memory a lot, you get fast results. The faster the longer the data to be processed.


    But I guess I'm now off to bed.

  7. #7

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    I am actually currently using a byte array, and the 50k block of data is converted once to a byte array and the "filter" I mentioned above is to be run on the byte array up to 100 times then converted back to a string.

    I estimate that my program is going to take about 55 hours to do what it needs to do, and a few minor tweaks I have already done to the code has maybe got it down to 20 hours...I am going to need to get it to 1 hour or less to be worth it :-)

    For now my plan is just to get it doing what it needs to do (which it currently doesn't do at all :-)) then I will work on improving the speed of it...possibly to the extent of giving the code to a friend to see if they can rewrite it in a faster language like C++ :-)

  8. #8
    PowerPoster
    Join Date
    Feb 2006
    Location
    East of NYC, USA
    Posts
    5,691

    Re: Merging two data streams (string or byte array)

    For pure raw speed write it in FORTH. That can be even faster than assembly. Of course you have to learn to think backwards first ...
    The most difficult part of developing a program is understanding the problem.
    The second most difficult part is deciding how you're going to solve the problem.
    Actually writing the program (translating your solution into some computer language) is the easiest part.

    Please indent your code and use [HIGHLIGHT="VB"] [/HIGHLIGHT] tags around it to make it easier to read.

    Please Help Us To Save Ana

  9. #9

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    Quote Originally Posted by Al42
    For pure raw speed write it in FORTH. That can be even faster than assembly. Of course you have to learn to think backwards first...
    I can only barely do it forwards :-)

  10. #10
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    Now that I've had my sleep, lets go with "what you're actually doing" type of thinking. If you're just reading normal ANSI text files, then byte arrays are the way to go. If you read Unicode text files (UTF-16), then integer arrays are your choice.

    Integers give you headache by being unsigned. You'd have to convert them to Long and then back to Integer. You also need to take that you're not trying to give too big numbers, ie. 65536 is too big for an integer and 256 is too big for a byte. With bytes:
    VB Code:
    1. ByteArray(X) = CByte((CLng(ByteArray(X)) + 10) And &HFF&)

    A faster method for VB would be to actually make a table (array) with precalculated values. In this example, Table(0) would be 10 and Table(255) would be 9. The major advantage with this would be the lack of data type conversions when the actual process is going on. I actually bothered to code this:

    VB Code:
    1. ' in a module
    2. Option Explicit
    3.  
    4. Public Type MYBYTETABLE
    5.     ToThis(255) As Byte
    6. End Type
    7.  
    8. Public ByteAdd(255) As MYBYTETABLE
    9.  
    10. Public Sub InitByteTable()
    11.     Dim lngA As Long, lngB As Long
    12.     For lngA = 0 To 255
    13.         With ByteAdd(lngA)
    14.             For lngB = 0 To 255
    15.                 .ToThis(lngB) = CByte((lngB + lngA) And &HFF&)
    16.             Next lngB
    17.         End With
    18.     Next lngA
    19. End Sub

    VB Code:
    1. ' a sample program
    2. Option Explicit
    3.  
    4. Private Sub Form_Load()
    5.     InitByteTable
    6.     MsgBox "Oh wow! 255 + 10 = " & ByteAdd(10).ToThis(255)
    7. End Sub


    Oh, one last thing: VB works much much faster when you compile the program. Also, if you set all advanced optimizations on, the array handling becomes far far faster. But you can't rely the working of the program on errors, you must make sure you never use too big or too small indexes... indexes that there aren't in the array won't rise an error and might crash your program. Same applies for data.

  11. #11

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    Thanks for that, i'll try it out...the actual data will be any file at all so any byte value from 0 to 255...whether what I have planned works or not I will tell people here about it when I know either way :-)

    Just a few things...

    I don't understand what "Table(0) would be 10 and Table(255) would be 9." means exactly.

    "MsgBox "Oh wow! 255 + 10 = " & ByteAdd(10).ToThis(255)"...i am assuming here without running it that it'd return 265...hmmm...(reads previous line)...(loud clanking noise)...(steam coming out of ears)...

    ...ah, i see now what you're saying...would this code actually be efficient no matter how many times it was used and how many different values were passed (for instance 0/0 to 255/255) or would running it 10000s of times actually be slower...I am sure I am totally missing something here looking at this but obviously can't place it :-)

    I am thinking now that the best way to use this (if it works as I now believe it does) is to write a function that makes use of two byte arrays with one of them being the 50k block of data I am processing and the other being the data I am filtering it with...does that sound right to you?

  12. #12
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    Yes, that's how you should do it. First array for the file, second array for the filter data. Then you can use what I've shown above to "convert" the file data into another format. For speed, you should work directly back to the file array so you don't need to reserve memory for a third array, which would hold the end result.

    So code would be like:
    VB Code:
    1. FileArray(X) = ByteAdd(FilterArray(Y)).ToThis(FileArray(X))


    We'll see if you can figure out how to process these X and Y variables efficiently


    Btw, character "0", which you used in your examples above, is actually 48 if you look into it memorywise. Character "A" is 65. If you add "0" to "A" in pure math, you get "q" (113) and not "B" (66).

  13. #13

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    So what i think you are saying at the end is that when splitting the filter data into the array I shouldn't use 0-9 but should instead use chr(0) to chr(9)...don't worry, I plan to either use a standard prng (pseudo-random number generator) or write my own based on established methods...that way the end project I am planning has as small a footprint as possible on the hard drive :-)

    Because I am using a prng, I would store the numbers directly as numbers into the byte array so there should be no issues :-)

    Edit: Forgot to ask...you have "ByteAdd(10).ToThis(255)"...I am pretty sure I could write my own ByteDed(x). to deduct the value, but how would I do the ".ToThis(lngB) = CByte((lngB + lngA) And &HFF&)" in the initByteTable? (Yes, I know I don't edit the one in byteadd, i'd make a ByteDed one below :-))...I don't really understand how the &HFF& affects it...I know it's 255, but don't know exactly what it is doing to the algorhythm :-)
    Last edited by smUX; Jun 3rd, 2006 at 11:14 PM.

  14. #14

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    Just made use of the code, and it *seems* to be working almost perfectly...I've written a piece of code to generate a string of "a"s and a random string of 0-9 numbers then split them into two byte arrays...the above code takes about 6s to complete but the actual filter takes 30ms...both uncompiled...and this is without any tweaking by me yet :-)

    Only bug I can see is that it only seems to filter half the byte array and leave the rest alone...here's the code I used:

    VB Code:
    1. Dim str1() As Byte, str2() As Byte
    2. For b = 1 To 50000: tx1 = tx1 & "a": tx2 = tx2 & Chr(Int(Rnd(1) * 10)): Next b
    3. str1 = tx1
    4. str2 = tx2
    5.  
    6. For b = 1 To 50000
    7. str1(b) = ByteAdd(str2(b)).ToThis(str1(b))
    8. Next b
    9. Text1 = str1

    It's early/late here in the UK (6am) so I am off to bed now...hopefully tomorrow I will be able to sort this if no-one else has pointed out my glaring error (if it's that visible :-))

  15. #15
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    Your error is very clear: strings are Unicode, two bytes per character. 50000 characters = 100000 bytes.

    Also, your method of adding up strings like that is slow. You could just use other features VB provides:

    VB Code:
    1. Dim str1() As Byte, b As Long
    2.     InitByteTable
    3.     str1 = StrConv(String$(5000, "0"), vbFromUnicode)
    4.     For b = 0 To UBound(str1)
    5.         str1(b) = ThisByte(str1(b)).Add(Int(Rnd * 10))
    6.     Next b
    7.     ' note: if you use a very long string, you get a major slowdown
    8.     ' when you put it into textbox
    9.     Text1.Text = StrConv(CStr(str1), vbUnicode)
    10.     Erase str1

    But note that you can't restore the original state this way. You probably want to use a certain key to scramble things up, although a good hacker can figure that out rather easily (especially if the key is short). Basically you'd use a shorter key to loop through a bigger file.

  16. #16

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    Quote Originally Posted by Merri
    Your error is very clear: strings are Unicode, two bytes per character. 50000 characters = 100000 bytes.
    I've had problems like that before, and just used strconv to fix it...I'm still a bit unsure of what to do and when to do it with these things though :-)

    Quote Originally Posted by Merri
    Also, your method of adding up strings like that is slow. You could just use other features VB provides
    Yeah, I'm aware that there's other ways to do it...just for the purposes of testing I was writing quick and dirty code :-)

    Quote Originally Posted by Merri
    But note that you can't restore the original state this way. You probably want to use a certain key to scramble things up, although a good hacker can figure that out rather easily (especially if the key is short). Basically you'd use a shorter key to loop through a bigger file.
    I say again, it's not an encryption code (although it could be used as such) :-P

    I actually have a much more secure custom encryption code I wrote a while back...basically generates filter strings from a password and uses a special replacement function to filter it...similar to this code but it's simpler :-)

    Although restoring the original state will be important...read a couple of posts up, as I ask about modifying ByteAdd(x) so I can also have a ByteDed(x) which also loops. Be aware that at NO point will I need both ByteAdd and ByteDed at the same time so they could both use ToThis() so maybe initbytetable(1,255) for initialising an add array (the 255 is explained below) and initbytetable(2,255) for initialising a deduct array for rebuilding the data using the filters

    There *IS* another issue I forgot about...is there any way to set an upper limit for the values? For instance, if I type initbytetable(255) it would allow a limit of 0-255 so once the number passes 255 it would loop back, but if I typed initbytetable(100) it would allow only 0-100 then loop...my guess would be modifying the "and &HFF&" but I wouldn't know how esp. as the number would be variable :-)

  17. #17
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Merging two data streams (string or byte array)

    VB6 does some ANSI <-> Unicode conversions. The basic rules:
    • When you load a file in text mode, VB6 converts ANSI to Unicode. When you save in text mode, Unicode is converted to ANSI.
    • When you call an API call, strings passed are converted to ANSI. When you get a string result from API, the string is converted to Unicode. (There are no difference between ANSI and Unicode versions of the API function).
    • VB6's native controls don't support Unicode. Thus Unicode and ANSI conversions happen all the time under the hood.
    • RichTextBox supports Unicode, but VB6 port of it does not. Same applies to some other controls.



    I made a version in which I swapped some code around:
    VB Code:
    1. ' in a module
    2. Option Explicit
    3.  
    4. Public Type BYTETABLE
    5.     Add(255) As Byte
    6.     Remove(255) As Byte
    7. End Type
    8.  
    9. Public ThisByte(255) As BYTETABLE
    10.  
    11. Public Sub InitByteTable()
    12.     Dim lngA As Long, lngB As Long
    13.     For lngA = 0 To 255
    14.         For lngB = 0 To 255
    15.             ThisByte(lngB).Add(lngA) = CByte((lngB + lngA) And &HFF&)
    16.             ThisByte(lngB).Remove(lngA) = CByte((256 + (lngB - lngA)) And &HFF&)
    17.         Next lngB
    18.     Next lngA
    19. End Sub


    You can use Mod to limit a value. 100 Mod 100 = 0. 101 Mod 100 = 1. And does a simple bitwise comparison.

  18. #18

    Thread Starter
    PowerPoster
    Join Date
    May 2006
    Location
    Location, location!
    Posts
    2,673

    Re: Merging two data streams (string or byte array)

    That's a useful piece of info, thanks...I used to basically use a 3 or 4 part line where first thing it did was divide the number by 256, int it then multiply and deduct that from the original number...mod should be a lot simpler :-)

    Regarding the mod thing, what I am actually talking about is modifying how many numbers initbytearray uses...default there is 256 (0 to 255) but I may actually want to use it with only (for instance) 100 (0 to 99) or 200 (0 to 199) or any number basically between 1 and 255...I assume the initbytearray doesn't take too long to build the strings so I should be able to re-initialise it at any time. Now I understand that I can change the for/next values in your code to achieve this PARTIALLY but how do I edit the bit with the &HFF& at the end...can I use a variable there or would I have to do it a different way...or am I right in thinking that I just take out the "and" and replace with mod X (x being the upper limit)...and I assume X+1 replacing the 256 in the second line.

    I will actually make use of this code and test out these theories I am having about how to do this tomorrow...I am currently on a TV as my monitor is FUBAR, but should have a new monitor (or the old one fixed) tomorrow :-)

    Thanks for your help...will mark it resolved as soon as I've got the code all working in my program :-)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width