Your code is broken, it gives one item too few. It might have even other bugs but I didn't dare to check them out: I don't know if you did a check for an empty array.
Anyways... here we are again: first of all, ASM code in this case isn't as fast as you're saying. plenderj's code is only ten times slower than yours when working with 1 000 000 items and six times slower with 10 000 000 items). Next, the optimized VB6 code I made for comparison isn't even two times slower than yours: only about 1.6 times. I bet some guru could make it even faster.
As a note, I worked with byte arrays here. It isn't hard to modify the stuff to work with Longs.
OFFTOPIC
Apparently you didn't get the thing when I used word "fair": what was behind it was I originally had a Function, and so it copied memory a few extra times, which hitted performance in such a fast thing to benchmark. And as you can see, even calling a Sub a lot of times is slow in VB (it made about 20% performance hit), so positioning the code in the benchmark loop gives more accurate results on the performance of the actual code. Just to make it clear since you seemed to understand it wrong, based on your joke about it
Btw: if you were a board, how could you code? Boards have no hands (yeah, it should be "bored") /OFFTOPIC
Your code is broken, it gives one item too few. It might have even other bugs but I didn't dare to check them out: I don't know if you did a check for an empty array.
Anyways... here we are again: first of all, ASM code in this case isn't as fast as you're saying. plenderj's code is only ten times slower than yours when working with 1 000 000 items and six times slower with 10 000 000 items). Next, the optimized VB6 code I made for comparison isn't even two times slower than yours: only about 1.6 times. I bet some guru could make it even faster.
As a note, I worked with byte arrays here. It isn't hard to modify the stuff to work with Longs.
OFFTOPIC
Apparently you didn't get the thing when I used word "fair": what was behind it was I originally had a Function, and so it copied memory a few extra times, which hitted performance in such a fast thing to benchmark. And as you can see, even calling a Sub a lot of times is slow in VB (it made about 20% performance hit), so positioning the code in the benchmark loop gives more accurate results on the performance of the actual code. Just to make it clear since you seemed to understand it wrong, based on your joke about it
Btw: if you were a board, how could you code? Boards have no hands (yeah, it should be "bored") /OFFTOPIC
Go ahead and test
No your coding as if my function returns one too many.
Change this:
For I = 0 To UBound(X) - 1
to this:
For I = 0 To UBound(X)
Using your timing method:
18 on my pc
38 Yours
290 his
at 10,000,000
181
302
3021
Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde
Heh, should check other people's code and not trust it: I copy-pasted plenderj's code without too much checking on it I did wonder at one point why I had to have + 1 in my code. I guess I also miscounted how many items plenderj's code returned - or then plenderj's code has that weird flaw. Haven't checked. Luckily an easy fix. Edit plenderj, be your name cursed
Here are some results on my machine.
1 000 000 items:
- 25 ms yours (1.00)
- 34 ms mine (1.36)
- 165 ms plenderj (6.60)
10 000 000 items:
- 266 ms (1.00)
- 347 ms (1.30)
- 1688 ms (6.34)
100 000 000 items:
- 2656 ms (1.00)
- 3413 ms (1.28)
- 16906 ms (6.36)
In case interested: AMD Sempron 2800+, 1024 MB @ 333MHz
Heh, should check other people's code and not trust it: I copy-pasted plenderj's code without too much checking on it I did wonder at one point why I had to have + 1 in my code. I guess I also miscounted how many items plenderj's code returned - or then plenderj's code has that weird flaw. Haven't checked. Luckily an easy fix. Edit plenderj, be your name cursed
Here are some results on my machine.
1 000 000 items:
- 25 ms yours (1.0)
- 45 ms mine (1.8)
- 165 ms plenderj (6.6)
10 000 000 items:
- 266 ms (1.00)
- 422 ms (1.58)
- 1688 ms (6.34)
100 000 000 items:
- 2656 ms (1.00)
- 4468 ms (1.68)
- 16906 ms (6.36)
In case interested: AMD Sempron 2800+, 1024 MB @ 333MHz
hrm... try looping it
Could be just a dif between intel an amd though
Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde
Sorry, looping has no point in this case because there are so many items there, it is easy to benchmark just like it is. This command is seldomly used many many times in a row, so it only matters to benchmark one run. Also, benchmarking it would also mean copying the array must be included in the loop, which in part gives false information on the speed of the array...
I think I might make a faster string duplicate removal array as well, since I know how to make it work faster than what plenderj's code is... also, there is a super fast sorting routine at PSC for VB6, which returns sorted array one item at a time: you could call it and check for similarities within the incoming string, as the duplicates are easy to catch (they all come in a row).
Originally Posted by Maven
Could be just a dif between intel an amd though
Which means you optimized your ASM code for Intel only - if you optimize for AMD then your code probably runs slower on Intel
dglienna: nimet tulisi kirjoittaa isolla alkukirjaimella ja kuten todettua, enkku on tosi outo kieli
Sorry, looping has no point in this case because there are so many items there, it is easy to benchmark just like it is. This command is seldomly used many many times in a row, so it only matters to benchmark one run. Also, benchmarking it would also mean copying the array must be included in the loop, which in part gives false information on the speed of the array...
I think I might make a faster string duplicate removal array as well, since I know how to make it work faster than what plenderj's code is... also, there is a super fast sorting routine at PSC for VB6, which returns sorted array one item at a time: you could call it and check for similarities within the incoming string, as the duplicates are easy to catch (they all come in a row).
Which means you optimized your ASM code for Intel only - if you optimize for AMD then your code probably runs slower on Intel
dglienna: nimet tulisi kirjoittaa isolla alkukirjaimella ja kuten todettua, enkku on tosi outo kieli
Well the whole purpose would be to avg out operating sytem interrupts. I usually loop my code and run it in real time.
Well you're code isn't always faster, it really depends on the status of the array. If the array has more duplicates then your code does better, if it has less then you're code ends up worse. Plenderj's code is different, it gets better the less duplicates that are found.
I'm looking into the idea of working with these numbers at 4 bytes per go. The speed gain would probably be 100-1000% but would have to figure out how to deal with odd bytes.
Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde