-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Ellis Dee
What do you mean that Quicksort3 errors on characters? As a guess, if you are running into case issues, you have to remember to put Option Compare Text at the top of the module where any of the listed sorting algorithms reside.
Just to make it more clear, here is a test I did,
Module1.bas with a Public Sub MedianThreeQuickSort1 in it. (a.k.a. Quicksort3 POST #14).
Code:
Option Compare Text
Public mList(0 To 2) As String
Form Code, sort error,
Code:
mList(0) = "C"
mList(1) = "B"
mList(2) = "A"
MedianThreeQuickSort1 mList ' Error Type Mismatch see * below
Debug.Print mList(0) ' Returns C
' * If I change lngMid to a Variant it doesn't error.
Form Code, no errors,
Code:
mList(0) = "120"
mList(1) = "12"
mList(2) = "119"
MedianThreeQuickSort1 mList ' no errors returned
Debug.Print mList(0) ' Returns 12
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Edgemeal
' * If I change lngMid to a Variant it doesn't error.
That is indeed the fix. I have edited the code posted to reflect this fix, as well as changed the function prototype to Public.
Thanks much for the bug report. It's always nice when the bug report includes the fix, so thanks for that as well.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Ellis Dee
That is indeed the fix. I have edited the code posted to reflect this fix, as well as changed the function prototype to Public.
I may just keep a copy of the original Quicksort3 for sorting string arrays that contain only numbers since it sorts logically and seems pretty fast, unless there is a better/faster way to accomplish that task? Thanks.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Edgemeal
I may just keep a copy of the original Quicksort3 for sorting string arrays that contain only numbers since it sorts logically and seems pretty fast, unless there is a better/faster way to accomplish that task? Thanks.
I'm not aware of any finished solution for sorting that way. Basically, every comparison would require checking the Val() value before comparing the strings directly.
Maybe a mirror array holding the Val()s would be the most efficient approach, where you compare the mirror array values first, and if they're equal then compare the actual array values.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Ellis Dee
I'm not aware of any finished solution for sorting that way. Basically, every comparison would require checking the Val() value before comparing the strings directly.
Ya thats what I figured, I made a few changes to Quick Sort(#13) to do numbers and it seems to be a hair faster then the original Quicksort3. Good enough,Thanks.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Ellis, I have a very minor suggestion:
Due to VB6 allows negative index for array such as: Dim myArray(-10, 0) As Long
This may come up with a call later such as: MedianThreeQuickSort1 myArray, -7, 0
or QuickSort myArray, -7, 0
In this case plngLeft and plngRight will be reset to LBound()=-10 and UBound()=0.
To avoid that, as in post#47, I suggest to give plngLeft a default value of 0 and plngRight a default value of -1.
Then you can test: If plngLeft > plngRight Then ... instead of: If plngRight = 0 Then ...
-
1 Attachment(s)
Re: VB6: Sorting algorithms (sort array, sorting arrays)
QuickSort often gets a bad rap from people because in its worst case, it could take N squared operations to sort N elements. This fear is entirely undeserved because in practice, a good implementation of QuickSort just isn’t going to do it. For example, if 1024 elements are being sorted, the probability of achieving the worst case scenario is:
P(worst case) = 2/1024 * 2/512 * 2/512 * 2/256 * 2/256 * 2/256 * 2/256 * 2/128…
= 2/1024 * (2/512)^2 * (2/256)^4 * (2/128)^8 * (2/64)^16 * (2/32)^32 * …
= (2^-9)^1 * (2^-8)^2 * (2^-7)^4 * (2^-6)^8 * (2^-5)^16 * (2^-4)^32 * (2^-3)^64 * (2^-2)^128 * (2^-1)^256
= 2^-9 * 2^-16 * 2^-28 * 2^-48 * 2^-80 * 2^-128 * 2^-192 * 2^-256 * 2^-256
= 2^(-9-16-28-48-80-128-192-256-256)
= 2^(-1013)
The probability of this is so low that if every computer on Earth was sorting using quicksort until the universe ends, it is still unlikely to happen. This is staggeringly unlikely.
But this is only the worst case, there are still a lot of very-bad-cases. The mathematics of this is beyond me but it is not too difficult to simulate the performance of QuickSort. I wrote a simulator and simulated sorting 128, 1024 and 65536 elements, 150,000 times each. This is a graph of the results:
http://www.vbforums.com/attachment.p...p;d=1211854019
When sorting 65536 elements, Quicksort performs an average of 1,420,143 comparisons. After doing 150,000 sorts, the best performance was 92% of the average while the worst performance was 120% of the average.
From the simulated numbers, I estimate that the chances of Quicksort taking more than 10% more than the average to be <0.005, more than 15% than the average to be <0.0003 and more than 20% more than the average to be <0.000007. If you wanted to guess the probability of Quicksort taking twice as long as its average case then you need to think about lots and lots more zeros in the probability. At this point, you are still looking at performance of order NlogN.
If you refer back to the graph again, it is apparent that the more data that is being sorted, the narrower the range of performance.
Another related problem for QuickSort is the exploding stack. Once again, unexpected large usage of the stack is unlikely, but if your stack space is limited, you may have a problem. There is a solution.
The exploding stack problem occurs when the partitions created by QuickSort are uneven and the stack fills up with details about lots of small partitions. The solution is to put the larger partition on the stack first and process the smaller partition first. This way, the stack contains details of large partition and there can not be many of these. In this case, the worst case number of partitions on the stack is logN.
If the likelihood of a bad case of QuickSort is so extreme, why have many people experienced it? The answer is that there are lots of bad ways to implement QuickSort. The golden rule is that the choice of the pivot value must always, always, always be independent and uncorrelated to the data. The best way to achieve this is to choose a random pivot. Do this always, always, always.
In addition, if your random number generator is predictable then your selection of the pivot is predictable and someone can engineer the worst case set of data – do not seed your random number generator in a predictable way if you are concerned about security.
With these thoughts in mind, the “QuickSort” routine in the application uses a poor method of choosing the pivot value. As Dee acknowledges, this will perform poorly when the data has a camel-hump in the middle. This will probably explode the stack with any reasonable amount of camel-hump data.
The “MedianThreeQuickSort” routine in the application is much much better. It is a good implementation of QuickSort. The random method of choosing the pivot value means that it is extremely unlikely that any given set of data will break it. The likelihood of it being broken is staggeringly small. The fact that this routine uses a median of three means that the pivot value is likely to be closer to the middle of the data, this actually speeds up the sort and makes it less likely to have a bad case.
The “MedianThreeQuickSort” routine could be improved by replacing the last two lines with:
Code:
If lngLast – plngLeft < plngRight – lngFirst Then
If plngLeft < lngLast Then QuickSort plngArray, plngLeft, lngLast
If lngFirst < plngRight Then QuickSort plngArray, lngFirst, plngRight
Else
If lngFirst < plngRight Then QuickSort plngArray, lngFirst, plngRight
If plngLeft < lngLast Then QuickSort plngArray, plngLeft, lngLast
End If
This will guarantee that the stack depth is never more than logN.
In summary, go ahead and use QuickSort. It is fast. Just make sure that you choose your pivot randomly.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by anhn
VB6 allows negative index for array
My head hurts just contemplating handling for this. I need to think about whether it's worth the added complexity, or if it's better in the long run to just ignore this ill-advised "feature".
Anything I do at the array level has to be implemented 54 times: 18 graphical algorithms, 18 one-dimensional algorithms, and 18 two-dimensional algorithms. So even minor changes tend to seem daunting.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by cbrow
The “MedianThreeQuickSort” routine could be improved by replacing the last two lines with:
Code:
If lngLast – plngLeft < plngRight – lngFirst Then
If plngLeft < lngLast Then QuickSort plngArray, plngLeft, lngLast
If lngFirst < plngRight Then QuickSort plngArray, lngFirst, plngRight
Else
If lngFirst < plngRight Then QuickSort plngArray, lngFirst, plngRight
If plngLeft < lngLast Then QuickSort plngArray, plngLeft, lngLast
End If
Good suggestion. I implemented it in the posted code on the first page.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Ellis Dee
My head hurts just contemplating handling for this. I need to think about whether it's worth the added complexity, or if it's better in the long run to just ignore this ill-advised "feature".
Oh!Oh! Forget about that if you think that is an "ill-advised", but that is a real thing I had to deal with in the past, so I just suggested. Never mind!
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Hello Ellis Dee,
I have looked at Snake Sort and like it. I agree with what you say that it is about as fast as QuickSort on random data and much faster on sorted or partially sorted data. Other people call this sort a "Natural Merge Sort" but your implementation is about as fast as they get on randomly ordered data. It is possible to make it faster for partially sorted data but the added complexity will probably slow it when sorting random data.
I wouldn’t change it but there are a couple of things that are interesting to think about:
1. The buffer of ordered section details (lngIndex) only needs to be size logN. There is an elegant way to merge the sections in a binary pattern to achieve the exact same result as you have. It is marginally slower than your approach.
2. Consider the scenario when you have a lot of data that has been previously processed and is therefore in order. Then you get some new data added to the end that is not sorted. Say you look for your ordered sections and get:
Section 0: 1,000,000 elements in order
Section 1: 2 elements in reverse order
Section 2: 5 elements in order
Section 3: 7 elements in reverse order.
The binary approach that SnakeSort uses (and my suggestion above) would merge sections 0 & 1 (cost 1,000,002), then sections 2 and 3 (cost 12) and then the two results for a total cost of (1,000,002 + 12 + 1,000,014) = 2,000,028 operations.
A smarter approach would be to merge sections 1 and 2 (cost 7), then merge this with secion 3 (cost 14) and then merge this with section 0 (cost 1,000,014). The total cost of this is (7 + 14 + 1,000,014) = 1,000,035 operations.
It is most efficient to merge the smaller sections together first.
The problem is that these two ideas make the algorithm more complicated and therefore slower on random data. Unless you can think of anything.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by cbrow
I have looked at Snake Sort and like it. I agree with what you say that it is about as fast as QuickSort on random data and much faster on sorted or partially sorted data. Other people call this sort a "Natural Merge Sort"
Boooo. I thought I'd come up with an original idea, but googling does turn up references to and descriptions of natural mergesort, and it is indeed snake sort. Oh well.
Oddly, the only references I see to it are in messageboard discussions. There is no mention of this variation in the wikipedia entry for merge sort. (Wikipedia was my primary reference in this whole endeavor.) I also posted the snake sort algorithm along with a description on a heavily trafficked general interest board, claiming it as my own and asking if anyone had heard of anything similar. Nobody had; in fact, several people thought it was smooth sort. (ha!)
But after a day of mourning I'll go through and remove the self-credits in the Natural Mergesort algorithm, along with giving it its proper name. Again: Boooo!
As far as your points of consideration, particularly for an already-ordered list with new random elements. That situation calls for either an online algorithm like insertion sort or the unbelievably efficient smooth sort. That is what impresses me the most about smooth sort: it is an offline algorithm that handles "online data" as efficiently as any online algorithm. That's beyond impressive; it's pure genius.
I've also always been on the fence about what the term "nearly sorted" really means. If you use the utility attached to the OP and select "5% Shuffled", that techincally qualifies as "nearly sorted" because every element starts out near its final sorted position. And yet, this yields one of snake sort's worst performances, while smooth sort flies through it like greased lightning.
To really put it in perspective, select snake sort, smooth sort and insertion sort by clicking them, hit the Filter button and then the 5% Shuffled button. Watch smooth sort kick all kinds of ass. The higher your screen resolution the more obvious the difference will be. Even though this is Insertion sort's best case, smooth sort still wins.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by cbrow
2. Consider the scenario when you have a lot of data that has been previously processed and is therefore in order. Then you get some new data added to the end that is not sorted. Say you look for your ordered sections and get:
Section 0: 1,000,000 elements in order
Section 1: 2 elements in reverse order
Section 2: 5 elements in order
Section 3: 7 elements in reverse order.
In the previous post I claimed that insertion sort or smooth sort would be the better choice for such a scenario, but I appear to be mistaken. To see it in action, change the following code from the OrderArray function in basGraphical.bas:
Code:
Case oeWeave
ShellSort1 plngArray
For i = iMax To iMax - 5 Step -1
plngArray(i) = Int((iMax - iMin + 1) * Rnd) + iMin
Next
' WeaveArray plngArray, iMin, iMax
To get the full effect you really need to filter so that each algorithm fills the entire vertical space of the screen. Note that Weave order is called Thatched in the tooltip balloons on the toolbar.
Crazy as this seems, after a quick glance it would appear that even changing it to be one single out-of-place element at the end of the array -- the very definition of what an online algorithm is supposed to excel at -- the order of fastest to slowest seems to go snake sort, smooth sort, insertion sort.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Hello Ellis Dee,
I have looked at Shear Sort and have read up on it. Don't expect this sort to work very fast because it is meant to run on N parallel processors. In any case I have found the problem:
There are three for loops that look like:
For j = 0 To Cols \ 2 - 1
or For j = 0 To Rows \ 2 - 1
When rows or columns are odd, these numbers need to round up like:
For j = 0 To CLng(Cols / 2) - 1
and For j = 0 To CLng(Rows / 2) - 1
or For j = 0 To (Cols + 1) \ 2 - 1
or For j = 0 To (Rows + 1) \ 2 - 1
The other thing is that this sort is especially bad if the number N is prime or if N has a large prime root.
Craig
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by cbrow
I have looked at Shear Sort [...] I have found the problem
Sweet, that looks like a winner.
I can't thank you enough for all your help.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Sorry, Something bizarre is happening:
The CLng in: For j = 0 To CLng(Cols / 2) - 1
is rounding down.
This form works though: For j = 0 To (Cols + 1) \ 2 - 1
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by cbrow
Sorry, Something bizarre is happening:
The CLng in: For j = 0 To CLng(Cols / 2) - 1
is rounding down.
This form works though: For j = 0 To (Cols + 1) \ 2 - 1
That's the one I used anyway, since \ is significantly faster than /.
I just learned in another thread that Mod is slow, which is unfortunate for smooth sort. I'll be interested to see how it performs when I get the benchmark logic implemented.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Hello Ellis,
Just looking at SmoothSort, the mods seem to be Mod 8, Mod 4 and Mod 2.
N Mod 8 is the same as N And 7
N Mod 4 is the same as N And 3
N Mod 2 is the same as N And 1
Also if VB had bit-wise shift operators these are much faster than \ 2, \ 4 and \ 8.
The And operator is much faster.
Craig
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
At risk of drifting a little off topic, here's what I have found about Mod vs And in VB6.
Just to be clear And can be used in place of Mod when the divider is a power of 2.
In the IDE And is much quicker than Mod but when compiled with optimizations on it can appear to run at exactly the same speed. Further tinkering reveals that And is still much quicker but the compiler is clever enough to convert N Mod Power2 into N And (Power2-1) if the divider is hard coded. This can be further shown by comparing N Mod Power2 with N Mod NotPower2 where the power2 version runs at 10x the speed. This only happens when the optimize for speed compile option is selected and where the divider is hard coded.
It's a real shame VB does not have bit shift operators, they are really useful. However... Again if optimize for speed is selected try comparing N \ Power2 with N \ Not Power2 with a variety of different numbers. N \ Power2 works out 3x faster, so not as fast as a bit shift but something good is going on.
______________________________________________
Edit: Partly in reference to Merri's post below, as safearrays are at the core of this.
I while ago I started working on a UDT array sorter. It's still not ready (It's a back burner project) but I'm getting re-interested in it again. There's a tricky issue of sorting 8 byte datatypes in UDT's where the element size does not align to multiples of 8 but multiples of 4. All other types are quite easy as they always align thanks to the padding used. I've resolved the 8 byte issue by treating it as two arrays which are sorted separately and then recombined, it still needs a lot of tidying and optimising. At it's core are to be several different versions (one for every data type) of one of the algorithms here, at this stage it's just bubble sort (to keep it simple) but that will change. Not sure which one yet though, it seems like a toss up between Stability and Speed.
These are the parameters as it stands
Code:
Public Function SortArrayX(ByVal ArrayPtr As Long, SortFieldElement As Variant, Optional Descending As Boolean = False) as boolean
I'm not super happy with this as the return of VarPtrArray has to be passed rather than the array itself, I can't think of any other way to do it. Thankfully it's fairly easy to check that it is a valid array structure and that the passed element does reside in arrays range so it should be safe to use. One of the bonuses of it is that it could in theory sort any one dimensional array (not sub type variant or object) UDT or not, and it should be quicker than a general algorithm that uses a variant to carry the array. Progress is slow for me as this is on the edge of my ability, but I take it this would be something of interest, yes?
Quote:
Originally Posted by Merri
<snip>[*]Create a new identical header that changes to zero base (if I recall correctly you simply change one value!)
Yes, the array structure stores the lowerbound and the number of elements, the upperbound is calculated.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Ellis Dee: if you do some learning on SafeArrays, you could make your code only work on zero base simply by replacing the array's header with a zero based one during processing of the array. This would also speed up the processing of all sorting no matter what base the array is in. The problem is of course that you'd need to learn how to handle all the cases properly with the technique and how to do it in the first place, but once you get it you only need to account for one base in sorting code (the zero base), the safe array tricking should perfectly take care of all the negative and positive bases.
In short the code would work something like this:- Read existing array header
- Create a new identical header that changes to zero base (if I recall correctly you simply change one value!)
- Replace the array's header
- Sorting code runs here, no need to check LBound because it is always zero and UBound returns the correct value!
- Restore the old header & clean up
Of course a bit of the stuff depends on how the sorting code is done in the first place, I didn't take a look into that.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by Milk
Not sure which one yet though, it seems like a toss up between Stability and Speed.
These are the parameters as it stands
Code:
Public Function SortArrayX(ByVal ArrayPtr As Long, SortFieldElement As Variant, Optional Descending As Boolean = False) as boolean
Merge sort is both stable and fast, plus it has the bonus of having no worst case; it's always fast. And as I discussed upthread, if you're going to include a Descending option, you pretty much have to use a stable algorithm. (There is just no relevance to the direction of an unstable sort; ascending and descending don't matter. Just change the direction you iterate the array after sorting.) I really can't see using any other algorithm if your going to create a one-stop sorting function. The code sprawl is annoying, but whaddya gonna do?
A generic UDT array sorter would be one of the most impressive things I could imagine in VB6, but I'm not sure of the utility of it. I have in the past used sorted UDT arrays, but as comfortable as I am with sorting code, even I didn't physically sort the UDT array. Instead, I built two-dimensional variant index arrays and sorted those. That's pretty much always going to be faster than sorting a UDT directly, even if you write a specific sorting function devoted to that particular UDT.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Because sorting involves much swapping internally it sorts a long array of indexes which refer back to the original UDT. Comparisons are a little slower but it only needs to swap a 4 byte element instead of say a 54 byte element. Once the index array is sorted it then moves each of the UDT elements to their new positions, if descending is selected then this is just done back to front. Moving the elements at the end is relatively slow per move but each element only needs to be moved once. It has occurred to me that simply returning the index array could be more efficient and certainly quicker. Another function could be written which takes an index array and a pointer to a UDT array that could reorder the UDT if that is required.
I have other issues to look at first :) and bearing in mind I started playing with it 12 months ago it might not be finished so soon.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Yeah, that's a solid approach.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Ellis, just here to say, great work on this thread, real good to see you compare them and the new algorithms
also i can find myself in this statement:
I think you're almost as into watching those little lines draw as I am. I swear, I can just sit and stare at them like a mental patient. I don't need food, or drink, or sleep; must watch lines.
i love demo`s like these: and im gonna make one with all the algorithms in here ;p
http://www.cs.rit.edu/~atk/Java/Sorting/sorting.html
Greetz and keep it up!
-
Re: Sorting algorithms (sort array, sorting arrays)
Code:
Public Sub MedianThreeQuickSort1(ByRef pvarArray As Variant, Optional
...
If plngLeft < lngLast Then MedianThreeQuickSort1 plngArray, plngLeft, lngLast
If lngFirst < plngRight Then MedianThreeQuickSort1 plngArray, lngFirst, plngRight
Else
If lngFirst < plngRight Then MedianThreeQuickSort1 plngArray, lngFirst, plngRight
If plngLeft < lngLast Then MedianThreeQuickSort1 plngArray, plngLeft, lngLast
A minor bug in this and related quicksort routines:
the recursive calls use "plngArray", which is undefined. It should be pvarArray.
Except for this, thanks for the clean and quick code example.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
both QuickSort1 and QuickSort3 have the same speed: 10000 elements are sorted in 31 ms.
SnakeSort is a little bit slower: 10000 elements are sorted in about 80 ms
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
That's a small sample size; most algorithms will handle 10k pretty well.
-
1 Attachment(s)
Re: VB6: Sorting algorithms (sort array, sorting arrays)
I guess people may have a need for speed? Here is an optimized median three quicksort. Usage is only a little harder than when using Ellis Dee's functions:
QSort Not Not YourArrayVariable
The odd Not Not call is used to get the safe array header's pointer. This allows to avoid the use of Variant datatype so it is a requirement. VB6 does not allow declaring it's own procedures As Any so this is a workaround.
Supports following datatypes: Boolean, Byte, Currency, Date, Double, Integer, Long, Single & String.
Note that it uses countsort for Boolean, Byte & Integer. Also, Dates are sorted as Double (this should make no difference however, Dates are the same as Double, only a bit slower to access).
The module includes Ellis Dee's MedianThreeQuickSort1 for comparison.
Edit!
QSort is safe with negative arrays (Dim MyArray(-6 To 0) is not a problem). It modifies the array to zero base temporarily, which means the rest of the code does not need to account for negative indexes (QSort handles the array as Dim MyArray(0 To 6)).
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Just to let you know I've also taken a look at Ellis Dee's SnakeSort. From what I can see I can't just go on and duplicate the behavior, the extra memory usage is too much imo. I think I can use a few tricks to reduce the memory requirements. I already know how to do it with lngIndex() with a very acceptable level of speed decrease.
The greater challenge is to limit the memory use of the mirror array. Whether my ideas work require a deeper look into how it all works. In general if you can reduce overall memory usage you also gain some speed, because all memory handling does slow things down. This is also why I guess SnakeSort seems to be so slow compared to QuickSort in timed benchmarks: the mirror array as it is currently does not do any good for how much stuff happens in memory. Especially strings will multiply the memory use much more than needs to happen thanks to new string allocations.
Note that I haven't yet taken any timed benchmarks of my own! I've mostly focused in getting things to work in the first place. I guess QSort could be optimized further. As QuickSort and SnakeSort seem to be the strongest competitors for the moment, I think they both need attention to get the speed & memory optimized solutions out so that better benchmarking can be done. The name I'll use for my own version of SnakeSort will be SSort. Just to keep it all nice & short :)
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Snake sort is actually a slightly modified Natural Merge Sort, so I'd just go with that name. It of course has the same memory-hogging pitfall of the standard Merge sort. I believe there are ways to reduce the memory overhead for merge sort, so one assumes that same optimization could be applied to the natural merge sort. I don't actually know what it is, but have seen it mentioned in passing.
I posted an explanation of exactly how it works in this thread over on the Straight Dope. That was before cbrow identified the algorithm as a Natural Merge Sort in this post, which also includes some ideas for making it more efficient.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
In that case I'll name SnakeSort1 as NaturalMergeSort1 and use NSort as the short function name for optimized version. I'll also take a look at the efficiency post. I don't need to read the explaining post to understand how it works, I think I understand stuff via code much better. I'm not so good when it comes to talking about which algorithm is better and why, but I can tell when I see inefficiency and I sure can benchmark :) So here I'm trying to pick the ones that make sense to optimize for speed. For my own interest, for general good and for finding out how Natural Merge Sort can truly perform versus Quick Sort when both have been tweaked closer to the optimal performance.
Can you fix your posts here to use Natural Merge Sort instead of Snake Sort? (Of course it is a good idea to leave a note in the main function post that the name has been changed and why it has been done so.)
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Hi guys i just need a quick help with bubble sort i got from this thread. can anyone help me simply this piece of code further? Is function will sort either asc or desc. Is there any other way to simply the code below? especially on the part marked in red.
vb Code:
Function Sort(ByRef str() As String, ByVal flag As Boolean)
Dim iLower As Integer, iUpper As Integer, iCount As Integer, Temp As String
Dim str2 As String
iUpper = UBound(str)
iLower = 1
Dim bSorted As Boolean
bSorted = False
Do While Not bSorted
bSorted = True
For iCount = iLower To iUpper - 1
[COLOR="Red"] If flag Then
str2 = StrComp(str(iCount + 1), str(iCount), vbTextCompare)
Else
str2 = StrComp(str(iCount), str(iCount + 1), vbTextCompare)
End If
If str2 = 1 Then
Temp = str(iCount + 1)
str(iCount + 1) = str(iCount)
str(iCount) = Temp
bSorted = False
End If[/COLOR]
Next iCount
iUpper = iUpper - 1
Loop
End Function
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
A Few More Sorts:
This is a String Sort I invented once. I call it PH91 (Pigeon Hole 91)
Fairly quick - comparable to Quick Sort in speed, but uses up a lot of stack space because it is recurrsive.
Could be unstable the same as Quicksort, if it runs out of stack space.
It uses a method I haven't seen before. At least I don't know that I reinvented the wheel.
Imagine several desks, each with 91 pigeon holes. Each pigeon hole is labeled [ Space ] & punctuation, then 0-9, then A-Z caps, then a-z Lower case.
You recieve an unsorted list in an Array.
Take the items 1 at a time and place them in the pigeon holes based on their first character. Some holes will hold 0 items and some will hold multiple Items.
When that is finished return to the first hole. proceed through the holes in order till you find the first hole that is filled. If it contains only 1 Item, then that item is finished sorting. place it in the final "sorted" stack to be returned by the rutine. If it contains more than 1 item, take the entire hole's contents to the desk #2 (Recurrsion).
Going through the same steps as just outlined, pigeonhole these Items also, but placed in their holes according to the second character of the item. Again check the pigeonholes. All holes that contain single items are finished sorting. All holes that contain more that 1 Item, take to desk #3 (Recurrsion).
New desks are created with each recurrsion till all items in the present desk are pigeonholed with only 1 item each hole. At this point there is no need to make a new desk and the present items now sorted may be returned to the former desk. Proceed through the remaining pigeonholes in this desk creating new desks as needed and returning the sorted results, until all pigeon holes in this desk have been processed. Return these sorted results to the formeer desk, and process the remaining pigeon holes. Return these sorted result to the former desk also. ect...
Finally you will arrive at the inital desk with all items sorted.
Since the Array list is passed to the rutine ByRef, the original list is now sorted. The rutine now ends & the calling rutine can now use the sorted array.
The rutine as it stands can crash if a word list contains a character asc value below 31 or above 122, but these chararcters are not normal text input, and if care is used not to accept these characters as input, then it's integrity is safe.
Without a doubt some wizard will find a way to speed this up even more, but it's aleady very fast.
Here it is.
Code:
Public Sub PH91(ByRef L$(), ByVal S&)
'Recursive
'Initial call with S = 1
Dim A$(), T$(), K&, P&, I&, J&, C&, B&(91), D&
Const Q& = 31
K& = UBound(B&):P& = UBound(L$):ReDim A$(K&, 0)
For I& = 0 To P&
If S& <= Len(L$(I&)) Then
J& = AscW(Mid$(L$(I&), S&, 1)) - Q&:D& = B&(J&)
If D& > UBound(A$, 2) Then ReDim Preserve A$(K&, D&)
A$(J&, D&) = L$(I&):B&(J&) = D& + 1
Else
L$(C&) = L$(I&): C& = C& + 1
End If
Next I&
K& = UBound(A$, 1)
For I& = 1 To K&
If B&(I&) Then
If B&(I&) > 1 Then
P& = B&(I&) - 1: ReDim T$(P&)
For J& = 0 To P&
T$(J&) = A$(I&, J&)
Next J&
S& = S& + 1:Call PH91(T$, S&):P& = UBound(T$)
For J& = 0 To P&
L$(C&) = T$(J&): C& = C& + 1
Next J&
S& = S& - 1
Else
L$(C&) = A$(I&, 0): C& = C& + 1
End If
End If
Next I&
End Sub
A Binary Search is very fast. I wondered why nobody thought to write a Binary Sort. Maybe they exist but I don't see them listed in any list of fast sorts.
I thought to experiment and write one .
I made it so you recieve the unsorted list as an array. I thought then that the quickest way to sort this list was to create a new one (sorted), then return it.
I used the Binary search method to find each word's new place in the sorted list, but I couldn't figure out another way to create the new list except by insertion.
Aaah! Thats' why it has never turned up in a list of fast sorts. It does great on small lists. The search to find it's new place in the list is very fast even with lists of 1 million entrys. BUT! To insert that item in the new list is a big drag if you must use insertion. It's much faster though than a bubble sort. ( Not saying much there )
I later saw some selection sort rutines, and they are similar in method though not quite the same. I suppose though this should be listed as another selection sort variation.
Well here it is. It is non-recursive.
Code:
Private Sub BinSort(ByRef LST$())
Dim L$(), I&, J&, LT&, P&, A$, Q&, LP&, HP&
LT& = UBound(LST$)
ReDim L$(LT& + 1)
For I& = 0 To LT&
A$ = LST(I&):LP& = 0:HP& = I&:P& = 0
Do
Q& = P&
If A$ > L$(P&) Then
LP& = P&
ElseIf A$ < L$(P&) Then
HP& = P&
End If
P& = Int((LP& + HP&) / 2)
Loop While P& <> Q&
If A$ > L$(P&) And L$(P&) <> "" Then P& = P& + 1
If I& > P& Then
' ********* The Loop below here is the time killer ***********
For J& = I& To P& Step -1
L$(J& + 1) = L$(J&)
Next J&
DoEvents
End If
L$(P&) = A$
Next I&
Redim Preserve L$(LT&)
LST$() = L$()
End Sub
This Insertion drag on the rutine is something that irritates me to no end.
I thought why can't we chunk it up, then combine the chunks after each chunk is sorted?
So I set about to write that too.
This rutine is designed to write a long list with sorted chunks within it. It creates pointers to the beginning of each chunk, so that it can properly assemble them later into a final sorted list.
If the initial list is under the established chunk size then the rutine sorts without chunking, but if over that threshold, it will chunk it up and feed it back into the rutine in smaller list sizes.
I did a lot of experimenting with the chunk size along with various list sizes, then established the best chunk size as a constant.
The Chunking rutine increased the speed of the sort 9 times faster on large lists ( 5000 - 100000 )
I used the Binary Sort rutine listed above, but Chunking should increase the speed of any sort that uses the insertion method. It will not increase the speed of Quicksort, or Shellsort since they do not use the insertion method. The insertion method is slow because of chunk size and this should allieviate that bottleneck somewhat.
It became longer than I wanted, and I suppose some Genius may be able to improve it as well.
It is recursive, but it is not capable of compounded recersion, so it can be used on systems where stack space is limited.
Here it is with the above Binary Sort included.
Code:
Private Sub ChunkSort(ByRef LST$())
Dim I&, J&, LS&, P&, A$, Q&, LP&, HP&
Dim L$(), SL$(), B&()
Const R& = 512 '******** The Chunk size is stored in a constant **********
LS& = UBound(LST$)::ReDim L$(LS& + 1)
'******** Below here is Chunking Rutine **********
If LS& > R& Then
HP& = LS& \ R&:ReDim B&(HP&):ReDim SL$(R& - 1)
Q& = R&
For I& = 0 To LS& Step Q&
If I& + Q& > LS& Then
Q& = LS& - I& + 1: ReDim SL$(Q& - 1)
End If
For J& = 0 To Q& - 1
SL$(J&) = LST$(I& + J&)
Next J&
ChunkSort SL$()
For J& = 0 To Q& - 1
L$(I& + J& + 1) = SL$(J&)
Next J&
B&(I& \ R&) = 0
Next I&
For I& = 1 To LS& + 1
A$ = ""
For J& = 0 To HP&
If B&(J&) < R& Then
P& = J& * R& + B&(J&) + 1
If P& <= LS& + 1 Then
If Trim(L(P&)) <> "" Then
If L$(P&) < A$ Or A$ = "" Then
A$ = L$(P&): LP& = J&
End If
End If
End If
End If
Next J&
B&(LP&) = B&(LP&) + 1:LST$(I& - 1) = A$
DoEvents
Next I&
Exit Sub
End If
'********Below is the Binary Sort Rutine **********
For I& = 0 To LS&
A$ = LST(I&): LP& = 0:HP& = I&:: P& = 0
Do
Q& = P&
If A$ > L$(P&) Then
LP& = P&
ElseIf A$ < L$(P&) Then
HP& = P&
End If
P& = Int((LP& + HP&) / 2)
Loop While P& <> Q&
If A$ > L$(P&) And L$(P&) <> "" Then P& = P& + 1
If I& > P& Then
For J& = I& To P& Step -1
L$(J& + 1) = L$(J&)
Next J&
DoEvents
End If
L$(P&) = A$
Next I&
ReDim Preserve L$(LS&)
LST$() = L$()
End Sub
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Quote:
Originally Posted by
Perry Miller
The rutine as it stands can crash if a word list contains a character asc value below 31 or above 122, but these chararcters are not normal text input, and if care is used not to accept these characters as input, then it's integrity is safe.
I don't visit here much at all anymore, but I have to drop in to say that statement above is far too ignorant. You'll run out of normal text input immediately when you start dealing with stuff like surnames. Take Müller or Selänne. In the internationalized world of today it is not safe to assume 91 characters to be enough for anything.
And in any case whatever code you write you better write it so that unexpected input doesn't crash it. There will always be unexpected input. Some people do unexpected input by purpose. They are called NSA.
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
Merri;
Thanks for your input.
Of course it is not suitable for every application in it's present version, but see how easily it can be changed to allow for other input that would not normally be expected.
In the revision below to accept characters 0 - 255, I simply changed 1 array variable, and the constant.
B&(255), & Const Q& = 0
This is all it takes to allow more varied input, and there should be no crash on non-standard input.
There are some situations where the input is more controlable however, and where that is possible, and
due care is taken, one could use a version which can be Optimized for your particular application.
Code:
Public Sub PH91(ByRef L$(), ByVal S&)
'Recursive
'Initial call with S = 1
Dim A$(), T$(), K&, P&, I&, J&, C&, B&(255), D&
Const Q& = 0
K& = UBound(B&):P& = UBound(L$):ReDim A$(K&, 0)
For I& = 0 To P&
If S& <= Len(L$(I&)) Then
J& = AscW(Mid$(L$(I&), S&, 1)) - Q&:D& = B&(J&)
If D& > UBound(A$, 2) Then ReDim Preserve A$(K&, D&)
A$(J&, D&) = L$(I&):B&(J&) = D& + 1
Else
L$(C&) = L$(I&): C& = C& + 1
End If
Next I&
K& = UBound(A$, 1)
For I& = 1 To K&
If B&(I&) Then
If B&(I&) > 1 Then
P& = B&(I&) - 1: ReDim T$(P&)
For J& = 0 To P&
T$(J&) = A$(I&, J&)
Next J&
S& = S& + 1:Call PH91(T$, S&):P& = UBound(T$)
For J& = 0 To P&
L$(C&) = T$(J&): C& = C& + 1
Next J&
S& = S& - 1
Else
L$(C&) = A$(I&, 0): C& = C& + 1
End If
End If
Next I&
End Sub
Quote:
I don't visit here much at all anymore, but I have to drop in to say that statement above is far too ignorant.
I can't plead Ignorance, I did think of it, but chose not to make it initially in the above variation, so I will admit that I am unaccomodating at times. Please accept my apology.
-
Re: Sorting algorithms (sort array, sorting arrays)
Fantastic thread Ellis Thank you!
Quick question, in Comb sort, why the line:
If lngGap = 10 Or lngGap = 9 Then lngGap = 11
Thanks
-
Re: Sorting algorithms (sort array, sorting arrays)
I use VB as a macro of Excel 2016.
I wrote one application to make a registry of people for my work purposes. After loading all the registered names into one array, I want them to get organized alphabetically. This sorting algorithm proved itself the most useful, simple and quick combined, comparing to the others Ive tested so far. However, there seems to be some logical bug that I just cant see: The names get sorted starting with "W..." continuing to "Z...." and only then it continues to "A..."
Note: The list of names is about 1000 large, but increasing, and I count that in the end it may operate with approximately 30 000 registers.
-
Re: Sorting algorithms (sort array, sorting arrays)
what would be in the form window
can you post this selection sorting along with the form window
-
Re: VB6: Sorting algorithms (sort array, sorting arrays)
This thread was started almost 10 years ago. It's an interesting thread in that it has come back to life every year, or so, since inception. Normally, once a few months have passed, a form doesn't get revived, but this one has more lives than a cat.
Still, most of the people who participated in the thread have since moved on to new things. I'm not sure if any of them are still active. Therefore, i would be best to start a new thread with a new question, as that is more likely to get a good answer than hoping that somebody still follows this one....though in the case of this particular thread, you never know.