-
May 30th, 2007, 06:55 AM
#1
Sorting algorithms (sort array)
While it's no doubt been covered many times before, I was hoping to set up a nice repository of the various sorting algorithms in a single thread. So if you have VB6 code for a technique that isn't listed, please post it.
To keep the code as simple as possible, assume we are sorting single-dimension typed arrays. Your code can work for string or numeric arrays; it will be up to the reader to make any necessary conversions.
I'll start us off with the old standby, the QuickSort:
Code:
' Omit plngLeft & plngRight; they are used internally during recursion
Public Sub QuickSortLong(ByRef plngArray() As Long, Optional ByVal plngLeft As Long, Optional ByVal plngRight As Long)
Dim lngFirst As Long
Dim lngMid As Long
Dim lngLast As Long
Dim lngSwap As Long
If plngRight = 0 Then
plngLeft = LBound(plngArray)
plngRight = UBound(plngArray)
End If
lngFirst = plngLeft
lngLast = plngRight
lngMid = plngArray((plngLeft + plngRight) \ 2)
Do
Do While plngArray(lngFirst) < lngMid And lngFirst < plngRight
lngFirst = lngFirst + 1
Loop
Do While lngMid < plngArray(lngLast) And lngLast > plngLeft
lngLast = lngLast - 1
Loop
If lngFirst <= lngLast Then
lngSwap = plngArray(lngFirst)
plngArray(lngFirst) = plngArray(lngLast)
plngArray(lngLast) = lngSwap
lngFirst = lngFirst + 1
lngLast = lngLast - 1
End If
Loop Until lngFirst > lngLast
If plngLeft < lngLast Then QuickSortLong plngArray, plngLeft, lngLast
If lngFirst < plngRight Then QuickSortLong plngArray, lngFirst, plngRight
End Sub
Last edited by Ellis Dee; Jun 12th, 2007 at 02:54 PM.
-
May 30th, 2007, 06:56 AM
#2
Re: Sorting algorithms
I've added a benchmark utility to compare a dozen different sorting methods to the CodeBank here. Anyone who is reading this because they searched the boards for a solution should read the CodeBank thread instead of this one.
Last edited by Ellis Dee; Jun 12th, 2007 at 08:05 PM.
-
May 30th, 2007, 10:13 AM
#3
Re: Sorting algorithms
I like this sorting routine as an option to the QuickSort for a couple of reasons. First, it's easy to code (about a dozen or so lines). Second, it's not such a memory hog. Third, it runs faster than the QuickSort when the array values are almost already sorted, so it takes advantage of the existing array's values. My super sort routine resembles the old Shell sort, but I modified it a few years ago to make it run even faster.
To test it, design a form with a list box, two command buttons, and a label. Set the list box's Sorted property to true and use it as a benchmark. Enjoy.
Code:
Dim Jump As Long, Temp As String
Dim SortArray() As String, Swapped As Boolean, StartTime As Double
' Change these as you see fit
Const StringLength = 9
Const ListSize = 50000
Private Sub Command1_Click()
List1.Clear
'Generate a random string array to sort
ReDim SortArray(ListSize)
For I = 1 To ListSize
For J = 1 To StringLength
SortArray(I) = SortArray(I) & Chr(Int(Rnd * 26) + 65)
Next
Next
StartTime = Now
Jump = ListSize
While Jump
Jump = Jump \ 2
Swapped = True
While Swapped
Swapped = False
For I = 1 To ListSize - Jump
If SortArray(I) > SortArray(I + Jump) Then
Temp = SortArray(I)
SortArray(I) = SortArray(I + Jump)
SortArray(I + Jump) = Temp
Swapped = True
End If
Next
Wend
Wend
Label1.Caption = "Code Doc's sort required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
' This sorting routine sorts 50000, 9-character random strings in less than 11 seconds.
For I = 1 To ListSize
List1.AddItem SortArray(I)
Next
End Sub
Private Sub Command2_Click()
List1.Clear
'Generate a random string array to sort
ReDim SortArray(ListSize)
For I = 1 To ListSize
For J = 1 To StringLength
SortArray(I) = SortArray(I) & Chr(Int(Rnd * 26) + 65)
Next
Next
StartTime = Now
For I = 1 To ListSize
List1.AddItem SortArray(I)
Next
Label1.Caption = "Sorted list box required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
' The sorted list box will sort 50000, 9-character random strings in less than 10 seconds.
End Sub
Private Sub Form_Load()
Command1.Caption = "Code Doc's Sort"
Command2.Caption = "Sorted List Box"
' Code Doc is running a Pentium 4 at 1.8 GHz.
End Sub
The routine above uses a string array, but you can modify it for a numerical array. I used strings as a worst-case scenario. With numerical arrays, Code Doc's super sort enjoys an aven bigger advantage over the list box because the numerical values would have to be converted to strings to sort and display the list.
Note also that Jump is halved on each iteration of the sort (Jump \ 2). I have played around with this value and on occasion cut the sort time in half. But, if you get greedy, the sort will sometimes miss now and then. Halving the interval guarantees success.
-
May 31st, 2007, 05:14 AM
#4
Re: Sorting algorithms
Sweet, that's awesome. Would you mind elaborating on the Jump value concept? Also, what's the name? JumpSort?
And to the group: Surely someone has code for the BubbleSort, ShellSort, AmericanSort, etc... I'll be adding what I've dubbed the StudentSort shortly.
-
May 31st, 2007, 05:43 AM
#5
Re: Sorting algorithms
Here's the aforementioned StudentSort:
Code:
Public Sub StudentSort(plngArray() As Long)
Dim lngSwap As Long
Dim i As Long
Dim iMin As Long
Dim iMax As Long
Dim blnSorted As Boolean
iMin = LBound(plngArray)
iMax = UBound(plngArray) - 1
Do While Not blnSorted
blnSorted = True
For i = iMin To iMax
If plngArray(i) > plngArray(i + 1) Then
lngSwap = plngArray(i)
plngArray(i) = plngArray(i + 1)
plngArray(i + 1) = lngSwap
blnSorted = False
End If
Next
Loop
End Sub
-
Jun 1st, 2007, 01:37 PM
#6
Re: Sorting algorithms
I incorporated Ellis Dee's StudentSort into my code. Now there are three command buttons, one for each sort algorithm.
Code:
Dim Jump As Long, Temp As String, I As Long
Dim SortArray() As String, Swapped As Boolean, StartTime As Double
' Change these as you see fit
Const StringLength = 9
Const ListSize = 50000
Private Sub Command1_Click()
BuildStrings
Jump = ListSize
While Jump
Jump = Jump \ 2
Swapped = True
While Swapped
Swapped = False
For I = 1 To ListSize - Jump
If SortArray(I) > SortArray(I + Jump) Then
Temp = SortArray(I)
SortArray(I) = SortArray(I + Jump)
SortArray(I + Jump) = Temp
Swapped = True
End If
Next
Wend
Wend
Label1.Caption = Command1.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
' Code Doc's Jump Sort routine sorts 50000, 9-character random strings in less than 10 seconds.
For I = 1 To ListSize
List1.AddItem SortArray(I)
Next
End Sub
Private Sub Command2_Click()
BuildStrings
StartTime = Now
For I = 1 To ListSize
List1.AddItem SortArray(I)
Next
Label1.Caption = Command2.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
' The sorted list box will sort 50000, 9-character random strings in less than 10 seconds.
End Sub
Private Sub Command3_Click()
BuildStrings
Swapped = True
While Swapped = True
Swapped = False
For I = 1 To ListSize - 1
If SortArray(I) > SortArray(I + 1) Then
Temp = SortArray(I)
SortArray(I) = SortArray(I + 1)
SortArray(I + 1) = Temp
Swapped = True
End If
Next
Wend
Label1.Caption = Command3.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
' Dee's Student Sort routine sorts 5000, 9-character random strings in 15 seconds.
' If you try to sort 50000 strings, take a coffee break or a nap. Yawn... ZZZ ZZZ ZZZ...
For I = 1 To ListSize
List1.AddItem SortArray(I)
Next
End Sub
Private Sub Form_Load()
Command1.Caption = "Code Doc's Jump Sort"
Command2.Caption = "Sorted List Box"
Command3.Caption = "Dee's Student Sort"
Label1.Caption = "Preparing to sort" & Str(ListSize) & " random strings..."
' Code Doc is running a Pentium 4 at 1.8 GHz.
End Sub
Public Sub BuildStrings()
List1.Clear
'Generate a random string array to sort
ReDim SortArray(ListSize)
For I = 1 To ListSize
For J = 1 To StringLength
SortArray(I) = SortArray(I) & Chr(Int(Rnd * 26) + 65)
Next
Next
Label1.Caption = "Now sorting" & Str(ListSize) & " random strings..."
StartTime = Now
End Sub
Caution: The StudentSort is practically useless at any array of over 10,000 items. It does do rather well for small lists, but it's required time increases with the list size in a nonlinear fashion.
On the other hand, Code Doc's JumpSort handles lists of all sizes similar to the sorted list box, and both tend to increase in time with the list size in a rather linear fashion.
I also noticed that the sort routines tend to run at different speeds depending on what Windows is doing at the time.
-
Jun 3rd, 2007, 12:51 PM
#7
Re: Sorting algorithms
Ha! "Student sort" indeed. It's called the bubblesort, and if I'd ever gone to school I'd know that. How embarassing. Oh well, no harm no foul; I'll edit the function name accordingly in any subsequent use of the code.
Googling turned up VB6 code for several of the other common algorithms. I've set up a basic benchmark framework, but I'm not in the mood to mess with it further just yet.
-
Jun 3rd, 2007, 07:30 PM
#8
Re: Sorting algorithms
Ooh, I like jump sort (Student/Bubble was the best that i had come up with)
Here's a minor tweak for dealing with l a r g e strings...
Code:
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Addr As Long, ByVal NewVal As Long)
'...
Jump = ListSize
While Jump
Jump = Jump \ 2
Swapped = True
While Swapped
Swapped = False
For I = 1 To ListSize - Jump
If SortArray(I) > SortArray(I + Jump) Then
lPtr = StrPtr(SortArray(I))
PutMem4 VarPtr(SortArray(I)), StrPtr(SortArray(I + Jump))
PutMem4 VarPtr(SortArray(I + Jump)), lPtr
Swapped = True
End If
Next
Wend
Wend
-
Jun 3rd, 2007, 08:53 PM
#9
Hyperactive Member
Re: Sorting algorithms
Originally Posted by Ellis Dee
Ha! "Student sort" indeed. It's called the bubblesort, and if I'd ever gone to school I'd know that. How embarassing. Oh well, no harm no foul; I'll edit the function name accordingly in any subsequent use of the code.
NO! NO!
"Student Sort" is a MUCH better name.
I will henceforth call it that.
Mac
-
Jun 6th, 2007, 03:13 PM
#10
Re: Sorting algorithms
Milk, thanks a bunch for your input. I have expanded my code to include the following:
1) A frame with two option buttons for data types, either strings or 6-digit long integers.
2) A list box to set the size of the list being sorted (anywhere from 1000 to 100,000 elements.
3) A warning message for Dee's Student Sort (sometimes called the Bubble Sort) when the list reaches about 20,000 items. It tends to hang the system by the throat.
Here's the new code:
Code:
Dim Jump As Long, Temp As String, I As Long, J As Integer, ListSize As Long
Dim StringArray() As String, Swapped As Boolean, StartTime As Double
Dim IntegerArray() As Long, MsgResponse As Integer
Const StringLength = 9 ' Change as you see fit
Private Sub Command1_Click()
BuildData
Jump = ListSize
If Option1(0).Value Then
While Jump
Jump = Jump \ 2
Swapped = True
While Swapped
Swapped = False
For I = 1 To ListSize - Jump
If StringArray(I) > StringArray(I + Jump) Then
Temp = StringArray(I)
StringArray(I) = StringArray(I + Jump)
StringArray(I + Jump) = Temp
Swapped = True
End If
Next
Wend
Wend
Else
While Jump
Jump = Jump \ 2
Swapped = True
While Swapped
Swapped = False
For I = 1 To ListSize - Jump
If IntegerArray(I) > IntegerArray(I + Jump) Then
Temp = IntegerArray(I)
IntegerArray(I) = IntegerArray(I + Jump)
IntegerArray(I + Jump) = Temp
Swapped = True
End If
Next
Wend
Wend
For I = 1 To ListSize
List1.AddItem IntegerArray(I)
Next
End If
Label1.Caption = Command1.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
If Option1(0).Value Then
For I = 1 To ListSize
List1.AddItem StringArray(I)
Next
Else
For I = 1 To ListSize
List1.AddItem Str(IntegerArray(I))
Next
End If
End Sub
Private Sub Command2_Click()
BuildData
StartTime = Now
If Option1(0).Value Then
For I = 1 To ListSize
List1.AddItem StringArray(I)
Next
Else
For I = 1 To ListSize
List1.AddItem Str(IntegerArray(I))
Next
End If
Label1.Caption = Command2.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
End Sub
Private Sub Command3_Click()
If List2.ListIndex > List2.ListCount \ 5 Then
MsgResponse = MsgBox("Caution, this will take some time!", vbOKCancel)
If MsgResponse = vbCancel Then Exit Sub ' User Bails Out
End If
BuildData
Swapped = True
If Option1(0).Value Then
While Swapped = True
Swapped = False
For I = 1 To ListSize - 1
If StringArray(I) > StringArray(I + 1) Then
Temp = StringArray(I)
StringArray(I) = StringArray(I + 1)
StringArray(I + 1) = Temp
Swapped = True
End If
Next
Wend
Else
While Swapped = True
Swapped = False
For I = 1 To ListSize - 1
If IntegerArray(I) > IntegerArray(I + 1) Then
Temp = IntegerArray(I)
IntegerArray(I) = IntegerArray(I + 1)
IntegerArray(I + 1) = Temp
Swapped = True
End If
Next
Wend
End If
Label1.Caption = Command3.Caption & " required " & Format$(Now - StartTime, "hh:mm:ss") & " to complete."
If Option1(0).Value Then
For I = 1 To ListSize
List1.AddItem StringArray(I)
Next
Else
For I = 1 To ListSize
List1.AddItem Str(IntegerArray(I))
Next
End If
End Sub
Private Sub Form_Load()
Command1.Caption = "Code Doc's Jump Sort"
Command2.Caption = "Sorted List Box"
Command3.Caption = "Dee's Student Sort"
Option1(0).Caption = "9-Character Strings"
Option1(1).Caption = "6-Digit Long Integers"
Frame1.Caption = "Data Type Options"
Label1.Caption = "Preparing to sort" & Str(ListSize) & " random data types..."
For I = 1 To 100
List2.AddItem Str(I * 1000)
Next
List2.ListIndex = 0 ' Set index at top of list
End Sub
Public Sub BuildData()
List1.Clear
'Generate a random string array to sort
ListSize = Val(List2.Text)
If Option1(0).Value Then ' Build String Data
ReDim StringArray(ListSize)
For I = 1 To ListSize
For J = 1 To StringLength
StringArray(I) = StringArray(I) & Chr(Int(Rnd * 26) + 65)
Next
Next
Label1.Caption = "Now sorting" & Str(ListSize) & " random 9-character strings..."
Else ' Build Integer Data
ReDim IntegerArray(ListSize)
For I = 1 To ListSize
IntegerArray(I) = Int(Rnd * 900000) + 100000
Next
Label1.Caption = "Now sorting" & Str(ListSize) & " random 6-digit integers..."
End If
StartTime = Now
End Sub
When the list size exceeds 70,000 items, I think Code Doc's Jump Sort regularly beats the VB list box control.
Does anyone have any code floating around for the ancient Insertion Sort? I recall it being slow compared to my Jump Sort, but I recall that it did blow apart the Bubble Sort when confronted with long lists.
-
Jun 7th, 2007, 01:19 AM
#11
Re: Sorting algorithms
There is a sort (Merge Sort) that divides the list into separate parts then merges the results. This is faster than a straight sort because it will operate on much smaller parts .
-
Jun 7th, 2007, 01:24 AM
#12
Re: Sorting algorithms
I found this version of the Merge Sort on DevX.com
Code:
'What is a merge sort? That's a difficult question to
'demonstrate, but not one to answer
'
' basically, the MergeSort function takes a
' group of elements, and you
' split the group in
' half or as near to half as possible.
'
' Then it runs MergeSort on each half.
' If there is two or fewer elements in the array when called,
' however, mergesort compares the elements, sorts them and
' returns them to the previous calling MergeSort. We are said,
' at that point to have descended the tree as far as we can.
' then, the elements are merged together in order by the Merge
' function.and on back up the chain until the whole thing is in
' order
'
' It's very quick. It's much quicker than the simple BubbleSort
' routine at the bottom that sorts the two element arrays for
' the Merge function
'
' Bubble sort's maximum iterations are x*x
' Merge sort's maximum iterations are x log x
'EXAMPLE USAGE
'Dim s(4) As String
'Dim s2() As String
'Dim s3() As String
's(0) = "XX"
's(1) = "A"
's(2) = "RR"
's(3) = "LL"
's(4) = "AABB"
's2 = MergeSort(s)
's3 = BubbleSort(s)
Public Function MergeSort(Strings() As String) As String()
Dim s As Long, i As Long, n() As String, m() As String
Dim u As Long, l As Long, j As Long, x() As String
' we don't sort 1 element!
If (UBound(Strings) - LBound(Strings)) = 0 Then
MergeSort = Strings
Exit Function
ElseIf (UBound(Strings) - LBound(Strings)) = 1 Then
MergeSort = BubbleSort(Strings)
Exit Function
Else
l = SplitArray(Strings, m, n)
m = MergeSort(m)
n = MergeSort(n)
MergeSort = MergeArray(m, n)
End If
End Function
Private Function SplitArray(Strings() As String, _
StringOut1() As String, StringOut2() As String) As Long
' Splits it 50/50 or as close as possible
' if it's just one string, we return StringOut1(0)
' The return number is the total elements of the
' largest of the arrays
Dim i As Long, j As Long, s1() As String, s2() As String
Dim z As Long
If (UBound(Strings) - LBound(Strings)) = 0 Then
StringOut1 = Strings
SplitArray = 1
Exit Function
End If
i = Int((UBound(Strings) + 1) / 2)
z = 0
For j = 0 To (i - 1)
ReDim Preserve StringOut1(z)
StringOut1(z) = Strings(j)
z = z + 1
Next j
z = 0
For j = i To UBound(Strings)
ReDim Preserve StringOut2(z)
StringOut2(z) = Strings(j)
z = z + 1
Next j
If UBound(StringOut1) > UBound(StringOut2) Then
SplitArray = UBound(StringOut1)
Else
SplitArray = UBound(StringOut2)
End If
End Function
Private Function MergeArray(String1() As String, _
String2() As String) As String()
Dim i As Long, j As Long
Dim n() As String, x As Long, y As Long, c As Integer
On Error Resume Next
i = -2
j = -2
i = UBound(String1) + 1
j = UBound(String2) + 1
If (i < 0) And (j < 0) Then Exit Function
If (i > -1) Then
i = UBound(String1) + 1
ElseIf (i = -1) Or ((i = 0) And (String1(0) = "")) Then
MergeArray = String2
Exit Function
End If
If j > -1 Then
i = i + UBound(String2) + 1
ElseIf (j = -1) Or ((j = 0) And (String2(0) = "")) Then
MergeArray = String1
Exit Function
End If
ReDim n(i - 1)
For j = 0 To (i - 1) Step 0
If (x > UBound(String1)) And (y > UBound(String2)) Then
MergeArray = n
Exit Function
End If
c = StrComp(String1(x), String2(y))
If (c = 0) And (x <= UBound(String1) And _
(y <= UBound(String2))) Then
n(j) = String1(x)
n(j + 1) = String2(y)
j = j + 2
y = y + 1
x = x + 1
ElseIf ((c < 0) Or (y > UBound(String2))) _
And (x <= UBound(String1)) Then
n(j) = String1(x)
x = x + 1
j = j + 1
ElseIf ((c > 0) Or (x > UBound(String1))) _
And (y <= UBound(String2)) Then
n(j) = String2(y)
y = y + 1
j = j + 1
End If
Next
MergeArray = n
End Function
Public Function BubbleSort(Strings() As String) As String()
Dim a As Long, b As Long, c As Long, d As Long
Dim e As Integer, f As Integer, g As Integer
Dim i As String, j As String
Dim m() As String, n() As String
e = 1
n = Strings
Do While e <> -1
For a = 0 To UBound(Strings) - 1
i = n(a)
j = n(a + 1)
f = StrComp(i, j)
If f <= 0 Then
n(a) = i
n(a + 1) = j
Else
n(a) = j
n(a + 1) = i
g = 1
End If
Next a
If g = 1 Then
e = 1
Else
e = -1
End If
g = 0
Loop
BubbleSort = n
End Function
-
Jun 7th, 2007, 10:31 AM
#13
Re: Sorting algorithms
Sweet, thanks much Randem. I found some time and motivation to get back to this; I'm setting up a project to benchmark the following sorts:- Bubble sort
- Cocktail sort
- Heap sort
- Insertion sort
- Jump sort
- Merge sort
- Quick sort
- Selection sort
- Shell sort
Unlike other benchmark programs I have posted to the boards, this one will actually have a user interface so that you can compare the different algorithms for a particular environment.
All sorting code will work with variant arrays in order to be able to compare both numeric and string sorting. Also, each algorithm will have two versions: one for single-dimension arrays, and another for two-dimensional arrays.
I figure it'll be done by the end of the week if all goes well.
In the meantime, does anybody know of any algorithms not included in my list?
Last edited by Ellis Dee; Jun 7th, 2007 at 10:58 AM.
Reason: Added cocktail sort to the list
-
Jun 7th, 2007, 11:12 AM
#14
Re: Sorting algorithms
Testing on one dimension leaves stability untested. For equal values, relative order should be maintained (if there are two 3s then after sort the initially first 3 should still come first of the two). Is the jump sort stable?
Will try to recreate modified insert when I have time... seems I didn't post it and I lost my copy.
-
Jun 7th, 2007, 11:30 AM
#15
Re: Sorting algorithms
Stability is not a requirement in a sorting algorithm. Many of the common algorithms are unstable: quicksort, selection sort, shell sort, etc...
-
Jun 7th, 2007, 11:56 AM
#16
Re: Sorting algorithms
Hey Code Doc, check out the wikipedia description of the Comb sort. Is that basically the same idea as your jump sort? (It sounds similar, but I can't tell.)
-
Jun 7th, 2007, 12:18 PM
#17
Re: Sorting algorithms
Originally Posted by Ellis Dee
Hey Code Doc, check out the wikipedia description of the Comb sort. Is that basically the same idea as your jump sort? (It sounds similar, but I can't tell.)
Comb Sort looks like the sort routine I was playing around with at the same time they claim they invented it (1991). It's not the same as the Jump Sort but tries to improve on it. My Jump Sort looks closer to the old Shell Sort but my code seems tighter. What I worry about in the Comb Sort is the "magic 1.3" and a few other numbers in that code that look heuristic.
Iteration count is perhaps the most important key to speed, but taking advantage of the array structure that may already be partially sorted is the other major factor. In that case, I believe my Jump Sort reigns supreme and works like lightning--even occasionally outrunning the QuickSort.
What amazes me is the Jump Sort can outrun the VB sorted list box with large numbers of initially random elements (> 70000).
-
Jun 7th, 2007, 06:52 PM
#18
Re: Sorting algorithms
Originally Posted by Ellis Dee
In the meantime, does anybody know of any algorithms not included in my list?
Library sort (insertion sort with gaps)
-
Jun 7th, 2007, 07:01 PM
#19
Re: Sorting algorithms
Originally Posted by Ellis Dee
Stability is not a requirement in a sorting algorithm. Many of the common algorithms are unstable: quicksort, selection sort, shell sort, etc...
I disagree... and it is one of the criteria along with no. of comparisons, no. of swaps (both considered, speed in general), memory use, and ease of implementation (can the prog language implement it).
Quicksort is stable if you shift to the left of pivot < and not <= for increasing sort as the relative positions of the moved items remain the same.
Selection is stable cause your picking leftmost from select range for increasing result set (and from right repectively for decreasing in result).
Merge is stable.
Heap is stable.
Insert is stable.
Bubble is stable.
Not totally familiar with shell sort but yeah I don't think its stable cause samples always test against single dimension data.
Last edited by leinad31; Jun 7th, 2007 at 07:49 PM.
-
Jun 8th, 2007, 07:58 AM
#20
Re: Sorting algorithms
The benchmark program is shaping up nicely. The list of algorithms is now:- BubbleSort
- Cocktail Sort
- Comb Sort
- Gnome Sort
- Heap Sort
- Insertion Sort
- Jump Sort
- Library Sort
- Merge Sort
- Quick Sort
- Selection Sort
- Shell Sort
- Smooth Sort
The simpler ones I just coded freehand based on the pseudocode wikipedia lists. The more advanced ones I've been able to find code for on devx.
The program includes somewhat verbose descriptions of each algorithm, compiled (and edited) from various online sources. I've included the text for three of the algorithms below. Code Doc, based on this, would you mind writing up a description for the jump sort that I can include? Also, it appears to me to be stable. Is it?
Bubble Sort
Bubble sort is a simple sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing two items at a time and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which means the list is sorted.
The algorithm gets its name from the way smaller elements "bubble" to the top (i.e. the beginning) of the list via the swaps. One way to optimize bubblesort (implemented here) is to note that, after each pass, the largest element will always move down to the bottom. Thus it suffices to sort the remaining n - 1 elements each subsequent pass.
Although simple, this algorithm is highly inefficient and is rarely used except in education
Cocktail Sort
Cocktail sort is a variation of bubble sort that sorts in both directions each pass through the list.
One optimization (implemented here) is to add an if-statement that checks whether there has been a swap after the first pass each iteration. If there hasn't been a swap the list is sorted and the algorithm can stop.
Gnome Sort
Gnome sort is a sorting algorithm which is similar to insertion sort except that moving an element to its proper place is accomplished by a series of swaps, as in bubble sort. It is conceptually simple, requiring no nested loops. In practice the algorithm has been reported to generally run as fast as Insertion sort, although this depends on the details of the architecture and the implementation
The name comes from the behavior of the Dutch garden gnome in sorting a line of flowerpots. He looks at the flower pot next to him and the previous one; if they are in the right order he steps one pot forward, otherwise he swaps them and steps one pot backwards. If there is no previous pot, he steps forwards; if there is no pot next to him, he is done
Effectively, the algorithm always finds the first place where two adjacent elements are in the wrong order, and swaps them. It takes advantage of the fact that performing a swap can only introduce a new out-of-order adjacent pair right before the two swapped elements, and so checks this position immediately after performing such a swap.
Last edited by Ellis Dee; Jun 8th, 2007 at 08:46 AM.
-
Jun 8th, 2007, 09:07 AM
#21
Re: Sorting algorithms
Originally Posted by Milk
I'm having trouble with this one. Because of the name, it's virtually un-googlable. And the description reads like Greek to me:
Let A be an n-element array to be sorted. These elements are inserted one at a time in random order into a sorting array S of size (1+")n. The insertions proceed in log n rounds as follows. Each round doubles the number of elements inserted into S and doubles the prefix of S where elements reside. Specifically, round i ends when element 2i is inserted and the elements are rebalanced. Before the rebalance, the 2i elements are in the first (1+")2i positions. A rebalance moves them into the first (2+2")2i positions, spreading the elements as evenly as possible. We call 2 + 2" the spreading factor.
During the ith round, the 2i−1 elements in S at the beginning of the round are called support elements, and their initial positions are called support positions. The 2i−1 elements inserted before the end-of-round rebalance are called intercalated elements.
The insertion of 2i−1 intercalated elements within round i is performed the brute force way: search for the target position of the element to be inserted by binary search (amongst the 2i−1 support positions in S), and move elements of higher rank to make room for the new element. Not all elements of higher rank need to be moved, only those in adjacent array positions until the nearest gap is found.
(This is a straight text dump from the pdf in the link. See the "Algorithm" section in the pdf for a clearer description.)
Is anybody up to the challenge of writing a library sort in vb6?
-
Jun 8th, 2007, 09:18 PM
#22
Re: Sorting algorithms
Its a modified insertion sort, it's google-able http://en.wikipedia.org/wiki/Library_sort
-
Jun 8th, 2007, 09:36 PM
#23
Re: Sorting algorithms
Here's a small code snippet i use in my sorting algorythms for directory sorting.
Code:
Function AreTheyAlphabetized(Entry1 As String, Entry2 As String) As Boolean
' the function will return a true if no change needed, or a false if
'entry2 should be first
Dim NumSteps As Long, Cl As Long
Dim Work1 As String, Work2 As String, temp As String
If Len(Entry1) > Len(Entry2) Then
NumSteps = Len(Entry1)
Else
NumSteps = Len(Entry2)
End If
Work1 = LCase(Entry1) + Space(NumSteps)
Work1 = Left(Work1, NumSteps)
Work2 = LCase(Entry2) + Space(NumSteps)
Work2 = Left(Work2, NumSteps)
For Cl = 1 To NumSteps
If Asc(Mid(Work1, Cl, 1)) < Asc(Mid(Work2, Cl, 1)) Then
AreTheyAlphabetized = True
Exit Function
ElseIf Asc(Mid(Work1, Cl, 1)) > Asc(Mid(Work2, Cl, 1)) Then
AreTheyAlphabetized = False
Exit Function
End If
Next Cl
AreTheyAlphabetized = True
End Function
You may find it useful. I use it for sorting listviews since the built-in sort sucks big-time.
I also noticed no one is using Recursion in their sorting routines?
-
Jun 9th, 2007, 12:25 AM
#24
Re: Sorting algorithms
Originally Posted by Lord Orwell
Here's a small code snippet i use in my sorting algorythms for directory sorting.
Code:
Function AreTheyAlphabetized(Entry1 As String, Entry2 As String) As Boolean
' the function will return a true if no change needed, or a false if
'entry2 should be first
Dim NumSteps As Long, Cl As Long
Dim Work1 As String, Work2 As String, temp As String
If Len(Entry1) > Len(Entry2) Then
NumSteps = Len(Entry1)
Else
NumSteps = Len(Entry2)
End If
Work1 = LCase(Entry1) + Space(NumSteps)
Work1 = Left(Work1, NumSteps)
Work2 = LCase(Entry2) + Space(NumSteps)
Work2 = Left(Work2, NumSteps)
For Cl = 1 To NumSteps
If Asc(Mid(Work1, Cl, 1)) < Asc(Mid(Work2, Cl, 1)) Then
AreTheyAlphabetized = True
Exit Function
ElseIf Asc(Mid(Work1, Cl, 1)) > Asc(Mid(Work2, Cl, 1)) Then
AreTheyAlphabetized = False
Exit Function
End If
Next Cl
AreTheyAlphabetized = True
End Function
You may find it useful. I use it for sorting listviews since the built-in sort sucks big-time.
I also noticed no one is using Recursion in their sorting routines?
Why not use StrComp()?
AreTheyAlphabetized("a", "b") is equivalent to StrComp("a", "b") <= 0
The only problem with the listview sort is it treats everything as text so it fails on numerics, dates, etc. If it sucks big time then maybe its your use of the listview sort that has problems as your function name implies use on text data.
As to recursion, one of the considerations for a good algorithm is memory usage, call stack included.
Last edited by leinad31; Jun 9th, 2007 at 12:33 AM.
-
Jun 9th, 2007, 02:21 AM
#25
Re: Sorting algorithms
strcomp didn't work too well for strings of different lengths. This subroutine sorts everything, even numbers properly, because numbers are lower in the ascii set. And it ignores upper and lower case, but as you can see from the code this can be changed.
Listview also failed on something else: I show a directory with it and the . and .. entries get moved from the top. Or i want directories listed first. All in all it is a really weak routine and if you want to fix any of its shortcomings you have to disable it altogether or it resorts everything you fix.
As for the function name suggesting anything, feel free to rename it to isitalphnumericalized. I wrote it years ago and i knew i would need a descriptive sub name so i used one
-
Jun 9th, 2007, 03:49 AM
#26
Re: Sorting algorithms
Originally Posted by Lord Orwell
strcomp didn't work too well for strings of different lengths. This subroutine sorts everything, even numbers properly, because numbers are lower in the ascii set. And it ignores upper and lower case, but as you can see from the code this can be changed.
Listview also failed on something else: I show a directory with it and the . and .. entries get moved from the top. Or i want directories listed first. All in all it is a really weak routine and if you want to fix any of its shortcomings you have to disable it altogether or it resorts everything you fix.
As for the function name suggesting anything, feel free to rename it to isitalphnumericalized. I wrote it years ago and i knew i would need a descriptive sub name so i used one
Then the problem wasnt with the listview or its sorting algorithm, technically there was no problem... but rather you wanted a sort order different to ascii or a sort order where some items remain where they are (unsorted). Same can be said of number sorting, only text sort is provided cause the data is stored in the listview as string anyway (listviuew.Text property, .Substring() etc).
It seems you misunderstood what I meant with it fails sorting numbers, dates, etc. I meant this output 1, 11, 12, 2, 20, 3 instead of 1, 2, 3, 11, 12, 20 cause they are treated as text. I don't see how your algorithm performs otherwise as your also using strings and comparing from the first character.
Also, I'm surprised that you still havent encountered an error with Asc("") using your code when its supposed to handle two strings of different lengths. Not really being particular this moment on naming convention or shortcomings, but an error is an error.
With regard to the thread topic, for speed consideration of sorting algo you want to simplify the comparison, and not only minimize the number of comparisons made, as much as possible...
Last edited by leinad31; Jun 9th, 2007 at 04:05 AM.
-
Jun 9th, 2007, 04:11 AM
#27
Re: Sorting algorithms
Smooth sort is out. Not ony did I find no code for it, but I found this wiki quote: (bolding mine)
The smoothsort sorting algorithm [1] is a variation of heapsort developed by Edsger Dijkstra in 1981. Like heapsort, smoothsort's upper bound is O(n log n). The advantage of smoothsort is that it comes closer to O(n) time if the input is already sorted to some degree, whereas heapsort averages O(n log n) regardless of the initial sorted state. Due to its complexity, smoothsort is rarely used.
Originally Posted by leinad31
Code for it isn't google-able. I found the wiki entry no problem, but there's no pseudocode to mess with.
I could throw something together that meets the criteria of the narrative description, but I have no way to know of any subtle issues / pitfalls / optimizations inherent to the algorithm, so freewheeling a solution wouldn't give it a fair shake.
As one of the posters I most respect around here, if you put together a library sort routine I would consider it an optimal implementation. Short of an implementation by you, Merri or Logophobic, I may have to remove library sort from the benchmark.
Lord Orwell, I'm not convinced your code is faster than a listview's native sort. Due to the variant string functions and lack of a byte array implementation, I would guess that it's actually significantly slower.
Last edited by Ellis Dee; Jun 9th, 2007 at 04:50 AM.
Reason: "it's", not "its"
-
Jun 9th, 2007, 04:35 AM
#28
Re: Sorting algorithms
i never had an error from asc("") simply because i checked the lengths of the strings first (see above but relisted here)
Code:
If Len(Entry1) > Len(Entry2) Then
NumSteps = Len(Entry1)
Else
NumSteps = Len(Entry2)
End If
I only stepped to the lengh of the shortest line.
And sorry but i didn't paste the whole function. I see what you mean about the numbers. The function i posted is only part of a wrapper function that sorts using VAL first and if both VAL are the same then it calls this function.
(little known fact: val(" 1 horse town") would return the value of 1)
Last edited by Lord Orwell; Jun 9th, 2007 at 04:39 AM.
-
Jun 9th, 2007, 09:21 AM
#29
Re: Sorting algorithms
Kk, I see what you mean. I missed the concatenation cause you used + to concatenate instead of &. Still I can't help but feel this was translated from C++ code inefficiently.
-
Jun 9th, 2007, 10:42 AM
#30
Re: Sorting algorithms
Here's an optimized version of that function, Lord Orwell:
Code:
' return a true if no change needed or false if pstr2 should be first
Function AreTheyAlphabetized(pstr1 As String, pstr2 As String) As Boolean
Dim bytArray1() As Byte
Dim bytArray2() As Byte
Dim lngLen As Long
Dim i As Long
If Len(pstr1) > Len(pstr2) Then
lngLen = Len(pstr2) - 1
Else
lngLen = Len(pstr1) - 1
End If
bytArray1 = StrConv(LCase$(pstr1), vbFromUnicode)
bytArray2 = StrConv(LCase$(pstr2), vbFromUnicode)
For i = 0 To lngLen
Select Case bytArray1(i)
Case Is < bytArray2(i)
AreTheyAlphabetized = True
Erase bytArray1, bytArray2
Exit Function
Case Is > bytArray2(i)
Erase bytArray1, bytArray2
Exit Function
End Select
Next
Erase bytArray1, bytArray2
AreTheyAlphabetized = (Len(pstr1) - 2 < lngLen)
End Function
But I must be missing something, because all that logic can be boiled down to this:
Code:
' return a true if no change needed or false if pstr2 should be first
Function AreTheyAlphabetized(pstr1 As String, pstr2 As String) As Boolean
AreTheyAlphabetized = (LCase$(pstr1) <= LCase$(pstr2))
End Function
And I'm fairly sure this would run even faster.
Last edited by Ellis Dee; Jun 9th, 2007 at 10:52 AM.
-
Jun 9th, 2007, 12:25 PM
#31
Re: Sorting algorithms
Code below is a modified (optimized, improved, etc) insertion sort algorithm that's based on divide and conquer concept for locating the insertion point.
Please let me know if there are logical errors that I missed as I havent tested this with very large data (especially the API part). If I managed to recreate it properly then it should work.
Code:
Option Explicit
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal ByteLen As Long)
Private SortStrArr() As String
Private Sub GenRnd_StrArr(ByRef ListSize As Long, StrLength As Long)
Dim i As Long
Dim j As Long
Dim TmpArr() As String
ReDim SortStrArr(ListSize - 1)
ReDim TmpArr(StrLength - 1)
For i = 0 To ListSize - 1
For j = 0 To StrLength - 1
TmpArr(j) = Chr$(Int(Rnd * 26) + 65)
Next
SortStrArr(i) = Join(TmpArr, "")
Next
End Sub
Private Sub CopyArrToList(ByRef ArrRef As Variant)
Dim i As Long
If Not IsArray(ArrRef) Then Exit Sub
List1.Clear
For i = LBound(ArrRef) To UBound(ArrRef)
List1.AddItem ArrRef(i)
Next
End Sub
Private Function InversionChk_StrArr(ByRef StrArr() As String, Upper As Long) As Boolean
Dim i As Long
For i = LBound(SortStrArr) To Upper - 1
If StrComp(StrArr(i), StrArr(i + 1)) > 0 Then Exit Function
Next
InversionChk_StrArr = True 'success, no inversions
End Function
Private Sub InsertionEx_StrArr(ByRef StrArr() As String)
Dim lCurr As Long, lPos As Long
Dim lLBound As Long, lLB As Long, lUB As Long
'Since just for the concept, ascending sort only
lLBound = LBound(StrArr)
For lCurr = lLBound + 1 To UBound(StrArr)
If StrComp(StrArr(lCurr - 1), StrArr(lCurr)) > 0 Then
'Optimization: find insert point only on inversion
'An inversion occured, locate insert point
lPos = lLBound
If StrComp(StrArr(lLBound), StrArr(lCurr)) <= 0 Then
'Optimization: range for insertion diminishes roughly by half each Do Loop iteration.
' Either range to left or right of mid value selected as next lLB-lUB range.
'Locate insert point in the middle of lLBound-lCurr.
lLB = lLBound + 1
lUB = lCurr
lPos = lLB + ((lUB - lLB) \ 2) 'calculate mid index
Do
Select Case StrComp(StrArr(lPos), StrArr(lCurr))
Case -1, 0: lLB = lPos + 1
Case 1: lUB = lPos
End Select
lPos = lLB + ((lUB - lLB) \ 2)
Loop While lLB < lUB
lPos = lLB
End If
'lPos now either equal to lLBound or lLB
'Shift elements accordingly
'Dim sTmp As String, i As Long
' sTmp = StrArr(lCurr)
' For i = lCurr To lPos + 1 Step -1
' StrArr(i) = StrArr(i - 1)
' Next
' StrArr(lPos) = sTmp
'Optimization: bulk shift values or references (in this case references) using API
'Shift elements accordingly using API
Dim lAddrCurr As Long, lAddrPos As Long
Const LSIZE As Long = 4 '4 bytes since we're copying addresses
CopyMemory lAddrCurr, ByVal VarPtr(StrArr(lCurr)), LSIZE
CopyMemory lAddrPos, VarPtr(StrArr(lPos)), LSIZE
CopyMemory ByVal lAddrPos + LSIZE, ByVal lAddrPos, (lCurr - lPos) * LSIZE
CopyMemory ByVal lAddrPos, lAddrCurr, LSIZE
End If
Next
End Sub
Private Sub Command1_Click()
Call GenRnd_StrArr(4000, 20)
Call InsertionEx_StrArr(SortStrArr) 'sort
Call CopyArrToList(SortStrArr) 'display in List1
MsgBox InversionChk_StrArr(SortStrArr, UBound(SortStrArr))
End Sub
Private Sub Form_Load()
Randomize
End Sub
Essentially the modifications result in less comparisons and takes advantage of API which results in less movement (swaps between variables). Memory use is roughly the same as array size (minimal memory for working variables), it should be stable (relative positions of equal values retained after sort), and if you have sufficient VB knowledge/familiarity (especially with the API) it shouldn't be that hard to implement.
BTW, since above is for demonstration only its hardcoded for ascending sort of strings.
Last edited by leinad31; Jan 7th, 2008 at 01:15 AM.
-
Jun 9th, 2007, 01:29 PM
#32
Re: Sorting algorithms
Looks good to me. Fair warning: I'm going to remove the API calls in order to genericize it for the benchmark routine. Anybody who wants that optimization can customize the routine themselves. (If they don't know how, they won't even know what they're missing.)
I'll poke around and see if the technique is documented elsewhere. In the meantime, what do you want to call it? Single-word names only, please. heh.
-
Jun 9th, 2007, 01:34 PM
#33
Re: Sorting algorithms
Well... i've always called it modified insertion and never thought of giving it another name since 2001...
-
Jun 9th, 2007, 04:52 PM
#34
Re: Sorting algorithms
i don't know how to program in C++ so i can assure you i didn't convert it.
how does this line work
(LCase$(pstr1) <= LCase$(pstr2))
i didn't think you could use that operator on strings.
I didn't use a byte array because this sub actually predates my using vb6. In 5 the strconv function didn't do that. We had to use the api CopyMemory and if you didn't use it just right... well you know.
I will try to learn a little from what you have posted
-
Jun 9th, 2007, 06:13 PM
#35
Re: Sorting algorithms
Preliminary results are in. First off, kudos to you, Code Doc; Jump sort kicks all kinds of ass.
The user interface and benchmark logic is finshed, so now it's just a matter of implementing the algorithms. So far I've successfully implemented quick, heap, jump, insertion, cocktail, and bubble. Comb functions, but is suspiciously slow; I think I may have implemented it poorly. And while I did write gnome, it doesn't sort properly. So far I have yet to implement selection, merge, shell, and leinad, but I already have vb code for all of them.
Library sort remains the odd man out, and is perilously close to being abandoned.
I have to say, this benchmark program is way overcoded, but I like it. My excuse for the unnecessary bells and whistles is that this is the first project where I've ever used ADO.
Lord Orwell, the <, >, <= and >= operators all work on strings just as well as the = operator.
-
Jun 9th, 2007, 06:54 PM
#36
Re: Sorting algorithms
well that will make my future coding a little easier. I've been programming in basic since Version 2 was hardcoded in trs80s. I don't always keep up on operator overloading in the new versions. Example: I upgraded to vb6 when i found out about instrrev.
I just did a little testing with <>= on strings. They work great. This is going to make my life easier thanks!
-
Jun 9th, 2007, 07:02 PM
#37
PowerPoster
Re: Sorting algorithms
Originally Posted by Lord Orwell
I just did a little testing with <>= on strings. They work great. This is going to make my life easier thanks!
You've only just found out about it? I shudder to think what else you've yet to learn about VB6 :-P
Although, I understand...when I moved from QBasic to VB6 I always used + to join strings together...it took a lot of getting used to using & for strings and + for numbers...all this time later, it seems I learn something new at least once a week about what's possible with VB6, but that's probably because I am always actively coding something for fun and seeing what I am capable of and I have tons of unfinished projects and planned projects that'll never get even *started* :-)
Well, everyone else has been doing it :-)
Loading a file into memory QUICKLY - Using SendKeys - HyperLabel - A highly customisable label replacement - Using resource files/DLLs with VB - Adding GZip to your projects
Expect more to come in future
If I have helped you, RATE ME! :-)
I love helping noobs with their VB problems (probably because, as an amateur programmer, I am only slightly better at VB than them :-)) but if you SERIOUSLY want to get help for free from a community such as VBForums, you have to first have a grounding (basic knowledge) in VB6, otherwise you're way too much work to help...You've got to give a little if you want to get help from us, in other words!
And we DON'T do your homework. If your tutor doesn't teach you enough to help you make the project without his or her help, FIND A BETTER TUTOR or try reading books on programming! We are happy to help with minor things regarding the project, but you have to understand the rest of it if you want our help to be useful.
-
Jun 9th, 2007, 09:20 PM
#38
Re: Sorting algorithms
i had control creation edition. No Help Files.
well it's a non-issue now. I am migrating (again) this time to .net.
with msdn collection (finally)
Another example of whole reams of code replaced by one statement. .net lets me load a file directly into a byte array. I do that all the time.
Last edited by Lord Orwell; Jun 9th, 2007 at 09:23 PM.
-
Jun 9th, 2007, 09:33 PM
#39
Re: Sorting algorithms
Originally Posted by Lord Orwell
.net lets me load a file directly into a byte array.
VB6 can do that too...
Library Sort Maybe I could code it, but I don't think it's worth the effort.
Ellis Dee: Your description of Gnome Sort is identical to Wikipedia's description of Insertion Sort on an array.
-
Jun 9th, 2007, 10:33 PM
#40
Re: Sorting algorithms
Re: Library Sort, whatever you gain fom having the "buffers", you loose to additional array maintenance tasks.
Ellis Dee, the memory block handling (or handling contiguous sorted elements) is the meat of the optimization and the reason I settled with optimizing insert rather than other sorting algo.. Without it, insert performance is hardly any better compared to selection. Try timing it with the API included.
Last edited by leinad31; Jun 9th, 2007 at 10:40 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|