Site hosted by Angelfire.com: Build your free website today!

Quicksort is broken. Is it worth fixing?

Quicksort is one of the most popular sorting algorithms. It requires relatively little code and, in general, performs much better than the other short sorting routines, such as the Insertion, Bubble and Shell sorts. If you're not familiar with it, I've added an appendix that illustrates how Quicksort works. Let me clarify. Most versions of Quicksort source code available over the net are broken. Some are badly broken. If you're using Quicksort, it's probably one of those. So... How are they broken? When do they perform badly or fail to work at all? Why? What's needed to fix them? I'll try to answer that.

Introduction

We are told quicksort is good. We are told it usually takes time proportional to N*logN (where N is the number of input records), and indeed it does, for randomly ordered data.
Quicksort's worst case takes time proportional to N*N, though that doesn't happen at all often in practice. It's also reputed to do badly on already sorted data. This isn't fair. For some of the really early versions of Quicksort the worst case was an array in sorted order. And the mud stuck.
My own experiments have shown that it in fact does better for mostly sorted data (we shall see what kinds of ordering upset it later; uniformly ascending or descending data speed it up, if we always choose the middle element of a partition to divide it into smaller partitions!). It's just that other sort algorithms do even better.
Duplicated records are the real worry (at least... as Quicksort is usually written - but there are two separate fixes for that given later in this article, one a minor change, the other a near-total rewrite that gears Quicksort for handling duplicated data... at a price).
No matter how we choose the pivot element, some input orders will make Quicksort perform very badly indeed. I'll provide an "adversarial" algorithm that will shuffle pre-sorted data to upset most modern Quicksort implementations later in the article (at first glance it's more of curiosity value than of practical use, but there's method to my madness, as you shall see...).
Most quicksort implementations use recursion to sort both partitions. This is incorrect. Recursion should only be used for the smaller partition, if at all. If it is used for both partitions, the maximum stack depth that might be required is N-1. In other words, if you call a Quicksort implementation that calls itself recursively to sort both partitions, and supply it a large input dataset, it may run out of stack and crash. [X] [link to an example which illustrates the stack problem for a 16-element array, with "middle element pivot" selection.] [second link to a corrected quicksort algorithm that does NOT have this fault]
Procedural recursion is not necessary at all, in Quicksort. If we are prepared to set aside a little memory to store indices of the partition elements used thus far, we can do without recursion altogether. But it turns out this slows the sort down.[X] [link to algorithm]
Let's get lateral. Is it worth randomizing the input array order?
Possibly. Sometimes. But the overhead required to randomize it may outweigh the benefits.
But Quicksort has another problem. It is not a stable sort. An explanation is in order. When, say, a user has a sorted array, and sorts by another column, does the sort algorithm always preserve the relative order of the elements that differ for the previous sort key but have the same value in the new sort key? Most users would expect this! A sort algorithm which always preserves the order of elements (which do not differ in the new sort key) is called a "stable sort". One which does not is called an "unstable sort". This can be illustrated with a table containing four records!

With an unstable sort
this might happen
(pink shows the shuffling)

Sorted by name

number	name
1	Cat
1	Fox
3	Rabbit
2	Rabbit

User sorts by number

number	name
1	Fox
1	Cat
2	Rabbit
3	Rabbit

User sorts by name again

number	name
1	Cat
1	Fox
3	Rabbit
2	Rabbit

With a stable sort
the user will always see this

Sorted by name

number	name
1	Cat
1	Fox
3	Rabbit
2	Rabbit

User sorts by number

number	name
1	Cat
1	Fox
2	Rabbit
3	Rabbit

User sorts by name again

number	name
1	Cat
1	Fox
2	Rabbit
3	Rabbit

For those of you much better at math than me (most of you!), stable sorts are transitive (that is SortonA(SortonB(Data)) is equivalent to SortOnAAndB(Data).
It turns out that most implementations of Quicksort are unstable sorts. I'll give an example of a stable implementation of Quicksort later in this article. There's also some code illustrating how you can "stabilize" the result of an unstable sort, with an extra pass over the data.

Sorting small partitions

Quicksort is *not* good at sorting small sections of an array (or small arrays, for that matter). Surprisingly, a Straight Insertion sort will do better. For this reason, in Knuth's "The Art of Computer Programming", volume 2, it is recommended that you not sort partitions of the array of size below about 9 elements, but that you instead "hand over" to an insertion sort after you've used Quicksort to divide the array into partitions of that size or smaller. Knuth calls the "minimum partition length worth sorting" parameter M. Yeah, right, I thought when I read this. But it's true. Though 9 mightn't be right if there are a lot of duplications in the data (that is, records that have a matching sort key). We'll cover that problem later.

Please wait...
Chart will load if you have Javascript

<--

Switching to Insertion

<--

The chart at left shows how different values of the M parameter affect sort performance (for an array of 5000 distinct long integers). These are NOT running averages, I ran the test for each M only once - but the trend lines are clear. The different colours indicate how much (and what sort of order) is found in the input data:

ascending order (green)
random order (blue)
descending order (black)
sawtooth order (red)

Insertion Sort does well for data already in ascending order because it's a "find the problem and fix it" sort algorithm. Indeed, for 5000 records already in order Insertion Sort *by itself* takes under 0.02 seconds (at least 5 times faster) . However, for 5000 records in random order, it takes over 20 seconds (at least 50 times slower).

So, it is worth "farming out" small partitions to Straight Insertion Sort! The textbooks promise about 8% but (if you're using indices) in VB, it seems to be a little better than that. Perhaps 15%. And m=9 is about right (if there aren't any duplicated sort keys...).

Partition Selection

The older textbook Quicksort routines chose the first element in the current partition as the pivot. It saves (a little) time if the input data are randomly ordered. But if they're not, it's not a good choice! If all the records are in sorted order (or reverse order), it degrades to either a Straight Insertion or a Bubblesort.
Most recently published implementations either choose the middle element of the partition or a randomly selected element. It doesn't seem to make much difference.

Please wait...
Chart will load if you have Javascript

<--

Middle Element

<--

The chart at left shows timings for a conventional Quicksort algorithm (VB implementation, sorting long integers via an index array, running under the VB IDE on a PII/350), which always chooses the middle element of a range as the partition element, when supplied with

strictly ascending values (green)
strictly descending values (blue)
random values (black)
sawtooth pattern (red)

for different values of N. As you can see choosing the middle partition works well for data in strictly ascending and descending order. It's difficult to cook up data that will "upset" it.

Please wait...
Chart will load if you have Javascript

<--

Random Element

<--

The chart at left shows timings for the same Quicksort algorithm, modified to choose a random partition element (these aren't averages, I only ran each test once. That's why there are spikes here and there). There's not much difference. Arrays that are already in ascending order tend to sort faster than ones in reverse order because less swaps are necessary. Arrays mostly in reverse order (usually) sort faster than ones mostly in random order because early partitions will swap them - once -and then they'll be in order and won't need swapping again. For N below 1000 the extra cost of calling Rnd() seems to outweigh the advantages.

I haven't shown the data here, but it's much the same story out to N=10000.

Reducing Recursion

As outlined earlier, most published Quicksort algorithms use recursion to sort both partitions. That is, their english definitions read like this:

If current partition size=2 then swap if necessary: exit (Note: this can be dispensed with if we have M=3 or more, see above).
Otherwise, choose a partition element, p
For each *other* element in the partition, check it against the value of element p.
As we find pairs of elements (on each side of the partition) that are on the wrong side, swap them. When we have finished this process we will have two sub-partitions.
Recursively sort the left-hand sub-partition
Recursively sort the right-hand sub-partition

Now, the problem here is that one of the partitions might have more elements that the other. Indeed, it could in theory have all of the elements other than the last partition element. This means that the recursion depth required to sort an array of n elements could - at least theoretically - be n. Ouch! With a large array we might run out of stack. You might think it won't happen if you choose your partition element "carefully". Ha! As if! It's happened to me, when I used Quicksort routines written to the above spec. Not often, but it happened. It was embarassing. It happened (about once a month) for three months before I figured out why. The last two points of the algorithm should be replaced with:

Recursively sort the smaller sub-partition
Set the current partition to the larger partition
Loop back to the first step

This will guarantee that the maximum stack depth needed is never greater than Ceiling(log2(N)). It will degrade performance only a little (perhaps 1%). But theory isn't going to persuade anyone, is it? How about raw performance data? And I'll include what you can get if you dispense with procedural recursion altogether, implement your own stack, and commit the sin of using Goto (in about six places!).

Please wait...
Chart will load if you have Javascript

<--

Avoiding Recursion

<--

I'll give you the numbers because the differences aren't that clear from the chart. When we compare for 20480 records, we get:

Evil goto'd version (red) takes 1.1646 seconds
One-sided recursion (green) takes 1.141
Double recursion (blue) takes 1.1395

So the goto version with the do-it-yourself stack is actually slower (or, at least the version I wrote was. I may have a play and see if I can improve it. But I think it will lose out). Still, the slightly better performance of double-sided recursion shouldn't tempt you to use it! You could still run out of stack! Okay, it's very, very unlikely! But a 1% performance gain?! Is that worth it?

If you're still in doubt, and think the 1% worth it, it turns out that if we run Quicksort against more records (about 30,000 or more) that turns around, and the double-side recursive version gets slower and slower (relative to the no-recursion version). Why I'm not sure.

Click here to see a typical (invalid) implementation using too much recursion

Click here to hide the code


Const minSubFile = 9

Sub ISortAlgorithm_Sort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)
    If l + minSubFile <= r Then 
        Quicksort d, l, r, a                'Don't bother quicksorting small arrays
    End If
    InsertionSort d, l, r, a                'Straight insertion sort
End Sub

Sub Quicksort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)

  'Description:	Two-sided recursion verison (may blow stack limit!)

Dim p As Long     'index of index of partition element
Dim v As Variant  'partition value
Dim rr As Long    'original right boundary
Dim temp As Long  'for swapping indices
    
rr = r                            'remember the index of end of partition
p = Int(l + r) / 2                'pick partition element
v = d(a(p))                       'remember partitioning value
temp = a(p): a(p) = a(l): a(l) = temp 'swap index of partition element to bottom
p = l                             'remember where we put it
Do
  Do
    For l = l + 1 To r - 1
      If v <= d(a(l)) Then Exit For
    Next
    For r = r To l + 1 Step -1
      If d(a(r)) <= v Then Exit Do
    Next
    If rr < l Then  'if we tacked an infinity on the right we could do without this
      l = l - 1
    ElseIf v < d(a(l)) Then
      l = l - 1
    End If
    temp = a(p): a(p) = a(l): a(l) = temp
    If p <= l - minSubFile Then Quicksort d, p, l - 1, a
    If rr - l >= minSubFile Then Quicksort d, l + 1, rr, a
    Exit Sub
  Loop
  temp = a(l): a(l) = a(r): a(r) = temp
  r = r - 1
Loop
End Sub

Private Sub InsertionSort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)
    Dim j As Long       'index when searching for inversions
    Dim i As Long       'index when correcting inversions
    Dim v As Variant    'value of element on the move
    Dim oldI As Long    'index of element being moved
    For j = l + 1 To r
        If d(a(j)) < d(a(j - 1)) Then
            i = j - 1
            oldI = a(j)
            v = d(oldI)
            Do
                a(i + 1) = a(i)
                i = i - 1
                If i < l Then Exit Do   'because we don't have a magic low value at the bottom
            Loop Until d(a(i)) <= v
            a(i + 1) = oldI
        End If
    Next
End Sub

Randomizing the Input Order

If you expect that, generally, you'll be getting records in random order, but you might occasionally be asked to sort records in ascending or descending order, you can use a version of Quicksort geared for sorting records in random order, and make sure they are in random order by shuffling them first.

Click here to see the code

Click here to hide the code
Now you'll see one of the reasons that I've recommended using arrays of indices rather than arrays of actual values to compare. It's much easier and cheaper to randomize the order of the indices (when called in VB, a will be an array of Long) than the order of the values themselves (unless you are sorting longs, that is!).

  sub randomizeArrayOrder(byref a As Variant)
    dim i    As Long  'index of element to swap
    dim j    As Long  'index of element to swap it with
    dim m    As Long  'number of elements remaining that might be swapped
    dim temp As Long  'for swapping elements (declare as Variant if a isn't an array of longs)
    
    m = ubound(a) - lbound(a) + 1
    for i=lbound(a) to ubound(a)-1
      j = i + int(m*rnd())
      temp=a(i)
      a(i)=a(j)
      a(j)=temp
      m = m - 1
    next    
  end sub

When you are randomizing order by exchanging, it takes three copies to exchange two elements. If you've got more room to work in you can go a bit quicker.

  sub randomizeArrayOrder(byref aIn as Variant, byref aOut as Variant) 'aIn gets trashed
    dim i As Long     'index of element to replace
    dim j AS Long     'index of element to replace it with
    dim m    As Long  'number of elements remaining that might be swapped
    m = ubound(aIn) - lbound(aIn) + 1
    for i=lbound(aIn) to ubound(aIn)
      j = i + int(m*rnd())
      aOut(i)=aIn(j)
      aIn(j)=aIn(i)		'<--two copies, not three, but aIn is trashed
      m=m-1
    next
  end sub

  'Note that for this to work aIn must not be the same array as aOut!  
  'You can trick VB into getting that right by calling it like this: randomizeArrayOrder (a),a
  'The brackets force VB to make a temporary copy of A.  VB can copy an entire array
  'in one hit, saving a lot of time.

Duplicated Values

Now for the horror story! There is a kind of data for which most Quicksort implementations will take time proportional to the square of the number of records. It's "everything the same", and to a lesser extent, data where there are only a few different sort key values. As this chart shows:

Please wait...
Chart will load if you have Javascript

<--

Duplicate Values...

<--

It's the red and pink lines that are really scary! Though the purple line doesn't look too good either. What this graph shows is that using Quicksort to sort on a field which only has a few possible values (or where most of the values of the records will be the same, say for a comment field that isn't used much, or the third line of an address field, or the "reasons you should cut my pay" field...)... is not a good idea. Or at least, not the usual version of Quicksort.

all records the same (red)
2 different values only(pink)
5 different values only(purple)
10 copies of each value(blue)
50 copies of each value(green)

There are two fixes available (that I know of). The first is merely two changes to two comparisons. When we're looking for elements to swap, we usually look for elements that belong "on the other side of the pivot". Well and good. We have to do that. However: we usually don't come in towards the centre from both sides at the same time. We search on the left first and then on the right. The code might look something like this:

  while l<r and element(l)<=pivot
    l = l + 1
  wend
  while l<r and pivot<=element(r)           'technically we don't need to check l<r here
    r = r - 1                               'in most implementations, as the pivot element itself
  wend                                      'will have been swapped with the leftmost element in the
                                            'current partition.
  'if we get here we may have two records to swap

It shouldn't. What we should be doing, if we want better performance when there are lots of duplicated values, is swapping more agressively. In other words, whenever we have an element equal to the pivot or belonging on the other side, we swap it. Yeah, we're deliberately swapping when we don't have to. This idea is from R.M.Sedgewick, and I'm glad he wrote it in to Donald Knuth! It's counterintuitive! It turns out that the extra swaps are less important than the fact that swapping aggressively keeps the left and right subpartitions about the same size whenever most of the values we find are the same as the pivot. With that change in place, running the same tests gets us these results:

Please wait...
Chart will load if you have Javascript

<--

Duplicate Values Part 2

<--

Aa you can see, this version works much better for arrays with many duplicates. I also put in two tests against ascending and randomly ordered arrays (to show it works okay for data without all those duplicates, too).

all records the same (red)
2 different values only(pink)
5 different values only(purple)
randomly sorted data (blue)
ordered data (every record different)(green)

Making Quicksort Stable

The comments in this section (and the algorithm it outlines) apply to any unstable sorting algorithm.
If you've just run an unstable sort algorithm (that is, one that may shuffle the pre-existing order of any elements which match according to the sort column), and output an array of indices indicating the output array order, you can make a linear pass over that array to "rekey" it (that is, to restore any pre-existing order between rows which match according to the sort column).

Click here to see the routine

Click here to hide the routine

  sub rekeySortOrder(byref d as Variant, byref a as Variant)
    ' a=array of indices resulting from unstable sort pass
    '   containing one copy of each of the values between lbound(a)
    '   and ubound(a).
    ' d=the column input array supplied to the sorting pass.
    '
    dim class       'indicates the equivalence class for each element
    dim classCount  'count of elements in each equivalence class
    dim thisClass   'the current equivalence class number
    dim i           'index into array of indices
    dim j           'index into array of equivalence counts
    
    redim class(lbound(a) to ubound(a))
    redim classCount(0 to ubound(a)-lbound(a))
    
    class(lbound(a)) = 0
    classCount(0) = 1
    thisClass = 0
    for i=lbound(a)+1 to ubound(a)
      if d(a(i-1)) < d(a(i)) then
        thisClass = thisClass + 1
      end if
      class(a(i)) = thisClass
      classCount(thisClass) = classCount(thisClass) + 1
    next
    
    '
    'We now know which equivalence class each element belongs in
    'And we know how many elements there were in each class.
    'So now we just need to know where the rows in each equivalence
    'class will start in the output array, and we can re-order.
    '
    dim offset
    dim classOffset    
    redim classOffset(0 to thisClass)
    
    offset = lbound(a)
    for j=0 to thisClass
      classOffset(j) = offset
      offset = offset + classCount(j)
    next
    
    '
    'At this point we know (a) where each equivalence class should
    'start in the output array (classOffset gives us this),
    'and (b) which class each element is in (class gives us this).
    'So, we can interate through the class array, writing into the
    'part of the output order array for the appropriate equivalence
    'class.
    '
    for i = lbound(a) to ubound(a)
      j = class(i)
      a(classoffset(j)) = i
      classOffset(j) = classOffset(j) + 1
    next    
  end sub

So far as I know, nobody has published an algorithm for making an unstable sort stable which will execute in time proportional to the number of records (as this one does) - although list insertion and (particularly) distribution sorts hint at this technique. With this little beast in your arsenal, you can use unstable sorts where stable sorts are required, and then run the above code against the sort order.

Note, though, that the above routine assumes that you have the indices (lbound(a) through ubound(a)) in your sort order. If you don't, it's *still* possible to rekey but not with linear execution time (so far as I know...), but you need a much fancier algorithm.

Late-breaking news...

I was looking at Quicksort, thinking about ways to speed it up for arrays where there are a lot of duplicate rows (for example, when you are sorting by an optional text field there may be many missing values), when I - almost inadvertantly - came up with a way to speed it for that sort of input and make it stable. I'll be publishing that routine just as soon as I'm sure I've got all the bugs out of it. It uses no procedural recursion, and it runs much faster for arrays containing large numbers of duplicate records. My preliminary testing pegs it at about twice the speed of a conventional Quicksort (when there's a lot of record duplication - say about 4 copies of each value on average). But it's more complicated, and performs a little worse when there aren't duplicates (Ironically, it also does worse at exploiting existing order in the input array!). The algorithm is as follows:

Start with two scratch arrays big enough to hold N longs, and a fixed size stack big enough to take Log2(N)+1 left and right partition boundaries. Set the current partition equal to the whole input array
Choose a partioning value, v, at random, from the current partition. Don't worry about where it was (no funny swapping required). Only the value matters.
Search through all the records in the current partition, comparing each to the partitioning element, and
- Write each element less than v into the left of the current partition (keep track of how many you've written)
- Write any element equal to v to the first scratch array
- Write any element greater than v to the second scratch array
Now, copy all the elements in the first scratch array next to the elements you've already moved to the left
Lastly, copy all the elements in the second scratch array in the room left over (they'll fit exactly)
At this point you have a 3-way partition, with elements less than v in the first, elements equal to v in the middle, elements greater than v in the right.
If neither partition is big enough to bother with (I stop processing at size 9, as per Donald Knuth's recommendation, though I find with heavily duplicated data the right threshhold is about 13), pop the boundaries of the next partition to sort from the stack and go back to step 2. If there's nothing left in the stack, run a Straight Insertion sort over the array to catch the smaller partitions that weren't worth running Quicksort against.
Push the edges of the larger partition onto a stack.
Sort the smaller partition, by looping back to step 2

Click here to see the routine

Click here to hide the routine


Const minSubFile As Long= 9
Private partition2 As Variant 'Indices from the right that belong on the left   (in order found)
Private partition3 As Variant 'Indices from the left that belong in the centre  (ascending)

Sub ISortAlgorithm_Sort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)
    If Not initialized Then
        ReDim partition2(l To r) As Long
        ReDim partition3(l To r) As Long
        initialized = True
    End If
    CheckBoundsEnough 0, r - l, partition2
    CheckBoundsEnough 0, r - l, partition3
    If l + minSubFile <= r Then Quicksort d, l, r, a 'Don't bother quicksorting small arrays
    InsertionSort d, l, r, a                'Straight insertion sort
End Sub

Private Sub InsertionSort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)
    Dim j As Long       'index when searching for inversions
    Dim i As Long       'index when correcting inversions
    Dim v As Variant    'value of element on the move
    Dim oldI As Long    'index of element being moved
    For j = l + 1 To r
        If d(a(j)) < d(a(j - 1)) Then
            i = j - 1
            oldI = a(j)
            v = d(oldI)
            Do
                a(i + 1) = a(i)
                i = i - 1
                If i < l Then Exit Do   'because we don't have a magic low value at the bottom
            Loop Until d(a(i)) <= v
            a(i + 1) = oldI
        End If
    Next
End Sub

Private Sub Quicksort(d As Variant, ByVal l As Long, ByVal r As Long, a As Variant)
  Dim n As Long         'Number of elements in area to sort
  Dim m As Long         'Used to calculate stack size required
  Dim p As Long         'index of index of partition element
  Dim v As Variant      'partition value
  Dim i As Long         'Read point in initial index array
  Dim x As Long         'Original index at that point
  Dim vv As Variant     'Value of array element being compared to partition
  Dim p2Count As Long   'Number of elements from left belong in centre
  Dim p3Count As Long   'Number of elements from left belonging at right
  Dim w As Long         'Write point (when writing back new order for the partition)
  Dim lStack() As Long  'Stack of left boundaries
  Dim rStack() As Long  'Stack of right boundaries
  Dim stackSize As Long 'Initially, needed stack size.  Later, current stack size.
  Dim cl As Long        'Destination of first element matching current partition value
  Dim cr As Long        'Destination of last element matching current partition value
                        'cl and cr are used to determine the next partitions to sort.
  n = r - l + 1
  m = 1
  While m < n * 2
    m = m * 2
    stackSize = stackSize + 1
  Wend
  ReDim lStack(stackSize)   'We use a stack so we can avoid the overhead for
  ReDim rStack(stackSize)   'recursive procedure calls.  We only need to stack l and r!
  stackSize = -1            'Nothing in the stack
  
  Do
    p = Int(l + r) / 2                'pick partition element (could be random, doesn't really matter)
    v = d(a(p))                       'remember partitioning value
    p2Count = 0
    p3Count = 0
    w = l
    For i = l To r
        x = a(i)
        vv = d(x)
        If vv < v Then
          a(w) = x
          w = w + 1
        ElseIf v < vv Then
          partition3(p3Count) = x
          p3Count = p3Count + 1
        Else
          partition2(p2Count) = x
          p2Count = p2Count + 1
        End If
    Next
    cl = w   'first point in centre
    For i = 0 To p2Count - 1
        a(w) = partition2(i)
        w = w + 1
    Next
    cr = w - 1 'last point in centre
    For i = 0 To p3Count - 1
        a(w) = partition3(i)
        w = w + 1
    Next
    If cl - l < r - cr Then
        If minSubFile < r - cr Then
            stackSize = stackSize + 1
            lStack(stackSize) = cr + 1
            rStack(stackSize) = r
        End If
        r = cl - 1
    Else
        If minSubFile < cl - l Then
            stackSize = stackSize + 1
            lStack(stackSize) = l
            rStack(stackSize) = cl - 1
        End If
        l = cr + 1
    End If
    If r - l < minSubFile Then
      If stackSize = -1 Then Exit Do
      l = lStack(stackSize)
      r = rStack(stackSize)
      stackSize = stackSize - 1
    End If
  Loop
End Sub

At first glance it ought to be slower, because the elements in the second and third partitions move two times (on average each element in the current partition moves 0.75 times - half the left and half the right elements have to be swapped, and a swap takes 3 moves to swap 2 elements - in each partitioning run of a conventional quicksort, except for the pivot, which moves twice). However, when there are a lot of duplicates in the part of the array being considered, separating out the elements that match the partition element saves a lot of partitioning! So much so that it more than pays for the extra moves. And the routine has less of a rocket science feel (except maybe for its internal stack and the straight insertion phase).

Please wait...
Chart will load if you have Javascript

<--

Stable 3-Way Quicksort

<--

This algorithm performs very well when there aren't many different values in the column of the array being sorted. The fewer there are the better it does. However, the modified version from the last graph works with ordered unique values a lot better, and unique random values a little better, so this one is probably worth using only when you can be certain there will be a lot of duplicates in the area. If you're sure there won't be, use the standard one.

all records the same (red)
2 different values only(pink)
5 different values only(purple)
randomly sorted data (blue)
ordered data (every record different)(green)

...though I think the 30% performance penalty if there aren't lots of duplicates might be a bit too steep a price to pay. Perhaps you're best off sampling, say, six records, chosen at random, and using a standard Quicksort if there aren't any duplicates among them, using my modified Quicksort algorithm if there are between three and five different values found, and switching to something else again (a unique value sort?) if you found only one or two.