New portion of improvements for Dual-Pivot Quicksort

Wed May 5 17:23:37 UTC 2010

Hi Dmytro,

Thank you very much for suggestions! I checked it and here
is my results:

If we consider case when the whole array is sorted,
we can omit setting sentinel and restoring original value:

//  int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE;

     for (int j, i = left + 1; i <= right; i++) {
         int ai = a[i];
         for (j = i - 1; /* j >= left && */ ai < a[j]; j--) {
             a[j + 1] = a[j];
         }
         a[j + 1] = ai;
     }
//  a[left - 1] = before;

I checked this version and found that it works little bit faster
(only 0.4-0.3%) with client VM, but loses more than 2.5% under
server mode in compare with one-pivot version from JDK 6:

                      client   server
            original: 60.75%   48.20%
no setting sentinel: 60.40%   50.88%

If we add additional check which case is sorted, I think, we
don't win at all.

And if pivot candidates are not put back to array, it shows
very poor results:

                     client   server
           original: 60.75%   48.20%
no back candidates: 66.80%   53.95%

I run one pivot version taken from JDK 6: I replace in Dual-Pivot
implementation with all optimizations dual-pivot partitioning by one-pivot:
but it remain still slower than DPQ. So, the main point of DPQ is not only
in optimizations, but in dividing array into 3 parts instead of two.

Thank you,
Vladimir

Dmytro Sheyko wrote:
> Hi Vladimir,
> 
> The trick with sentinel is quite cute. Going farther, I think it is not 
> always necessary to replace left outer element with the least possible value
> (and restore it later after sorting) because after partitioning every 
> element in adjoining array part can play the role of sentinel.
> The only exception is when the user requested to sort array partially 
> (not from the beginning). Thereby we should care about setting sentinel 
> explicitly in this exceptional case only and only once before sorting whole array.
> 
> Also it seems to me that it is not necessary to put pivot candidates (5 
> elements that are used to choose pivots) back to array
> because anyway they are to be sorted later and likely change their 
> positions.
> 
> I am also interesting in theoretical rationale why dual pivot quicksort 
> is better than single pivot one. The last document that I have seen 
> (Last updated: September 22, 2009)
> compared classic quicksort (where pivot is chosen arbitrarily) with 
> "classic" dual pivot quicksort (where pivots are also chosen arbitrarily).
> As conclusion they both perform about 2*n*ln(n) key comparisons. However 
> jdk had improved quicksort: median-of-three and pseudomedian-of-nine
> approaches were used. And median-of-three approach lead to 12/7*n*ln(n) 
> key comparisons. On the other hand, dual pivot quicksort is also improved:
> pivots are chosen from 5 candidates, and hence it must perform less than 
> 2*n*ln(n) key comparisons.
> 
> Regards,
> Dmytro Sheyko
> 
>  > Date: Tue, 27 Apr 2010 01:50:08 +0400
>  > Subject: New portion of improvements for Dual-Pivot Quicksort
>  > From: vladimir.iaroslavski at googlemail.com
>  > To: core-libs-dev at openjdk.java.net
>  >
>  > Hello, everyone!
>  >
>  > I've investigated the implementation of the Dual-Pivot Quicksort
>  > which is used for sorting primitives and here is my result:
>  >
>  > http://cr.openjdk.java.net/~alanb/6947216/webrev.00
>  >
>  > New implementation of Dual-Pivot Quicksort is faster
>  > than previous one of 12% for client VM and few percents for
>  > server VM. Tests on Bentley's test suite (Win XP, JDK 7,
>  > build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88
>  > for client VM and 0.98-1.00 for server VM.
>  >
>  > In compare with sorting from JDK 6 by Jon L. Bentley and
>  > M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50
>  > (client / server).
>  >
>  > See the execution time for sorting array of 2`000`000 int elements
>  > 50 times, client / server VM, in milliseconds:
>  >
>  > random
>  > new: 16723 18776
>  > jdk7: 17571 18975
>  > jdk6: 22241 26524
>  >
>  > ascendant
>  > new: 3541 4702
>  > jdk7: 4486 4669
>  > jdk6: 8667 7978
>  >
>  > descendant
>  > new: 3854 4907
>  > jdk7: 4942 5034
>  > jdk6: 8787 8243
>  >
>  > equal
>  > new: 234 281
>  > jdk7: 291 230
>  > jdk6: 602 1018
>  >
>  > organ pipes
>  > new: 7673 8613
>  > jdk7: 8167 8993
>  > jdk6: 11902 14044
>  >
>  > stagger 1
>  > new: 7648 8591
>  > jdk7: 8161 8979
>  > jdk6: 11908 13810
>  >
>  > stagger 2
>  > new: 8349 9299
>  > jdk7: 10968 11916
>  > jdk6: 12194 14734
>  >
>  > stagger 4
>  > new: 8475 9622
>  > jdk7: 9221 9682
>  > jdk6: 10523 12006
>  >
>  > stagger 8
>  > new: 9321 10689
>  > jdk7: 11125 12387
>  > jdk6: 13829 16214
>  >
>  > period 1..2
>  > new: 758 751
>  > jdk7: 870 754
>  > jdk6: 1038 1227
>  >
>  > period 1..4
>  > new: 1004 963
>  > jdk7: 1365 1209
>  > jdk6: 1511 1847
>  >
>  > period 1..8
>  > new: 1588 1573
>  > jdk7: 1599 1790
>  > jdk6: 2602 3045
>  >
>  > random 1..2
>  > new: 1327 1125
>  > jdk7: 1362 1496
>  > jdk6: 1531 2182
>  >
>  > random 1..4
>  > new: 1830 2118
>  > jdk7: 1851 2236
>  > jdk6: 2292 3025
>  >
>  > where stagger(m) is array like a[i] = i * (m + 1) % length.
>  >
>  > The summary of changes is:
>  >
>  > 1. For sorting small arrays is used insertion sort with sentinel
>  > instead of traditional, which has the structure:
>  >
>  > for (int i = left + 1; i <= right; i++) {
>  > for (j = i; j > left && a[j-1] > a[j]; j--) {
>  > swap(a[i], a[j-1]);
>  > }
>  > }
>  >
>  > Note that range check j > left is performed on each iteration,
>  > but really helps very rare. To avoid this expensive range check,
>  > it was suggested to set minimum value (negative infinity) on the
>  > first position. This type of suggestion is used in new version:
>  >
>  > if left bound > 0, we can put sentinel on a[left - 1], do insertion
>  > sort without expensive check range, and then restore a[left - 1]
>  > value. If left == 0, traditional insertion sort is used. Please,
>  > look at the webrev for details.
>  >
>  > 2. In previous implementation 5 evenly spaced elements
>  >
>  > sixth = length / 6;
>  > a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth]
>  >
>  > were used as candidates of pivots elements. This case is very
>  > sensitive for period inputs, especially by 6. The new suggestion
>  > is to take 5 center evenly spaced elements like this:
>  >
>  > int seventh = length / 7;
>  > int midpoint = (left + right) >>> 1;
>  >
>  > a[midpoint - 2 * seventh], a[midpoint - seventh], a[midpoint],
>  > a[midpoint + seventh], a[midpoint + 2 * seventh]
>  >
>  > and moreover, the seventh is calculated inexpensively:
>  > seventh = (length >>> 3) + (length >>> 6) + 1;
>  >
>  > This schema works the same on random, ascendant, descendant, equal
>  > inputs, but much better for period / stagger.
>  >
>  > 3. The whole structure
>  >
>  > <choose pivots>
>  >
>  > if (pivotsDiffer) {
>  > <do partitioning for two pivots>
>  > }
>  > else {
>  > <do partitioning for one pivot>
>  > }
>  >
>  > <sort left and right parts>
>  >
>  > if (!pivotsDiffer) {
>  > return;
>  > }
>  > <swap internal pivot values to ends>
>  >
>  > <sort center part>
>  >
>  > was modified to:
>  > ----------------
>  >
>  > <choose pivots>
>  >
>  > if (pivot1 < pivot2) {
>  > <do partitioning for two pivots>
>  > <swap pivots into their final positions>
>  > <sort left and right parts>
>  > <swap internal pivot values to ends>
>  > <sort center part>
>  > }
>  > else {
>  > <do partitioning for one pivot>
>  > <sort left and right parts>
>  > }
>  >
>  > 4. Partitioning for both cases have not been changed at all.
>  >
>  > 5. Minor javadoc and format changes.
>  >
>  > Please, review new implementation,
>  > any comments / suggestions are welcome!
>  >
>  > Thank you,
>  > Vladimir