New portion of improvements for Dual-Pivot Quicksort

Dmytro Sheyko dmytro_sheyko at hotmail.com
Wed May 5 12:01:55 UTC 2010


Hi Vladimir,

The trick with sentinel is quite cute. Going farther, I think it is not always necessary to replace left outer element with the least possible value
(and restore it later after sorting) because after partitioning every element in adjoining array part can play the role of sentinel.
The only exception is when the user requested to sort array partially (not from the beginning). Thereby we should care about setting sentinel explicitly
in this exceptional case only and only once before sorting whole array.

Also it seems to me that it is not necessary to put pivot candidates (5 elements that are used to choose pivots) back to array
because anyway they are to be sorted later and likely change their positions.

I am also interesting in theoretical rationale why dual pivot quicksort is better than single pivot one. The last document that I have seen (Last updated: September 22, 2009)
compared classic quicksort (where pivot is chosen arbitrarily) with "classic" dual pivot quicksort (where pivots are also chosen arbitrarily).
As conclusion they both perform about 2*n*ln(n) key comparisons. However jdk had improved quicksort: median-of-three and pseudomedian-of-nine
approaches were used. And median-of-three approach lead to 12/7*n*ln(n) key comparisons. On the other hand, dual pivot quicksort is also improved:
pivots are chosen from 5 candidates, and hence it must perform less than 2*n*ln(n) key comparisons.

Regards,
Dmytro Sheyko

> Date: Tue, 27 Apr 2010 01:50:08 +0400
> Subject: New portion of improvements for Dual-Pivot Quicksort
> From: vladimir.iaroslavski at googlemail.com
> To: core-libs-dev at openjdk.java.net
> 
> Hello, everyone!
> 
> I've investigated the implementation of the Dual-Pivot Quicksort
> which is used for sorting primitives and here is my result:
> 
> http://cr.openjdk.java.net/~alanb/6947216/webrev.00
> 
> New implementation of Dual-Pivot Quicksort is faster
> than previous one of 12% for client VM and few percents for
> server VM. Tests on Bentley's test suite (Win XP, JDK 7,
> build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88
> for client VM and 0.98-1.00 for server VM.
> 
> In compare with sorting from JDK 6 by Jon L. Bentley and
> M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50
> (client / server).
> 
> See the execution time for sorting array of 2`000`000 int elements
> 50 times, client / server VM, in milliseconds:
> 
> random
>  new: 16723 18776
> jdk7: 17571 18975
> jdk6: 22241 26524
> 
> ascendant
>  new:  3541  4702
> jdk7:  4486  4669
> jdk6:  8667  7978
> 
> descendant
>  new:  3854  4907
> jdk7:  4942  5034
> jdk6:  8787  8243
> 
> equal
>  new:   234   281
> jdk7:   291   230
> jdk6:   602  1018
> 
> organ pipes
>  new:  7673  8613
> jdk7:  8167  8993
> jdk6: 11902 14044
> 
> stagger 1
>  new:  7648  8591
> jdk7:  8161  8979
> jdk6: 11908 13810
> 
> stagger 2
>  new:  8349  9299
> jdk7: 10968 11916
> jdk6: 12194 14734
> 
> stagger 4
>  new:  8475  9622
> jdk7:  9221  9682
> jdk6: 10523 12006
> 
> stagger 8
>  new:  9321 10689
> jdk7: 11125 12387
> jdk6: 13829 16214
> 
> period 1..2
>  new:   758   751
> jdk7:   870   754
> jdk6:  1038  1227
> 
> period 1..4
>  new:  1004   963
> jdk7:  1365  1209
> jdk6:  1511  1847
> 
> period 1..8
>  new:  1588  1573
> jdk7:  1599  1790
> jdk6:  2602  3045
> 
> random 1..2
>  new:  1327  1125
> jdk7:  1362  1496
> jdk6:  1531  2182
> 
> random 1..4
>  new:  1830  2118
> jdk7:  1851  2236
> jdk6:  2292  3025
> 
> where stagger(m) is array like a[i] = i * (m + 1) % length.
> 
> The summary of changes is:
> 
> 1. For sorting small arrays is used insertion sort with sentinel
> instead of traditional, which has the structure:
> 
> for (int i = left + 1; i <= right; i++) {
>     for (j = i; j > left && a[j-1] > a[j]; j--) {
>         swap(a[i], a[j-1]);
>     }
> }
> 
> Note that range check j > left is performed on each iteration,
> but really helps very rare. To avoid this expensive range check,
> it was suggested to set minimum value (negative infinity) on the
> first position. This type of suggestion is used in new version:
> 
> if left bound > 0, we can put sentinel on a[left - 1], do insertion
> sort without expensive check range, and then restore a[left - 1]
> value. If left == 0, traditional insertion sort is used. Please,
> look at the webrev for details.
> 
> 2. In previous implementation 5 evenly spaced elements
> 
> sixth = length / 6;
> a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth]
> 
> were used as candidates of pivots elements. This case is very
> sensitive for period inputs, especially by 6. The new suggestion
> is to take 5 center evenly spaced elements like this:
> 
> int seventh = length / 7;
> int midpoint = (left + right) >>> 1;
> 
> a[midpoint - 2 * seventh], a[midpoint -     seventh], a[midpoint],
> a[midpoint +     seventh], a[midpoint + 2 * seventh]
> 
> and moreover, the seventh is calculated inexpensively:
> seventh = (length >>> 3) + (length >>> 6) + 1;
> 
> This schema works the same on random, ascendant, descendant, equal
> inputs, but much better for period / stagger.
> 
> 3. The whole structure
> 
> <choose pivots>
> 
> if (pivotsDiffer) {
>     <do partitioning for two pivots>
> }
> else {
>     <do partitioning for one pivot>
> }
> 
> <sort left and right parts>
> 
> if (!pivotsDiffer) {
>     return;
> }
> <swap internal pivot values to ends>
> 
> <sort center part>
> 
> was modified to:
> ----------------
> 
> <choose pivots>
> 
> if (pivot1 < pivot2) {
>     <do partitioning for two pivots>
>     <swap pivots into their final positions>
>     <sort left and right parts>
>     <swap internal pivot values to ends>
>     <sort center part>
> }
> else {
>     <do partitioning for one pivot>
>     <sort left and right parts>
> }
> 
> 4. Partitioning for both cases have not been changed at all.
> 
> 5. Minor javadoc and format changes.
> 
> Please, review new implementation,
> any comments / suggestions are welcome!
> 
> Thank you,
> Vladimir
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100505/04674865/attachment.html>


More information about the core-libs-dev mailing list