New portion of improvements for Dual-Pivot Quicksort

Wed Apr 28 07:38:53 UTC 2010

Vladimir,

Cool!  I will look at the changes tomorrow (Wednesday).  Have you told Jon
Bentely?

    Josh

On Mon, Apr 26, 2010 at 2:50 PM, Vladimir Iaroslavski <
vladimir.iaroslavski at googlemail.com> wrote:

> Hello, everyone!
>
> I've investigated the implementation of the Dual-Pivot Quicksort
> which is used for sorting primitives and here is my result:
>
> http://cr.openjdk.java.net/~alanb/6947216/webrev.00
>
> New implementation of Dual-Pivot Quicksort is faster
> than previous one of 12% for client VM and few percents for
> server VM. Tests on Bentley's test suite (Win XP, JDK 7,
> build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88
> for client VM and 0.98-1.00 for server VM.
>
> In compare with sorting from JDK 6 by Jon L. Bentley and
> M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50
> (client / server).
>
> See the execution time for sorting array of 2`000`000 int elements
> 50 times, client / server VM, in milliseconds:
>
> random
>  new: 16723 18776
> jdk7: 17571 18975
> jdk6: 22241 26524
>
> ascendant
>  new:  3541  4702
> jdk7:  4486  4669
> jdk6:  8667  7978
>
> descendant
>  new:  3854  4907
> jdk7:  4942  5034
> jdk6:  8787  8243
>
> equal
>  new:   234   281
> jdk7:   291   230
> jdk6:   602  1018
>
> organ pipes
>  new:  7673  8613
> jdk7:  8167  8993
> jdk6: 11902 14044
>
> stagger 1
>  new:  7648  8591
> jdk7:  8161  8979
> jdk6: 11908 13810
>
> stagger 2
>  new:  8349  9299
> jdk7: 10968 11916
> jdk6: 12194 14734
>
> stagger 4
>  new:  8475  9622
> jdk7:  9221  9682
> jdk6: 10523 12006
>
> stagger 8
>  new:  9321 10689
> jdk7: 11125 12387
> jdk6: 13829 16214
>
> period 1..2
>  new:   758   751
> jdk7:   870   754
> jdk6:  1038  1227
>
> period 1..4
>  new:  1004   963
> jdk7:  1365  1209
> jdk6:  1511  1847
>
> period 1..8
>  new:  1588  1573
> jdk7:  1599  1790
> jdk6:  2602  3045
>
> random 1..2
>  new:  1327  1125
> jdk7:  1362  1496
> jdk6:  1531  2182
>
> random 1..4
>  new:  1830  2118
> jdk7:  1851  2236
> jdk6:  2292  3025
>
> where stagger(m) is array like a[i] = i * (m + 1) % length.
>
> The summary of changes is:
>
> 1. For sorting small arrays is used insertion sort with sentinel
> instead of traditional, which has the structure:
>
> for (int i = left + 1; i <= right; i++) {
>    for (j = i; j > left && a[j-1] > a[j]; j--) {
>        swap(a[i], a[j-1]);
>    }
> }
>
> Note that range check j > left is performed on each iteration,
> but really helps very rare. To avoid this expensive range check,
> it was suggested to set minimum value (negative infinity) on the
> first position. This type of suggestion is used in new version:
>
> if left bound > 0, we can put sentinel on a[left - 1], do insertion
> sort without expensive check range, and then restore a[left - 1]
> value. If left == 0, traditional insertion sort is used. Please,
> look at the webrev for details.
>
> 2. In previous implementation 5 evenly spaced elements
>
> sixth = length / 6;
> a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth]
>
> were used as candidates of pivots elements. This case is very
> sensitive for period inputs, especially by 6. The new suggestion
> is to take 5 center evenly spaced elements like this:
>
> int seventh = length / 7;
> int midpoint = (left + right) >>> 1;
>
> a[midpoint - 2 * seventh], a[midpoint -     seventh], a[midpoint],
> a[midpoint +     seventh], a[midpoint + 2 * seventh]
>
> and moreover, the seventh is calculated inexpensively:
> seventh = (length >>> 3) + (length >>> 6) + 1;
>
> This schema works the same on random, ascendant, descendant, equal
> inputs, but much better for period / stagger.
>
> 3. The whole structure
>
> <choose pivots>
>
> if (pivotsDiffer) {
>    <do partitioning for two pivots>
> }
> else {
>    <do partitioning for one pivot>
> }
>
> <sort left and right parts>
>
> if (!pivotsDiffer) {
>    return;
> }
> <swap internal pivot values to ends>
>
> <sort center part>
>
> was modified to:
> ----------------
>
> <choose pivots>
>
> if (pivot1 < pivot2) {
>    <do partitioning for two pivots>
>    <swap pivots into their final positions>
>    <sort left and right parts>
>    <swap internal pivot values to ends>
>    <sort center part>
> }
> else {
>    <do partitioning for one pivot>
>    <sort left and right parts>
> }
>
> 4. Partitioning for both cases have not been changed at all.
>
> 5. Minor javadoc and format changes.
>
> Please, review new implementation,
> any comments / suggestions are welcome!
>
> Thank you,
> Vladimir
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100428/23c0b3cc/attachment.html>