Replacement of Quicksort in java.util.Arrays with Dual-Pivot Quicksort: improvements
Vladimir Yaroslavskiy
Vladimir.Yaroslavskiy at Sun.COM
Mon Sep 14 15:07:30 UTC 2009
Hi Team,
Thank you very much for your feedback and comments!
I collect here all hints and tips for improvement of
the new algorithm. New version of Quicksort is attached.
And now my investigations:
1. [Jon Bentley]
> Use the median of 2k+1 elements for pivots.
> We can choose a sample of five elements, sort them, and then
> partition around the 2nd and 4th sorted elements.
Yes, this helps, I can't say that we win a lot of time in general,
but it works really faster for input data like this 012012012...,
012301230..., 01234012340.... And more important thing: this
approach allows me to eliminate "div" parameter in the implementation.
As it was suggested, I take middle 5 elements with indexes: s, 2s, 3s,
4s, 5s (where s is length/6), sort them and take 2s and 4s elements
as pivots. Also I don't need to compare pivots because they are sorted.
2. [Jon Bentley]
> How about doing a test for pivot1 == pivot2, and then
> separately handling that special case? equal elements. Right now, the
> inner loop in the "// sorting" section compares every element both to
> pivot1 and pivot2.
Good catch! It reduces time of sorting equal elements (or almost equal)
from 347 to 228 and doesn't change time for other cases.
3. [Jon Bentley]
> Vladimir is currently using the first scheme that we tried.
> I wonder if it might be faster (or clearer) to try an invariant like
> | < | M | ? | M | > |
> where M denotes middle elements with the property
> Pivot1 <= M <= Pivot2
> Vladimir, do you think that you might be able to get a better invariant?
Sure, I tried other invariant like this:
| < | M | > | ? |
but it is slower than current.
4.1 [Oleg Anashkin]
> First thing that came to mind - have you thought about extrapolating this
> approach to more pivots? If 2-pivot algorithm is faster than 1-pivot, then
> 3-pivot might be even faster, right? Can the number of pivots be chosen as
> a function of array size (to mitigate overhead)?
and
4.2 [Leonid Geller]
> As an observation, why not expand the new algorithm to N-Pivot
> where N = round(ln(array length)).
> This should lower the average sort cost even lower.
Here is the experimental results for several pivots.
It's easy to define recurrence relation and write code
to count the comparisons and swap:
Classic Quicksort: comps - 52254956
swaps - 26277985
Dual-Pivot Quicksort: comps - 52246553
swaps - 20977937
Quicksort with 3 pivots: comps - 54399943
swaps - 24224254
Quicksort with 4 pivots: comps - 57241615
swaps - 28659728
You can see that "3 pivots" is better than Classic Quicksort for
swaps only, "4 pivots" is the looser, and "Dual-Pivot Quicksort"
is real winner. If we have many pivots we spent a lot of time on
sorting them and dividing into more parts doesn't save. In case
with 2 pivots, we use only 1 comparison and 1/2 swaps to sort it out.
5. [R?mi Forax]
> The algorithm use two constants: 37 and 13,
> I think they should be declared as private static final.
The constants are: 27 (not 37) and 13, and I made it as
private static final int. (but 27 is changed to 17, see below)
6. [R?mi Forax]
> In the loop tagged "sorting" and "equals element", a[k] can be
> stored in a local variable to avoid to recompute it several time.
Adding a local variable doesn't save time, but it was done in the
line of other improvement (see below). Thank you for pointing to it!
7. [Vladimir Yaroslavskiy]
I very often look at swap function and don't like it. I tried to
eliminate it and add these 3 lines into the code as inline, but
no effect. Then I looked from other side, we have a lot of code
such as:
for (int k=0; ...) {
if (a[k] < ...) {
swap(a, k, less++)
}
...
}
You can see that we retrieve a[k] many times, so I suggest doing it
like this without swap function:
for (int k=0; ...) {
int ak = a[k]
if (ak < ...) {
a[k] = a[less];
a[less++] = ak;
}
...
}
What we have achieve:
1. No swap function call
2. Two assignments instead of three
3. No local variable temp
This approach shows very nice results especially for server mode:
new: 100 / old: 126
8. [Jonathan Graehl]
> suggest this:
> int third = len / div;
> third=third>0?third:1;
Very elegant, but unfortunately I could not apply this,
because the schema of choosing pivot elements has been changed.
9. [Goktug Gokdogan]
> 1. Falling back-to insertion sort at < 17 instead of < 27.
> 2. (a[great] > pivot2) test is more likely to fail compared to
> (k < great) in the while loop, so exchange them (ie. use
> while (a[great] > pivot2 && k < great)).
I tried 17 as boundary for insertion sort: in some cases it is better,
in other no. But in average I don't see that time is increased.
I set up 17 in new version. Note that in my first version the value
was 17, but later it was changed to 25 -> 27.
while (a[great] > pivot2 && k < great)) - we win 3-4% ! Thank you!
------------------------------------------------------------------
I hope that I don't miss something what you suggested.
If you find missed items or have other comments/suggestions,
please, let me know. And the end I would like provide the
execution times for standard input data (all times should
be divided by 1000 to get seconds):
Client VM Server VM
random random
dpq: 16221 dpq: 19893
jdk: 20162 jdk: 24199
ascendant ascendant
dpq: 5686 dpq: 7952
jdk: 9414 jdk: 11748
descendant descendant
dpq: 5921 dpq: 7924
jdk: 9736 jdk: 11777
all equal all equal
dpq: 261 dpq: 500
jdk: 734 jdk: 1140
repeated 50 repeated 50
dpq: 3450 dpq: 4239
jdk: 4207 jdk: 5874
organ pipes organ pipes
dpq: 9364 dpq: 11451
jdk: 12858 jdk: 16534
random 01 random 01
dpq: 1474 dpq: 1423
jdk: 1747 jdk: 2591
010101 010101
dpq: 1092 dpq: 1052
jdk: 1140 jdk: 2076
012012 012012
dpq: 771 dpq: 1035
jdk: 1220 jdk: 1980
012301 012301
dpq: 1140 dpq: 1393
jdk: 1606 jdk: 2876
012340 012340
dpq: 1464 dpq: 1826
jdk: 1947 jdk: 3366
random 4 random 4
dpq: 1855 dpq: 2423
jdk: 2258 jdk: 3148
random 1000 random 1000
dpq: 8008 dpq: 9213
jdk: 8521 jdk: 10363
You can see that now Dual-Pivots Quicksort wins in all nominations
(in previous letter it looses in server mode in few cases).
Alan,
I'll send you updated Arrays.java little bit later.
Thank you,
Vladimir
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: DualPivotQuicksort753.java
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20090914/a1332afe/DualPivotQuicksort753.java>
More information about the core-libs-dev
mailing list