New portion of improvements for Dual-Pivot Quicksort
Joshua Bloch
jjb at google.com
Wed May 5 04:57:42 UTC 2010
Vladimir,
Hi. I'm thrilled that you were able to eke so much more perforance out of
this already optimized code. I read your changes on my flight to Pittsburgh.
Here's the code review:
I think the comment on lines 102-105 was too useful to delete:
102 ... This
103 * method differs from the public {@code sort} method in that the
104 * {@code right} index is inclusive, and it does no range checking
105 * on {@code left} or {@code right}.
The sentinel technique that you use in lines 118 - 136 is questionable: you
are modifying a portion of the array outside the specified range of the
call, which arguably violates the contract of the call, and could be
observed in a multithreaded program. It's not beyond the realm of reason
that it could break existing clients. I will discuss it with Doug Lea and
let you know what he says.
If we leave this optimization in, we should change the comments slightly.
Appearing to comment-out code (other than entire assertions in inner loops)
is never a good thing, hence the change to line 130 and the addition of the
line that follows. Also I reworded the commennt in lines 122-125 for
clarity. Changed lines are starred, added line has a "+":
118 // Use insertion sort on tiny arrays
119 if (length < INSERTION_SORT_THRESHOLD) {
120 if (left > 0) {
121 /*
*122 * Temporarily set the value immediately preceding the
range
*123 * to the minimum int value to serve as a sentinel.
This
*124 * allows us to avoid the j >= left check on each
iteration.
125 */
126 int before = a[left - 1]; a[left - 1] =
Integer.MIN_VALUE;
127
128 for (int j, i = left + 1; i <= right; i++) {
129 int ai = a[i];
*130 for (j = i - 1; ai < a[j]; j--) {
+NEW // assert j >= left;
131 a[j + 1] = a[j];
132 }
133 a[j + 1] = ai;
134 }
135 a[left - 1] = before;
136 } else {
The comment in line 155 is misleading, and should be replace by this:
I'd reword the comment in lines 137-140 for clarity:
137 /*
138 * For case, when left == 0, traditional (without a
sentinel)
139 * insertion sort, optimized for server VM, is used.
140 */
155 // Inexpensive approximation of length / 7
The comment strategy for choosing sample points is subtle enough that it
deserves bit more commentary. I propose replacing line 158 with this:
/*
* Sort five evenly spaced elements around (and including) the center
* element in the range. These elements will be used for pivot selection
* as described below. The choice for spacing these elements was
empirically
* determined to work well on a wide variety of inputs.
*/
Lines 183 and 184 are unnecessary array accesses. Probably better is:
183 int pivot1 = ae2;
184 int pivot2 = ae4;
This is essentially just a renaming of these two local variables, and
presumably the compiler will act accordingly.
I prefer the original wording:
197 * Pointer k is the first index of ?-part
to the revised wording:
217 * Pointer k is the first index of not inspected ?-part.
I'd revert this change.
I'd change 190 from:
190 if (pivot1 < pivot2) {
to
190 if (pivot1 != pivot2) {
It's clearer (this is the "pivots differ" case), and at least as fast.
The spacing lines 249 and 250 is a bit quirky:
249 dualPivotQuicksort(a, left, less - 2);
250 dualPivotQuicksort(a, great + 2, right);
I'd replace it with the more standard:
249 dualPivotQuicksort(a, left, less - 2);
250 dualPivotQuicksort(a, great + 2, right);
Alternatively, you could make corresponding arguments line up:
249 dualPivotQuicksort(a, left, less - 2);
250 dualPivotQuicksort(a, great + 2, right );
The policy change in line 256 is non-trivial:
Old:
298 if (less < e1 && great > e5) {
New:
256 if (great - less > 5 * seventh) {
I previously experimented with the "new-style" policy (length-based, rather
than endpoint-based), but didn't get good results. Perhaps the symmetric
style combines well with the change in interval size (5/7 vs. 2/3). At any
rate, it's simpler than the old-style and conforms to the comment, so if
you're satisfied with the performance, I'm happy with it.
In lines 362 and 363, you have the same "quirky spacing" as in linkes 249
and 250:
361 // Sort left and right parts recursively
362 dualPivotQuicksort(a, left, less - 1);
363 dualPivotQuicksort(a, great + 1, right);
Regards,
Josh
On Thu, Apr 29, 2010 at 4:05 AM, Vladimir Iaroslavski <
vladimir.iaroslavski at googlemail.com> wrote:
> Hello,
>
> Thank you for pointing to this check, I have run test again and it
> works little it faster
> (this check was necessary for old version of insertion sort):
>
> 60.90% / 48.35% instead of 61.00% / 51.00% (client / server).
>
> I attached new version of the class (+ minor javdoc changes).
>
> Best regards,
> Vladimir
>
> On Thu, Apr 29, 2010 at 12:18 PM, Goktug Gokdogan <gokdogan at gmail.com>
> wrote:
> > One quick comment: you can consider moving trivial array check so it will
> be
> > executed only if insertion sort threshold check passes:
> >
> > if (length < INSERTION_SORT_THRESHOLD) {
> > if (length < 2) {
> > return;
> > }
> > ....
> >
> > On Mon, Apr 26, 2010 at 2:50 PM, Vladimir Iaroslavski
> > <vladimir.iaroslavski at googlemail.com> wrote:
> >>
> >> Hello, everyone!
> >>
> >> I've investigated the implementation of the Dual-Pivot Quicksort
> >> which is used for sorting primitives and here is my result:
> >>
> >> http://cr.openjdk.java.net/~alanb/6947216/webrev.00
> >>
> >> New implementation of Dual-Pivot Quicksort is faster
> >> than previous one of 12% for client VM and few percents for
> >> server VM. Tests on Bentley's test suite (Win XP, JDK 7,
> >> build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88
> >> for client VM and 0.98-1.00 for server VM.
> >>
> >> In compare with sorting from JDK 6 by Jon L. Bentley and
> >> M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50
> >> (client / server).
> >>
> >> See the execution time for sorting array of 2`000`000 int elements
> >> 50 times, client / server VM, in milliseconds:
> >>
> >> random
> >> new: 16723 18776
> >> jdk7: 17571 18975
> >> jdk6: 22241 26524
> >>
> >> ascendant
> >> new: 3541 4702
> >> jdk7: 4486 4669
> >> jdk6: 8667 7978
> >>
> >> descendant
> >> new: 3854 4907
> >> jdk7: 4942 5034
> >> jdk6: 8787 8243
> >>
> >> equal
> >> new: 234 281
> >> jdk7: 291 230
> >> jdk6: 602 1018
> >>
> >> organ pipes
> >> new: 7673 8613
> >> jdk7: 8167 8993
> >> jdk6: 11902 14044
> >>
> >> stagger 1
> >> new: 7648 8591
> >> jdk7: 8161 8979
> >> jdk6: 11908 13810
> >>
> >> stagger 2
> >> new: 8349 9299
> >> jdk7: 10968 11916
> >> jdk6: 12194 14734
> >>
> >> stagger 4
> >> new: 8475 9622
> >> jdk7: 9221 9682
> >> jdk6: 10523 12006
> >>
> >> stagger 8
> >> new: 9321 10689
> >> jdk7: 11125 12387
> >> jdk6: 13829 16214
> >>
> >> period 1..2
> >> new: 758 751
> >> jdk7: 870 754
> >> jdk6: 1038 1227
> >>
> >> period 1..4
> >> new: 1004 963
> >> jdk7: 1365 1209
> >> jdk6: 1511 1847
> >>
> >> period 1..8
> >> new: 1588 1573
> >> jdk7: 1599 1790
> >> jdk6: 2602 3045
> >>
> >> random 1..2
> >> new: 1327 1125
> >> jdk7: 1362 1496
> >> jdk6: 1531 2182
> >>
> >> random 1..4
> >> new: 1830 2118
> >> jdk7: 1851 2236
> >> jdk6: 2292 3025
> >>
> >> where stagger(m) is array like a[i] = i * (m + 1) % length.
> >>
> >> The summary of changes is:
> >>
> >> 1. For sorting small arrays is used insertion sort with sentinel
> >> instead of traditional, which has the structure:
> >>
> >> for (int i = left + 1; i <= right; i++) {
> >> for (j = i; j > left && a[j-1] > a[j]; j--) {
> >> swap(a[i], a[j-1]);
> >> }
> >> }
> >>
> >> Note that range check j > left is performed on each iteration,
> >> but really helps very rare. To avoid this expensive range check,
> >> it was suggested to set minimum value (negative infinity) on the
> >> first position. This type of suggestion is used in new version:
> >>
> >> if left bound > 0, we can put sentinel on a[left - 1], do insertion
> >> sort without expensive check range, and then restore a[left - 1]
> >> value. If left == 0, traditional insertion sort is used. Please,
> >> look at the webrev for details.
> >>
> >> 2. In previous implementation 5 evenly spaced elements
> >>
> >> sixth = length / 6;
> >> a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth]
> >>
> >> were used as candidates of pivots elements. This case is very
> >> sensitive for period inputs, especially by 6. The new suggestion
> >> is to take 5 center evenly spaced elements like this:
> >>
> >> int seventh = length / 7;
> >> int midpoint = (left + right) >>> 1;
> >>
> >> a[midpoint - 2 * seventh], a[midpoint - seventh], a[midpoint],
> >> a[midpoint + seventh], a[midpoint + 2 * seventh]
> >>
> >> and moreover, the seventh is calculated inexpensively:
> >> seventh = (length >>> 3) + (length >>> 6) + 1;
> >>
> >> This schema works the same on random, ascendant, descendant, equal
> >> inputs, but much better for period / stagger.
> >>
> >> 3. The whole structure
> >>
> >> <choose pivots>
> >>
> >> if (pivotsDiffer) {
> >> <do partitioning for two pivots>
> >> }
> >> else {
> >> <do partitioning for one pivot>
> >> }
> >>
> >> <sort left and right parts>
> >>
> >> if (!pivotsDiffer) {
> >> return;
> >> }
> >> <swap internal pivot values to ends>
> >>
> >> <sort center part>
> >>
> >> was modified to:
> >> ----------------
> >>
> >> <choose pivots>
> >>
> >> if (pivot1 < pivot2) {
> >> <do partitioning for two pivots>
> >> <swap pivots into their final positions>
> >> <sort left and right parts>
> >> <swap internal pivot values to ends>
> >> <sort center part>
> >> }
> >> else {
> >> <do partitioning for one pivot>
> >> <sort left and right parts>
> >> }
> >>
> >> 4. Partitioning for both cases have not been changed at all.
> >>
> >> 5. Minor javadoc and format changes.
> >>
> >> Please, review new implementation,
> >> any comments / suggestions are welcome!
> >>
> >> Thank you,
> >> Vladimir
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100504/5858010a/attachment.html>
More information about the core-libs-dev
mailing list