Re[2]: New portion of improvements for Dual-Pivot Quicksort
Vladimir Iaroslavski
iaroslavski at mail.ru
Wed May 5 19:03:22 UTC 2010
Josh,
Quick note on changing
int pivot1 = a[e2[;
int pivot2 = a[e4];
by
int pivot1 = ae2;
int pivot2 = ae4;
It is extremely surprised, but version with local variables eats
5% (!) of time for client and 2% for server mode (in compare
with one-pivot implementation from JDK 6), see:
client server
with a[e2], a[e4]: 60.75% 48.20%
with ae2, ae4: 65.80% 50.42%
I don't have idea why this simple change is so expensive.
Does anybody can explain it?
Vladimir
Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch <jjb at google.com>:
>
> Vladimir,
>
> Hi. I'm thrilled that you were able to eke so much more perforance out of this already optimized code. I read your changes on my flight to Pittsburgh. Here's the code review:
>
> I think the comment on lines 102-105 was too useful to delete:
>
> 102 ... This
> 103 * method differs from the public {@code sort} method in that the
> 104 * {@code right} index is inclusive, and it does no range checking
> 105 * on {@code left} or {@code right}.
>
> The sentinel technique that you use in lines 118 - 136 is questionable: you are modifying a portion of the array outside the specified range of the call, which arguably violates the contract of the call, and could be observed in a multithreaded program. It's not beyond the realm of reason that it could break existing clients. I will discuss it with Doug Lea and let you know what he says.
>
> If we leave this optimization in, we should change the comments slightly. Appearing to comment-out code (other than entire assertions in inner loops) is never a good thing, hence the change to line 130 and the addition of the line that follows. Also I reworded the commennt in lines 122-125 for clarity. Changed lines are starred, added line has a "+":
>
> 118 // Use insertion sort on tiny arrays
> 119 if (length < INSERTION_SORT_THRESHOLD) {
> 120 if (left > 0) {
> 121 /*
> *122 * Temporarily set the value immediately preceding the range
> *123 * to the minimum int value to serve as a sentinel. This
> *124 * allows us to avoid the j >= left check on each iteration.
> 125 */
> 126 int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE;
> 127
> 128 for (int j, i = left + 1; i <= right; i++) {
> 129 int ai = a[i];
> *130 for (j = i - 1; ai < a[j]; j--) {
> +NEW // assert j >= left;
> 131 a[j + 1] = a[j];
> 132 }
> 133 a[j + 1] = ai;
> 134 }
> 135 a[left - 1] = before;
> 136 } else {
>
> The comment in line 155 is misleading, and should be replace by this:
>
> I'd reword the comment in lines 137-140 for clarity:
>
> 137 /*
> 138 * For case, when left == 0, traditional (without a sentinel)
> 139 * insertion sort, optimized for server VM, is used.
> 140 */
>
> 155 // Inexpensive approximation of length / 7
>
> The comment strategy for choosing sample points is subtle enough that it deserves bit more commentary. I propose replacing line 158 with this:
>
> /*
> * Sort five evenly spaced elements around (and including) the center
> * element in the range. These elements will be used for pivot selection
> * as described below. The choice for spacing these elements was empirically
> * determined to work well on a wide variety of inputs.
> */
>
> Lines 183 and 184 are unnecessary array accesses. Probably better is:
>
> 183 int pivot1 = ae2;
> 184 int pivot2 = ae4;
>
> This is essentially just a renaming of these two local variables, and presumably the compiler will act accordingly.
>
> I prefer the original wording:
>
> 197 * Pointer k is the first index of -part
>
> to the revised wording:
>
> 217 * Pointer k is the first index of not inspected -part.
>
> I'd revert this change.
>
> I'd change 190 from:
>
> 190 if (pivot1 < pivot2) {
> to
> 190 if (pivot1 != pivot2) {
>
> It's clearer (this is the "pivots differ" case), and at least as fast.
>
> The spacing lines 249 and 250 is a bit quirky:
>
> 249 dualPivotQuicksort(a, left, less - 2);
> 250 dualPivotQuicksort(a, great + 2, right);
>
> I'd replace it with the more standard:
>
> 249 dualPivotQuicksort(a, left, less - 2);
> 250 dualPivotQuicksort(a, great + 2, right);
>
> Alternatively, you could make corresponding arguments line up:
>
> 249 dualPivotQuicksort(a, left, less - 2);
> 250 dualPivotQuicksort(a, great + 2, right );
>
> The policy change in line 256 is non-trivial:
>
> Old:
>
> 298 if (less < e1 && great > e5) {
>
> New:
>
> 256 if (great - less > 5 * seventh) {
>
> I previously experimented with the "new-style" policy (length-based, rather than endpoint-based), but didn't get good results. Perhaps the symmetric style combines well with the change in interval size (5/7 vs. 2/3). At any rate, it's simpler than the old-style and conforms to the comment, so if you're satisfied with the performance, I'm happy with it.
>
> In lines 362 and 363, you have the same "quirky spacing" as in linkes 249 and 250:
>
> 361 // Sort left and right parts recursively
> 362 dualPivotQuicksort(a, left, less - 1);
> 363 dualPivotQuicksort(a, great + 1, right);
>
> Regards,
> Josh
More information about the core-libs-dev
mailing list