New portion of improvements for Dual-Pivot Quicksort
Vladimir Iaroslavski
iaroslavski at mail.ru
Tue May 18 14:57:50 UTC 2010
Hello,
I've run your modification for counting sort, it real faster.
I attached new version with your changes (I did little bit
format it) and included my case with float/double.
Note that you modification doesn't pass test from Sorting class,
which I sent earlier. It fails on float/double test:
Test #3: random = 666, len = 34, a = 0, g = 6, z = 9, n = 10, p = 9
I suggest shorter method (which is based on your idea to skip counting
negative zeros on Phase 1.): I found find first zero index (or it will
be index of first positive element if no zeros at all, or last negative,
if no positive and zero elements) and then swap negative zero to the
beginning of the sub-range.
int hi = right;
while (left < hi) {
int middle = (left + hi) >>> 1;
float middleValue = a[middle];
if (middleValue < 0.0f) {
left = middle + 1;
} else {
hi = middle;
}
}
for (int k = left, p = left; k <= right; k++) {
float ak = a[k];
if (ak != 0.0f) {
return;
}
if (Float.floatToRawIntBits(ak) < 0) { // ak is -0.0f
a[k] = +0.0f;
a[p++] = -0.0f;
}
}
Important note: in partitioning loop there are several places
(marked by // !) where potential bug with -0.0 could be
(when pivot and a[great] are zeros with different signs):
if (a[great] == pivot1) {
a[k] = a[less];
- a[less++] = pivot1; // !
+ a[less++] = a[great];
} else { // pivot1 < a[great] < pivot2
a[k] = a[great];
}
- a[great--] = pivot2; // !
+ a[great--] = ak;
} else if (ak == pivot1) { // Move a[k] to left part
a[k] = a[less];
- a[less++] = pivot1; // !
+ a[less++] = ak;
}
and the same in "Pivots are equal" branch.
I did changes "pivot1/2 -> ak" in methods for all types
and "pivot1 -> a[great]" in float/double sections only.
Please, review format changes for counting sort and new version
of Phase 3 for float/double.
Thank you,
Vladimir
Dmytro Sheyko wrote:
> Hi,
>
> About counting sort again.
>
> 1. This condition "i < count.length && k <= right" is excessive. Any one
> conjunct is enough. "k <= right" seems better.
> 2. No need to calculate "short value = (short) (i + Short.MIN_VALUE)"
> when "count[i]" is zero.
> 3. For signed primitives (byte and short) we would better loop backward.
> Thanks to "k >= fromIndex" condition we will quit looping earlier
> assuming that typically we work with positive numbers.
> For unsigned primitives (char) we would better loop forward because
> typically we work with characters about zero (ASCII).
>
> - for (int i = 0, k = left; i < count.length && k <= right; i++) {
> - short value = (short) (i + Short.MIN_VALUE);
> - for (int s = count[i]; s > 0; s--) {
> - a[k++] = value;
> - }
> - }
>
> + for (int i = NUM_SHORT_VALUES - 1, k = toIndex - 1; k >=
> fromIndex; --i) {
> + while (count[i] == 0) --i;
> + short value = (short) (i + Short.MIN_VALUE);
> + int s = count[i];
> + do { a[k--] = value; } while (--s > 0);
> + }
>
> Thanks,
> Dmytro Sheyko
>
> > From: iaroslavski at mail.ru
> > To: dmytro_sheyko at hotmail.com
> > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> > Subject: Re[2]: New portion of improvements for Dual-Pivot Quicksort
> > Date: Tue, 18 May 2010 01:11:19 +0400
> >
> > Sounds good!
> > Will consider too...
> >
> > Mon, 17 May 2010 22:24:11 +0700 письмо от Dmytro Sheyko
> <dmytro_sheyko at hotmail.com>:
> >
> > > Hi,
> > >
> > > Regarding counting sort. We can check whether we should switch to
> counting sort only once in the beginning.
> > >
> > > > Date: Mon, 17 May 2010 17:30:37 +0400
> > > > From: iaroslavski at mail.ru
> > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > > > To: dmytro_sheyko at hotmail.com
> > > > CC: core-libs-dev at openjdk.java.net
> > > >
> > > > Hello,
> > > >
> > > > Thank you for review, I'll check and run tests again with you
> changes.
> > > >
> > > > Thank you,
> > > > Vladimir
> > > >
> > > > Dmytro Sheyko wrote:
> > > > > Hello,
> > > > >
> > > > > More ideas.
> > > > >
> > > > > 1. We can use
> > > > > Double.doubleToRawLongBits instead of Double.doubleToLongBits and
> > > > > Float.floatToRawIntBits instead of Float.floatToIntBits.
> > > > > No need to handle NaN's because they all are placed to the end
> of array.
> > > > >
> > > > > 2. Note that
> > > > > Double.doubleToRawLongBits(+0.0) == 0L and
> > > > > Double.doubleToRawLongBits(-0.0) == Long.MIN_VALUE and
> > > > > Float.floatToRawIntBits(+0.0) == 0 and
> > > > > Float.floatToRawIntBits(-0.0) == Integer.MIN_VALUE.
> > > > >
> > > > > Comparing with is zero usually more efficient (or at least not
> worse)
> > > > > than with other values. Thus such pattern
> > > > >
> > > > > if (ak == 0.0f && NEGATIVE_ZERO == Float.floatToIntBits(ak))
> > > > >
> > > > > can be replaced with
> > > > >
> > > > > if (ak == 0.0f && Float.floatToIntBits(ak) < 0)
> > > > >
> > > > > 3. It would be more efficient to count negative zeros after
> sorting.
> > > > > General sorting algorithm puts both negative and positive zeros
> together
> > > > > (but maybe not in right order).
> > > > > Therefore we have to process less elements because usually we
> have less
> > > > > zeros than other numbers.
> > > > >
> > > > > Thanks,
> > > > > Dmytro Sheyko
> > > > >
> > > > > > From: iaroslavski at mail.ru
> > > > > > To: dmytro_sheyko at hotmail.com; jjb at google.com
> > > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> > > > > > Subject: Re[6]: New portion of improvements for Dual-Pivot
> Quicksort
> > > > > > Date: Fri, 14 May 2010 23:54:06 +0400
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've updated the class, please, review the changes.
> > > > > >
> > > > > > Vladimir
> > > > > >
> > > > > > Fri, 14 May 2010 01:48:11 +0700 письмо от Dmytro Sheyko
> > > > > <dmytro_sheyko at hotmail..com>:
> > > > > >
> > > > > > > Yes. I prefer F (Find First zero using binary search) over
> C (Count
> > > > > negatives) and S (Smart Scan for zero).
> > > > > > >
> > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > > CC: jjb at google.com; core-libs-dev at openjdk.java.net;
> > > > > iaroslavski at mail.ru
> > > > > > > > Subject: Re[4]: New portion of improvements for
> Dual-Pivot Quicksort
> > > > > > > > Date: Thu, 13 May 2010 21:34:54 +0400
> > > > > > > >
> > > > > > > > Dmytro,
> > > > > > > >
> > > > > > > > I've tested your suggested variants, and found that case "C"
> > > > > > > > (very interesting approach to find first position of zero
> > > > > > > > by counting negative elements) works slower than original
> > > > > > > > or two other cases.
> > > > > > > >
> > > > > > > > Implementations "F" and "S" are very close to each other
> > > > > > > > and little bit faster than original. I prefer case "F":
> > > > > > > > it is shorter and more clear. Do you agree?
> > > > > > > >
> > > > > > > > I'll prepare updated DualPivotQuicksort file and send it
> > > > > > > > tomorrow.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Vladimir
> > > > > > > >
> > > > > > > > Wed, 12 May 2010 17:04:52 +0700 письмо от Dmytro Sheyko
> > > > > <dmytro_sheyko at hotmail.com>:
> > > > > > > >
> > > > > > > > > Vladimir,
> > > > > > > > >
> > > > > > > > > Your changes are good for me.
> > > > > > > > >
> > > > > > > > > Additionally I have some comments/proposals regarding
> dealing
> > > > > with negative zeros.
> > > > > > > > >
> > > > > > > > > 1. Scanning for the first zero we can avoid range check
> (i >=
> > > > > left) if we have at least one negative value.
> > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > > +++ DualPivotQuicksortS.java Wed May 12 12:10:46 2010
> > > > > > > > > @@ -1705,10 +1705,15 @@
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > // Find first zero element
> > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > > + int zeroIndex = 0;
> > > > > > > > >
> > > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] ==
> 0.0f; i--) {
> > > > > > > > > - zeroIndex = i;
> > > > > > > > > + if (a[left] < 0.0f) {
> > > > > > > > > + zeroIndex = findAnyZero(a, left, n);
> > > > > > > > > +
> > > > > > > > > + // there is at least one negative value, so range
> check is
> > > > > not needed
> > > > > > > > > + for (int i = zeroIndex - 1; /*i >= left &&*/ a[i] ==
> 0.0f; i--) {
> > > > > > > > > + zeroIndex = i;
> > > > > > > > > + }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > // Turn the right number of positive zeros back into
> negative zeros
> > > > > > > > >
> > > > > > > > > 2. We can find the position of the first zero by counting
> > > > > negative values during preprocessing phase.
> > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > > +++ DualPivotQuicksortC.java Wed May 12 12:01:24 2010
> > > > > > > > > @@ -1678,7 +1678,7 @@
> > > > > > > > > * Phase 1: Count negative zeros and move NaNs to end of
> array.
> > > > > > > > > */
> > > > > > > > > final int NEGATIVE_ZERO = Float.floatToIntBits(-0.0f);
> > > > > > > > > - int numNegativeZeros = 0;
> > > > > > > > > + int numNegativeZeros = 0, numNegativeValues = 0;
> > > > > > > > > int n = right;
> > > > > > > > >
> > > > > > > > > for (int k = left; k <= n; k++) {
> > > > > > > > > @@ -1689,6 +1689,8 @@
> > > > > > > > > } else if (ak != ak) { // i.e., ak is NaN
> > > > > > > > > a[k--] = a[n];
> > > > > > > > > a[n--] = Float.NaN;
> > > > > > > > > + } else if (ak < 0.0f) {
> > > > > > > > > + numNegativeValues++;
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > @@ -1705,7 +1707,7 @@
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > // Find first zero element
> > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > > + int zeroIndex = numNegativeValues;
> > > > > > > > >
> > > > > > > > > for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f;
> i--) {
> > > > > > > > > zeroIndex = i;
> > > > > > > > >
> > > > > > > > > 3. We can use binary search to find the first zero and
> thus
> > > > > avoid linear scan.
> > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > > +++ DualPivotQuicksortF.java Wed May 12 12:03:58 2010
> > > > > > > > > @@ -1705,11 +1705,7 @@
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > // Find first zero element
> > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > > -
> > > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] ==
> 0.0f; i--) {
> > > > > > > > > - zeroIndex = i;
> > > > > > > > > - }
> > > > > > > > > + int zeroIndex = findFirstZero(a, left, n);
> > > > > > > > >
> > > > > > > > > // Turn the right number of positive zeros back into
> negative zeros
> > > > > > > > > for (int i = zeroIndex, m = zeroIndex +
> numNegativeZeros; i <
> > > > > m; i++) {
> > > > > > > > > @@ -1718,7 +1714,7 @@
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > /**
> > > > > > > > > - * Returns the index of some zero element in the
> specified
> > > > > range via
> > > > > > > > > + * Returns the index of the first zero element in the
> > > > > specified range via
> > > > > > > > > * binary search. The range is assumed to be sorted, and
> must
> > > > > contain
> > > > > > > > > * at least one zero.
> > > > > > > > > *
> > > > > > > > > @@ -1726,18 +1722,17 @@
> > > > > > > > > * @param low the index of the first element, inclusive,
> to be
> > > > > searched
> > > > > > > > > * @param high the index of the last element, inclusive,
> to be
> > > > > searched
> > > > > > > > > */
> > > > > > > > > - private static int findAnyZero(float[] a, int low,
> int high) {
> > > > > > > > > - while (true) {
> > > > > > > > > + private static int findFirstZero(float[] a, int low,
> int high) {
> > > > > > > > > + while (low < high) {
> > > > > > > > > int middle = (low + high) >>> 1;
> > > > > > > > > float middleValue = a[middle];
> > > > > > > > >
> > > > > > > > > if (middleValue < 0.0f) {
> > > > > > > > > low = middle + 1;
> > > > > > > > > - } else if (middleValue > 0.0f) {
> > > > > > > > > - high = middle - 1;
> > > > > > > > > - } else { // middleValue == 0.0f
> > > > > > > > > - return middle;
> > > > > > > > > + } else { // middleValue >= 0.0f
> > > > > > > > > + high = middle;
> > > > > > > > > }
> > > > > > > > > + return low;
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > Counting negative values appeared more expensive than
> any other
> > > > > variants.
> > > > > > > > > The last proposal seems to me as efficient as the current
> > > > > solution is in its worst case - when we have only one negative
> zero (in
> > > > > the half of array).
> > > > > > > > > And it shows the best result if we have many zeros.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Dmytro Sheyko
> > > > > > > > >
> > > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > > To: jjb at google.com; dmytro_sheyko at hotmail.com
> > > > > > > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> > > > > > > > > > Subject: Re[2]: New portion of improvements for
> Dual-Pivot
> > > > > Quicksort
> > > > > > > > > > Date: Sun, 9 May 2010 23:51:27 +0400
> > > > > > > > > >
> > > > > > > > > > Josh,
> > > > > > > > > > Dmytro,
> > > > > > > > > >
> > > > > > > > > > I have done more thoroughly testing "great - less > 5 *
> > > > > seventh" vs. "less < e1 && great > e5",
> > > > > > > > > > and found that more symmetric code "less < e1 &&
> great > e5"
> > > > > is little bit faster, ~0.5..0.7%
> > > > > > > > > > on both VMs. Other code has not been changed.
> > > > > > > > > >
> > > > > > > > > > Please, take the latest version in attachment.
> > > > > > > > > >
> > > > > > > > > > Vladimir
> > > > > > > > > >
> > > > > > > > > > Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch
> > > > > <jjb at google.com>:
> > > > > > > > > >
> > > > > > > > > > > Vladimir,
> > > > > > > > > > >
> > > > > > > > > > > Old:
> > > > > > > > > > >
> > > > > > > > > > >298 if (less < e1 && great > e5) {
> > > > > > > > > > >
> > > > > > > > > > > New:
> > > > > > > > > > >
> > > > > > > > > > >256 if (great - less > 5 * seventh) {
> > > > > > > > > >
> > > > > > > > > > >Regards,
> > > > > > > > > > >Josh
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: DualPivotQuicksort.java
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100518/9ba99906/DualPivotQuicksort.java>
More information about the core-libs-dev
mailing list