New portion of improvements for Dual-Pivot Quicksort

Vladimir Iaroslavski iaroslavski at mail.ru
Mon May 17 13:30:37 UTC 2010


Hello,

Thank you for review, I'll check and run tests again with you changes.

Thank you,
Vladimir

Dmytro Sheyko wrote:
> Hello,
> 
> More ideas.
> 
> 1. We can use
> Double.doubleToRawLongBits instead of Double.doubleToLongBits and
> Float.floatToRawIntBits instead of Float.floatToIntBits.
> No need to handle NaN's because they all are placed to the end of array.
> 
> 2. Note that
> Double.doubleToRawLongBits(+0.0) == 0L and
> Double.doubleToRawLongBits(-0.0) == Long.MIN_VALUE and
> Float.floatToRawIntBits(+0.0) == 0 and
> Float.floatToRawIntBits(-0.0) == Integer.MIN_VALUE.
> 
> Comparing with is zero usually more efficient (or at least not worse) 
> than with other values. Thus such pattern
> 
>     if (ak == 0.0f && NEGATIVE_ZERO == Float.floatToIntBits(ak))
> 
> can be replaced with
> 
>     if (ak == 0.0f && Float.floatToIntBits(ak) < 0)
> 
> 3. It would be more efficient to count negative zeros after sorting.
> General sorting algorithm puts both negative and positive zeros together 
> (but maybe not in right order).
> Therefore we have to process less elements because usually we have less 
> zeros than other numbers.
> 
> Thanks,
> Dmytro Sheyko
> 
>  > From: iaroslavski at mail.ru
>  > To: dmytro_sheyko at hotmail.com; jjb at google.com
>  > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
>  > Subject: Re[6]: New portion of improvements for Dual-Pivot Quicksort
>  > Date: Fri, 14 May 2010 23:54:06 +0400
>  >
>  > Hello,
>  >
>  > I've updated the class, please, review the changes.
>  >
>  > Vladimir
>  >
>  > Fri, 14 May 2010 01:48:11 +0700 письмо от Dmytro Sheyko 
> <dmytro_sheyko at hotmail..com>:
>  >
>  > > Yes. I prefer F (Find First zero using binary search) over C (Count 
> negatives) and S (Smart Scan for zero).
>  > >
>  > > > From: iaroslavski at mail.ru
>  > > > To: dmytro_sheyko at hotmail.com
>  > > > CC: jjb at google.com; core-libs-dev at openjdk.java.net; 
> iaroslavski at mail.ru
>  > > > Subject: Re[4]: New portion of improvements for Dual-Pivot Quicksort
>  > > > Date: Thu, 13 May 2010 21:34:54 +0400
>  > > >
>  > > > Dmytro,
>  > > >
>  > > > I've tested your suggested variants, and found that case "C"
>  > > > (very interesting approach to find first position of zero
>  > > > by counting negative elements) works slower than original
>  > > > or two other cases.
>  > > >
>  > > > Implementations "F" and "S" are very close to each other
>  > > > and little bit faster than original. I prefer case "F":
>  > > > it is shorter and more clear. Do you agree?
>  > > >
>  > > > I'll prepare updated DualPivotQuicksort file and send it
>  > > > tomorrow.
>  > > >
>  > > > Thank you,
>  > > > Vladimir
>  > > >
>  > > > Wed, 12 May 2010 17:04:52 +0700 письмо от Dmytro Sheyko 
> <dmytro_sheyko at hotmail.com>:
>  > > >
>  > > > > Vladimir,
>  > > > >
>  > > > > Your changes are good for me.
>  > > > >
>  > > > > Additionally I have some comments/proposals regarding dealing 
> with negative zeros.
>  > > > >
>  > > > > 1. Scanning for the first zero we can avoid range check (i >= 
> left) if we have at least one negative value.
>  > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>  > > > > +++ DualPivotQuicksortS.java Wed May 12 12:10:46 2010
>  > > > > @@ -1705,10 +1705,15 @@
>  > > > > }
>  > > > >
>  > > > > // Find first zero element
>  > > > > - int zeroIndex = findAnyZero(a, left, n);
>  > > > > + int zeroIndex = 0;
>  > > > >
>  > > > > - for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
>  > > > > - zeroIndex = i;
>  > > > > + if (a[left] < 0.0f) {
>  > > > > + zeroIndex = findAnyZero(a, left, n);
>  > > > > +
>  > > > > + // there is at least one negative value, so range check is 
> not needed
>  > > > > + for (int i = zeroIndex - 1; /*i >= left &&*/ a[i] == 0.0f; i--) {
>  > > > > + zeroIndex = i;
>  > > > > + }
>  > > > > }
>  > > > >
>  > > > > // Turn the right number of positive zeros back into negative zeros
>  > > > >
>  > > > > 2. We can find the position of the first zero by counting 
> negative values during preprocessing phase.
>  > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>  > > > > +++ DualPivotQuicksortC.java Wed May 12 12:01:24 2010
>  > > > > @@ -1678,7 +1678,7 @@
>  > > > > * Phase 1: Count negative zeros and move NaNs to end of array.
>  > > > > */
>  > > > > final int NEGATIVE_ZERO = Float.floatToIntBits(-0.0f);
>  > > > > - int numNegativeZeros = 0;
>  > > > > + int numNegativeZeros = 0, numNegativeValues = 0;
>  > > > > int n = right;
>  > > > >
>  > > > > for (int k = left; k <= n; k++) {
>  > > > > @@ -1689,6 +1689,8 @@
>  > > > > } else if (ak != ak) { // i.e., ak is NaN
>  > > > > a[k--] = a[n];
>  > > > > a[n--] = Float.NaN;
>  > > > > + } else if (ak < 0.0f) {
>  > > > > + numNegativeValues++;
>  > > > > }
>  > > > > }
>  > > > >
>  > > > > @@ -1705,7 +1707,7 @@
>  > > > > }
>  > > > >
>  > > > > // Find first zero element
>  > > > > - int zeroIndex = findAnyZero(a, left, n);
>  > > > > + int zeroIndex = numNegativeValues;
>  > > > >
>  > > > > for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
>  > > > > zeroIndex = i;
>  > > > >
>  > > > > 3. We can use binary search to find the first zero and thus 
> avoid linear scan.
>  > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>  > > > > +++ DualPivotQuicksortF.java Wed May 12 12:03:58 2010
>  > > > > @@ -1705,11 +1705,7 @@
>  > > > > }
>  > > > >
>  > > > > // Find first zero element
>  > > > > - int zeroIndex = findAnyZero(a, left, n);
>  > > > > -
>  > > > > - for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
>  > > > > - zeroIndex = i;
>  > > > > - }
>  > > > > + int zeroIndex = findFirstZero(a, left, n);
>  > > > >
>  > > > > // Turn the right number of positive zeros back into negative zeros
>  > > > > for (int i = zeroIndex, m = zeroIndex + numNegativeZeros; i < 
> m; i++) {
>  > > > > @@ -1718,7 +1714,7 @@
>  > > > > }
>  > > > >
>  > > > > /**
>  > > > > - * Returns the index of some zero element in the specified 
> range via
>  > > > > + * Returns the index of the first zero element in the 
> specified range via
>  > > > > * binary search. The range is assumed to be sorted, and must 
> contain
>  > > > > * at least one zero.
>  > > > > *
>  > > > > @@ -1726,18 +1722,17 @@
>  > > > > * @param low the index of the first element, inclusive, to be 
> searched
>  > > > > * @param high the index of the last element, inclusive, to be 
> searched
>  > > > > */
>  > > > > - private static int findAnyZero(float[] a, int low, int high) {
>  > > > > - while (true) {
>  > > > > + private static int findFirstZero(float[] a, int low, int high) {
>  > > > > + while (low < high) {
>  > > > > int middle = (low + high) >>> 1;
>  > > > > float middleValue = a[middle];
>  > > > >
>  > > > > if (middleValue < 0.0f) {
>  > > > > low = middle + 1;
>  > > > > - } else if (middleValue > 0.0f) {
>  > > > > - high = middle - 1;
>  > > > > - } else { // middleValue == 0.0f
>  > > > > - return middle;
>  > > > > + } else { // middleValue >= 0.0f
>  > > > > + high = middle;
>  > > > > }
>  > > > > + return low;
>  > > > > }
>  > > > > }
>  > > > >
>  > > > > Counting negative values appeared more expensive than any other 
> variants.
>  > > > > The last proposal seems to me as efficient as the current 
> solution is in its worst case - when we have only one negative zero (in 
> the half of array).
>  > > > > And it shows the best result if we have many zeros.
>  > > > >
>  > > > > Regards,
>  > > > > Dmytro Sheyko
>  > > > >
>  > > > > > From: iaroslavski at mail.ru
>  > > > > > To: jjb at google.com; dmytro_sheyko at hotmail.com
>  > > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
>  > > > > > Subject: Re[2]: New portion of improvements for Dual-Pivot 
> Quicksort
>  > > > > > Date: Sun, 9 May 2010 23:51:27 +0400
>  > > > > >
>  > > > > > Josh,
>  > > > > > Dmytro,
>  > > > > >
>  > > > > > I have done more thoroughly testing "great - less > 5 * 
> seventh" vs. "less < e1 && great > e5",
>  > > > > > and found that more symmetric code "less < e1 && great > e5" 
> is little bit faster, ~0.5..0.7%
>  > > > > > on both VMs. Other code has not been changed.
>  > > > > >
>  > > > > > Please, take the latest version in attachment.
>  > > > > >
>  > > > > > Vladimir
>  > > > > >
>  > > > > > Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch 
> <jjb at google.com>:
>  > > > > >
>  > > > > > > Vladimir,
>  > > > > > >
>  > > > > > > Old:
>  > > > > > >
>  > > > > > >298 if (less < e1 && great > e5) {
>  > > > > > >
>  > > > > > > New:
>  > > > > > >
>  > > > > > >256 if (great - less > 5 * seventh) {
>  > > > > >
>  > > > > > >Regards,
>  > > > > > >Josh



More information about the core-libs-dev mailing list