New portion of improvements for Dual-Pivot Quicksort

Dmytro Sheyko dmytro_sheyko at hotmail.com
Tue May 18 08:15:07 UTC 2010


Hi,

About counting sort again.

1. This condition "i < count.length && k <= right" is excessive. Any one conjunct is enough. "k <= right" seems better.
2. No need to calculate "short value = (short) (i + Short.MIN_VALUE)" when "count[i]" is zero.
3. For signed primitives (byte and short) we would better loop backward. Thanks to "k >= fromIndex" condition we will quit looping earlier assuming that typically we work with positive numbers.
For unsigned primitives (char) we would better loop forward because typically we work with characters about zero (ASCII).

-        for (int i = 0, k = left; i < count.length && k <= right; i++) {
-            short value = (short) (i + Short.MIN_VALUE);
-            for (int s = count[i]; s > 0; s--) {
-                a[k++] = value;
-            }
-        }

+        for (int i = NUM_SHORT_VALUES - 1, k = toIndex - 1; k >= fromIndex; --i) {
+            while (count[i] == 0) --i;
+            short value = (short) (i + Short.MIN_VALUE);
+            int s = count[i];
+            do { a[k--] = value; } while (--s > 0);
+        }

Thanks,
Dmytro Sheyko

> From: iaroslavski at mail.ru
> To: dmytro_sheyko at hotmail.com
> CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> Subject: Re[2]: New portion of improvements for Dual-Pivot Quicksort
> Date: Tue, 18 May 2010 01:11:19 +0400
> 
> Sounds good!
> Will consider too...
> 
> Mon, 17 May 2010 22:24:11 +0700 письмо от Dmytro Sheyko <dmytro_sheyko at hotmail.com>:
> 
> > Hi,
> > 
> > Regarding counting sort. We can check whether we should switch to counting sort only once in the beginning.
> > 
> > > Date: Mon, 17 May 2010 17:30:37 +0400
> > > From: iaroslavski at mail.ru
> > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
> > > To: dmytro_sheyko at hotmail.com
> > > CC: core-libs-dev at openjdk.java.net
> > > 
> > > Hello,
> > > 
> > > Thank you for review, I'll check and run tests again with you changes.
> > > 
> > > Thank you,
> > > Vladimir
> > > 
> > > Dmytro Sheyko wrote:
> > > > Hello,
> > > > 
> > > > More ideas.
> > > > 
> > > > 1. We can use
> > > > Double.doubleToRawLongBits instead of Double.doubleToLongBits and
> > > > Float.floatToRawIntBits instead of Float.floatToIntBits.
> > > > No need to handle NaN's because they all are placed to the end of array.
> > > > 
> > > > 2. Note that
> > > > Double.doubleToRawLongBits(+0.0) == 0L and
> > > > Double.doubleToRawLongBits(-0.0) == Long.MIN_VALUE and
> > > > Float.floatToRawIntBits(+0.0) == 0 and
> > > > Float.floatToRawIntBits(-0.0) == Integer.MIN_VALUE.
> > > > 
> > > > Comparing with is zero usually more efficient (or at least not worse) 
> > > > than with other values. Thus such pattern
> > > > 
> > > > if (ak == 0.0f && NEGATIVE_ZERO == Float.floatToIntBits(ak))
> > > > 
> > > > can be replaced with
> > > > 
> > > > if (ak == 0.0f && Float.floatToIntBits(ak) < 0)
> > > > 
> > > > 3. It would be more efficient to count negative zeros after sorting.
> > > > General sorting algorithm puts both negative and positive zeros together 
> > > > (but maybe not in right order).
> > > > Therefore we have to process less elements because usually we have less 
> > > > zeros than other numbers.
> > > > 
> > > > Thanks,
> > > > Dmytro Sheyko
> > > > 
> > > > > From: iaroslavski at mail.ru
> > > > > To: dmytro_sheyko at hotmail.com; jjb at google.com
> > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> > > > > Subject: Re[6]: New portion of improvements for Dual-Pivot Quicksort
> > > > > Date: Fri, 14 May 2010 23:54:06 +0400
> > > > >
> > > > > Hello,
> > > > >
> > > > > I've updated the class, please, review the changes.
> > > > >
> > > > > Vladimir
> > > > >
> > > > > Fri, 14 May 2010 01:48:11 +0700 письмо от Dmytro Sheyko 
> > > > <dmytro_sheyko at hotmail..com>:
> > > > >
> > > > > > Yes. I prefer F (Find First zero using binary search) over C (Count 
> > > > negatives) and S (Smart Scan for zero).
> > > > > >
> > > > > > > From: iaroslavski at mail.ru
> > > > > > > To: dmytro_sheyko at hotmail.com
> > > > > > > CC: jjb at google.com; core-libs-dev at openjdk.java.net; 
> > > > iaroslavski at mail.ru
> > > > > > > Subject: Re[4]: New portion of improvements for Dual-Pivot Quicksort
> > > > > > > Date: Thu, 13 May 2010 21:34:54 +0400
> > > > > > >
> > > > > > > Dmytro,
> > > > > > >
> > > > > > > I've tested your suggested variants, and found that case "C"
> > > > > > > (very interesting approach to find first position of zero
> > > > > > > by counting negative elements) works slower than original
> > > > > > > or two other cases.
> > > > > > >
> > > > > > > Implementations "F" and "S" are very close to each other
> > > > > > > and little bit faster than original. I prefer case "F":
> > > > > > > it is shorter and more clear. Do you agree?
> > > > > > >
> > > > > > > I'll prepare updated DualPivotQuicksort file and send it
> > > > > > > tomorrow.
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Vladimir
> > > > > > >
> > > > > > > Wed, 12 May 2010 17:04:52 +0700 письмо от Dmytro Sheyko 
> > > > <dmytro_sheyko at hotmail.com>:
> > > > > > >
> > > > > > > > Vladimir,
> > > > > > > >
> > > > > > > > Your changes are good for me.
> > > > > > > >
> > > > > > > > Additionally I have some comments/proposals regarding dealing 
> > > > with negative zeros.
> > > > > > > >
> > > > > > > > 1. Scanning for the first zero we can avoid range check (i >= 
> > > > left) if we have at least one negative value.
> > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > +++ DualPivotQuicksortS.java Wed May 12 12:10:46 2010
> > > > > > > > @@ -1705,10 +1705,15 @@
> > > > > > > > }
> > > > > > > >
> > > > > > > > // Find first zero element
> > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > + int zeroIndex = 0;
> > > > > > > >
> > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
> > > > > > > > - zeroIndex = i;
> > > > > > > > + if (a[left] < 0.0f) {
> > > > > > > > + zeroIndex = findAnyZero(a, left, n);
> > > > > > > > +
> > > > > > > > + // there is at least one negative value, so range check is 
> > > > not needed
> > > > > > > > + for (int i = zeroIndex - 1; /*i >= left &&*/ a[i] == 0.0f; i--) {
> > > > > > > > + zeroIndex = i;
> > > > > > > > + }
> > > > > > > > }
> > > > > > > >
> > > > > > > > // Turn the right number of positive zeros back into negative zeros
> > > > > > > >
> > > > > > > > 2. We can find the position of the first zero by counting 
> > > > negative values during preprocessing phase.
> > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > +++ DualPivotQuicksortC.java Wed May 12 12:01:24 2010
> > > > > > > > @@ -1678,7 +1678,7 @@
> > > > > > > > * Phase 1: Count negative zeros and move NaNs to end of array.
> > > > > > > > */ 
> > > > > > > > final int NEGATIVE_ZERO = Float.floatToIntBits(-0.0f);
> > > > > > > > - int numNegativeZeros = 0;
> > > > > > > > + int numNegativeZeros = 0, numNegativeValues = 0;
> > > > > > > > int n = right;
> > > > > > > >
> > > > > > > > for (int k = left; k <= n; k++) {
> > > > > > > > @@ -1689,6 +1689,8 @@
> > > > > > > > } else if (ak != ak) { // i.e., ak is NaN
> > > > > > > > a[k--] = a[n];
> > > > > > > > a[n--] = Float.NaN;
> > > > > > > > + } else if (ak < 0.0f) {
> > > > > > > > + numNegativeValues++;
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > @@ -1705,7 +1707,7 @@
> > > > > > > > }
> > > > > > > >
> > > > > > > > // Find first zero element
> > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > + int zeroIndex = numNegativeValues;
> > > > > > > >
> > > > > > > > for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
> > > > > > > > zeroIndex = i;
> > > > > > > >
> > > > > > > > 3. We can use binary search to find the first zero and thus 
> > > > avoid linear scan.
> > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
> > > > > > > > +++ DualPivotQuicksortF.java Wed May 12 12:03:58 2010
> > > > > > > > @@ -1705,11 +1705,7 @@
> > > > > > > > }
> > > > > > > >
> > > > > > > > // Find first zero element
> > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
> > > > > > > > -
> > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] == 0.0f; i--) {
> > > > > > > > - zeroIndex = i;
> > > > > > > > - }
> > > > > > > > + int zeroIndex = findFirstZero(a, left, n);
> > > > > > > >
> > > > > > > > // Turn the right number of positive zeros back into negative zeros
> > > > > > > > for (int i = zeroIndex, m = zeroIndex + numNegativeZeros; i < 
> > > > m; i++) {
> > > > > > > > @@ -1718,7 +1714,7 @@
> > > > > > > > }
> > > > > > > >
> > > > > > > > /**
> > > > > > > > - * Returns the index of some zero element in the specified 
> > > > range via
> > > > > > > > + * Returns the index of the first zero element in the 
> > > > specified range via
> > > > > > > > * binary search. The range is assumed to be sorted, and must 
> > > > contain
> > > > > > > > * at least one zero.
> > > > > > > > *
> > > > > > > > @@ -1726,18 +1722,17 @@
> > > > > > > > * @param low the index of the first element, inclusive, to be 
> > > > searched
> > > > > > > > * @param high the index of the last element, inclusive, to be 
> > > > searched
> > > > > > > > */ 
> > > > > > > > - private static int findAnyZero(float[] a, int low, int high) {
> > > > > > > > - while (true) {
> > > > > > > > + private static int findFirstZero(float[] a, int low, int high) {
> > > > > > > > + while (low < high) {
> > > > > > > > int middle = (low + high) >>> 1;
> > > > > > > > float middleValue = a[middle];
> > > > > > > >
> > > > > > > > if (middleValue < 0.0f) {
> > > > > > > > low = middle + 1;
> > > > > > > > - } else if (middleValue > 0.0f) {
> > > > > > > > - high = middle - 1;
> > > > > > > > - } else { // middleValue == 0.0f
> > > > > > > > - return middle;
> > > > > > > > + } else { // middleValue >= 0.0f
> > > > > > > > + high = middle;
> > > > > > > > }
> > > > > > > > + return low;
> > > > > > > > }
> > > > > > > > }
> > > > > > > >
> > > > > > > > Counting negative values appeared more expensive than any other 
> > > > variants.
> > > > > > > > The last proposal seems to me as efficient as the current 
> > > > solution is in its worst case - when we have only one negative zero (in 
> > > > the half of array).
> > > > > > > > And it shows the best result if we have many zeros.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Dmytro Sheyko
> > > > > > > >
> > > > > > > > > From: iaroslavski at mail.ru
> > > > > > > > > To: jjb at google.com; dmytro_sheyko at hotmail.com
> > > > > > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
> > > > > > > > > Subject: Re[2]: New portion of improvements for Dual-Pivot 
> > > > Quicksort
> > > > > > > > > Date: Sun, 9 May 2010 23:51:27 +0400
> > > > > > > > >
> > > > > > > > > Josh,
> > > > > > > > > Dmytro,
> > > > > > > > >
> > > > > > > > > I have done more thoroughly testing "great - less > 5 * 
> > > > seventh" vs. "less < e1 && great > e5",
> > > > > > > > > and found that more symmetric code "less < e1 && great > e5" 
> > > > is little bit faster, ~0.5..0.7%
> > > > > > > > > on both VMs. Other code has not been changed.
> > > > > > > > >
> > > > > > > > > Please, take the latest version in attachment.
> > > > > > > > >
> > > > > > > > > Vladimir
> > > > > > > > >
> > > > > > > > > Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua Bloch 
> > > > <jjb at google.com>:
> > > > > > > > >
> > > > > > > > > > Vladimir,
> > > > > > > > > >
> > > > > > > > > > Old:
> > > > > > > > > >
> > > > > > > > > >298 if (less < e1 && great > e5) {
> > > > > > > > > >
> > > > > > > > > > New:
> > > > > > > > > >
> > > > > > > > > >256 if (great - less > 5 * seventh) {
> > > > > > > > >
> > > > > > > > > >Regards,
> > > > > > > > > >Josh
 		 	   		  
_________________________________________________________________
Hotmail: Powerful Free email with security by Microsoft.
https://signup.live.com/signup.aspx?id=60969
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100518/d9b9f2fd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DualPivotQuicksort.diff
Type: application/octet-stream
Size: 7127 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100518/d9b9f2fd/DualPivotQuicksort.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DualPivotQuicksort.java
Type: text/x-java
Size: 101553 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100518/d9b9f2fd/DualPivotQuicksort.java>


More information about the core-libs-dev mailing list