New portion of improvements for Dual-Pivot Quicksort

Tue Jun 1 11:40:19 UTC 2010

Hi Dmytro,

Very interesting investigation, thank you very much!

I tried you suggested sorting network, but all versions work the same.
Then I used bubble sort (10 comparisons) and it shows better
results: on client it is faster on 0.2%, server - 0.8%, not too
much, but faster.

The schema is (bubble sort with changes of direction):
[4 5] [3 4] [2 3] [1 2] [2 3] [3 4] [4 5] [3 4] [2 3] [3 4]

And moreover: if we use insertion sort for sorting of 5 candidates:

int ae1 = a[e1], ae3 = a[e3], ae5 = a[e5], ae2 = a[e2], ae4 = a[e4];

if (ae2 < a[e1]) {
     a[e2] = a[e1]; a[e1] = ae2;
}
if (ae3 < a[e2]) {
     if (ae3 < a[e1]) {
         a[e3] = a[e2]; a[e2] = a[e1]; a[e1] = ae3;
     } else {
         a[e3] = a[e2]; a[e2] = ae3;
     }
}
if (ae4 < a[e2]) {
     if (ae4 < a[e1]) {
         a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = a[e1]; a[e1] = ae4;
     } else {
         a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = ae4;
     }
} else {
     if (ae4 < a[e3]) {
         a[e4] = a[e3]; a[e3] = ae4;
     }
}
if (ae5 < a[e2]) {
     if (ae5 < a[e1]) {
         a[e5]=a[e4];a[e4]=a[e3];a[e3]=a[e2];a[e2]=a[e1];a[e1]=ae5;
     } else {
         a[e5] = a[e4]; a[e4] = a[e3]; a[e3] = a[e2]; a[e2] = ae5;
     }
} else {
     if (ae5 < a[e3]) {
         a[e5] = a[e4]; a[e4] = a[e3]; a[e3] = ae5;
     }
     else {
         if (ae5 < a[e4]) {
             a[e5] = a[e4]; a[e4] = ae5;
         }
     }
}

it shows better results than bubble sort:
client - 0.37%, server - 1.03%

Which sorting algorithm we should use?

1. Network - compact, 9 lines of code
2. Bubble - also compact, 10 lines of code
3. Insertion - faster, but 39 lines of code

Regards,
Vladimir

Dmytro Sheyko wrote:
> Corrections.
> 1. The sixth comparator in current network (that swaps with 90% 
> probability) is [2 3] not [0 3].
> 2. The average number of swaps are
> 576/120 = 4.8 for current network and
> 376/120 = 3.1(3...) for proposed network.
> 
> ------------------------------------------------------------------------
> From: dmytro_sheyko at hotmail.com
> To: iaroslavski at mail.ru; joshua.bloch at google.com
> Subject: RE: New portion of improvements for Dual-Pivot Quicksort
> Date: Tue, 1 Jun 2010 14:59:07 +0700
> CC: core-libs-dev at openjdk.java.net
> 
> Hi Vladimir,
> 
> I cannot write 7-comparison sort briefly as well. However I think we can 
> optimize sorting network a bit.
> We can consider such properties as
> 1. size, i.e. number of comparators
> 2. delay, i.e. number of parallel steps (some comparisons and swaps can 
> be done concurrently)
> 3. average number of swaps.
> 
> The current sorting network
> [0 1][3 4] | [0 2] | [1 2][0 3] | [2 3][1 4] | [1 2][3 4]
> has size 9 and delay 5.
> These values are good enough. Generally we cannot write it shorter (in 
> order to improve size) and combine more than 2 comparison in a parallel 
> step (in order to improve delay).
> 
> However the average number of swaps seems too high. It performs 576 
> swaps per 1080 comparisons (1080 = 9 {size} * 120 {5!})
> I tried every permutation of 5 distinct values.
> There is a profile per comparator:
> (60/120=50%)(60/120=50%)(40/120=33%)(80/120=66%)(48/120=40%)(108/120=90%)(36/120=30%)(72/120=60%)(72/120=60%)[576/1080=53%]
> The first two comparison offer probability of swap as 50%, which is 
> natural. However the sixth comparator (i.e. [0 3]) swaps with 
> probability 90%, i.e. almost always!
> 
> The one of the best sorting network (I have found) with the same size 
> and delay is following
> [0 4] | [0 2][1 4] | [1 3][2 4] | [0 1][2 3] | [1 2][3 4]
> Its average number of swaps is 376.
> And here is a profile:
> (60/120=50%)(40/120=33%)(40/120=33%)(50/120=41%)(30/120=25%)(48/120=40%)(48/120=40%)(36/120=30%)(24/120=20%)[376/1080=34%]
> 
> Regards,
> Dmytro Sheyko
> 
>  > Date: Wed, 26 May 2010 16:12:17 +0400
>  > From: iaroslavski at mail.ru
>  > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
>  > To: dmytro_sheyko at hotmail.com; Joshua.Bloch at google.com
>  > CC: core-libs-dev at openjdk.java.net
>  >
>  > Hello Dmytro,
>  >
>  > Theoretical investigations are based on simple model,
>  > which doesn't take into account many optimizations
>  > like "equal pivots", "scan equal to pivots", etc.
>  > Model based on the implementation will be too complex
>  > to be analyzed.
>  >
>  > But it is easy to count comparisons and assignments,
>  > if I change them by functions with counter.
>  >
>  > Also I tried to write 7-comparison sort, but the code
>  > was too long (a lot of inner if-then-else) instead of
>  > 9 compact lines. Do you have suggestion how to implement
>  > a nice 7-comparison sort? I tried also selection and bubble
>  > sorts, but insertion sort shows better time.
>  >
>  > Josh,
>  > Could you please review the last changes (especially javadoc
>  > and comments)?
>  >
>  > Thank you,
>  > Vladimir
>  >
>  > Dmytro Sheyko wrote:
>  > > Hi Vladimir,
>  > >
>  > > As for me, everything seems good.
>  > >
>  > > Returning to the theoretical background, could you estimate number of
>  > > comparison and assignments? These should be less than in your initial
>  > > version.
>  > >
>  > > Also have you considered 7-comparison sort for sorting 5 pivot
>  > > candidates instead of 9-comparison sorting network?
>  > >
>  > > Thank you,
>  > > Dmytro Sheyko
>  > >
>  > > > Date: Tue, 25 May 2010 10:42:51 +0400
>  > > > From: iaroslavski at mail.ru
>  > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
>  > > > To: dmytro_sheyko at hotmail.com
>  > > > CC: core-libs-dev at openjdk.java.net
>  > > >
>  > > > I added more comments, please, review attached version.
>  > > >
>  > > > >> So, we win 2-3% !
>  > > >
>  > > > On [partial] sorted inputs new version runs more faster than few
>  > > percents:
>  > > >
>  > > > organ pipes
>  > > > this: 6896
>  > > > prev: 7424
>  > > > jdk7: 8018
>  > > > jdk6: 12502
>  > > >
>  > > > ascendant
>  > > > this: 2877
>  > > > prev: 3845
>  > > > jdk7: 4583
>  > > > jdk6: 9019
>  > > >
>  > > > descendant
>  > > > this: 3287
>  > > > prev: 4110
>  > > > jdk7: 4897
>  > > > jdk6: 9132
>  > > >
>  > > > Dmytro Sheyko wrote:
>  > > > > That's great! Thank you.
>  > > > >
>  > > > > > Date: Fri, 21 May 2010 18:38:51 +0400
>  > > > > > From: iaroslavski at mail.ru
>  > > > > > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
>  > > > > > To: dmytro_sheyko at hotmail.com
>  > > > > > CC: core-libs-dev at openjdk.java.net
>  > > > > >
>  > > > > > Hello,
>  > > > > >
>  > > > > > I prepared version with your changes "Skip the last negative
>  > > > > > value (if any) or all leading negative" and with my optimization
>  > > > > > for all types. I added two while loops before partitioning to
>  > > > > > skip elements, less than pivot1 and greater than pivot2:
>  > > > > >
>  > > > > > if (pivot1 != pivot2) {
>  > > > > > /* ... */
>  > > > > > a[e2] = a[less];
>  > > > > > a[e4] = a[great];
>  > > > > >
>  > > > > > ++ while (a[++less] < pivot1);
>  > > > > > ++ while (a[--great] > pivot2);
>  > > > > >
>  > > > > > /* ... */
>  > > > > > outer:
>  > > > > > for (int k = less; k <= great; k++) {
>  > > > > > ...
>  > > > > > }
>  > > > > >
>  > > > > > Here is benchmark result (in compare with quicksort from JDK 6):
>  > > > > >
>  > > > > > client server
>  > > > > > ------ ------
>  > > > > > previous version: 60.70% 48.20%
>  > > > > > current version: 57.22% 46.18%
>  > > > > >
>  > > > > > So, we win 2-3% !
>  > > > > >
>  > > > > > Thank you,
>  > > > > > Vladimir
>  > > > > >
>  > > > > > Dmytro Sheyko wrote:
>  > > > > > > Hi Vladimir,
>  > > > > > >
>  > > > > > > I tried to figure out why the testcase failed on my
>  > > modification. It
>  > > > > > > appeared that number of negative zeros were changed during 
> general
>  > > > > sort.
>  > > > > > > As I can see you already fixed this issue. Well, my
>  > > modification was
>  > > > > > > based on assumption that we can speed up eliminating 
> explicit array
>  > > > > > > range checks.
>  > > > > > > However, such assumption is wrong because Hotspot anyway emits
>  > > range
>  > > > > > > checks at its discretion and therefore processZeros generally
>  > > does not
>  > > > > > > work as fast as I expected.
>  > > > > > > So complications I made are not worth doing.
>  > > > > > >
>  > > > > > > As for the latest code you posted. Doesn't it make sense to 
> skip
>  > > > > leading
>  > > > > > > negative zeros before farther processing? In this case we avoid
>  > > > > > > unnecessary assigning +0.0 and then -0.0 to the same 
> location a[k]
>  > > > > (i.e.
>  > > > > > > where k == p).
>  > > > > > >
>  > > > > > > /*
>  > > > > > > * Skip the last negative value (if any) or all leading negative
>  > > > > > > zeros
>  > > > > > > */
>  > > > > > > while (left <= right && Double.doubleToRawLongBits(a[left]) 
> < 0) {
>  > > > > > > left++;
>  > > > > > > }
>  > > > > > >
>  > > > > > > for (int k = left + 1, p = left; k <= right; k++) {
>  > > > > > > double ak = a[k];
>  > > > > > > if (ak != 0.0d) {
>  > > > > > > return;
>  > > > > > > }
>  > > > > > > if (Double.doubleToRawLongBits(ak) < 0) { // ak is -0.0d
>  > > > > > > a[k] = 0.0d;
>  > > > > > > a[p++] = -0.0d;
>  > > > > > > }
>  > > > > > > }
>  > > > > > >
>  > > > > > > Thank you,
>  > > > > > > Dmytro Sheyko
>  > > > > > >
>  > > > > > >
>  > > > > > > > Date: Wed, 19 May 2010 14:41:32 +0400
>  > > > > > > > From: iaroslavski at mail.ru
>  > > > > > > > Subject: Re: New portion of improvements for Dual-Pivot 
> Quicksort
>  > > > > > > > To: dmytro_sheyko at hotmail.com
>  > > > > > > > CC: core-libs-dev at openjdk.java.net
>  > > > > > > >
>  > > > > > > > resend the class with correct constructor
>  > > > > > > >
>  > > > > > > > Vladimir Iaroslavski wrote:
>  > > > > > > > > Dmytro,
>  > > > > > > > >
>  > > > > > > > > Thank you for comments, I updated double method, did 
> little bit
>  > > > > > > > > javadoc changes and replaced in char/short/byte methods
>  > > > > > > > > "fromIndex -> left", "toIndex-1 -> right", the code became
>  > > > > > > > > consistent with main sort method and more compact. Also 
> I use
>  > > > > > > > > more usual "i--" and "i++" in for loops (instead of "--i",
>  > > "++i.
>  > > > > > > > >
>  > > > > > > > > To accent the difference between float/double and other 
> types,
>  > > > > > > > > I put comment where it is important:
>  > > > > > > > >
>  > > > > > > > > /*
>  > > > > > > > > * In spite of a[great] == pivot1, the assignment
>  > > > > > > > > * a[less++] = pivot1 may be incorrect, if a[great]
>  > > > > > > > > * and pivot1 are floating-point zeros of different
>  > > > > > > > > * signs, therefore in float/double methods we have
>  > > > > > > > > * to use more accurate assignment a[k] = a[great].
>  > > > > > > > > */
>  > > > > > > > > a[less++] = pivot1;
>  > > > > > > > >
>  > > > > > > > > and for double/float:
>  > > > > > > > >
>  > > > > > > > > /*
>  > > > > > > > > .....
>  > > > > > > > > */
>  > > > > > > > > a[k] = a[great];
>  > > > > > > > >
>  > > > > > > > > See updated version in attachment.
>  > > > > > > > >
>  > > > > > > > > Thank you,
>  > > > > > > > > Vladimir
>  > > > > > > > >
>  > > > > > > > > Dmytro Sheyko wrote:
>  > > > > > > > >> Vladimir,
>  > > > > > > > >>
>  > > > > > > > >> I can see that you changed 
> sortNegZeroAndNaN(float[]...) but
>  > > > > probably
>  > > > > > > > >> forgot to change sortNegZeroAndNaN(double[]...).
>  > > > > > > > >>
>  > > > > > > > >> You really puzzled me with failed testcase and note that
>  > > sorting
>  > > > > > > > >> algorithm (without special attention to zeros) 
> generally may
>  > > > > change
>  > > > > > > > >> number of negative zeros.
>  > > > > > > > >> I will provide my comments later.
>  > > > > > > > >>
>  > > > > > > > >> As for counting sort, I think we should use single format
>  > > > > style over
>  > > > > > > > >> the file (unless we have valuable reason not to do 
> this). I
>  > > > > mean to
>  > > > > > > > >> choose
>  > > > > > > > >> 1)
>  > > > > > > > >> if (toIndex - fromIndex >
>  > > > > > > > >> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
>  > > > > > > > >> countingSort(a, fromIndex, toIndex);
>  > > > > > > > >> return;
>  > > > > > > > >> }
>  > > > > > > > >> sort(a, fromIndex, toIndex - 1, true);
>  > > > > > > > >> 2)
>  > > > > > > > >> if (toIndex - fromIndex >
>  > > > > > > > >> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
>  > > > > > > > >> countingSort(a, fromIndex, toIndex);
>  > > > > > > > >> } else {
>  > > > > > > > >> sort(a, fromIndex, toIndex - 1, true);
>  > > > > > > > >> }
>  > > > > > > > >> I prefer the second one.
>  > > > > > > > >>
>  > > > > > > > >> Thanks a lot,
>  > > > > > > > >> Dmytro Sheyko
>  > > > > > > > >>
>  > > > > > > > >> > Date: Tue, 18 May 2010 18:57:50 +0400
>  > > > > > > > >> > From: iaroslavski at mail.ru
>  > > > > > > > >> > Subject: Re: New portion of improvements for Dual-Pivot
>  > > > > Quicksort
>  > > > > > > > >> > To: dmytro_sheyko at hotmail.com
>  > > > > > > > >> > CC: core-libs-dev at openjdk.java.net
>  > > > > > > > >> >
>  > > > > > > > >> > Hello,
>  > > > > > > > >> >
>  > > > > > > > >> > I've run your modification for counting sort, it real
>  > > faster.
>  > > > > > > > >> > I attached new version with your changes (I did 
> little bit
>  > > > > > > > >> > format it) and included my case with float/double.
>  > > > > > > > >> >
>  > > > > > > > >> > Note that you modification doesn't pass test from
>  > > Sorting class,
>  > > > > > > > >> > which I sent earlier. It fails on float/double test:
>  > > > > > > > >> >
>  > > > > > > > >> > Test #3: random = 666, len = 34, a = 0, g = 6, z = 
> 9, n =
>  > > > > 10, p = 9
>  > > > > > > > >> >
>  > > > > > > > >> > I suggest shorter method (which is based on your 
> idea to
>  > > skip
>  > > > > > > counting
>  > > > > > > > >> > negative zeros on Phase 1.): I found find first zero
>  > > index (or
>  > > > > > > it will
>  > > > > > > > >> > be index of first positive element if no zeros at all,
>  > > or last
>  > > > > > > > >> negative,
>  > > > > > > > >> > if no positive and zero elements) and then swap negative
>  > > > > zero to the
>  > > > > > > > >> > beginning of the sub-range.
>  > > > > > > > >> >
>  > > > > > > > >> > int hi = right;
>  > > > > > > > >> >
>  > > > > > > > >> > while (left < hi) {
>  > > > > > > > >> > int middle = (left + hi) >>> 1;
>  > > > > > > > >> > float middleValue = a[middle];
>  > > > > > > > >> >
>  > > > > > > > >> > if (middleValue < 0.0f) {
>  > > > > > > > >> > left = middle + 1;
>  > > > > > > > >> > } else {
>  > > > > > > > >> > hi = middle;
>  > > > > > > > >> > }
>  > > > > > > > >> > }
>  > > > > > > > >> >
>  > > > > > > > >> > for (int k = left, p = left; k <= right; k++) {
>  > > > > > > > >> > float ak = a[k];
>  > > > > > > > >> > if (ak != 0.0f) {
>  > > > > > > > >> > return;
>  > > > > > > > >> > }
>  > > > > > > > >> > if (Float.floatToRawIntBits(ak) < 0) { // ak is -0.0f
>  > > > > > > > >> > a[k] = +0.0f;
>  > > > > > > > >> > a[p++] = -0.0f;
>  > > > > > > > >> > }
>  > > > > > > > >> > }
>  > > > > > > > >> >
>  > > > > > > > >> > Important note: in partitioning loop there are several
>  > > places
>  > > > > > > > >> > (marked by // !) where potential bug with -0.0 could be
>  > > > > > > > >> > (when pivot and a[great] are zeros with different 
> signs):
>  > > > > > > > >> >
>  > > > > > > > >> > if (a[great] == pivot1) {
>  > > > > > > > >> > a[k] = a[less];
>  > > > > > > > >> > - a[less++] = pivot1; // !
>  > > > > > > > >> > + a[less++] = a[great];
>  > > > > > > > >> > } else { // pivot1 < a[great] < pivot2
>  > > > > > > > >> > a[k] = a[great];
>  > > > > > > > >> > }
>  > > > > > > > >> > - a[great--] = pivot2; // !
>  > > > > > > > >> > + a[great--] = ak;
>  > > > > > > > >> > } else if (ak == pivot1) { // Move a[k] to left part
>  > > > > > > > >> > a[k] = a[less];
>  > > > > > > > >> > - a[less++] = pivot1; // !
>  > > > > > > > >> > + a[less++] = ak;
>  > > > > > > > >> > }
>  > > > > > > > >> >
>  > > > > > > > >> > and the same in "Pivots are equal" branch.
>  > > > > > > > >> >
>  > > > > > > > >> > I did changes "pivot1/2 -> ak" in methods for all types
>  > > > > > > > >> > and "pivot1 -> a[great]" in float/double sections only.
>  > > > > > > > >> >
>  > > > > > > > >> > Please, review format changes for counting sort and new
>  > > version
>  > > > > > > > >> > of Phase 3 for float/double.
>  > > > > > > > >> >
>  > > > > > > > >> > Thank you,
>  > > > > > > > >> > Vladimir
>  > > > > > > > >> >
>  > > > > > > > >> > Dmytro Sheyko wrote:
>  > > > > > > > >> > > Hi,
>  > > > > > > > >> > >
>  > > > > > > > >> > > About counting sort again.
>  > > > > > > > >> > >
>  > > > > > > > >> > > 1. This condition "i < count.length && k <= right" is
>  > > > > excessive.
>  > > > > > > > >> Any one
>  > > > > > > > >> > > conjunct is enough. "k <= right" seems better.
>  > > > > > > > >> > > 2. No need to calculate "short value = (short) (i +
>  > > > > > > > >> Short.MIN_VALUE)"
>  > > > > > > > >> > > when "count[i]" is zero.
>  > > > > > > > >> > > 3. For signed primitives (byte and short) we would
>  > > better loop
>  > > > > > > > >> backward.
>  > > > > > > > >> > > Thanks to "k >= fromIndex" condition we will quit 
> looping
>  > > > > earlier
>  > > > > > > > >> > > assuming that typically we work with positive numbers.
>  > > > > > > > >> > > For unsigned primitives (char) we would better loop
>  > > forward
>  > > > > > > because
>  > > > > > > > >> > > typically we work with characters about zero (ASCII).
>  > > > > > > > >> > >
>  > > > > > > > >> > > - for (int i = 0, k = left; i < count.length && k <=
>  > > > > right; i++) {
>  > > > > > > > >> > > - short value = (short) (i + Short.MIN_VALUE);
>  > > > > > > > >> > > - for (int s = count[i]; s > 0; s--) {
>  > > > > > > > >> > > - a[k++] = value;
>  > > > > > > > >> > > - }
>  > > > > > > > >> > > - }
>  > > > > > > > >> > >
>  > > > > > > > >> > > + for (int i = NUM_SHORT_VALUES - 1, k = toIndex - 
> 1; k >=
>  > > > > > > > >> > > fromIndex; --i) {
>  > > > > > > > >> > > + while (count[i] == 0) --i;
>  > > > > > > > >> > > + short value = (short) (i + Short.MIN_VALUE);
>  > > > > > > > >> > > + int s = count[i];
>  > > > > > > > >> > > + do { a[k--] = value; } while (--s > 0);
>  > > > > > > > >> > > + }
>  > > > > > > > >> > >
>  > > > > > > > >> > > Thanks,
>  > > > > > > > >> > > Dmytro Sheyko
>  > > > > > > > >> > >
>  > > > > > > > >> > > > From: iaroslavski at mail.ru
>  > > > > > > > >> > > > To: dmytro_sheyko at hotmail.com
>  > > > > > > > >> > > > CC: core-libs-dev at openjdk.java.net; 
> iaroslavski at mail.ru
>  > > > > > > > >> > > > Subject: Re[2]: New portion of improvements for
>  > > Dual-Pivot
>  > > > > > > > >> Quicksort
>  > > > > > > > >> > > > Date: Tue, 18 May 2010 01:11:19 +0400
>  > > > > > > > >> > > >
>  > > > > > > > >> > > > Sounds good!
>  > > > > > > > >> > > > Will consider too...
>  > > > > > > > >> > > >
>  > > > > > > > >> > > > Mon, 17 May 2010 22:24:11 +0700 письмо от Dmytro 
> Sheyko
>  > > > > > > > >> > > <dmytro_sheyko at hotmail.com>:
>  > > > > > > > >> > > >
>  > > > > > > > >> > > > > Hi,
>  > > > > > > > >> > > > >
>  > > > > > > > >> > > > > Regarding counting sort. We can check whether we
>  > > should
>  > > > > > > > >> switch to
>  > > > > > > > >> > > counting sort only once in the beginning.
>  > > > > > > > >> > > > >
>  > > > > > > > >> > > > > > Date: Mon, 17 May 2010 17:30:37 +0400
>  > > > > > > > >> > > > > > From: iaroslavski at mail.ru
>  > > > > > > > >> > > > > > Subject: Re: New portion of improvements for
>  > > Dual-Pivot
>  > > > > > > > >> Quicksort
>  > > > > > > > >> > > > > > To: dmytro_sheyko at hotmail.com
>  > > > > > > > >> > > > > > CC: core-libs-dev at openjdk.java.net
>  > > > > > > > >> > > > > >
>  > > > > > > > >> > > > > > Hello,
>  > > > > > > > >> > > > > >
>  > > > > > > > >> > > > > > Thank you for review, I'll check and run 
> tests again
>  > > > > > > with you
>  > > > > > > > >> > > changes.
>  > > > > > > > >> > > > > >
>  > > > > > > > >> > > > > > Thank you,
>  > > > > > > > >> > > > > > Vladimir
>  > > > > > > > >> > > > > >
>  > > > > > > > >> > > > > > Dmytro Sheyko wrote:
>  > > > > > > > >> > > > > > > Hello,
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > More ideas.
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > 1. We can use
>  > > > > > > > >> > > > > > > Double.doubleToRawLongBits instead of
>  > > > > > > > >> Double.doubleToLongBits and
>  > > > > > > > >> > > > > > > Float.floatToRawIntBits instead of
>  > > > > Float.floatToIntBits.
>  > > > > > > > >> > > > > > > No need to handle NaN's because they all are
>  > > placed to
>  > > > > > > > >> the end
>  > > > > > > > >> > > of array.
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > 2. Note that
>  > > > > > > > >> > > > > > > Double.doubleToRawLongBits(+0.0) == 0L and
>  > > > > > > > >> > > > > > > Double.doubleToRawLongBits(-0.0) ==
>  > > Long.MIN_VALUE and
>  > > > > > > > >> > > > > > > Float.floatToRawIntBits(+0.0) == 0 and
>  > > > > > > > >> > > > > > > Float.floatToRawIntBits(-0.0) ==
>  > > Integer.MIN_VALUE.
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > Comparing with is zero usually more efficient
>  > > (or at
>  > > > > > > > >> least not
>  > > > > > > > >> > > worse)
>  > > > > > > > >> > > > > > > than with other values. Thus such pattern
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > if (ak == 0.0f && NEGATIVE_ZERO ==
>  > > > > > > Float.floatToIntBits(ak))
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > can be replaced with
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > if (ak == 0.0f && Float.floatToIntBits(ak) 
> < 0)
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > 3. It would be more efficient to count
>  > > negative zeros
>  > > > > > > after
>  > > > > > > > >> > > sorting.
>  > > > > > > > >> > > > > > > General sorting algorithm puts both 
> negative and
>  > > > > positive
>  > > > > > > > >> zeros
>  > > > > > > > >> > > together
>  > > > > > > > >> > > > > > > (but maybe not in right order).
>  > > > > > > > >> > > > > > > Therefore we have to process less elements 
> because
>  > > > > > > > >> usually we
>  > > > > > > > >> > > have less
>  > > > > > > > >> > > > > > > zeros than other numbers.
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > Thanks,
>  > > > > > > > >> > > > > > > Dmytro Sheyko
>  > > > > > > > >> > > > > > >
>  > > > > > > > >> > > > > > > > From: iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > To: dmytro_sheyko at hotmail.com; 
> jjb at google.com
>  > > > > > > > >> > > > > > > > CC: core-libs-dev at openjdk.java.net;
>  > > > > iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > Subject: Re[6]: New portion of 
> improvements for
>  > > > > > > Dual-Pivot
>  > > > > > > > >> > > Quicksort
>  > > > > > > > >> > > > > > > > Date: Fri, 14 May 2010 23:54:06 +0400
>  > > > > > > > >> > > > > > > >
>  > > > > > > > >> > > > > > > > Hello,
>  > > > > > > > >> > > > > > > >
>  > > > > > > > >> > > > > > > > I've updated the class, please, review the
>  > > changes.
>  > > > > > > > >> > > > > > > >
>  > > > > > > > >> > > > > > > > Vladimir
>  > > > > > > > >> > > > > > > >
>  > > > > > > > >> > > > > > > > Fri, 14 May 2010 01:48:11 +0700 письмо от
>  > > Dmytro
>  > > > > Sheyko
>  > > > > > > > >> > > > > > > <dmytro_sheyko at hotmail..com>:
>  > > > > > > > >> > > > > > > >
>  > > > > > > > >> > > > > > > > > Yes. I prefer F (Find First zero using 
> binary
>  > > > > search)
>  > > > > > > > >> over
>  > > > > > > > >> > > C (Count
>  > > > > > > > >> > > > > > > negatives) and S (Smart Scan for zero).
>  > > > > > > > >> > > > > > > > >
>  > > > > > > > >> > > > > > > > > > From: iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > > > To: dmytro_sheyko at hotmail.com
>  > > > > > > > >> > > > > > > > > > CC: jjb at google.com;
>  > > > > core-libs-dev at openjdk.java.net;
>  > > > > > > > >> > > > > > > iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > > > Subject: Re[4]: New portion of
>  > > improvements for
>  > > > > > > > >> > > Dual-Pivot Quicksort
>  > > > > > > > >> > > > > > > > > > Date: Thu, 13 May 2010 21:34:54 +0400
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > Dmytro,
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > I've tested your suggested variants, and
>  > > > > found that
>  > > > > > > > >> case "C"
>  > > > > > > > >> > > > > > > > > > (very interesting approach to find first
>  > > > > position
>  > > > > > > > >> of zero
>  > > > > > > > >> > > > > > > > > > by counting negative elements) works
>  > > slower than
>  > > > > > > > >> original
>  > > > > > > > >> > > > > > > > > > or two other cases.
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > Implementations "F" and "S" are very 
> close
>  > > > > to each
>  > > > > > > > >> other
>  > > > > > > > >> > > > > > > > > > and little bit faster than original. I
>  > > > > prefer case
>  > > > > > > > >> "F":
>  > > > > > > > >> > > > > > > > > > it is shorter and more clear. Do you 
> agree?
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > I'll prepare updated DualPivotQuicksort
>  > > file and
>  > > > > > > > >> send it
>  > > > > > > > >> > > > > > > > > > tomorrow.
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > Thank you,
>  > > > > > > > >> > > > > > > > > > Vladimir
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > Wed, 12 May 2010 17:04:52 +0700 письмо
>  > > от Dmytro
>  > > > > > > > >> Sheyko
>  > > > > > > > >> > > > > > > <dmytro_sheyko at hotmail.com>:
>  > > > > > > > >> > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > Vladimir,
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > Your changes are good for me.
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > Additionally I have some
>  > > comments/proposals
>  > > > > > > > >> regarding
>  > > > > > > > >> > > dealing
>  > > > > > > > >> > > > > > > with negative zeros.
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > 1. Scanning for the first zero we can
>  > > > > avoid range
>  > > > > > > > >> check
>  > > > > > > > >> > > (i >=
>  > > > > > > > >> > > > > > > left) if we have at least one negative value.
>  > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11
>  > > > > > > 09:04:19 2010
>  > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortS.java Wed 
> May 12
>  > > > > 12:10:46
>  > > > > > > > >> 2010
>  > > > > > > > >> > > > > > > > > > > @@ -1705,10 +1705,15 @@
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > // Find first zero element
>  > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a, 
> left, n);
>  > > > > > > > >> > > > > > > > > > > + int zeroIndex = 0;
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > - for (int i = zeroIndex - 1; i >=
>  > > left &&
>  > > > > a[i] ==
>  > > > > > > > >> > > 0.0f; i--) {
>  > > > > > > > >> > > > > > > > > > > - zeroIndex = i;
>  > > > > > > > >> > > > > > > > > > > + if (a[left] < 0.0f) {
>  > > > > > > > >> > > > > > > > > > > + zeroIndex = findAnyZero(a, left, n);
>  > > > > > > > >> > > > > > > > > > > +
>  > > > > > > > >> > > > > > > > > > > + // there is at least one negative
>  > > value, so
>  > > > > > > range
>  > > > > > > > >> > > check is
>  > > > > > > > >> > > > > > > not needed
>  > > > > > > > >> > > > > > > > > > > + for (int i = zeroIndex - 1; /*i >=
>  > > left &&*/
>  > > > > > > > >> a[i] ==
>  > > > > > > > >> > > 0.0f; i--) {
>  > > > > > > > >> > > > > > > > > > > + zeroIndex = i;
>  > > > > > > > >> > > > > > > > > > > + }
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > // Turn the right number of 
> positive zeros
>  > > > > > > back into
>  > > > > > > > >> > > negative zeros
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > 2. We can find the position of the 
> first
>  > > > > zero by
>  > > > > > > > >> counting
>  > > > > > > > >> > > > > > > negative values during preprocessing phase.
>  > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11
>  > > > > > > 09:04:19 2010
>  > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortC.java Wed 
> May 12
>  > > > > 12:01:24
>  > > > > > > > >> 2010
>  > > > > > > > >> > > > > > > > > > > @@ -1678,7 +1678,7 @@
>  > > > > > > > >> > > > > > > > > > > * Phase 1: Count negative zeros 
> and move
>  > > > > NaNs to
>  > > > > > > > >> end of
>  > > > > > > > >> > > array.
>  > > > > > > > >> > > > > > > > > > > */
>  > > > > > > > >> > > > > > > > > > > final int NEGATIVE_ZERO =
>  > > > > > > > >> Float.floatToIntBits(-0.0f);
>  > > > > > > > >> > > > > > > > > > > - int numNegativeZeros = 0;
>  > > > > > > > >> > > > > > > > > > > + int numNegativeZeros = 0,
>  > > > > numNegativeValues = 0;
>  > > > > > > > >> > > > > > > > > > > int n = right;
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > for (int k = left; k <= n; k++) {
>  > > > > > > > >> > > > > > > > > > > @@ -1689,6 +1689,8 @@
>  > > > > > > > >> > > > > > > > > > > } else if (ak != ak) { // i.e., ak 
> is NaN
>  > > > > > > > >> > > > > > > > > > > a[k--] = a[n];
>  > > > > > > > >> > > > > > > > > > > a[n--] = Float.NaN;
>  > > > > > > > >> > > > > > > > > > > + } else if (ak < 0.0f) {
>  > > > > > > > >> > > > > > > > > > > + numNegativeValues++;
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > @@ -1705,7 +1707,7 @@
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > // Find first zero element
>  > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a, 
> left, n);
>  > > > > > > > >> > > > > > > > > > > + int zeroIndex = numNegativeValues;
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > for (int i = zeroIndex - 1; i >= 
> left &&
>  > > > > a[i] ==
>  > > > > > > > >> 0.0f;
>  > > > > > > > >> > > i--) {
>  > > > > > > > >> > > > > > > > > > > zeroIndex = i;
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > 3. We can use binary search to find
>  > > the first
>  > > > > > > > >> zero and
>  > > > > > > > >> > > thus
>  > > > > > > > >> > > > > > > avoid linear scan.
>  > > > > > > > >> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11
>  > > > > > > 09:04:19 2010
>  > > > > > > > >> > > > > > > > > > > +++ DualPivotQuicksortF.java Wed 
> May 12
>  > > > > 12:03:58
>  > > > > > > > >> 2010
>  > > > > > > > >> > > > > > > > > > > @@ -1705,11 +1705,7 @@
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > // Find first zero element
>  > > > > > > > >> > > > > > > > > > > - int zeroIndex = findAnyZero(a, 
> left, n);
>  > > > > > > > >> > > > > > > > > > > -
>  > > > > > > > >> > > > > > > > > > > - for (int i = zeroIndex - 1; i >=
>  > > left &&
>  > > > > a[i] ==
>  > > > > > > > >> > > 0.0f; i--) {
>  > > > > > > > >> > > > > > > > > > > - zeroIndex = i;
>  > > > > > > > >> > > > > > > > > > > - }
>  > > > > > > > >> > > > > > > > > > > + int zeroIndex = findFirstZero(a,
>  > > left, n);
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > // Turn the right number of 
> positive zeros
>  > > > > > > back into
>  > > > > > > > >> > > negative zeros
>  > > > > > > > >> > > > > > > > > > > for (int i = zeroIndex, m = 
> zeroIndex +
>  > > > > > > > >> > > numNegativeZeros; i <
>  > > > > > > > >> > > > > > > m; i++) {
>  > > > > > > > >> > > > > > > > > > > @@ -1718,7 +1714,7 @@
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > /**
>  > > > > > > > >> > > > > > > > > > > - * Returns the index of some zero
>  > > element
>  > > > > in the
>  > > > > > > > >> > > specified
>  > > > > > > > >> > > > > > > range via
>  > > > > > > > >> > > > > > > > > > > + * Returns the index of the first 
> zero
>  > > > > element
>  > > > > > > > >> in the
>  > > > > > > > >> > > > > > > specified range via
>  > > > > > > > >> > > > > > > > > > > * binary search. The range is assumed
>  > > to be
>  > > > > > > > >> sorted, and
>  > > > > > > > >> > > must
>  > > > > > > > >> > > > > > > contain
>  > > > > > > > >> > > > > > > > > > > * at least one zero.
>  > > > > > > > >> > > > > > > > > > > *
>  > > > > > > > >> > > > > > > > > > > @@ -1726,18 +1722,17 @@
>  > > > > > > > >> > > > > > > > > > > * @param low the index of the first
>  > > element,
>  > > > > > > > >> inclusive,
>  > > > > > > > >> > > to be
>  > > > > > > > >> > > > > > > searched
>  > > > > > > > >> > > > > > > > > > > * @param high the index of the last
>  > > element,
>  > > > > > > > >> inclusive,
>  > > > > > > > >> > > to be
>  > > > > > > > >> > > > > > > searched
>  > > > > > > > >> > > > > > > > > > > */
>  > > > > > > > >> > > > > > > > > > > - private static int
>  > > findAnyZero(float[] a,
>  > > > > > > int low,
>  > > > > > > > >> > > int high) {
>  > > > > > > > >> > > > > > > > > > > - while (true) {
>  > > > > > > > >> > > > > > > > > > > + private static int
>  > > findFirstZero(float[]
>  > > > > a, int
>  > > > > > > > >> low,
>  > > > > > > > >> > > int high) {
>  > > > > > > > >> > > > > > > > > > > + while (low < high) {
>  > > > > > > > >> > > > > > > > > > > int middle = (low + high) >>> 1;
>  > > > > > > > >> > > > > > > > > > > float middleValue = a[middle];
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > if (middleValue < 0.0f) {
>  > > > > > > > >> > > > > > > > > > > low = middle + 1;
>  > > > > > > > >> > > > > > > > > > > - } else if (middleValue > 0.0f) {
>  > > > > > > > >> > > > > > > > > > > - high = middle - 1;
>  > > > > > > > >> > > > > > > > > > > - } else { // middleValue == 0.0f
>  > > > > > > > >> > > > > > > > > > > - return middle;
>  > > > > > > > >> > > > > > > > > > > + } else { // middleValue >= 0.0f
>  > > > > > > > >> > > > > > > > > > > + high = middle;
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > > + return low;
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > > }
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > Counting negative values appeared more
>  > > > > expensive
>  > > > > > > > >> than
>  > > > > > > > >> > > any other
>  > > > > > > > >> > > > > > > variants.
>  > > > > > > > >> > > > > > > > > > > The last proposal seems to me as
>  > > efficient
>  > > > > as the
>  > > > > > > > >> current
>  > > > > > > > >> > > > > > > solution is in its worst case - when we have
>  > > only one
>  > > > > > > > >> negative
>  > > > > > > > >> > > zero (in
>  > > > > > > > >> > > > > > > the half of array).
>  > > > > > > > >> > > > > > > > > > > And it shows the best result if we
>  > > have many
>  > > > > > > zeros.
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > Regards,
>  > > > > > > > >> > > > > > > > > > > Dmytro Sheyko
>  > > > > > > > >> > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > From: iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > > > > > To: jjb at google.com;
>  > > > > dmytro_sheyko at hotmail.com
>  > > > > > > > >> > > > > > > > > > > > CC: core-libs-dev at openjdk.java.net;
>  > > > > > > > >> iaroslavski at mail.ru
>  > > > > > > > >> > > > > > > > > > > > Subject: Re[2]: New portion of
>  > > > > improvements for
>  > > > > > > > >> > > Dual-Pivot
>  > > > > > > > >> > > > > > > Quicksort
>  > > > > > > > >> > > > > > > > > > > > Date: Sun, 9 May 2010 23:51:27 +0400
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > Josh,
>  > > > > > > > >> > > > > > > > > > > > Dmytro,
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > I have done more thoroughly testing
>  > > "great -
>  > > > > > > > >> less > 5 *
>  > > > > > > > >> > > > > > > seventh" vs. "less < e1 && great > e5",
>  > > > > > > > >> > > > > > > > > > > > and found that more symmetric code
>  > > "less
>  > > > > < e1 &&
>  > > > > > > > >> > > great > e5"
>  > > > > > > > >> > > > > > > is little bit faster, ~0.5..0.7%
>  > > > > > > > >> > > > > > > > > > > > on both VMs. Other code has not been
>  > > > > changed.
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > Please, take the latest version in
>  > > > > attachment.
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > Vladimir
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > Tue, 4 May 2010 21:57:42 -0700
>  > > письмо от
>  > > > > Joshua
>  > > > > > > > >> Bloch
>  > > > > > > > >> > > > > > > <jjb at google.com>:
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > > Vladimir,
>  > > > > > > > >> > > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > > Old:
>  > > > > > > > >> > > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > >298 if (less < e1 && great > e5) {
>  > > > > > > > >> > > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > > New:
>  > > > > > > > >> > > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > >256 if (great - less > 5 * 
> seventh) {
>  > > > > > > > >> > > > > > > > > > > >
>  > > > > > > > >> > > > > > > > > > > > >Regards,
>  > > > > > > > >> > > > > > > > > > > > >Josh