New portion of improvements for Dual-Pivot Quicksort
Vladimir Iaroslavski
iaroslavski at mail.ru
Wed May 19 10:41:32 UTC 2010
resend the class with correct constructor
Vladimir Iaroslavski wrote:
> Dmytro,
>
> Thank you for comments, I updated double method, did little bit
> javadoc changes and replaced in char/short/byte methods
> "fromIndex -> left", "toIndex-1 -> right", the code became
> consistent with main sort method and more compact. Also I use
> more usual "i--" and "i++" in for loops (instead of "--i", "++i.
>
> To accent the difference between float/double and other types,
> I put comment where it is important:
>
> /*
> * In spite of a[great] == pivot1, the assignment
> * a[less++] = pivot1 may be incorrect, if a[great]
> * and pivot1 are floating-point zeros of different
> * signs, therefore in float/double methods we have
> * to use more accurate assignment a[k] = a[great].
> */
> a[less++] = pivot1;
>
> and for double/float:
>
> /*
> .....
> */
> a[k] = a[great];
>
> See updated version in attachment.
>
> Thank you,
> Vladimir
>
> Dmytro Sheyko wrote:
>> Vladimir,
>>
>> I can see that you changed sortNegZeroAndNaN(float[]...) but probably
>> forgot to change sortNegZeroAndNaN(double[]...).
>>
>> You really puzzled me with failed testcase and note that sorting
>> algorithm (without special attention to zeros) generally may change
>> number of negative zeros.
>> I will provide my comments later.
>>
>> As for counting sort, I think we should use single format style over
>> the file (unless we have valuable reason not to do this). I mean to
>> choose
>> 1)
>> if (toIndex - fromIndex >
>> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
>> countingSort(a, fromIndex, toIndex);
>> return;
>> }
>> sort(a, fromIndex, toIndex - 1, true);
>> 2)
>> if (toIndex - fromIndex >
>> COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR) {
>> countingSort(a, fromIndex, toIndex);
>> } else {
>> sort(a, fromIndex, toIndex - 1, true);
>> }
>> I prefer the second one.
>>
>> Thanks a lot,
>> Dmytro Sheyko
>>
>> > Date: Tue, 18 May 2010 18:57:50 +0400
>> > From: iaroslavski at mail.ru
>> > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
>> > To: dmytro_sheyko at hotmail.com
>> > CC: core-libs-dev at openjdk.java.net
>> >
>> > Hello,
>> >
>> > I've run your modification for counting sort, it real faster.
>> > I attached new version with your changes (I did little bit
>> > format it) and included my case with float/double.
>> >
>> > Note that you modification doesn't pass test from Sorting class,
>> > which I sent earlier. It fails on float/double test:
>> >
>> > Test #3: random = 666, len = 34, a = 0, g = 6, z = 9, n = 10, p = 9
>> >
>> > I suggest shorter method (which is based on your idea to skip counting
>> > negative zeros on Phase 1.): I found find first zero index (or it will
>> > be index of first positive element if no zeros at all, or last
>> negative,
>> > if no positive and zero elements) and then swap negative zero to the
>> > beginning of the sub-range.
>> >
>> > int hi = right;
>> >
>> > while (left < hi) {
>> > int middle = (left + hi) >>> 1;
>> > float middleValue = a[middle];
>> >
>> > if (middleValue < 0.0f) {
>> > left = middle + 1;
>> > } else {
>> > hi = middle;
>> > }
>> > }
>> >
>> > for (int k = left, p = left; k <= right; k++) {
>> > float ak = a[k];
>> > if (ak != 0.0f) {
>> > return;
>> > }
>> > if (Float.floatToRawIntBits(ak) < 0) { // ak is -0.0f
>> > a[k] = +0.0f;
>> > a[p++] = -0.0f;
>> > }
>> > }
>> >
>> > Important note: in partitioning loop there are several places
>> > (marked by // !) where potential bug with -0.0 could be
>> > (when pivot and a[great] are zeros with different signs):
>> >
>> > if (a[great] == pivot1) {
>> > a[k] = a[less];
>> > - a[less++] = pivot1; // !
>> > + a[less++] = a[great];
>> > } else { // pivot1 < a[great] < pivot2
>> > a[k] = a[great];
>> > }
>> > - a[great--] = pivot2; // !
>> > + a[great--] = ak;
>> > } else if (ak == pivot1) { // Move a[k] to left part
>> > a[k] = a[less];
>> > - a[less++] = pivot1; // !
>> > + a[less++] = ak;
>> > }
>> >
>> > and the same in "Pivots are equal" branch.
>> >
>> > I did changes "pivot1/2 -> ak" in methods for all types
>> > and "pivot1 -> a[great]" in float/double sections only.
>> >
>> > Please, review format changes for counting sort and new version
>> > of Phase 3 for float/double.
>> >
>> > Thank you,
>> > Vladimir
>> >
>> > Dmytro Sheyko wrote:
>> > > Hi,
>> > >
>> > > About counting sort again.
>> > >
>> > > 1. This condition "i < count.length && k <= right" is excessive.
>> Any one
>> > > conjunct is enough. "k <= right" seems better.
>> > > 2. No need to calculate "short value = (short) (i +
>> Short.MIN_VALUE)"
>> > > when "count[i]" is zero.
>> > > 3. For signed primitives (byte and short) we would better loop
>> backward.
>> > > Thanks to "k >= fromIndex" condition we will quit looping earlier
>> > > assuming that typically we work with positive numbers.
>> > > For unsigned primitives (char) we would better loop forward because
>> > > typically we work with characters about zero (ASCII).
>> > >
>> > > - for (int i = 0, k = left; i < count.length && k <= right; i++) {
>> > > - short value = (short) (i + Short.MIN_VALUE);
>> > > - for (int s = count[i]; s > 0; s--) {
>> > > - a[k++] = value;
>> > > - }
>> > > - }
>> > >
>> > > + for (int i = NUM_SHORT_VALUES - 1, k = toIndex - 1; k >=
>> > > fromIndex; --i) {
>> > > + while (count[i] == 0) --i;
>> > > + short value = (short) (i + Short.MIN_VALUE);
>> > > + int s = count[i];
>> > > + do { a[k--] = value; } while (--s > 0);
>> > > + }
>> > >
>> > > Thanks,
>> > > Dmytro Sheyko
>> > >
>> > > > From: iaroslavski at mail.ru
>> > > > To: dmytro_sheyko at hotmail.com
>> > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
>> > > > Subject: Re[2]: New portion of improvements for Dual-Pivot
>> Quicksort
>> > > > Date: Tue, 18 May 2010 01:11:19 +0400
>> > > >
>> > > > Sounds good!
>> > > > Will consider too...
>> > > >
>> > > > Mon, 17 May 2010 22:24:11 +0700 письмо от Dmytro Sheyko
>> > > <dmytro_sheyko at hotmail.com>:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > Regarding counting sort. We can check whether we should
>> switch to
>> > > counting sort only once in the beginning.
>> > > > >
>> > > > > > Date: Mon, 17 May 2010 17:30:37 +0400
>> > > > > > From: iaroslavski at mail.ru
>> > > > > > Subject: Re: New portion of improvements for Dual-Pivot
>> Quicksort
>> > > > > > To: dmytro_sheyko at hotmail.com
>> > > > > > CC: core-libs-dev at openjdk.java.net
>> > > > > >
>> > > > > > Hello,
>> > > > > >
>> > > > > > Thank you for review, I'll check and run tests again with you
>> > > changes.
>> > > > > >
>> > > > > > Thank you,
>> > > > > > Vladimir
>> > > > > >
>> > > > > > Dmytro Sheyko wrote:
>> > > > > > > Hello,
>> > > > > > >
>> > > > > > > More ideas.
>> > > > > > >
>> > > > > > > 1. We can use
>> > > > > > > Double.doubleToRawLongBits instead of
>> Double.doubleToLongBits and
>> > > > > > > Float.floatToRawIntBits instead of Float.floatToIntBits.
>> > > > > > > No need to handle NaN's because they all are placed to
>> the end
>> > > of array.
>> > > > > > >
>> > > > > > > 2. Note that
>> > > > > > > Double.doubleToRawLongBits(+0.0) == 0L and
>> > > > > > > Double.doubleToRawLongBits(-0.0) == Long.MIN_VALUE and
>> > > > > > > Float.floatToRawIntBits(+0.0) == 0 and
>> > > > > > > Float.floatToRawIntBits(-0.0) == Integer.MIN_VALUE.
>> > > > > > >
>> > > > > > > Comparing with is zero usually more efficient (or at
>> least not
>> > > worse)
>> > > > > > > than with other values. Thus such pattern
>> > > > > > >
>> > > > > > > if (ak == 0.0f && NEGATIVE_ZERO == Float.floatToIntBits(ak))
>> > > > > > >
>> > > > > > > can be replaced with
>> > > > > > >
>> > > > > > > if (ak == 0.0f && Float.floatToIntBits(ak) < 0)
>> > > > > > >
>> > > > > > > 3. It would be more efficient to count negative zeros after
>> > > sorting.
>> > > > > > > General sorting algorithm puts both negative and positive
>> zeros
>> > > together
>> > > > > > > (but maybe not in right order).
>> > > > > > > Therefore we have to process less elements because
>> usually we
>> > > have less
>> > > > > > > zeros than other numbers.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > Dmytro Sheyko
>> > > > > > >
>> > > > > > > > From: iaroslavski at mail.ru
>> > > > > > > > To: dmytro_sheyko at hotmail.com; jjb at google.com
>> > > > > > > > CC: core-libs-dev at openjdk.java.net; iaroslavski at mail.ru
>> > > > > > > > Subject: Re[6]: New portion of improvements for Dual-Pivot
>> > > Quicksort
>> > > > > > > > Date: Fri, 14 May 2010 23:54:06 +0400
>> > > > > > > >
>> > > > > > > > Hello,
>> > > > > > > >
>> > > > > > > > I've updated the class, please, review the changes.
>> > > > > > > >
>> > > > > > > > Vladimir
>> > > > > > > >
>> > > > > > > > Fri, 14 May 2010 01:48:11 +0700 письмо от Dmytro Sheyko
>> > > > > > > <dmytro_sheyko at hotmail..com>:
>> > > > > > > >
>> > > > > > > > > Yes. I prefer F (Find First zero using binary search)
>> over
>> > > C (Count
>> > > > > > > negatives) and S (Smart Scan for zero).
>> > > > > > > > >
>> > > > > > > > > > From: iaroslavski at mail.ru
>> > > > > > > > > > To: dmytro_sheyko at hotmail.com
>> > > > > > > > > > CC: jjb at google.com; core-libs-dev at openjdk.java.net;
>> > > > > > > iaroslavski at mail.ru
>> > > > > > > > > > Subject: Re[4]: New portion of improvements for
>> > > Dual-Pivot Quicksort
>> > > > > > > > > > Date: Thu, 13 May 2010 21:34:54 +0400
>> > > > > > > > > >
>> > > > > > > > > > Dmytro,
>> > > > > > > > > >
>> > > > > > > > > > I've tested your suggested variants, and found that
>> case "C"
>> > > > > > > > > > (very interesting approach to find first position
>> of zero
>> > > > > > > > > > by counting negative elements) works slower than
>> original
>> > > > > > > > > > or two other cases.
>> > > > > > > > > >
>> > > > > > > > > > Implementations "F" and "S" are very close to each
>> other
>> > > > > > > > > > and little bit faster than original. I prefer case
>> "F":
>> > > > > > > > > > it is shorter and more clear. Do you agree?
>> > > > > > > > > >
>> > > > > > > > > > I'll prepare updated DualPivotQuicksort file and
>> send it
>> > > > > > > > > > tomorrow.
>> > > > > > > > > >
>> > > > > > > > > > Thank you,
>> > > > > > > > > > Vladimir
>> > > > > > > > > >
>> > > > > > > > > > Wed, 12 May 2010 17:04:52 +0700 письмо от Dmytro
>> Sheyko
>> > > > > > > <dmytro_sheyko at hotmail.com>:
>> > > > > > > > > >
>> > > > > > > > > > > Vladimir,
>> > > > > > > > > > >
>> > > > > > > > > > > Your changes are good for me.
>> > > > > > > > > > >
>> > > > > > > > > > > Additionally I have some comments/proposals
>> regarding
>> > > dealing
>> > > > > > > with negative zeros.
>> > > > > > > > > > >
>> > > > > > > > > > > 1. Scanning for the first zero we can avoid range
>> check
>> > > (i >=
>> > > > > > > left) if we have at least one negative value.
>> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>> > > > > > > > > > > +++ DualPivotQuicksortS.java Wed May 12 12:10:46
>> 2010
>> > > > > > > > > > > @@ -1705,10 +1705,15 @@
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > // Find first zero element
>> > > > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
>> > > > > > > > > > > + int zeroIndex = 0;
>> > > > > > > > > > >
>> > > > > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] ==
>> > > 0.0f; i--) {
>> > > > > > > > > > > - zeroIndex = i;
>> > > > > > > > > > > + if (a[left] < 0.0f) {
>> > > > > > > > > > > + zeroIndex = findAnyZero(a, left, n);
>> > > > > > > > > > > +
>> > > > > > > > > > > + // there is at least one negative value, so range
>> > > check is
>> > > > > > > not needed
>> > > > > > > > > > > + for (int i = zeroIndex - 1; /*i >= left &&*/
>> a[i] ==
>> > > 0.0f; i--) {
>> > > > > > > > > > > + zeroIndex = i;
>> > > > > > > > > > > + }
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > // Turn the right number of positive zeros back into
>> > > negative zeros
>> > > > > > > > > > >
>> > > > > > > > > > > 2. We can find the position of the first zero by
>> counting
>> > > > > > > negative values during preprocessing phase.
>> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>> > > > > > > > > > > +++ DualPivotQuicksortC.java Wed May 12 12:01:24
>> 2010
>> > > > > > > > > > > @@ -1678,7 +1678,7 @@
>> > > > > > > > > > > * Phase 1: Count negative zeros and move NaNs to
>> end of
>> > > array.
>> > > > > > > > > > > */
>> > > > > > > > > > > final int NEGATIVE_ZERO =
>> Float.floatToIntBits(-0.0f);
>> > > > > > > > > > > - int numNegativeZeros = 0;
>> > > > > > > > > > > + int numNegativeZeros = 0, numNegativeValues = 0;
>> > > > > > > > > > > int n = right;
>> > > > > > > > > > >
>> > > > > > > > > > > for (int k = left; k <= n; k++) {
>> > > > > > > > > > > @@ -1689,6 +1689,8 @@
>> > > > > > > > > > > } else if (ak != ak) { // i.e., ak is NaN
>> > > > > > > > > > > a[k--] = a[n];
>> > > > > > > > > > > a[n--] = Float.NaN;
>> > > > > > > > > > > + } else if (ak < 0.0f) {
>> > > > > > > > > > > + numNegativeValues++;
>> > > > > > > > > > > }
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > @@ -1705,7 +1707,7 @@
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > // Find first zero element
>> > > > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
>> > > > > > > > > > > + int zeroIndex = numNegativeValues;
>> > > > > > > > > > >
>> > > > > > > > > > > for (int i = zeroIndex - 1; i >= left && a[i] ==
>> 0.0f;
>> > > i--) {
>> > > > > > > > > > > zeroIndex = i;
>> > > > > > > > > > >
>> > > > > > > > > > > 3. We can use binary search to find the first
>> zero and
>> > > thus
>> > > > > > > avoid linear scan.
>> > > > > > > > > > > --- DualPivotQuicksort.java Tue May 11 09:04:19 2010
>> > > > > > > > > > > +++ DualPivotQuicksortF.java Wed May 12 12:03:58
>> 2010
>> > > > > > > > > > > @@ -1705,11 +1705,7 @@
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > // Find first zero element
>> > > > > > > > > > > - int zeroIndex = findAnyZero(a, left, n);
>> > > > > > > > > > > -
>> > > > > > > > > > > - for (int i = zeroIndex - 1; i >= left && a[i] ==
>> > > 0.0f; i--) {
>> > > > > > > > > > > - zeroIndex = i;
>> > > > > > > > > > > - }
>> > > > > > > > > > > + int zeroIndex = findFirstZero(a, left, n);
>> > > > > > > > > > >
>> > > > > > > > > > > // Turn the right number of positive zeros back into
>> > > negative zeros
>> > > > > > > > > > > for (int i = zeroIndex, m = zeroIndex +
>> > > numNegativeZeros; i <
>> > > > > > > m; i++) {
>> > > > > > > > > > > @@ -1718,7 +1714,7 @@
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > /**
>> > > > > > > > > > > - * Returns the index of some zero element in the
>> > > specified
>> > > > > > > range via
>> > > > > > > > > > > + * Returns the index of the first zero element
>> in the
>> > > > > > > specified range via
>> > > > > > > > > > > * binary search. The range is assumed to be
>> sorted, and
>> > > must
>> > > > > > > contain
>> > > > > > > > > > > * at least one zero.
>> > > > > > > > > > > *
>> > > > > > > > > > > @@ -1726,18 +1722,17 @@
>> > > > > > > > > > > * @param low the index of the first element,
>> inclusive,
>> > > to be
>> > > > > > > searched
>> > > > > > > > > > > * @param high the index of the last element,
>> inclusive,
>> > > to be
>> > > > > > > searched
>> > > > > > > > > > > */
>> > > > > > > > > > > - private static int findAnyZero(float[] a, int low,
>> > > int high) {
>> > > > > > > > > > > - while (true) {
>> > > > > > > > > > > + private static int findFirstZero(float[] a, int
>> low,
>> > > int high) {
>> > > > > > > > > > > + while (low < high) {
>> > > > > > > > > > > int middle = (low + high) >>> 1;
>> > > > > > > > > > > float middleValue = a[middle];
>> > > > > > > > > > >
>> > > > > > > > > > > if (middleValue < 0.0f) {
>> > > > > > > > > > > low = middle + 1;
>> > > > > > > > > > > - } else if (middleValue > 0.0f) {
>> > > > > > > > > > > - high = middle - 1;
>> > > > > > > > > > > - } else { // middleValue == 0.0f
>> > > > > > > > > > > - return middle;
>> > > > > > > > > > > + } else { // middleValue >= 0.0f
>> > > > > > > > > > > + high = middle;
>> > > > > > > > > > > }
>> > > > > > > > > > > + return low;
>> > > > > > > > > > > }
>> > > > > > > > > > > }
>> > > > > > > > > > >
>> > > > > > > > > > > Counting negative values appeared more expensive
>> than
>> > > any other
>> > > > > > > variants.
>> > > > > > > > > > > The last proposal seems to me as efficient as the
>> current
>> > > > > > > solution is in its worst case - when we have only one
>> negative
>> > > zero (in
>> > > > > > > the half of array).
>> > > > > > > > > > > And it shows the best result if we have many zeros.
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > Dmytro Sheyko
>> > > > > > > > > > >
>> > > > > > > > > > > > From: iaroslavski at mail.ru
>> > > > > > > > > > > > To: jjb at google.com; dmytro_sheyko at hotmail.com
>> > > > > > > > > > > > CC: core-libs-dev at openjdk.java.net;
>> iaroslavski at mail.ru
>> > > > > > > > > > > > Subject: Re[2]: New portion of improvements for
>> > > Dual-Pivot
>> > > > > > > Quicksort
>> > > > > > > > > > > > Date: Sun, 9 May 2010 23:51:27 +0400
>> > > > > > > > > > > >
>> > > > > > > > > > > > Josh,
>> > > > > > > > > > > > Dmytro,
>> > > > > > > > > > > >
>> > > > > > > > > > > > I have done more thoroughly testing "great -
>> less > 5 *
>> > > > > > > seventh" vs. "less < e1 && great > e5",
>> > > > > > > > > > > > and found that more symmetric code "less < e1 &&
>> > > great > e5"
>> > > > > > > is little bit faster, ~0.5..0.7%
>> > > > > > > > > > > > on both VMs. Other code has not been changed.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Please, take the latest version in attachment.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Vladimir
>> > > > > > > > > > > >
>> > > > > > > > > > > > Tue, 4 May 2010 21:57:42 -0700 письмо от Joshua
>> Bloch
>> > > > > > > <jjb at google.com>:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Vladimir,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Old:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >298 if (less < e1 && great > e5) {
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > New:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >256 if (great - less > 5 * seventh) {
>> > > > > > > > > > > >
>> > > > > > > > > > > > >Regards,
>> > > > > > > > > > > > >Josh
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: DualPivotQuicksort.java
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100519/e5c9c696/DualPivotQuicksort.java>
More information about the core-libs-dev
mailing list