New portion of improvements for Dual-Pivot Quicksort

Thu May 6 08:59:18 UTC 2010

Hello Dmytro,

I got your idea, and now I'm trying to combine insertion sort
with your suggestion (don't set a sentinel), to be under
restriction that we cannot change anything outside of the
range and to sort correctly if initially part of array only
is to be sorted (not whole array), see:

[ ? ] [ this range to be sorted ] [ ? ]

If center part must be sorted only, left [ ? ] part
cannot be sentinel. Therefore, we have to sort it
by traditional insertion sort (or something else).

Thank you,
Vladimir

Dmytro Sheyko wrote:
> Vladimir,
> 
>  > If we consider case when the whole array is sorted,
>  > we can omit setting sentinel and restoring original value:
> 
> Let me define my point more precisely. We can omit setting sentinel not 
> only in a particular case when the whole array is sorted.
> 
> During partitioning the whole array is split into small chunks. Elements 
> within a chunks remains unsorted,
> but every element in a certain chunk is greater than every element in 
> preceding (lefter) chunks
> and less than every element in following (righter) chunks.
> 
> E.g.
> 
>    [...A...B...]C[...D...E...]F[...G...H...]
> 
> Elements C and F are pivots.
> Elements D and E are greater that pivot C and hence greater than A and 
> B, and less than pivot F and hence G and H.
> 
> Therefore during sorting any non-leftmost chunk we can rely on preceding 
> chunk.
> 
> Did I miss something? Is this what you mean?
> 
> Thank you,
> Dmytro Sheyko
> 
>  > Date: Wed, 5 May 2010 21:23:37 +0400
>  > From: iaroslavski at mail.ru
>  > Subject: Re: New portion of improvements for Dual-Pivot Quicksort
>  > To: dmytro_sheyko at hotmail.com
>  > CC: core-libs-dev at openjdk.java.net
>  >
>  > Hi Dmytro,
>  >
>  > Thank you very much for suggestions! I checked it and here
>  > is my results:
>  >
>  > If we consider case when the whole array is sorted,
>  > we can omit setting sentinel and restoring original value:
>  >
>  > // int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE;
>  >
>  > for (int j, i = left + 1; i <= right; i++) {
>  > int ai = a[i];
>  > for (j = i - 1; /* j >= left && */ ai < a[j]; j--) {
>  > a[j + 1] = a[j];
>  > }
>  > a[j + 1] = ai;
>  > }
>  > // a[left - 1] = before;
>  >
>  > I checked this version and found that it works little bit faster
>  > (only 0.4-0.3%) with client VM, but loses more than 2.5% under
>  > server mode in compare with one-pivot version from JDK 6:
>  >
>  > client server
>  > original: 60.75% 48.20%
>  > no setting sentinel: 60.40% 50.88%
>  >
>  > If we add additional check which case is sorted, I think, we
>  > don't win at all.
>  >
>  > And if pivot candidates are not put back to array, it shows
>  > very poor results:
>  >
>  > client server
>  > original: 60.75% 48.20%
>  > no back candidates: 66.80% 53.95%
>  >
>  > I run one pivot version taken from JDK 6: I replace in Dual-Pivot
>  > implementation with all optimizations dual-pivot partitioning by 
> one-pivot:
>  > but it remain still slower than DPQ. So, the main point of DPQ is not 
> only
>  > in optimizations, but in dividing array into 3 parts instead of two.
>  >
>  > Thank you,
>  > Vladimir
>  >
>  > Dmytro Sheyko wrote:
>  > > Hi Vladimir,
>  > >
>  > > The trick with sentinel is quite cute. Going farther, I think it is 
> not
>  > > always necessary to replace left outer element with the least 
> possible value
>  > > (and restore it later after sorting) because after partitioning every
>  > > element in adjoining array part can play the role of sentinel.
>  > > The only exception is when the user requested to sort array partially
>  > > (not from the beginning). Thereby we should care about setting 
> sentinel
>  > > explicitly in this exceptional case only and only once before 
> sorting whole array.
>  > >
>  > > Also it seems to me that it is not necessary to put pivot 
> candidates (5
>  > > elements that are used to choose pivots) back to array
>  > > because anyway they are to be sorted later and likely change their
>  > > positions.
>  > >
>  > > I am also interesting in theoretical rationale why dual pivot 
> quicksort
>  > > is better than single pivot one. The last document that I have seen
>  > > (Last updated: September 22, 2009)
>  > > compared classic quicksort (where pivot is chosen arbitrarily) with
>  > > "classic" dual pivot quicksort (where pivots are also chosen 
> arbitrarily).
>  > > As conclusion they both perform about 2*n*ln(n) key comparisons. 
> However
>  > > jdk had improved quicksort: median-of-three and pseudomedian-of-nine
>  > > approaches were used. And median-of-three approach lead to 
> 12/7*n*ln(n)
>  > > key comparisons. On the other hand, dual pivot quicksort is also 
> improved:
>  > > pivots are chosen from 5 candidates, and hence it must perform less 
> than
>  > > 2*n*ln(n) key comparisons.
>  > >
>  > > Regards,
>  > > Dmytro Sheyko