RFR: JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort) [v12]

iaroslavski duke at openjdk.java.net
Fri May 20 21:23:02 UTC 2022


On Sun, 15 May 2022 14:17:41 GMT, Piotr Tarsa <duke at openjdk.java.net> wrote:

>> iaroslavski has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort)
>>   
>>   * Improved mixed insertion sort
>>   * Optimized insertion sort
>>   * Improved merging sort
>>   * Optimized soring tests
>
> i think it would make much more sense to have the extra buffer size limit in bytes, not in elements. for single-threaded sorting the limit should be low, e.g. 1/64 of the heap, not 1/8 (in case of sorting e.g. longs as each long is 8 byte long). multi-threaded sorting could be more aggressive when it comes to resource usage as it's less likely to be used in resource constrained environment.

@tarsa  Hi Piotr,
 
Thank you for your interest to sorting and your point to allocation of buffers in sorting.
 
Let’s I share my thoughts.
 
The number of elements of any array is not more than ~2 billion
(max positive int value) and therefore the size (in bytes) of
int array is not more than ~2 GB * 4 = ~8 GB.
 
We need additional buffer for int, long, float and double types only.
 
We will have 2 constants which limit the number of int/float and
long/double elements in array, like this:
 
    private static final int MAX_BUFFER_LENGTH_1 =              // for int/float
            (int) Math.min(Runtime.getRuntime().maxMemory() >> 10, Integer.MAX_VALUE);
 
    private static final int MAX_BUFFER_LENGTH_2 =             // for long/double
            (int) Math.min(Runtime.getRuntime().maxMemory() >> 11, Integer.MAX_VALUE);
 
See, that the max number of elements for long/double is in 2 times less than for int/float,
because long/double occupies 2 times more bytes.
 
I take factors 10 and 11 as example, it may be other values, but the first is less than the second by 1.
 
Why do we need to use Math.min(…, Integer.MAX_VALUE)? We must do it in this case:
if  Runtime.getRuntime().maxMemory() >> 11 is larger than max int (for example, we have power server),
casting to int will produce negative value. So, limit by 2 Gb max is required.
 
And the method tryAllocate will look like this:
 
private static Object tryAllocate(Object a, int length) {
    try {
        if (a instanceof int[]) {
            return length > MAX_BUFFER_LENGTH_1 ? null : new int[length];
        }
        if (a instanceof long[]) {
            return length > MAX_BUFFER_LENGTH_2 ? null : new long[length];
        }
        if (a instanceof float[]) {
            return length > MAX_BUFFER_LENGTH_1 ? null : new float[length];
        }
        if (a instanceof double[]) {
            return length > MAX_BUFFER_LENGTH_2 ? null : new double[length];
        }
        throw new IllegalArgumentException("Unknown array: " + a.getClass().getName());
    } catch (OutOfMemoryError e) {
        return null;
    }
}
 
What do you think, what value of shift is the best, 6, 7, 8, 9, 10, 11, 12 etc.?
Runtime.getRuntime().maxMemory() >> 10??
 
Do you have actual scenarios? Actual requirements? Your feedback are welcome!

-------------

PR: https://git.openjdk.java.net/jdk/pull/3938


More information about the core-libs-dev mailing list