RFR: JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort) [v12]
iaroslavski
duke at openjdk.java.net
Fri May 20 21:23:02 UTC 2022
On Sun, 15 May 2022 14:17:41 GMT, Piotr Tarsa <duke at openjdk.java.net> wrote:
>> iaroslavski has updated the pull request incrementally with one additional commit since the last revision:
>>
>> JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort)
>>
>> * Improved mixed insertion sort
>> * Optimized insertion sort
>> * Improved merging sort
>> * Optimized soring tests
>
> i think it would make much more sense to have the extra buffer size limit in bytes, not in elements. for single-threaded sorting the limit should be low, e.g. 1/64 of the heap, not 1/8 (in case of sorting e.g. longs as each long is 8 byte long). multi-threaded sorting could be more aggressive when it comes to resource usage as it's less likely to be used in resource constrained environment.
@tarsa Hi Piotr,
Thank you for your interest to sorting and your point to allocation of buffers in sorting.
Let’s I share my thoughts.
The number of elements of any array is not more than ~2 billion
(max positive int value) and therefore the size (in bytes) of
int array is not more than ~2 GB * 4 = ~8 GB.
We need additional buffer for int, long, float and double types only.
We will have 2 constants which limit the number of int/float and
long/double elements in array, like this:
private static final int MAX_BUFFER_LENGTH_1 = // for int/float
(int) Math.min(Runtime.getRuntime().maxMemory() >> 10, Integer.MAX_VALUE);
private static final int MAX_BUFFER_LENGTH_2 = // for long/double
(int) Math.min(Runtime.getRuntime().maxMemory() >> 11, Integer.MAX_VALUE);
See, that the max number of elements for long/double is in 2 times less than for int/float,
because long/double occupies 2 times more bytes.
I take factors 10 and 11 as example, it may be other values, but the first is less than the second by 1.
Why do we need to use Math.min(…, Integer.MAX_VALUE)? We must do it in this case:
if Runtime.getRuntime().maxMemory() >> 11 is larger than max int (for example, we have power server),
casting to int will produce negative value. So, limit by 2 Gb max is required.
And the method tryAllocate will look like this:
private static Object tryAllocate(Object a, int length) {
try {
if (a instanceof int[]) {
return length > MAX_BUFFER_LENGTH_1 ? null : new int[length];
}
if (a instanceof long[]) {
return length > MAX_BUFFER_LENGTH_2 ? null : new long[length];
}
if (a instanceof float[]) {
return length > MAX_BUFFER_LENGTH_1 ? null : new float[length];
}
if (a instanceof double[]) {
return length > MAX_BUFFER_LENGTH_2 ? null : new double[length];
}
throw new IllegalArgumentException("Unknown array: " + a.getClass().getName());
} catch (OutOfMemoryError e) {
return null;
}
}
What do you think, what value of shift is the best, 6, 7, 8, 9, 10, 11, 12 etc.?
Runtime.getRuntime().maxMemory() >> 10??
Do you have actual scenarios? Actual requirements? Your feedback are welcome!
-------------
PR: https://git.openjdk.java.net/jdk/pull/3938
More information about the core-libs-dev
mailing list