RFR: JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort) [v12]
Piotr Tarsa
duke at openjdk.java.net
Sat May 21 18:01:44 UTC 2022
On Mon, 14 Mar 2022 19:12:29 GMT, iaroslavski <duke at openjdk.java.net> wrote:
>> Sorting:
>>
>> - adopt radix sort for sequential and parallel sorts on int/long/float/double arrays (almost random and length > 6K)
>> - fix tryMergeRuns() to better handle case when the last run is a single element
>> - minor javadoc and comment changes
>>
>> Testing:
>> - add new data inputs in tests for sorting
>> - add min/max/infinity values to float/double testing
>> - add tests for radix sort
>
> iaroslavski has updated the pull request incrementally with one additional commit since the last revision:
>
> JDK-8266431: Dual-Pivot Quicksort improvements (Radix sort)
>
> * Improved mixed insertion sort
> * Optimized insertion sort
> * Improved merging sort
> * Optimized soring tests
hi Vladimir,
the main reason i raise the issue with catching out-of-memory-error is that it's done very rarely in production code and also it's rather unpredictable when multiple things are going on (e.g. one thread can allocate a lot of memory, but other threads receive `oome`).
i've searched through /src/ in openjdk/jdk repository for `oome` (i.e. this one) and found that most times the `oome` is caught is in `java.desktop` module (probably this means the `oome` was related to native memory or other resources external to jvm), so i've narrowed search to `java.base`: https://github.com/search?q=outofmemoryerror+repo%3Aopenjdk%2Fjdk+path%3A%2Fsrc%2Fjava.base%2F+language%3AJava&type=Code&ref=advsearch&l=Java and found just 2 cases of catching `oome` without always rethrowing it:
- https://github.com/openjdk/jdk/blob/d8b0b32f9f4049aa678809aa068978e3a4e29457/src/java.base/share/classes/sun/nio/ch/FileChannelImpl.java#L1304 (but this rethrows if it internally fails with `oome` for a second time)
- https://github.com/openjdk/jdk/blob/739769c8fc4b496f08a92225a12d07414537b6c0/src/java.base/share/classes/java/util/concurrent/SubmissionPublisher.java#L1171
overall, it seems that swallowing `oome` (i.e. in this case: catching and not rethrowing) is definitely not a readily used approach.
> We need additional buffer for int, long, float and double types only.
>
> We will have 2 constants which limit the number of int/float and
long/double elements in array, like this:
> (...)
> See, that the max number of elements for long/double is in 2 times less than for int/float,
because long/double occupies 2 times more bytes.
sounds good
> Why do we need to use Math.min(…, Integer.MAX_VALUE)? (...) So, limit by 2 Gb max is required.
yep, understood
> What do you think, what value of shift is the best, 6, 7, 8, 9, 10, 11, 12 etc.?
> Runtime.getRuntime().maxMemory() >> 10??
>
> Do you have actual scenarios? Actual requirements? Your feedback are welcome!
keeping the extra buffer size below 1% of max memory should be safe enough. currently the code is:
private static final int MAX_BUFFER_LENGTH =
(int) Math.min(Runtime.getRuntime().maxMemory() >> 6, 256L << 20);
// ...
private static Object tryAllocate(Object a, int length) {
if (length > MAX_BUFFER_LENGTH) {
return null;
}
try {
if (a instanceof int[]) {
return new int[length];
}
if (a instanceof long[]) {
return new long[length];
}
...
which means that in the worst case `sizeof(long) * (max memory >> 6) == 12.5% of heap size limit` would be used which is a lot when someone runs java code in memory constrained environment and that is rather common nowadays (microservices, containers, etc).
-------------
PR: https://git.openjdk.java.net/jdk/pull/3938
More information about the core-libs-dev
mailing list