Extending Java Arrays/Collection Sort API

Laurent Bourgès bourges.laurent at gmail.com
Wed Nov 14 08:01:50 UTC 2018


Hi,

I am a scientist and the author of the Marlin renderer, integrated in
OpenJDK.

In this renderer, I needed a fast sort algorithm on small almost sorted
arrays, but more important, to deal with 2 arrays: x values and their
corresponding edge indices (pointer like). I derived my MergeSort class to
perform 2 swaps (x & edge ptr) as Arrays.sort() or TimSort() did not
provide such API.

1. I started discussing about extending the Java Sort API with Vladimir
Yaroslavskiy, the author of the famous Java DualPivotQuickSort, as he is
currently improving it.

His coming DPQS 2018 class provides lot of sort algorithms, I propose to
make them public API through a facade method in Arrays class:
insertionSort(), heapSort(), mergeSort(), quickSort()... Of course, his
current code is dedicated to DPQS, but his algorithm implementation could
be generalized to propose the Best implementations as a public Java API
(stable & unstable sorts)

For example in Marlin, I need trivial insertionSort() in the Helpers class,
my MergeSort could be replaced by another best implementation.
Let's not reinvent the wheel...

2. New Sort API to deal with indices...

For now, there is no mean to obtain the indices sorted on another value,
see my introduction: edge indices sorted by their x positions.
In data science, this is very useful and general.

What about an Arrays API extension:
- Arrays.sort(int[] a, int[] b, low, high) or
Indices int[] Arrays.sortIndices(int[] a, low high), a[] left untouched
- same with preallocated buffers & runs arrays
- alternatively, it could be expressed as Comparator<Primitive>
Arrays.sort(int[] a, low, high, Comparator<int, int> cmp) that sorts array
a using the given comparator. For example, a[] could be edge indices sorted
according to their edge x position...
This comparator is specific as it compares anything (external storage) at
given indices i,j.

Finally Marlin renderer would benefit from Value types (array of structs),
as edge data are currently packed in int[] arrays, but this new Sort API
seems to me general & needed enough to be discussed here.

Best regards,
Laurent


More information about the core-libs-dev mailing list