Re: Slightly faster java.util.Arrays.byteSort(byte[])
Vladimir Yaroslavskiy
vlv.spb.ru at mail.ru
Tue Jun 18 11:51:09 UTC 2019
Hi Rodion,
I looked at your byte sorting and compared it with the version which is under
review now. Here is the summary of my results:
1. net.coderodde.Arrays.sort(byte[]) works slower on small data (len < ~70).
The reason is that insertion sort is invoked on small arrays where counting
sort is not fast.
2. On other data net.coderodde.Arrays.sort shows the same time.
In general your code is not faster to be integrated, but many thanks
for pointing to non-perfect implementation of existing byte sorting.
I provide here the new implementation of byte sorting. Please look at it
and let me know if you have other ideas how to improve it.
/**
* Min size of a byte array to use counting sort.
*/
private static final int MIN_BYTE_COUNTING_SORT_SIZE = 64;
/**
* Sorts the specified range of the array using
* counting sort or insertion sort.
*
* @param a the array to be sorted
* @param low the index of the first element, inclusive, to be sorted
* @param high the index of the last element, exclusive, to be sorted
*/
static void sort(byte[] a, int low, int high) {
if (high - low > MIN_BYTE_COUNTING_SORT_SIZE) {
countingSort(a, low, high);
} else {
insertionSort(a, low, high);
}
}
/**
* Sorts the specified range of the array using insertion sort.
*
* @param a the array to be sorted
* @param low the index of the first element, inclusive, to be sorted
* @param high the index of the last element, exclusive, to be sorted
*/
private static void insertionSort(byte[] a, int low, int high) {
for (int i, k = low; ++k < high; ) {
byte ai = a[i = k];
if (ai < a[i - 1]) {
while (--i >= low && ai < a[i]) {
a[i + 1] = a[i];
}
a[i + 1] = ai;
}
}
}
/**
* The number of distinct byte values.
*/
private static final int NUM_BYTE_VALUES = 1 << 8;
/**
* Max index of byte counter.
*/
private static final int MAX_BYTE_INDEX = Byte.MAX_VALUE + NUM_BYTE_VALUES + 1;
/**
* Sorts the specified range of the array using counting sort.
*
* @param a the array to be sorted
* @param low the index of the first element, inclusive, to be sorted
* @param high the index of the last element, exclusive, to be sorted
*/
private static void countingSort(byte[] a, int low, int high) {
int[] count = new int[NUM_BYTE_VALUES];
/*
* Compute a histogram with the number of each values.
*/
for (int i = high; i > low; ++count[a[--i] & 0xFF]);
/*
* Place values on their final positions.
*/
for (int k = high, i = MAX_BYTE_INDEX; --i > Byte.MAX_VALUE; ) {
int value = i & 0xFF;
for (low = k - count[value]; k > low; a[--k] = (byte) value);
}
}
Thank you,
Vladimir
>>I'm not an expert, however, while your OCA (see below) is being processed I
>>would recommend to provide more comprehensive stats. Different lengths of an
>>array, different flavours of data: sorted, sorted in the reverse order, random,
>>typical expected case(s), etc.
>>
>>It seems that this particular functionality ( sort(byte[] ) hasn't changed since
>>the JDK 8. However, you should probably add the current JDK to your comparison.
>>
>>One necessary step towards making this eligible for inclusion in the JDK would
>>be to sign the OCA
>>
>>???? https://www.oracle.com/technetwork/community/oca-486395.html
>>
>>Keep in mind that it is not by any means a guarantee that your change will be
>>included. Once the OCA has been signed and processed, the code then can be
>>discussed and evaluated by experts.
>>
>>-Pavel
>>
>>> On 14 Jun 2019, at 16:34, Rodion Efremov < coderodd3 at gmail.com > wrote:
>>>
>>> Good evening!
>>>
>>> I managed to improve the JDK 8 java.util.Arrays.sort(byte[])
>>> performance-wise [1]. The (warmed up) demonstration program produces more
>>> or less optimistic results on arrays of 1e8 bytes:
>>>
>>> seed = 1560526264738
>>> java.util.Arrays.sort(byte[]) in 87.643701 milliseconds.
>>> java.util.Arrays.parallelSort(byte[]) in 301.329701 milliseconds.
>>> net.coderodde.Arrays.sort(byte[]) in 62.0763 milliseconds.
>>> Algorithms agree: true
>>>
>>> I would like to hear any comments on how to make it eligible for inclusion
>>> in JDK.
>>>
>>> Best regards,
>>> Rodion E.
>>>
>>> References:
>>> [1] https://gist.github.com/coderodde/493407bc1c57352b53c2aa18b5c9a7a8
>Hi Rodion,
>I'm working on the new version of Arrays.sort() / Arrays.parallelSort() which is under review.
>I will look on your version and provide my comments.
>Thank you,
>Vladimir
More information about the core-libs-dev
mailing list