RFR: 8331311: C2: Big Endian Port of 8318446: optimize stores into primitive arrays by combining values into larger store [v6]
Amit Kumar
amitkumar at openjdk.org
Thu Jun 6 07:04:58 UTC 2024
On Wed, 5 Jun 2024 20:07:17 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>> This pr adds a few tweaks to [JDK-8318446](https://bugs.openjdk.org/browse/JDK-8318446) which allows enabling it also on big endian platforms (e.g. AIX, S390). JDK-8318446 introduced a C2 optimization to replace consecutive stores to a primitive array with just one store.
>>
>> By example (from `TestMergeStores.java`):
>>
>>
>> static Object[] test2a(byte[] a, int offset, long v) {
>> if (IS_BIG_ENDIAN) {
>> a[offset + 0] = (byte)(v >> 56);
>> a[offset + 1] = (byte)(v >> 48);
>> a[offset + 2] = (byte)(v >> 40);
>> a[offset + 3] = (byte)(v >> 32);
>> a[offset + 4] = (byte)(v >> 24);
>> a[offset + 5] = (byte)(v >> 16);
>> a[offset + 6] = (byte)(v >> 8);
>> a[offset + 7] = (byte)(v >> 0);
>> } else {
>> a[offset + 0] = (byte)(v >> 0);
>> a[offset + 1] = (byte)(v >> 8);
>> a[offset + 2] = (byte)(v >> 16);
>> a[offset + 3] = (byte)(v >> 24);
>> a[offset + 4] = (byte)(v >> 32);
>> a[offset + 5] = (byte)(v >> 40);
>> a[offset + 6] = (byte)(v >> 48);
>> a[offset + 7] = (byte)(v >> 56);
>> }
>> return new Object[]{ a };
>> }
>>
>>
>> Depending on the endianess 8 bytes are stored into an array. The order of the stores is the same as the order of an 8-byte-store therefore 8 1-byte-stores can be replaced with just one 8-byte-store (if there aren't too many range checks).
>>
>> Additionally I've fixed a few comments and a test bug.
>>
>> The optimization seems to be a little bit more effective on big endian platforms.
>>
>> Again by example:
>>
>>
>> static Object[] test800a(byte[] a, int offset, long v) {
>> if (IS_BIG_ENDIAN) {
>> a[offset + 0] = (byte)(v >> 40); // Removed from candidate list
>> a[offset + 1] = (byte)(v >> 32); // Removed from candidate list
>> a[offset + 2] = (byte)(v >> 24); // Merged
>> a[offset + 3] = (byte)(v >> 16); // Merged
>> a[offset + 4] = (byte)(v >> 8); // Merged
>> a[offset + 5] = (byte)(v >> 0); // Merged
>> } else {
>> a[offset + 0] = (byte)(v >> 0); // Removed from candidate list
>> a[offset + 1] = (byte)(v >> 8); // Removed from candidate list
>> a[offset + 2] = (byte)(v >> 16); // Not merged
>> a[offset + 3] = (byte)(v >> 24); // Not merged
>> a[offset + 4] = (byte)(v >> 32); // Not merge...
>
> Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:
>
> - Merge branch 'master' into 8331311_merge_stores_on_big_endian
> - Feedback Emanuel
> - Eliminate IS_BIG_ENDIAN and always execute both variants
> - test2BE: big endian version of test2
> - Improve make_merged_input_value based on Emanuel's feedback
> - Improve comment
> - Improve comment
> - Add bug id
> - Typo
> - 8331311: C2: Big Endian Port of 8318446: optimize stores into primitive arrays by combining values into larger store
I did another round of testing on s390x. looks good.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19218#issuecomment-2151550321
More information about the hotspot-compiler-dev
mailing list