RFR: 8331311: C2: Big Endian Port of 8318446: optimize stores into primitive arrays by combining values into larger store

Emanuel Peter epeter at openjdk.org
Tue May 14 16:15:02 UTC 2024


On Mon, 13 May 2024 15:58:31 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> This pr adds a few tweaks to [JDK-8318446](https://bugs.openjdk.org/browse/JDK-8318446) which allows enabling it also on big endian platforms (e.g. AIX, S390). JDK-8318446 introduced a C2 optimization to replace consecutive stores to a primitive array with just one store.
>> 
>> By example (from `TestMergeStores.java`):
>> 
>> 
>>     static Object[] test2a(byte[] a, int offset, long v) {
>>         if (IS_BIG_ENDIAN) {
>>             a[offset + 0] = (byte)(v >> 56);
>>             a[offset + 1] = (byte)(v >> 48);
>>             a[offset + 2] = (byte)(v >> 40);
>>             a[offset + 3] = (byte)(v >> 32);
>>             a[offset + 4] = (byte)(v >> 24);
>>             a[offset + 5] = (byte)(v >> 16);
>>             a[offset + 6] = (byte)(v >> 8);
>>             a[offset + 7] = (byte)(v >> 0);
>>         } else {
>>             a[offset + 0] = (byte)(v >> 0);
>>             a[offset + 1] = (byte)(v >> 8);
>>             a[offset + 2] = (byte)(v >> 16);
>>             a[offset + 3] = (byte)(v >> 24);
>>             a[offset + 4] = (byte)(v >> 32);
>>             a[offset + 5] = (byte)(v >> 40);
>>             a[offset + 6] = (byte)(v >> 48);
>>             a[offset + 7] = (byte)(v >> 56);
>>         }
>>         return new Object[]{ a };
>>     }
>> 
>> 
>> Depending on the endianess 8 bytes are stored into an array. The order of the stores is the same as the order of an 8-byte-store therefore 8 1-byte-stores can be replaced with just one 8-byte-store (if there aren't too many range checks).
>> 
>> Additionally I've fixed a few comments and a test bug.
>> 
>> The optimization seems to be a little bit more effective on big endian platforms.
>> 
>> Again by example:
>> 
>> 
>>     static Object[] test800a(byte[] a, int offset, long v) {
>>         if (IS_BIG_ENDIAN) {
>>             a[offset + 0] = (byte)(v >> 40); // Removed from candidate list
>>             a[offset + 1] = (byte)(v >> 32); // Removed from candidate list
>>             a[offset + 2] = (byte)(v >> 24); // Merged
>>             a[offset + 3] = (byte)(v >> 16); // Merged
>>             a[offset + 4] = (byte)(v >> 8);  // Merged
>>             a[offset + 5] = (byte)(v >> 0);  // Merged
>>         } else {
>>             a[offset + 0] = (byte)(v >> 0);  // Removed from candidate list
>>             a[offset + 1] = (byte)(v >> 8);  // Removed from candidate list
>>             a[offset + 2] = (byte)(v >> 16); // Not merged
>>             a[offset + 3] = (byte)(v >> 24); // Not merged
>>             a[offset + 4] = (byte)(v >> 32); // Not merge...
>
> @offamitkumar you can put this through your testing if you like. It should solve the issues with test/hotspot/jtreg/compiler/c2/TestMergeStores.java also for s390.

@reinrich thanks for taking this up!
Just did a quick scan of the tests. I think it could be good to have both big/small endian tests run on both big/small endian machines, but only expect IR rules to pass if the test and platform are expected to optimize. This just makes sure that the logic is correct, and does not optimize the wrong cases, producing wrong results.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19218#issuecomment-2110622758


More information about the hotspot-compiler-dev mailing list