RFR: 8331311: C2: Big Endian Port of 8318446: optimize stores into primitive arrays by combining values into larger store

Emanuel Peter epeter at openjdk.org
Tue May 14 16:19:06 UTC 2024


On Mon, 13 May 2024 15:53:52 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

> This pr adds a few tweaks to [JDK-8318446](https://bugs.openjdk.org/browse/JDK-8318446) which allows enabling it also on big endian platforms (e.g. AIX, S390). JDK-8318446 introduced a C2 optimization to replace consecutive stores to a primitive array with just one store.
> 
> By example (from `TestMergeStores.java`):
> 
> 
>     static Object[] test2a(byte[] a, int offset, long v) {
>         if (IS_BIG_ENDIAN) {
>             a[offset + 0] = (byte)(v >> 56);
>             a[offset + 1] = (byte)(v >> 48);
>             a[offset + 2] = (byte)(v >> 40);
>             a[offset + 3] = (byte)(v >> 32);
>             a[offset + 4] = (byte)(v >> 24);
>             a[offset + 5] = (byte)(v >> 16);
>             a[offset + 6] = (byte)(v >> 8);
>             a[offset + 7] = (byte)(v >> 0);
>         } else {
>             a[offset + 0] = (byte)(v >> 0);
>             a[offset + 1] = (byte)(v >> 8);
>             a[offset + 2] = (byte)(v >> 16);
>             a[offset + 3] = (byte)(v >> 24);
>             a[offset + 4] = (byte)(v >> 32);
>             a[offset + 5] = (byte)(v >> 40);
>             a[offset + 6] = (byte)(v >> 48);
>             a[offset + 7] = (byte)(v >> 56);
>         }
>         return new Object[]{ a };
>     }
> 
> 
> Depending on the endianess 8 bytes are stored into an array. The order of the stores is the same as the order of an 8-byte-store therefore 8 1-byte-stores can be replaced with just one 8-byte-store (if there aren't too many range checks).
> 
> Additionally I've fixed a few comments and a test bug.
> 
> The optimization seems to be a little bit more effective on big endian platforms.
> 
> Again by example:
> 
> 
>     static Object[] test800a(byte[] a, int offset, long v) {
>         if (IS_BIG_ENDIAN) {
>             a[offset + 0] = (byte)(v >> 40); // Removed from candidate list
>             a[offset + 1] = (byte)(v >> 32); // Removed from candidate list
>             a[offset + 2] = (byte)(v >> 24); // Merged
>             a[offset + 3] = (byte)(v >> 16); // Merged
>             a[offset + 4] = (byte)(v >> 8);  // Merged
>             a[offset + 5] = (byte)(v >> 0);  // Merged
>         } else {
>             a[offset + 0] = (byte)(v >> 0);  // Removed from candidate list
>             a[offset + 1] = (byte)(v >> 8);  // Removed from candidate list
>             a[offset + 2] = (byte)(v >> 16); // Not merged
>             a[offset + 3] = (byte)(v >> 24); // Not merged
>             a[offset + 4] = (byte)(v >> 32); // Not merged
>             a[offset + 5] = (byte)(v >> 40); // Not merged
>         }
>         return new Object[]{ a };...

src/hotspot/share/opto/memnode.cpp line 3313:

> 3311:     merged_input_value = _store->in(MemNode::ValueIn);
> 3312:     bool is_true = is_con_RShift(first->in(MemNode::ValueIn), base_last, shift_last);
> 3313: #endif // VM_LITTLE_ENDIAN

You could just have local variables for "lo" / "hi", set them depending on big/little endian, and then the logic would be the same for both.

test/hotspot/jtreg/compiler/c2/TestMergeStores.java line 57:

> 55:     private static final Random RANDOM = Utils.getRandomInstance();
> 56: 
> 57:     private static final boolean IS_BIG_ENDIAN = UNSAFE.isBigEndian();

`static` is very important here, so that the `if` constant fold in the test. Otherwise we don't know if we have the IR rule pass because of the correct branch. Maybe add a comment for that.

test/hotspot/jtreg/compiler/c2/TestMergeStores.java line 117:

> 115:         testGroups.get("test2").put("test2c", (_,_) -> { return test2c(aB.clone(), offset1, vL1); });
> 116:         testGroups.get("test2").put("test2d", (_,_) -> { return test2d(aB.clone(), offset1, vL1); });
> 117:         testGroups.get("test2").put("test2e", (_,_) -> { return test2e(aB.clone(), offset1, vL1); });

Nice catch

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19218#discussion_r1600314029
PR Review Comment: https://git.openjdk.org/jdk/pull/19218#discussion_r1600315534
PR Review Comment: https://git.openjdk.org/jdk/pull/19218#discussion_r1600315727


More information about the hotspot-compiler-dev mailing list