RFR: 8300148: Consider using a StoreStore barrier instead of Release barrier on ctor exit
Aleksey Shipilev
shade at openjdk.org
Wed Mar 27 12:28:23 UTC 2024
On Wed, 27 Mar 2024 05:58:34 GMT, Joshua Cao <duke at openjdk.org> wrote:
> The [JSR 133 cookbook](https://gee.cs.oswego.edu/dl/jmm/cookbook.html) has long recommended using a `StoreStore` barrier at the end of constructors that write to final fields. `StoreStore` barriers are much cheaper on arm machines as shown in benchmarks in this issue as well as https://bugs.openjdk.org/browse/JDK-8324186.
>
> This change does not improve the case for constructors for objects with volatile fields because [MemBarRelease is emitted for volatile stores](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp#L211). This is demonstrated in test case `classWithVolatile`, where this patch does not impact the IR.
>
> I had to modify some code around escape analysis to make sure there are no regressions in eliminating allocations and `StoreStore`'s. The [current handling of StoreStore's in escape analysis](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/escape.cpp#L2590) makes the assumption that the barriers input is a `Proj` to an `Allocate` ([example](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/library_call.cpp#L1553)). This is contrary to the barriers in the end of the constructor where there the barrier directly takes in an `Allocate` without an in between `Proj`. I opted to instead eliminate `StoreStore`s in GVN, exactly how `MemBarRelease` is handled.
>
> I had to add [checks for StoreStore in macro.cpp](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/macro.cpp#L636), or else we fail some [cases for reducing allocation merges](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/AllocationMergesTests.java#L1233-L1256).
>
> Passes hotspot tier1 locally on a Linux machine.
>
> ### Benchmarks
>
> Running Renaissance ParNnemonics on an Amazon Graviton (arm) instance.
>
> Baseline:
>
> Result "org.renaissance.jdk.streams.JmhParMnemonics.run":
> N = 25
> mean = 3309.611 ±(99.9%) 86.699 ms/op
>
> Histogram, ms/op:
> [3000.000, 3050.000) = 0
> [3050.000, 3100.000) = 4
> [3100.000, 3150.000) = 1
> [3150.000, 3200.000) = 0
> [3200.000, 3250.000) = 0
> [3250.000, 3300.000) = 0
> [3300.000, 3350.000) = 9
> [3350.000, 3400.000) = 6
> [3400.000, 3450.000) = 5
>
> Percentiles, ms/op:
> p(0.0000) = 3069.910 ms/op
> p(50.0000) = 3348.140 ms/op
> ...
I think you want to merge from master to get the clean GHA runs.
I am running some more tests here too.
src/hotspot/share/opto/parse1.cpp line 1002:
> 1000: // 3. On processors which are not CPU_MULTI_COPY_ATOMIC (e.g. PPC64),
> 1001: // support_IRIW_for_not_multiple_copy_atomic_cpu selects that
> 1002: // MemBarStoreStore is used before volatile load instead of after volatile
This change is superfluous: `MemBarVolatile` is still emitted before volatile loads in those cases.
test/hotspot/jtreg/compiler/c2/irTests/ConstructorBarriers.java line 35:
> 33: * @run main compiler.c2.irTests.ConstructorBarriers
> 34: */
> 35: public class ConstructorBarriers {
I think this test would benefit from additional scalar replacement cases. This would test EA behavior with this barrier. Basically, return the field value, not the instance?
test/hotspot/jtreg/compiler/c2/irTests/ConstructorBarriers.java line 36:
> 34: */
> 35: public class ConstructorBarriers {
> 36: private class ClassBasic {
I think these should be `static` to avoid capturing the enclosing class reference accidentally.
test/hotspot/jtreg/compiler/c2/irTests/ConstructorBarriers.java line 73:
> 71: @IR(counts = {IRNode.MEMBAR_STORESTORE, "1"})
> 72: @IR(counts = {IRNode.MEMBAR_RELEASE, "1"})
> 73: @IR(counts = {IRNode.MEMBAR_VOLATILE, "1"})
As per comment for `IRIW*` architectures, volatile barrier would be expected only on specific platforms. I think we want to `@require` this test as x86_64- and AArch64-specific.
-------------
PR Review: https://git.openjdk.org/jdk/pull/18505#pullrequestreview-1962878529
PR Review Comment: https://git.openjdk.org/jdk/pull/18505#discussion_r1540818154
PR Review Comment: https://git.openjdk.org/jdk/pull/18505#discussion_r1540842433
PR Review Comment: https://git.openjdk.org/jdk/pull/18505#discussion_r1541000643
PR Review Comment: https://git.openjdk.org/jdk/pull/18505#discussion_r1540832688
More information about the hotspot-compiler-dev
mailing list