RFR: 8300148: Consider using a StoreStore barrier instead of Release barrier on ctor exit
Joshua Cao
duke at openjdk.org
Wed Mar 27 06:02:36 UTC 2024
The [JSR 133 cookbook](https://gee.cs.oswego.edu/dl/jmm/cookbook.html) has long recommended using a `StoreStore` barrier at the end of constructors that write to final fields. `StoreStore` barriers are much cheaper on arm machines as shown in benchmarks in this issue as well as https://bugs.openjdk.org/browse/JDK-8324186.
This change does not improve the case for constructors for objects with volatile fields because [MemBarRelease is emitted for volatile stores](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp#L211). This is demonstrated in test case `classWithVolatile`, where this patch does not impact the IR.
I had to modify some code around escape analysis to make sure there are no regressions in eliminating allocations and `StoreStore`'s. The [current handling of StoreStore's in escape analysis](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/escape.cpp#L2590) makes the assumption that the barriers input is a `Proj` to an `Allocate` ([example](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/library_call.cpp#L1553)). This is contrary to the barriers in the end of the constructor where there the barrier directly takes in an `Allocate` without an in between `Proj`. I opted to instead eliminate `StoreStore`s in GVN, exactly how `MemBarRelease` is handled.
I had to add [checks for StoreStore in macro.cpp](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/src/hotspot/share/opto/macro.cpp#L636), or else we fail some [cases for reducing allocation merges](https://github.com/openjdk/jdk/blob/8fc9097b3720314ef7efaf1f3ac31898c8d6ca19/test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/AllocationMergesTests.java#L1233-L1256).
Passes hotspot tier1 locally on a Linux machine.
### Benchmarks
Running Renaissance ParNnemonics on an Amazon Graviton (arm) instance.
Baseline:
Result "org.renaissance.jdk.streams.JmhParMnemonics.run":
N = 25
mean = 3309.611 ±(99.9%) 86.699 ms/op
Histogram, ms/op:
[3000.000, 3050.000) = 0
[3050.000, 3100.000) = 4
[3100.000, 3150.000) = 1
[3150.000, 3200.000) = 0
[3200.000, 3250.000) = 0
[3250.000, 3300.000) = 0
[3300.000, 3350.000) = 9
[3350.000, 3400.000) = 6
[3400.000, 3450.000) = 5
Percentiles, ms/op:
p(0.0000) = 3069.910 ms/op
p(50.0000) = 3348.140 ms/op
p(90.0000) = 3415.178 ms/op
p(95.0000) = 3417.057 ms/op
p(99.0000) = 3417.780 ms/op
p(99.9000) = 3417.780 ms/op
p(99.9900) = 3417.780 ms/op
p(99.9990) = 3417.780 ms/op
p(99.9999) = 3417.780 ms/op
p(100.0000) = 3417.780 ms/op
After patch:
Result "org.renaissance.jdk.streams.JmhParMnemonics.run": [20/383]
N = 25
mean = 2765.754 ±(99.9%) 62.062 ms/op
Histogram, ms/op:
[2600.000, 2625.000) = 0
[2625.000, 2650.000) = 4
[2650.000, 2675.000) = 2
[2675.000, 2700.000) = 3
[2700.000, 2725.000) = 0
[2725.000, 2750.000) = 0
[2750.000, 2775.000) = 0
[2775.000, 2800.000) = 5
[2800.000, 2825.000) = 5
[2825.000, 2850.000) = 2
[2850.000, 2875.000) = 3
Percentiles, ms/op:
p(0.0000) = 2632.734 ms/op
p(50.0000) = 2793.454 ms/op
p(90.0000) = 2871.524 ms/op
p(95.0000) = 2877.469 ms/op
p(99.0000) = 2878.872 ms/op
p(99.9000) = 2878.872 ms/op
p(99.9900) = 2878.872 ms/op
p(99.9990) = 2878.872 ms/op
p(99.9999) = 2878.872 ms/op
p(100.0000) = 2878.872 ms/op
We see a 16% improvement in throughput
-------------
Commit messages:
- Remove unused imports
- Handle remaining cases of eliminating StoreStore for escaped objs
- Add tests for barriers in constructors
- Replace all end of ctor MemBarRelease with MemBarStoreStore
- Compute redundancy for StoreStore
- 8300148: Consider using a StoreStore barrier instead of Release barrier on ctor exit
Changes: https://git.openjdk.org/jdk/pull/18505/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18505&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8300148
Stats: 97 lines in 6 files changed: 92 ins; 0 del; 5 mod
Patch: https://git.openjdk.org/jdk/pull/18505.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/18505/head:pull/18505
PR: https://git.openjdk.org/jdk/pull/18505
More information about the hotspot-compiler-dev
mailing list