RFR: JDK-8305711: Arm: C2 always enters slowpath for monitorexit

Aleksey Shipilev shade at openjdk.org
Thu Apr 6 16:59:12 UTC 2023


On Thu, 6 Apr 2023 16:29:57 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> A small bug in the C2 implementation of monitorexit for thin locks causes us to always enter the slow path.
> 
> This seems to be a day zero bug of the arm port, since JEP 297: "Unified arm32/arm64 Port". It has a significant effect on locking performance, but its effect had been hidden until JDK 15 by biased locking. Biased locking removal made the bug appearant.
> 
> With this patch, @rkennke's artificial microbenchmark that does nothing but uncontended locking improves greatly (see https://github.com/rkennke/fastlockbench):
> 
> 
> Benchmark                      (backoff)  Mode  Cnt      Score   Error  Units
> FastLockingBenchmark.testSync          0  avgt    2    110.600          ns/op
> FastLockingBenchmark.testSync          1  avgt    2    105.725          ns/op
> FastLockingBenchmark.testSync          2  avgt    2    122.780          ns/op
> FastLockingBenchmark.testSync          4  avgt    2    125.133          ns/op
> FastLockingBenchmark.testSync          8  avgt    2    151.915          ns/op
> FastLockingBenchmark.testSync         16  avgt    2    206.458          ns/op
> FastLockingBenchmark.testSync         32  avgt    2    313.980          ns/op
> FastLockingBenchmark.testSync         64  avgt    2    522.206          ns/op
> 
> 
> New:
> 
> Benchmark                      (backoff)  Mode  Cnt      Score   Error  Units
> FastLockingBenchmark.testSync          0  avgt    2     60.102          ns/op
> FastLockingBenchmark.testSync          1  avgt    2     61.667          ns/op
> FastLockingBenchmark.testSync          2  avgt    2     74.950          ns/op
> FastLockingBenchmark.testSync          4  avgt    2     85.480          ns/op
> FastLockingBenchmark.testSync          8  avgt    2    115.019          ns/op
> FastLockingBenchmark.testSync         16  avgt    2    178.046          ns/op
> FastLockingBenchmark.testSync         32  avgt    2    273.376          ns/op
> FastLockingBenchmark.testSync         64  avgt    2    500.287          ns/op
> 
> 
> Please note that Arm remains broken since JDK-8301995; I based and tested this patch on the parent of that change.

Ouch! Good thing this does not blow up correctness-wise? The object header would almost never (famous last words) look like a displaced header when locked. `InterpreterMacroAssembler::unlock_object` does it correctly, and this code now matches the interpreter.

-------------

Marked as reviewed by shade (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/13376#pullrequestreview-1375254094


More information about the hotspot-compiler-dev mailing list