RFR: JDK-8305711: Arm: C2 always enters slowpath for monitorexit
Aleksey Shipilev
shade at openjdk.org
Thu Apr 6 16:59:12 UTC 2023
On Thu, 6 Apr 2023 16:29:57 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> A small bug in the C2 implementation of monitorexit for thin locks causes us to always enter the slow path.
>
> This seems to be a day zero bug of the arm port, since JEP 297: "Unified arm32/arm64 Port". It has a significant effect on locking performance, but its effect had been hidden until JDK 15 by biased locking. Biased locking removal made the bug appearant.
>
> With this patch, @rkennke's artificial microbenchmark that does nothing but uncontended locking improves greatly (see https://github.com/rkennke/fastlockbench):
>
>
> Benchmark (backoff) Mode Cnt Score Error Units
> FastLockingBenchmark.testSync 0 avgt 2 110.600 ns/op
> FastLockingBenchmark.testSync 1 avgt 2 105.725 ns/op
> FastLockingBenchmark.testSync 2 avgt 2 122.780 ns/op
> FastLockingBenchmark.testSync 4 avgt 2 125.133 ns/op
> FastLockingBenchmark.testSync 8 avgt 2 151.915 ns/op
> FastLockingBenchmark.testSync 16 avgt 2 206.458 ns/op
> FastLockingBenchmark.testSync 32 avgt 2 313.980 ns/op
> FastLockingBenchmark.testSync 64 avgt 2 522.206 ns/op
>
>
> New:
>
> Benchmark (backoff) Mode Cnt Score Error Units
> FastLockingBenchmark.testSync 0 avgt 2 60.102 ns/op
> FastLockingBenchmark.testSync 1 avgt 2 61.667 ns/op
> FastLockingBenchmark.testSync 2 avgt 2 74.950 ns/op
> FastLockingBenchmark.testSync 4 avgt 2 85.480 ns/op
> FastLockingBenchmark.testSync 8 avgt 2 115.019 ns/op
> FastLockingBenchmark.testSync 16 avgt 2 178.046 ns/op
> FastLockingBenchmark.testSync 32 avgt 2 273.376 ns/op
> FastLockingBenchmark.testSync 64 avgt 2 500.287 ns/op
>
>
> Please note that Arm remains broken since JDK-8301995; I based and tested this patch on the parent of that change.
Ouch! Good thing this does not blow up correctness-wise? The object header would almost never (famous last words) look like a displaced header when locked. `InterpreterMacroAssembler::unlock_object` does it correctly, and this code now matches the interpreter.
-------------
Marked as reviewed by shade (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/13376#pullrequestreview-1375254094
More information about the hotspot-compiler-dev
mailing list