RFR: 8324969: C2: prevent elimination of unbalanced coarsened locking regions [v8]
Vladimir Kozlov
kvn at openjdk.org
Wed Feb 28 02:11:15 UTC 2024
> The issue is that when we do nested locks elimination we don't check that synchronized region has both `Lock` and `Unlock` nodes.
> Which can happen if locks coarsening optimization eliminated pair of Unlock/Lock nodes from adjacent locking regions before we check for nested locks.
>
> Consider this code (all locks/unlocks use the same object):
>
> lock1(obj) { // outer synchronized region
> lock2(obj) {
> // nested synchronized region 1
> } unlock2(obj);
> lock3(obj) {
> // nested synchronized region 2
> } unlock3(obj);
> } unlock1(obj);
>
> If `lock3` directly follows `unlock2` (no branches or safepoints) locks coarsening optimization will remove them:
>
>
> lock1(obj) { // outer synchronized region
> lock2(obj) {
> // nested synchronized region 1
> };
> {
> // nested synchronized region 2
> } unlock3(obj);
> } unlock1(obj);
>
>
> Nested locks elimination code checks only `Lock` node in one region to find if it is nested (inside other lock region which use the same object) and then eliminate it. So we end up with not eliminated `Unlock` node in second nested region.
>
> Why we don't hit this issue before? Normally nested locks elimination executed first and only then we do locks coarsening elimination. In the example above we eliminate all nested `Lock` and `Unlock` nodes, leaving only outer `Lock` and `Unlock`.
>
> The additional factors which leads to the failure is fully unrolled loop around nested sync regions and some allocation to trigger Escape Analysis:
>
>
> lock1(obj) { // outer synchronized region
> Test var = new Test(); // Triggers EA
> for (I = 0; I < 3; i++) { // small iteration number to fully unroll
> lock2(obj) {
> // nested synchronized region 1
> } unlock2(obj);
> lock3(obj) {
> // nested synchronized region 2
> } unlock3(obj);
> }
> } unlock1(obj);
>
> Before executing Escape Analysis we do loops optimization to simplify graph: [compile.cpp#L2332](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/compile.cpp#L2332)
> We also allow to fully unroll short loops (LoopOptsMaxUnroll) to remove merges from graph. It helps EA eliminate allocations.
> Such unrolling creates several `Lock` and `Unlock` nodes per synchronized region. But nested locks elimination look for region with only one unique `Lock` node: [callnode.cpp#L2117](https://github.com/open...
Vladimir Kozlov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:
- Merge branch 'master' into 8324969
- added const to mark_unbalanced_boxes()
- Added asserts for BoxLock states change
- Add verivication for unbalanced BoxLock marking
- Additional changes
- More comments. Adress review.
- Renaming
- New implementation
- Merge branch 'master' into 8324969
- 8324969: assert(false) failed: Non-balanced monitor enter/exit!
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/17697/files
- new: https://git.openjdk.org/jdk/pull/17697/files/e9f59e36..1607411d
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=17697&range=07
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=17697&range=06-07
Stats: 38554 lines in 1656 files changed: 17349 ins; 13460 del; 7745 mod
Patch: https://git.openjdk.org/jdk/pull/17697.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/17697/head:pull/17697
PR: https://git.openjdk.org/jdk/pull/17697
More information about the hotspot-compiler-dev
mailing list