RFR: 8324969: C2: prevent elimination of unbalanced coarsened locking regions [v7]

Emanuel Peter epeter at openjdk.org
Mon Feb 26 11:18:56 UTC 2024


On Sun, 25 Feb 2024 04:46:10 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> The issue is that when we do nested locks elimination we don't check that synchronized region has both `Lock` and `Unlock` nodes.
>> Which can happen if locks coarsening optimization eliminated pair of Unlock/Lock nodes from adjacent locking regions before we check for nested locks.
>> 
>> Consider this code (all locks/unlocks use the same object):
>> 
>>     lock1(obj) { // outer synchronized region
>>          lock2(obj) {
>>              // nested synchronized region 1
>>          } unlock2(obj);
>>          lock3(obj) {
>>              // nested synchronized region 2
>>          } unlock3(obj);
>>     } unlock1(obj);
>> 
>> If `lock3` directly follows `unlock2` (no branches or safepoints) locks coarsening optimization will remove them:
>> 
>> 
>>     lock1(obj) { // outer synchronized region
>>          lock2(obj) {
>>              // nested synchronized region 1
>>          };
>>          {
>>              // nested synchronized region 2
>>          } unlock3(obj);
>>     } unlock1(obj);
>> 
>> 
>> Nested locks elimination code checks only `Lock` node in one region to find if it is nested (inside other lock region which use the same object) and then eliminate it. So we end up with not eliminated `Unlock` node in second nested region.
>> 
>> Why we don't hit this issue before? Normally nested locks elimination executed first and only then we do locks coarsening elimination. In the example above we eliminate all nested `Lock` and `Unlock` nodes, leaving only outer `Lock` and `Unlock`.
>> 
>> The additional factors which leads to the failure is fully unrolled loop around nested sync regions and some allocation to trigger Escape Analysis:
>> 
>> 
>>     lock1(obj) { // outer synchronized region
>>          Test var = new Test(); // Triggers EA
>>          for (I = 0; I < 3; i++) { // small iteration number to fully unroll
>>              lock2(obj) {
>>                  // nested synchronized region 1
>>              } unlock2(obj);
>>              lock3(obj) {
>>                  // nested synchronized region 2
>>              } unlock3(obj);
>>          }
>>     } unlock1(obj);
>> 
>> Before executing Escape Analysis we do loops optimization to simplify graph: [compile.cpp#L2332](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/compile.cpp#L2332)
>> We also allow to fully unroll short loops (LoopOptsMaxUnroll) to remove merges from graph. It helps EA eliminate allocations.
>> Such unrolling creates several `Lock` and `Unlock` nodes per synchronized region. But nested locks eliminat...
>
> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Added asserts for BoxLock states change

I like the `_kind` states, with the state transitions. Makes it a bit clearer :)

Would it make sense to add the `_kind` string to the node dump?
That would possibly allow for IR matching with regex, on a few examples.
And that would make me feel a bit safer about test coverage ;)

src/hotspot/share/opto/compile.cpp line 4975:

> 4973: void Compile::mark_unbalanced_boxes() {
> 4974:   int count = coarsened_count();
> 4975:   for (int i = 0; i < count; i++) {

Just out of curiosity: Do you often move out the limit from the loop?
I guess this could help with performance. But `coarsened_count` is const, so I guess it would not make a change?

Another question: could you make `mark_unbalanced_boxes` const?

src/hotspot/share/opto/escape.cpp line 2893:

> 2891:       // We can mark whole locking region as Local only when only
> 2892:       // one object is used for locking.
> 2893:       box->set_local();

Hmm. The name of the method suggests that there are no side-effects. Not sure what would be a better alternative though.

src/hotspot/share/opto/locknode.cpp line 95:

> 93:     return old_box;
> 94:   }
> 95:   return this;

For `EliminateNestedLocks` we now never have a hash, are only equal with `this`, and don't common nodes in `Ideal` either. Is that intended?

Are there any IR tests that verify that we common nodes in the cases where we expect it to common?

-------------

PR Review: https://git.openjdk.org/jdk/pull/17697#pullrequestreview-1900547952
PR Review Comment: https://git.openjdk.org/jdk/pull/17697#discussion_r1502406866
PR Review Comment: https://git.openjdk.org/jdk/pull/17697#discussion_r1502418078
PR Review Comment: https://git.openjdk.org/jdk/pull/17697#discussion_r1502423652


More information about the hotspot-compiler-dev mailing list