RFR: 8322743: assert(held_monitor_count() == jni_monitor_count()) failed [v2]

Sat Jan 20 00:00:27 UTC 2024

On Wed, 17 Jan 2024 20:20:05 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> Corner case with a local (not escaped) object used for synchronization. C2 Escape Analysis thinks that it can eliminate locks for it. In most cases it is true but not in this case.
>> 
>> 
>>         for (int i = 0; i < 2; ++i) {
>>             Object o = new Object();
>>             synchronized (o) { // monitorenter
>>                 // Trigger OSR compilation
>>                 for (int j = 0; j < 100_000; ++j) {
>> 
>> The test has nested loop which trigger OSR compilation. The locked object comes from Interpreter into compiled OSR code. During parsing C2 creates an other non escaped object and correctly merge both together (with Phi node) so that non escaped object is not scalar replaceable. Because it does not globally escapes EA still removes locks for it and, as result, also for merged locked object from Interpreter which is the bug.
>> 
>> The fix is to check that synchronized block does not have any associated escaped objects when EA decides if locks can be eliminated.
>> 
>> Added regression test prepared by @TobiHartmann. Tested tier1-5, xcomp and stress.
>> Performance testing show no difference.
>
> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Address review comments

As I commented in bug reported, technically we can split Unlock node through Phi to separate them and try to eliminate ones related not-escaped new object. But it will not help in this case.

There are 2 modes in C2 how we handle locks/unlocks.

Before [JDK-7125896](https://bugs.openjdk.org/browse/JDK-7125896) we used `BoxLockNode` only to indicate stack slot where we store object's header (MarkWord) for heavy monitors [HotSpot/Synchronization](https://wiki.openjdk.org/display/HotSpot/Synchronization).

In that mode the same `BoxLockNode` can be used by not interfering synchronization regions even for different objects:

  synchronize(obj1) {}
  synchronize(obj2) {}

The only matter stack slot it points [locknode.cpp#L51](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/locknode.cpp#L51)

EA supports this mode and C2 looks on each locks/unlocks which reference only one object and creates new separate `BoxLockNode` (synchronization region) for them when it eliminates locks [macro.cpp#L1974](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/macro.cpp#L1974)

[JDK-7125896](https://bugs.openjdk.org/browse/JDK-7125896) and sequential fixes introduced new mode to simplify handling locks and to allow "easy" implement elimination of some nested locks which lock the same object. This is default mode (`EliminateNestedLocks` == true) since JDK 8 (and 7u4). In this mode we don't merge `BoxLockNode` nodes - each synchronization region will have separate `BoxLockNode` - one per locked object.  This assumes that we will see only on object if we trace all `Lock/Unlock` nodes which reference one `BoxLockNode`.

An other assumption is that if we have merge point during parsing (for example, diamond shape code inside synchronized region) we  can use `BoxLockNode` for the same stack slot from already processed path: [parse1.cpp#L1800](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse1.cpp#L1800). It was additional fix [JDK-7128355](https://bugs.openjdk.org/browse/JDK-7128355) after nested locks elimination implementation.

Based on that (all `Lock/Unlock` nodes which reference one `BoxLockNode` locks only one and the same object) in this mode it was assumed that we can eliminate all locks and unlocks if we find at least one which we can eliminate in one synchronized region (one  `BoxLockNode`) [macro.cpp#L1946](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/macro.cpp#L1946)

OSR compilation in this bug case breaks these assumptions. During parsing we merged synchronized region (one  `BoxLockNode`) with different locked object (from Interpreter). As result the assumption that we can eliminate all locks/unlocks for one region based only on one lock is incorrect.

It may be possible do something when we parse merge point but I think it is hard. What if this merge point is not at the start but somewhere later?

For me it was much easier to catch such case early during escape analysis where information about all objects is available.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17331#issuecomment-1901348194