RFR: 8306922: IR verification fails because IR dump is chopped up [v3]

Tobias Hartmann thartmann at openjdk.org
Fri Jun 23 08:11:06 UTC 2023


On Fri, 23 Jun 2023 07:57:21 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Some IR tests were failing because the logs were messed up (badly interleaved), due to multiple threads writing at the same time.
>> 
>> The issue was in `Compile::print_ideal_ir`. While we did lock the `tty`, this lock was broken because the node dumping sometimes calls into the VM, which can trigger Safepoints, which in turn can explicitly break the locks, with `ttyLocker::break_tty_lock_for_safepoint`.
>> 
>> What good is a lock if it can be broken? Not much. Though it is only broken if there is a Safepoint, so if we can avoid safepointing while holding the lock we should be safe. To verify that, we can write:
>> 
>> 
>>   NoSafepointVerifier nsv;
>>   ttyLocker ttyl;
>> 
>> 
>> The verifier triggered immediately, as expected.
>> And I now gather the whole dump into a separate `stringStream` first, so that all the Safepoints can happen before I even acquire the lock. Then, I acquire the log and just print the `stringStream` to `tty / xtty`, without triggering any Safepoint.
>> 
>> We still need to acquire a lock, else other threads may be printing also, and we get a bad interleaving in the locks.
>> 
>> I had to enable `dump_bfs` to print to an arbitrary stream, instead of just `tty`.
>> 
>> I un-problemlisted `compiler/c2/irTests/TestVectorConditionalMove.java`, and fixed a CompileCommand issue for it.
>> 
>> **Follow-up work**
>> 
>> [JDK-8310712](https://bugs.openjdk.org/browse/JDK-8310712) C2: check for broken tty locks due to SafePoint
>> We should in general go through all uses of `ttyLocker`, and add a `NoSafepointVerifier` to ensure no such lock is broken. Or maybe we just build the verifier into the ttyLocker?
>> 
>> [JDK-8310711](https://bugs.openjdk.org/browse/JDK-8310711) IR Framework: remove safepoint while printing handling
>> 
>> **Testing**
>> 
>> I ran `TestVectorConditionalMove.java` 1000x on master, just with the CompileCommand issue fixed. It triggered an IR issue 5 times (hit rate `0.005`). With the fix, it never triggers (`0.995^1000 = 0.007`). I also ran up to tier6 and stress-testing.
>
> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:
> 
>  - Merge branch 'master' into JDK-8306922
>  - removed unnecessary flags from TestVectorConditionalMove.java
>  - Apply suggestions from code review
>    
>    Co-authored-by: Christian Hagedorn <christian.hagedorn at oracle.com>
>  - renamed _st to _output
>  - make sure CloneMap info is also printed to stream, and not to tty
>  - add locker again, and refactor
>  - write to xtty or tty
>  - add tty locker again
>  - buffer xtty and tty
>  - 8306922: IR verification fails because IR dump is chopped up

Looks good to me otherwise.

> Maybe we should in general go through all uses of ttyLocker, and add a NoSafepointVerifier to ensure no such lock is broken? Or maybe we just build the verifier into the ttyLocker? Should I file an RFE?

Yes, I think it would be good to file an RFE to investigate the impact of adding a NoSafepointVerifier to the ttyLocker.

src/hotspot/share/opto/compile.cpp line 561:

> 559:   // Node dumping can cause a safepoint, which can break the tty lock.
> 560:   // Buffer all node dumps, so that all safepoints happen before we lock.
> 561:   stringStream ss;

Should we add a `ResourceMark rm;`?

src/hotspot/share/opto/output.cpp line 2072:

> 2070:   if (C->trace_opto_output()) {
> 2071:     // Buffer and print all at once
> 2072:     stringStream ss;

Should we add a `ResourceMark rm;`?

-------------

Marked as reviewed by thartmann (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/14591#pullrequestreview-1494665517
PR Review Comment: https://git.openjdk.org/jdk/pull/14591#discussion_r1239506027
PR Review Comment: https://git.openjdk.org/jdk/pull/14591#discussion_r1239506853


More information about the hotspot-compiler-dev mailing list