RFR: 8306922: IR verification fails because IR dump is chopped up
Christian Hagedorn
chagedorn at openjdk.org
Fri Jun 23 07:41:06 UTC 2023
On Wed, 21 Jun 2023 14:43:18 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> Some IR tests were failing because the logs were messed up (badly interleaved), due to multiple threads writing at the same time.
>
> The issue was in `Compile::print_ideal_ir`. While we did lock the `tty`, this lock was broken because the node dumping sometimes calls into the VM, which can trigger Safepoints, which in turn can explicitly break the locks, with `ttyLocker::break_tty_lock_for_safepoint`.
>
> What good is a lock if it can be broken? Not much. Though it is only broken if there is a Safepoint, so if we can avoid safepointing while holding the lock we should be safe. To verify that, we can write:
>
>
> NoSafepointVerifier nsv;
> ttyLocker ttyl;
>
>
> The verifier triggered immediately, as expected.
> And I now gather the whole dump into a separate `stringStream` first, so that all the Safepoints can happen before I even acquire the lock. Then, I acquire the log and just print the `stringStream` to `tty / xtty`, without triggering any Safepoint.
>
> We still need to acquire a lock, else other threads may be printing also, and we get a bad interleaving in the locks.
>
> I had to enable `dump_bfs` to print to an arbitrary stream, instead of just `tty`.
>
> I un-problemlisted `compiler/c2/irTests/TestVectorConditionalMove.java`, and fixed a CompileCommand issue for it.
>
> **Question**
>
> Maybe we should in general go through all uses of `ttyLocker`, and add a `NoSafepointVerifier` to ensure no such lock is broken? Or maybe we just build the verifier into the ttyLocker? Should I file an RFE?
>
> **Testing**
>
> I ran `TestVectorConditionalMove.java` 1000x on master, just with the CompileCommand issue fixed. It triggered an IR issue 5 times (hit rate `0.005`). With the fix, it never triggers (`0.995^1000 = 0.007`). I also ran up to tier6 and stress-testing.
Looks good, thanks for cleaning this up!
I think it's a good idea to also check the other usages of `ttyLocker` separately.
Can you also file an RFE to remove the "safepoint while printing handling" in the IR framework? This is no longer needed with this patch but would exceed the scope of this fix. You can assign that to me.
src/hotspot/share/opto/compile.cpp line 559:
> 557: // be sure to tag this tty output with the compile ID.
> 558:
> 559: // Node dumping can cause a safepoint, which can break the ttyLocker.
Suggestion:
// Node dumping can cause a safepoint, which can break the tty lock.
src/hotspot/share/opto/output.cpp line 2084:
> 2082: void PhaseOutput::print_scheduling() {
> 2083: print_scheduling(tty);
> 2084: }
New line:
Suggestion:
}
test/hotspot/jtreg/compiler/c2/irTests/TestVectorConditionalMove.java line 49:
> 47: "-XX:+UseCMoveUnconditionally",
> 48: "-XX:+UseVectorCmov",
> 49: "-XX:CompileCommand=compileonly,compiler.c2.irTests.TestVectorConditionalMove::test*");
`compileonly` is probably not necessary but you can leave it as such.
-------------
Marked as reviewed by chagedorn (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/14591#pullrequestreview-1494608279
PR Review Comment: https://git.openjdk.org/jdk/pull/14591#discussion_r1239467929
PR Review Comment: https://git.openjdk.org/jdk/pull/14591#discussion_r1239471742
PR Review Comment: https://git.openjdk.org/jdk/pull/14591#discussion_r1239473229
More information about the hotspot-compiler-dev
mailing list