RFR: JDK-8319104: GtestWrapper crashes with SIGILL in AsyncLogTest::test_asynclog_raw on AIX opt
Joachim Kern
jkern at openjdk.org
Wed Nov 8 09:49:57 UTC 2023
On Tue, 7 Nov 2023 08:38:10 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Currently gtest/GTestWrapper.java crashes on AIX in the opt build.
>>
>> SIGILL (0x4) at pc=0x0000000000000000, pid=15598006, tid=258
>>
>> JRE version: OpenJDK Runtime Environment (22.0) (build 22-internal-adhoc.openjdk.jdk-dev)
>> Java VM: OpenJDK 64-Bit Server VM (22-internal-adhoc.openjdk.jdk-dev, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, aix-ppc64)
>> Problematic frame:
>> V [libjvm.so+0x42c9c] LogTagSet::vwrite(LogLevel::type, char const*, char*)+0x104
>>
>> ### The reason for the SIGILL is the following:
>> The _AsyncLogTest.asynclog_vm_ test iterates over the previously registered entries of _LogOutputList_. There it gets a Nullpointer, which it tries to use for a function call `(nullpointer)->write(...);` This results in an SIGILL.
>> The first reason for this is, that the LogTagSet tests (which seem to be called beforehand only on AIX) creates such an entry with
>> `ts.set_output_level(LogConfiguration::StdoutLog, LogLevel::Info);`
>> The first flaw is that this entry is not removed again at the end of the test. So the VM is modified and would behave differently depending on the sequence of the test execution.
>> The second flaw is that the registered _LogConfiguration::StdoutLog_ in the call above is still a nullpointer due to the fact that on AIX no VM was created before the LogTagSet tests and only a creation of a VM initializes the logConfiguration. So again the whole Gtestwrapper testsuite depends on the executable sequence of the tests. If no other test creates a vm beforehand the LogTagSet tests will fail. Again this has to be fixed by let the LogTagSet tests themself creating a VM.
>>
>> ### Solution to fix all errors:
>>
>> 1. Make the 1. LogTagSet test a TEST_VM test.
>> 2. Clean up the _LogTagSet.is_level_ and _LogTagSet.level_for_ tests with a final `ts.set_output_level(LogConfiguration::StdoutLog, LogLevel::Off);` which will remove the entry from the LogOutputList again.
>
> Hi Joachim,
>
> sorry for some more questions, but I'd like to understand the problem better. I looked at your explanation and the crash stack in the JBS issue. I want to rule out that we don't hide a bigger problem somewhere. AsyncLogger is tricky, and there may be hidden bugs.
>
>> The AsyncLogTest.asynclog_vm test iterates over the previously registered entries of LogOutputList. There it gets a Nullpointer, which it tries to use for a function call (nullpointer)->write(...); This results in an SIGILL.
>
> At what point do we actually crash in AsyncLogTest? What do we access?
>
>> The first reason for this is, that the LogTagSet tests (which seem to be called beforehand only on AIX) creates such an entry with
> ts.set_output_level(LogConfiguration::StdoutLog, LogLevel::Info);
>
> How would the existence of such an entry cause a crash?
>
> Or is the problem that these tests are not VM tests and happen to run before VM initialization? But in that case, `LogConfiguration::StdoutLog` is NULL and the underlying code handles that and does nothing. This would make the test fail, but it should also not crash.
>
> Your patch does two things: make "LogTagSet.defaults" a VM test, and switch logging off at the end of two other tests. None of these changes should prevent a crash. The changes are correct in that they make the tests more reliable. But nothing should crash.
>
> ---
>
> Zooming out, these tests certainly have a problem with modifying global log state.
>
> At least the following tests seem all to modify global state too:
>
> TEST(LogOutputList, is_level_single_output) {
> TEST(LogOutputList, is_level_multiple_outputs) {
> TEST(LogOutputList, level_for) {
> TEST(LogTagSet, has_output) {
> TEST_VM_F(LogConfigurationTest, configure_stdout) {
> TEST_VM_F(LogConfigurationTest, subscribe) {
>
> a very simple solution would be to just run them all as TEST_OTHER_VM. Unless that costs too much performance.
>
> Cheers, Thomas
>
> P.S.
>
> BTW, you can shake out tests that depend on order of runtime by using `gtest_shuffle`, e.g.:
>
> `./hotspot/variant-server/libjvm/gtest/gtestLauncher -jdk:./images/jdk --gtest_filter=Log* --gtest_shuffle`
@tstuefe : I would like to integrate the PR today. Please contact me by phone if you still have objections.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16479#issuecomment-1801448258
More information about the hotspot-runtime-dev
mailing list