RFR: 8229517: Support for optional asynchronous/buffered logging [v20]
Xin Liu
xliu at openjdk.java.net
Mon May 24 04:29:15 UTC 2021
On Mon, 24 May 2021 04:12:54 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> This patch provides a buffer to store asynchrounous messages and flush them to
>> underlying files periodically.
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
>
> flush() waits until all pending logging IO operations are done.
>
> This patch support all gtests in async mode.
> make test TEST="gtest:all" TEST_OPTS='VM_OPTIONS=-Xlog:async'
> > _Mailing list message from [Liu, Xin](mailto:xxinliu at amazon.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_
> > hi, David,
> > I am aware of this issue. There are a few gtest testcases can't run in
> > async logging mode. The rootcause is testing code assumes logging is
> > synchronous.
> > eg. this testcase opens and reads 'TestLogFileName' right after
> > log_trace(). This is perfectly fine in sequential execution. It's unsafe
> > and TestLogFileName may haven't been written yet. fopen could fail(no
> > error check in current testing) and then fread yields NULL pointer for
> > strcmp.
> > TEST_VM_F(LogTest, prefix) {
> > set_log_config(TestLogFileName, "logging+test=trace");
> > log_trace(logging, test)(LOG_LINE_STR);
> > EXPECT_TRUE(file_contains_substring(TestLogFileName, LOG_PREFIX_STR
> > LOG_LINE_STR));
> > }
> > It's also not MT-safe to do concurrent fread() (gtest EXPECT/ASSERT)
> > while fwrite is onging(AsyncLog Thread). Random execution may happen if
> > FILE has been corrupted.
> > I manage to fix those hard crashes in gtests. It would up a non-trivial
> > patch, and it just fixed hard crashes. I reckon I give up running gtest
> > in async mode and provide an isolated and a clean room to test it.
>
> Hi Xin,
>
> I agree that these UL tests are annoying because they modify global VM state - resetting global UL options which ideally should just be set at VM startup and then never touched. gtest should not do this.
>
> I struggled with the same thing with NMT in this upcoming PR: #2921.
>
> As I wrote before, in my view, it should be possible to execute the gtest suite with any form of UL options - or VM options in general - and the gtests should at most _react_ on those options, never change them. In the case of UL, maybe conditionally execute or skip some tests if UL is on or off.
>
> In other words, I should be able to run `gtestLauncher -jdk:... -Xlog:os` and the gtests should run through successfully with these flags. And then, as a complimentary step, we execute the gtests (or selected parts of them with --gtest_filter) with selected, non-default, VM options. E.g. with UL turned on. I know you do this already for some of the Async tests, and maybe thats the way to go for all UL gtests. We do this also for Metaspace- and large-page-tests. And I will modify the NMT gtests to not modify NMT state anymore:https://github.com/openjdk/jdk/pull/2921/files#diff-c1004f2502ccc5f5340e7752264c7e0274f6a5fe4da0554d89d3bc8209801199 )
>
> Moreover, things like "log to a file and scan for expected output" are really better suited for jtreg tests anyway, which spawn a child VM with options and scan the log output. Doing this in gtests seems wrong. Jtreg gives more fine-grained control, I can `-retain` the output files, I can execute them in parallel, Problemlist them, etc. I don't have to execute test logic like reading the log output file _inside_ the logging VM.
>
> So maybe (future RFE?) we need to move part or all of the UL tests out of the gtest suite and make them jtreg tests.
>
> Cheers, Thomas
> > _Mailing list message from [Liu, Xin](mailto:xxinliu at amazon.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.java.net):_
> > hi, David,
> > I am aware of this issue. There are a few gtest testcases can't run in
> > async logging mode. The rootcause is testing code assumes logging is
> > synchronous.
> > eg. this testcase opens and reads 'TestLogFileName' right after
> > log_trace(). This is perfectly fine in sequential execution. It's unsafe
> > and TestLogFileName may haven't been written yet. fopen could fail(no
> > error check in current testing) and then fread yields NULL pointer for
> > strcmp.
> > TEST_VM_F(LogTest, prefix) {
> > set_log_config(TestLogFileName, "logging+test=trace");
> > log_trace(logging, test)(LOG_LINE_STR);
> > EXPECT_TRUE(file_contains_substring(TestLogFileName, LOG_PREFIX_STR
> > LOG_LINE_STR));
> > }
> > It's also not MT-safe to do concurrent fread() (gtest EXPECT/ASSERT)
> > while fwrite is onging(AsyncLog Thread). Random execution may happen if
> > FILE has been corrupted.
> > I manage to fix those hard crashes in gtests. It would up a non-trivial
> > patch, and it just fixed hard crashes. I reckon I give up running gtest
> > in async mode and provide an isolated and a clean room to test it.
>
> Hi Xin,
>
> I agree that these UL tests are annoying because they modify global VM state - resetting global UL options which ideally should just be set at VM startup and then never touched. gtest should not do this.
>
> I struggled with the same thing with NMT in this upcoming PR: #2921.
>
> As I wrote before, in my view, it should be possible to execute the gtest suite with any form of UL options - or VM options in general - and the gtests should at most _react_ on those options, never change them. In the case of UL, maybe conditionally execute or skip some tests if UL is on or off.
>
> In other words, I should be able to run `gtestLauncher -jdk:... -Xlog:os` and the gtests should run through successfully with these flags. And then, as a complimentary step, we execute the gtests (or selected parts of them with --gtest_filter) with selected, non-default, VM options. E.g. with UL turned on. I know you do this already for some of the Async tests, and maybe thats the way to go for all UL gtests. We do this also for Metaspace- and large-page-tests. And I will modify the NMT gtests to not modify NMT state anymore:https://github.com/openjdk/jdk/pull/2921/files#diff-c1004f2502ccc5f5340e7752264c7e0274f6a5fe4da0554d89d3bc8209801199 )
>
> Moreover, things like "log to a file and scan for expected output" are really better suited for jtreg tests anyway, which spawn a child VM with options and scan the log output. Doing this in gtests seems wrong. Jtreg gives more fine-grained control, I can `-retain` the output files, I can execute them in parallel, Problemlist them, etc. I don't have to execute test logic like reading the log output file _inside_ the logging VM.
>
> So maybe (future RFE?) we need to move part or all of the UL tests out of the gtest suite and make them jtreg tests.
>
> Cheers, Thomas
Hi, Thomas
Thanks for the advice. I agree. OpenJDK have invested so much on those unified logging tests. It's wasteful if I skip them in async mode. I manage to resolve the race conditions.
To support it, I strengthen "AsyncLogWriter::flush()". Now it not only flushes pending messages out of buffer, it guarantees all logs messages are seen by external. one lock _io_sem is introduced to consolidate that.
By inserting AsyncLogWriter::flush() in some places of unified logging tests, I can pass all gtests in async mode. I verified it in different builds. flush() is no-op if async log is not established.
Amazingly, I discovered a very subtle bug also. The following call can also turn off all log outputs but it will bypass `LogConfiguration::disable_outputs()`. Fixed.
LogConfiguration::parse_log_arguments(filename, "all=off", "", "", &stream);
@dholmes-ora ,
Could you verify that with TEST_OPTS='VM_OPTIONS=-Xlog:async'?
I also verified on Windows 10(WSL1). It works as expected.
-------------
PR: https://git.openjdk.java.net/jdk/pull/3135
More information about the hotspot-dev
mailing list