RFR: 8229517: Support for optional asynchronous/buffered logging
David Holmes
david.holmes at oracle.com
Thu May 27 05:34:06 UTC 2021
Hi Xin,
On 27/05/2021 4:15 am, Xin Liu wrote:
> On Wed, 26 May 2021 05:41:52 GMT, David Holmes <david.holmes at oracle.com> wrote:
>
>> So again prior to this the real problem was a race between final logging
> output and VM termination, that could result in either crashes or lost
> logs messages, which in turn could cause test failures.
>
> For `LogConfiguration::disable_outputs`, it's not just for JVM termination.
> When you send a dcmd `VM.log disable` to a running JVM, it also invokes it.
Thanks for highlighting that - yes that does make things additionally
complicated. So this one mechanism is addressing both cases and so
avoids the need for the additional termination protocol.
Thanks for bearing with me on this.
David
-----
> That bug caused JVM may crashes AND lost messages.
> If `flush()` can't guarantee all dequeued messages have been written before returning, we may end up using broken pointers.
>
>
> void LogConfiguration::disable_outputs() {
> size_t idx = _n_outputs;
>
> // Remove all outputs from all tagsets.
> for (LogTagSet* ts = LogTagSet::first(); ts != NULL; ts = ts->next()) {
> ts->disable_outputs();
> }
>
> // Handle jcmd VM.log disable
> // ts->disable_outputs() above has deleted output_list with RCU synchronization.
> // Therefore, no new logging entry can enter AsyncLog buffer for the time being.
> // flush pending entries before LogOutput instances die.
> AsyncLogWriter::flush();
>
> while (idx > 0) {
> LogOutput* out = _outputs[--idx];
> // Delete the output unless stdout or stderr (idx 0 or 1)
> if (idx > 1) {
> delete_output(idx);
> } else {
> out->set_config_string("all=off");
> }
> }
> }
>
> Those LogOutput instances are destroyed right after flush().
> Each AsyncLogMessage instance contains a reference of LogFileOutput. LogWriter thread may use those deleted pointers!
> that's the root cause you saw that intermittent crashes.
>
> ```c++
> e->output()->write_blocking(e->decorations(), msg);
>
>
>> At one stage I believe you had intended to terminate the async writer
> thread, and that would have fixed the problem as well - assuming you
> handled the termination properly.
>
> Yes, I used to have a termination protocol. It was also very complex because it must be MT-safe, no message lost and no concurrency with synchronous logging. Thomas disliked that and suggested me to remove the termination protocol(100+ line).
>
> AsyncLogWriter::terminate() was placed before `LogConfiguration::finalize()`, therefore, it hid the problem in JVM termination.
> It can't help for the dcmd situation, which we still need strong `flush()`.The second use of strong `flush()` is for gtest. Without it, it's impossible to have a reliable way to check the results of async logging.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/3135
>
More information about the hotspot-dev
mailing list