RFR: 8229517: Support for optional asynchronous/buffered logging
Xin Liu
xliu at openjdk.java.net
Tue Mar 23 18:24:39 UTC 2021
On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu <xliu at openjdk.org> wrote:
> This patch provides a buffer to store asynchrounous messages and flush them to
> underlying files periodically.
Hi, Reviewers,
I would like to restart the RFR process for the feature async logging. We (AWS) have deployed this feature over a year in a few critical services. It helps us to reduce long-tail GC pauses. On Linux, we used to experience intermittent second-level delays due to gclog writings. If those undesirable events happen to appear at safepoints, hotspot has to prolong the pause intervals, which then increase the response time of Java application/service.
Originally, we observed and solved this issue on a Linux system with software RAID. In absence of hardware assistance, multiple writes have to be synchronized and it is that operation that yields long pauses. This issue may become more prevalent if Linux servers adopt ZFS in the future. We don’t think redirecting log files to tmpfs is a final solution. Hotspot should provide a self-contained and cross-platform solution. **Our solution is to provide a buffer and flush it in a standalone thread periodically.**
Since then, we found more unexpected but interesting scenarios. e.g. some cloud-based applications run entirely on a AWS EBS partition. syscall `write` could be a blocking operation if the underlying infrastructure is experiencing an intermittent issue. Even stdout/stderr are not absolutely non-blocking. Someone may send `XOFF` of software flow control and pause the tty to read. As a result, the JVM process which is emitting logs to the tty is blocked then. Yes, that action may freeze the Java service accidentally!
Those pain points are not AWS-exclusive. We found relevant questions on stackoverflow[1] and it seems that J9 provides an option `-Xgc:bufferedLogging` to mitigate it[2]. We hope hotspot would consider our feature.
Back to implementation, this is the 2nd revision based on Unified logging. Previous RFR[3] was a top-down design. We provide a parallel header file `asynclog.hpp` and hope log-sites opt in. That design is very stiff because asynclog.hpp is full of template parameters and was turned down[4]. The new patch has deprecated the old design and achieved asynchronous logging in bottom-up way. We provide an output-option which conforms to JEP-158[5]. Developers can choose asynchronous mode for a file-based output by providing an extra option **async=true**. e.g. `-Xlog:gc=debug:file=gc.log::async=true`
May we know more about LogMessageBuffer.hpp/cpp? We haven’t found a real use of it. That’s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven’t supported async_mode for LogStdoutOutput and LogStderrOutput either. It’s not difficult but needs to big code change.
[1] https://stackoverflow.com/questions/27072340/is-gc-log-writing-asynchronous-safe-to-put-gc-log-on-nfs-mount
[2] https://stackoverflow.com/questions/54994943/is-openj9-gc-log-asynchronous
[3] https://cr.openjdk.java.net/~xliu/8229517/01/webrev/
[4] https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-March/041034.html
[5] https://openjdk.java.net/jeps/158
-------------
PR: https://git.openjdk.java.net/jdk/pull/3135
More information about the hotspot-dev
mailing list