RFR: 8229517: Support for optional asynchronous/buffered logging [v2]

Tue Mar 30 13:20:10 UTC 2021

On Tue, 30 Mar 2021 07:19:08 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

>>> > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR?
>>> 
>>> I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it.
>>> 
>>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html
>> 
>> Interesting. This design diagram is similar to this PR, but I don't think it is a good idea to have a blocking message buffer.
>> As mentioned in prior thread, it makes hotspot be more subject to external factors. TCP/UPD is an even more representative example of blocking IO than harddrive, isn't it?
>> 
>> ### Design and its Rationale
>> For async-logging feature,  we proposed a lossy non-blocking design here.  A [bounded deque](https://github.com/openjdk/jdk/pull/3135/files#diff-5a3c326d548886f56ef0c46f4a63f7c58f76e1c51fada9a874d40d12a43f15b0R40) or ringbuffer gives a strong guarantee that log sites won't block java threads and the critical internal threads. This is the very problem we are meant to solve. 
>> 
>> 
>> It can be proven that we cannot have all three guarantees at the same time: **non-blocking**, **bounded memory** and **log fidelity**.  To overcome blocking I/O, which sometimes is not under our control, we think it's fair to trade log fidelity for non-blocking.  If we kept fidelity and chose unbound buffer, we could end up with some spooky out-of-memory errors on some resource-constrained hardwares.  We understand that the platforms hotspot running range from powerful servers to embedded devices. By leaving the buffer size adjustable, we can fit more scenarios.  Nevertheless,  with a bounded buffer, we believe developers can still capture important logging traits as long as the window is big enough and log messages are consecutive.  The current implementation does provide those two merits. 
>> 
>> ### A new proposal based on current implementation 
>> I agree with reviewers' comments above.  It's questionable to use the singleton `WatcherThread` to do IO-intensive job here. It may hinder other tasks.  David's guess is right. I was not familiar with hotspot thread and quite frustrated to deal with a special-task thread's lifecycle. That why I used PeriodicTask. I feel more confident to take that challenge again. 
>> 
>> Just like Yasumasa [depicted](https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d), I can create a dedicated NonJavaThread to flush logs instead. Yesterday, I found `WatcherThread::unpark()` uses its monitor to wake up other pending tasks.  I think we can implement in this way. Once log sites observe the buffer is half-full, it uses `monitor::notify()` to resume flusher thread. I think logging event is high-frequent but less contentious. Waking it up for each log message is not so economical. I have a lossy buffer anyway, so I suggest to have two checkpoints only: 1) half-full. 2) full. 
>> 
>> ### Wrap it up
>> We would like to propose a lossy design of async-logging in this PR.  It is a trade off, so I don't think it's a good idea to handle all logs in async mode.  In practice, we hope people only choose `async-logging` for those logs which may happen at safepoints. 
>> 
>> I understand Yasumasa's problem.  If you would like to consider netcat or nfs/sshfs, I think your problem can still be solved in the existing file-based output. In this way, you can also utilize this feature by setting your "file" output async mode, then it makes your hotspot non-blocking over TCP as well.
>
> This proposal mostly looks good to me, but it is better if async support is implement in higher level class.
> LogFileOutput class inherited as following, and you modified LogFileOutput now (you might change LogFileStreamOutput later)
> 
> * LogOutput
>     * LogFileStreamOutput
>         * LogFileOutput
> 
> I want to add async support to LogFileStreamOutput or LogFileStreamOutput because it helps us if we add other UL output (e.g. network socket) in future.

Hi Xin, regrading the VM thread blocking on logs.

If you instead use two arrays, one active and one for flushing, you can swap them with atomic stores from the flushing thread.
And use GlobalCounter::write_synchronize(); to make sure no writer is still using the swapped out array for logging.

The logging thread would use GlobalCounter::critical_section_begin(), atomic inc position to get the spot in the array for the log, store the log and then GlobalCounter::critical_section_end().

That way you will never block a logging thread with the flushing and run enqueues in parallel.

If you want really want smooth logging you could also remove the strdup, since it may cause a syscall to "break in" more memory.
To solve that you could use the arrays as memory instead, and do bump the pointer allocation with an atomic increment to size needed instead of position.

I tested a bit locally generally I don't think there is an issue with blocking the VM thread on flushing.
So I'm not really that concern about this, but it's always nice to have an algorithm which is constant time instead. (Neither CS begin()/end() or atomic inc can fail or loop on x86)

-------------

PR: https://git.openjdk.java.net/jdk/pull/3135