RFR: 8229517: Support for optional asynchronous/buffered logging [v2]

Tue Mar 30 06:18:59 UTC 2021

On Mon, 29 Mar 2021 12:25:13 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

> > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR?
> 
> I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it.
> 
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html

Interesting. This design diagram is similar to this PR, but I don't think it is a good idea to have a blocking message buffer.
As mentioned in prior thread, it makes hotspot be more subject to external factors. TCP/UPD is an even more representative example of blocking IO than harddrive, isn't it?

### Design and its Rationale
For async-logging feature,  we proposed a lossy non-blocking design here.  A [bounded deque](https://github.com/openjdk/jdk/pull/3135/files#diff-5a3c326d548886f56ef0c46f4a63f7c58f76e1c51fada9a874d40d12a43f15b0R40) or ringbuffer gives a strong guarantee that log sites won't block java threads and the critical internal threads. This is the very problem we are meant to solve. 

It can be proven that we cannot have all three guarantees at the same time: **non-blocking**, **bounded memory** and **log fidelity**.  To overcome blocking I/O, which sometimes is not under our control, we think it's fair to trade log fidelity for non-blocking.  If we kept fidelity and chose unbound buffer, we could end up with some spooky out-out-memory errors on some resource resource-constrained hardwares.  We understand that the platforms hotspot running range from powerful servers to embedded devices. By leaving the buffer size adjustable, we can fit more scenarios.  Nevertheless,  with a bounded buffer, we believe developers can still capture important logging traits as long as the window is big enough and log messages are consecutive.  The current implementation does provide those two attributes. 

### A new proposal based on current implementation 
I agree with reviewers' comments above.  It's questionable to use the singleton `WatcherThread` to do IO-intensive job here. It may hinder other tasks.  David's guess is right. I was not familiar with hotspot thread and quite frustrated to deal with a special-task thread's lifecycle. That why I used PeriodicTask. I feel more confident to take that challenge again. 

Just like Yasumasa [depicted](https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d), I can create a dedicated NonJavaThread to flush logs instead. Yesterday, I found `WatcherThread::unpark()` uses its monitor to wake up other pending tasks.  I think we can implement in this way. Once log sites observe the buffer is half-full, it uses `monitor::notify()` to wake up  flusher thread to work. I think logging event is high-frequent but less contentious. Waking it up for each log message is not so economical. I have a lossy buffer anyway, so I suggest to have two checkpoints only: 1) half-full. 2) full. 

### Wrap it up
We would like to propose a lossy design of async-logging in this PR.  It is a trade off, so I don't think it's a good idea to handle all logs in async mode.  In practice, we hope people only choose `async-logging` for those logs which really may happen at safepoints. 

I understand Yasumasa's problem.  If you would like to consider netcat or nfs/sshfs, I think your problem can still be solved in the existing file-based output. In this way, you can also utilize this feature by setting your "file" output async mode, then it makes your hotspot non-blocking over TCP as well.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3135