RFR: 8229517: Support for optional asynchronous/buffered logging [v20]

Xin Liu xliu at openjdk.java.net
Mon May 24 08:27:53 UTC 2021


On Mon, 24 May 2021 04:12:54 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> This patch provides a buffer to store asynchrounous messages and flush them to
>> underlying files periodically.
>
> Xin Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   flush() waits until all pending logging IO operations are done.
>   
>   This patch support all gtests in async mode.
>   make test TEST="gtest:all" TEST_OPTS='VM_OPTIONS=-Xlog:async'

Your summary about _sem and _lock are correct.  Now let's discuss on _io_sem. introduced in https://github.com/openjdk/jdk/pull/3135/commits/9800a8ccebb4490c9aeca781bc8155149245d308

> IIUC what you are trying to deal with is a case where the log writing
thread has emptied the buffer and as is busy writing that out, and other
log messages then get enqueued, but then we abort/terminate and we call
flush() in the aborting/terminating thread. 

Yes. Another scenario is to hard flush logs so gtest can see them. 

>  So now we would have two threads trying to write to the log outputs and these different logging
messages could get interleaved in a confusing way;
 
interleave? No. _io_sem is a Dijkstra semaphore whose value is 1.  you can imagine that _io_sem denotes the single IO channel in your system. Everybody can do operation P (aka. _io_sem.wait()) but only one can successfully return. Another has to wait because the value of _io_sem is zero. Yes, there's a race condition, but it doesn't matter. winner does the I/O jobs first. It will do operation V (_io_sem.notify()) in its completion and it will wake up the other thread.  flush() first empties the buffer. When it returns, it can guarantee what it dequeued are written. 

Yes, it's possible that first batch of popped messages are written later.  This only happens when a man explicitly "flush" it.  Normally, only log writer thread does the IO. 

> First, as noted, making the aborting thread wait() could be very
problematic for the process of aborting. 

In no content scenario, _io_sem.wait() of aborting thread should return immediately because its value is 1. 
If log write thread is writing, I think it's okay to wait for its 'turn'. That the semantics of "flushing":  make sure all logs are written to the filesystem.  

> Second, there is still a race as to which thread will get the io_sem
first and so the output could still be presented in the wrong order.

Yes, it may flip the order. I think it's still acceptable.  When JVM is aborting,  I manage to dump everything I am holding. 
The only compromise the batch order may be flip and it only happens in extreme condition.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3135


More information about the hotspot-dev mailing list