RFR: 8229517: Support for optional asynchronous/buffered logging

Fri Mar 26 23:25:31 UTC 2021

On Fri, 26 Mar 2021 10:02:37 GMT, Volker Simonis <simonis at openjdk.org> wrote:

>> This patch provides a buffer to store asynchrounous messages and flush them to
>> underlying files periodically.
>
> Hi Xin,
> 
> thanks for finally addressing this issue. In general your change looks good. Please find my detailed comments inline.
> 
> Best regards,
> Volker

hi, Thomas, 

Thank you for reviewing this PR. 

> Hi Xin,
> 
> I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description.
> 
> * Who does the writing, and who is affected when the writing stalls?

The WatchThread eventually flushes those buffered messages.  if the writing stalls, it blocks periodic tasks. 
It blocks long enough,  other periodic tasks are skipped. 

> * Do you then block or throw output away?
>   
>   * If the former, how do you mitigate the ripple effect?
>   * If the latter, how does the reader of the log file know that something is missing?

The capacity of buffer is limited, which is `AsyncLogBufferSize` (2k by default). 
Actually, logTagSet.cpp limits the maximal length of a vwrite is 512. That means that maximal memory used by this buffer is 1M (=2k * 0.5k). 

If the buffer overflows, it starts dropping the heads. this behavior simulates a ringbuffer.
If you enable `-XX:+Verbose`, the dropping message will be printed to the tty console. 

I prefer to drop messages than keeping them growing because later may trigger out-of-memory error. 

> * How often do you flush? How do you prevent missing output in the log file in case of crashes?

The interval is defined by `LogAsyncInterval` (300ms by default). I insert a statement `async->flusher()` in `ostream_abort()`.

> * Can this really the full brunt of logging (-Xlog:*=trace) over many threads?
to be honest, it can't. I see a lot of dropping messages on console with -XX:+Verbose.

I have tuned parameters that it won't drop messages easily for normal GC activity with info verbosity. 
`-Xlog:*=trace` will drop messages indeed, but this is tunable. I have a [stress test](https://github.com/navyxliu/JavaGCworkload/blob/master/runJavaUL.sh) to tweak parameters. 

> * Does this work with multiple target and multiple IO files?

Yes, it works if you have multiple outputs. `LogAsyncFlusher` is singleton. one single buffer and one thread serve them all.

> * Does it cost anything if logging is off or not async?
>
so far, LogAsyncFlusher as a periodic task remains active even no output is in async_mode. 
it wakes up every `LogAsyncInterval` ms. it's a dummy task because the deque is always empty.  the cost is almost nothing. 

> Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread.
> 
> I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling.
> 

You concern is reasonable. I don't understand why there is only one watchThread and up to 10 periodic tasks are crowded in it. 
If it's a bottleneck, I plan to improve this infrastructure.  I can make hotspot supports multiple watcher threads and spread periodic tasks among them.  All watcher threads are connected using linked list to manage.  

Can we treat it as a separated task?  for normal usage, I think the delay is quite managed.  Writing thousands of lines to a file usually can be done in sub-ms. 

> * How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking.

IMHO, logging shouldn't hurt performance a lot. At least, those do impact on performance are not supposed to enable by default. On the other side,  I hope logging messages from other threads avoid from interweaving when I enable them to read. 
That leads me to use mutex. That actually improves readability.  

My design target is non-blocking.  pop_all() is an ad-hoc operation which pop up all elements and release the mutex immediately. writeback() does IO without it.

In our real applications, we haven't seen this feature downgrade GC performance yet. 
> 
> I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think.
> 
> Cheers, Thomas

I believe UL has its own reasons.  In my defense, I don't make UL more complex. I only changed a couple of lines in one of its implementation file(logFileOutput.cpp) and didn't change its interfaces. 

I try my best to reuse existing codebase. We can always refactor existing code([JDK-8239066](https://bugs.openjdk.java.net/browse/JDK-8239066), [JDK-8263840](https://bugs.openjdk.java.net/browse/JDK-8263840)), but it's not this PR's purpose. 

thanks,
--lx

-------------

PR: https://git.openjdk.java.net/jdk/pull/3135