RFR(L): JDK-8229517: Support for optional asynchronous/buffered logging

Tue Mar 10 07:06:14 UTC 2020

Hi, David, 

Thanks for commenting the RFR.

Indeed, LogAsyncFlusher needs to wait async loggers to release the RW mutex, but please note that LogAsyncFlusher releases the RW mutex before it traps into filesystem writing. It's not a blocker( refer to LogAsyncFlusher::task). 
Ie. The current implementation is blocking-free but not lock-free.  I mentioned it in early email.  I don't want to make thing complicated in early stage.  Lock-free FIFO requires extensive testing.  

For a FIFO queue, there's a broad design space for it.  If we only choose selective logging sites and offload them to the async buffer, it won't be contended situation.  If it turns out to be a contented lock, we could change it to spinlock, priority lock or even upgrade the FIFO to lock-free implementation.

LogAsyncFlusher is singleton. Its constructor is private.  Currently, I manually initialize and delete the only instance of it. It's very subtle because there's timing issue alongside of the initialization JVM. 
If we enables C++11 in the future, we can have Meyer's singleton idiom.  It will be more elegant because the instance can be lazy-initialized. 

Speaking of tagset, I think it's advantage of Unified logging framework. The framework separates frond-end and backend, interface and implementation. I love it and make use of it. 
A concrete class called 'LogTagSet' represents a set of tags.  They are all singleton and created by LogTagSetMapping. How to emit log messages is defined in LogTagSet.  They are even configurable in the runtime. 
<log.hpp> and <asynclog.hpp> are just two set of compatible interfaces. Under the hook, they shared the same LogTagset infrastructure. 

Yes, developers have to choose go conventional log or asynclog in build-time. I think developers have insights of their code which may cause long latency from logging. 
In order to have drop-in replacement api, I didn't go the direction that provide a new tag for async buffer.  So far, tags are all component names of hotspot. 

Thanks,
--lx

On 3/8/20, 10:34 PM, "David Holmes" <david.holmes at oracle.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

    Hi,

    On 5/03/2020 8:04 pm, Liu, Xin wrote:
    > Hi, hotspot reviewers,
    >
    > Could you take a look at my webrev?
    > Bug: https://bugs.openjdk.java.net/browse/JDK-8229517
    > Webrev: https://cr.openjdk.java.net/~xliu/8229517/01/webrev/
    >
    > Previously, we had discussion at runtime-dev. https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038211.html
    > I would like to continue it because I redo my implementation on the tip of jdk and have made it a general solution.

    Not a review as such but a couple of comments/queries.

    The async logging is done as a periodic task that will be executed the
    Watcher thread. This task requires a mutex that is also used by the
    other threads doing the log operations. So potentially the watcher
    thread could experience a significant delay waiting for the mutex.

    I'm not clear what the actual lifecycle of the LogAsyncFlusher instance
    is. I would have expected a single instance initialized at VM startup
    and cleaned up at VM exit. The code gives the appearance that this can
    actually be much more dynamic with multiple instances coming and going
    through the lifetime of the VM. Deletion of the instance is particularly
    problematic due to the Mutex, which has to be guaranteed not to be
    accessed when it gets deleted.

    > This webrev  provides <logging/asynclog.hpp> on apar with <log.hpp>. Even though it doesn’t touch any existing logging calls, it’s very possible to offload any callsite of logging to the asynchronous buffer.
    > My rationale is that we only need to redirect some time-critical logs to asynchronous buffer to dodge unexpected filesystem delay.  Eg. g1gc has some loggings at safepoint. We mitigate those long-tailed GC latencies rooted by filesystem writing to gclog.

    So this isn't an opt-in facility at runtime, but rather something that
    has to be decided a-priori by the developer.

    How does this interact with the synchronous logging that may be used for
    a given set of log tags?

    Thanks,
    David
    -----

    > I got feedback from my our internal users. They have a concern about my approach of dynamic memory. Indeed, I gave up ringbuffer design in previous version and turn to use a dynamic allocated linkedlist.  If writing to filesystem stalls infinitely, JVM may implode because of memory usage.  I plan to solve it by adding a memory quota to the FIFO of logAsyncFlusher.hpp. It will start dropping early messages of  the FIFO if it has consumed too many free memories. Another potential improvement is changing the FIFO from mutex protected to lock-free.  It make sense if we redirect high volume of logging messages to it and mutex becomes hot.
    >
    > Thanks,
    > --lx
    >
    >
    >