Request For Comment: Asynchronous Logging

Thomas Stüfe thomas.stuefe at gmail.com
Tue Apr 13 05:30:41 UTC 2021


Hi Xin & others,

Sorry for the late response, I had vacation.

As I wrote before, I think this is a valuable improvement. I would be
unhappy if this were stalled further.

In this thread there seems to be a majority opinion now for
- having an own thread instead of using the WatcherThread
- having a global option for async (good point, Ioi)

AFAICS the only POC left is the question whether to make the sizing option
a memory size or a "number of entries". I argued for the memory size
before, but won't insist on it, in order to move this issue forward (we
could always just keep that option diagnostic).

Is there anything else left to discuss?

I mentioned before that I think UL is too complex and could benefit from
simplification; but this is clearly not Xins fault, and I am sorry that I
brought this even up here, since it has nothing to do with this RFE.

Cheers, Thomas



On Tue, Mar 30, 2021 at 11:21 PM Liu, Xin <xxinliu at amazon.com> wrote:

> Thanks Volker for this. I would like to append some additional materials.
> I forgot to mention them when I wrote the Rationale[5] yesterday.
>
> We identified and root-caused the tail-latency on a Linux system with
> software RAID in 2018.
> We have different implementations on jdk8u and jdk11u. We are seeking to
> merge this feature to tip.
>
> Nonetheless, it doesn't mean "async-logging facility" only solves Amazon's
> peculiar problem.  When we studied this, we found many interesting
> references.
> Eg. LinkedIn reported and analyzed it well[1]. In particular, they
> mentioned that one reason was Linux cache writeback [2]. IMHO, that could
> impact
> almost all mass-storge Linux filesystems. Twitter also expressed that "I
> would love to hear if this can happen with OpenJDK!"[3].  This is also
> reported
> by other companies[4].
>
> Thanks,
> --lx
>
>
> [1]
> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
> [2]
> https://yoshinorimatsunobu.blogspot.com/2014/03/why-buffered-writes-are-sometimes.html
> [3] https://www.evanjones.ca/jvm-mmap-pause-finding.html
> [4]
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/042301.html
> [5] https://github.com/openjdk/jdk/pull/3135#issuecomment-809942487
>
> On 3/30/21, 11:20 AM, "Volker Simonis" <volker.simonis at gmail.com> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     Hi,
>
>     I'd like to (re)start a discussion on asynchronous logging [1,2,3,4].
>     We are successfully using this feature productively at Amazon both in
>     jdk8 and jdk11 to reduce the tail latency of services which use
>     logging. We think that async logging is a useful addition to the
>     current logging framework which might be beneficial to a wider range
>     of users. The following write-up tries to capture the comments and
>     suggestions from the previous discussions we are aware of.
>
>     Current state:
>
>     - HotSpot uses the so called "Unified Logging" (UL) framework which
>     was introduced by JEP 158 [5] in JDK 9. Most logs have been
>     retrofitted to use UL since then (e.g. "JEP 271: Unified GC Logging"
>     [6]).
>     - The current UL implementation is based on the standard C buffered
>     stream I/O interface [7]. The LogFileStreamOutput class which writes
>     logs to abstract FILE streams is the only child of the abstract base
>     class LogOutput. LogFileStreamOutput has three child classes
>     LogStdoutOutput,  LogStderrOutput and LogFileOutput which write to
>     stdout, stderr or an arbitrary file respectively. The initial UL JEP
>     158 [5] envisioned logging to a socket but has not implemented it. At
>     least one such extension has been proposed since then [8].
>     - UL synchronizes logs from different threads with the help of the
>     standard C flockfile()/funlockfile() [9] functions. But even without
>     this explicit locking, all the "stdio functions are thread-safe. This
>     is achieved by assigning to each FILE object a lockcount and (if the
>     lockcount is nonzero) an owning thread.  For each library call, these
>     functions wait until the FILE object is no longer locked by a
>     different thread, then lock it, do the requested I/O, and unlock the
>     object again" [9]. A quick look at the glibc sources reveals that FILE
>     locking is implemented with the help of futex() [10] which breaks down
>     to s simple atomic compare and swap (CAS) on the fast path.
>     - Notice that UL "synchronizes" logs from different threads to avoid
>     log interleaving. But it does not "serialize" logs according to the
>     time at which  they occurred. This is because the underlying stdio
>     functions do not guarantee a specific order for different threads
>     waiting on a locked FILE stream. E.g. if three log events A, B, C
>     occur in that order, the first will lock the output stream. If the log
>     events B and C both arrive while the stream is locked, it is
>     unspecified which of B and C will be logged first after A releases the
>     lock.
>
>     Problem statement:
>
>     - The amount of time a log event will block its FILE stream depends on
>     the underlying file system. This can range from a few nanoseconds for
>     in-memory file systems or milliseconds for physical discs under heavy
>     load up to several seconds in the worst case scenario for e.g. network
>     file systems. A blocked log output stream will block all concurrent
>     threads which try to write log messages at the same time. If logging
>     is done during a safepoint, this can significantly increase the
>     safepoint time (e.g. several parallel GC threads trying to log at the
>     same time). We can treat stdout/stderr as special files here without
>     loss of generality.
>
>     Proposed solution:
>
>     Extend UL with an asynchronous logging facility. Asynchronous logging
>     will be optional and disabled by default. It should have the following
>     properties:
>     - If disabled (the default) asynchronous logging should have no
>     observable impact on logging.
>     - If enabled, log messages will be stored in an intermediate data
>     structure (e.g. a double ended queue).
>     - A service thread will concurrently read and remove log messages from
>     that data structure in a FIFO style and write them to the output
>     stream
>     - Storing log messages in the intermediate data structure should take
>     constant time and not longer than logging a message takes in the
>     traditional UL system (in general the time should be much shorter
>     because the actual I/O is deferred).
>     - Asynchronous logging trades memory overhead (i.e. the size of the
>     intermediate data structure) for log accuracy. This means that in the
>     unlikely case where the service thread which does the asynchronous
>     logging falls behind the log producing threads, some logs might be
>     lost. However, the probability for this to happen can be minimized by
>     increasing the configurable size of the intermediate data structure.
>     - The final output produced by asynchronous logging should be equal to
>     the output from normal logging if no messages had to be dropped.
>     Notice that in contrast to the traditional unified logging,
>     asynchronous logging will give us the possibility to not only
>     synchronize log events, but to optionally also serialize them based on
>     their generation time if that's desired. This is because we are in
>     full control of the synchronization primitives for the intermediate
>     data structure which stores the logs.
>     - If log messages had to be dropped, this should be logged in the log
>     output (e.g. "[..] 42 messages dropped due to async logging")
>     - Asynchronous logging should ideally be implemented in such a way
>     that it can be easily adapted by alternative log targets like for
>     example sockets in the future.
>
>     Alternative solutions:
>     - It has repeatedly been suggested to place the log files into a
>     memory file system but we don't think this is an optimal solution.
>     Main memory is often a constrained resource and we don't want log
>     files to compete with the JVM for it in such cases.
>     - It has also been argued to place the log files on a fast file system
>     which is only used for logging but in containerized environments file
>     system are often virtualized and the properties of the underlying
>     physical devices are not obvious.
>     - The load on the file system might be unpredictable due to other
>     applications on the same host.
>     - All these workarounds won't work if we eventually implement direct
>     logging to a network socket as suggested in [8].
>
>     Implementation details / POC:
>
>     - A recent pull request [2] for JDK-8229517 [3] proposed to use a
>     simple deque implementation derived from HotSpot's LinkedListImpl
>     class for the intermediate data structure. It synchronizes access to
>     the queue with a MutexLocker which is internally implemented with
>     pthread_lock() and results in an atomic CAS on the fast path. So
>     performance-wise the locking itself is not different from the
>     flockfile()/funlockfile() functionality currently used by UL but
>     adding a log message to the deque should be constant as it basically
>     only requires a strdup(). And we could even eliminate the strdup() if
>     we'd pre-allocate a big enough array for holding the log messages as
>     proposed in the pull request [2].
>     - The pull pull request [2] for JDK-8229517 [3] proposed to set the
>     async flag as an attribute of the Xlog option which feels more natural
>     because UL configuration is traditionally done within the Xlog option.
>     But we could just as well use a global -XX flag to enable async
>     logging? What are your preferences here?
>     - The pull pull request [2] for JDK-8229517 [3] (mis)uses the
>     WatcherThread as service thread to concurrently process the
>     intermediate data structure and write the log messages out to the log
>     stream. That should definitely be changed to an independent service
>     thread.
>     - The pull pull request [2] for JDK-8229517 [3] initially proposed
>     that the "service thread" runs at a fixed interval to dump log
>     messages to the log streams. But reviewers commented that this should
>     better happen either continuously or based on the filling level of the
>     intermediate data structure. What are your preferences here?
>     - What are your preferences on the configuration of the intermediate
>     data structure? Should it be configured based on the maximum number of
>     log messages it can store or rather on the total size of the stored
>     log messages? I think that for normal runs this distinction probably
>     won't make a big difference because the size of log messages will
>     probably be the same on average so "number of log messages" should
>     always be proportional to "total size of log mesages".
>
>     1. Before diving into more implementation details, I'd first like to
>     reach a general consensus that asynchronous logging is a useful
>     feature that's worth while adding to HotSpot.
>
>     2. Once we agree on that, we should agree on the desired properties of
>     asynchronous logging. I've tried to collect a basic set in the
>     "Proposed solution" section.
>
>     3. If that's done as well, we can discuss various implementation
>     details and finally prepare new pull requests.
>
>     Thank you and best regards,
>     Volker
>
>     [1] https://bugs.openjdk.java.net/browse/JDK-8229517
>     [2] https://github.com/openjdk/jdk/pull/3135
>     [3]
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html
>     [4]
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-August/039130.html
>     [5] https://openjdk.java.net/jeps/158
>     [6] https://openjdk.java.net/jeps/271
>     [7] https://man7.org/linux/man-pages/man3/stdio.3.html
>     [8] https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d
>     [9] https://man7.org/linux/man-pages/man3/flockfile.3.html
>     [10] https://man7.org/linux/man-pages/man2/futex.2.html
>
>


More information about the hotspot-dev mailing list