Request for Comment: Add a stalling mode to asynchronous UL

Tue Mar 19 12:05:00 UTC 2024

Hi Xin,

Thank you for taking the time to perform your own tests on my code, I'm 
happy that the results were so encouraging. David explained the point of 
stalling mode very well, we will only stall
in the case when the buffer is full, otherwise we are effectively 
non-blocking.

I will be preparing a PR with ticket number 8323807, however I first 
need to take my branch from prototype to "production" quality.

All the best,
Johan
------------------------------------------------------------------------
*From:* Liu, Xin [mailto:xxinliu at amazon.com]
*Sent:* Tuesday, March 19, 2024 at 05:30
*To:* Thomas Stüfe; Johan Sjolen
*Cc:* hotspot-dev at openjdk.org
*Subject:* Request for Comment: Add a stalling mode to asynchronous UL

> Hi, Johan,
>
> I don’t quite understand the new feature ‘stalling’. On the
> other side, I think your new design is very reasonable.
>
> Gil Tene dubbed the time that Java applications do nothing
> “hiccups”. From those services which are keen on GC pause in
> high-percentile such as P99.999, we found that one source of hiccups
> is unified logging at safepoints. There are many reasons unified logs
> are blocking. It is possible that log writes are throttled by the
> write bandwidth of disk. I think the main motivation of -Xlog:async is
> non-blocking, not leveraging multi-core. When log write is blocked, it
> should not block Java threads or VMThread. In your stall mode, my
> understanding is that your log site is still blocking. You have to wait for
> AsyncLogWriter when it is trapped in ‘msg.output->write_blocking’,
> right?  If we would stall rather than drop messages , why do we still choose
> async-logging in the first place?
>
> On the other side, the circular buffer looks promising. I compare your
> change(pull/17757) and its base(71b46c38a82) using your cmd
> "slow-debug/jdk/bin/java -Xlog:async".
>
> For stdout, I use "slowdebug/jdk/bin/java -Xlog:async -Xlog:all=trace
> |& tee stdout-$i.log".
>
> I ran 20 iterations and compare their differences using
> T-test. I found that there are significant differences for both stdout/stderr
> and regular file.  For stdout, the mean of dropped message reduces
> 2.9%.  For file, the mean of dropped message reduces 86.8%.
>
> I see that your approach is much more effective than ping-pong buffer
> when IO write is fast. Write a file is faster than stdout/stderr. Shall you post
> PR for broader reviews.
>   
> Thanks,
> --lx
>   
>
> From: Thomas Stüfe <thomas.stuefe at gmail.com>
> Date: Friday, February 16, 2024 at 6:56 AM
> To: Johan Sjolen <johan.sjolen at oracle.com>
> Cc: "hotspot-dev at openjdk.org" <hotspot-dev at openjdk.org>, "Liu, Xin" <xxinliu at amazon.com>
> Subject: RE: [EXTERNAL] Request for Comment: Add a stalling mode to asynchronous UL
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
> CC'ing Xin as original author.
>
> At first glance, I like this. I also like the ringbuffer approach better. Not sure if it had been discussed originally, but the original PR went on for a while IIRC.
>
> Do we even need the "drop" mode? I never liked the idea of losing output, and maybe stalls are that rare that they won't matter. Arguably, if you stall you should increase the log buffer size.
>
> ..Thomas
>
>
> On Fri, Feb 16, 2024 at 3:45 PM Johan Sjolen <mailto:johan.sjolen at oracle.com> wrote:
> Hello Hotspot,
>
> Today, asynchronous UL depends on off-loading the work of logging to output devices by having log sites copying messages onto a buffer and an output thread swapping it with an empty buffer and emptying it.
> If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all.
> I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message.
>
> My purpose with this e-mail is twofold. First, I would like to know whether there is any interest in such a feature being added to Hotspot. Second, I'd like to discuss a change in the design of async UL which will permit this feature and other improvements.
> I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such:
>
> ```
> $ java -Xlog:async:drop  ... # Drop messages
> $ java -Xlog:async:stall ... # Stall log sites
> $ java -Xlog:async       ... # Drop messages by default
> ```
>
> The rest of this e-mail will describe the design changes and motivation behind these changes, so if you're mostly interested in answering whether or not you want there to be a stalling mode in async UL then you don't need to bother with the rest 🙂.
>
> There's a draft PR for these changes, it runs on my Linux computer, but I have implemented the necessary changes for Mac and Windows also. The PR is available here: https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/17757__;!!ACWV5N9M2RV99hQ!MdM5-wtW55eXb62nY7oNPVi5xR_eyStKhV9pUhhFk0yDRJG7ORpQCxzhntPXxwVz77AVK_5X1TJmtmSci0M$
>
> I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol:
>
> 1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers.
>     This in turn leads to excessive dropping of messages and poor scaling of buffer increases.
> 2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites.
>     Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging.
>     This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues.
>
> Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build:
>
> ```
> $ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0
> $ grep 'dropped due to' asynctest
> [0.147s][warning][                       ]   1338 messages dropped due to async logging
> [0.171s][warning][                       ]    496 messages dropped due to async logging
> [0.511s][warning][                       ]     98 messages dropped due to async logging
> [0.542s][warning][                       ]     13 messages dropped due to async logging
> [0.762s][warning][                       ]   5620 messages dropped due to async logging
> [1.286s][warning][                       ]  12440 messages dropped due to async logging
> [1.316s][warning][                       ]   2610 messages dropped due to async logging
> [1.352s][warning][                       ]   2497 messages dropped due to async logging
> ```
>
> I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress.
> It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time.
>
> My solution does seem to improve the dropping situation, as per the same test:
>
> ```
> $ ul/jdk/bin/java -Xlog:async:drop -Xlog:all=trace:file=asyncdrop::filecount=0
> $ grep 'dropped due to' asyncdrop
> [0.295s][warning][                       ]    195 messages dropped due to async logging
> [0.949s][warning][                       ]   3012 messages dropped due to async logging
> [1.315s][warning][                       ]   7992 messages dropped due to async logging
> ```
> That is a bit over 10 000 dropped messages, as opposed to over 20 000 in the old solution. I also did some throughput benchmarking, which shows that stalling has minimal impact on throughput, see appendix 2 for the throughput timings.
>
>
> The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details.
>
> TL;DR:
>
> 1. Add a 'stall' strategy for the async logging so that we can guarantee 0 dropped messages at the expense of delays at log sites, this is exposed by changing the command line option. Before: -Xlog:async, after: -Xlog:async[:[drop|stall]]. Default is still drop.
> 2. Ping-pong buffers are not a good idea when also supporting stalling, because a log site can be stalled for far longer than in the synchronous case.
> 3. Therefore switch out ping-pong buffers and single mutex with a circular buffer and separate producer and consumer mutexes. This also has the benefit of reducing dropped messages in my testing.
>
> Thank you for reading and I appreciate any input on this.
> Johan Sjölén
>
>
> Appendix 0, the old async protocol.
>
> Nota bene: Any absence or presence of bugs may be the opposite in the actual source code.
>
> ```
> Variables:
> Mutex m;
> Buffer* cb;
> Buffer a;
> Buffer b;
> const char* message;
>
> Methods:
> void Buffer::write(const char* message) {
>    if (has_space(strlen(message))) {
>      internal_write(message);
>    }
> }
>
> LogSite:
> 1. m.lock();
> 2. cb->write(message);
> 3. m.unlock();
> 4. m.notify();
>
> OutputThread:
> 1. m.wait();
> 2. m.lock();
> 3. buffer := cb;
> 4. if cb == &a then cb = &b; else cb = &a;
> 5. m.unlock();
> 6. for each message in buffer: print to log site
> 7. go to 1.
> ```
>
> Appendix 1, the new protocol.
>
> Nota bene: Any absence or presence of bugs may be the opposite in the actual source code.
> ```
> Mutex consumer;
> Mutex producer;
> CircularBuffer cb;
> size_t head;
> size_t tail;
> const char* message;
> char* out;
>
> LogSite:
> 1. producer.lock();
> 2. t = tail; h = head; // Read the values
> 2. if unused_memory(h, t) < strlen(message) then drop_or_stall();
> 3. size_t bytes_written = cb.write(message); // Writes string and envelope
> 4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail.
> 5. producer.unlock();
>
> OutputThread:
> 1. consumer.lock();
> 2. t = tail; h = head;
> 3. if h == t return; // No message available
> 4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message
> 5. head = (h + bytes_read) % cb.size() // More memory available for producer
> 6. consumer.unlock();
> ```
>
> Appendix 2, the throughput tests.
>
> ```
> Testing requires the tool pv, example of how to try it yourself:
> $ ./build/slow/jdk/bin/java -Xlog:async -Xlog:all=trace | (pv --rate --timer --bytes > /dev/null)
>
> # Slow debug build
> ## Default sync logging
> ### /dev/null
> 48.5MiB 0:00:02 [17.5MiB/s]
> ### To file
> 48.5MiB 0:00:03 [14.4MiB/s]
>
> ## Old async logging
> ### /dev/null
> 51.1MiB 0:00:02 [20.9MiB/s]
> ### To file
> 57.1MiB 0:00:02 [19.7MiB/s]
> Dropped messages:
> [0.202s][warning][                       ]    350 messages dropped due to async logging
> [1.302s][warning][                       ]   1981 messages dropped due to async logging
> [1.328s][warning][                       ]   2193 messages dropped due to async logging
> [1.354s][warning][                       ]   2464 messages dropped due to async logging
> [1.379s][warning][                       ]   4158 messages dropped due to async logging
> [1.405s][warning][                       ]   3017 messages dropped due to async logging
> [1.435s][warning][                       ]   3818 messages dropped due to async logging
> [1.479s][warning][                       ]   2286 messages dropped due to async logging
>
> ## New async logging (dropping mode)
> ### /dev/null
> 50.6MiB 0:00:02 [20.9MiB/s]
> ### To file
> 50.8MiB 0:00:02 [18.1MiB/s]
> Dropped messages: 0 on this run.
> Did another run as there is high variability in dropped messages:
> 50.2MiB 0:00:02 [18.3MiB/s]
> [1.276s][warning][                       ]   4942 messages dropped due to async logging
>
> ## New async logging (stalling mode)
> ### /dev/null
> 51.3MiB 0:00:02 [19.9MiB/s]
> ### To file
> 50.9MiB 0:00:02 [18.8MiB/s]
>
> # Release build Linux-x64 (to a tmpfs mounted file)
> ## New async drop
> 20,1MiB 0:00:00 [43,3MiB/s]
> 0 dropped messages
> ## New async stall
> 33,7MiB 0:00:00 [41,5MiB/s]
> N/A dropped messages
> 0 dropped messages
> ## Old async
> 17,7MiB 0:00:00 [42,2MiB/s]
> [0.059s][warning][                       ]   6860 messages dropped due to async logging
> [0.082s][warning][                       ]  20601 messages dropped due to async logging
> [0,107s][warning][                       ]   9147 messages dropped due to async logging
> [0,131s][warning][                       ]  11113 messages dropped due to async logging
> [0,154s][warning][                       ]  11845 messages dropped due to async logging
> [0,178s][warning][                       ]  10383 messages dropped due to async logging
> [0,197s][warning][                       ]   9376 messages dropped due to async logging
> [0,216s][warning][                       ]  10997 messages dropped due to async logging
> [0,234s][warning][                       ]   5978 messages dropped due to async logging
> [0,252s][warning][                       ]  10439 messages dropped due to async logging
> [0,270s][warning][                       ]   6562 messages dropped due to async logging
> [0,325s][warning][                       ]    391 messages dropped due to async logging
> [0,343s][warning][                       ]   3532 messages dropped due to async logging
> [0,362s][warning][                       ]   3268 messages dropped due to async logging
> [0,395s][warning][                       ]   4182 messages dropped due to async logging
> [0,411s][warning][                       ]   2490 messages dropped due to async logging
> ```
>