RFR: 8323807: Async UL: Add a stalling mode to async UL [v10]

Thu Jan 23 08:21:29 UTC 2025

On Wed, 22 Jan 2025 14:57:55 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Hi,
>> 
>> In January of this year I took a stab at implementing a stalling mode for UL, see link: https://github.com/openjdk/jdk/pull/17757 . I also talked about this feature in the mailing lists and seemed to receive positive feedback. With that PR, I also implemented a circular buffer. This PR didn't go through because 1. The stalling mode was broken 2. The complexity was a bit too large imho.
>> 
>> This PR does a much smaller change by only focusing on implementing the actual stalling.
>> 
>> The addition in terms of command line changes are the same as before, you can now specify the mode of your async logging:
>> 
>> 
>> $ java -Xlog:async:drop # Dropping mode, same as today
>> $ java -Xlog:async:stall # Stalling mode!
>> $ java -Xlog:async # Dropping mode by default still
>> 
>> 
>> The change in protocol is quite simple. If a producer thread `P` cannot fit a message into the buffer, it `malloc`s a message and exposes it via a shared pointer. It blocks all other producer threads from writing into the buffer. At the same time, the consumer thread (`AsyncLogWriter`) will perform all writing. When the consumer thread has emptied the write buffer, it writes the stalled message, notifies `P` and releases all locks. `P` then let's all other producer threads continue.
>> 
>> We do this by having two locks: `Outer` and `Inner`. In our example above, `P` prevents any other producers from progressing by holding the outer lock, but allows the consumer thread to progress by releasing the inner lock.
>> 
>> In pseudo-code we have something like this in the stalling case.
>> 
>> 
>> void produce() {
>>   OuterLock olock;
>>   InnerLock ilock;
>>   bool out_of_memory = attempt_produce(shared_buffer);
>>   if (out_of_memory) {
>>     pmsg = new Message();
>>     shared_message = pmsg;
>>     while (shared_message != nullptr) ilock.wait();
>>     free(pmsg);
>>   }
>> }
>> 
>> void consume() {
>>   InnerLock ilock;
>>   consume(shared_buffer);
>>   if (shared_message != nullptr) {
>>     consume(shared_message);
>>     ilock.notify();
>>   }
>> }
>> 
>> 
>> *Note!* It is very important that the consumer prints all output found in the buffer before printing the stalled message. This is because logging is output in Program Order. In other words: `print(m0); print(m1);` means that `m0` must appear before `m1` in the log file.
>> 
>> *Note!* Yes, we do force *all* threads to stall before the original stalled message has been printed. This isn't optimal, but I still have hope that we can switch to a faster circu...
>
> Johan Sjölen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   NOT_DEBUG is not the same as PRODUCT_ONLY I assume

src/hotspot/share/logging/logAsyncWriter.cpp line 105:

> 103:   if (LogConfiguration::async_mode() == LogConfiguration::AsyncMode::Stall) {
> 104:     size_t size = Message::calc_size(msg_len);
> 105:     void* ptr = os::malloc(size, mtLogging);

Your mention of recursion made me check if we are logging from our `os:malloc/free` calls. I can find at least one path  do `log_warning` in debug builds. This will cause both recursive locking on our PlatformMonitor and cause infinite recursion if the second message does not fit in the buffer. 

Maybe this logging is unreachable from this specific context. But I find it a little scary nonetheless. Maybe this should be documented somewhere.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22770#discussion_r1926545352