RFR: 8323807: Async UL: Add a stalling mode to async UL [v8]
David Holmes
dholmes at openjdk.org
Thu Jan 23 03:46:55 UTC 2025
On Wed, 22 Jan 2025 15:03:27 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:
>> Alright. I think to me the two signals is more confusing. Mainly because we have mutual exclusion on producing data (this includes taking the consumer lock for producing), and if it stalls the producer lock is held until the consumer consumes. So if we ignore the case where the log message is larger than the buffer, if we stall we also know that the consumer must have been notified and `_data_availiable` been set to true.
>>
>>> Axel correctly identified that a notify call is missing. This turned out not to matter, perhaps because of spurious wake ups? Regardless, I fixed it.
>>
>> The reason it used to work is that we probably never logged 50k+ bytes. If there is a single byte in the buffer when a message stalls we must have had another producer that have a notification in flight for the consumer.
>>
>> We can then have, pseudo-C++.
>> Producer side:
>> ```C++
>> ProducerLocker pl();
>> ConsumerLocker cl();
>> produce_data(); // May produce a stall
>> if (!_data_availiable) {
>> _data_availiable = true;
>> cl.notify();
>> }
>> while (_stalled_msg) {
>> cl.wait();
>> }
>> // Free the stalled message here after the lock or on the consumer side
>>
>>
>> Consumer side:
>> ```C++
>> {
>> ConsumerLocker cl();
>> while (!_data_availiable) {
>> cl.wait();
>> }
>> claim_data();
>> _data_availiable = false;
>> }
>> process_data();
>> if (_stalled_msg) {
>> ConsumerLocker cl();
>> _stalled_msg = nullptr;
>> cl.notify();
>> }
>>
>>
>> The nice thing with one signal, is that it can be kept one to one with the property "we have notified the consumer".
>>
>> Doing the current way may have some redundant notify calls, and have a bit more state to keep in your head while parsing the code. But I am fine with it as well.
>
>> I think to me the two signals is more confusing.
>
> I should have said that having two properties being signalled by one variable is confusing. What I wrote still have two signals. It is just that `_data_availiable` is used to signal "there is something to consume". While `_stalled_msg` signals "a stalled message has been consumed". While currently `_data_availiable` and `_stalled_message` signals "there is something to consume" together and `_stalled_msg` also signals "a stalled message has been consumed". So the meaning of the `_stalled_message` signal is contextual.
I found the two properties clearer for understanding the stalling protocol. At least I thought I did but Axel's comments are confusing me now.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22770#discussion_r1926310789
More information about the hotspot-runtime-dev
mailing list