RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2]

Tue Mar 30 08:27:47 UTC 2021

On Mon, 29 Mar 2021 07:56:19 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> Man Cao has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Address comment and add a gtest.
>
> src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 100:
> 
>> 98: template<typename T, T* volatile* (*next_ptr)(T&)>
>> 99: template<bool use_rcu>
>> 100: T* LockFreeQueue<T, next_ptr>::pop() {
> 
> On further consideration I don't think this `use_rcu` conditionalized `pop` is the right path. The current behavior (with the embedded critical section) was for the specific use case in `G1DirtyCardQueueSet`. But for a general tool, I think a different approach is needed. I think better would be to not provide `pop()` at all, and instead provide `try_pop()`, which has a tri-status result: success, lost a race, or lost to an in-progress operation. So something like:
> enum class LockFreeQueuePopStatus {
>   success,
>   lost_race,
>   operation_in_progress
> };
> 
> // Member of LockFreeQueue<T, ...>
> // Executes the body of the old pop loop once, with appropriate
> // adjustments to return value and returning rather than retrying.
> Pair<LockFreeQueuePopStatus, T*> try_pop();
> Then let the specific use-case determine the context in which try_pop should be called and how to handle the various possible results.
> 
> This eliminates `ConditionalCriticalSection` (which seems strange). This also eliminates the `G1DirtyCardQueueSet`-specific subclass of `LockFreeQueue`. Instead we have (private) `G1DirtyCardQueueSet::pop_queue()`:
> BufferNode* G1DirtyCardQueueSet::pop_queue() {
>   using Status = LockFreeQueuePopStatus;
>   Thread* current_thread = Thread::current();
>   while (true) {
>     GlobalCounter::CriticalSection cs(current_thread);
>     Pair<Status, BufferNode*> pop_result = _completed.try_pop();
>     switch (pop_result.first) {
>     case Status::success: return pop_result.second;
>     case Status::operation_in_progress: return nullptr;
>     case Status::lost_race: break;  // Try again.
>     }
>   }
> }
> I'm also not sure whether the G1 case actually needs the critical section inside the loop anymore. That might be a holdover from an earlier version where the operation-in-progress case did just loop to try again. It definitely does need a critical section though; the life cycle management for the BufferNodes depends on it.

Thanks for the suggestion, and yes this makes sense. Done. Could you double-check if the updated comments are appropriate?

My only concern is that the difference between operation_in_progress and lost_race may be too subtle for most client code. I suppose most client code can just do "retry until succeeded" like in the test.
I don't expect these two cases to differ much in run-time. Is there any performance data to show that the operation_in_progress case indeed takes much longer for retrying? I checked review threads for JDK-8237143 and JDK-8238867 but didn't find any.

Anyway, this CR should use the tri-status approach (and the intra-loop critical section approach), in order to avoid behavior change.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2986