RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection

Thu Jul 5 23:22:59 UTC 2018

Hi Thomas,

> 
> I have been looking into this a bit and finally (with some patch from
> me that fixes the changes too) and some additional probes (using the
> TASKQUEUE_STATS "infrastructure") I am starting to get meaningful
> results.
> 
> More about that later.
> 
> In any case the technique looks like a nice improvement at least in
> steal attempts and steal/steal attempts ratio on some bigger tests, but
> I need to update my code again it seems :)
> 
> I can add the changes to the TASKQUEUE_STATS logging later btw.

Great! Looking forward to seeing the results.
> 
>>>
>>> […]
>>> Updated webrev:
>>>
>>> http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html
>>
>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>   255     if (sz2 > sz1) {
>>   256       sel_k = k2;
>>   257       suc = _queues[k2]->pop_global(t);
>>   258     } else {
>>   259       sel_k = k1;
>>   260       suc = _queues[k1]->pop_global(t);
>>   261     }
>>
>> The paper avoids the steal attempt when both potential victims have a
>> size of zero, e.g. insert another clause:
>>
>>    } else if (sz1 == 0) {
>>      sel_k = k1;  // Might be needed to avoid uninitialized variable
>> warnings?
>>      suc = false;
>>    } else {
>>      ...
>>
>> There is a race condition between obtaining the size and checking it
>> here, but I don't think that's important.  The point is to avoid an
>> expensive steal attempt when it is very likely to fail.
>>
> 
> There is another bug in the existing code: current Hotspot collectors
> all reuse a single task queue set. So since the queue id's are only
> initialized once at startup, there will be some initial use of a
> suboptimal queue.

Technically, it is a bug. I doubt it will have material impact, cause 
the old value probably just as good as next random one.
> 
> I assume Shenandoah does not need a reset because it creates new
> taskqueuesets whenever it needs them (and frees them afterwards).
> 
Shenandoah does reuse queues, we added clear() method inside our queue 
set implementation to clean up queue, overflow queue and buffer, etc.

> The current design of passing stealing-local information (the seed)
> makes it clear that the owner of that variable needs to initialize it.
> 
> At this time I have no preference on Kim's suggestion to put these
> variables into the queue if you asked me. I would tend to encapsulate
> the mechanism as much as possible though.
> 
> I do think if Shenandoah wants to put this information into the
> GCThreadLocalBlock (for what reason?) it is probably most flexible to
> pass these things as kind of context to the steal_best_of_2() method. I
> do not think it is desirable to have a second copy of the taskqueue*
> code around; I can't see how else one implementation can use the
> TaskQueueSet local queue ids and the other use the same information
> from somewhere else right now.
Similar to what ZGC does, so we can avoid passing worker id and queue, 
etc. all over the places.

We don't want to use gnu style thread-local, so GCThreadLocalBlock is 
the temporary place until compiler upgrade (?)

As I mentioned in early email, I would prefer to pass 
TaskQueueStealLocals/Context, but I am afraid of venturing into other 
GCs that I am not familiar with.

Thomas, seems you have made other changes/improvements, do you want to 
take over this RFE? I am fine with either ways.

Thanks,

-Zhengyu

> 
> Thanks,
>    Thomas
>