RFR: 8301116: Parallelize TLAB resizing in G1 [v2]

Kim Barrett kbarrett at openjdk.org
Fri Feb 3 01:49:56 UTC 2023


On Fri, 3 Feb 2023 01:34:26 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> It looks like this has been a leftover of some older version (the `G1JavaThreadClaimer` is fairly "new") - I will remove this change.
>> Thanks for making me look at this again in detail!
>
>> To be more exact, the removal of the const happens in `g1CollectedHeap.inline.hpp:124`:
>> 
>> ```
>> inline JavaThread* const* G1JavaThreadsListClaimer::claim(uint& count) {
>>   count = 0;
>>   if (Atomic::load(&_cur_claim) >= _list.length()) {
>>     return nullptr;
>>   }
>>   uint claim = Atomic::fetch_and_add(&_cur_claim, _claim_step);
>>   if (claim >= _list.length()) {
>>     return nullptr;
>>   }
>>   count = MIN2(_list.length() - claim, _claim_step);
>>   return _list.list()->threads() + claim;                         <--- here
>> }
>> ```
>> 
>> because of the mentioned access in `g1YoungGCPostEvacuateTasks.cpp:720`.
> 
> I think a better solution might be to declare the `Thread::_tlab` member `mutable` and `Thread::tlab() const`.

> Using `Threads::possibly_parallel_oops_do`, thread iteration/claiming seems to be the bottleneck, i.e. claiming the token.
> 
> One issue is that all threads need to traverse the array from the beginning to get to the current claim position - so the minimum processing time for a single thread is iterating the complete thread array and checking all tokens for it. That does not matter so much if the work per thread is big (like when walking the stacks for oops - but even then I think it would be noticable, need to check).
> 
> The other is that with little work per thread threads like in this situation they seem to be contending heavily on the JavaThread claim tokens, so this "parallelization" is quite a pessimization - I've measured ~4x slower with 18 threads than using the single-threaded version (on ~21k JavaThreads)

Looking at `Threads::possibly_parallel_threads_do`, I'm thinking it could use a redesign to use the technique
being used in `G1JavaThreadsListClaimer`.  The existing design and implementation dates from when the
threads-list was an actual linked list, rather than the newer ThreadsList.  But maybe that's future work.

-------------

PR: https://git.openjdk.org/jdk/pull/12360


More information about the hotspot-gc-dev mailing list