RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v31]

Thomas Schatzl tschatzl at openjdk.org
Fri Oct 20 16:07:40 UTC 2023


On Fri, 20 Oct 2023 13:34:59 GMT, Richard Reingruber <rrich at openjdk.org> wrote:

>> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems.
>> 
>> #### Implementation (Updated 2023-10-20)
>> 
>> Comment copied from `PSCardTable::scavenge_contents_parallel`:
>> 
>> ```c++
>> // Scavenging and accesses to the card table are strictly limited to the stripe.
>> // In particular scavenging of an object crossing stripe boundaries is shared
>> // among the threads assigned to the stripes it resides on. This reduces
>> // complexity and enables shared scanning of large objects.
>> // It requires preprocessing of the card table though where imprecise card marks of
>> // objects crossing stripe boundaries are propagated to the first card of
>> // each stripe covered by the individual object.
>> 
>> 
>> The baseline was refactored to make use of a read-only copy of the card table. That "shadow" table (`PSStripeShadowCardTable`) separates reading, clearing and redirtying of table entries which allows for a much simpler implementation.
>> 
>> Scanning of object arrays is limited to dirty card chunks.
>> 
>> ## Everything below refers to the Outdated Initial Implementation
>> 
>> The algorithm to share scanning large arrays is supposed to be a straight
>> forward extension of the scheme implemented in
>> `PSCardTable::scavenge_contents_parallel`.
>> 
>> - A worker scans the part of a large array located in its stripe
>> 
>> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe.
>> 
>> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`)
>>   
>> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned.
>> 
>> #### Performance testing
>> 
>> ##### BigArrayInOldGenRR.java
>> 
>> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its refere...
>
> Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Forgot to move comment to PSStripeShadowCardTable.

Ship it! :)

-------------

Marked as reviewed by tschatzl (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/14846#pullrequestreview-1690399343


More information about the hotspot-gc-dev mailing list