RFR: 8324649: Shenandoah: refactor implementation of free set [v4]

Y. Srinivas Ramakrishna ysr at openjdk.org
Mon Jan 29 00:24:35 UTC 2024


On Sat, 27 Jan 2024 00:42:46 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> Several objectives:
>> 1. Reduce humongous allocation failures by segregating regular regions from humongous regions
>> 2. Do not retire regions just because an allocation failed within the region if the memory remaining within the region is large enough to represent a LAB
>> 3. Track range of empty regions in addition to range of available regions in order to expedite humongous allocations
>> 4. Treat collector reserves as available for Mutator allocations after evacuation completes
>> 5. Improve encapsulation so as to enable an OldCollector reserve for future integration of generational Shenandoah
>> 
>> On internal performance pipelines, this change shows:
>> 
>>  1. some Increase in page faults and rss_max with certain workloads, presumably because of "segregation" of humongous from regular regions.
>>  2. An increase in System CPU time on certain benchmarks: sunflow (+165%), scimark.sparse.large (+50%), lusearch (+43%).  This system CPU time increase appears to correlate with increased page faults and/or rss.
>>  3. An increase in trigger_failure for the hyperalloc_a2048_o4096 experiment (not yet understood)
>>  4. 2-30x improvements on multiple metrics of the Extremem phased workload latencies (most likely resulting from fewer degenerated or full GCs)
>> 
>> Shenandoah
>> -------------------------------------------------------------------------------------------------------
>> +166.55% scimark.sparse.large/minor_page_fault_count p=0.00000
>>   Control: 819938.875   (+/-5724.56  )         40
>>   Test:    2185552.625   (+/-26378.64  )         20
>> 
>> +166.16% scimark.sparse.large/rss_max p=0.00000
>>   Control: 3285226.375   (+/-22812.93  )         40
>>   Test:    8743881.500   (+/-104906.69  )         20
>> 
>> +164.78% sunflow/cpu_system p=0.00000
>>   Control:      1.280s  (+/-  0.10s )         40
>>   Test:         3.390s  (+/-  0.13s )         20
>> 
>> +149.29% hyperalloc_a2048_o4096/trigger_failure p=0.00000
>>   Control:      3.259   (+/-  1.46  )         33
>>   Test:         8.125   (+/-  2.05  )         20
>> 
>> +143.75% pmd/major_page_fault_count p=0.03622
>>   Control:      1.000   (+/-  0.00  )         40
>>   Test:         2.438   (+/-  2.59  )         20
>> 
>> +80.22% lusearch/minor_page_fault_count p=0.00000
>>   Control: 2043930.938   (+/-4777.14  )         40
>>   Test:    3683477.625   (+/-5650.29  )         20
>> 
>> +50.50% scimark.sparse.small/minor_page_fault_count p=0.00000
>>   Control: 697899.156   (+/-3457.82  )         40
>>   Test:    1050363.812   (+/-175...
>
> Kelvin Nilsen has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Fix typo in comment
>  - Remove unnecessary include

Left a few scattered comments as I skimmed through this.
Will probably go back and look again, more carefully but thought I'd leave these here for now until I get back to this later, take a fresh look and do a more careful and complete review.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 41:

> 39:     case Mutator: return "Mutator";
> 40:     case Collector: return "Collector";
> 41:     case NumFreeSets: return "NumFreeSets";

How is "NumFreeSets" a type whose name would be used? I'd remove it from here.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 51:

> 49:     _region_size_bytes(ShenandoahHeapRegion::region_size_bytes())
> 50: {
> 51:   _membership = NEW_C_HEAP_ARRAY(ShenandoahFreeMemoryType, max_regions, mtGC);

So it looks like max_regions will cover the entire largest possible size of the heap (and all regions that can ever be in the heap).

In that case, I'd make the `_max` field `const`, and similarly the `_membership` array. By the way, I see a bunch of terms being interchangeably used here: sets, types, membership. I'd settle upon a single terminology that would cleanly work and keep the set of terms small.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 401:

> 399: 
> 400: HeapWord* ShenandoahFreeSet::allocate_single(ShenandoahAllocRequest& req, bool& in_new_region) {
> 401:   shenandoah_assert_heaplocked();

In addition, another precondition for this method appears to be that req.size() <= humongous size threshold. Perhaps that check should also be disposed of here. (Based on the documentation at the previous review comment above.)

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 413:

> 411:   //
> 412:   // Free set maintains mutator and collector views, and normally they allocate in their views only,
> 413:   // unless we special cases for stealing and mixed allocations.

I'd drop this comment, unless such special cases actually exist in the code today. Do they?

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 423:

> 421:       // Allocate within mutator free from high memory to low so as to preserve low memory for humongous allocations
> 422:       if (!_free_sets.is_empty(Mutator)) {
> 423:         // Use signed idx.  Otherwise, loop will never terminate.

Can you use `ssize_t` for the index and the return values from the methods? (I assume the problem you have in mind arises when `leftmost` below is 0?

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 443:

> 441:       // GCLABs are for evacuation so we must be in evacuation phase.  If this allocation is successful, increment
> 442:       // the relevant evac_expended rather than used value.
> 443: 

This seems like a leakage of abstraction to me. Besides, I do not see `evac_expended` anywhere in this file?

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp line 725:

> 723: // need to reserve memory within for evacuation.  (We will create a new reserve after update refs finishes,
> 724: // setting aside some of the memory that was reclaimed by the most recent GC.  This new reserve will satisfy
> 725: // the evacuation needs of the next GC pass.)

This part of the comment seems to belong in the caller(s), I feel. The documentation here should only talk about implementation-specifics. (The public contract will be only in the header file.)

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 40:

> 38: };
> 39: 
> 40: class ShenandoahSetsOfFree {

I wonder if "ShenandoahSetsOfFreeRegions" or "ShenandoahFreeRegionSets" is a better name.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 43:

> 41: 
> 42: private:
> 43:   size_t _max;                  // The maximum number of heap regions

Is this a constant or can it change with the number of regions in the whole heap (and therefore as the heap expands and contracts), or is it a fixed constant depending on the maximum size of the heap (and the fixed region size)?

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 163:

> 161: 
> 162:   // While holding the heap lock, allocate memory for a single object or LAB  which is to be entirely contained
> 163:   // within a single HeapRegion as characterized by req.  The req.size() value is known to be less than or

"known to be", or "required to be"? In other words, is it a precondition of the method to behave correctly?

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 168:

> 166:   HeapWord* allocate_contiguous(ShenandoahAllocRequest& req);
> 167: 
> 168:   void flip_to_gc(ShenandoahHeapRegion* r);

Document the contract(s) of this method.

src/hotspot/share/gc/shenandoah/shenandoahFreeSet.hpp line 200:

> 198:   // Note that we plan to replenish the Collector reserve at the end of update refs, at which time all
> 199:   // of the regions recycled from the collection set will be available.
> 200:   void move_collector_sets_to_mutator(size_t cset_regions);

Can we start with a concise documentation of the public contract of this method?
That can optionally (and slightly separately) be supplemented with a usage scenario or documenting its motivation. In fact, it might be a good idea to move the latter to the implementation of the method and keep the header file terse and describe only each method's contract.

-------------

Changes requested by ysr (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17561#pullrequestreview-1846218060
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468956106
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468956899
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468968069
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468968532
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468969853
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468971487
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468964922
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468953723
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468954727
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468967072
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468963795
PR Review Comment: https://git.openjdk.org/jdk/pull/17561#discussion_r1468964235


More information about the hotspot-gc-dev mailing list