RFR: 8310428: Add CollectedHeap::reserved_range() API to return reserved heap memory range

Mon Jun 26 22:36:10 UTC 2023

On Thu, 22 Jun 2023 04:44:07 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> For CDS, we actually don't need to know the exact range of the heap.
>> 
>> Currently, archive heap dumping is implemented only for G1. In the future, we want to be able to dump with any collector.
>> 
>> The eventual goal is - if you dump with collector X, then the layout of the archive heap should be optimal for this collector (e.g., avoid runtime relocation). The archive heap created by X should still be loadable by other collectors (but relocation is more likely).
>> 
>> So, for people who are really sensitive to start-up, they can create a custom archive with the specific GC that their app uses.
>> 
>> Currently, when dumping the archive heap with G1, we put it at the very top of the heap, which is the best location for G1 as this can usually avoid relocation. We determine the range by getting the entire G1 heap range, and take a portion of that from the top end.
>> 
>> To support all collectors, I think we can move the range calculation into each collector, and have a new API like
>> 
>> 
>> CollectorHeap::get_archive_heap_preferred_base(size_t required_bytes, size_t alignment);
>> 
>> The collector should return the range according to its policy. E.g., for SerialGC, it should probably return the bottom of the heap. The "reserve at heap top" logic can be moved into G1.
>
>> For CDS, we actually don't need to know the exact range of the heap.
>> 
>> 
>> 
>> Currently, archive heap dumping is implemented only for G1. In the future, we want to be able to dump with any collector.
>> 
>> 
>> 
>> The eventual goal is - if you dump with collector X, then the layout of the archive heap should be optimal for this collector (e.g., avoid runtime relocation). The archive heap created by X should still be loadable by other collectors (but relocation is more likely).
>> 
>> 
>> 
>> So, for people who are really sensitive to start-up, they can create a custom archive with the specific GC that their app uses.
>> 
>> 
>> 
>> Currently, when dumping the archive heap with G1, we put it at the very top of the heap, which is the best location for G1 as this can usually avoid relocation. We determine the range by getting the entire G1 heap range, and take a portion of that from the top end.
>> 
>> 
>> 
>> To support all collectors, I think we can move the range calculation into each collector, and have a new API like
>> 
>> 
>> 
>> ```
>> 
>> CollectorHeap::get_archive_heap_preferred_base(size_t required_bytes, size_t alignment);
>> 
>> ```
>> 
>> The collector should return the range according to its policy. E.g., for SerialGC, it should probably return the bottom of the heap. The "reserve at heap top" logic can be moved into G1.
>> 
> 
> Right - each GC might have a specialized strategy that is optimal for that GC. A good ZGC strategy would be to materialize objects in the heap lazily and concurrently using load barriers. That way materializing objects wouldn't stall the program.
> 
> However, I believe we can have a common baseline strategy that is completely GC agnostic, which just materializes objects in the archive using normal allocations, and translates the in-archive pointers to real pointers using normal stores, based on the oops we got from the materializing allocations, using a GC agnostic translation mechanism.
> 
> I think there are interesting ways of optimizing such a shared translation strategy by exploiting some "mostly contiguous" property that all collectors have, combined with locality properties that tend to be quite universal (most pointers don't point very far away from its object). In fact, I suspect an optimized GC agnostic mechanism could be fast enough, such that making specialized solutions for collector X might not be worth it. 
> 
> I think starting with a GC agnostic solution and then checking if specific collectors want to optimize further by handing their gu...

> > I've had an off-line discussion with @fisk and created an investigative RFE based on his ideas. Please see #14520 and use that PR for further discussion on this topic.
> 
> Please let's have the discussion in the open.

Hi Thomas,

I just want to clarify that for the purpose of CDS, there's no need for the proposed `CollectedHeap::reserved_range()` API. If CDS needs to find out the optimal location for the archived objects, it should be done using a more specific API. Any CDS-related discussions should be moved to #14520.

If we need the `reserved_range()` API for other purposes, then of course we can continue the discussion here.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14595#issuecomment-1608410709