RFR: 8296344: Remove dependency on G1 for writing the CDS archive heap [v2]

Thu Feb 2 21:53:27 UTC 2023

On Tue, 31 Jan 2023 16:11:13 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Goals
>> 
>> - Simplify the writing of the CDS archive heap
>>     - We no longer need to allocate special "archive regions" during CDS dump time. See all the removed G1 code. 
>> - Make it possible to (in a future RFE) write the CDS archive heap using any garbage collector
>> 
>> Implementation - the following runs inside a safepoint so heap objects aren't moving
>> 
>> - Find all the objects that should be archived
>> - Allocate buffer using a GrowableArray
>>     - Copy all "open" objects into the buffer
>>     - Copy all "closed" objects into the buffer
>> - Make sure the heap image can be mapped with all possible G1 region sizes: 
>>     - While copying, add fillers to make sure no object spans across 1MB boundary (minimal G1 region size)
>>     - Align the bottom of the "open" and "closed" objects at 1MB boundary
>> - Write the buffer to disk as the image for the archive heap
>> 
>> Testing: tiers 1 ~ 5
>
> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision:
> 
>   fixed windows and 32-bit builds which do not support archive heap

src/hotspot/share/cds/archiveHeapWriter.cpp line 266:

> 264:     HeapShared::CachedOopInfo* info = HeapShared::archived_object_cache()->get(src_obj);
> 265:     assert(info != nullptr, "must be");
> 266:     if (info->in_open_region() == copy_open_region) {

This trick `(false == false)` is `true` used to copy closed regions fooled me for a while. This was unusual for me, so pointing out here for anyone else reading it up. This would anyway go away in the future work that combines open and closed regions into a single one.

src/hotspot/share/cds/archiveHeapWriter.cpp line 528:

> 526:   template <class T> void do_oop_work(T *p) {
> 527:     size_t field_offset = pointer_delta(p, _src_obj, sizeof(char));
> 528:     ArchiveHeapWriter::relocate_field_in_requested_obj<T>(_requested_obj, field_offset);

I understand what we are doing here, but it seems to be taking a bit of a convoluted approach. It performs quite a few operations to calculate `source_referent`, which is already available in *p.
So an alternate implementation for `do_oop(narrowOop *p)` can be:

    if (!CompressedOops::is_null(*p)) {
      size_t field_offset = pointer_delta(p, _src_obj, sizeof(char));
      oop request_referrent = source_obj_to_requested_obj(CompressedOops::decode(*p));  // source referent is already in *p
      narrowOop* field_address_in_buffer = (narrowOop*)(cast_from_oop<address>(_buffered_obj) + field_offset);
      *field_address_in_buffer = CompressedOops::encode(request_referrent);
      mark_oop_pointer<narrowOop>(_requested_obj, field_offset, _oopmap);
    }

The `_buffered_obj` can be calculated in the `iterator` below and passed to `EmbeddedOopRelocator` c'tor.
`do_oop(oop *p)` would follow the same. Does this read better?

src/hotspot/share/cds/archiveHeapWriter.cpp line 559:

> 557:   for (int i = 0; i < length; i++) {
> 558:     if (UseCompressedOops) {
> 559:       relocate_root_at<narrowOop>(requested_roots, i);

This can also follow similar pattern as code in my comment on`do_oop_work` above, as in:

    if (UseCompressedOops) {
      offset = (size_t)((objArrayOop)requested_roots)->obj_at_offset<narrowOop>(i);
      narrowOop* buffered_addr = (narrowOop*)(cast_from_oop<address>(buffered_roots) + offset);
      oop request_referent = source_obj_to_requested_obj(CompressedOops::decode(*buffered_addr));
      *buffered_addr = CompressedOops::encode(request_referent);
      mark_oop_pointer<narrowOop>(requested_roots, offset, heap_info->oopmap());
    }

-------------

PR: https://git.openjdk.org/jdk/pull/12304