RFR: 8296263: Uniform APIs for using archived heap regions

Fri Nov 4 05:18:27 UTC 2022

On Thu, 3 Nov 2022 16:06:47 GMT, Ashutosh Mehra <duke at openjdk.org> wrote:

> This is an attempt to unify the two different approaches for using archived heap regions. Main goal is to restructure and modify the code to have a single set of GC APIs that can be called for using archived heap regions.
> 
> In current state, the VM either tries to "map" (for G1) or "load" (for non-G1 GC policies) the archived heap regions into the java heap.
> When mapping, the VM determines the address range in the java heap where the archived regions should be mapped. It tries to map the regions towards the end of the heap. The APIs used for this purpose are G1 specific.
> When loading, the VM asks the GC to provide a chunk of memory from the heap, into which it reads the contents of the archived heap regions. The APIs used are GC policy agnostic but challenging to use for region based collectors.
> 
> This PR attempts to add new set of GC APIs that can be used by the VM to reserve space in the heap for mapping the archived heap regions. It combines the good parts of the two existing approaches. Similar to the "loading" API, in this new approach VM is not responsible for determining the mapping address. That responsibility always resides with the GC policy. This also allows the flexibility for the GC implementation to decide where and how to reserve the space for the archived regions. For instance, G1 implementation can continue to attempt to allocate the space towards the end of the heap.
> This PR also provides the implementation of the new APIs for all the existing GC policies that currently support archived heap regions viz G1, serial, parallel and epsilon.

I am not sure if the existing implementation is 100% correct, but for these test cases, I think we are probably saved by this code: 

  if (!is_aligned(relocated_closed_heap_region_bottom, HeapRegion::GrainBytes)) {
    // Align the bottom of the closed archive heap regions at G1 region boundary.
    // This will avoid the situation where the highest open region and the lowest
    // closed region sharing the same G1 region. Otherwise we will fail to map the
    // open regions.
    size_t align = size_t(relocated_closed_heap_region_bottom) % HeapRegion::GrainBytes;
    delta -= align;
    log_info(cds)("CDS heap data needs to be relocated lower by a further " SIZE_FORMAT
                  " bytes to " INTX_FORMAT " to be aligned with HeapRegion::GrainBytes",
                  align, delta);
    set_shared_heap_runtime_delta(delta);
    relocated_closed_heap_region_bottom = heap_region_runtime_start_address(si);
    _heap_pointers_need_patching = true;
  }

G1 regions are at least 1MB, and are always a power of 2.

By patching SharedStringsStress.java with this, I can get the CA1 and OA0 regions to be not aligned by GrainBytes, but that doesn't seem to cause the test to fail.

-                TestCommon.concat(vmOptionsPrefix, "HelloString"));
+                TestCommon.concat(vmOptionsPrefix, "-Xlog:cds=debug", "-Xmx6g", "HelloString"));

In any case, I think we can consider first changing the way the regions are written ([JDK-8296344](https://bugs.openjdk.org/browse/JDK-8296344)) so that they can be more easily mapped by various collectors.

(Also, tactically, we should probably first change G1 to use the new "Uniform API" you are thinking about, but leave the other collectors unchanged. This way, we can gradually test things out and fix the other collectors in subsequent RFEs).

Currently, when writing the archived heap, we allocate a G1 region and write objects into it, from bottom to top. When it fills up, we allocate another G1 region that's immediately below, and start filing it from bottom to top. At the end, we merge all the fully-filled regions into the CA0 region, and make the last, half-filled region CA1.

(Same for the OA0, OA1 regions, but usually the OA0 region never has more than 1MB objects, so we'd never have the OA1 region).

This is kind of kludgy. We should be able to first determine all objects to be archived, and then write them out a single contiguous "closed" region, and a single contiguous "open" region. When filling out these regions, we can pack the objects so that they will never cross a 1MB boundary.

Also, I think it may not even be worthwhile to have the "closed" region and treat it specially at runtime. We can have just a single contiguous block of archived objects like this, where S are the String objects and their char arrays, and O are the other types of objects

OOOOOOOOOOOSSSSSSSSSSS

At runtime, we allocate enough G1 regions from the top of the heap to accommodate the archived objects, and put a dummy object at the bottom to fix the bottom-most region.

(The reason we align the archived regions to the top of the G1 heap is the top of the heap usually have the same narrowOop for various heap sizes, so we can usually avoid patching the embedded oop pointers.

This is a trade off with other collectors, which may not allow you to start allocating memory from the top. We may want to reconsider this.)

All the Strings are always in the interned table so they will never be collected. Also, we already computed their hashcode, so they are never written into (unless you `synchronize` on them at runtime). So for the region(s) that contain only the S objects, 
we can effectively share the memory across multiple processes, and the GC will never collect them.

Anyway, we usually just have a few MBs of archived objects, so it may not matter whether we keep them immutable or not.
*******

I want to thank you for starting working in this area. Going forward, I think we need more discussion and design before we can decide exactly what to do.

-------------

PR: https://git.openjdk.org/jdk/pull/10970