RFR: 8296263: Uniform APIs for using archived heap regions

Fri Nov 4 02:41:27 UTC 2022

On Thu, 3 Nov 2022 16:06:47 GMT, Ashutosh Mehra <duke at openjdk.org> wrote:

> This is an attempt to unify the two different approaches for using archived heap regions. Main goal is to restructure and modify the code to have a single set of GC APIs that can be called for using archived heap regions.
> 
> In current state, the VM either tries to "map" (for G1) or "load" (for non-G1 GC policies) the archived heap regions into the java heap.
> When mapping, the VM determines the address range in the java heap where the archived regions should be mapped. It tries to map the regions towards the end of the heap. The APIs used for this purpose are G1 specific.
> When loading, the VM asks the GC to provide a chunk of memory from the heap, into which it reads the contents of the archived heap regions. The APIs used are GC policy agnostic but challenging to use for region based collectors.
> 
> This PR attempts to add new set of GC APIs that can be used by the VM to reserve space in the heap for mapping the archived heap regions. It combines the good parts of the two existing approaches. Similar to the "loading" API, in this new approach VM is not responsible for determining the mapping address. That responsibility always resides with the GC policy. This also allows the flexibility for the GC implementation to decide where and how to reserve the space for the archived regions. For instance, G1 implementation can continue to attempt to allocate the space towards the end of the heap.
> This PR also provides the implementation of the new APIs for all the existing GC policies that currently support archived heap regions viz G1, serial, parallel and epsilon.

I think I figured out the reason for crashes with `SharedStringsStress.java` and `GCSharedStringsDuringDump.java` tests.
In the implementation for G1 the regions are allocated one after the other, without taking into account the gap that may have existed between the regions at the dump time. This can result in an object spanning across two G1 regions.

For instance, if there are two regions at dump time (denoted by `X` and `Y`) and the first region occupies one G1 region R1 and  the second region occupies two G1 regions - R2 and R3, then they would be represented at dump time as:

    R1         R2         R3         R4
     |XXXXXX    |YYYYYYYYYY|YYYYYYYYYY|

The blank space towards the end of R1 denotes the region is not fully occupied.

At run time these would be mapped as follows:

    R1         R2         R3         R4
     |XXXXXXYYYY|YYYYYYYYYY|YYYYYY    |

At dump time the objects are all within a region. But at run time, it is possible for an object near the region boundary to overflow into next region. If that happens we would the assertion `assert(next_addr == top())` in `HeapRegion::update_bot()`.

-------------

PR: https://git.openjdk.org/jdk/pull/10970