RFR: JDK-8312018: Improve reservation of class space and CDS

Fri Aug 11 07:34:58 UTC 2023

On Fri, 11 Aug 2023 03:59:55 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> This PR rewrites memory reservation of class space and CDS to have ASLR and a much better chance of low-address placement.
>> 
>> ------
>> 
>> Motivation:
>> 
>> (I was advised to keep PR text short lest I spam the mailing lists with Skara generated mails. So, for motivation, see JBS issue)
>> 
>> -------
>> 
>> The patch introduces a new API to reserve memory within an address range at a randomized location, while trying to be smart about it. The API is generic, and future planned uses of this API could include replacing the zero-based heap allocation and the zero-based reservation of Shenandoah Collection Sets, thereby allowing us to consolidate coding.
>> 
>> This PR complements @iklam 's current work that rewrites archive heap initialization at runtime. Once his work is in, we will be able to recalculate narrow Klass IDs for objects loaded from the archive, and that will allow us to reap the benefits of this patch for the CDS runtime case too.
>> 
>> -------
>> 
>> Noteworthy functional changes:
>> 
>> - class space is now likely to be reserved at a random location in low address ranges; this now includes ranges below 2 GB, which had been excluded before due to the use of HeapBaseMinAddress as minimal attach point.
>>      -  Note that this makes no problem with sbrk() - the only platform still having this problem is AIX, and we solved it differently there, see comment in AIX implementation of `os::vm_min_address()`
>> - I removed the PPC/AARCH64 specific coding that attempted to map at 4G/32G aligned addresses. That section has a complex history - it was originally introduced to deal with AARCH64 immediate-loading shortcomings, but PPC piggybacked on it for its perceived ability to allocate for zero-based, which got subsequently lost, so its broken now (see https://bugs.openjdk.org/browse/JDK-8313669). The new code is a better replacement for this coding.
>> 
>> -------
>> 
>> Example (linux amd64):
>> 
>> We start the JVM with a 30GB heap. 
>> 
>> In the stock JVM, the JVM will place the heap in the lower address ranges starting at 2G (0x8000_0000). But then it is unable to place the class space in lower regions too, so it placed it at 32 GB (0x8_0000_0000), and we don't have zero-based encoding (Narrow klass base: 0x0000000800000000). This scenario repeats for every iteration, so we will always use these two addresses (no ASLR): 
>> 
>> 
>> thomas at starfish $ ./images/jdk/bin/java -Xshare:off -Xmx30g -Xlog:gc+heap+exit -Xlog:gc+metaspace -version
>> [0.019s][info][gc,metaspace] ...
>
> src/hotspot/share/runtime/os.cpp line 1790:
> 
>> 1788: // If randomize is true, the location will be randomized.
>> 1789: char* os::attempt_reserve_memory_between(char* min, char* max, size_t bytes, size_t alignment, bool randomize) {
>> 1790: 
> 
> I would suggest breaking this function down into smaller functions for readability.

I wrecked my brain about how to split this function up, without coming to a solution I'm almost sure will be criticised for code duplication or obscurity.

The function sets up several common things before splitting into either randomized or non-randomized logic. If I split along randomized/non-randomized, whichever way I split, I would have either to duplicate most of the setup coding or give these daughter functions a very broad interface, which would make them not comprehensible. In my view, splitting makes sense only if you can describe the split-off parts in isolation well.

Also, any internal daughter functions resulting from the split would end up in the externally visible os namespace since they need to access the private `os::pd_attempt_reserve_..`, which I don't like.

I could attempt to split out the reservation alone (the loop calling os::pd_attempt_reserve...), e.g. to reserve from a given precomputed set of addresses. But that would require me to rewrite the non-randomized part that does not precompute those addresses, which I dislike. Again, this split-out variant would also end up in the externally visible os:: namespace.

I could split out part of the shuffling logic, and maybe I try that. But I feel not much brevity is gained from it, these sections are rather short.

Do you have any particular split in mind?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15041#discussion_r1290988328