RFR: JDK-8320368: Per-CPU optimization of Klass range reservation [v2]

Andrew Haley aph at openjdk.org
Wed Nov 22 09:53:11 UTC 2023


On Tue, 21 Nov 2023 16:40:33 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> In `Metaspace::reserve_address_space_for_compressed_classes`, we reserve space for the future Klass range. We place the Klass range somewhere that allows us to use "good" narrow Klass decoding later when initializing the encoding scheme.
>> 
>> Narrow Klass decoding is inherently CPU-specific, so doing this in shared coding is awkward. It leads to many ifdefs, vague code comments that are difficult to explain, and missed optimizations. 
>> 
>> There are common patterns: 
>> - all platforms benefit from unscaled encoding so trying to reserve <4GB for CDS=off is worthwhile. 
>> 
>> But there are more differences than one would think:
>> - some platforms (s390, riscv) benefit from reservation < 4GB even with CDS=on since a 32-bit immediate requires fewer instructions
>> - some platforms (aarch64) don't benefit from zero-based encoding, so no need to try that
>> - some platforms benefit from optimizing the base for 16-bit moves (PPC, s390, aarch64) or for other immediate formats (riscv)
>> 
>> It would be much better to have this section per CPU so that every CPU can implement its perfect, well documented version. A bit of code duplication is a good price for code clarity.
>> 
>> -------------
>> 
>> This patch splits out `Metaspace::reserve_address_space_for_compressed_classes` into five variants, one per CPU (moving the code to CompressedKlassPointers); it also splits out `CompressedKlassPointers::initialize` into two variants, one for aarch64, one for all other platforms. 
>> 
>> Changes per-CPU:
>> 
>> #### aarch64:
>> 
>> Don't attempt to reserve for zero-based encoding; since lsl is not faster than movk. We reserve for movk mode right away if reserve for unscaled fails or if CDS=on.
>> 
>> We also add a last-ditch attempt to reserve optimized for movk via over-alignment. We only do this on aarch64 to prevent errors like this one JDK-8318119: "Invalid narrow Klass base on aarch64 post 8312018"
>> 
>> Since we don't want zero-based encoding, we need an aarch64-specific version for `CompressedKlassPointers::initialize()`
>> 
>> #### riscv:
>> 
>> We attempt to reserve at a "good" base that has only bits set either in [12..32), [32, 44) or in [44, 64).
>> 
>> #### s390:
>> 
>> We attempt to allocate < 4GB unconditionally.
>
> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Merge branch 'JDK-8320368-Per-CPU-optimization-of-Klass-range-reservation' of github.com:tstuefe/jdk into JDK-8320368-Per-CPU-optimization-of-Klass-range-reservation
>  - Regression Test

src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp line 49:

> 47:   }
> 48: 
> 49:   // If that failed, attempt to allocate at any 4G-aligned address. Let the system decide where. For ASLR,

One small nit here: encoding in MOVK mode may require more instructions than XOR mode because XOR is `eor dst, src, 0x800000000` but MOVK is `mov dst, src; eor dst, src, 0x800000000`. XOR is always the best, and we should perhaps try it first.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16743#discussion_r1401779226


More information about the hotspot-dev mailing list