RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible

Aleksey Shipilev shade at openjdk.org
Tue Jan 30 11:51:50 UTC 2024


On Wed, 10 Jan 2024 09:09:50 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes. 
> 
> Before: mov uses 10 instruction bytes:
> 
> 
>    35  ;; decode_klass_not_null
>    36   0x00007f8b089e51c4:   movabs $0x82000000,%r11
>    37   0x00007f8b089e51ce:   add    %r11,%r10
> 
> 
> Now: mov uses 6 instruction bytes:
> 
> 
>    35  ;; decode_klass_not_null
>    36   0x00007fbe609e51c4:   mov    $0x82000000,%r11d
>    37   0x00007fbe609e51ca:   add    %r11,%r10
> 
> 
> Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible.
> 
> ----------
> 
> Tests: tier1 (GHA), tier 2 on x64 linux

src/hotspot/cpu/x86/assembler_x86.cpp line 13369:

> 13367: #ifdef _LP64
> 13368: void Assembler::mov32_or_64(Register dst, int64_t imm) {
> 13369:   if ((uint64_t)imm < nth_bit(32)) {

Drive-by comments:
 a) macro-assembler stuff like this should be in macroAssembler;
 b) there is `is_simm32(imm)` for checks like these;
 c) I did [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) recently, maybe you could just use that?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447114489


More information about the hotspot-dev mailing list