RFR: JDK-8323497: On x64, use 32-bit immediate moves for narrow klass base if possible
Aleksey Shipilev
shade at openjdk.org
Tue Jan 30 11:51:50 UTC 2024
On Wed, 10 Jan 2024 09:09:50 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> On x64, we always use the long form of mov immediate to load the klass base into a register. If the klass base fits into 32 bits, we could use the short form and save four instruction bytes.
>
> Before: mov uses 10 instruction bytes:
>
>
> 35 ;; decode_klass_not_null
> 36 0x00007f8b089e51c4: movabs $0x82000000,%r11
> 37 0x00007f8b089e51ce: add %r11,%r10
>
>
> Now: mov uses 6 instruction bytes:
>
>
> 35 ;; decode_klass_not_null
> 36 0x00007fbe609e51c4: mov $0x82000000,%r11d
> 37 0x00007fbe609e51ca: add %r11,%r10
>
>
> Note that this also works with CDS enabled and gives us the motivation to allocate low-address ranges unconditionally, even if the zero-based encoding is not possible.
>
> ----------
>
> Tests: tier1 (GHA), tier 2 on x64 linux
src/hotspot/cpu/x86/assembler_x86.cpp line 13369:
> 13367: #ifdef _LP64
> 13368: void Assembler::mov32_or_64(Register dst, int64_t imm) {
> 13369: if ((uint64_t)imm < nth_bit(32)) {
Drive-by comments:
a) macro-assembler stuff like this should be in macroAssembler;
b) there is `is_simm32(imm)` for checks like these;
c) I did [JDK-8319406](https://bugs.openjdk.org/browse/JDK-8319406) recently, maybe you could just use that?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17340#discussion_r1447114489
More information about the hotspot-dev
mailing list