RFR: 8282204: Use lea instructions for arithmetic operations on x86_64 [v5]
Quan Anh Mai
duke at openjdk.java.net
Mon Feb 28 12:01:36 UTC 2022
On Mon, 28 Feb 2022 11:11:12 GMT, Jie Fu <jiefu at openjdk.org> wrote:
>> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
>>
>> address reviews
>
> src/hotspot/cpu/x86/vm_version_x86.hpp line 1058:
>
>> 1056: static bool supports_fast_3op_lea() {
>> 1057: return supports_fast_2op_lea() &&
>> 1058: ((is_intel() && supports_clwb()) || // Icelake and above
>
> Why we use `supports_clwb ()` here?
Icelakes introduce `clwb` instruction so I use that to filter Intel cpus.
> src/hotspot/cpu/x86/vm_version_x86.hpp line 1059:
>
>> 1057: return supports_fast_2op_lea() &&
>> 1058: ((is_intel() && supports_clwb()) || // Icelake and above
>> 1059: is_amd());
>
> Does it mean all AMD cpus would be fast-3op if they are fast-2op?
Yes, AMD complex `lea` has 2 cycle latency and can be executed on 2 ports so it is better than 2 `add`s.
> src/hotspot/cpu/x86/x86_64.ad line 450:
>
>> 448: _INT_NO_RCX_REG_mask.Remove(OptoReg::as_OptoReg(rcx->as_VMReg()));
>> 449:
>> 450: _INT_NO_RBP_R13_REG_mask = _LONG_REG_mask;
>
> Should it be `= _INT_REG_mask` ?
Fixed, thanks
> src/hotspot/cpu/x86/x86_64.ad line 7503:
>
>> 7501: instruct leaI_rReg_immI2_immI(rRegI dst, rRegI index, immI2 scale, immI disp)
>> 7502: %{
>> 7503: predicate(VM_Version::supports_fast_2op_lea());
>
> We actually have three operands (index, scale and dis) for this rule.
> Shouldn't this predicate be `VM_Version::supports_fast_3op_lea()`?
>
> Please make it clear what do you mean by `supports_fast_3op_lea`.
> I guess you mean 3 operands including `base, index and disp`, right?
Yes, I have clarified this in the comment of `VM_Version::supports_fast_3op_lea()`
-------------
PR: https://git.openjdk.java.net/jdk/pull/7560
More information about the hotspot-compiler-dev
mailing list