RFR: 8282204: Use lea instructions for arithmetic operations on x86_64 [v5]

Quan Anh Mai duke at openjdk.java.net
Mon Feb 28 12:01:36 UTC 2022


On Mon, 28 Feb 2022 11:11:12 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   address reviews
>
> src/hotspot/cpu/x86/vm_version_x86.hpp line 1058:
> 
>> 1056:   static bool supports_fast_3op_lea() {
>> 1057:     return supports_fast_2op_lea() &&
>> 1058:            ((is_intel() && supports_clwb()) || // Icelake and above
> 
> Why we use `supports_clwb ()` here?

Icelakes introduce `clwb` instruction so I use that to filter Intel cpus.

> src/hotspot/cpu/x86/vm_version_x86.hpp line 1059:
> 
>> 1057:     return supports_fast_2op_lea() &&
>> 1058:            ((is_intel() && supports_clwb()) || // Icelake and above
>> 1059:             is_amd());
> 
> Does it mean all AMD cpus would be fast-3op if they are fast-2op?

Yes, AMD complex `lea` has 2 cycle latency and can be executed on 2 ports so it is better than 2 `add`s.

> src/hotspot/cpu/x86/x86_64.ad line 450:
> 
>> 448:   _INT_NO_RCX_REG_mask.Remove(OptoReg::as_OptoReg(rcx->as_VMReg()));
>> 449: 
>> 450:   _INT_NO_RBP_R13_REG_mask = _LONG_REG_mask;
> 
> Should it be `= _INT_REG_mask` ?

Fixed, thanks

> src/hotspot/cpu/x86/x86_64.ad line 7503:
> 
>> 7501: instruct leaI_rReg_immI2_immI(rRegI dst, rRegI index, immI2 scale, immI disp)
>> 7502: %{
>> 7503:   predicate(VM_Version::supports_fast_2op_lea());
> 
> We actually have three operands (index, scale and dis) for this rule.
> Shouldn't this predicate be `VM_Version::supports_fast_3op_lea()`?
> 
> Please make it clear what do you mean by `supports_fast_3op_lea`.
> I guess you mean 3 operands including `base, index and disp`, right?

Yes, I have clarified this in the comment of `VM_Version::supports_fast_3op_lea()`

-------------

PR: https://git.openjdk.java.net/jdk/pull/7560


More information about the hotspot-compiler-dev mailing list