RFR: 8282204: Use lea instructions for arithmetic operations on x86_64 [v5]

Jie Fu jiefu at openjdk.java.net
Mon Feb 28 11:29:52 UTC 2022


On Fri, 25 Feb 2022 12:11:43 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>> Hi,
>> 
>> This patch adds several matching rules for x86_64 backend to use `lea` instructions for several fused arithmetic operations. Also, `lea`s don't kill flags and allow different `dst` and `src`, so it is preferred over `sll` if possible, too. 
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   address reviews

Changes requested by jiefu (Reviewer).

src/hotspot/cpu/x86/vm_version_x86.hpp line 1056:

> 1054:   // Pre Icelake Intels have inefficient 3-op lea with 3 latency, this can be
> 1055:   // replaced by add-add or lea-add
> 1056:   static bool supports_fast_3op_lea() {

Please make it clear what do you mean by '3op'.

src/hotspot/cpu/x86/vm_version_x86.hpp line 1058:

> 1056:   static bool supports_fast_3op_lea() {
> 1057:     return supports_fast_2op_lea() &&
> 1058:            ((is_intel() && supports_clwb()) || // Icelake and above

Why we use `supports_clwb ()` here?

src/hotspot/cpu/x86/vm_version_x86.hpp line 1059:

> 1057:     return supports_fast_2op_lea() &&
> 1058:            ((is_intel() && supports_clwb()) || // Icelake and above
> 1059:             is_amd());

Does it mean all AMD cpus would be fast-3op if they are fast-2op?

src/hotspot/cpu/x86/x86_64.ad line 450:

> 448:   _INT_NO_RCX_REG_mask.Remove(OptoReg::as_OptoReg(rcx->as_VMReg()));
> 449: 
> 450:   _INT_NO_RBP_R13_REG_mask = _LONG_REG_mask;

Should it be `= _INT_REG_mask` ?

src/hotspot/cpu/x86/x86_64.ad line 7503:

> 7501: instruct leaI_rReg_immI2_immI(rRegI dst, rRegI index, immI2 scale, immI disp)
> 7502: %{
> 7503:   predicate(VM_Version::supports_fast_2op_lea());

We actually have three operands (index, scale and dis) for this rule.
Shouldn't this predicate be `VM_Version::supports_fast_3op_lea()`?

Please make it clear what do you mean by `supports_fast_3op_lea`.
I guess you mean 3 operands including `base, index and disp`, right?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7560


More information about the hotspot-compiler-dev mailing list