RFR: 8354348: Enable Extended EVEX to REX2/REX demotion for commutative operations with same dst and src2
Jatin Bhateja
jbhateja at openjdk.org
Tue Sep 2 02:55:45 UTC 2025
On Thu, 28 Aug 2025 21:09:03 GMT, Srinivas Vamsi Parasa <sparasa at openjdk.org> wrote:
> This change extends Extended EVEX (EEVEX) to REX2/REX demotion for Intel APX NDD instructions to handle commutative operations when the destination register and the second source register (src2) are the same.
>
> Currently, EEVEX to REX2/REX demotion is only enabled when the first source (src1) and the destination are the same. This enhancement allows additional cases of valid demotion for commutative instructions (add, imul, and, or, xor).
>
> For example:
> `eaddl r18, r25, r18` can be encoded as `addl r18, r25` using APX REX2 encoding
> `eaddl r2, r7, r2` can be encoded as `addl r2, r7` using non-APX legacy encoding
src/hotspot/cpu/x86/assembler_x86.cpp line 12932:
> 12930: if (is_commutative && is_demotable(no_flags, dst->encoding(), src2->encoding())) {
> 12931: if (size == EVEX_64bit) {
> 12932: emit_prefix_and_int8(get_prefixq(src1, dst, is_map1), opcode_byte + 2);
It will be good to write a comment on top of opcode_byte adjustment on account of opcode mismatch b/w NDD and equivalent demotable variant.
EVEX.LLZ.NP.MAP4.SCALABLE 21 /r AND {NF} {ND=1} rv, rv/mv, rv
`REX.W + 23 /r AND r64, r/m64 | RM | Valid | N.E. | r64 AND r/m64
`
src/hotspot/cpu/x86/assembler_x86.cpp line 13055:
> 13053: bool is_prefixq = (size == EVEX_64bit) ? true : false;
> 13054: bool normal_demotion = is_demotable(no_flags, dst_enc, nds_enc);
> 13055: bool commutative_demotion = is_commutative && is_demotable(no_flags, dst_enc, src_enc);
Nomenclature change: instead of normal_demotion and commutative demotion, it will be more appropriate to use first/second_operand_demotable.
src/hotspot/cpu/x86/x86_64.ad line 7121:
> 7119: %{
> 7120: predicate(UseAPX);
> 7121: match(Set dst (AddI (LoadI src1) src2));
Will this not be covered by the pattern at line 7103, since ADLC automatically generates a DFA to handle both cases?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26997#discussion_r2314775483
PR Review Comment: https://git.openjdk.org/jdk/pull/26997#discussion_r2313941101
PR Review Comment: https://git.openjdk.org/jdk/pull/26997#discussion_r2314792264
More information about the hotspot-compiler-dev
mailing list