RFR: 8351016: RA support for EVEX to REX/REX2 demotion to optimize NDD instructions [v18]

Vladimir Ivanov vlivanov at openjdk.org
Tue Nov 25 07:14:59 UTC 2025


On Tue, 25 Nov 2025 06:19:13 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/x86.ad line 2641:
>> 
>>> 2639:   }
>>> 2640: 
>>> 2641:   if (mdef->num_opnds() <= oper_index || mdef->operand_index(oper_index) < 0 ||
>> 
>> IMO all 4 checks (plus, `mdef->in(mdef->operand_index(oper_index)) != null`) can be considered as structural invariants for mach nodes marked as NDD demotable. So, I'd like to see an assert ensuring they don't fail for NDD-demotable nodes.
>> 
>> Speaking of the code shape, I suggest to restore previous shape with explicit checks before the switch and add asserts before returning negative result:
>> 
>> static boolean is_ndd_demotable() {
>>   return ((mdef->flags() & Node::PD::Flag_ndd_demotable) != 0) ||
>>          ((mdef->flags() & Node::PD::Flag_ndd_demotable_commutative) != 0));
>> }
>> 
>> bool Matcher::is_register_biasing_candidate(const MachNode* mdef, int oper_index) {
>>   if (mdef == nullptr) {
>>     return false;
>>   }
>> 
>>   if (mdef->num_opnds() <= oper_index || 
>>       mdef->operand_index(oper_index) < 0 ||
>>       mdef->in(mdef->operand_index(oper_index)) != nullptr) {
>>     assert(!is_ndd_demotable(mdef), "%s", mdef->Name());
>>     return false;
>>   }
>> 
>>   // Complex memory operand covers multiple incoming edges needed for
>>   // address computation, biasing def towards any address component will not
>>   // result into NDD demotion by assembler.
>>   if (mdef->operand_num_edges(oper_index) != 1) {
>>     assert(!is_ndd_demotable(mdef), "%s", mdef->Name());
>>     return false;
>>   }
>> 
>>   // Demotion candidate must be register mask compatible with definition.
>>   const RegMask& oper_mask = mdef->in_RegMask(mdef->operand_index(oper_index));
>>   if (!oper_mask.overlap(mdef->out_RegMask())) {
>>     assert(!is_ndd_demotable(mdef), "%s", mdef->Name());
>>     return false;
>>   }
>> 
>>   switch (oper_index) {
>>     // First operand of MachNode corresponding to Intel APX NDD selection
>>     // pattern can share its assigned register with definition operand if
>>     // their live ranges do not overlap. In such a scenario we can demote
>>     // it to legacy map0/map1 instruction by replacing its 4-byte extended
>>     // EVEX prefix with shorter REX/REX2 encoding. Demotion candidates
>>     // are decorated with a special flag by instruction selector.
>>     case 1: return is_ndd_demotable(mdef);
>> 
>>     // Definition operand of commutative operation can be biased towards second operand.
>>     case 2: return (mdef->flags() & Node::PD::Flag_ndd_demotable_commutative) != 0;
>> 
>>     // Current scheme only...
>
>>  if (mdef->operand_num_edges(oper_index) != 1) {
>>    assert(!is_ndd_demotable(mdef), "%s", mdef->Name());
>>    return false;
>>  }
> 
> We don't need this assertion, NDD commutative operation MachNode may have its first input as memory, but we may pick second input for biasing.

Do you have `addI_rReg_rReg_mem_ndd` case in mind here? (It matches `Set dst (AddI src1 (LoadI src2))` and is marked as `Flag_ndd_demotable_commutative`).

>> src/hotspot/share/opto/chaitin.cpp line 1475:
>> 
>>> 1473: 
>>> 1474: OptoReg::Name PhaseChaitin::select_bias_lrg_color(LRG &lrg) {
>>> 1475:   uint bias_lrg1_idx = lrg._copy_bias;
>> 
>> Do I get it right that `_copy_bias2 != 0` implies `_copy_bias != 0`? Can you enforce it? (Here and in `PhaseChaitin::bias_color` where `_copy_bias2` is initialized.)
>
> Bias live ranges are indipendent while marking and during enforcement.

Do I get it right that memory operands in the first position for commutative operations are the reason why `copy_bias` is invalid while `copy_bias2` is not?

>> src/hotspot/share/opto/chaitin.cpp line 1498:
>> 
>>> 1496:     // the chances of register sharing once the bias live range
>>> 1497:     // becomes the part of IFG.
>>> 1498:     uint bias_lrg = lrgs(bias_lrg1_idx).degrees_of_freedom() >
>> 
>> How does it work when `bias_lrg1_idx` or `bias_lrg2_idx` indices are 0? Original code had a guard against such condition.
>
> Original code performed upfront non-zero check on copy_bias and then either selects the copy bias or constrain the register mask of defintion, now also we are checking for both the copybias upfront and select first or second bias which are also guarded with non-zero bias check else select the bias with minimum degreee of freedom if none of the live ranges are part of IFG.

So, the implicit assumption here (enforced by the caller) is that at least one of `bias_lrg1_idx` and `bias_lrg2_idx` values should be non-zero. At least, it deserves an assert here.

>> src/hotspot/share/opto/chaitin.cpp line 1538:
>> 
>>> 1536: 
>>> 1537:   // Try biasing the color with non-interfering bias live range[s].
>>> 1538:   if (lrg._copy_bias != 0 || lrg._copy_bias2 != 0) {
>> 
>> IMO you can drop `(lrg._copy_bias != 0 || lrg._copy_bias2 != 0)` guard. Original code didn't check it and there are enough guard in `select_bias_lrg_color` to catch it.
>
> Hi @iwanowww , guard checks were also part of [stock code](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/chaitin.cpp#L1496)

The check you pointed at is performed on the result on LRG lookup (which is placed in `select_bias_lrg_color()` now) and not the original index stored in `_copy_bias`/`_copy_bias2`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2558778478
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2558787562
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2558768375
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2558758109


More information about the hotspot-compiler-dev mailing list