RFR: 8351016: RA support for EVEX to REX/REX2 demotion to optimize NDD instructions [v12]

Vladimir Ivanov vlivanov at openjdk.org
Tue Nov 18 23:55:20 UTC 2025


On Sat, 15 Nov 2025 02:24:47 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Currently, while choosing the colour (register) for a definition live range during the select phase of register allocation, we pick the first available colour that does not match with already allocated neighboring live ranges.
>> 
>> With Intel APX NDD ISA extension, several existing two-address arithmetic instructions can now have an explicit non-destructive destination operand; this, in general, saves additional spills for two-address instructions where the destination is also the first source operand, and where the source live range surpasses the current instruction.
>> 
>> All NDD instructions mandate extended EVEX encoding with a bulky 4-byte prefix, [JDK-8351994](https://github.com/openjdk/jdk/pull/24431) added logic for NDD to REX/REX2 demotion in the assembler layer, but due to the existing first color selection register allocation policy, the demotions are rare. This patch biases the allocation of NDD definition to the first source operand or the second source operand for the commutative class of operations.
>> 
>> Biasing is a compile-time hint to the allocator and is different from live range coalescing (aggressive/conservative), which merges the two live ranges using the union find algorithm.  Given that REX encoding needs a 1-byte prefix and REX2 encoding needs a 2-byte prefix, domotion saves considerable JIT code size.
>> 
>> The patch shows around 5-20% improvement in code size by facilitating NDD demotion.
>> 
>> For the following micro, the method JIT code size reduced from 136 to 120 bytes, which is around a 13% reduction in code size footprint.
>>  
>> **Micro:-**
>> <img width="900" height="300" alt="image" src="https://github.com/user-attachments/assets/9cbe9da8-d6af-4b1c-bb55-3e5d86eb2cf9" />
>> 
>> 
>> **Baseline :-**
>> <img width="900" height="300" alt="image" src="https://github.com/user-attachments/assets/ff5d50c6-fdfa-40e8-b93d-5f117d5a1ac6" />
>> 
>> **With opt:-**
>> <img width="900" height="300" alt="image" src="https://github.com/user-attachments/assets/bff425b0-f7bf-4ffd-a43d-18bdeb36b000" />
>> 
>> Thorough validations are underway using the latest [Intel Software Development Emulator version 9.58](https://www.intel.com/content/www/us/en/download/684897/intel-software-development-emulator.html).
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
> 
>   Review comments resolutions

Looks much better now, Jatin.

It looks like `Matcher::should_attempt_register_biasing()` has some implicit expectations about `mdef` shape. Is it possible to materialize them (as asserts on mach nodes with `Flag_ndd_demotable` or `Flag_ndd_commutative` flags set)? So, a misplaced declaration can be caught during testing.

src/hotspot/cpu/x86/x86.ad line 2641:

> 2639:   }
> 2640: 
> 2641:   if (mdef->num_opnds() <= oper_index || mdef->operand_index(oper_index) < 0) {

Move `mdef->operand_num_edges(oper_index) == 1` check here?

src/hotspot/cpu/x86/x86.ad line 2648:

> 2646:   // can be demoted to REX/REX2 encodings. For commutative operations with register
> 2647:   // operands, allocation of definition operand is biased towards both the operands.
> 2648:   return (((mdef->flags() & Node::PD::Flag_ndd_demotable) != 0) &&

It is called either with ` oper_index == 1` or ` oper_index == 2`. Can you make it explicit that any other operand doesn't participate in register biasing? Also, I'd expand the check, so it becomes clear that 1st operand requires `Flag_ndd_demotable` and 2nd requires `Flag_ndd_demotable` + `Flag_ndd_commutative` set.

src/hotspot/share/opto/matcher.hpp line 509:

> 507: 
> 508:  public:
> 509:   static bool should_attempt_register_biasing(const MachNode* mdef, int oper_index);

I suggest to call it `is_register_biasing_candidate(const MachNode* mdef, int oper_index)`.

-------------

PR Review: https://git.openjdk.org/jdk/pull/26283#pullrequestreview-3480085019
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2539963456
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2539976175
PR Review Comment: https://git.openjdk.org/jdk/pull/26283#discussion_r2539955192


More information about the hotspot-compiler-dev mailing list