RFR: 8351016: RA support for EVEX to REX/REX2 demotion to optimize NDD instructions
Quan Anh Mai
qamai at openjdk.org
Mon Aug 4 15:52:28 UTC 2025
On Mon, 14 Jul 2025 02:36:24 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> Currently, while choosing the colour (register) for a definition live range during the select phase of register allocation, we pick the first available colour which does not match with already allocated nebouring live ranges.
>
> With Intel APX NDD ISA extension several existing two address arithmetic instruction can now have an explicit non-destructive desitination operand, this in general saves addition spills for two address instruciton where destination is also first source operand and whose live range surpasses this instruction.
>
> All NDD instructions mandates extended EVEX encoding with a bulky 4 byte prefix, [JDK-8351994](https://github.com/openjdk/jdk/pull/24431) added logic for NDD to REX/REX2 deomotion in the assembler layer but due to existing first color selection register allocation policy the demotions are rare. This patch biases the allocation of NDD definition to first source operand or second source operand for commutative class of operations.
>
> Biasing is compile time hint to allocator and is different from live range coalescing (aggressive / conservative) which merge the two live ranges using union find algorithm. Given that REX encoding needs 1 byte prefix and REX2 encoding need 2 byte prefix, domotion saves considerable JIT code size.
>
> Patch shows around 5-20% improment in code size by facilitating NDD demotion.
>
> For following micro, method JIT code size reduced from 136 to 120 bytes which is around 13% reduction in code size.
>
> **Micro:-**
> <img width="1344" height="315" alt="image" src="https://github.com/user-attachments/assets/9cbe9da8-d6af-4b1c-bb55-3e5d86eb2cf9" />
>
>
> **Baseline :-**
> <img width="1013" height="163" alt="image" src="https://github.com/user-attachments/assets/ff5d50c6-fdfa-40e8-b93d-5f117d5a1ac6" />
>
> **With opt:-**
> <img width="940" height="160" alt="image" src="https://github.com/user-attachments/assets/bff425b0-f7bf-4ffd-a43d-18bdeb36b000" />
>
> Thorough validation are underway using latest [Intel Software Developement Emulator version 9.58](https://www.intel.com/content/www/us/en/download/684897/intel-software-development-emulator.html).
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
This may also be applicable to non-APX instructions. For example, in the case of casting long to int, if the destination and the source are the same, then we do not need to emit any code. As a result, do you think it is better to mark operands in the ad file to preferably have the same register as the result?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26283#issuecomment-3069328067
More information about the hotspot-compiler-dev
mailing list