RFR: 8367292: VectorAPI: Optimize VectorMask.fromLong/toLong() for SVE [v3]
    Xiaohong Gong 
    xgong at openjdk.org
       
    Tue Oct 28 10:23:07 UTC 2025
    
    
  
On Tue, 28 Oct 2025 09:43:03 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Yes, the IR changes you pointed above is right. 
>> 
>> The major performance uplift comes from the existing optimization  of `VectorStoreMask (VectorLoadMask v) => v`. As you know, `VectorLoadMask` will be generated by some APIs like `VectorMask.fromArray()`. With this change, `VectorMask.fromLong()` also generates this IR. The mask conversions (V->P and P->V) between these APIs can be saved. 
>> 
>> Another performance uplift comes from the flexible vector register allocation. Before, the vector register is specified as the same for different instructions. But now, it depends on RA. In this case, it potentially breaks the un-expected  data-dependence across loop iterations.
>
> @XiaohongGong If this is only about `VectorStoreMask (VectorLoadMask v) => v`, why not solve the issue with an `Ideal` optimization? Would that be an alternative?
`VectorStoreMask (VectorLoadMask v) => v` is already existed in C2. Spliting the `VectorLongToMask` and `VectorMaskToLong` can reuse this transformation. That's why the performance can be improved. Because redundent mask conversions are optimized out in some case.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27481#discussion_r2468900531
    
    
More information about the hotspot-compiler-dev
mailing list