RFR: 8358521: Optimize vector operations by reassociating broadcasted inputs [v4]
Emanuel Peter
epeter at openjdk.org
Tue Feb 24 08:46:08 UTC 2026
On Tue, 24 Feb 2026 06:17:30 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Hi all,
>>
>> This patch optimizes SIMD kernels making heavy use of broadcasted inputs through following reassociating ideal transformations.
>>
>>
>> VectorOperation (VectorBroadcast INP1, VectorBroadcast INP2) =>
>> VectorBroadcast (ScalarOpration INP1, INP2)
>>
>> VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast INP2) INP3) =>
>> VectorOperation INP3 (VectorOperation (VectorBroadcast INP1) (VectorOperation INP2))
>>
>>
>> The idea is to push broadcasts across the vector operation and replace the vector with an equivalent, cheaper scalar variant. Currently, patch handles most common vector operations.
>>
>> Following are the performance number of benchmark included with this patch on latest generation x86 targets:-
>>
>> **AMD Turin (2.1GHz)**
>> <img width="1122" height="355" alt="image" src="https://github.com/user-attachments/assets/3f5087bf-0e14-4c56-b0c2-3d23253bad54" />
>>
>> **Intel Granite Rapids (2.1GHz)**
>> <img width="1105" height="325" alt="image" src="https://github.com/user-attachments/assets/c8481f86-4db2-4c4e-bd65-51542c59fe63" />
>>
>>
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> Review comments resolutions
Just scanned the PR description:
VectorOperation (VectorBroadcast INP1) (VectorOperation (VectorBroadcast INP2) INP3) =>
VectorOperation INP3 (VectorOperation (VectorBroadcast INP1) (VectorOperation INP2))
I think it should be `VectorBroadcast INP2` in both cases, and not `VectorOperation INP2`, because `INP2` is scalar, right?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25617#issuecomment-3950091432
More information about the core-libs-dev
mailing list