RFR: 8341137: Optimize long vector multiplication using x86 VPMULUDQ instruction [v2]
Jatin Bhateja
jbhateja at openjdk.org
Thu Oct 17 19:43:24 UTC 2024
On Mon, 14 Oct 2024 21:26:41 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
>>
>> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8341137
>> - 8341137: Optimize long vector multiplication using x86 VPMULUDQ instruction
>
> src/hotspot/share/opto/vectornode.cpp line 2124:
>
>> 2122: // MulL ( And SRC1, 0xFFFFFFFF) (URShift SRC2 , 32)
>> 2123: if ((is_lower_double_word_and_mask_op(in(1)) ||
>> 2124: is_lower_double_word_and_mask_op(in(1)) ||
>
> `is_lower_double_word_and_mask_op(in(1)) || is_lower_double_word_and_mask_op(in(1))` is redundant, right? Shouldn't you only need it once? Same for the other 3 calls, which are similarly repeated.
Ah, these harmless cunning typos :-), but we should not rely on c-compiler's short circuiting.
> test/hotspot/jtreg/compiler/vectorapi/VectorMultiplyOpt.java line 41:
>
>> 39: */
>> 40:
>> 41: public class VectorMultiplyOpt {
>
> Could it be possible to also do IR verification in this test? It would be good to check that we don't generate `AndVL` or `URShiftVL` with this transform.
We do need those nodes to chop off the upper double words of quadword lanes.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21244#discussion_r1805324915
PR Review Comment: https://git.openjdk.org/jdk/pull/21244#discussion_r1805324796
More information about the core-libs-dev
mailing list