RFR: 8356760: VectorAPI: Optimize VectorMask.fromLong for all-true/all-false cases [v5]
erifan
duke at openjdk.org
Fri Jul 25 07:28:43 UTC 2025
On Fri, 25 Jul 2025 03:35:11 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> erifan has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
>>
>> - Add JMH benchmarks for cast chain transformation
>> - Merge branch 'master' into JDK-8356760
>> - Refactor the implementation
>>
>> Do the convertion in C2's IGVN phase to cover more cases.
>> - Merge branch 'master' into JDK-8356760
>> - Simplify the test code
>> - Address some review comments
>>
>> Add support for the following patterns:
>> toLong(maskAll(true)) => (-1ULL >> (64 -vlen))
>> toLong(maskAll(false)) => 0
>>
>> And add more test cases.
>> - Merge branch 'master' into JDK-8356760
>> - 8356760: VectorAPI: Optimize VectorMask.fromLong for all-true/all-false cases
>>
>> If the input long value `l` of `VectorMask.fromLong(SPECIES, l)` would
>> set or unset all lanes, `VectorMask.fromLong(SPECIES, l)` is equivalent
>> to `maskAll(true)` or `maskAll(false)`. But the cost of `maskAll` is
>> relative smaller than that of `fromLong`. This patch does the conversion
>> for these cases if `l` is a compile time constant.
>>
>> And this conversion also enables further optimizations that recognize
>> maskAll patterns, see [1].
>>
>> Some JTReg test cases are added to ensure the optimization is effective.
>>
>> I tried many different ways to write a JMH benchmark, but failed. Since
>> the input of `VectorMask.fromLong(SPECIES, l)` needs to be a specific
>> compile-time constant, the statement will be hoisted out of the loop.
>> If we don't use a loop, the hotspot will become other instructions, and
>> no obvious performance change was observed. However, combined with the
>> optimization of [1], we can observe a performance improvement of about
>> 7% on both aarch64 and x64.
>>
>> The patch was tested on both aarch64 and x64, all of tier1 tier2 and
>> tier3 tests passed.
>>
>> [1] https://github.com/openjdk/jdk/pull/24674
>
> src/hotspot/share/opto/vectornode.cpp line 1986:
>
>> 1984: Node* VectorMaskToLongNode::Ideal_MaskAll(PhaseGVN* phase) {
>> 1985: Node* in1 = in(1);
>> 1986: // VectorMaskToLong follows a VectorStoreMask if predicate is not supported.
>
> It's always good to add an assertion check for coding assumptions.
Done, thanks!
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25793#discussion_r2230351518
More information about the hotspot-compiler-dev
mailing list