RFR: 8294588: Auto vectorize half precision floating point conversion APIs [v9]
Sandhya Viswanathan
sviswanathan at openjdk.org
Thu Dec 8 03:00:56 UTC 2022
On Thu, 8 Dec 2022 00:37:48 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> Smita Kamath has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Updated test case and updated code as per review comment
>
> I started new testing after verifying locally that test passed with `-XX:UseAVX=1`.
> > @vnkozlov The test was failing earlier with -XX:UseAVX=1 because the right implemented() check was not happening as Fei Gao explained. In vectornode.cpp, method VectorCastNode::implemented() was not getting the right vopc (VectorCastF2X, VectorCastS2X instead of VectorCastF2HF and VectorCastHF2F) after call to VectorCastNode::opcode() and so the Matcher::match_rule_supported_superword() was called with wrong vopc. This is now fixed as Smita has fixed the VectorCastNode::opcode() and VectorCastNode::implemented().
>
> Also, the IR test was only enabled for avx512f earlier, which some how over shadowed the problem. Since VM features are queried using CPUID hence matcher will give up if both F16C and AVX512F are not present. Hi @smita-kamath , we should not explicitly disable the F16C in vm_version.
@jatin-bhateja When User sets -XX:UseAVX=0 on command line F16C needs to be disabled explicitly (in vm_version) as it needs AVX support.
-------------
PR: https://git.openjdk.org/jdk/pull/11471
More information about the hotspot-compiler-dev
mailing list