RFR: 8289552: Make intrinsic conversions between bit representations of half precision values and floats [v11]

Jatin Bhateja jbhateja at openjdk.org
Tue Oct 4 06:52:56 UTC 2022


On Mon, 3 Oct 2022 17:47:00 GMT, Smita Kamath <svkamath at openjdk.org> wrote:

>> Rethink about it, you can get 0x01 by right shifting k0 to the right - `kshiftrw(ktmp, k0, 15)`
>
> @merykitty Thanks for the suggestion. I will update the instruct to use kmovwl. I will also experiment with kshiftrw and let you know.

> You can use `kmovwl` instead which will relax the avx512bw constraint, however, you will need avx512vl for `evcvtps2ph`. Thanks.

Yes, in general all AVX512VL targets support AVX512BW, but cloud instances give freedom to enable custom features. Regarding K0, as per section "15.6.1.1" of SDM, expectation is that K0 can appear in source and destination of regular non predication context, k0 should always contain all true mask so it should be unmodifiable for subsequent usages i.e. should not be present as destination of a mask manipulating instruction. Your suggestion is to have that in source but it may not work either. Changing existing sequence to use kmovw and replace AVX512BW with AVX512VL will again mean introducing an additional predication check for this pattern.

-------------

PR: https://git.openjdk.org/jdk/pull/9781


More information about the core-libs-dev mailing list