RFR: 8282711: Accelerate Math.signum function for AVX and AVX512 target. [v2]
Quan Anh Mai
duke at openjdk.java.net
Wed Mar 9 02:20:01 UTC 2022
On Tue, 8 Mar 2022 14:05:36 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:
>>> I believe you could achieve better register management using the following sequence.
>>>
>>> ```
>>> evfpclasspd(ktmp1, src, 0x7, vec_enc);
>>> evblendmpd(dst, ktmp1, dst, zero, true, vec_enc);
>>
>> 0x7 encodes QNaN, +/- 0.0 values. Thus blending dst with zero will not work for NaN. I guess you wanted to mention src as in original sequence.
>>
>>> evfpclasspd(ktmp1, src, 0x40, vec_enc);
>>
>> 0x40 checks does not check for NEGATIVE_INFINITE, Math.signum should return -1 for it. 0x40 should be 0x50
>>
>>> evsubpd(dst, ktmp1, zero, one, true, vec_enc);
>>> ```
>>
>> But I agree we can do away with some temporaries.
>
> Ah my bad, the second instruction should be `evblendmpd(dst, ktmp1, one, src, true, vec_enc);`
> And the third one should be 0x50 as you mentioned.
Fixed the suggestion because I was mistaken, the first blend put all `NaN` and nonnegative values and the masked `vsubpd` put the remaining negative values, completing the operation. You need to spot `SNaN` also because it can be produced using `Double.longBitsToDouble` and in the future using vector reinterpretation.
Thanks.
-------------
PR: https://git.openjdk.java.net/jdk/pull/7717
More information about the hotspot-compiler-dev
mailing list