RFR: 8282711: Accelerate Math.signum function for AVX and AVX512 target. [v2]

Quan Anh Mai duke at openjdk.java.net
Wed Mar 9 02:20:01 UTC 2022


On Tue, 8 Mar 2022 14:05:36 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:

>>> I believe you could achieve better register management using the following sequence.
>>> 
>>> ```
>>> evfpclasspd(ktmp1, src, 0x7, vec_enc);
>>> evblendmpd(dst, ktmp1, dst, zero, true, vec_enc);
>> 
>> 0x7 encodes QNaN, +/- 0.0 values. Thus blending dst with zero will not work for NaN. I guess you wanted to mention src as in original sequence.
>> 
>>> evfpclasspd(ktmp1, src, 0x40, vec_enc);
>> 
>> 0x40 checks does not check for NEGATIVE_INFINITE, Math.signum should return -1 for it.  0x40 should be 0x50
>> 
>>> evsubpd(dst, ktmp1, zero, one, true, vec_enc);
>>> ```
>> 
>> But I agree we can do away with some temporaries.
>
> Ah my bad, the second instruction should be `evblendmpd(dst, ktmp1, one, src, true, vec_enc);`
> And the third one should be 0x50 as you mentioned.

Fixed the suggestion because I was mistaken, the first blend put all `NaN` and nonnegative values and the masked `vsubpd` put the remaining negative values, completing the operation. You need to spot `SNaN` also because it can be produced using `Double.longBitsToDouble` and in the future using vector reinterpretation.
Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7717


More information about the hotspot-compiler-dev mailing list