RFR(S) 8192846: Extend cmov vectorization to work for float
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Dec 5 18:02:53 UTC 2017
Pushed into jdk/hs with approval of our gatekeeper.
Regards,
Vladimir
On 12/4/17 2:26 PM, Vladimir Kozlov wrote:
> On 12/4/17 11:19 AM, Lupusoru, Razvan A wrote:
>> Hi Vladimir,
>>
>> I removed the "do_vector_loop" check in order to enable the
>> optimization by default - but it seems that was undesirable from your
>> point of view.
>
> It is not desirable for JDK 10 since it is too late.
>
>>
>> Thus, we have added a new flag "UseVectorCmov" which is disabled by
>> default but anyone interested in feature can enable it manually.
>> Ideally we can get it in this form into this upcoming release and
>> remove the guard for following release once we check the stability of
>> it more.
>
> Sounds good. Agree.
>
>>
>> The path for the updated webrev:
>> http://cr.openjdk.java.net/~vdeshpande/8192846/webrev.01/
>> Code contributed by Razvan Lupusoru (rlupusoru)
>
> I will test it.
>
> Thanks,
> Vladimir
>
>>
>> Thanks,
>> Razvan
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Thursday, November 30, 2017 3:46 PM
>> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>;
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(S) 8192846: Extend cmov vectorization to work for float
>>
>> Thank you, Razvan
>>
>> I have one question: why you removed _do_vector_loop checks in
>> superword.cpp?
>>
>> Also I think it is late for jdk 10. I will test and push when jdk 10
>> is forked.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/30/17 1:48 PM, Lupusoru, Razvan A wrote:
>>> Hi all,
>>>
>>> This submission is to the issue noted in JDK-8192846 - namely that
>>> vectorized cmov for floats is not supported.
>>>
>>> This patch rectifies the situation so that when scalar Cmov for float
>>> and double is generated (or forced generated by passing
>>> -XX:+UseCMoveUnconditionally), it stands a chance for Vectorization
>>> using existing functionality previously enabled for doubles. The
>>> following code pattern will get vectorized with patch above:
>>>
>>> private void cmove_kernel_float(float[] in1, float[] in2, int length,
>>> float[] out) {
>>>
>>> for (int i = 0; i < length; i++) {
>>>
>>> out[i] = (in1[i] > in2[i]) ? in1[i] : in2[i];
>>>
>>> }
>>>
>>> }
>>>
>>> Instructions generated for loop are:
>>>
>>> vmovdqu 0x10(%rcx,%r10,4),%ymm0
>>>
>>> vmovdqu 0x10(%rdx,%r10,4),%ymm1
>>>
>>> vcmpleps %ymm0,%ymm1,%ymm2
>>>
>>> vblendvps %ymm2,%ymm0,%ymm1,%ymm2
>>>
>>> vmovdqu %ymm2,0x10(%r9,%r10,4)
>>>
>>> The patch is available for review here:
>>>
>>> http://cr.openjdk.java.net/~rlupusoru/jdk_hs/webrev_cmovevf_00/
>>>
>>> Jtreg testing with compiler tests has successfully passed. Thanks for
>>> review!
>>>
>>> --Razvan
>>>
More information about the hotspot-compiler-dev
mailing list