RFR(S) 8192846: Extend cmov vectorization to work for float

Mon Dec 4 22:26:37 UTC 2017

On 12/4/17 11:19 AM, Lupusoru, Razvan A wrote:
> Hi Vladimir,
> 
> I removed the "do_vector_loop" check in order to enable the optimization by default - but it seems that was undesirable from your point of view.

It is not desirable for JDK 10 since it is too late.

> 
> Thus, we have added a new flag "UseVectorCmov" which is disabled by default but anyone interested in feature can enable it manually. Ideally we can get it in this form into this upcoming release and remove the guard for following release once we check the stability of it more.

Sounds good. Agree.

> 
> The path for the updated webrev:
> http://cr.openjdk.java.net/~vdeshpande/8192846/webrev.01/
> Code contributed by Razvan Lupusoru (rlupusoru)

I will test it.

Thanks,
Vladimir

> 
> Thanks,
> Razvan
> 
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, November 30, 2017 3:46 PM
> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(S) 8192846: Extend cmov vectorization to work for float
> 
> Thank you, Razvan
> 
> I have one question: why you removed _do_vector_loop checks in superword.cpp?
> 
> Also I think it is late for jdk 10. I will test and push when jdk 10 is forked.
> 
> Thanks,
> Vladimir
> 
> On 11/30/17 1:48 PM, Lupusoru, Razvan A wrote:
>> Hi all,
>>
>> This submission is to the issue noted in JDK-8192846  - namely that
>> vectorized cmov for floats is not supported.
>>
>> This patch rectifies the situation so that when scalar Cmov for float
>> and double is generated (or forced generated by passing
>> -XX:+UseCMoveUnconditionally), it stands a chance for Vectorization
>> using existing functionality previously enabled for doubles. The following code pattern will get vectorized with patch above:
>>
>> private void cmove_kernel_float(float[] in1, float[] in2, int length,
>> float[] out) {
>>
>>     for (int i = 0; i < length; i++) {
>>
>>       out[i] = (in1[i] > in2[i]) ? in1[i] : in2[i];
>>
>>     }
>>
>> }
>>
>> Instructions generated for loop are:
>>
>>     vmovdqu 0x10(%rcx,%r10,4),%ymm0
>>
>>     vmovdqu 0x10(%rdx,%r10,4),%ymm1
>>
>>     vcmpleps %ymm0,%ymm1,%ymm2
>>
>>     vblendvps %ymm2,%ymm0,%ymm1,%ymm2
>>
>>     vmovdqu %ymm2,0x10(%r9,%r10,4)
>>
>> The patch is available for review here:
>>
>> http://cr.openjdk.java.net/~rlupusoru/jdk_hs/webrev_cmovevf_00/
>>
>> Jtreg testing with compiler tests has successfully passed. Thanks for review!
>>
>> --Razvan
>>