[15] RFR(S): 8239009: C2: Don't use PSHUF to load scalars from memory on x86

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Mar 5 00:48:17 UTC 2020


On 2/14/20 7:28 AM, Vladimir Ivanov wrote:
> 
>> How many bytes are we loading from memory after your change?
> 
> 4 bytes for int & float and 8 bytes for double.
> 
> Wide memory accesses have both performance and correctness implications:
>    * the access is not properly aligned (2-/4-/8-byte vs 16-byte)
>    * can trigger page faults
> 
> I noticed that the change in ReplS_mem is not enough: movdl loads 4 bytes and there's no 
> m16-to-xmmreg instruction available. So, I aligned ReplS_mem with ReplB_mem: use VPBROADCASTW when 
> available (AVX2). Otherwise, it is matched as loadS + ReplS_reg.

Make sense.

> 
> Updated webrev:
>    http://cr.openjdk.java.net/~vlivanov/8239009/webrev.01

Agree.

Thanks,
Vladimir K

> 
> Best regards,
> Vladimir Ivanov
> 
>> On 2/13/20 8:04 AM, Vladimir Ivanov wrote:
>>> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.00
>>> https://bugs.openjdk.java.net/browse/JDK-8239009
>>>
>>> Mem-to-reg variants of PSHUF instructions load 16 bytes from memory.
>>>
>>> Switch to reg-to-reg variants and perform scalar loads of proper size instead.
>>>
>>> Testing: hs-precheckin-comp,hs-tier1,hs-tier2
>>>
>>> Thanks!
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>


More information about the hotspot-compiler-dev mailing list