[15] RFR(S): 8239009: C2: Don't use PSHUF to load scalars from memory on x86

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Fri Feb 14 15:28:36 UTC 2020


> How many bytes are we loading from memory after your change?

4 bytes for int & float and 8 bytes for double.

Wide memory accesses have both performance and correctness implications:
   * the access is not properly aligned (2-/4-/8-byte vs 16-byte)
   * can trigger page faults

I noticed that the change in ReplS_mem is not enough: movdl loads 4 
bytes and there's no m16-to-xmmreg instruction available. So, I aligned 
ReplS_mem with ReplB_mem: use VPBROADCASTW when available (AVX2). 
Otherwise, it is matched as loadS + ReplS_reg.

Updated webrev:
   http://cr.openjdk.java.net/~vlivanov/8239009/webrev.01

Best regards,
Vladimir Ivanov

> On 2/13/20 8:04 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.00
>> https://bugs.openjdk.java.net/browse/JDK-8239009
>>
>> Mem-to-reg variants of PSHUF instructions load 16 bytes from memory.
>>
>> Switch to reg-to-reg variants and perform scalar loads of proper size 
>> instead.
>>
>> Testing: hs-precheckin-comp,hs-tier1,hs-tier2
>>
>> Thanks!
>>
>> Best regards,
>> Vladimir Ivanov
> 


More information about the hotspot-compiler-dev mailing list