[15] RFR(S): 8239009: C2: Don't use PSHUF to load scalars from memory on x86
Vladimir Kozlov
vladimir.kozlov at oracle.com
Thu Mar 5 00:48:17 UTC 2020
On 2/14/20 7:28 AM, Vladimir Ivanov wrote:
>
>> How many bytes are we loading from memory after your change?
>
> 4 bytes for int & float and 8 bytes for double.
>
> Wide memory accesses have both performance and correctness implications:
> * the access is not properly aligned (2-/4-/8-byte vs 16-byte)
> * can trigger page faults
>
> I noticed that the change in ReplS_mem is not enough: movdl loads 4 bytes and there's no
> m16-to-xmmreg instruction available. So, I aligned ReplS_mem with ReplB_mem: use VPBROADCASTW when
> available (AVX2). Otherwise, it is matched as loadS + ReplS_reg.
Make sense.
>
> Updated webrev:
> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.01
Agree.
Thanks,
Vladimir K
>
> Best regards,
> Vladimir Ivanov
>
>> On 2/13/20 8:04 AM, Vladimir Ivanov wrote:
>>> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.00
>>> https://bugs.openjdk.java.net/browse/JDK-8239009
>>>
>>> Mem-to-reg variants of PSHUF instructions load 16 bytes from memory.
>>>
>>> Switch to reg-to-reg variants and perform scalar loads of proper size instead.
>>>
>>> Testing: hs-precheckin-comp,hs-tier1,hs-tier2
>>>
>>> Thanks!
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>
More information about the hotspot-compiler-dev
mailing list