[15] RFR(S): 8239009: C2: Don't use PSHUF to load scalars from memory on x86
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Mar 10 17:38:50 UTC 2020
Thanks for the reviews, Vladimir & Dean.
Best regards,
Vladimir Ivanov
On 05.03.2020 03:48, Vladimir Kozlov wrote:
> On 2/14/20 7:28 AM, Vladimir Ivanov wrote:
>>
>>> How many bytes are we loading from memory after your change?
>>
>> 4 bytes for int & float and 8 bytes for double.
>>
>> Wide memory accesses have both performance and correctness implications:
>> * the access is not properly aligned (2-/4-/8-byte vs 16-byte)
>> * can trigger page faults
>>
>> I noticed that the change in ReplS_mem is not enough: movdl loads 4
>> bytes and there's no m16-to-xmmreg instruction available. So, I
>> aligned ReplS_mem with ReplB_mem: use VPBROADCASTW when available
>> (AVX2). Otherwise, it is matched as loadS + ReplS_reg.
>
> Make sense.
>
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.01
>
> Agree.
>
> Thanks,
> Vladimir K
>
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> On 2/13/20 8:04 AM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8239009/webrev.00
>>>> https://bugs.openjdk.java.net/browse/JDK-8239009
>>>>
>>>> Mem-to-reg variants of PSHUF instructions load 16 bytes from memory.
>>>>
>>>> Switch to reg-to-reg variants and perform scalar loads of proper
>>>> size instead.
>>>>
>>>> Testing: hs-precheckin-comp,hs-tier1,hs-tier2
>>>>
>>>> Thanks!
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>
More information about the hotspot-compiler-dev
mailing list