RFR: 8227505: SuperWordLoopUnrollAnalysis may lead to over loop unrolling

Deshpande, Vivek R vivek.r.deshpande at intel.com
Sun Sep 1 01:46:04 UTC 2019


Hi Jie

I will try with NUM = 2048 and let you know.

Regards,
Vivek

-----Original Message-----
From: Jie Fu [mailto:fujie at loongson.cn] 
Sent: Saturday, August 31, 2019 8:04 AM
To: Deshpande, Vivek R <vivek.r.deshpande at intel.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Subject: Re: RFR: 8227505: SuperWordLoopUnrollAnalysis may lead to over loop unrolling

Hi Vivek,

Would you mind if I assign this issue[1] to you?

I can't find an AVX-512 machine in our company to do more investigation.
I'm sorry for that.

Thanks a lot.
Best regards,
Jie

[1] https://bugs.openjdk.java.net/browse/JDK-8227505

On 2019/8/23 上午8:53, Jie Fu wrote:
> Hi Vivek,
>
> Thanks for your clarification.
> Please seem comments inline.
>
> On 2019/8/23 上午3:26, Deshpande, Vivek R wrote:
>> Hi Jie
>>
>> On AVX2 (256 bit vector) machine I did not observe the difference in 
>> the generated code, same as your observation.
>>
>> But on AVX3(512 bit/ 64 byte vector) machine the generated code with 
>> the patch was generating the AVX2 (256 bit) instructions instead of
>> AVX3 (512 bit) instructions.
>> So it is not able to use the complete vector width with the patch.
>> As far as performance is concerned with this particular benchmark, 
>> that I have shared,  and with given number of iterations in the 
>> benchmark, I did not observe any difference with the patch from 
>> original.
> As for your particular case, I don't think it's a problem to compile 
> with vector-256 since there is no performance drop compared with 
> vector-512.
> Instead, I'd prefer using vector-256 to lower the risk of over loop 
> unrolling.
>
> Also I'm not sure whether the power consumption will increase if
> vector-512 is used on your machine.
>
>
>> So it's the difference in the generated code which is not using full 
>> vector width.
> According to your performance analysis, vector-256 is good enough for 
> your test case.
> What's the benefit to generate vector-512 for your case?
>
> Well, the patch doesn't disable the generation of vector-512 at all.
> You can increase the NUM in your program from 1024 to 2048 or more and 
> try again.
> Thanks.
>
> What do you think?
> Any comments?
>
> Thanks a lot.
> Best regards,
> Jie
>
>>
>> Regards,
>> Vivek



More information about the hotspot-compiler-dev mailing list