RFR(S):8216050:X86: Fix for Superword optimization fails with assert(0 <= i && i < _len) failed: illegal index

Deshpande, Vivek R vivek.r.deshpande at intel.com
Tue Jan 15 19:27:48 UTC 2019


Thanks Vladimir and Tobias for the review.
I have pushed the change.

Regards,
Vivek

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Monday, January 14, 2019 4:26 PM
To: Deshpande, Vivek R <vivek.r.deshpande at intel.com>; Tobias Hartmann <tobias.hartmann at oracle.com>; hotspot-compiler-dev at openjdk.java.net compiler <hotspot-compiler-dev at openjdk.java.net>
Cc: Raj, Guru <guru.raj at intel.com>; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization fails with assert(0 <= i && i < _len) failed: illegal index

On 1/14/19 4:17 PM, Deshpande, Vivek R wrote:
> Hi Vladimir
> 
> Thanks for looking at the patch.
> The MulAddS2I node gets packed in follow_use_defs() with this approach in which we just perform swaps in follow_def_uses and return false.

Got it. I confused follow_use_defs() with follow_def_uses().

Changes are good.

Vladimir

> This way MulAddS2I nodes gets the right alignment of multiple of 4 from its outs.
> If we return true after the swaps in follow_def_uses(), it gets alignment  as multiple of 2(from LoadS) for packing, instead of multiple of 4.
> 
> Regards,
> Vivek
> 
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Monday, January 14, 2019 2:37 PM
> To: Deshpande, Vivek R <vivek.r.deshpande at intel.com>; Tobias Hartmann 
> <tobias.hartmann at oracle.com>; hotspot-compiler-dev at openjdk.java.net 
> compiler <hotspot-compiler-dev at openjdk.java.net>
> Cc: Raj, Guru <guru.raj at intel.com>
> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization fails 
> with assert(0 <= i && i < _len) failed: illegal index
> 
> Hi Vivek,
> 
> I do not understand changes in superword.cpp.
> 
> muladds2i will never be packed in follow_def_uses() since you return 'false' for muladds2i in all cases when u1 != u2 (even when i1 == i2). Is it intentional?
> 
> Thanks,
> Vladimir
> 
> On 1/11/19 11:38 AM, Deshpande, Vivek R wrote:
>> Hi Tobias
>>
>> Thanks for reviewing the patch.
>> I have made the changes according to your suggestion.
>> In this webrev:
>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/
>> I have fix for the crash reported in the 8216050.
>>
>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI.
>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost.
>>
>> I have updated the bug also with the link to webrev.
>>
>> I have created a different bug JDK-8216580 for
>>    3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes
>>        for a[i] and a[i+1] accesses in same MulAddS2I node
>>
>> Thank you.
>> Regards,
>> Vivek
>>
>> -----Original Message-----
>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com]
>> Sent: Friday, January 11, 2019 4:49 AM
>> To: Deshpande, Vivek R <vivek.r.deshpande at intel.com>; 
>> hotspot-compiler-dev at openjdk.java.net compiler 
>> <hotspot-compiler-dev at openjdk.java.net>
>> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Viswanathan, 
>> Sandhya <sandhya.viswanathan at intel.com>; Raj, Guru 
>> <guru.raj at intel.com>
>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization fails 
>> with assert(0 <= i && i < _len) failed: illegal index
>>
>> Hi Vivek,
>>
>> On 11.01.19 07:58, Deshpande, Vivek R wrote:
>>> 1) Fix for the crash by matching the operand by swapping to right positions.
>>
>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix.
>>
>>> 2) Cost based generation of vpdpwssd instruction.
>>
>> Other instructions added by JDK-8214751 still miss a cost definition, for example:
>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20
>>
>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to 
>>> be isomorphic when they have different control RangeCheck nodes
>>>       for a[i] and a[i+1] accesses in same MulAddS2I node
>>
>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE.
>>
>> Thanks,
>> Tobias
>>


More information about the hotspot-compiler-dev mailing list