RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Sep 12 00:11:25 UTC 2018


Thank you.

I want to discuss next issue:

 > You did not added instructions to load these registers from memory (and stack). What happens in 
such cases when you need to load or store?
 >>>> Let us take an example, e.g. for loading into rregF. First it gets loaded from memory into 
regF and then register to register move to rregF and vice versa.

This is what I thought. You increase registers pressure this way which may cause spills on stack. 
Also we don't check that register could be the same as result you may get unneeded moves.

I would advice add memory moves at least.

An other question. You use movflt() and movdbl() which use either movap[s|d] and movs[s|d] instructions:
http://hg.openjdk.java.net/jdk/jdk/file/4ffb0a33f265/src/hotspot/cpu/x86/macroAssembler_x86.hpp#l164
Are these instructions work when avx512vl is not available? I see for vectors you use 
vpxor+vinserti* combination.

Last question. I notice next UseAVX check in vectors spills code in x86.ad:
if ((UseAVX < 2) || VM_Version::supports_avx512vl())

Should it be (UseAVX < 3)?

Thanks,
Vladimir

On 9/11/18 2:58 PM, Viswanathan, Sandhya wrote:
> Hi Vladimir,
> 
> Thanks a lot for your review and feedback. Please see my response in your email below. I will send an updated webrev incorporating your feedback.
> 
> Best Regards,
> Sandhya
> 
> 
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Monday, September 10, 2018 6:09 PM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction
> 
> Very nice. Thank you, Sandhya.
> 
> I would like to see more meaningful naming in .ad files - instead of rreg* and ovec* to have something like vlReg* and legVec*.
> 
>>>> Yes, accepted.
> 
> New load_from_* and load_to_* instructions in .ad files should be renamed to next and collocate with other Move*_reg_reg* instructions:
> 
> instruct MoveF2VL(vlRegF dst, regF src)
> instruct MoveVL2F(regF dst, vlRegF src)
>>>> Yes, accepted.
> 
> You did not added instructions to load these registers from memory (and stack). What happens in such cases when you need to load or store?
>>>> Let us take an example, e.g. for loading into rregF. First it gets loaded from memory into regF and then register to register move to rregF and vice versa.
> 
> Also please explain why these registers are used when UseAVX == 0?:
> 
> +instruct absD_reg(rregD dst) %{
>      predicate((UseSSE>=2) && (UseAVX == 0));
> 
> we switch off evex so regular regD (only legacy register in this case) should work too:
>    661   if (UseAVX < 3) {
>    662     _features &= ~CPU_AVX512F;
> 
>>>> Yes, accepted. It could be regD here.
> 
> Next checks could be combined by using new function in vm_version_x86.hpp (you already have some):
> 
> +reg_class_dynamic vectors_reg_bwdqvl(vectors_reg_evex,
> +vectors_reg_legacy, %{
> VM_Version::supports_evex() && VM_Version::supports_avx512bw() && VM_Version::supports_avx512dq() &&
> VM_Version::supports_avx512vl() %} );
> 
>>>> Yes, accepted.
> 
> I would suggest to test these changes on different machines (non-avx512 and avx512) and with different UseAVX values.
> 
>>>> Will do.
> 
> Thanks,
> Vladimir
> 
> On 9/5/18 4:09 PM, Viswanathan, Sandhya wrote:
>> Recently there have been couple of high priority issues with regards
>> to high bank of XMM register
>> (XMM16-XMM31) usage by C2:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8207746
>>
>> https://bugs.openjdk.java.net/browse/JDK-8209735
>>
>> Please find below a patch which attempts to clean up the XMM register handling by using register groups.
>>
>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.00/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>>
>> The patch provides a restricted set of registers to the match rules in
>> the ad file based on the underlying architecture.
>>
>> The aim is to remove special handling/workaround from macro assembler and assembler.
>>
>> By removing the special handling, the patch reduces the overall code size by about 1800 lines of code.
>>
>> Your review and feedback is very welcome.
>>
>> Best Regards,
>>
>> Sandhya
>>


More information about the hotspot-compiler-dev mailing list