RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction

Vladimir Kozlov vladimir.kozlov at oracle.com
Mon Sep 24 21:51:01 UTC 2018


Looks good. I start testing again.

I don't see your 'submit' job in a system. I asked people to look what happened.

Thanks,
Vladimir

On 9/21/18 2:30 PM, Viswanathan, Sandhya wrote:
> Hi Vladimir,
> 
> Please find the updated webrev with fix for build failure on SPARC and other architectures at:
> Patch: http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.04/
> RFE: https://bugs.openjdk.java.net/browse/JDK-8210764
> 
> Vivek submitted this webrev for testing to submit repo yesterday at around noon. We haven’t received any email back so far. This is our first time with submit repo.
> http://mail.openjdk.java.net/pipermail/jdk-submit-changes/2018-September/003164.html
> 
> Best Regards,
> Sandhya
> 
> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya
> Sent: Thursday, September 20, 2018 10:53 AM
> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
> Subject: RE: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction
> 
> Hi Vladimir,
> 
> In C1_LIRAssembler.hpp, when I added an additional parameter to negate, I did make sure to add it as a default parameter:
> 
> src/hotspot/share/c1/c1_LIRAssembler.hpp, line 282:
>    void negate(LIR_Opr left, LIR_Opr dest, LIR_Opr tmp = LIR_OprFact::illegalOpr);
> 
> But I guess since the function is not just called but declared/defined in all the other architectures, I need to add an unused LIR_Opr to the negate function for them.
> This would be on similar lines as done in some other C1_LIRAssembler methods.
> 
> I will make this change and work with Vivek to use the submit repo for testing it on Sparc.
> 
> Best Regards,
> Sandhya
> 
> 
> 
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, September 20, 2018 10:09 AM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction
> 
> I hit build failure on SPARC due to shared changes in C1:
> 
> workspace/open/src/hotspot/cpu/sparc/c1_LIRAssembler_sparc.cpp", line 3027: Error: "LIR_Assembler::negate(LIR_OprDesc*,
> LIR_OprDesc*)" was previously declared "LIR_Assembler::negate(LIR_OprDesc*, LIR_OprDesc*, LIR_OprDesc*)".
> jib > 1 Error(s) detected.
> 
> I assume other platforms are also affected.
> 
> Vladimir
> 
> On 9/19/18 9:53 AM, Vladimir Kozlov wrote:
>> Thank you, Sandhya
>>
>> I submitted new testing.
>>
>> Vladimir
>>
>> On 9/18/18 4:52 PM, Viswanathan, Sandhya wrote:
>>> Hi Vladimir,
>>>
>>> Please find below the updated webrev with fixes for the two issues:
>>>
>>> Patch: http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.03/
>>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.03/>
>>>
>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8210764
>>>
>>> Fix for compiler/vectorization/TestNaNVector.java was to pass legVecS
>>> as the temporary register type for intrinsics instead of legVecD.
>>>
>>> This test was only failing with -XX:MaxVectorSize=4.
>>>
>>> The file modified is x86_64.ad.
>>>
>>> Fix for compiler/vectorization/TestNaNVector.java was to allow all
>>> xmm registers (xmm0-xmm31) for C1 and handle floating point abs and negate appropriately by providing a temp register.
>>>
>>> The C1 files are modified for this fix.
>>>
>>> I reran compiler jtreg tests with setting nativepath appropriately on Haswell, SKX and KNL.
>>>
>>> Best Regards,
>>>
>>> Sandhya
>>>
>>> *From:*Viswanathan, Sandhya
>>> *Sent:* Tuesday, September 18, 2018 1:47 PM
>>> *To:* 'JC Beyler' <jcbeyler at google.com>
>>> *Cc:* vladimir.kozlov at oracle.com; hotspot-compiler-dev
>>> <hotspot-compiler-dev at openjdk.java.net>
>>> *Subject:* RE: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>> instruction
>>>
>>> Hi Jc,
>>>
>>> Thanks a lot for the steps. I am now able to verify my fix for the NativeCallTest.java.
>>>
>>> Best Regards,
>>>
>>> Sandhya
>>>
>>> *From:*JC Beyler [mailto:jcbeyler at google.com]
>>> *Sent:* Monday, September 17, 2018 9:29 PM
>>> *To:* Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>
>>> *Cc:* vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>;
>>> hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>>
>>> *Subject:* Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>> instruction
>>>
>>> Hi Sandhya,
>>>
>>> How are you invoking the test for NativeCallTest?
>>>
>>> The way I would do it using jtreg would be something like this:
>>>
>>> $ export BUILD_TYPE=release
>>>
>>> $ export JDK_PATH=wherever you have your JDK
>>>
>>>   From the test subfolder:
>>>
>>> $ wherever-your-jtreg-is/bin/jtreg
>>> -nativepath:$JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/su
>>> pport/test/hotspot/jtreg/native/lib -jdk
>>> $JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/images/jdk
>>> hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/t
>>> est/NativeCallTest.java
>>>
>>> Seems to pass for me.
>>>
>>> But much easier is:
>>>
>>> $ make run-test TEST="test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java"
>>>
>>> That seems to pass for me as well and is easier to use :)
>>>
>>> For information, the make run-test documentation is here:
>>>
>>> http://hg.openjdk.java.net/jdk9/jdk9/raw-file/tip/common/doc/testing.
>>> html
>>>
>>> Let me know if that helps,
>>>
>>> Jc
>>>
>>>      Note: For NativeCallTest.java and many others I get the following message in the corresponding .jtr file:
>>>                           "test result: Error. Use -nativepath to specify the location of native code"
>>>                   Do I need to give any additional info to jtreg to get over this problem?
>>>
>>>      Thanks a lot!
>>>      Best Regards,
>>>      Sandhya
>>>
>>>      -----Original Message-----
>>>      From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>]
>>>      Sent: Monday, September 17, 2018 10:14 AM
>>>      To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>;
>>>      hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>      Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>> instruction
>>>
>>>      I finished testing on avx512 machine.
>>>      All passed except known (TestNaNVector.java) failures.
>>>
>>>      Thanks,
>>>      Vladimir
>>>
>>>      On 9/14/18 5:22 PM, Vladimir Kozlov wrote:
>>>       > I gave incorrect link to RFE. Here is correct:
>>>       >
>>>       > https://bugs.openjdk.java.net/browse/JDK-8210764
>>>       >
>>>       > Vladimir
>>>       >
>>>       > On 9/14/18 3:49 PM, Vladimir Kozlov wrote:
>>>       >> Build failure I got on MacOS and Windows. Linux passed for some reason so I am not surprise you did not noticed.
>>>       >>
>>>       >> Anyway. I tested with these changes and got next failures on avx1 machines. I am planning to run on avx512 too.
>>>       >>
>>>       >> 1. compiler/vectorization/TestNaNVector.java with '-Xcomp' or '-XX:CompileThreshold=100 -XX:-TieredCompilation'
>>>      on CPU
>>>       >> with AVX1 only
>>>       >>
>>>       >> #  SIGSEGV (0xb) at pc=0x00007f3b146410f0, pid=13871,
>>> tid=13884
>>>       >> # Problematic frame:
>>>       >> # V  [libjvm.so+0x46f0f0]  MachNode::ideal_reg() const+0x20
>>>       >>
>>>       >> Current CompileTask:
>>>       >> C2:    154    5             java.lang.String::equals (65
>>> bytes)
>>>       >>
>>>       >> Stack: [0x00007f3b10044000,0x00007f3b10145000],
>>> sp=0x00007f3b1013fe70,  free space=1007k
>>>       >> Native frames: (J=compiled Java code, A=aot compiled Java
>>> code, j=interpreted, Vv=VM code, C=native code)
>>>       >> V  [libjvm.so+0x46f0f0]  MachNode::ideal_reg() const+0x20
>>>       >> V  [libjvm.so+0x882a72]
>>> PhaseChaitin::gather_lrg_masks(bool)+0x872
>>>       >> V  [libjvm.so+0xd82235]  PhaseCFG::global_code_motion()+0xfc5
>>>       >> V  [libjvm.so+0xd824b1]
>>> PhaseCFG::do_global_code_motion()+0x51
>>>       >> V  [libjvm.so+0xa2c26c]  Compile::Code_Gen()+0x24c
>>>       >> V  [libjvm.so+0xa2ff82]  Compile::Compile(ciEnv*,
>>> C2Compiler*, ciMethod*, int, bool, bool, bool,
>>>      DirectiveSet*)+0xe42
>>>       >>
>>>       >>
>>> ---------------------------------------------------------------------
>>> ---------------------------
>>>       >> 2.
>>>
>>>      with '-Xcomp'
>>>       >> #  Internal Error
>>> (workspace/open/src/hotspot/share/c1/c1_LinearScan.cpp:5646),
>>> pid=22016, tid=22073
>>>       >> #  assert(false) failed: cannot spill interval that is used
>>> in first instruction (possible reason: no register
>>>      found)
>>>       >>
>>>       >> Current CompileTask:
>>>       >> C1: 854767 13391       3 org.sunflow.math.Matrix4::multiply
>>> (692 bytes)
>>>       >>
>>>       >> Stack: [0x00007f23b9d82000,0x00007f23b9e83000],
>>> sp=0x00007f23b9e7f9d0,  free space=1014k
>>>       >> Native frames: (J=compiled Java code, A=aot compiled Java
>>> code, j=interpreted, Vv=VM code, C=native code)
>>>       >> V  [libjvm.so+0x1882202]  VMError::report_and_die(int, char
>>> const*, char const*, __va_list_tag*, Thread*, unsigned
>>>       >> char*, void*, void*, char const*, int, unsigned long)+0x562
>>>       >> V  [libjvm.so+0x1882d2f]  VMError::report_and_die(Thread*,
>>> void*, char const*, int, char const*, char const*,
>>>       >> __va_list_tag*)+0x2f
>>>       >> V  [libjvm.so+0xb0bea0]  report_vm_error(char const*, int,
>>> char const*, char const*, ...)+0x100
>>>       >> V  [libjvm.so+0x7e0410]
>>> LinearScanWalker::alloc_locked_reg(Interval*)+0x3a0
>>>       >> V  [libjvm.so+0x7e0a20]
>>> LinearScanWalker::activate_current()+0x280
>>>       >> V  [libjvm.so+0x7e0c7d]  IntervalWalker::walk_to(int) [clone
>>> .constprop.299]+0x9d
>>>       >> V  [libjvm.so+0x7e1078]
>>> LinearScan::allocate_registers()+0x338
>>>       >> V  [libjvm.so+0x7e2135]  LinearScan::do_linear_scan()+0x155
>>>       >> V  [libjvm.so+0x70a6bb]  Compilation::emit_lir()+0x99b
>>>       >> V  [libjvm.so+0x70caff]
>>> Compilation::compile_java_method()+0x42f
>>>       >> V  [libjvm.so+0x70d974]  Compilation::compile_method()+0x1d4
>>>       >> V  [libjvm.so+0x70e547]
>>> Compilation::Compilation(AbstractCompiler*, ciEnv*, ciMethod*, int,
>>> BufferBlob*,
>>>       >> DirectiveSet*)+0x357
>>>       >> V  [libjvm.so+0x71073c]  Compiler::compile_method(ciEnv*,
>>> ciMethod*, int, DirectiveSet*)+0x14c
>>>       >> V  [libjvm.so+0xa3cf89]
>>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x409
>>>       >>
>>>       >> Vladimir
>>>       >>
>>>       >> On 9/14/18 1:27 PM, Viswanathan, Sandhya wrote:
>>>       >>>
>>>       >>> Thanks Vladimir, the below should fix this issue:
>>>       >>>
>>>       >>> ------------------------------
>>>       >>> --- old/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>>> 13:10:23.488379912 -0700
>>>       >>> +++ new/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>>> 13:10:23.308379915 -0700
>>>       >>> @@ -233,22 +233,6 @@
>>>       >>>     _xmm_regs[13]  = xmm13;
>>>       >>>     _xmm_regs[14]  = xmm14;
>>>       >>>     _xmm_regs[15]  = xmm15;
>>>       >>> -  _xmm_regs[16]  = xmm16;
>>>       >>> -  _xmm_regs[17]  = xmm17;
>>>       >>> -  _xmm_regs[18]  = xmm18;
>>>       >>> -  _xmm_regs[19]  = xmm19;
>>>       >>> -  _xmm_regs[20]  = xmm20;
>>>       >>> -  _xmm_regs[21]  = xmm21;
>>>       >>> -  _xmm_regs[22]  = xmm22;
>>>       >>> -  _xmm_regs[23]  = xmm23;
>>>       >>> -  _xmm_regs[24]  = xmm24;
>>>       >>> -  _xmm_regs[25]  = xmm25;
>>>       >>> -  _xmm_regs[26]  = xmm26;
>>>       >>> -  _xmm_regs[27]  = xmm27;
>>>       >>> -  _xmm_regs[28]  = xmm28;
>>>       >>> -  _xmm_regs[29]  = xmm29;
>>>       >>> -  _xmm_regs[30]  = xmm30;
>>>       >>> -  _xmm_regs[31]  = xmm31;
>>>       >>>   #endif // _LP64
>>>       >>>
>>>       >>>     for (int i = 0; i < 8; i++) {
>>>       >>> ---------------------------------
>>>       >>>
>>>       >>> I think the gcc version on my desktop is older so didn’t catch this.
>>>       >>>
>>>       >>> The updated patch along with the above change and setting UseAVX as 3 is uploaded to:
>>>       >>> Patch:
>>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.02/
>>>      <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.02/>
>>>       >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8209735
>>>       >>>
>>>       >>> FYI, I did notice that the default for UseAVX had been
>>> rolled back and wanted to get confirmation from you before
>>>       >>> changing it back to 3.
>>>       >>>
>>>       >>> Best Regards,
>>>       >>> Sandhya
>>>       >>>
>>>       >>>
>>>       >>> -----Original Message-----
>>>       >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>]
>>>       >>> Sent: Friday, September 14, 2018 12:13 PM
>>>       >>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>;
>>>      hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>       >>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>> instruction
>>>       >>>
>>>       >>> I got build failure:
>>>       >>>
>>>       >>>
>>> workspace/open/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp:236:3: error:
>>> array index 16 is past the end of the array
>>>       >>> (which contains 16 elements) [-Werror,-Warray-bounds]
>>>       >>> jib >   _xmm_regs[16]  = xmm16;
>>>       >>>
>>>       >>> I also noticed that we don't have RFE for this work. I filed:
>>>       >>>
>>>       >>> https://bugs.openjdk.java.net/browse/JDK-8209735
>>>       >>>
>>>       >>> You did not enabled avx512 by default (8209735 change was
>>> synced from jdk 11 into 12 2 weeks ago). I added next
>>>       >>> change to yours in src/hotspot/cpu/x86/globals_x86.hpp:
>>>       >>>
>>>       >>> - product(intx, UseAVX, 2, \
>>>       >>> + product(intx, UseAVX, 3, \
>>>       >>>
>>>       >>> Thanks,
>>>       >>> Vladimir
>>>       >>>
>>>       >>> On 9/14/18 11:39 AM, Vladimir Kozlov wrote:
>>>       >>>> Looks good to me. I will start testing and let you know results.
>>>       >>>>
>>>       >>>> Thanks,
>>>       >>>> Vladimir
>>>       >>>>
>>>       >>>> On 9/13/18 6:05 AM, Viswanathan, Sandhya wrote:
>>>       >>>>> Hi Vladimir,
>>>       >>>>>
>>>       >>>>> Please find below the updated webrev with all your comments incorporated:
>>>       >>>>>
>>>       >>>>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.01/
>>>      <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.01/>
>>>       >>>>>
>>>       >>>>> I have run the jtreg compiler tests on SKX and KNL which
>>> have two
>>>       >>>>> different flavors of AVX512 and Haswell a non-avx512 system. Also tested SPECjvm2008 on the three platforms.
>>>       >>>>>
>>>       >>>>> Best Regards,
>>>       >>>>> Sandhya
>>>       >>>>>
>>>       >>>>>
>>>       >>>>> -----Original Message-----
>>>       >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>]
>>>       >>>>> Sent: Tuesday, September 11, 2018 8:54 PM
>>>       >>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>;
>>>       >>>>> hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>       >>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>> instruction
>>>       >>>>>
>>>       >>>>> Thank you, Sandhya
>>>       >>>>>
>>>       >>>>> I am satisfied with your detailed answer for memory loads issues. Okay lets not add them.
>>>       >>>>>
>>>       >>>>> Vladimir
>>>       >>>>>
>>>       >>>>> On 9/11/18 6:13 PM, Viswanathan, Sandhya wrote:
>>>       >>>>>> Hi Vladimir,
>>>       >>>>>>
>>>       >>>>>> Thanks a lot for the detailed review. I really appreciate your feedback.
>>>       >>>>>> Please see my response in your email below marked with
>>> (Sandhya
>>>       >>>>>>>>> ). Looking forward to your advice.
>>>       >>>>>>
>>>       >>>>>> Best Regards,
>>>       >>>>>> Sandhya
>>>       >>>>>>
>>>       >>>>>>
>>>       >>>>>> -----Original Message-----
>>>       >>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>]
>>>       >>>>>> Sent: Tuesday, September 11, 2018 5:11 PM
>>>       >>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>;
>>>       >>>>>> hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>       >>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>>       >>>>>> instruction
>>>       >>>>>>
>>>       >>>>>> Thank you.
>>>       >>>>>>
>>>       >>>>>> I want to discuss next issue:
>>>       >>>>>>
>>>       >>>>>>     > You did not added instructions to load these
>>> registers from
>>>       >>>>>> memory (and stack). What happens in such cases when you need to load or store?
>>>       >>>>>>     >>>> Let us take an example, e.g. for loading into
>>> rregF. First
>>>       >>>>>> it gets loaded from memory into regF and then register to register move to rregF and vice versa.
>>>       >>>>>>
>>>       >>>>>> This is what I thought. You increase registers pressure this way which may cause spills on stack.
>>>       >>>>>> Also we don't check that register could be the same as result you may get unneeded moves.
>>>       >>>>>>
>>>       >>>>>> I would advice add memory moves at least.
>>>       >>>>>>
>>>       >>>>>> Sandhya >>>  I had added those rules initially and
>>> removed them in
>>>       >>>>>> the final patch. I noticed that the register allocator
>>> uses the
>>>       >>>>>> memory rules (e.g. LoadF) to initialize the idealreg2reg
>>> mask
>>>       >>>>>> (matcher.cpp). I would like the register allocator to get
>>> all the
>>>       >>>>>> possible register on an architecture for idealreg2reg
>>> mask. I
>>>       >>>>>> wondered that multiple instruct rules in .ad file for
>>> LoadF from
>>>       >>>>>> memory might cause problems.  I would have to have higher
>>> cost for
>>>       >>>>>> loading into restricted register set like vlReg. Then I
>>> decided that
>>>       >>>>>> the register allocator can handle this in much better way
>>> than me
>>>       >>>>>> adding rules to load from memory. This is with the
>>> background that the regF is always all the available
>>>      registers
>>>       >>>>>> and vlRegF is the restricted register set. Likewise for
>>> VecS and legVecS. Let me know you thoughts on this
>>>      and if
>>>       >>>>>> I should still add the rules to load from memory into
>>> vlReg and legVec. The specific code from matcher.cpp
>>>      that I
>>>       >>>>>> am referring to is:
>>>       >>>>>>      MachNode *spillCP = match_tree(new
>>>       >>>>>>
>>> LoadNNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>>>       >>>>>> #endif
>>>       >>>>>>      MachNode *spillI  = match_tree(new
>>>       >>>>>>
>>> LoadINode(NULL,mem,fp,atp,TypeInt::INT,MemNode::unordered));
>>>       >>>>>>      MachNode *spillL  = match_tree(new
>>>       >>>>>>
>>> LoadLNode(NULL,mem,fp,atp,TypeLong::LONG,MemNode::unordered,
>>>       >>>>>> LoadNode::DependsO nlyOnTest, false));
>>>       >>>>>>      MachNode *spillF  = match_tree(new
>>>       >>>>>>
>>> LoadFNode(NULL,mem,fp,atp,Type::FLOAT,MemNode::unordered));
>>>       >>>>>>      MachNode *spillD  = match_tree(new
>>>       >>>>>>
>>> LoadDNode(NULL,mem,fp,atp,Type::DOUBLE,MemNode::unordered));
>>>       >>>>>>      MachNode *spillP  = match_tree(new
>>>       >>>>>>
>>> LoadPNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>>>       >>>>>>      ....
>>>       >>>>>>      idealreg2regmask[Op_RegF] = &spillF->out_RegMask();
>>>       >>>>>>
>>>       >>>>>> An other question. You use movflt() and movdbl() which
>>> use either
>>>       >>>>>> movap[s|d] and movs[s|d]
>>>       >>>>>> instructions:
>>>       >>>>>>
>>> http://hg.openjdk.java.net/jdk/jdk/file/4ffb0a33f265/src/hotspot/cpu
>>>       >>>>>> /x86/macroAssembler_x86.hpp#l164 Are these instructions
>>> work when
>>>       >>>>>> avx512vl is not available? I see for vectors you use
>>>       >>>>>> vpxor+vinserti* combination.
>>>       >>>>>>
>>>       >>>>>> Sandhya >>> Yes the scalar floating point instructions
>>> are available
>>>       >>>>>> with AVX512 encoding when avx512vl is not available. That
>>> is why you
>>>       >>>>>> would see not just movflt, movdbl but all the other
>>> scalar
>>>       >>>>>> operations like adds, addsd etc using the entire xmm
>>> range (xmm0-31). In other words they are AVX512F
>>>      instructions.
>>>       >>>>>>
>>>       >>>>>> Last question. I notice next UseAVX check in vectors spills code in x86.ad <http://x86.ad>:
>>>       >>>>>> if ((UseAVX < 2) || VM_Version::supports_avx512vl())
>>>       >>>>>>
>>>       >>>>>> Should it be (UseAVX < 3)?
>>>       >>>>>>
>>>       >>>>>> Sandhya >>> Yes, that is a very good catch. I will fix this in the updated webrev.
>>>       >>>>>>
>>>       >>>>>> Thanks,
>>>       >>>>>> Vladimir
>>>       >>>>>>
>>>       >>>>>> On 9/11/18 2:58 PM, Viswanathan, Sandhya wrote:
>>>       >>>>>>> Hi Vladimir,
>>>       >>>>>>>
>>>       >>>>>>> Thanks a lot for your review and feedback. Please see my
>>> response
>>>       >>>>>>> in your email below. I will send an updated webrev incorporating your feedback.
>>>       >>>>>>>
>>>       >>>>>>> Best Regards,
>>>       >>>>>>> Sandhya
>>>       >>>>>>>
>>>       >>>>>>>
>>>       >>>>>>> -----Original Message-----
>>>       >>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>]
>>>       >>>>>>> Sent: Monday, September 10, 2018 6:09 PM
>>>       >>>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>>> <mailto:sandhya.viswanathan at intel.com>>;
>>>       >>>>>>> hotspot-compiler-dev at openjdk.java.net
>>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>>       >>>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on
>>> AVX512
>>>       >>>>>>> instruction
>>>       >>>>>>>
>>>       >>>>>>> Very nice. Thank you, Sandhya.
>>>       >>>>>>>
>>>       >>>>>>> I would like to see more meaningful naming in .ad files
>>> - instead
>>>       >>>>>>> of rreg* and ovec* to have something like vlReg* and legVec*.
>>>       >>>>>>>
>>>       >>>>>>>>>> Yes, accepted.
>>>       >>>>>>>
>>>       >>>>>>> New load_from_* and load_to_* instructions in .ad files
>>> should be
>>>       >>>>>>> renamed to next and collocate with other Move*_reg_reg* instructions:
>>>       >>>>>>>
>>>       >>>>>>> instruct MoveF2VL(vlRegF dst, regF src) instruct
>>> MoveVL2F(regF dst,
>>>       >>>>>>> vlRegF src)
>>>       >>>>>>>>>> Yes, accepted.
>>>       >>>>>>>
>>>       >>>>>>> You did not added instructions to load these registers
>>> from memory
>>>       >>>>>>> (and stack). What happens in such cases when you need to load or store?
>>>       >>>>>>>>>> Let us take an example, e.g. for loading into rregF.
>>> First it
>>>       >>>>>>>>>> gets loaded from memory into regF and then register to register move to rregF and vice versa.
>>>       >>>>>>>
>>>       >>>>>>> Also please explain why these registers are used when UseAVX == 0?:
>>>       >>>>>>>
>>>       >>>>>>> +instruct absD_reg(rregD dst) %{
>>>       >>>>>>>         predicate((UseSSE>=2) && (UseAVX == 0));
>>>       >>>>>>>
>>>       >>>>>>> we switch off evex so regular regD (only legacy register in this case) should work too:
>>>       >>>>>>>       661   if (UseAVX < 3) {
>>>       >>>>>>>       662     _features &= ~CPU_AVX512F;
>>>       >>>>>>>
>>>       >>>>>>>>>> Yes, accepted. It could be regD here.
>>>       >>>>>>>
>>>       >>>>>>> Next checks could be combined by using new function in vm_version_x86.hpp (you already have some):
>>>       >>>>>>>
>>>       >>>>>>> +reg_class_dynamic vectors_reg_bwdqvl(vectors_reg_evex,
>>>       >>>>>>> +vectors_reg_legacy, %{
>>>       >>>>>>> VM_Version::supports_evex() &&
>>> VM_Version::supports_avx512bw() &&
>>>       >>>>>>> VM_Version::supports_avx512dq() &&
>>>       >>>>>>> VM_Version::supports_avx512vl() %} );
>>>       >>>>>>>
>>>       >>>>>>>>>> Yes, accepted.
>>>       >>>>>>>
>>>       >>>>>>> I would suggest to test these changes on different
>>> machines
>>>       >>>>>>> (non-avx512 and avx512) and with different UseAVX values.
>>>       >>>>>>>
>>>       >>>>>>>>>> Will do.
>>>       >>>>>>>
>>>       >>>>>>> Thanks,
>>>       >>>>>>> Vladimir
>>>       >>>>>>>
>>>       >>>>>>> On 9/5/18 4:09 PM, Viswanathan, Sandhya wrote:
>>>       >>>>>>>> Recently there have been couple of high priority issues
>>> with
>>>       >>>>>>>> regards to high bank of XMM register
>>>       >>>>>>>> (XMM16-XMM31) usage by C2:
>>>       >>>>>>>>
>>>       >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8207746
>>>       >>>>>>>>
>>>       >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209735
>>>       >>>>>>>>
>>>       >>>>>>>> Please find below a patch which attempts to clean up
>>> the XMM
>>>       >>>>>>>> register handling by using register groups.
>>>       >>>>>>>>
>>>       >>>>>>>>
>>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.00/
>>>      <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>>>       >>>>>>>>
>>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>>>       >>>>>>>>
>>>       >>>>>>>> The patch provides a restricted set of registers to the
>>> match
>>>       >>>>>>>> rules in the ad file based on the underlying architecture.
>>>       >>>>>>>>
>>>       >>>>>>>> The aim is to remove special handling/workaround from macro assembler and assembler.
>>>       >>>>>>>>
>>>       >>>>>>>> By removing the special handling, the patch reduces the
>>> overall
>>>       >>>>>>>> code size by about 1800 lines of code.
>>>       >>>>>>>>
>>>       >>>>>>>> Your review and feedback is very welcome.
>>>       >>>>>>>>
>>>       >>>>>>>> Best Regards,
>>>       >>>>>>>>
>>>       >>>>>>>> Sandhya
>>>       >>>>>>>>
>>>
>>>
>>> --
>>>
>>> Thanks,
>>>
>>> Jc
>>>


More information about the hotspot-compiler-dev mailing list