RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction

Fri Sep 21 21:30:01 UTC 2018

Hi Vladimir,

Please find the updated webrev with fix for build failure on SPARC and other architectures at:
Patch: http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.04/ 
RFE: https://bugs.openjdk.java.net/browse/JDK-8210764 

Vivek submitted this webrev for testing to submit repo yesterday at around noon. We haven’t received any email back so far. This is our first time with submit repo.
http://mail.openjdk.java.net/pipermail/jdk-submit-changes/2018-September/003164.html

Best Regards,
Sandhya

-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya
Sent: Thursday, September 20, 2018 10:53 AM
To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
Subject: RE: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction

Hi Vladimir,

In C1_LIRAssembler.hpp, when I added an additional parameter to negate, I did make sure to add it as a default parameter:

src/hotspot/share/c1/c1_LIRAssembler.hpp, line 282:
  void negate(LIR_Opr left, LIR_Opr dest, LIR_Opr tmp = LIR_OprFact::illegalOpr);

But I guess since the function is not just called but declared/defined in all the other architectures, I need to add an unused LIR_Opr to the negate function for them.
This would be on similar lines as done in some other C1_LIRAssembler methods. 

I will make this change and work with Vivek to use the submit repo for testing it on Sparc.

Best Regards,
Sandhya

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Thursday, September 20, 2018 10:09 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction

I hit build failure on SPARC due to shared changes in C1:

workspace/open/src/hotspot/cpu/sparc/c1_LIRAssembler_sparc.cpp", line 3027: Error: "LIR_Assembler::negate(LIR_OprDesc*,
LIR_OprDesc*)" was previously declared "LIR_Assembler::negate(LIR_OprDesc*, LIR_OprDesc*, LIR_OprDesc*)".
jib > 1 Error(s) detected.

I assume other platforms are also affected.

Vladimir

On 9/19/18 9:53 AM, Vladimir Kozlov wrote:
> Thank you, Sandhya
> 
> I submitted new testing.
> 
> Vladimir
> 
> On 9/18/18 4:52 PM, Viswanathan, Sandhya wrote:
>> Hi Vladimir,
>>
>> Please find below the updated webrev with fixes for the two issues:
>>
>> Patch: http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.03/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.03/>
>>
>> RFE: https://bugs.openjdk.java.net/browse/JDK-8210764
>>
>> Fix for compiler/vectorization/TestNaNVector.java was to pass legVecS 
>> as the temporary register type for intrinsics instead of legVecD.
>>
>> This test was only failing with -XX:MaxVectorSize=4.
>>
>> The file modified is x86_64.ad.
>>
>> Fix for compiler/vectorization/TestNaNVector.java was to allow all 
>> xmm registers (xmm0-xmm31) for C1 and handle floating point abs and negate appropriately by providing a temp register.
>>
>> The C1 files are modified for this fix.
>>
>> I reran compiler jtreg tests with setting nativepath appropriately on Haswell, SKX and KNL.
>>
>> Best Regards,
>>
>> Sandhya
>>
>> *From:*Viswanathan, Sandhya
>> *Sent:* Tuesday, September 18, 2018 1:47 PM
>> *To:* 'JC Beyler' <jcbeyler at google.com>
>> *Cc:* vladimir.kozlov at oracle.com; hotspot-compiler-dev 
>> <hotspot-compiler-dev at openjdk.java.net>
>> *Subject:* RE: RFR (M) 8207746:C2: Lucene crashes on AVX512 
>> instruction
>>
>> Hi Jc,
>>
>> Thanks a lot for the steps. I am now able to verify my fix for the NativeCallTest.java.
>>
>> Best Regards,
>>
>> Sandhya
>>
>> *From:*JC Beyler [mailto:jcbeyler at google.com]
>> *Sent:* Monday, September 17, 2018 9:29 PM
>> *To:* Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>
>> *Cc:* vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>;
>> hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>>
>> *Subject:* Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 
>> instruction
>>
>> Hi Sandhya,
>>
>> How are you invoking the test for NativeCallTest?
>>
>> The way I would do it using jtreg would be something like this:
>>
>> $ export BUILD_TYPE=release
>>
>> $ export JDK_PATH=wherever you have your JDK
>>
>>  From the test subfolder:
>>
>> $ wherever-your-jtreg-is/bin/jtreg
>> -nativepath:$JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/su
>> pport/test/hotspot/jtreg/native/lib -jdk 
>> $JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/images/jdk
>> hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/t
>> est/NativeCallTest.java
>>
>> Seems to pass for me.
>>
>> But much easier is:
>>
>> $ make run-test TEST="test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java"
>>
>> That seems to pass for me as well and is easier to use :)
>>
>> For information, the make run-test documentation is here:
>>
>> http://hg.openjdk.java.net/jdk9/jdk9/raw-file/tip/common/doc/testing.
>> html
>>
>> Let me know if that helps,
>>
>> Jc
>>
>>     Note: For NativeCallTest.java and many others I get the following message in the corresponding .jtr file:
>>                          "test result: Error. Use -nativepath to specify the location of native code"
>>                  Do I need to give any additional info to jtreg to get over this problem?
>>
>>     Thanks a lot!
>>     Best Regards,
>>     Sandhya
>>
>>     -----Original Message-----
>>     From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>>     Sent: Monday, September 17, 2018 10:14 AM
>>     To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>;
>>     hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>     Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 
>> instruction
>>
>>     I finished testing on avx512 machine.
>>     All passed except known (TestNaNVector.java) failures.
>>
>>     Thanks,
>>     Vladimir
>>
>>     On 9/14/18 5:22 PM, Vladimir Kozlov wrote:
>>      > I gave incorrect link to RFE. Here is correct:
>>      >
>>      > https://bugs.openjdk.java.net/browse/JDK-8210764
>>      >
>>      > Vladimir
>>      >
>>      > On 9/14/18 3:49 PM, Vladimir Kozlov wrote:
>>      >> Build failure I got on MacOS and Windows. Linux passed for some reason so I am not surprise you did not noticed.
>>      >>
>>      >> Anyway. I tested with these changes and got next failures on avx1 machines. I am planning to run on avx512 too.
>>      >>
>>      >> 1. compiler/vectorization/TestNaNVector.java with '-Xcomp' or '-XX:CompileThreshold=100 -XX:-TieredCompilation'
>>     on CPU
>>      >> with AVX1 only
>>      >>
>>      >> #  SIGSEGV (0xb) at pc=0x00007f3b146410f0, pid=13871,
>> tid=13884
>>      >> # Problematic frame:
>>      >> # V  [libjvm.so+0x46f0f0]  MachNode::ideal_reg() const+0x20
>>      >>
>>      >> Current CompileTask:
>>      >> C2:    154    5             java.lang.String::equals (65
>> bytes)
>>      >>
>>      >> Stack: [0x00007f3b10044000,0x00007f3b10145000],
>> sp=0x00007f3b1013fe70,  free space=1007k
>>      >> Native frames: (J=compiled Java code, A=aot compiled Java 
>> code, j=interpreted, Vv=VM code, C=native code)
>>      >> V  [libjvm.so+0x46f0f0]  MachNode::ideal_reg() const+0x20
>>      >> V  [libjvm.so+0x882a72]
>> PhaseChaitin::gather_lrg_masks(bool)+0x872
>>      >> V  [libjvm.so+0xd82235]  PhaseCFG::global_code_motion()+0xfc5
>>      >> V  [libjvm.so+0xd824b1]
>> PhaseCFG::do_global_code_motion()+0x51
>>      >> V  [libjvm.so+0xa2c26c]  Compile::Code_Gen()+0x24c
>>      >> V  [libjvm.so+0xa2ff82]  Compile::Compile(ciEnv*, 
>> C2Compiler*, ciMethod*, int, bool, bool, bool,
>>     DirectiveSet*)+0xe42
>>      >>
>>      >>
>> ---------------------------------------------------------------------
>> ---------------------------
>>      >> 2.
>>
>>     with '-Xcomp'
>>      >> #  Internal Error
>> (workspace/open/src/hotspot/share/c1/c1_LinearScan.cpp:5646),
>> pid=22016, tid=22073
>>      >> #  assert(false) failed: cannot spill interval that is used 
>> in first instruction (possible reason: no register
>>     found)
>>      >>
>>      >> Current CompileTask:
>>      >> C1: 854767 13391       3 org.sunflow.math.Matrix4::multiply 
>> (692 bytes)
>>      >>
>>      >> Stack: [0x00007f23b9d82000,0x00007f23b9e83000],
>> sp=0x00007f23b9e7f9d0,  free space=1014k
>>      >> Native frames: (J=compiled Java code, A=aot compiled Java 
>> code, j=interpreted, Vv=VM code, C=native code)
>>      >> V  [libjvm.so+0x1882202]  VMError::report_and_die(int, char 
>> const*, char const*, __va_list_tag*, Thread*, unsigned
>>      >> char*, void*, void*, char const*, int, unsigned long)+0x562
>>      >> V  [libjvm.so+0x1882d2f]  VMError::report_and_die(Thread*, 
>> void*, char const*, int, char const*, char const*,
>>      >> __va_list_tag*)+0x2f
>>      >> V  [libjvm.so+0xb0bea0]  report_vm_error(char const*, int, 
>> char const*, char const*, ...)+0x100
>>      >> V  [libjvm.so+0x7e0410]
>> LinearScanWalker::alloc_locked_reg(Interval*)+0x3a0
>>      >> V  [libjvm.so+0x7e0a20]
>> LinearScanWalker::activate_current()+0x280
>>      >> V  [libjvm.so+0x7e0c7d]  IntervalWalker::walk_to(int) [clone 
>> .constprop.299]+0x9d
>>      >> V  [libjvm.so+0x7e1078]
>> LinearScan::allocate_registers()+0x338
>>      >> V  [libjvm.so+0x7e2135]  LinearScan::do_linear_scan()+0x155
>>      >> V  [libjvm.so+0x70a6bb]  Compilation::emit_lir()+0x99b
>>      >> V  [libjvm.so+0x70caff]
>> Compilation::compile_java_method()+0x42f
>>      >> V  [libjvm.so+0x70d974]  Compilation::compile_method()+0x1d4
>>      >> V  [libjvm.so+0x70e547]
>> Compilation::Compilation(AbstractCompiler*, ciEnv*, ciMethod*, int, 
>> BufferBlob*,
>>      >> DirectiveSet*)+0x357
>>      >> V  [libjvm.so+0x71073c]  Compiler::compile_method(ciEnv*, 
>> ciMethod*, int, DirectiveSet*)+0x14c
>>      >> V  [libjvm.so+0xa3cf89]
>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x409
>>      >>
>>      >> Vladimir
>>      >>
>>      >> On 9/14/18 1:27 PM, Viswanathan, Sandhya wrote:
>>      >>>
>>      >>> Thanks Vladimir, the below should fix this issue:
>>      >>>
>>      >>> ------------------------------
>>      >>> --- old/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>> 13:10:23.488379912 -0700
>>      >>> +++ new/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>> 13:10:23.308379915 -0700
>>      >>> @@ -233,22 +233,6 @@
>>      >>>     _xmm_regs[13]  = xmm13;
>>      >>>     _xmm_regs[14]  = xmm14;
>>      >>>     _xmm_regs[15]  = xmm15;
>>      >>> -  _xmm_regs[16]  = xmm16;
>>      >>> -  _xmm_regs[17]  = xmm17;
>>      >>> -  _xmm_regs[18]  = xmm18;
>>      >>> -  _xmm_regs[19]  = xmm19;
>>      >>> -  _xmm_regs[20]  = xmm20;
>>      >>> -  _xmm_regs[21]  = xmm21;
>>      >>> -  _xmm_regs[22]  = xmm22;
>>      >>> -  _xmm_regs[23]  = xmm23;
>>      >>> -  _xmm_regs[24]  = xmm24;
>>      >>> -  _xmm_regs[25]  = xmm25;
>>      >>> -  _xmm_regs[26]  = xmm26;
>>      >>> -  _xmm_regs[27]  = xmm27;
>>      >>> -  _xmm_regs[28]  = xmm28;
>>      >>> -  _xmm_regs[29]  = xmm29;
>>      >>> -  _xmm_regs[30]  = xmm30;
>>      >>> -  _xmm_regs[31]  = xmm31;
>>      >>>   #endif // _LP64
>>      >>>
>>      >>>     for (int i = 0; i < 8; i++) {
>>      >>> ---------------------------------
>>      >>>
>>      >>> I think the gcc version on my desktop is older so didn’t catch this.
>>      >>>
>>      >>> The updated patch along with the above change and setting UseAVX as 3 is uploaded to:
>>      >>> Patch: 
>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.02/
>>     <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.02/>
>>      >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8209735
>>      >>>
>>      >>> FYI, I did notice that the default for UseAVX had been 
>> rolled back and wanted to get confirmation from you before
>>      >>> changing it back to 3.
>>      >>>
>>      >>> Best Regards,
>>      >>> Sandhya
>>      >>>
>>      >>>
>>      >>> -----Original Message-----
>>      >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>>      >>> Sent: Friday, September 14, 2018 12:13 PM
>>      >>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>;
>>     hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>      >>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 
>> instruction
>>      >>>
>>      >>> I got build failure:
>>      >>>
>>      >>>
>> workspace/open/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp:236:3: error: 
>> array index 16 is past the end of the array
>>      >>> (which contains 16 elements) [-Werror,-Warray-bounds]
>>      >>> jib >   _xmm_regs[16]  = xmm16;
>>      >>>
>>      >>> I also noticed that we don't have RFE for this work. I filed:
>>      >>>
>>      >>> https://bugs.openjdk.java.net/browse/JDK-8209735
>>      >>>
>>      >>> You did not enabled avx512 by default (8209735 change was 
>> synced from jdk 11 into 12 2 weeks ago). I added next
>>      >>> change to yours in src/hotspot/cpu/x86/globals_x86.hpp:
>>      >>>
>>      >>> - product(intx, UseAVX, 2, \
>>      >>> + product(intx, UseAVX, 3, \
>>      >>>
>>      >>> Thanks,
>>      >>> Vladimir
>>      >>>
>>      >>> On 9/14/18 11:39 AM, Vladimir Kozlov wrote:
>>      >>>> Looks good to me. I will start testing and let you know results.
>>      >>>>
>>      >>>> Thanks,
>>      >>>> Vladimir
>>      >>>>
>>      >>>> On 9/13/18 6:05 AM, Viswanathan, Sandhya wrote:
>>      >>>>> Hi Vladimir,
>>      >>>>>
>>      >>>>> Please find below the updated webrev with all your comments incorporated:
>>      >>>>>
>>      >>>>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.01/
>>     <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.01/>
>>      >>>>>
>>      >>>>> I have run the jtreg compiler tests on SKX and KNL which 
>> have two
>>      >>>>> different flavors of AVX512 and Haswell a non-avx512 system. Also tested SPECjvm2008 on the three platforms.
>>      >>>>>
>>      >>>>> Best Regards,
>>      >>>>> Sandhya
>>      >>>>>
>>      >>>>>
>>      >>>>> -----Original Message-----
>>      >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>>      >>>>> Sent: Tuesday, September 11, 2018 8:54 PM
>>      >>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>;
>>      >>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>      >>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 
>> instruction
>>      >>>>>
>>      >>>>> Thank you, Sandhya
>>      >>>>>
>>      >>>>> I am satisfied with your detailed answer for memory loads issues. Okay lets not add them.
>>      >>>>>
>>      >>>>> Vladimir
>>      >>>>>
>>      >>>>> On 9/11/18 6:13 PM, Viswanathan, Sandhya wrote:
>>      >>>>>> Hi Vladimir,
>>      >>>>>>
>>      >>>>>> Thanks a lot for the detailed review. I really appreciate your feedback.
>>      >>>>>> Please see my response in your email below marked with 
>> (Sandhya
>>      >>>>>>>>> ). Looking forward to your advice.
>>      >>>>>>
>>      >>>>>> Best Regards,
>>      >>>>>> Sandhya
>>      >>>>>>
>>      >>>>>>
>>      >>>>>> -----Original Message-----
>>      >>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>>      >>>>>> Sent: Tuesday, September 11, 2018 5:11 PM
>>      >>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>;
>>      >>>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>      >>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>>      >>>>>> instruction
>>      >>>>>>
>>      >>>>>> Thank you.
>>      >>>>>>
>>      >>>>>> I want to discuss next issue:
>>      >>>>>>
>>      >>>>>>     > You did not added instructions to load these 
>> registers from
>>      >>>>>> memory (and stack). What happens in such cases when you need to load or store?
>>      >>>>>>     >>>> Let us take an example, e.g. for loading into 
>> rregF. First
>>      >>>>>> it gets loaded from memory into regF and then register to register move to rregF and vice versa.
>>      >>>>>>
>>      >>>>>> This is what I thought. You increase registers pressure this way which may cause spills on stack.
>>      >>>>>> Also we don't check that register could be the same as result you may get unneeded moves.
>>      >>>>>>
>>      >>>>>> I would advice add memory moves at least.
>>      >>>>>>
>>      >>>>>> Sandhya >>>  I had added those rules initially and 
>> removed them in
>>      >>>>>> the final patch. I noticed that the register allocator 
>> uses the
>>      >>>>>> memory rules (e.g. LoadF) to initialize the idealreg2reg 
>> mask
>>      >>>>>> (matcher.cpp). I would like the register allocator to get 
>> all the
>>      >>>>>> possible register on an architecture for idealreg2reg 
>> mask. I
>>      >>>>>> wondered that multiple instruct rules in .ad file for 
>> LoadF from
>>      >>>>>> memory might cause problems.  I would have to have higher 
>> cost for
>>      >>>>>> loading into restricted register set like vlReg. Then I 
>> decided that
>>      >>>>>> the register allocator can handle this in much better way 
>> than me
>>      >>>>>> adding rules to load from memory. This is with the 
>> background that the regF is always all the available
>>     registers
>>      >>>>>> and vlRegF is the restricted register set. Likewise for 
>> VecS and legVecS. Let me know you thoughts on this
>>     and if
>>      >>>>>> I should still add the rules to load from memory into 
>> vlReg and legVec. The specific code from matcher.cpp
>>     that I
>>      >>>>>> am referring to is:
>>      >>>>>>      MachNode *spillCP = match_tree(new
>>      >>>>>>
>> LoadNNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>>      >>>>>> #endif
>>      >>>>>>      MachNode *spillI  = match_tree(new
>>      >>>>>>
>> LoadINode(NULL,mem,fp,atp,TypeInt::INT,MemNode::unordered));
>>      >>>>>>      MachNode *spillL  = match_tree(new
>>      >>>>>>
>> LoadLNode(NULL,mem,fp,atp,TypeLong::LONG,MemNode::unordered,
>>      >>>>>> LoadNode::DependsO nlyOnTest, false));
>>      >>>>>>      MachNode *spillF  = match_tree(new
>>      >>>>>>
>> LoadFNode(NULL,mem,fp,atp,Type::FLOAT,MemNode::unordered));
>>      >>>>>>      MachNode *spillD  = match_tree(new
>>      >>>>>>
>> LoadDNode(NULL,mem,fp,atp,Type::DOUBLE,MemNode::unordered));
>>      >>>>>>      MachNode *spillP  = match_tree(new
>>      >>>>>>
>> LoadPNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>>      >>>>>>      ....
>>      >>>>>>      idealreg2regmask[Op_RegF] = &spillF->out_RegMask();
>>      >>>>>>
>>      >>>>>> An other question. You use movflt() and movdbl() which 
>> use either
>>      >>>>>> movap[s|d] and movs[s|d]
>>      >>>>>> instructions:
>>      >>>>>>
>> http://hg.openjdk.java.net/jdk/jdk/file/4ffb0a33f265/src/hotspot/cpu
>>      >>>>>> /x86/macroAssembler_x86.hpp#l164 Are these instructions 
>> work when
>>      >>>>>> avx512vl is not available? I see for vectors you use
>>      >>>>>> vpxor+vinserti* combination.
>>      >>>>>>
>>      >>>>>> Sandhya >>> Yes the scalar floating point instructions 
>> are available
>>      >>>>>> with AVX512 encoding when avx512vl is not available. That 
>> is why you
>>      >>>>>> would see not just movflt, movdbl but all the other 
>> scalar
>>      >>>>>> operations like adds, addsd etc using the entire xmm 
>> range (xmm0-31). In other words they are AVX512F
>>     instructions.
>>      >>>>>>
>>      >>>>>> Last question. I notice next UseAVX check in vectors spills code in x86.ad <http://x86.ad>:
>>      >>>>>> if ((UseAVX < 2) || VM_Version::supports_avx512vl())
>>      >>>>>>
>>      >>>>>> Should it be (UseAVX < 3)?
>>      >>>>>>
>>      >>>>>> Sandhya >>> Yes, that is a very good catch. I will fix this in the updated webrev.
>>      >>>>>>
>>      >>>>>> Thanks,
>>      >>>>>> Vladimir
>>      >>>>>>
>>      >>>>>> On 9/11/18 2:58 PM, Viswanathan, Sandhya wrote:
>>      >>>>>>> Hi Vladimir,
>>      >>>>>>>
>>      >>>>>>> Thanks a lot for your review and feedback. Please see my 
>> response
>>      >>>>>>> in your email below. I will send an updated webrev incorporating your feedback.
>>      >>>>>>>
>>      >>>>>>> Best Regards,
>>      >>>>>>> Sandhya
>>      >>>>>>>
>>      >>>>>>>
>>      >>>>>>> -----Original Message-----
>>      >>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>>      >>>>>>> Sent: Monday, September 10, 2018 6:09 PM
>>      >>>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com 
>> <mailto:sandhya.viswanathan at intel.com>>;
>>      >>>>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>>      >>>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on
>> AVX512
>>      >>>>>>> instruction
>>      >>>>>>>
>>      >>>>>>> Very nice. Thank you, Sandhya.
>>      >>>>>>>
>>      >>>>>>> I would like to see more meaningful naming in .ad files
>> - instead
>>      >>>>>>> of rreg* and ovec* to have something like vlReg* and legVec*.
>>      >>>>>>>
>>      >>>>>>>>>> Yes, accepted.
>>      >>>>>>>
>>      >>>>>>> New load_from_* and load_to_* instructions in .ad files 
>> should be
>>      >>>>>>> renamed to next and collocate with other Move*_reg_reg* instructions:
>>      >>>>>>>
>>      >>>>>>> instruct MoveF2VL(vlRegF dst, regF src) instruct 
>> MoveVL2F(regF dst,
>>      >>>>>>> vlRegF src)
>>      >>>>>>>>>> Yes, accepted.
>>      >>>>>>>
>>      >>>>>>> You did not added instructions to load these registers 
>> from memory
>>      >>>>>>> (and stack). What happens in such cases when you need to load or store?
>>      >>>>>>>>>> Let us take an example, e.g. for loading into rregF. 
>> First it
>>      >>>>>>>>>> gets loaded from memory into regF and then register to register move to rregF and vice versa.
>>      >>>>>>>
>>      >>>>>>> Also please explain why these registers are used when UseAVX == 0?:
>>      >>>>>>>
>>      >>>>>>> +instruct absD_reg(rregD dst) %{
>>      >>>>>>>         predicate((UseSSE>=2) && (UseAVX == 0));
>>      >>>>>>>
>>      >>>>>>> we switch off evex so regular regD (only legacy register in this case) should work too:
>>      >>>>>>>       661   if (UseAVX < 3) {
>>      >>>>>>>       662     _features &= ~CPU_AVX512F;
>>      >>>>>>>
>>      >>>>>>>>>> Yes, accepted. It could be regD here.
>>      >>>>>>>
>>      >>>>>>> Next checks could be combined by using new function in vm_version_x86.hpp (you already have some):
>>      >>>>>>>
>>      >>>>>>> +reg_class_dynamic vectors_reg_bwdqvl(vectors_reg_evex,
>>      >>>>>>> +vectors_reg_legacy, %{
>>      >>>>>>> VM_Version::supports_evex() &&
>> VM_Version::supports_avx512bw() &&
>>      >>>>>>> VM_Version::supports_avx512dq() &&
>>      >>>>>>> VM_Version::supports_avx512vl() %} );
>>      >>>>>>>
>>      >>>>>>>>>> Yes, accepted.
>>      >>>>>>>
>>      >>>>>>> I would suggest to test these changes on different 
>> machines
>>      >>>>>>> (non-avx512 and avx512) and with different UseAVX values.
>>      >>>>>>>
>>      >>>>>>>>>> Will do.
>>      >>>>>>>
>>      >>>>>>> Thanks,
>>      >>>>>>> Vladimir
>>      >>>>>>>
>>      >>>>>>> On 9/5/18 4:09 PM, Viswanathan, Sandhya wrote:
>>      >>>>>>>> Recently there have been couple of high priority issues 
>> with
>>      >>>>>>>> regards to high bank of XMM register
>>      >>>>>>>> (XMM16-XMM31) usage by C2:
>>      >>>>>>>>
>>      >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8207746
>>      >>>>>>>>
>>      >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209735
>>      >>>>>>>>
>>      >>>>>>>> Please find below a patch which attempts to clean up 
>> the XMM
>>      >>>>>>>> register handling by using register groups.
>>      >>>>>>>>
>>      >>>>>>>>
>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.00/
>>     <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>>      >>>>>>>>
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>>      >>>>>>>>
>>      >>>>>>>> The patch provides a restricted set of registers to the 
>> match
>>      >>>>>>>> rules in the ad file based on the underlying architecture.
>>      >>>>>>>>
>>      >>>>>>>> The aim is to remove special handling/workaround from macro assembler and assembler.
>>      >>>>>>>>
>>      >>>>>>>> By removing the special handling, the patch reduces the 
>> overall
>>      >>>>>>>> code size by about 1800 lines of code.
>>      >>>>>>>>
>>      >>>>>>>> Your review and feedback is very welcome.
>>      >>>>>>>>
>>      >>>>>>>> Best Regards,
>>      >>>>>>>>
>>      >>>>>>>> Sandhya
>>      >>>>>>>>
>>
>>
>> --
>>
>> Thanks,
>>
>> Jc
>>