RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction
Viswanathan, Sandhya
sandhya.viswanathan at intel.com
Thu Sep 20 17:53:16 UTC 2018
Hi Vladimir,
In C1_LIRAssembler.hpp, when I added an additional parameter to negate, I did make sure to add it as a default parameter:
src/hotspot/share/c1/c1_LIRAssembler.hpp, line 282:
void negate(LIR_Opr left, LIR_Opr dest, LIR_Opr tmp = LIR_OprFact::illegalOpr);
But I guess since the function is not just called but declared/defined in all the other architectures, I need to add an unused LIR_Opr to the negate function for them.
This would be on similar lines as done in some other C1_LIRAssembler methods.
I will make this change and work with Vivek to use the submit repo for testing it on Sparc.
Best Regards,
Sandhya
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Thursday, September 20, 2018 10:09 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512 instruction
I hit build failure on SPARC due to shared changes in C1:
workspace/open/src/hotspot/cpu/sparc/c1_LIRAssembler_sparc.cpp", line 3027: Error: "LIR_Assembler::negate(LIR_OprDesc*,
LIR_OprDesc*)" was previously declared "LIR_Assembler::negate(LIR_OprDesc*, LIR_OprDesc*, LIR_OprDesc*)".
jib > 1 Error(s) detected.
I assume other platforms are also affected.
Vladimir
On 9/19/18 9:53 AM, Vladimir Kozlov wrote:
> Thank you, Sandhya
>
> I submitted new testing.
>
> Vladimir
>
> On 9/18/18 4:52 PM, Viswanathan, Sandhya wrote:
>> Hi Vladimir,
>>
>> Please find below the updated webrev with fixes for the two issues:
>>
>> Patch: http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.03/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.03/>
>>
>> RFE: https://bugs.openjdk.java.net/browse/JDK-8210764
>>
>> Fix for compiler/vectorization/TestNaNVector.java was to pass legVecS
>> as the temporary register type for intrinsics instead of legVecD.
>>
>> This test was only failing with -XX:MaxVectorSize=4.
>>
>> The file modified is x86_64.ad.
>>
>> Fix for compiler/vectorization/TestNaNVector.java was to allow all
>> xmm registers (xmm0-xmm31) for C1 and handle floating point abs and negate appropriately by providing a temp register.
>>
>> The C1 files are modified for this fix.
>>
>> I reran compiler jtreg tests with setting nativepath appropriately on Haswell, SKX and KNL.
>>
>> Best Regards,
>>
>> Sandhya
>>
>> *From:*Viswanathan, Sandhya
>> *Sent:* Tuesday, September 18, 2018 1:47 PM
>> *To:* 'JC Beyler' <jcbeyler at google.com>
>> *Cc:* vladimir.kozlov at oracle.com; hotspot-compiler-dev
>> <hotspot-compiler-dev at openjdk.java.net>
>> *Subject:* RE: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> instruction
>>
>> Hi Jc,
>>
>> Thanks a lot for the steps. I am now able to verify my fix for the NativeCallTest.java.
>>
>> Best Regards,
>>
>> Sandhya
>>
>> *From:*JC Beyler [mailto:jcbeyler at google.com]
>> *Sent:* Monday, September 17, 2018 9:29 PM
>> *To:* Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>
>> *Cc:* vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>;
>> hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>>
>> *Subject:* Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> instruction
>>
>> Hi Sandhya,
>>
>> How are you invoking the test for NativeCallTest?
>>
>> The way I would do it using jtreg would be something like this:
>>
>> $ export BUILD_TYPE=release
>>
>> $ export JDK_PATH=wherever you have your JDK
>>
>> From the test subfolder:
>>
>> $ wherever-your-jtreg-is/bin/jtreg
>> -nativepath:$JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/su
>> pport/test/hotspot/jtreg/native/lib -jdk
>> $JDK_PATH/build/linux-x86_64-normal-server-$BUILD_TYPE/images/jdk
>> hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/t
>> est/NativeCallTest.java
>>
>> Seems to pass for me.
>>
>> But much easier is:
>>
>> $ make run-test TEST="test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java"
>>
>> That seems to pass for me as well and is easier to use :)
>>
>> For information, the make run-test documentation is here:
>>
>> http://hg.openjdk.java.net/jdk9/jdk9/raw-file/tip/common/doc/testing.
>> html
>>
>> Let me know if that helps,
>>
>> Jc
>>
>> Note: For NativeCallTest.java and many others I get the following message in the corresponding .jtr file:
>> "test result: Error. Use -nativepath to specify the location of native code"
>> Do I need to give any additional info to jtreg to get over this problem?
>>
>> Thanks a lot!
>> Best Regards,
>> Sandhya
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>> Sent: Monday, September 17, 2018 10:14 AM
>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>;
>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> instruction
>>
>> I finished testing on avx512 machine.
>> All passed except known (TestNaNVector.java) failures.
>>
>> Thanks,
>> Vladimir
>>
>> On 9/14/18 5:22 PM, Vladimir Kozlov wrote:
>> > I gave incorrect link to RFE. Here is correct:
>> >
>> > https://bugs.openjdk.java.net/browse/JDK-8210764
>> >
>> > Vladimir
>> >
>> > On 9/14/18 3:49 PM, Vladimir Kozlov wrote:
>> >> Build failure I got on MacOS and Windows. Linux passed for some reason so I am not surprise you did not noticed.
>> >>
>> >> Anyway. I tested with these changes and got next failures on avx1 machines. I am planning to run on avx512 too.
>> >>
>> >> 1. compiler/vectorization/TestNaNVector.java with '-Xcomp' or '-XX:CompileThreshold=100 -XX:-TieredCompilation'
>> on CPU
>> >> with AVX1 only
>> >>
>> >> # SIGSEGV (0xb) at pc=0x00007f3b146410f0, pid=13871,
>> tid=13884
>> >> # Problematic frame:
>> >> # V [libjvm.so+0x46f0f0] MachNode::ideal_reg() const+0x20
>> >>
>> >> Current CompileTask:
>> >> C2: 154 5 java.lang.String::equals (65
>> bytes)
>> >>
>> >> Stack: [0x00007f3b10044000,0x00007f3b10145000],
>> sp=0x00007f3b1013fe70, free space=1007k
>> >> Native frames: (J=compiled Java code, A=aot compiled Java
>> code, j=interpreted, Vv=VM code, C=native code)
>> >> V [libjvm.so+0x46f0f0] MachNode::ideal_reg() const+0x20
>> >> V [libjvm.so+0x882a72]
>> PhaseChaitin::gather_lrg_masks(bool)+0x872
>> >> V [libjvm.so+0xd82235] PhaseCFG::global_code_motion()+0xfc5
>> >> V [libjvm.so+0xd824b1]
>> PhaseCFG::do_global_code_motion()+0x51
>> >> V [libjvm.so+0xa2c26c] Compile::Code_Gen()+0x24c
>> >> V [libjvm.so+0xa2ff82] Compile::Compile(ciEnv*,
>> C2Compiler*, ciMethod*, int, bool, bool, bool,
>> DirectiveSet*)+0xe42
>> >>
>> >>
>> ---------------------------------------------------------------------
>> ---------------------------
>> >> 2.
>>
>> with '-Xcomp'
>> >> # Internal Error
>> (workspace/open/src/hotspot/share/c1/c1_LinearScan.cpp:5646),
>> pid=22016, tid=22073
>> >> # assert(false) failed: cannot spill interval that is used
>> in first instruction (possible reason: no register
>> found)
>> >>
>> >> Current CompileTask:
>> >> C1: 854767 13391 3
>> org.sunflow.math.Matrix4::multiply (692 bytes)
>> >>
>> >> Stack: [0x00007f23b9d82000,0x00007f23b9e83000],
>> sp=0x00007f23b9e7f9d0, free space=1014k
>> >> Native frames: (J=compiled Java code, A=aot compiled Java
>> code, j=interpreted, Vv=VM code, C=native code)
>> >> V [libjvm.so+0x1882202] VMError::report_and_die(int, char
>> const*, char const*, __va_list_tag*, Thread*, unsigned
>> >> char*, void*, void*, char const*, int, unsigned long)+0x562
>> >> V [libjvm.so+0x1882d2f] VMError::report_and_die(Thread*,
>> void*, char const*, int, char const*, char const*,
>> >> __va_list_tag*)+0x2f
>> >> V [libjvm.so+0xb0bea0] report_vm_error(char const*, int,
>> char const*, char const*, ...)+0x100
>> >> V [libjvm.so+0x7e0410]
>> LinearScanWalker::alloc_locked_reg(Interval*)+0x3a0
>> >> V [libjvm.so+0x7e0a20]
>> LinearScanWalker::activate_current()+0x280
>> >> V [libjvm.so+0x7e0c7d] IntervalWalker::walk_to(int) [clone
>> .constprop.299]+0x9d
>> >> V [libjvm.so+0x7e1078]
>> LinearScan::allocate_registers()+0x338
>> >> V [libjvm.so+0x7e2135] LinearScan::do_linear_scan()+0x155
>> >> V [libjvm.so+0x70a6bb] Compilation::emit_lir()+0x99b
>> >> V [libjvm.so+0x70caff]
>> Compilation::compile_java_method()+0x42f
>> >> V [libjvm.so+0x70d974] Compilation::compile_method()+0x1d4
>> >> V [libjvm.so+0x70e547]
>> Compilation::Compilation(AbstractCompiler*, ciEnv*, ciMethod*, int,
>> BufferBlob*,
>> >> DirectiveSet*)+0x357
>> >> V [libjvm.so+0x71073c] Compiler::compile_method(ciEnv*,
>> ciMethod*, int, DirectiveSet*)+0x14c
>> >> V [libjvm.so+0xa3cf89]
>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x409
>> >>
>> >> Vladimir
>> >>
>> >> On 9/14/18 1:27 PM, Viswanathan, Sandhya wrote:
>> >>>
>> >>> Thanks Vladimir, the below should fix this issue:
>> >>>
>> >>> ------------------------------
>> >>> --- old/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>> 13:10:23.488379912 -0700
>> >>> +++ new/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp 2018-09-14
>> 13:10:23.308379915 -0700
>> >>> @@ -233,22 +233,6 @@
>> >>> _xmm_regs[13] = xmm13;
>> >>> _xmm_regs[14] = xmm14;
>> >>> _xmm_regs[15] = xmm15;
>> >>> - _xmm_regs[16] = xmm16;
>> >>> - _xmm_regs[17] = xmm17;
>> >>> - _xmm_regs[18] = xmm18;
>> >>> - _xmm_regs[19] = xmm19;
>> >>> - _xmm_regs[20] = xmm20;
>> >>> - _xmm_regs[21] = xmm21;
>> >>> - _xmm_regs[22] = xmm22;
>> >>> - _xmm_regs[23] = xmm23;
>> >>> - _xmm_regs[24] = xmm24;
>> >>> - _xmm_regs[25] = xmm25;
>> >>> - _xmm_regs[26] = xmm26;
>> >>> - _xmm_regs[27] = xmm27;
>> >>> - _xmm_regs[28] = xmm28;
>> >>> - _xmm_regs[29] = xmm29;
>> >>> - _xmm_regs[30] = xmm30;
>> >>> - _xmm_regs[31] = xmm31;
>> >>> #endif // _LP64
>> >>>
>> >>> for (int i = 0; i < 8; i++) {
>> >>> ---------------------------------
>> >>>
>> >>> I think the gcc version on my desktop is older so didn’t catch this.
>> >>>
>> >>> The updated patch along with the above change and setting UseAVX as 3 is uploaded to:
>> >>> Patch:
>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.02/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.02/>
>> >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8209735
>> >>>
>> >>> FYI, I did notice that the default for UseAVX had been
>> rolled back and wanted to get confirmation from you before
>> >>> changing it back to 3.
>> >>>
>> >>> Best Regards,
>> >>> Sandhya
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>> >>> Sent: Friday, September 14, 2018 12:13 PM
>> >>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>;
>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>> >>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> instruction
>> >>>
>> >>> I got build failure:
>> >>>
>> >>>
>> workspace/open/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp:236:3: error:
>> array index 16 is past the end of the array
>> >>> (which contains 16 elements) [-Werror,-Warray-bounds]
>> >>> jib > _xmm_regs[16] = xmm16;
>> >>>
>> >>> I also noticed that we don't have RFE for this work. I filed:
>> >>>
>> >>> https://bugs.openjdk.java.net/browse/JDK-8209735
>> >>>
>> >>> You did not enabled avx512 by default (8209735 change was
>> synced from jdk 11 into 12 2 weeks ago). I added next
>> >>> change to yours in src/hotspot/cpu/x86/globals_x86.hpp:
>> >>>
>> >>> - product(intx, UseAVX, 2, \
>> >>> + product(intx, UseAVX, 3, \
>> >>>
>> >>> Thanks,
>> >>> Vladimir
>> >>>
>> >>> On 9/14/18 11:39 AM, Vladimir Kozlov wrote:
>> >>>> Looks good to me. I will start testing and let you know results.
>> >>>>
>> >>>> Thanks,
>> >>>> Vladimir
>> >>>>
>> >>>> On 9/13/18 6:05 AM, Viswanathan, Sandhya wrote:
>> >>>>> Hi Vladimir,
>> >>>>>
>> >>>>> Please find below the updated webrev with all your comments incorporated:
>> >>>>>
>> >>>>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.01/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.01/>
>> >>>>>
>> >>>>> I have run the jtreg compiler tests on SKX and KNL which
>> have two
>> >>>>> different flavors of AVX512 and Haswell a non-avx512 system. Also tested SPECjvm2008 on the three platforms.
>> >>>>>
>> >>>>> Best Regards,
>> >>>>> Sandhya
>> >>>>>
>> >>>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>> >>>>> Sent: Tuesday, September 11, 2018 8:54 PM
>> >>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>;
>> >>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>> >>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> instruction
>> >>>>>
>> >>>>> Thank you, Sandhya
>> >>>>>
>> >>>>> I am satisfied with your detailed answer for memory loads issues. Okay lets not add them.
>> >>>>>
>> >>>>> Vladimir
>> >>>>>
>> >>>>> On 9/11/18 6:13 PM, Viswanathan, Sandhya wrote:
>> >>>>>> Hi Vladimir,
>> >>>>>>
>> >>>>>> Thanks a lot for the detailed review. I really appreciate your feedback.
>> >>>>>> Please see my response in your email below marked with
>> (Sandhya
>> >>>>>>>>> ). Looking forward to your advice.
>> >>>>>>
>> >>>>>> Best Regards,
>> >>>>>> Sandhya
>> >>>>>>
>> >>>>>>
>> >>>>>> -----Original Message-----
>> >>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>> >>>>>> Sent: Tuesday, September 11, 2018 5:11 PM
>> >>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>;
>> >>>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>> >>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on AVX512
>> >>>>>> instruction
>> >>>>>>
>> >>>>>> Thank you.
>> >>>>>>
>> >>>>>> I want to discuss next issue:
>> >>>>>>
>> >>>>>> > You did not added instructions to load these
>> registers from
>> >>>>>> memory (and stack). What happens in such cases when you need to load or store?
>> >>>>>> >>>> Let us take an example, e.g. for loading into
>> rregF. First
>> >>>>>> it gets loaded from memory into regF and then register to register move to rregF and vice versa.
>> >>>>>>
>> >>>>>> This is what I thought. You increase registers pressure this way which may cause spills on stack.
>> >>>>>> Also we don't check that register could be the same as result you may get unneeded moves.
>> >>>>>>
>> >>>>>> I would advice add memory moves at least.
>> >>>>>>
>> >>>>>> Sandhya >>> I had added those rules initially and
>> removed them in
>> >>>>>> the final patch. I noticed that the register allocator
>> uses the
>> >>>>>> memory rules (e.g. LoadF) to initialize the idealreg2reg
>> mask
>> >>>>>> (matcher.cpp). I would like the register allocator to get
>> all the
>> >>>>>> possible register on an architecture for idealreg2reg
>> mask. I
>> >>>>>> wondered that multiple instruct rules in .ad file for
>> LoadF from
>> >>>>>> memory might cause problems. I would have to have higher
>> cost for
>> >>>>>> loading into restricted register set like vlReg. Then I
>> decided that
>> >>>>>> the register allocator can handle this in much better way
>> than me
>> >>>>>> adding rules to load from memory. This is with the
>> background that the regF is always all the available
>> registers
>> >>>>>> and vlRegF is the restricted register set. Likewise for
>> VecS and legVecS. Let me know you thoughts on this
>> and if
>> >>>>>> I should still add the rules to load from memory into
>> vlReg and legVec. The specific code from matcher.cpp
>> that I
>> >>>>>> am referring to is:
>> >>>>>> MachNode *spillCP = match_tree(new
>> >>>>>>
>> LoadNNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>> >>>>>> #endif
>> >>>>>> MachNode *spillI = match_tree(new
>> >>>>>>
>> LoadINode(NULL,mem,fp,atp,TypeInt::INT,MemNode::unordered));
>> >>>>>> MachNode *spillL = match_tree(new
>> >>>>>>
>> LoadLNode(NULL,mem,fp,atp,TypeLong::LONG,MemNode::unordered,
>> >>>>>> LoadNode::DependsO nlyOnTest, false));
>> >>>>>> MachNode *spillF = match_tree(new
>> >>>>>>
>> LoadFNode(NULL,mem,fp,atp,Type::FLOAT,MemNode::unordered));
>> >>>>>> MachNode *spillD = match_tree(new
>> >>>>>>
>> LoadDNode(NULL,mem,fp,atp,Type::DOUBLE,MemNode::unordered));
>> >>>>>> MachNode *spillP = match_tree(new
>> >>>>>>
>> LoadPNode(NULL,mem,fp,atp,TypeInstPtr::BOTTOM,MemNode::unordered));
>> >>>>>> ....
>> >>>>>> idealreg2regmask[Op_RegF] = &spillF->out_RegMask();
>> >>>>>>
>> >>>>>> An other question. You use movflt() and movdbl() which
>> use either
>> >>>>>> movap[s|d] and movs[s|d]
>> >>>>>> instructions:
>> >>>>>>
>> http://hg.openjdk.java.net/jdk/jdk/file/4ffb0a33f265/src/hotspot/cpu
>> >>>>>> /x86/macroAssembler_x86.hpp#l164 Are these instructions
>> work when
>> >>>>>> avx512vl is not available? I see for vectors you use
>> >>>>>> vpxor+vinserti* combination.
>> >>>>>>
>> >>>>>> Sandhya >>> Yes the scalar floating point instructions
>> are available
>> >>>>>> with AVX512 encoding when avx512vl is not available. That
>> is why you
>> >>>>>> would see not just movflt, movdbl but all the other
>> scalar
>> >>>>>> operations like adds, addsd etc using the entire xmm
>> range (xmm0-31). In other words they are AVX512F
>> instructions.
>> >>>>>>
>> >>>>>> Last question. I notice next UseAVX check in vectors spills code in x86.ad <http://x86.ad>:
>> >>>>>> if ((UseAVX < 2) || VM_Version::supports_avx512vl())
>> >>>>>>
>> >>>>>> Should it be (UseAVX < 3)?
>> >>>>>>
>> >>>>>> Sandhya >>> Yes, that is a very good catch. I will fix this in the updated webrev.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Vladimir
>> >>>>>>
>> >>>>>> On 9/11/18 2:58 PM, Viswanathan, Sandhya wrote:
>> >>>>>>> Hi Vladimir,
>> >>>>>>>
>> >>>>>>> Thanks a lot for your review and feedback. Please see my
>> response
>> >>>>>>> in your email below. I will send an updated webrev incorporating your feedback.
>> >>>>>>>
>> >>>>>>> Best Regards,
>> >>>>>>> Sandhya
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> -----Original Message-----
>> >>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com
>> <mailto:vladimir.kozlov at oracle.com>]
>> >>>>>>> Sent: Monday, September 10, 2018 6:09 PM
>> >>>>>>> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com
>> <mailto:sandhya.viswanathan at intel.com>>;
>> >>>>>>> hotspot-compiler-dev at openjdk.java.net
>> <mailto:hotspot-compiler-dev at openjdk.java.net>
>> >>>>>>> Subject: Re: RFR (M) 8207746:C2: Lucene crashes on
>> AVX512
>> >>>>>>> instruction
>> >>>>>>>
>> >>>>>>> Very nice. Thank you, Sandhya.
>> >>>>>>>
>> >>>>>>> I would like to see more meaningful naming in .ad files
>> - instead
>> >>>>>>> of rreg* and ovec* to have something like vlReg* and legVec*.
>> >>>>>>>
>> >>>>>>>>>> Yes, accepted.
>> >>>>>>>
>> >>>>>>> New load_from_* and load_to_* instructions in .ad files
>> should be
>> >>>>>>> renamed to next and collocate with other Move*_reg_reg* instructions:
>> >>>>>>>
>> >>>>>>> instruct MoveF2VL(vlRegF dst, regF src) instruct
>> MoveVL2F(regF dst,
>> >>>>>>> vlRegF src)
>> >>>>>>>>>> Yes, accepted.
>> >>>>>>>
>> >>>>>>> You did not added instructions to load these registers
>> from memory
>> >>>>>>> (and stack). What happens in such cases when you need to load or store?
>> >>>>>>>>>> Let us take an example, e.g. for loading into rregF.
>> First it
>> >>>>>>>>>> gets loaded from memory into regF and then register to register move to rregF and vice versa.
>> >>>>>>>
>> >>>>>>> Also please explain why these registers are used when UseAVX == 0?:
>> >>>>>>>
>> >>>>>>> +instruct absD_reg(rregD dst) %{
>> >>>>>>> predicate((UseSSE>=2) && (UseAVX == 0));
>> >>>>>>>
>> >>>>>>> we switch off evex so regular regD (only legacy register in this case) should work too:
>> >>>>>>> 661 if (UseAVX < 3) {
>> >>>>>>> 662 _features &= ~CPU_AVX512F;
>> >>>>>>>
>> >>>>>>>>>> Yes, accepted. It could be regD here.
>> >>>>>>>
>> >>>>>>> Next checks could be combined by using new function in vm_version_x86.hpp (you already have some):
>> >>>>>>>
>> >>>>>>> +reg_class_dynamic vectors_reg_bwdqvl(vectors_reg_evex,
>> >>>>>>> +vectors_reg_legacy, %{
>> >>>>>>> VM_Version::supports_evex() &&
>> VM_Version::supports_avx512bw() &&
>> >>>>>>> VM_Version::supports_avx512dq() &&
>> >>>>>>> VM_Version::supports_avx512vl() %} );
>> >>>>>>>
>> >>>>>>>>>> Yes, accepted.
>> >>>>>>>
>> >>>>>>> I would suggest to test these changes on different
>> machines
>> >>>>>>> (non-avx512 and avx512) and with different UseAVX values.
>> >>>>>>>
>> >>>>>>>>>> Will do.
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Vladimir
>> >>>>>>>
>> >>>>>>> On 9/5/18 4:09 PM, Viswanathan, Sandhya wrote:
>> >>>>>>>> Recently there have been couple of high priority issues
>> with
>> >>>>>>>> regards to high bank of XMM register
>> >>>>>>>> (XMM16-XMM31) usage by C2:
>> >>>>>>>>
>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8207746
>> >>>>>>>>
>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209735
>> >>>>>>>>
>> >>>>>>>> Please find below a patch which attempts to clean up
>> the XMM
>> >>>>>>>> register handling by using register groups.
>> >>>>>>>>
>> >>>>>>>>
>> http://cr.openjdk.java.net/~vdeshpande/xmm_reg/webrev.00/
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>> >>>>>>>>
>> <http://cr.openjdk.java.net/%7Evdeshpande/xmm_reg/webrev.00/>
>> >>>>>>>>
>> >>>>>>>> The patch provides a restricted set of registers to the
>> match
>> >>>>>>>> rules in the ad file based on the underlying architecture.
>> >>>>>>>>
>> >>>>>>>> The aim is to remove special handling/workaround from macro assembler and assembler.
>> >>>>>>>>
>> >>>>>>>> By removing the special handling, the patch reduces the
>> overall
>> >>>>>>>> code size by about 1800 lines of code.
>> >>>>>>>>
>> >>>>>>>> Your review and feedback is very welcome.
>> >>>>>>>>
>> >>>>>>>> Best Regards,
>> >>>>>>>>
>> >>>>>>>> Sandhya
>> >>>>>>>>
>>
>>
>> --
>>
>> Thanks,
>>
>> Jc
>>
More information about the hotspot-compiler-dev
mailing list