From vivek.r.deshpande at intel.com Tue Mar 1 00:00:20 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 1 Mar 2016 00:00:20 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> Hi Christian We used the SHA Extension implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. It needed to have Intel copyright, so created a separate file. The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. Regards, Vivek -----Original Message----- From: Christian Thalinger [mailto:christian.thalinger at oracle.com] Sent: Monday, February 29, 2016 1:58 PM To: Deshpande, Vivek R Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling Why is the new file called macroAssembler_intel_x86.cpp? > On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: > > HI Vladimir > > Thank you for your review. > I have updated the patch with the changes you have suggested. > The new webrev is at this location: > http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ > > Regards > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, February 26, 2016 6:50 PM > To: Deshpande, Vivek R; hotspot compiler > Cc: Viswanathan, Sandhya; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > Very nice, Vivek!!! > > Did you run tests with both 32- and 64-bit VMs? > > Small notes: > > In vm_version_x86.hpp spacing are not aligned in next line: > > static bool supports_avxonly() { return ((supports_avx2() || > supports_avx()) && !supports_evex()); } > + static bool supports_sha() { return (_features & CPU_SHA) != 0; } > > Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): > > if (supports_sha()) { > if (FLAG_IS_DEFAULT(UseSHA)) { > UseSHA = true; > } > } else if (UseSHA) { > warning("SHA instructions are not available on this CPU"); > FLAG_SET_DEFAULT(UseSHA, false); > } > > if (UseSHA) { > if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { > FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); > } > } else if (UseSHA1Intrinsics) { > warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); > FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); > } > > if (UseSHA) { > if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { > FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); > } > } else if (UseSHA256Intrinsics) { > warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); > FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); > } > > if (UseSHA512Intrinsics) { > warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); > FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); > } > > if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { > FLAG_SET_DEFAULT(UseSHA, false); > } > > > Thanks, > Vladimir > > On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >> Hi all >> >> I would like to contribute a patch which optimizesSHA-1 andSHA-256 for >> 64 and 32 bitX86architecture using Intel SHA extensions. >> >> Could you please review and sponsor this patch. >> >> Bug-id: >> >> https://bugs.openjdk.java.net/browse/JDK-8150767 >> webrev: >> >> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >> >> Thanks and regards, >> >> Vivek >> From christian.thalinger at oracle.com Tue Mar 1 00:37:26 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 29 Feb 2016 14:37:26 -1000 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D4CD42.2050101@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> Message-ID: <64553B37-EDB9-45CE-8613-0D9BCC1B2F49@oracle.com> > On Feb 29, 2016, at 12:59 PM, Aleksey Shipilev wrote: > > On 03/01/2016 01:35 AM, Christian Thalinger wrote: >>> On Feb 29, 2016, at 11:58 AM, Aleksey Shipilev >>> wrote: See the notes at the bottom: >>> http://cr.openjdk.java.net/~shade/8150669/notes.txt >> >> That?s good. I wonder why this wasn?t intrinsified before. > > Yup, puzzled me too. > >> One nit: can we rename IsPrimitive.java to >> TestClassIsPrimitive.java? Most of our tests are called Test* while >> support files have other names. Not sure if we have a >> hard-convention on this but it certainly is nice and helpful >> sometimes. > > Yes, we can: > http://cr.openjdk.java.net/~shade/8150669/webrev.02/ Thanks. Looks good. > > -Aleksey > From christian.thalinger at oracle.com Tue Mar 1 00:42:34 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 29 Feb 2016 14:42:34 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> Message-ID: <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> > On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: > > Hi Christian > > We used the SHA Extension implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation > of SHA1 and SHA256. Will that extension only be available on Intel chips? > It needed to have Intel copyright, so created a separate file. That is reasonable. > The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Monday, February 29, 2016 1:58 PM > To: Deshpande, Vivek R > Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > Why is the new file called macroAssembler_intel_x86.cpp? > >> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >> >> HI Vladimir >> >> Thank you for your review. >> I have updated the patch with the changes you have suggested. >> The new webrev is at this location: >> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >> >> Regards >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, February 26, 2016 6:50 PM >> To: Deshpande, Vivek R; hotspot compiler >> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> Very nice, Vivek!!! >> >> Did you run tests with both 32- and 64-bit VMs? >> >> Small notes: >> >> In vm_version_x86.hpp spacing are not aligned in next line: >> >> static bool supports_avxonly() { return ((supports_avx2() || >> supports_avx()) && !supports_evex()); } >> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >> >> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >> >> if (supports_sha()) { >> if (FLAG_IS_DEFAULT(UseSHA)) { >> UseSHA = true; >> } >> } else if (UseSHA) { >> warning("SHA instructions are not available on this CPU"); >> FLAG_SET_DEFAULT(UseSHA, false); >> } >> >> if (UseSHA) { >> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >> } >> } else if (UseSHA1Intrinsics) { >> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >> } >> >> if (UseSHA) { >> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >> } >> } else if (UseSHA256Intrinsics) { >> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >> } >> >> if (UseSHA512Intrinsics) { >> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >> } >> >> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA, false); >> } >> >> >> Thanks, >> Vladimir >> >> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>> Hi all >>> >>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 for >>> 64 and 32 bitX86architecture using Intel SHA extensions. >>> >>> Could you please review and sponsor this patch. >>> >>> Bug-id: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>> webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>> >>> Thanks and regards, >>> >>> Vivek >>> > From vladimir.kozlov at oracle.com Tue Mar 1 01:01:44 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 Feb 2016 17:01:44 -0800 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D4B1F7.4040201@oracle.com> References: <56D4B1F7.4040201@oracle.com> Message-ID: <56D4E9F8.4070303@oracle.com> Did CPU, you tested on, supports ERMS (fast stos)? clear_mem() is used only in .ad which is only C2. You can put it under #ifdef COMPILER2 and you can access Matcher::init_array_short_size then. Why x86_32.ad does not have similar changes? Do we really should care for old CPUs (UseFastStosb == false)? Use short branch instructions jccb and jmpb!!!! movptr(Address(base, cnt, Address::times_ptr), 0); is too big. You have RAX for that. Labels declaration (except DONE) and bind(LONG); should be inside if (!is_large) { since it is only used there. You have too many jumps per code. I would suggest next: Label DONE; xorptr(tmp, tmp); if (!is_large) { Label LOOP, LONG; cmpptr(cnt, InitArrayShortSize/BytesPerLong); jccb(Assembler::greater, LONG); decrement(cnt); jccb(Assembler::negative, DONE); // Zero length NOT_LP64(shlptr(cnt, 1);) // convert to number of 32-bit words for 32-bit VM BIND(LOOP); movptr(Address(base, cnt, Address::times_ptr), tmp); decrement(cnt); jccb(Assembler::greaterEqual, LOOP); BIND(LONG); } I was thinking may be we should do it in Ideal graph instead of assembler. But it could trigger Fill array or split iterations optimizations which may not good for such small arrays. Thanks, Vladimir On 2/29/16 1:02 PM, Aleksey Shipilev wrote: > Hi, > > Object storage zeroing uses "rep stos" instructions on x86, which are > fast on long lengths, but have the setup penalty. We successfully avoid > that penalty when zeroing the objects of known lengths (all objects and > arrays of constant sizes). However, we don't do anything for arrays of > non-constant sizes, which are very frequent. > > See more details here: > https://bugs.openjdk.java.net/browse/JDK-8146801 > > Patch: > http://cr.openjdk.java.net/~shade/8146801/webrev.02/ > > The core of the changes is at MacroAssembler::clear_mem. > > The rest is collateral: > a) pulling InitArrayShortSize from Matchers to global VM options to > get the access to it in MacroAssembler; > b) dragging ClearArrayNode::_is_large when ClearArrayNode::Ideal bails > on large constant length -- otherwise we produce effectively dead code > for short loop in MacroAssembler, that is never taken. > > With this patch, the allocation performance for small arrays is improved > 3-4x. Performance data and disassemblies: > http://cr.openjdk.java.net/~shade/8146801/notes.txt > > Testing: JPRT -testset hotspot; targeted microbenchmarks; RBT > hotspot/test/:hotspot_all,vm.runtime.testlist,vm.compiler.testlist > > Cheers, > -Aleksey > > From vladimir.kozlov at oracle.com Tue Mar 1 01:15:21 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 Feb 2016 17:15:21 -0800 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> Message-ID: <56D4ED29.1050108@oracle.com> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. You can manipulate when to use it in vm_version_x86.cpp. Intel Copyright in the file's header is fine. Code changes are fine now (webrev.01). Thanks, Vladimir On 2/29/16 4:42 PM, Christian Thalinger wrote: > >> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >> >> Hi Christian >> >> We used the SHA Extension implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation >> of SHA1 and SHA256. > > Will that extension only be available on Intel chips? > >> It needed to have Intel copyright, so created a separate file. > > That is reasonable. > >> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Monday, February 29, 2016 1:58 PM >> To: Deshpande, Vivek R >> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> Why is the new file called macroAssembler_intel_x86.cpp? >> >>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>> >>> HI Vladimir >>> >>> Thank you for your review. >>> I have updated the patch with the changes you have suggested. >>> The new webrev is at this location: >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>> >>> Regards >>> Vivek >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Friday, February 26, 2016 6:50 PM >>> To: Deshpande, Vivek R; hotspot compiler >>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> Very nice, Vivek!!! >>> >>> Did you run tests with both 32- and 64-bit VMs? >>> >>> Small notes: >>> >>> In vm_version_x86.hpp spacing are not aligned in next line: >>> >>> static bool supports_avxonly() { return ((supports_avx2() || >>> supports_avx()) && !supports_evex()); } >>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>> >>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>> >>> if (supports_sha()) { >>> if (FLAG_IS_DEFAULT(UseSHA)) { >>> UseSHA = true; >>> } >>> } else if (UseSHA) { >>> warning("SHA instructions are not available on this CPU"); >>> FLAG_SET_DEFAULT(UseSHA, false); >>> } >>> >>> if (UseSHA) { >>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>> } >>> } else if (UseSHA1Intrinsics) { >>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>> } >>> >>> if (UseSHA) { >>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>> } >>> } else if (UseSHA256Intrinsics) { >>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>> } >>> >>> if (UseSHA512Intrinsics) { >>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>> } >>> >>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA, false); >>> } >>> >>> >>> Thanks, >>> Vladimir >>> >>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>> Hi all >>>> >>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 for >>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>> >>>> Could you please review and sponsor this patch. >>>> >>>> Bug-id: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>> webrev: >>>> >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>> >>>> Thanks and regards, >>>> >>>> Vivek >>>> >> > From vivek.r.deshpande at intel.com Tue Mar 1 01:20:52 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 1 Mar 2016 01:20:52 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A391EA@ORSMSX106.amr.corp.intel.com> Hi Christian The SHA extensions are for x86 ISA and optimization is activated on corresponding CPUID bit check. Those would be available on Intel Goldmont microarchitecture. Regards, Vivek -----Original Message----- From: Christian Thalinger [mailto:christian.thalinger at oracle.com] Sent: Monday, February 29, 2016 4:43 PM To: Deshpande, Vivek R Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: > > Hi Christian > > We used the SHA Extension > implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. Will that extension only be available on Intel chips? > It needed to have Intel copyright, so created a separate file. That is reasonable. > The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Monday, February 29, 2016 1:58 PM > To: Deshpande, Vivek R > Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > Why is the new file called macroAssembler_intel_x86.cpp? > >> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >> >> HI Vladimir >> >> Thank you for your review. >> I have updated the patch with the changes you have suggested. >> The new webrev is at this location: >> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >> >> Regards >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, February 26, 2016 6:50 PM >> To: Deshpande, Vivek R; hotspot compiler >> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> Very nice, Vivek!!! >> >> Did you run tests with both 32- and 64-bit VMs? >> >> Small notes: >> >> In vm_version_x86.hpp spacing are not aligned in next line: >> >> static bool supports_avxonly() { return ((supports_avx2() || >> supports_avx()) && !supports_evex()); } >> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >> >> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >> >> if (supports_sha()) { >> if (FLAG_IS_DEFAULT(UseSHA)) { >> UseSHA = true; >> } >> } else if (UseSHA) { >> warning("SHA instructions are not available on this CPU"); >> FLAG_SET_DEFAULT(UseSHA, false); >> } >> >> if (UseSHA) { >> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >> } >> } else if (UseSHA1Intrinsics) { >> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >> >> if (UseSHA) { >> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >> } >> } else if (UseSHA256Intrinsics) { >> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >> >> if (UseSHA512Intrinsics) { >> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >> >> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >> FLAG_SET_DEFAULT(UseSHA, false); >> } >> >> >> Thanks, >> Vladimir >> >> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>> Hi all >>> >>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>> for >>> 64 and 32 bitX86architecture using Intel SHA extensions. >>> >>> Could you please review and sponsor this patch. >>> >>> Bug-id: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>> webrev: >>> >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>> >>> Thanks and regards, >>> >>> Vivek >>> > From vivek.r.deshpande at intel.com Tue Mar 1 01:23:09 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 1 Mar 2016 01:23:09 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <56D4ED29.1050108@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3920C@ORSMSX106.amr.corp.intel.com> HI Vladimir The thought behind using "intel" is to create generic placeholder for more functions with Intel copyright and not to restrict the file to SHA extensions. Let us know what you think. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Monday, February 29, 2016 5:15 PM To: Christian Thalinger; Deshpande, Vivek R Cc: hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. You can manipulate when to use it in vm_version_x86.cpp. Intel Copyright in the file's header is fine. Code changes are fine now (webrev.01). Thanks, Vladimir On 2/29/16 4:42 PM, Christian Thalinger wrote: > >> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >> >> Hi Christian >> >> We used the SHA Extension >> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. > > Will that extension only be available on Intel chips? > >> It needed to have Intel copyright, so created a separate file. > > That is reasonable. > >> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Monday, February 29, 2016 1:58 PM >> To: Deshpande, Vivek R >> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> Why is the new file called macroAssembler_intel_x86.cpp? >> >>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>> >>> HI Vladimir >>> >>> Thank you for your review. >>> I have updated the patch with the changes you have suggested. >>> The new webrev is at this location: >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>> >>> Regards >>> Vivek >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Friday, February 26, 2016 6:50 PM >>> To: Deshpande, Vivek R; hotspot compiler >>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>> enabling >>> >>> Very nice, Vivek!!! >>> >>> Did you run tests with both 32- and 64-bit VMs? >>> >>> Small notes: >>> >>> In vm_version_x86.hpp spacing are not aligned in next line: >>> >>> static bool supports_avxonly() { return ((supports_avx2() || >>> supports_avx()) && !supports_evex()); } >>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>> >>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>> >>> if (supports_sha()) { >>> if (FLAG_IS_DEFAULT(UseSHA)) { >>> UseSHA = true; >>> } >>> } else if (UseSHA) { >>> warning("SHA instructions are not available on this CPU"); >>> FLAG_SET_DEFAULT(UseSHA, false); >>> } >>> >>> if (UseSHA) { >>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>> } >>> } else if (UseSHA1Intrinsics) { >>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>> } >>> >>> if (UseSHA) { >>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>> } >>> } else if (UseSHA256Intrinsics) { >>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>> } >>> >>> if (UseSHA512Intrinsics) { >>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>> } >>> >>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>> FLAG_SET_DEFAULT(UseSHA, false); >>> } >>> >>> >>> Thanks, >>> Vladimir >>> >>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>> Hi all >>>> >>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>> for >>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>> >>>> Could you please review and sponsor this patch. >>>> >>>> Bug-id: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>> webrev: >>>> >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>> >>>> Thanks and regards, >>>> >>>> Vivek >>>> >> > From christian.thalinger at oracle.com Tue Mar 1 01:38:59 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 29 Feb 2016 15:38:59 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3920C@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3920C@ORSMSX106.amr.corp.intel.com> Message-ID: > On Feb 29, 2016, at 3:23 PM, Deshpande, Vivek R wrote: > > HI Vladimir > > The thought behind using "intel" is to create generic placeholder for more functions with Intel copyright and not to restrict the file to SHA extensions. But you can have multiple copyrights in one file. Why doesn?t this work here? > Let us know what you think. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Monday, February 29, 2016 5:15 PM > To: Christian Thalinger; Deshpande, Vivek R > Cc: hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. You can manipulate when to use it in vm_version_x86.cpp. > > Intel Copyright in the file's header is fine. > > Code changes are fine now (webrev.01). > > Thanks, > Vladimir > > On 2/29/16 4:42 PM, Christian Thalinger wrote: >> >>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>> >>> Hi Christian >>> >>> We used the SHA Extension >>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >> >> Will that extension only be available on Intel chips? >> >>> It needed to have Intel copyright, so created a separate file. >> >> That is reasonable. >> >>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 1:58 PM >>> To: Deshpande, Vivek R >>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> Why is the new file called macroAssembler_intel_x86.cpp? >>> >>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>> >>>> HI Vladimir >>>> >>>> Thank you for your review. >>>> I have updated the patch with the changes you have suggested. >>>> The new webrev is at this location: >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>> >>>> Regards >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Friday, February 26, 2016 6:50 PM >>>> To: Deshpande, Vivek R; hotspot compiler >>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Very nice, Vivek!!! >>>> >>>> Did you run tests with both 32- and 64-bit VMs? >>>> >>>> Small notes: >>>> >>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>> >>>> static bool supports_avxonly() { return ((supports_avx2() || >>>> supports_avx()) && !supports_evex()); } >>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>> >>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>> >>>> if (supports_sha()) { >>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>> UseSHA = true; >>>> } >>>> } else if (UseSHA) { >>>> warning("SHA instructions are not available on this CPU"); >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>> } >>>> } else if (UseSHA1Intrinsics) { >>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>> } >>>> } else if (UseSHA256Intrinsics) { >>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>>> } >>>> >>>> if (UseSHA512Intrinsics) { >>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>>> } >>>> >>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>> Hi all >>>>> >>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>>> for >>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>> >>>>> Could you please review and sponsor this patch. >>>>> >>>>> Bug-id: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>> webrev: >>>>> >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>> >>>>> Thanks and regards, >>>>> >>>>> Vivek >>>>> >>> >> From vladimir.kozlov at oracle.com Tue Mar 1 01:41:18 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 Feb 2016 17:41:18 -0800 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3920C@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3920C@ORSMSX106.amr.corp.intel.com> Message-ID: <56D4F33E.2020704@oracle.com> These files can become very big - macroAssembler_libm_x86_*.cpp has ~4000 lines. I would be better to have separate files for separate features. It will be easier to maintain them. I am thinking about splitting macroAssembler_libm_x86_*.cpp but not right now. Thanks, Vladimir On 2/29/16 5:23 PM, Deshpande, Vivek R wrote: > HI Vladimir > > The thought behind using "intel" is to create generic placeholder for more functions with Intel copyright and not to restrict the file to SHA extensions. > Let us know what you think. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Monday, February 29, 2016 5:15 PM > To: Christian Thalinger; Deshpande, Vivek R > Cc: hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. You can manipulate when to use it in vm_version_x86.cpp. > > Intel Copyright in the file's header is fine. > > Code changes are fine now (webrev.01). > > Thanks, > Vladimir > > On 2/29/16 4:42 PM, Christian Thalinger wrote: >> >>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>> >>> Hi Christian >>> >>> We used the SHA Extension >>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >> >> Will that extension only be available on Intel chips? >> >>> It needed to have Intel copyright, so created a separate file. >> >> That is reasonable. >> >>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 1:58 PM >>> To: Deshpande, Vivek R >>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> Why is the new file called macroAssembler_intel_x86.cpp? >>> >>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>> >>>> HI Vladimir >>>> >>>> Thank you for your review. >>>> I have updated the patch with the changes you have suggested. >>>> The new webrev is at this location: >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>> >>>> Regards >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Friday, February 26, 2016 6:50 PM >>>> To: Deshpande, Vivek R; hotspot compiler >>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Very nice, Vivek!!! >>>> >>>> Did you run tests with both 32- and 64-bit VMs? >>>> >>>> Small notes: >>>> >>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>> >>>> static bool supports_avxonly() { return ((supports_avx2() || >>>> supports_avx()) && !supports_evex()); } >>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>> >>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>> >>>> if (supports_sha()) { >>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>> UseSHA = true; >>>> } >>>> } else if (UseSHA) { >>>> warning("SHA instructions are not available on this CPU"); >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>> } >>>> } else if (UseSHA1Intrinsics) { >>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>> } >>>> } else if (UseSHA256Intrinsics) { >>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>>> } >>>> >>>> if (UseSHA512Intrinsics) { >>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>>> } >>>> >>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>> Hi all >>>>> >>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>>> for >>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>> >>>>> Could you please review and sponsor this patch. >>>>> >>>>> Bug-id: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>> webrev: >>>>> >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>> >>>>> Thanks and regards, >>>>> >>>>> Vivek >>>>> >>> >> From christian.thalinger at oracle.com Tue Mar 1 01:59:11 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 29 Feb 2016 15:59:11 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <56D4ED29.1050108@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> Message-ID: <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> > On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: > > I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. > You can manipulate when to use it in vm_version_x86.cpp. > > Intel Copyright in the file's header is fine. > > Code changes are fine now (webrev.01). > > Thanks, > Vladimir > > On 2/29/16 4:42 PM, Christian Thalinger wrote: >> >>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>> >>> Hi Christian >>> >>> We used the SHA Extension implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation >>> of SHA1 and SHA256. >> >> Will that extension only be available on Intel chips? >> >>> It needed to have Intel copyright, so created a separate file. >> >> That is reasonable. >> >>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 1:58 PM >>> To: Deshpande, Vivek R >>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> Why is the new file called macroAssembler_intel_x86.cpp? >>> >>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>> >>>> HI Vladimir >>>> >>>> Thank you for your review. >>>> I have updated the patch with the changes you have suggested. >>>> The new webrev is at this location: >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>> >>>> Regards >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Friday, February 26, 2016 6:50 PM >>>> To: Deshpande, Vivek R; hotspot compiler >>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>>> >>>> Very nice, Vivek!!! >>>> >>>> Did you run tests with both 32- and 64-bit VMs? >>>> >>>> Small notes: >>>> >>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>> >>>> static bool supports_avxonly() { return ((supports_avx2() || >>>> supports_avx()) && !supports_evex()); } >>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>> >>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>> >>>> if (supports_sha()) { >>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>> UseSHA = true; >>>> } >>>> } else if (UseSHA) { >>>> warning("SHA instructions are not available on this CPU"); >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>> } >>>> } else if (UseSHA1Intrinsics) { >>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>>> } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>> } >>>> } else if (UseSHA256Intrinsics) { >>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>>> } >>>> >>>> if (UseSHA512Intrinsics) { >>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>>> } >>>> >>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA, false); >>>> } >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>> Hi all >>>>> >>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 for >>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>> >>>>> Could you please review and sponsor this patch. >>>>> >>>>> Bug-id: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>> webrev: >>>>> >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>> >>>>> Thanks and regards, >>>>> >>>>> Vivek >>>>> >>> >> From vladimir.kozlov at oracle.com Tue Mar 1 02:02:05 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 Feb 2016 18:02:05 -0800 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> Message-ID: <56D4F81D.3060000@oracle.com> On 2/29/16 5:59 PM, Christian Thalinger wrote: > >> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: >> >> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. > > I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. This is good too. Vladimir > >> You can manipulate when to use it in vm_version_x86.cpp. >> >> Intel Copyright in the file's header is fine. >> >> Code changes are fine now (webrev.01). >> >> Thanks, >> Vladimir >> >> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>> >>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>>> >>>> Hi Christian >>>> >>>> We used the SHA Extension implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation >>>> of SHA1 and SHA256. >>> >>> Will that extension only be available on Intel chips? >>> >>>> It needed to have Intel copyright, so created a separate file. >>> >>> That is reasonable. >>> >>>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Monday, February 29, 2016 1:58 PM >>>> To: Deshpande, Vivek R >>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>>> >>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>> >>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>>> >>>>> HI Vladimir >>>>> >>>>> Thank you for your review. >>>>> I have updated the patch with the changes you have suggested. >>>>> The new webrev is at this location: >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>> >>>>> Regards >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>> To: Deshpande, Vivek R; hotspot compiler >>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>>>> >>>>> Very nice, Vivek!!! >>>>> >>>>> Did you run tests with both 32- and 64-bit VMs? >>>>> >>>>> Small notes: >>>>> >>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>> >>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>> supports_avx()) && !supports_evex()); } >>>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>>> >>>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>>> >>>>> if (supports_sha()) { >>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>> UseSHA = true; >>>>> } >>>>> } else if (UseSHA) { >>>>> warning("SHA instructions are not available on this CPU"); >>>>> FLAG_SET_DEFAULT(UseSHA, false); >>>>> } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>> } >>>>> } else if (UseSHA1Intrinsics) { >>>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); >>>>> } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>> } >>>>> } else if (UseSHA256Intrinsics) { >>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); >>>>> } >>>>> >>>>> if (UseSHA512Intrinsics) { >>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); >>>>> } >>>>> >>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA, false); >>>>> } >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>> Hi all >>>>>> >>>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 for >>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>> >>>>>> Could you please review and sponsor this patch. >>>>>> >>>>>> Bug-id: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>> webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>> >>>>>> Thanks and regards, >>>>>> >>>>>> Vivek >>>>>> >>>> >>> > From igor.veresov at oracle.com Tue Mar 1 04:56:12 2016 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 29 Feb 2016 20:56:12 -0800 Subject: RFR(S) 8134119: Use new API to get cache line sizes Message-ID: <803009C6-E8E1-4624-B618-31AE3E4DD88E@oracle.com> The adds support of the new Solaris 12 API that lets us avoid using libkstat and libpicl to determine the CPU type and cache line sizes. Webrev: http://cr.openjdk.java.net/~iveresov/8134119/webrev.00/ Thanks, igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Mar 1 05:56:09 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 29 Feb 2016 21:56:09 -0800 Subject: RFR(S) 8134119: Use new API to get cache line sizes In-Reply-To: <803009C6-E8E1-4624-B618-31AE3E4DD88E@oracle.com> References: <803009C6-E8E1-4624-B618-31AE3E4DD88E@oracle.com> Message-ID: <56D52EF9.5060201@oracle.com> Good! Thanks, Vladimir On 2/29/16 8:56 PM, Igor Veresov wrote: > The adds support of the new Solaris 12 API that lets us avoid using > libkstat and libpicl to determine the CPU type and cache line sizes. > > Webrev: http://cr.openjdk.java.net/~iveresov/8134119/webrev.00/ > > Thanks, > igor From rahul.v.raghavan at oracle.com Tue Mar 1 06:30:44 2016 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Mon, 29 Feb 2016 22:30:44 -0800 (PST) Subject: RFR(XS): 8145348: Make intrinsics flags diagnostic In-Reply-To: <56D4AB6E.7090005@oracle.com> References: <34a9defd-e509-4a0e-b410-74d12add6a77@default> <56D4AB6E.7090005@oracle.com> Message-ID: <8c97dcd9-2cea-4404-bb91-3b7e946e6f55@default> Hi, > -----Original Message----- > From: Vladimir Kozlov > Sent: Tuesday, March 01, 2016 2:05 AM > To: hotspot-compiler-dev at openjdk.java.net > > Looks good but we need to file CCC request since we changing product > flags. You can push after it is approved. Okay. Thank you Vladimir. > > Thanks, > Vladimir > > On 2/29/16 10:12 AM, Rahul Raghavan wrote: > > Hi, > > > > Please review the following patch for JDK-8145348. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8145348 > > Webrev: http://cr.openjdk.java.net/~rraghavan/8145348/webrev.00/ > > > > - Identified flags which control intrinsics generation, from vmIntrinsics::is_disabled_by_flags(). > > > > - Changed following flags from 'product' or 'develop' to 'diagnostic' type - > > [globals.hpp] > > UseGHASHIntrinsics > > InlineArrayCopy > > InlineObjectHash > > InlineNatives > > InlineMathNatives > > InlineClassNatives > > InlineThreadNatives > > InlineUnsafeOps > > UseAESIntrinsics > > UseAESCTRIntrinsics > > UseSHA1Intrinsics > > UseSHA256Intrinsics > > UseSHA512Intrinsics > > UseCRC32Intrinsics > > UseCRC32CIntrinsics > > UseAdler32Intrinsics > > UseVectorizedMismatchIntrinsic > > > > [c1_globals.hpp] > > InlineNIOCheckIndex > > > > [c2_globals.hpp] > > InlineReflectionGetCallerClass > > InlineObjectCopy > > SpecialStringCompareTo > > SpecialStringIndexOf > > SpecialStringEquals > > SpecialArraysEquals > > SpecialEncodeISOArray > > UseMathExactIntrinsics > > UseMultiplyToLenIntrinsic > > UseSquareToLenIntrinsic > > UseMulAddIntrinsic > > UseMontgomeryMultiplyIntrinsic > > UseMontgomerySquareIntrinsic > > > > - Added required -XX:+UnlockDiagnosticVMOptions to @run for following two tests using the intrinsic flags. > > hotspot/test/compiler/intrinsics/muladd/TestMulAdd.java > > hotspot/test/compiler/runtime/6859338/Test6859338.java > > (confirmed all other tests using any intrinsic flags got required -XX:+UnlockDiagnosticVMOptions also) > > > > - No issues with JPRT (-testset hotspot). > > - Confirmed no issues, expected behavior with the change for small unit tests. > > (for debug/release VM with flags on/off, UnlockDiagnosticVMOptions usage combinations) > > > > > > Thanks, > > Rahul > > From igor.veresov at oracle.com Tue Mar 1 09:02:54 2016 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 1 Mar 2016 01:02:54 -0800 Subject: RFR(S) 8134119: Use new API to get cache line sizes In-Reply-To: <56D52EF9.5060201@oracle.com> References: <803009C6-E8E1-4624-B618-31AE3E4DD88E@oracle.com> <56D52EF9.5060201@oracle.com> Message-ID: Thanks, Vladimir! igor > On Feb 29, 2016, at 9:56 PM, Vladimir Kozlov wrote: > > Good! > > Thanks, > Vladimir > > On 2/29/16 8:56 PM, Igor Veresov wrote: >> The adds support of the new Solaris 12 API that lets us avoid using >> libkstat and libpicl to determine the CPU type and cache line sizes. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/8134119/webrev.00/ >> >> Thanks, >> igor From nils.eliasson at oracle.com Tue Mar 1 09:46:07 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 1 Mar 2016 10:46:07 +0100 Subject: RFR (S): 8148563: compiler/compilercontrol/jcmd/StressAddMultiThreadedTest.java timesout In-Reply-To: References: <47423DF0-C7C7-4053-BEC0-1D12A558E982@oracle.com> <56C77EFA.1060702@oracle.com> <56C85B32.4040405@oracle.com> <56CB22A6.2090502@oracle.com> Message-ID: <56D564DF.8000907@oracle.com> Hi Pavel, Prepend -Xmixed to inner test VM (the one runnning BaseAction launched by process builder). (Or do that as a separate change) Otherwise it looks good! Best regards, Nils Eliasson On 2016-02-25 18:54, Pavel Punegov wrote: > HI, > > I updated a webrev according to discussion: > http://cr.openjdk.java.net/~ppunegov/8148563/webrev.02/ > > > 1. Remove sequential test. > 2. As in previous webrev use different commands, not only > Compiler.directives_add. > 3. Make test execute only 20 commands, and not more than 30 seconds > (controlled by TimeLimitedRunner). > 4. Make test execute diagnostic commands in the fixed set of 5 threads. > > ? Pavel. > >> On 22 Feb 2016, at 18:00, Nils Eliasson > > wrote: >> >> Hi, >> >> I posted my own webrev for the same issue before reading this thread. >> I didn't see when the bug changed owner. >> >> My reflections are: >> >> 1) The sequantial test is redudant - the multi version tests everything. >> 2) Very good that you added more that just the add command! >> 3) Making this test run for 120 second is way too much in my opinion. >> 20 seconds should be more than enough each night. We are testing a >> stack guarded by a lock. >> 4) Are you sure "Runtime.getRuntime().availableProcessors()" don't >> return all processors on a system (regardless of how many our image >> are allowed to use). I would limit to four threads or so - just to >> make sure there are possibilities for concurrent operations. >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-02-20 13:25, Pavel Punegov wrote: >>> Vladimir, >>> >>> Test generated 5 directives of size 1000 directives before. >>> Previously test added them to directives stack one after another, >>> making VM fail with native OOM (JDK-8144246 >>> ). >>> CompilerDirectivesLimit flag was added with default value of 50. >>> Since that test began to add directives on the stack failing every time. >>> >>> I changed the test to create only one file (999 directives) to reach >>> the limit of 1000 (set by option). So it still can try to add >>> directives over the limit, but it also executes other commands like >>> "clear", "remove" and "print". >>> >>> Added Nils to CC. Nils, could you please take a look also? >>> >>> On 19.02.2016 23:45, Vladimir Kozlov wrote: >>>> Seems fine. What size of directive files was before and after this >>>> fix? JBS does not have this information. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/19/16 9:04 AM, Pavel Punegov wrote: >>>>> Hi, >>>>> >>>>> please review the fix for a test bug. >>>>> >>>>> Issue: >>>>> 1. Test timeouts because it executes a lot of jcmd processes. >>>>> Number of >>>>> threads is calculated as number of processors (cores) * 10, that >>>>> led to >>>>> an enormous number of jcmds executed on hosts with lots of >>>>> CPUs/cores. >>>>> 2. Test also spends a lot of time to generate 5 huge directive files, >>>>> that were tried to be added on to the directives stack. Directive >>>>> stack >>>>> has a limit (default 50, controlled by CompilerDirectivesLimit). >>>>> >>>>> Fix: >>>>> 1. Calculate number of threads as a log of the number of >>>>> CPUs/cores * 10. >>>>> 2. Generate only one file that is less than specified >>>>> CompilerDirectivesLimit. >>>>> 3. Add different commands to execute (add, clear, remove and >>>>> print) and >>>>> generate them on demand. >>>>> >>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.00/ >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8148563 >>>>> >>>>> ? Pavel. >>>>> >>> >>> -- >>> Thanks, >>> Pavel Punegov >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Tue Mar 1 10:07:08 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 1 Mar 2016 13:07:08 +0300 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D4CD42.2050101@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> Message-ID: <56D569CC.70406@oracle.com> > http://cr.openjdk.java.net/~shade/8150669/webrev.02/ Overall, looks good. + ciType* t = c->value()->as_instance()->java_mirror_type(); as_instance() is redundant: InstanceConstant::value() already produces ciInstance: class InstanceConstant: public InstanceType { ciInstance* value() const { return _value; } + set_constant(t->is_klass() ? 0 : 1); I'd prefer to see ciType::is_primitive_type() instead which is more readable. class ciType : public ciMetadata { ... // Returns true if this is not a klass or array (i.e., not a reference type). bool is_primitive_type() const { return basic_type() != T_OBJECT && basic_type() != T_ARRAY; } I have a general question: why did you decide to intrinsify the method into a native call? Class::is_primitive looks pretty trivial to translate it right into machine code: bool java_lang_Class::is_primitive(oop java_class) { bool is_primitive = (java_class->metadata_field(_klass_offset) == NULL); ... return is_primitive; } Best regards, Vladimir Ivanov From tobias.hartmann at oracle.com Tue Mar 1 10:27:26 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 1 Mar 2016 11:27:26 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <1456786793750-259603.post@n7.nabble.com> References: <1456786793750-259603.post@n7.nabble.com> Message-ID: <56D56E8E.5060402@oracle.com> Hi Nileema, thanks for reporting this issue! CC'ing the GC team because this seems to be a GC issue (see evaluation below). On 29.02.2016 23:59, nileema wrote: > We are seeing an issue with the CodeCache becoming full which causes the > compiler to be disabled in jdk-8u45 to jdk-8u72. > > We had seen a similar issue in Java7 (old issue: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html). > This issue went away with earlier versions of Java 8. Reading the old conversation, I'm wondering if this could again be a problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is still open - now assigned to me. Doing a quick experiment, it looks like we mostly compile OSR methods: 22129 2137 % 3 Runnable_412::run @ 4 (31 bytes) 22130 2189 % 4 Runnable_371::run @ 4 (31 bytes) 22134 2129 % 3 Runnable_376::run @ 4 (31 bytes) 22136 2109 % 3 Runnable_410::run @ 4 (31 bytes) Currently, OSR nmethods are not flushed just because the code cache is full but only if the nmethod becomes invalid (class loading/unloading, uncommon trap, ..) With your test, class unloading should happen and therefore the OSR nmethods *should* be flushed. > We used the test http://github.com/martint/jittest to compare the behavior > of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache full > messages with jdk-8u25 but did see them with 8u45+ (8u60 and 8u74) > Test results comparing 8u25, 8u45 and 8u74: > https://gist.github.com/nileema/6fb667a215e95919242f > > In the results you can see that 8u25 starts collecting the code cache much > sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we > force a full gc when it is about to hit the code cache limit, we see the > code cache size go down. You can use the following flags to get additional information: -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing -XX:+TraceClassUnloading I did some more experiments with 8u45: java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -XX:+UseG1GC -jar jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading" -> We do *not* unload any classes. The code cache fills up with OSR nmethods that are not flushed. Removing the -XX:+UseG1GC flag solves the issue: java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep Unloading -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] messages and the code cache does not fill up. -> OSR nmethods are flushed because the classes are unloaded: 21670 970 % 4 Runnable_87::run @ -2 (31 bytes) made zombie The log files look good: 1456825330672 112939 10950016 10195496 111.28 1456825331675 118563 11432256 10467176 112.41 1456825332678 125935 11972928 10778432 115.72 [Unloading class Runnable_2498 0x00000007c0566028] ... [Unloading class Runnable_34 0x00000007c0082028] 1456825333684 131493 10220608 5382976 117.46 1456825334688 137408 10359296 5636120 116.81 1456825335692 143593 7635136 5914624 114.21 After the code cache fills up, we unload classes and therefore flush methods and start over again. I checked for several releases if classes are unloaded: - 8u27: success - 8u33: success - 8u40: fail - 8u45: fail - 8u76: fail The regression was introduced in 8u40. I also tried with the latest JDK 9 build and it fails as well (had to change the bean name from "Code Cache" to "CodeCache" and run with -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the problem. Can someone from the GC team have a look? > Is this a known issue? I'm not aware of any related issue. Best regards, Tobias > Thanks! > > Nileema > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html > Sent from the OpenJDK Hotspot Compiler Development List mailing list archive at Nabble.com. > From tobias.hartmann at oracle.com Tue Mar 1 12:35:00 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 1 Mar 2016 13:35:00 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <56D56E8E.5060402@oracle.com> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> Message-ID: <56D58C74.1020406@oracle.com> Hi, is just had a another look and it turned out that even with 8u40+ class unloading is triggered. I missed that because it happens *much* later (compared to 8u33) when the code cache already filled up and compilation is disabled. At this point we don't recover because new classes are loaded and new OSR nmethods are compiled rapidly. Summary: The code cache fills up due to OSR nmethods that are not being flushed. With 8u33 and earlier, G1 did more aggressive class unloading (probably due to more allocations or different heuristics) and this allowed the sweeper to flush enough OSR nmethods to continue compilation. With 8u40 and later, class unloading happens long after the code cache is full. I think we should fix this by flushing "cold" OSR nmethods as well (JDK-8023191). Thomas Schatzl mentioned that we could also trigger a concurrent mark if the code cache is full and hope that some classes are unloaded but I'm afraid this is too invasive (and does not help much in the general case). Opinions? Best regards, Tobias On 01.03.2016 11:27, Tobias Hartmann wrote: > Hi Nileema, > > thanks for reporting this issue! > > CC'ing the GC team because this seems to be a GC issue (see evaluation below). > > On 29.02.2016 23:59, nileema wrote: >> We are seeing an issue with the CodeCache becoming full which causes the >> compiler to be disabled in jdk-8u45 to jdk-8u72. >> >> We had seen a similar issue in Java7 (old issue: >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html). >> This issue went away with earlier versions of Java 8. > > Reading the old conversation, I'm wondering if this could again be a problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is still open - now assigned to me. > > Doing a quick experiment, it looks like we mostly compile OSR methods: > 22129 2137 % 3 Runnable_412::run @ 4 (31 bytes) > 22130 2189 % 4 Runnable_371::run @ 4 (31 bytes) > 22134 2129 % 3 Runnable_376::run @ 4 (31 bytes) > 22136 2109 % 3 Runnable_410::run @ 4 (31 bytes) > > Currently, OSR nmethods are not flushed just because the code cache is full but only if the nmethod becomes invalid (class loading/unloading, uncommon trap, ..) > > With your test, class unloading should happen and therefore the OSR nmethods *should* be flushed. > >> We used the test http://github.com/martint/jittest to compare the behavior >> of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache full >> messages with jdk-8u25 but did see them with 8u45+ (8u60 and 8u74) >> Test results comparing 8u25, 8u45 and 8u74: >> https://gist.github.com/nileema/6fb667a215e95919242f >> >> In the results you can see that 8u25 starts collecting the code cache much >> sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we >> force a full gc when it is about to hit the code cache limit, we see the >> code cache size go down. > > You can use the following flags to get additional information: > -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing -XX:+TraceClassUnloading > > I did some more experiments with 8u45: > > java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -XX:+UseG1GC -jar jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading" > -> We do *not* unload any classes. The code cache fills up with OSR nmethods that are not flushed. > > Removing the -XX:+UseG1GC flag solves the issue: > > java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep Unloading > -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] messages and the code cache does not fill up. > -> OSR nmethods are flushed because the classes are unloaded: > 21670 970 % 4 Runnable_87::run @ -2 (31 bytes) made zombie > > The log files look good: > > 1456825330672 112939 10950016 10195496 111.28 > 1456825331675 118563 11432256 10467176 112.41 > 1456825332678 125935 11972928 10778432 115.72 > [Unloading class Runnable_2498 0x00000007c0566028] > ... > [Unloading class Runnable_34 0x00000007c0082028] > 1456825333684 131493 10220608 5382976 117.46 > 1456825334688 137408 10359296 5636120 116.81 > 1456825335692 143593 7635136 5914624 114.21 > > After the code cache fills up, we unload classes and therefore flush methods and start over again. > > I checked for several releases if classes are unloaded: > - 8u27: success > - 8u33: success > - 8u40: fail > - 8u45: fail > - 8u76: fail > > The regression was introduced in 8u40. > > I also tried with the latest JDK 9 build and it fails as well (had to change the bean name from "Code Cache" to "CodeCache" and run with -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the problem. > > Can someone from the GC team have a look? > >> Is this a known issue? > > I'm not aware of any related issue. > > Best regards, > Tobias > >> Thanks! >> >> Nileema >> >> >> >> -- >> View this message in context: http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html >> Sent from the OpenJDK Hotspot Compiler Development List mailing list archive at Nabble.com. >> From aleksey.shipilev at oracle.com Tue Mar 1 12:47:30 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 1 Mar 2016 15:47:30 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D4E9F8.4070303@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> Message-ID: <56D58F62.80102@oracle.com> Hi Vladimir! New webrev: http://cr.openjdk.java.net/~shade/8146801/webrev.05/ On 03/01/2016 04:01 AM, Vladimir Kozlov wrote: > I was thinking may be we should do it in Ideal graph instead of > assembler. But it could trigger Fill array or split iterations > optimizations which may not good for such small arrays. Yes, I thinking about that too, but backed off thinking this is a x86-specific optimization. But then again, looking at ClearArray::Ideal that happens for all platforms, we should try to emit the runtime check there too. Let me see if we can pull that off without messing up the generated code. Meanwhile, the webrev above is the assembler-side variant. > Did CPU, you tested on, supports ERMS (fast stos)? Yes, tested on i7-4790K (Haswell), /proc/cpuinfo reports "erms", and we are going through UseFastStosb branch. > clear_mem() is used only in .ad which is only C2. You can put it under > #ifdef COMPILER2 and you can access Matcher::init_array_short_size then. Ah, good trick. Still, I think exposing this as the true platform-dependent flag is better. It also closely follows what ClearArray::Ideal does. > Why x86_32.ad does not have similar changes? Hm, I thought I uploaded the webrev with those changes too, but now I see it is missing. A new webrev has x86_32 parts. > Do we really should care for old CPUs (UseFastStosb == false)? I think we should care, in the same way we care in ClearArray::Ideal. > Use short branch instructions jccb and jmpb!!!! > movptr(Address(base, cnt, Address::times_ptr), 0); is too big. You have > RAX for that. > Labels declaration (except DONE) and bind(LONG); should be inside if > (!is_large) { since it is only used there. *embarrased* All three fixed in new webrev. > You have too many jumps per code. I would suggest next: The variant of your code is in new webrev (there were two minor bugs: shlptr was too late for 32-bit VMs, and no jmpb(DONE)). Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From mikael.gerdin at oracle.com Tue Mar 1 13:17:29 2016 From: mikael.gerdin at oracle.com (Mikael Gerdin) Date: Tue, 1 Mar 2016 14:17:29 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <56D58C74.1020406@oracle.com> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> Message-ID: <56D59669.7030200@oracle.com> Hi, On 2016-03-01 13:35, Tobias Hartmann wrote: > Hi, > > is just had a another look and it turned out that even with 8u40+ class unloading is triggered. I missed that because it happens *much* later (compared to 8u33) when the code cache already filled up and compilation is disabled. At this point we don't recover because new classes are loaded and new OSR nmethods are compiled rapidly. > > Summary: > The code cache fills up due to OSR nmethods that are not being flushed. With 8u33 and earlier, G1 did more aggressive class unloading (probably due to more allocations or different heuristics) and this allowed the sweeper to flush enough OSR nmethods to continue compilation. With 8u40 and later, class unloading happens long after the code cache is full. Before 8u40 G1 could only unload classes at Full GCs. After 8u40 G1 can unload classes at the end of a concurrent GC cycle, avoiding Full GC. If you run the test with CMS with +CMSClassUnloadingEnabled you will probably see similar problematic results since the class unloading in G1 is very similar to the one in CMS. I haven't investigated in depth why the classes do not get unloaded in the G1 and CMS cases but there are several known quirks with how concurrent class unloading behaves which causes them to unload classes later than the serial Full GC. Running G1 with -XX:-ClassUnloadingWithConcurrentMark or CMS with -XX:-CMSClassUnloadingEnabled disables concurrent class unloading completely and works around the issue you are seeing. For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead). > > I think we should fix this by flushing "cold" OSR nmethods as well (JDK-8023191). Thomas Schatzl mentioned that we could also trigger a concurrent mark if the code cache is full and hope that some classes are unloaded but I'm afraid this is too invasive (and does not help much in the general case). If it is possible to flush OSR nmethods without doing a full class unloading cycle then I think that path is prefereable. /Mikael > > Opinions? > > Best regards, > Tobias > > On 01.03.2016 11:27, Tobias Hartmann wrote: >> Hi Nileema, >> >> thanks for reporting this issue! >> >> CC'ing the GC team because this seems to be a GC issue (see evaluation below). >> >> On 29.02.2016 23:59, nileema wrote: >>> We are seeing an issue with the CodeCache becoming full which causes the >>> compiler to be disabled in jdk-8u45 to jdk-8u72. >>> >>> We had seen a similar issue in Java7 (old issue: >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html). >>> This issue went away with earlier versions of Java 8. >> >> Reading the old conversation, I'm wondering if this could again be a problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is still open - now assigned to me. >> >> Doing a quick experiment, it looks like we mostly compile OSR methods: >> 22129 2137 % 3 Runnable_412::run @ 4 (31 bytes) >> 22130 2189 % 4 Runnable_371::run @ 4 (31 bytes) >> 22134 2129 % 3 Runnable_376::run @ 4 (31 bytes) >> 22136 2109 % 3 Runnable_410::run @ 4 (31 bytes) >> >> Currently, OSR nmethods are not flushed just because the code cache is full but only if the nmethod becomes invalid (class loading/unloading, uncommon trap, ..) >> >> With your test, class unloading should happen and therefore the OSR nmethods *should* be flushed. >> >>> We used the test http://github.com/martint/jittest to compare the behavior >>> of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache full >>> messages with jdk-8u25 but did see them with 8u45+ (8u60 and 8u74) >>> Test results comparing 8u25, 8u45 and 8u74: >>> https://gist.github.com/nileema/6fb667a215e95919242f >>> >>> In the results you can see that 8u25 starts collecting the code cache much >>> sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we >>> force a full gc when it is about to hit the code cache limit, we see the >>> code cache size go down. >> >> You can use the following flags to get additional information: >> -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing -XX:+TraceClassUnloading >> >> I did some more experiments with 8u45: >> >> java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -XX:+UseG1GC -jar jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading" >> -> We do *not* unload any classes. The code cache fills up with OSR nmethods that are not flushed. >> >> Removing the -XX:+UseG1GC flag solves the issue: >> >> java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep Unloading >> -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] messages and the code cache does not fill up. >> -> OSR nmethods are flushed because the classes are unloaded: >> 21670 970 % 4 Runnable_87::run @ -2 (31 bytes) made zombie >> >> The log files look good: >> >> 1456825330672 112939 10950016 10195496 111.28 >> 1456825331675 118563 11432256 10467176 112.41 >> 1456825332678 125935 11972928 10778432 115.72 >> [Unloading class Runnable_2498 0x00000007c0566028] >> ... >> [Unloading class Runnable_34 0x00000007c0082028] >> 1456825333684 131493 10220608 5382976 117.46 >> 1456825334688 137408 10359296 5636120 116.81 >> 1456825335692 143593 7635136 5914624 114.21 >> >> After the code cache fills up, we unload classes and therefore flush methods and start over again. >> >> I checked for several releases if classes are unloaded: >> - 8u27: success >> - 8u33: success >> - 8u40: fail >> - 8u45: fail >> - 8u76: fail >> >> The regression was introduced in 8u40. >> >> I also tried with the latest JDK 9 build and it fails as well (had to change the bean name from "Code Cache" to "CodeCache" and run with -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the problem. >> >> Can someone from the GC team have a look? >> >>> Is this a known issue? >> >> I'm not aware of any related issue. >> >> Best regards, >> Tobias >> >>> Thanks! >>> >>> Nileema >>> >>> >>> >>> -- >>> View this message in context: http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html >>> Sent from the OpenJDK Hotspot Compiler Development List mailing list archive at Nabble.com. >>> From nils.eliasson at oracle.com Tue Mar 1 14:24:10 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 1 Mar 2016 15:24:10 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> Message-ID: <56D5A60A.50700@oracle.com> Hi Volker, An excellent proposition. This is how it should be used. I polished a few rough edges: * CompilerBroker.cpp - The directives was already access in compile_method - but hidden incompilation_is_prohibited. I moved it out so we only have a single directive access. Wrapped compile_method to make sure the release of the directive doesn't get lost. * Let WB_AddCompilerDirective return a bool for success. Also fixed the state - need to be in native to get string, but then need to be in VM when parsing directive. And some comments: * I am against adding new compile option commands (At least until the stringly typeness is fixed). Lets add good ways too use compiler directives instead. I need to look at the stale task removal code tomorrow - hopefully we could save the blocking info in the task so we don't need to access the directive in the policy. All in here: Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ The code runs fine with the test I fixed for JDK-8073793: http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ Best regards, Nils Eliasson On 2016-02-26 19:47, Volker Simonis wrote: > Hi, > > so I want to propose the following solution for this problem: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ > > I've started from the opposite site and made the BackgroundCompilation > manageable through the compiler directives framework. Once this works > (and it's actually trivial due to the nice design of the > CompilerDirectives framework :), we get the possibility to set the > BackgroundCompilation option on a per method base on the command line > via the CompileCommand option for free: > > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" > > And of course we can also use it directly as a compiler directive: > > [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] > > It also becomes possible to use this directly from the Whitebox API > through the DiagnosticCommand.compilerDirectivesAdd command. > Unfortunately, this command takes a file with compiler directives as > argument. I think this would be overkill in this context. So because > it was so easy and convenient, I added the following two new Whitebox > methods: > > public native void addCompilerDirective(String compDirect); > public native void removeCompilerDirective(); > > which can now be used to set arbitrary CompilerDirective command > directly from within the WhiteBox API. (The implementation of these > two methods is trivial as you can see in whitebox.cpp). > v > The blocking versions of enqueueMethodForCompilation() now become > simple wrappers around the existing methods without the need of any > code changes in their native implementation. This is good, because it > keeps the WhiteBox API stable! > > Finally some words about the implementation of the per-method > BackgroundCompilation functionality. It actually only requires two > small changes: > > 1. extending CompileBroker::is_compile_blocking() to take the method > and compilation level as arguments and use them to query the > DirectivesStack for the corresponding BackgroundCompilation value. > > 2. changing AdvancedThresholdPolicy::select_task() such that it > prefers blocking compilations. This is not only necessary, because it > decreases the time we have to wait for a blocking compilation, but > also because it prevents blocking compiles from getting stale. This > could otherwise easily happen in AdvancedThresholdPolicy::is_stale() > for methods which only get artificially compiled during a test because > their invocations counters are usually too small. > > There's still a small probability that a blocking compilation will be > not blocking. This can happen if a method for which we request the > blocking compilation is already in the compilation queue (see the > check 'compilation_is_in_queue(method)' in > CompileBroker::compile_method_base()). In testing scenarios this will > rarely happen because methods which are manually compiled shouldn't > get called that many times to implicitly place them into the compile > queue. But we can even completely avoid this problem by using > WB.isMethodQueuedForCompilation() to make sure that a method is not in > the queue before we request a blocking compilation. > > I've also added a small regression test to demonstrate and verify the > new functionality. > > Regards, > Volker On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson wrote: >> Hi Vladimir, >> >> WhiteBox::compilation_locked is a global state that temporarily stops all >> compilations. I this case I just want to achieve blocking compilation for a >> single compile without affecting the rest of the system. The tests using it >> will continue executing as soon as that compile is finished, saving time >> where wait-loops is used today. It adds nice determinism to tests. >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>> You are adding parameter which is used only for testing. >>> Can we have callback(or check field) into WB instead? Similar to >>> WhiteBox::compilation_locked. >>> >>> Thanks, >>> Vladimir >>> >>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>> Hi, >>>> >>>> Please review this change that adds support for blocking compiles in the >>>> whitebox API. This enables simpler less time consuming tests. >>>> >>>> Motivation: >>>> * -XX:-BackgroundCompilation is a global flag and can be time consuming >>>> * Blocking compiles removes the need for waiting on the compile queue to >>>> complete >>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>> causing indeterminism in the test >>>> * Less VM-flags allows for more tests in the same VM >>>> >>>> Testing: >>>> Posting a separate RFR for test fix that uses this change. They will be >>>> pushed at the same time. >>>> >>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>> >>>> Best regards, >>>> Nils Eliasson >> From nils.eliasson at oracle.com Tue Mar 1 14:25:49 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 1 Mar 2016 15:25:49 +0100 Subject: RFR(S/M): 8073793: serviceability/dcmd/compiler/CodelistTest.java fails with ClassNotFoundException trying to load VM anonymous class In-Reply-To: <56CF176A.7090705@oracle.com> References: <56CF176A.7090705@oracle.com> Message-ID: <56D5A66D.1030900@oracle.com> New webrev using changed API in JDK-8150646. Webrev: http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ Best regards, Nils Eliasson On 2016-02-25 16:02, Nils Eliasson wrote: > Hi, > > Please review this fix of the CodelistTest. > > Summary: > The test iterated over the output and tried to reflect some Classes > for verification. This is fragile since some classes are not > reflectable and it changes over time. > > Solution: > Instead ensure compilation of some select methods, on different > compile levels, and verify that those methods show up in the output. > > Testing: > Test run on all platforms. > > This change requires: https://bugs.openjdk.java.net/browse/JDK-8150646 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8073793 > Webrev: http://cr.openjdk.java.net/~neliasso/8073793/webrev.01/ > > Best regards, > Nils Eliasson From pavel.punegov at oracle.com Tue Mar 1 14:41:31 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Tue, 1 Mar 2016 17:41:31 +0300 Subject: RFR (S): 8148563: compiler/compilercontrol/jcmd/StressAddMultiThreadedTest.java timesout In-Reply-To: <56D564DF.8000907@oracle.com> References: <47423DF0-C7C7-4053-BEC0-1D12A558E982@oracle.com> <56C77EFA.1060702@oracle.com> <56C85B32.4040405@oracle.com> <56CB22A6.2090502@oracle.com> <56D564DF.8000907@oracle.com> Message-ID: <288BE01B-3212-4ECF-B872-8C999A4011D0@oracle.com> Thanks for review, Nils new webrev with prepended Xmixed: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.04/ > On 01 Mar 2016, at 12:46, Nils Eliasson wrote: > > Hi Pavel, > > Prepend -Xmixed to inner test VM (the one runnning BaseAction launched by process builder). (Or do that as a separate change) > > Otherwise it looks good! > > Best regards, > Nils Eliasson > > > On 2016-02-25 18:54, Pavel Punegov wrote: >> HI, >> >> I updated a webrev according to discussion: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.02/ >> >> 1. Remove sequential test. >> 2. As in previous webrev use different commands, not only Compiler.directives_add. >> 3. Make test execute only 20 commands, and not more than 30 seconds (controlled by TimeLimitedRunner). >> 4. Make test execute diagnostic commands in the fixed set of 5 threads. >> >> ? Pavel. >> >>> On 22 Feb 2016, at 18:00, Nils Eliasson < nils.eliasson at oracle.com > wrote: >>> >>> Hi, >>> >>> I posted my own webrev for the same issue before reading this thread. I didn't see when the bug changed owner. >>> >>> My reflections are: >>> >>> 1) The sequantial test is redudant - the multi version tests everything. >>> 2) Very good that you added more that just the add command! >>> 3) Making this test run for 120 second is way too much in my opinion. 20 seconds should be more than enough each night. We are testing a stack guarded by a lock. >>> 4) Are you sure "Runtime.getRuntime().availableProcessors()" don't return all processors on a system (regardless of how many our image are allowed to use). I would limit to four threads or so - just to make sure there are possibilities for concurrent operations. >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-02-20 13:25, Pavel Punegov wrote: >>>> Vladimir, >>>> >>>> Test generated 5 directives of size 1000 directives before. >>>> Previously test added them to directives stack one after another, making VM fail with native OOM (JDK-8144246 ). CompilerDirectivesLimit flag was added with default value of 50. Since that test began to add directives on the stack failing every time. >>>> >>>> I changed the test to create only one file (999 directives) to reach the limit of 1000 (set by option). So it still can try to add directives over the limit, but it also executes other commands like "clear", "remove" and "print". >>>> >>>> Added Nils to CC. Nils, could you please take a look also? >>>> >>>> On 19.02.2016 23:45, Vladimir Kozlov wrote: >>>>> Seems fine. What size of directive files was before and after this fix? JBS does not have this information. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/19/16 9:04 AM, Pavel Punegov wrote: >>>>>> Hi, >>>>>> >>>>>> please review the fix for a test bug. >>>>>> >>>>>> Issue: >>>>>> 1. Test timeouts because it executes a lot of jcmd processes. Number of >>>>>> threads is calculated as number of processors (cores) * 10, that led to >>>>>> an enormous number of jcmds executed on hosts with lots of CPUs/cores. >>>>>> 2. Test also spends a lot of time to generate 5 huge directive files, >>>>>> that were tried to be added on to the directives stack. Directive stack >>>>>> has a limit (default 50, controlled by CompilerDirectivesLimit). >>>>>> >>>>>> Fix: >>>>>> 1. Calculate number of threads as a log of the number of CPUs/cores * 10. >>>>>> 2. Generate only one file that is less than specified >>>>>> CompilerDirectivesLimit. >>>>>> 3. Add different commands to execute (add, clear, remove and print) and >>>>>> generate them on demand. >>>>>> >>>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.00/ >>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8148563 >>>>>> >>>>>> ? Pavel. >>>>>> >>>> >>>> -- >>>> Thanks, >>>> Pavel Punegov >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.punegov at oracle.com Tue Mar 1 15:03:26 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Tue, 1 Mar 2016 18:03:26 +0300 Subject: RFR (S): 8148563: compiler/compilercontrol/jcmd/StressAddMultiThreadedTest.java timesout In-Reply-To: <288BE01B-3212-4ECF-B872-8C999A4011D0@oracle.com> References: <47423DF0-C7C7-4053-BEC0-1D12A558E982@oracle.com> <56C77EFA.1060702@oracle.com> <56C85B32.4040405@oracle.com> <56CB22A6.2090502@oracle.com> <56D564DF.8000907@oracle.com> <288BE01B-3212-4ECF-B872-8C999A4011D0@oracle.com> Message-ID: <30AAA0EB-CC2C-4036-8073-A79187EEF76D@oracle.com> Changed the limit of directives form 1000 to 1001, because test has 5 threads submitting 200 directives, and there is also a default directive *.* always on stack, that is also counted by the limit http://cr.openjdk.java.net/~ppunegov/8148563/webrev.05/ > On 01 Mar 2016, at 17:41, Pavel Punegov wrote: > > Thanks for review, Nils > > new webrev with prepended Xmixed: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.04/ > >> On 01 Mar 2016, at 12:46, Nils Eliasson > wrote: >> >> Hi Pavel, >> >> Prepend -Xmixed to inner test VM (the one runnning BaseAction launched by process builder). (Or do that as a separate change) >> >> Otherwise it looks good! >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-02-25 18:54, Pavel Punegov wrote: >>> HI, >>> >>> I updated a webrev according to discussion: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.02/ >>> >>> 1. Remove sequential test. >>> 2. As in previous webrev use different commands, not only Compiler.directives_add. >>> 3. Make test execute only 20 commands, and not more than 30 seconds (controlled by TimeLimitedRunner). >>> 4. Make test execute diagnostic commands in the fixed set of 5 threads. >>> >>> ? Pavel. >>> >>>> On 22 Feb 2016, at 18:00, Nils Eliasson < nils.eliasson at oracle.com > wrote: >>>> >>>> Hi, >>>> >>>> I posted my own webrev for the same issue before reading this thread. I didn't see when the bug changed owner. >>>> >>>> My reflections are: >>>> >>>> 1) The sequantial test is redudant - the multi version tests everything. >>>> 2) Very good that you added more that just the add command! >>>> 3) Making this test run for 120 second is way too much in my opinion. 20 seconds should be more than enough each night. We are testing a stack guarded by a lock. >>>> 4) Are you sure "Runtime.getRuntime().availableProcessors()" don't return all processors on a system (regardless of how many our image are allowed to use). I would limit to four threads or so - just to make sure there are possibilities for concurrent operations. >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-02-20 13:25, Pavel Punegov wrote: >>>>> Vladimir, >>>>> >>>>> Test generated 5 directives of size 1000 directives before. >>>>> Previously test added them to directives stack one after another, making VM fail with native OOM (JDK-8144246 ). CompilerDirectivesLimit flag was added with default value of 50. Since that test began to add directives on the stack failing every time. >>>>> >>>>> I changed the test to create only one file (999 directives) to reach the limit of 1000 (set by option). So it still can try to add directives over the limit, but it also executes other commands like "clear", "remove" and "print". >>>>> >>>>> Added Nils to CC. Nils, could you please take a look also? >>>>> >>>>> On 19.02.2016 23:45, Vladimir Kozlov wrote: >>>>>> Seems fine. What size of directive files was before and after this fix? JBS does not have this information. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 2/19/16 9:04 AM, Pavel Punegov wrote: >>>>>>> Hi, >>>>>>> >>>>>>> please review the fix for a test bug. >>>>>>> >>>>>>> Issue: >>>>>>> 1. Test timeouts because it executes a lot of jcmd processes. Number of >>>>>>> threads is calculated as number of processors (cores) * 10, that led to >>>>>>> an enormous number of jcmds executed on hosts with lots of CPUs/cores. >>>>>>> 2. Test also spends a lot of time to generate 5 huge directive files, >>>>>>> that were tried to be added on to the directives stack. Directive stack >>>>>>> has a limit (default 50, controlled by CompilerDirectivesLimit). >>>>>>> >>>>>>> Fix: >>>>>>> 1. Calculate number of threads as a log of the number of CPUs/cores * 10. >>>>>>> 2. Generate only one file that is less than specified >>>>>>> CompilerDirectivesLimit. >>>>>>> 3. Add different commands to execute (add, clear, remove and print) and >>>>>>> generate them on demand. >>>>>>> >>>>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.00/ >>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8148563 >>>>>>> >>>>>>> ? Pavel. >>>>>>> >>>>> >>>>> -- >>>>> Thanks, >>>>> Pavel Punegov >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 1 15:08:06 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 1 Mar 2016 16:08:06 +0100 Subject: RFR (S): 8148563: compiler/compilercontrol/jcmd/StressAddMultiThreadedTest.java timesout In-Reply-To: <30AAA0EB-CC2C-4036-8073-A79187EEF76D@oracle.com> References: <47423DF0-C7C7-4053-BEC0-1D12A558E982@oracle.com> <56C77EFA.1060702@oracle.com> <56C85B32.4040405@oracle.com> <56CB22A6.2090502@oracle.com> <56D564DF.8000907@oracle.com> <288BE01B-3212-4ECF-B872-8C999A4011D0@oracle.com> <30AAA0EB-CC2C-4036-8073-A79187EEF76D@oracle.com> Message-ID: <56D5B056.9020703@oracle.com> Looks good! Best regards, Nils Eliasson On 2016-03-01 16:03, Pavel Punegov wrote: > Changed the limit of directives form 1000 to 1001, because test has 5 > threads submitting 200 directives, and there is also a default > directive *.* always on stack, that is also counted by the limit > > http://cr.openjdk.java.net/~ppunegov/8148563/webrev.05/ > > >> On 01 Mar 2016, at 17:41, Pavel Punegov > > wrote: >> >> Thanks for review, Nils >> >> new webrev with prepended Xmixed: >> http://cr.openjdk.java.net/~ppunegov/8148563/webrev.04/ >> >> >>> On 01 Mar 2016, at 12:46, Nils Eliasson >> > wrote: >>> >>> Hi Pavel, >>> >>> Prepend -Xmixed to inner test VM (the one runnning BaseAction >>> launched by process builder). (Or do that as a separate change) >>> >>> Otherwise it looks good! >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-02-25 18:54, Pavel Punegov wrote: >>>> HI, >>>> >>>> I updated a webrev according to discussion: >>>> http://cr.openjdk.java.net/~ppunegov/8148563/webrev.02/ >>>> >>>> 1. Remove sequential test. >>>> 2. As in previous webrev use different commands, not only >>>> Compiler.directives_add. >>>> 3. Make test execute only 20 commands, and not more than 30 seconds >>>> (controlled by TimeLimitedRunner). >>>> 4. Make test execute diagnostic commands in the fixed set of 5 threads. >>>> >>>> ? Pavel. >>>> >>>>> On 22 Feb 2016, at 18:00, Nils Eliasson >>>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I posted my own webrev for the same issue before reading this >>>>> thread. I didn't see when the bug changed owner. >>>>> >>>>> My reflections are: >>>>> >>>>> 1) The sequantial test is redudant - the multi version tests >>>>> everything. >>>>> 2) Very good that you added more that just the add command! >>>>> 3) Making this test run for 120 second is way too much in my >>>>> opinion. 20 seconds should be more than enough each night. We are >>>>> testing a stack guarded by a lock. >>>>> 4) Are you sure "Runtime.getRuntime().availableProcessors()" don't >>>>> return all processors on a system (regardless of how many our >>>>> image are allowed to use). I would limit to four threads or so - >>>>> just to make sure there are possibilities for concurrent operations. >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> On 2016-02-20 13:25, Pavel Punegov wrote: >>>>>> Vladimir, >>>>>> >>>>>> Test generated 5 directives of size 1000 directives before. >>>>>> Previously test added them to directives stack one after another, >>>>>> making VM fail with native OOM (JDK-8144246 >>>>>> ). >>>>>> CompilerDirectivesLimit flag was added with default value of 50. >>>>>> Since that test began to add directives on the stack failing >>>>>> every time. >>>>>> >>>>>> I changed the test to create only one file (999 directives) to >>>>>> reach the limit of 1000 (set by option). So it still can try to >>>>>> add directives over the limit, but it also executes other >>>>>> commands like "clear", "remove" and "print". >>>>>> >>>>>> Added Nils to CC. Nils, could you please take a look also? >>>>>> >>>>>> On 19.02.2016 23:45, Vladimir Kozlov wrote: >>>>>>> Seems fine. What size of directive files was before and after >>>>>>> this fix? JBS does not have this information. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/19/16 9:04 AM, Pavel Punegov wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> please review the fix for a test bug. >>>>>>>> >>>>>>>> Issue: >>>>>>>> 1. Test timeouts because it executes a lot of jcmd processes. >>>>>>>> Number of >>>>>>>> threads is calculated as number of processors (cores) * 10, >>>>>>>> that led to >>>>>>>> an enormous number of jcmds executed on hosts with lots of >>>>>>>> CPUs/cores. >>>>>>>> 2. Test also spends a lot of time to generate 5 huge directive >>>>>>>> files, >>>>>>>> that were tried to be added on to the directives stack. >>>>>>>> Directive stack >>>>>>>> has a limit (default 50, controlled by CompilerDirectivesLimit). >>>>>>>> >>>>>>>> Fix: >>>>>>>> 1. Calculate number of threads as a log of the number of >>>>>>>> CPUs/cores * 10. >>>>>>>> 2. Generate only one file that is less than specified >>>>>>>> CompilerDirectivesLimit. >>>>>>>> 3. Add different commands to execute (add, clear, remove and >>>>>>>> print) and >>>>>>>> generate them on demand. >>>>>>>> >>>>>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.00/ >>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8148563 >>>>>>>> >>>>>>>> ? Pavel. >>>>>>>> >>>>>> >>>>>> -- >>>>>> Thanks, >>>>>> Pavel Punegov >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edward.nevill at gmail.com Tue Mar 1 15:09:18 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Tue, 01 Mar 2016 15:09:18 +0000 Subject: [aarch64-port-dev ] RFR: 8150229: aarch64: c2 fix pipeline class for several instructions. In-Reply-To: References: Message-ID: <1456844958.30827.12.camel@mylittlepony.linaroharston> On Fri, 2016-02-19 at 20:23 +0800, Felix Yang wrote: > Hi, > > Please review the following webrev: > > http://cr.openjdk.java.net/~fyang/8150229/webrev.00/ > > Jira issue: *https://bugs.openjdk.java.net/browse/JDK-8150229 > * > > The pipeline class for some instructions is not set correctly. An example: Looks correct to me. These were some pipeline classes I got wrong when I did the pipeline scheduling patch. Thanks for finding these, Ed. From aph at redhat.com Tue Mar 1 15:20:35 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 1 Mar 2016 15:20:35 +0000 Subject: [aarch64-port-dev ] RFR: 8150229: aarch64: c2 fix pipeline class for several instructions. In-Reply-To: References: Message-ID: <56D5B343.3060301@redhat.com> On 02/19/2016 12:23 PM, Felix Yang wrote: > Please review the following webrev: > > http://cr.openjdk.java.net/~fyang/8150229/webrev.00/ OK. Andrew. From vladimir.x.ivanov at oracle.com Tue Mar 1 15:52:33 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 1 Mar 2016 18:52:33 +0300 Subject: [9] RFR (XS): 8150933: System::arraycopy intrinsic doesn't mark mismatched loads Message-ID: <56D5BAC1.1070107@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8150933 http://cr.openjdk.java.net/~vlivanov/8150933/webrev.00 System.arraycopy intrinsic produces a mismatched access for unaligned case, but doesn't mark it as such. Also, added a check to avoid constant folding when mismatched access happens. It allows to produce correct code in product binaries, but signals about the problem when fastdebug binaries are used. Testing: failing tests, JPRT. Best regards, Vladimir Ivanov From vladimir.kozlov at oracle.com Tue Mar 1 15:59:08 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 07:59:08 -0800 Subject: [9] RFR (XS): 8150933: System::arraycopy intrinsic doesn't mark mismatched loads In-Reply-To: <56D5BAC1.1070107@oracle.com> References: <56D5BAC1.1070107@oracle.com> Message-ID: <56D5BC4C.7010303@oracle.com> Seems fine. Thanks, Vladimir On 3/1/16 7:52 AM, Vladimir Ivanov wrote: > https://bugs.openjdk.java.net/browse/JDK-8150933 > http://cr.openjdk.java.net/~vlivanov/8150933/webrev.00 > > System.arraycopy intrinsic produces a mismatched access for unaligned > case, but doesn't mark it as such. > > Also, added a check to avoid constant folding when mismatched access > happens. It allows to produce correct code in product binaries, but > signals about the problem when fastdebug binaries are used. > > Testing: failing tests, JPRT. > > Best regards, > Vladimir Ivanov From aleksey.shipilev at oracle.com Tue Mar 1 15:59:38 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 1 Mar 2016 18:59:38 +0300 Subject: [9] RFR (XS): 8150933: System::arraycopy intrinsic doesn't mark mismatched loads In-Reply-To: <56D5BAC1.1070107@oracle.com> References: <56D5BAC1.1070107@oracle.com> Message-ID: <56D5BC6A.40808@oracle.com> On 03/01/2016 06:52 PM, Vladimir Ivanov wrote: > https://bugs.openjdk.java.net/browse/JDK-8150933 > http://cr.openjdk.java.net/~vlivanov/8150933/webrev.00 > > System.arraycopy intrinsic produces a mismatched access for unaligned > case, but doesn't mark it as such. Oh man. Looks good. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue Mar 1 17:03:24 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 09:03:24 -0800 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D58F62.80102@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> Message-ID: <56D5CB5C.4050203@oracle.com> Do you have new performance numbers? I hope it did not regress with new code. 2 things left I fill should be addressed. I think is_large parameter for ClearArrayNode() should be explicit. We have only 2 places it is called. We should avoid to have copies of rep_fast_stos() in .ad files. The difference in .ad depending on UseFastStosb is format only. We now have ability to generate format with conditions - see membar_volatile() and others (search %%template). There is also way to add and set new field in rep_fast_stos to check is_large in ins_encode but it is more complicated (adlc changes). So I would keep predicate for that. Thanks, Vladimir On 3/1/16 4:47 AM, Aleksey Shipilev wrote: > Hi Vladimir! > > New webrev: > http://cr.openjdk.java.net/~shade/8146801/webrev.05/ > > On 03/01/2016 04:01 AM, Vladimir Kozlov wrote: >> I was thinking may be we should do it in Ideal graph instead of >> assembler. But it could trigger Fill array or split iterations >> optimizations which may not good for such small arrays. > > Yes, I thinking about that too, but backed off thinking this is a > x86-specific optimization. But then again, looking at ClearArray::Ideal > that happens for all platforms, we should try to emit the runtime check > there too. Let me see if we can pull that off without messing up the > generated code. Meanwhile, the webrev above is the assembler-side variant. > >> Did CPU, you tested on, supports ERMS (fast stos)? > > Yes, tested on i7-4790K (Haswell), /proc/cpuinfo reports "erms", and we > are going through UseFastStosb branch. > >> clear_mem() is used only in .ad which is only C2. You can put it under >> #ifdef COMPILER2 and you can access Matcher::init_array_short_size then. > > Ah, good trick. Still, I think exposing this as the true > platform-dependent flag is better. It also closely follows what > ClearArray::Ideal does. > >> Why x86_32.ad does not have similar changes? > > Hm, I thought I uploaded the webrev with those changes too, but now I > see it is missing. A new webrev has x86_32 parts. > >> Do we really should care for old CPUs (UseFastStosb == false)? > > I think we should care, in the same way we care in ClearArray::Ideal. > >> Use short branch instructions jccb and jmpb!!!! > >> movptr(Address(base, cnt, Address::times_ptr), 0); is too big. You have >> RAX for that. > >> Labels declaration (except DONE) and bind(LONG); should be inside if >> (!is_large) { since it is only used there. > > *embarrased* All three fixed in new webrev. > >> You have too many jumps per code. I would suggest next: > > The variant of your code is in new webrev (there were two minor bugs: > shlptr was too late for 32-bit VMs, and no jmpb(DONE)). > > > > Thanks, > -Aleksey > From vladimir.x.ivanov at oracle.com Tue Mar 1 17:04:26 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 1 Mar 2016 20:04:26 +0300 Subject: [9] RFR (XS): 8150933: System::arraycopy intrinsic doesn't mark mismatched loads In-Reply-To: <56D5BC4C.7010303@oracle.com> References: <56D5BAC1.1070107@oracle.com> <56D5BC4C.7010303@oracle.com> Message-ID: <56D5CB9A.1010209@oracle.com> Vladimir, Aleksey, thanks. Best regards, Vladimir Ivanov On 3/1/16 6:59 PM, Vladimir Kozlov wrote: > Seems fine. > > Thanks, > Vladimir > > On 3/1/16 7:52 AM, Vladimir Ivanov wrote: >> https://bugs.openjdk.java.net/browse/JDK-8150933 >> http://cr.openjdk.java.net/~vlivanov/8150933/webrev.00 >> >> System.arraycopy intrinsic produces a mismatched access for unaligned >> case, but doesn't mark it as such. >> >> Also, added a check to avoid constant folding when mismatched access >> happens. It allows to produce correct code in product binaries, but >> signals about the problem when fastdebug binaries are used. >> >> Testing: failing tests, JPRT. >> >> Best regards, >> Vladimir Ivanov From pavel.punegov at oracle.com Tue Mar 1 17:12:08 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Tue, 1 Mar 2016 20:12:08 +0300 Subject: RFR (S): 8148563: compiler/compilercontrol/jcmd/StressAddMultiThreadedTest.java timesout In-Reply-To: <56D5B056.9020703@oracle.com> References: <47423DF0-C7C7-4053-BEC0-1D12A558E982@oracle.com> <56C77EFA.1060702@oracle.com> <56C85B32.4040405@oracle.com> <56CB22A6.2090502@oracle.com> <56D564DF.8000907@oracle.com> <288BE01B-3212-4ECF-B872-8C999A4011D0@oracle.com> <30AAA0EB-CC2C-4036-8073-A79187EEF76D@oracle.com> <56D5B056.9020703@oracle.com> Message-ID: Thanks for review, Nils ? Pavel. > On 01 Mar 2016, at 18:08, Nils Eliasson wrote: > > Looks good! > > Best regards, > Nils Eliasson > > > On 2016-03-01 16:03, Pavel Punegov wrote: >> Changed the limit of directives form 1000 to 1001, because test has 5 threads submitting 200 directives, and there is also a default directive *.* always on stack, that is also counted by the limit >> >> http://cr.openjdk.java.net/~ppunegov/8148563/webrev.05/ >> >>> On 01 Mar 2016, at 17:41, Pavel Punegov < pavel.punegov at oracle.com > wrote: >>> >>> Thanks for review, Nils >>> >>> new webrev with prepended Xmixed: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.04/ >>> >>>> On 01 Mar 2016, at 12:46, Nils Eliasson > wrote: >>>> >>>> Hi Pavel, >>>> >>>> Prepend -Xmixed to inner test VM (the one runnning BaseAction launched by process builder). (Or do that as a separate change) >>>> >>>> Otherwise it looks good! >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-02-25 18:54, Pavel Punegov wrote: >>>>> HI, >>>>> >>>>> I updated a webrev according to discussion: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.02/ >>>>> >>>>> 1. Remove sequential test. >>>>> 2. As in previous webrev use different commands, not only Compiler.directives_add. >>>>> 3. Make test execute only 20 commands, and not more than 30 seconds (controlled by TimeLimitedRunner). >>>>> 4. Make test execute diagnostic commands in the fixed set of 5 threads. >>>>> >>>>> ? Pavel. >>>>> >>>>>> On 22 Feb 2016, at 18:00, Nils Eliasson < nils.eliasson at oracle.com > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I posted my own webrev for the same issue before reading this thread. I didn't see when the bug changed owner. >>>>>> >>>>>> My reflections are: >>>>>> >>>>>> 1) The sequantial test is redudant - the multi version tests everything. >>>>>> 2) Very good that you added more that just the add command! >>>>>> 3) Making this test run for 120 second is way too much in my opinion. 20 seconds should be more than enough each night. We are testing a stack guarded by a lock. >>>>>> 4) Are you sure "Runtime.getRuntime().availableProcessors()" don't return all processors on a system (regardless of how many our image are allowed to use). I would limit to four threads or so - just to make sure there are possibilities for concurrent operations. >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> On 2016-02-20 13:25, Pavel Punegov wrote: >>>>>>> Vladimir, >>>>>>> >>>>>>> Test generated 5 directives of size 1000 directives before. >>>>>>> Previously test added them to directives stack one after another, making VM fail with native OOM (JDK-8144246 ). CompilerDirectivesLimit flag was added with default value of 50. Since that test began to add directives on the stack failing every time. >>>>>>> >>>>>>> I changed the test to create only one file (999 directives) to reach the limit of 1000 (set by option). So it still can try to add directives over the limit, but it also executes other commands like "clear", "remove" and "print". >>>>>>> >>>>>>> Added Nils to CC. Nils, could you please take a look also? >>>>>>> >>>>>>> On 19.02.2016 23:45, Vladimir Kozlov wrote: >>>>>>>> Seems fine. What size of directive files was before and after this fix? JBS does not have this information. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 2/19/16 9:04 AM, Pavel Punegov wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> please review the fix for a test bug. >>>>>>>>> >>>>>>>>> Issue: >>>>>>>>> 1. Test timeouts because it executes a lot of jcmd processes. Number of >>>>>>>>> threads is calculated as number of processors (cores) * 10, that led to >>>>>>>>> an enormous number of jcmds executed on hosts with lots of CPUs/cores. >>>>>>>>> 2. Test also spends a lot of time to generate 5 huge directive files, >>>>>>>>> that were tried to be added on to the directives stack. Directive stack >>>>>>>>> has a limit (default 50, controlled by CompilerDirectivesLimit). >>>>>>>>> >>>>>>>>> Fix: >>>>>>>>> 1. Calculate number of threads as a log of the number of CPUs/cores * 10. >>>>>>>>> 2. Generate only one file that is less than specified >>>>>>>>> CompilerDirectivesLimit. >>>>>>>>> 3. Add different commands to execute (add, clear, remove and print) and >>>>>>>>> generate them on demand. >>>>>>>>> >>>>>>>>> webrev: http://cr.openjdk.java.net/~ppunegov/8148563/webrev.00/ >>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8148563 >>>>>>>>> >>>>>>>>> ? Pavel. >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks, >>>>>>> Pavel Punegov >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Mar 1 17:15:11 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 09:15:11 -0800 Subject: RFR(S/M): 8073793: serviceability/dcmd/compiler/CodelistTest.java fails with ClassNotFoundException trying to load VM anonymous class In-Reply-To: <56D5A66D.1030900@oracle.com> References: <56CF176A.7090705@oracle.com> <56D5A66D.1030900@oracle.com> Message-ID: <56D5CE1F.3010906@oracle.com> Good. Thanks, Vladimir On 3/1/16 6:25 AM, Nils Eliasson wrote: > New webrev using changed API in JDK-8150646. > > Webrev: http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ > > Best regards, > Nils Eliasson > > On 2016-02-25 16:02, Nils Eliasson wrote: >> Hi, >> >> Please review this fix of the CodelistTest. >> >> Summary: >> The test iterated over the output and tried to reflect some Classes >> for verification. This is fragile since some classes are not >> reflectable and it changes over time. >> >> Solution: >> Instead ensure compilation of some select methods, on different >> compile levels, and verify that those methods show up in the output. >> >> Testing: >> Test run on all platforms. >> >> This change requires: https://bugs.openjdk.java.net/browse/JDK-8150646 >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8073793 >> Webrev: http://cr.openjdk.java.net/~neliasso/8073793/webrev.01/ >> >> Best regards, >> Nils Eliasson > From vladimir.kozlov at oracle.com Tue Mar 1 17:24:54 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 09:24:54 -0800 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D5A60A.50700@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> Message-ID: <56D5D066.7040805@oracle.com> Nils, please answer Pavel's questions. Thanks, Vladimir On 3/1/16 6:24 AM, Nils Eliasson wrote: > Hi Volker, > > An excellent proposition. This is how it should be used. > > I polished a few rough edges: > * CompilerBroker.cpp - The directives was already access in > compile_method - but hidden incompilation_is_prohibited. I moved it out > so we only have a single directive access. Wrapped compile_method to > make sure the release of the directive doesn't get lost. > * Let WB_AddCompilerDirective return a bool for success. Also fixed the > state - need to be in native to get string, but then need to be in VM > when parsing directive. > > And some comments: > * I am against adding new compile option commands (At least until the > stringly typeness is fixed). Lets add good ways too use compiler > directives instead. > > I need to look at the stale task removal code tomorrow - hopefully we > could save the blocking info in the task so we don't need to access the > directive in the policy. > > All in here: > Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ > > The code runs fine with the test I fixed for JDK-8073793: > http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ > > Best regards, > Nils Eliasson > > On 2016-02-26 19:47, Volker Simonis wrote: >> Hi, >> >> so I want to propose the following solution for this problem: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >> >> I've started from the opposite site and made the BackgroundCompilation >> manageable through the compiler directives framework. Once this works >> (and it's actually trivial due to the nice design of the >> CompilerDirectives framework :), we get the possibility to set the >> BackgroundCompilation option on a per method base on the command line >> via the CompileCommand option for free: >> >> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >> >> >> And of course we can also use it directly as a compiler directive: >> >> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >> >> It also becomes possible to use this directly from the Whitebox API >> through the DiagnosticCommand.compilerDirectivesAdd command. >> Unfortunately, this command takes a file with compiler directives as >> argument. I think this would be overkill in this context. So because >> it was so easy and convenient, I added the following two new Whitebox >> methods: >> >> public native void addCompilerDirective(String compDirect); >> public native void removeCompilerDirective(); >> >> which can now be used to set arbitrary CompilerDirective command >> directly from within the WhiteBox API. (The implementation of these >> two methods is trivial as you can see in whitebox.cpp). >> v >> The blocking versions of enqueueMethodForCompilation() now become >> simple wrappers around the existing methods without the need of any >> code changes in their native implementation. This is good, because it >> keeps the WhiteBox API stable! >> >> Finally some words about the implementation of the per-method >> BackgroundCompilation functionality. It actually only requires two >> small changes: >> >> 1. extending CompileBroker::is_compile_blocking() to take the method >> and compilation level as arguments and use them to query the >> DirectivesStack for the corresponding BackgroundCompilation value. >> >> 2. changing AdvancedThresholdPolicy::select_task() such that it >> prefers blocking compilations. This is not only necessary, because it >> decreases the time we have to wait for a blocking compilation, but >> also because it prevents blocking compiles from getting stale. This >> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >> for methods which only get artificially compiled during a test because >> their invocations counters are usually too small. >> >> There's still a small probability that a blocking compilation will be >> not blocking. This can happen if a method for which we request the >> blocking compilation is already in the compilation queue (see the >> check 'compilation_is_in_queue(method)' in >> CompileBroker::compile_method_base()). In testing scenarios this will >> rarely happen because methods which are manually compiled shouldn't >> get called that many times to implicitly place them into the compile >> queue. But we can even completely avoid this problem by using >> WB.isMethodQueuedForCompilation() to make sure that a method is not in >> the queue before we request a blocking compilation. >> >> I've also added a small regression test to demonstrate and verify the >> new functionality. >> >> Regards, >> Volker > On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson > wrote: >>> Hi Vladimir, >>> >>> WhiteBox::compilation_locked is a global state that temporarily stops >>> all >>> compilations. I this case I just want to achieve blocking compilation >>> for a >>> single compile without affecting the rest of the system. The tests >>> using it >>> will continue executing as soon as that compile is finished, saving time >>> where wait-loops is used today. It adds nice determinism to tests. >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>> You are adding parameter which is used only for testing. >>>> Can we have callback(or check field) into WB instead? Similar to >>>> WhiteBox::compilation_locked. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>> Hi, >>>>> >>>>> Please review this change that adds support for blocking compiles >>>>> in the >>>>> whitebox API. This enables simpler less time consuming tests. >>>>> >>>>> Motivation: >>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>> consuming >>>>> * Blocking compiles removes the need for waiting on the compile >>>>> queue to >>>>> complete >>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>> causing indeterminism in the test >>>>> * Less VM-flags allows for more tests in the same VM >>>>> >>>>> Testing: >>>>> Posting a separate RFR for test fix that uses this change. They >>>>> will be >>>>> pushed at the same time. >>>>> >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>> > From volker.simonis at gmail.com Tue Mar 1 18:31:20 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 1 Mar 2016 19:31:20 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D5D066.7040805@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> Message-ID: Hi Pavel, Nils, Vladimir, sorry, but I was busy the last days so I couldn't answer your mails. Thanks a lot for your input and your suggestions. I'll look into this tomorrow and hopefully I'll be able to address all your concerns. Regards, Volker On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov wrote: > Nils, please answer Pavel's questions. > > Thanks, > Vladimir > > > On 3/1/16 6:24 AM, Nils Eliasson wrote: >> >> Hi Volker, >> >> An excellent proposition. This is how it should be used. >> >> I polished a few rough edges: >> * CompilerBroker.cpp - The directives was already access in >> compile_method - but hidden incompilation_is_prohibited. I moved it out >> so we only have a single directive access. Wrapped compile_method to >> make sure the release of the directive doesn't get lost. >> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >> state - need to be in native to get string, but then need to be in VM >> when parsing directive. >> >> And some comments: >> * I am against adding new compile option commands (At least until the >> stringly typeness is fixed). Lets add good ways too use compiler >> directives instead. >> >> I need to look at the stale task removal code tomorrow - hopefully we >> could save the blocking info in the task so we don't need to access the >> directive in the policy. >> >> All in here: >> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >> >> The code runs fine with the test I fixed for JDK-8073793: >> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >> >> Best regards, >> Nils Eliasson >> >> On 2016-02-26 19:47, Volker Simonis wrote: >>> >>> Hi, >>> >>> so I want to propose the following solution for this problem: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>> >>> I've started from the opposite site and made the BackgroundCompilation >>> manageable through the compiler directives framework. Once this works >>> (and it's actually trivial due to the nice design of the >>> CompilerDirectives framework :), we get the possibility to set the >>> BackgroundCompilation option on a per method base on the command line >>> via the CompileCommand option for free: >>> >>> >>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>> >>> >>> And of course we can also use it directly as a compiler directive: >>> >>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>> >>> It also becomes possible to use this directly from the Whitebox API >>> through the DiagnosticCommand.compilerDirectivesAdd command. >>> Unfortunately, this command takes a file with compiler directives as >>> argument. I think this would be overkill in this context. So because >>> it was so easy and convenient, I added the following two new Whitebox >>> methods: >>> >>> public native void addCompilerDirective(String compDirect); >>> public native void removeCompilerDirective(); >>> >>> which can now be used to set arbitrary CompilerDirective command >>> directly from within the WhiteBox API. (The implementation of these >>> two methods is trivial as you can see in whitebox.cpp). >>> v >>> The blocking versions of enqueueMethodForCompilation() now become >>> simple wrappers around the existing methods without the need of any >>> code changes in their native implementation. This is good, because it >>> keeps the WhiteBox API stable! >>> >>> Finally some words about the implementation of the per-method >>> BackgroundCompilation functionality. It actually only requires two >>> small changes: >>> >>> 1. extending CompileBroker::is_compile_blocking() to take the method >>> and compilation level as arguments and use them to query the >>> DirectivesStack for the corresponding BackgroundCompilation value. >>> >>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>> prefers blocking compilations. This is not only necessary, because it >>> decreases the time we have to wait for a blocking compilation, but >>> also because it prevents blocking compiles from getting stale. This >>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>> for methods which only get artificially compiled during a test because >>> their invocations counters are usually too small. >>> >>> There's still a small probability that a blocking compilation will be >>> not blocking. This can happen if a method for which we request the >>> blocking compilation is already in the compilation queue (see the >>> check 'compilation_is_in_queue(method)' in >>> CompileBroker::compile_method_base()). In testing scenarios this will >>> rarely happen because methods which are manually compiled shouldn't >>> get called that many times to implicitly place them into the compile >>> queue. But we can even completely avoid this problem by using >>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>> the queue before we request a blocking compilation. >>> >>> I've also added a small regression test to demonstrate and verify the >>> new functionality. >>> >>> Regards, >>> Volker >> >> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >> wrote: >>>> >>>> Hi Vladimir, >>>> >>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>> all >>>> compilations. I this case I just want to achieve blocking compilation >>>> for a >>>> single compile without affecting the rest of the system. The tests >>>> using it >>>> will continue executing as soon as that compile is finished, saving time >>>> where wait-loops is used today. It adds nice determinism to tests. >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>> >>>>> You are adding parameter which is used only for testing. >>>>> Can we have callback(or check field) into WB instead? Similar to >>>>> WhiteBox::compilation_locked. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Please review this change that adds support for blocking compiles >>>>>> in the >>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>> >>>>>> Motivation: >>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>> consuming >>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>> queue to >>>>>> complete >>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>> causing indeterminism in the test >>>>>> * Less VM-flags allows for more tests in the same VM >>>>>> >>>>>> Testing: >>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>> will be >>>>>> pushed at the same time. >>>>>> >>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>> >>>> >> > From mtraverso at gmail.com Tue Mar 1 19:18:28 2016 From: mtraverso at gmail.com (Martin Traverso) Date: Tue, 1 Mar 2016 11:18:28 -0800 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <56D59669.7030200@oracle.com> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> Message-ID: > For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead). Unfortunately, this is not a theoretical issue for us. We see this problem running Presto (http://prestodb.io), which generates bytecode for every query it processes. For now, we're working around it with a background thread that watches the size of the code cache and calls System.gc() when it gets close to the max ( https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24231cd388992d7c ). Martin On Tue, Mar 1, 2016 at 5:17 AM, Mikael Gerdin wrote: > Hi, > > On 2016-03-01 13:35, Tobias Hartmann wrote: > >> Hi, >> >> is just had a another look and it turned out that even with 8u40+ class >> unloading is triggered. I missed that because it happens *much* later >> (compared to 8u33) when the code cache already filled up and compilation is >> disabled. At this point we don't recover because new classes are loaded and >> new OSR nmethods are compiled rapidly. >> >> Summary: >> The code cache fills up due to OSR nmethods that are not being flushed. >> With 8u33 and earlier, G1 did more aggressive class unloading (probably due >> to more allocations or different heuristics) and this allowed the sweeper >> to flush enough OSR nmethods to continue compilation. With 8u40 and later, >> class unloading happens long after the code cache is full. >> > > Before 8u40 G1 could only unload classes at Full GCs. > After 8u40 G1 can unload classes at the end of a concurrent GC cycle, > avoiding Full GC. > > If you run the test with CMS with +CMSClassUnloadingEnabled you will > probably see similar problematic results since the class unloading in G1 is > very similar to the one in CMS. > I haven't investigated in depth why the classes do not get unloaded in the > G1 and CMS cases but there are several known quirks with how concurrent > class unloading behaves which causes them to unload classes later than the > serial Full GC. > > Running G1 with -XX:-ClassUnloadingWithConcurrentMark > or CMS with -XX:-CMSClassUnloadingEnabled > disables concurrent class unloading completely and works around the issue > you are seeing. > > For real world applications I hope that this is a much smaller issue but > if you must load and execute loads and loads of short lived classes then it > might be reasonable to disable concurrent class unloading (at the cost of > getting serial Full gcs instead). > > > >> I think we should fix this by flushing "cold" OSR nmethods as well >> (JDK-8023191). Thomas Schatzl mentioned that we could also trigger a >> concurrent mark if the code cache is full and hope that some classes are >> unloaded but I'm afraid this is too invasive (and does not help much in the >> general case). >> > > If it is possible to flush OSR nmethods without doing a full class > unloading cycle then I think that path is prefereable. > > /Mikael > > > >> Opinions? >> >> Best regards, >> Tobias >> >> On 01.03.2016 11:27, Tobias Hartmann wrote: >> >>> Hi Nileema, >>> >>> thanks for reporting this issue! >>> >>> CC'ing the GC team because this seems to be a GC issue (see evaluation >>> below). >>> >>> On 29.02.2016 23:59, nileema wrote: >>> >>>> We are seeing an issue with the CodeCache becoming full which causes the >>>> compiler to be disabled in jdk-8u45 to jdk-8u72. >>>> >>>> We had seen a similar issue in Java7 (old issue: >>>> >>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html >>>> ). >>>> This issue went away with earlier versions of Java 8. >>>> >>> >>> Reading the old conversation, I'm wondering if this could again be a >>> problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is >>> still open - now assigned to me. >>> >>> Doing a quick experiment, it looks like we mostly compile OSR methods: >>> 22129 2137 % 3 Runnable_412::run @ 4 (31 bytes) >>> 22130 2189 % 4 Runnable_371::run @ 4 (31 bytes) >>> 22134 2129 % 3 Runnable_376::run @ 4 (31 bytes) >>> 22136 2109 % 3 Runnable_410::run @ 4 (31 bytes) >>> >>> Currently, OSR nmethods are not flushed just because the code cache is >>> full but only if the nmethod becomes invalid (class loading/unloading, >>> uncommon trap, ..) >>> >>> With your test, class unloading should happen and therefore the OSR >>> nmethods *should* be flushed. >>> >>> We used the test http://github.com/martint/jittest to compare the >>>> behavior >>>> of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache >>>> full >>>> messages with jdk-8u25 but did see them with 8u45+ (8u60 and 8u74) >>>> Test results comparing 8u25, 8u45 and 8u74: >>>> https://gist.github.com/nileema/6fb667a215e95919242f >>>> >>>> In the results you can see that 8u25 starts collecting the code cache >>>> much >>>> sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we >>>> force a full gc when it is about to hit the code cache limit, we see the >>>> code cache size go down. >>>> >>> >>> You can use the following flags to get additional information: >>> -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing >>> -XX:+TraceClassUnloading >>> >>> I did some more experiments with 8u45: >>> >>> java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m >>> -XX:+TraceClassUnloading -XX:+UseG1GC -jar >>> jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading" >>> -> We do *not* unload any classes. The code cache fills up with OSR >>> nmethods that are not flushed. >>> >>> Removing the -XX:+UseG1GC flag solves the issue: >>> >>> java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m >>> -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep >>> Unloading >>> -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] >>> messages and the code cache does not fill up. >>> -> OSR nmethods are flushed because the classes are unloaded: >>> 21670 970 % 4 Runnable_87::run @ -2 (31 bytes) made >>> zombie >>> >>> The log files look good: >>> >>> 1456825330672 112939 10950016 10195496 111.28 >>> 1456825331675 118563 11432256 10467176 112.41 >>> 1456825332678 125935 11972928 10778432 115.72 >>> [Unloading class Runnable_2498 0x00000007c0566028] >>> ... >>> [Unloading class Runnable_34 0x00000007c0082028] >>> 1456825333684 131493 10220608 5382976 117.46 >>> 1456825334688 137408 10359296 5636120 116.81 >>> 1456825335692 143593 7635136 5914624 114.21 >>> >>> After the code cache fills up, we unload classes and therefore flush >>> methods and start over again. >>> >>> I checked for several releases if classes are unloaded: >>> - 8u27: success >>> - 8u33: success >>> - 8u40: fail >>> - 8u45: fail >>> - 8u76: fail >>> >>> The regression was introduced in 8u40. >>> >>> I also tried with the latest JDK 9 build and it fails as well (had to >>> change the bean name from "Code Cache" to "CodeCache" and run with >>> -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the >>> problem. >>> >>> Can someone from the GC team have a look? >>> >>> Is this a known issue? >>>> >>> >>> I'm not aware of any related issue. >>> >>> Best regards, >>> Tobias >>> >>> Thanks! >>>> >>>> Nileema >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html >>>> Sent from the OpenJDK Hotspot Compiler Development List mailing list >>>> archive at Nabble.com. >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 1 20:04:15 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 1 Mar 2016 21:04:15 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> Message-ID: <56D5F5BF.8090303@oracle.com> Hi, On 2016-02-29 20:01, Pavel Punegov wrote: > Hi Volker, > > I have some comments and questions about your patch: > > *- src/share/vm/runtime/advancedThresholdPolicy.cpp* > > You check for background compilation (blocking) by the searching for > an appropriate directive. > But there is a CompileTask::is_blocking() method, that returns a value > set in CompilerBroker::compile_method_base when a compile task was > created. It seems that CompileBroker::is_compile_blocking() finds the > right directive and checks for BackgroundCompilation for being set. Yes, CompileTask::is_blocking() should be used instead of looking up the directive again. > > I think that checking it twice could lead to an issue with different > directives being set on the stack. With diagnostic commands I can > clear the directives stack, or remove directives. If I do this in > between the task was submitted and being checked > in AdvancedThresholdPolicy::select_task, this task could became non > blocking. > > *- src/share/vm/compiler/compileBroker.cpp* > * > *1317 backgroundCompilation = directive->BackgroundCompilationOption; > > Does it check the BackgroundCompilation for being set for both c1 and > c2 at the same time? What will happen if I set BackgroundCompilation > to c1 only? > AFAIK, there are different queues for c1 and c2, and hence we could > have BackgroundCompilation to be set separately for both compilers. The correct directive set is retrieved in the beginning of the compilation when getMatchingDirective(target_method, target_compiler) is called. So this will work perfectly even with different flags for dfferent compilers. > > *- **test/lib/sun/hotspot/WhiteBox.java > * > 318 addCompilerDirective("[{ match: ? > > I?m not quite sure that this is a right way to set a method to be > blocking. Adding a directive on top of the stack makes already set > directives for that method not used. > For example, if I would like to set method to be logged > (LogCompilation) and disable some inlining, but then enqueue it with > WB, I will get it to be only compiled without LogCompilation. > But, AFAIK, setting CompileCommand option will work for already set > directive through a compatibility in CompilerDirectives. > > So, I would prefer to have a directive (file or WB) or an option set > by myself, and then invoke standard WB.enqueueMethodForCompilation(). I agree that the enqueueMethod with the block-argument might cause undesirable surprises and that it is better to just have the plain methods addCompilerDirective, enqueueMethod and removeCompilerDirective in Whitebox.java. If we find some often used pattern we can add that to CompilerWhitebox or similar. There is one bug here though - addCompilerDirective adds any number of directives, but removeCompilerDirective just removes one. I can do a quick fix that limits the WB-api to just add one at a time. Thanks for the feedback, Nils Eliasson > > ? Thanks, > Pavel Punegov > >> On 26 Feb 2016, at 21:47, Volker Simonis > > wrote: >> >> Hi, >> >> so I want to propose the following solution for this problem: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >> >> I've started from the opposite site and made the BackgroundCompilation >> manageable through the compiler directives framework. Once this works >> (and it's actually trivial due to the nice design of the >> CompilerDirectives framework :), we get the possibility to set the >> BackgroundCompilation option on a per method base on the command line >> via the CompileCommand option for free: >> >> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >> >> And of course we can also use it directly as a compiler directive: >> >> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >> >> It also becomes possible to use this directly from the Whitebox API >> through the DiagnosticCommand.compilerDirectivesAdd command. >> Unfortunately, this command takes a file with compiler directives as >> argument. I think this would be overkill in this context. So because >> it was so easy and convenient, I added the following two new Whitebox >> methods: >> >> public native void addCompilerDirective(String compDirect); >> public native void removeCompilerDirective(); >> >> which can now be used to set arbitrary CompilerDirective command >> directly from within the WhiteBox API. (The implementation of these >> two methods is trivial as you can see in whitebox.cpp). >> >> The blocking versions of enqueueMethodForCompilation() now become >> simple wrappers around the existing methods without the need of any >> code changes in their native implementation. This is good, because it >> keeps the WhiteBox API stable! >> >> Finally some words about the implementation of the per-method >> BackgroundCompilation functionality. It actually only requires two >> small changes: >> >> 1. extending CompileBroker::is_compile_blocking() to take the method >> and compilation level as arguments and use them to query the >> DirectivesStack for the corresponding BackgroundCompilation value. >> >> 2. changing AdvancedThresholdPolicy::select_task() such that it >> prefers blocking compilations. This is not only necessary, because it >> decreases the time we have to wait for a blocking compilation, but >> also because it prevents blocking compiles from getting stale. This >> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >> for methods which only get artificially compiled during a test because >> their invocations counters are usually too small. >> >> There's still a small probability that a blocking compilation will be >> not blocking. This can happen if a method for which we request the >> blocking compilation is already in the compilation queue (see the >> check 'compilation_is_in_queue(method)' in >> CompileBroker::compile_method_base()). In testing scenarios this will >> rarely happen because methods which are manually compiled shouldn't >> get called that many times to implicitly place them into the compile >> queue. But we can even completely avoid this problem by using >> WB.isMethodQueuedForCompilation() to make sure that a method is not in >> the queue before we request a blocking compilation. >> >> I've also added a small regression test to demonstrate and verify the >> new functionality. >> >> Regards, >> Volker >> >> >> >> >> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >> wrote: >>> Hi Vladimir, >>> >>> WhiteBox::compilation_locked is a global state that temporarily >>> stops all >>> compilations. I this case I just want to achieve blocking >>> compilation for a >>> single compile without affecting the rest of the system. The tests >>> using it >>> will continue executing as soon as that compile is finished, saving time >>> where wait-loops is used today. It adds nice determinism to tests. >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>> >>>> You are adding parameter which is used only for testing. >>>> Can we have callback(or check field) into WB instead? Similar to >>>> WhiteBox::compilation_locked. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review this change that adds support for blocking compiles >>>>> in the >>>>> whitebox API. This enables simpler less time consuming tests. >>>>> >>>>> Motivation: >>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>> consuming >>>>> * Blocking compiles removes the need for waiting on the compile >>>>> queue to >>>>> complete >>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>> causing indeterminism in the test >>>>> * Less VM-flags allows for more tests in the same VM >>>>> >>>>> Testing: >>>>> Posting a separate RFR for test fix that uses this change. They >>>>> will be >>>>> pushed at the same time. >>>>> >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aleksey.shipilev at oracle.com Tue Mar 1 20:06:12 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 1 Mar 2016 23:06:12 +0300 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D569CC.70406@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> <56D569CC.70406@oracle.com> Message-ID: <56D5F634.9020105@oracle.com> Hi Vladimir, Fixed all your findings in a new webrev: http://cr.openjdk.java.net/~shade/8150669/webrev.03/ Passes JPRT -testset hotspot; microbenchmarks; new test. On 03/01/2016 01:07 PM, Vladimir Ivanov wrote: > I have a general question: why did you decide to intrinsify the method > into a native call? Class::is_primitive looks pretty trivial to > translate it right into machine code: > bool java_lang_Class::is_primitive(oop java_class) { > bool is_primitive = (java_class->metadata_field(_klass_offset) == > NULL); > ... > return is_primitive; > } Yeah, previous version was just a stripped-down isInstance intrinsic code. But you are right, we can just read class data without messing with runtime calls, which improves performance even without canonicalizing: http://cr.openjdk.java.net/~shade/8150669/notes.txt Cheers, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From christian.thalinger at oracle.com Tue Mar 1 21:01:38 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 1 Mar 2016 11:01:38 -1000 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D5F634.9020105@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> <56D569CC.70406@oracle.com> <56D5F634.9020105@oracle.com> Message-ID: <94539711-4823-41FC-816A-E17FE13F490D@oracle.com> > On Mar 1, 2016, at 10:06 AM, Aleksey Shipilev wrote: > > Hi Vladimir, > > Fixed all your findings in a new webrev: > http://cr.openjdk.java.net/~shade/8150669/webrev.03/ Even better. > > Passes JPRT -testset hotspot; microbenchmarks; new test. > > On 03/01/2016 01:07 PM, Vladimir Ivanov wrote: >> I have a general question: why did you decide to intrinsify the method >> into a native call? Class::is_primitive looks pretty trivial to >> translate it right into machine code: >> bool java_lang_Class::is_primitive(oop java_class) { >> bool is_primitive = (java_class->metadata_field(_klass_offset) == >> NULL); >> ... >> return is_primitive; >> } > > Yeah, previous version was just a stripped-down isInstance intrinsic > code. But you are right, we can just read class data without messing > with runtime calls, which improves performance even without canonicalizing: > http://cr.openjdk.java.net/~shade/8150669/notes.txt > > Cheers, > -Aleksey > > From aleksey.shipilev at oracle.com Tue Mar 1 21:46:18 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 00:46:18 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D5CB5C.4050203@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> Message-ID: <56D60DAA.2010401@oracle.com> On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: > Do you have new performance numbers? I hope it did not regress with new > code. It does not regress, the code is tight: http://cr.openjdk.java.net/~shade/8146801/notes.txt > 2 things left I fill should be addressed. Both are fixed here: http://cr.openjdk.java.net/~shade/8146801/webrev.06/ Still passes JPRT -testset hotspot; RBT run is in progress. Cheers, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue Mar 1 22:40:55 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 14:40:55 -0800 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D60DAA.2010401@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> Message-ID: <56D61A77.6080509@oracle.com> Perfect! Thanks, Vladimir On 3/1/16 1:46 PM, Aleksey Shipilev wrote: > On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: >> Do you have new performance numbers? I hope it did not regress with new >> code. > > It does not regress, the code is tight: > http://cr.openjdk.java.net/~shade/8146801/notes.txt > >> 2 things left I fill should be addressed. > > Both are fixed here: > http://cr.openjdk.java.net/~shade/8146801/webrev.06/ > > Still passes JPRT -testset hotspot; RBT run is in progress. > > Cheers, > -Aleksey > From vitalyd at gmail.com Tue Mar 1 22:53:52 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 1 Mar 2016 17:53:52 -0500 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D61A77.6080509@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> Message-ID: Related question - can the prefetch hints go away for small array allocations considering size is already being branched on? I've noticed allocations always come with a prefetch sequence, so perhaps this is just standard allocation pattern. On Tuesday, March 1, 2016, Vladimir Kozlov wrote: > Perfect! > > Thanks, > Vladimir > > On 3/1/16 1:46 PM, Aleksey Shipilev wrote: > >> On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: >> >>> Do you have new performance numbers? I hope it did not regress with new >>> code. >>> >> >> It does not regress, the code is tight: >> http://cr.openjdk.java.net/~shade/8146801/notes.txt >> >> 2 things left I fill should be addressed. >>> >> >> Both are fixed here: >> http://cr.openjdk.java.net/~shade/8146801/webrev.06/ >> >> Still passes JPRT -testset hotspot; RBT run is in progress. >> >> Cheers, >> -Aleksey >> >> -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikael.vidstedt at oracle.com Wed Mar 2 00:25:37 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Tue, 1 Mar 2016 16:25:37 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions Message-ID: <56D63301.9050909@oracle.com> Please review the following change which updates the various vextract* and vinsert* methods in assembler_x86 & macroAssembler_x86 to better match the real HW instructions, which also has the benefit of providing the full functionality/flexibility of the instructions where earlier only some specific modes were supported. Put differently, with this change it's much easier to correlate the methods to the Intel manual and understand what they actually do. Specifically, the vinsert* family of instructions take three registers and an immediate which decide how the bits should be shuffled around, but without this change the method only allowed two of the registers to be specified, and the immediate was hard-coded to 0x01. Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 Webrev: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev/ Special thanks to Mike Berg for helping discuss, co-develop, and test the change! Cheers, Mikael From vivek.r.deshpande at intel.com Wed Mar 2 01:24:37 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 2 Mar 2016 01:24:37 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> Hi Vladimir, Christian I have updated the code according your suggestion of file name change. The updated webrev is at this location: http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ Please let me know if I have to do anything more. Also is there any change required to Makefile so that configurations.xml has name of the added file ? Regards, Vivek -----Original Message----- From: Christian Thalinger [mailto:christian.thalinger at oracle.com] Sent: Monday, February 29, 2016 5:59 PM To: Vladimir Kozlov Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: > > I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. > You can manipulate when to use it in vm_version_x86.cpp. > > Intel Copyright in the file's header is fine. > > Code changes are fine now (webrev.01). > > Thanks, > Vladimir > > On 2/29/16 4:42 PM, Christian Thalinger wrote: >> >>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>> >>> Hi Christian >>> >>> We used the SHA Extension >>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >> >> Will that extension only be available on Intel chips? >> >>> It needed to have Intel copyright, so created a separate file. >> >> That is reasonable. >> >>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 1:58 PM >>> To: Deshpande, Vivek R >>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>> enabling >>> >>> Why is the new file called macroAssembler_intel_x86.cpp? >>> >>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>> >>>> HI Vladimir >>>> >>>> Thank you for your review. >>>> I have updated the patch with the changes you have suggested. >>>> The new webrev is at this location: >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>> >>>> Regards >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Friday, February 26, 2016 6:50 PM >>>> To: Deshpande, Vivek R; hotspot compiler >>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Very nice, Vivek!!! >>>> >>>> Did you run tests with both 32- and 64-bit VMs? >>>> >>>> Small notes: >>>> >>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>> >>>> static bool supports_avxonly() { return ((supports_avx2() || >>>> supports_avx()) && !supports_evex()); } >>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>> >>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>> >>>> if (supports_sha()) { >>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>> UseSHA = true; >>>> } >>>> } else if (UseSHA) { >>>> warning("SHA instructions are not available on this CPU"); >>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>> } >>>> } else if (UseSHA1Intrinsics) { >>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>> >>>> if (UseSHA) { >>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>> } >>>> } else if (UseSHA256Intrinsics) { >>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>> >>>> if (UseSHA512Intrinsics) { >>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>> >>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>> Hi all >>>>> >>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>>> for >>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>> >>>>> Could you please review and sponsor this patch. >>>>> >>>>> Bug-id: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>> webrev: >>>>> >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>> >>>>> Thanks and regards, >>>>> >>>>> Vivek >>>>> >>> >> From vladimir.kozlov at oracle.com Wed Mar 2 01:29:18 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 1 Mar 2016 17:29:18 -0800 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> Message-ID: <56D641EE.5050203@oracle.com> Looks good to me. I will push if Chris is okay with it. Thanks, Vladimir On 3/1/16 5:24 PM, Deshpande, Vivek R wrote: > Hi Vladimir, Christian > > I have updated the code according your suggestion of file name change. > The updated webrev is at this location: > http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ > Please let me know if I have to do anything more. > Also is there any change required to Makefile so that configurations.xml has name of the added file ? > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Monday, February 29, 2016 5:59 PM > To: Vladimir Kozlov > Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > >> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: >> >> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. > > I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. > >> You can manipulate when to use it in vm_version_x86.cpp. >> >> Intel Copyright in the file's header is fine. >> >> Code changes are fine now (webrev.01). >> >> Thanks, >> Vladimir >> >> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>> >>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>>> >>>> Hi Christian >>>> >>>> We used the SHA Extension >>>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >>> >>> Will that extension only be available on Intel chips? >>> >>>> It needed to have Intel copyright, so created a separate file. >>> >>> That is reasonable. >>> >>>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Monday, February 29, 2016 1:58 PM >>>> To: Deshpande, Vivek R >>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>> >>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>>> >>>>> HI Vladimir >>>>> >>>>> Thank you for your review. >>>>> I have updated the patch with the changes you have suggested. >>>>> The new webrev is at this location: >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>> >>>>> Regards >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>> To: Deshpande, Vivek R; hotspot compiler >>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>> enabling >>>>> >>>>> Very nice, Vivek!!! >>>>> >>>>> Did you run tests with both 32- and 64-bit VMs? >>>>> >>>>> Small notes: >>>>> >>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>> >>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>> supports_avx()) && !supports_evex()); } >>>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>>> >>>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>>> >>>>> if (supports_sha()) { >>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>> UseSHA = true; >>>>> } >>>>> } else if (UseSHA) { >>>>> warning("SHA instructions are not available on this CPU"); >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>> } >>>>> } else if (UseSHA1Intrinsics) { >>>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>> } >>>>> } else if (UseSHA256Intrinsics) { >>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>> >>>>> if (UseSHA512Intrinsics) { >>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>> >>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>> Hi all >>>>>> >>>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>>>> for >>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>> >>>>>> Could you please review and sponsor this patch. >>>>>> >>>>>> Bug-id: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>> webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>> >>>>>> Thanks and regards, >>>>>> >>>>>> Vivek >>>>>> >>>> >>> > From vladimir.x.ivanov at oracle.com Wed Mar 2 07:36:04 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 10:36:04 +0300 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D63301.9050909@oracle.com> References: <56D63301.9050909@oracle.com> Message-ID: <56D697E4.8060104@oracle.com> Nice cleanup, Mikael! src/cpu/x86/vm/assembler_x86.hpp: Outdated comments: // Copy low 128bit into high 128bit of YMM registers. // Load/store high 128bit of YMM registers which does not destroy other half. // Copy low 256bit into high 256bit of ZMM registers. src/cpu/x86/vm/assembler_x86.cpp: ! emit_int8(imm8 & 0x01); Maybe additionally assert valid imm8 range? Maybe keep vinsert*h variants and move them to MacroAssembler? They look clearer in some contextes: - __ vextractf128h(Address(rsp, base_addr+n*16), as_XMMRegister(n)); + __ vextractf128(Address(rsp, base_addr+n*16), as_XMMRegister(n), 1); Otherwise, looks good. Best regards, Vladimir Ivanov On 3/2/16 3:25 AM, Mikael Vidstedt wrote: > > Please review the following change which updates the various vextract* > and vinsert* methods in assembler_x86 & macroAssembler_x86 to better > match the real HW instructions, which also has the benefit of providing > the full functionality/flexibility of the instructions where earlier > only some specific modes were supported. Put differently, with this > change it's much easier to correlate the methods to the Intel manual and > understand what they actually do. > > Specifically, the vinsert* family of instructions take three registers > and an immediate which decide how the bits should be shuffled around, > but without this change the method only allowed two of the registers to > be specified, and the immediate was hard-coded to 0x01. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 > Webrev: > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev/ > > Special thanks to Mike Berg for helping discuss, co-develop, and test > the change! > > Cheers, > Mikael > From vladimir.x.ivanov at oracle.com Wed Mar 2 07:39:39 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 10:39:39 +0300 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D5F634.9020105@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> <56D569CC.70406@oracle.com> <56D5F634.9020105@oracle.com> Message-ID: <56D698BB.4050505@oracle.com> Looks good! Best regards, Vladimir Ivanov On 3/1/16 11:06 PM, Aleksey Shipilev wrote: > Hi Vladimir, > > Fixed all your findings in a new webrev: > http://cr.openjdk.java.net/~shade/8150669/webrev.03/ > > Passes JPRT -testset hotspot; microbenchmarks; new test. > > On 03/01/2016 01:07 PM, Vladimir Ivanov wrote: >> I have a general question: why did you decide to intrinsify the method >> into a native call? Class::is_primitive looks pretty trivial to >> translate it right into machine code: >> bool java_lang_Class::is_primitive(oop java_class) { >> bool is_primitive = (java_class->metadata_field(_klass_offset) == >> NULL); >> ... >> return is_primitive; >> } > > Yeah, previous version was just a stripped-down isInstance intrinsic > code. But you are right, we can just read class data without messing > with runtime calls, which improves performance even without canonicalizing: > http://cr.openjdk.java.net/~shade/8150669/notes.txt > > Cheers, > -Aleksey > > From vladimir.x.ivanov at oracle.com Wed Mar 2 07:42:48 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 10:42:48 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D60DAA.2010401@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> Message-ID: <56D69978.2070304@oracle.com> Looks good. Best regards, Vladimir Ivanov On 3/2/16 12:46 AM, Aleksey Shipilev wrote: > On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: >> Do you have new performance numbers? I hope it did not regress with new >> code. > > It does not regress, the code is tight: > http://cr.openjdk.java.net/~shade/8146801/notes.txt > >> 2 things left I fill should be addressed. > > Both are fixed here: > http://cr.openjdk.java.net/~shade/8146801/webrev.06/ > > Still passes JPRT -testset hotspot; RBT run is in progress. > > Cheers, > -Aleksey > From tobias.hartmann at oracle.com Wed Mar 2 08:51:17 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 2 Mar 2016 09:51:17 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> Message-ID: <56D6A985.4020800@oracle.com> Hi Martin, On 01.03.2016 20:18, Martin Traverso wrote: >> For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead). > > Unfortunately, this is not a theoretical issue for us. We see this problem running Presto (http://prestodb.io), which generates bytecode for every query it processes. For now, we're working around it with a background thread that watches the size of the code cache and calls System.gc() when it gets close to the max (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24231cd388992d7c). Okay, I changed JDK-8023191 from enhancement to bug and set fix version to 9. We can then backport this to 8u. Best regards, Tobias > > Martin > > > > On Tue, Mar 1, 2016 at 5:17 AM, Mikael Gerdin > wrote: > > Hi, > > On 2016-03-01 13:35, Tobias Hartmann wrote: > > Hi, > > is just had a another look and it turned out that even with 8u40+ class unloading is triggered. I missed that because it happens *much* later (compared to 8u33) when the code cache already filled up and compilation is disabled. At this point we don't recover because new classes are loaded and new OSR nmethods are compiled rapidly. > > Summary: > The code cache fills up due to OSR nmethods that are not being flushed. With 8u33 and earlier, G1 did more aggressive class unloading (probably due to more allocations or different heuristics) and this allowed the sweeper to flush enough OSR nmethods to continue compilation. With 8u40 and later, class unloading happens long after the code cache is full. > > > Before 8u40 G1 could only unload classes at Full GCs. > After 8u40 G1 can unload classes at the end of a concurrent GC cycle, avoiding Full GC. > > If you run the test with CMS with +CMSClassUnloadingEnabled you will probably see similar problematic results since the class unloading in G1 is very similar to the one in CMS. > I haven't investigated in depth why the classes do not get unloaded in the G1 and CMS cases but there are several known quirks with how concurrent class unloading behaves which causes them to unload classes later than the serial Full GC. > > Running G1 with -XX:-ClassUnloadingWithConcurrentMark > or CMS with -XX:-CMSClassUnloadingEnabled > disables concurrent class unloading completely and works around the issue you are seeing. > > For real world applications I hope that this is a much smaller issue but if you must load and execute loads and loads of short lived classes then it might be reasonable to disable concurrent class unloading (at the cost of getting serial Full gcs instead). > > > > I think we should fix this by flushing "cold" OSR nmethods as well (JDK-8023191). Thomas Schatzl mentioned that we could also trigger a concurrent mark if the code cache is full and hope that some classes are unloaded but I'm afraid this is too invasive (and does not help much in the general case). > > > If it is possible to flush OSR nmethods without doing a full class unloading cycle then I think that path is prefereable. > > /Mikael > > > > Opinions? > > Best regards, > Tobias > > On 01.03.2016 11:27, Tobias Hartmann wrote: > > Hi Nileema, > > thanks for reporting this issue! > > CC'ing the GC team because this seems to be a GC issue (see evaluation below). > > On 29.02.2016 23:59, nileema wrote: > > We are seeing an issue with the CodeCache becoming full which causes the > compiler to be disabled in jdk-8u45 to jdk-8u72. > > We had seen a similar issue in Java7 (old issue: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-August/011333.html). > This issue went away with earlier versions of Java 8. > > > Reading the old conversation, I'm wondering if this could again be a problem with OSR nmethods that are not flushed? The bug (JDK-8023191) is still open - now assigned to me. > > Doing a quick experiment, it looks like we mostly compile OSR methods: > 22129 2137 % 3 Runnable_412::run @ 4 (31 bytes) > 22130 2189 % 4 Runnable_371::run @ 4 (31 bytes) > 22134 2129 % 3 Runnable_376::run @ 4 (31 bytes) > 22136 2109 % 3 Runnable_410::run @ 4 (31 bytes) > > Currently, OSR nmethods are not flushed just because the code cache is full but only if the nmethod becomes invalid (class loading/unloading, uncommon trap, ..) > > With your test, class unloading should happen and therefore the OSR nmethods *should* be flushed. > > We used the test http://github.com/martint/jittest to compare the behavior > of jdk-8u25 and jdk-8u45. For this test, we did not see any CodeCache full > messages with jdk-8u25 but did see them with 8u45+ (8u60 and 8u74) > Test results comparing 8u25, 8u45 and 8u74: > https://gist.github.com/nileema/6fb667a215e95919242f > > In the results you can see that 8u25 starts collecting the code cache much > sooner than 8u45. 8u45 very quickly hits the limit for code cache. If we > force a full gc when it is about to hit the code cache limit, we see the > code cache size go down. > > > You can use the following flags to get additional information: > -XX:CICompilerCount=1 -XX:+PrintCompilation -XX:+PrintMethodFlushing -XX:+TraceClassUnloading > > I did some more experiments with 8u45: > > java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -XX:+UseG1GC -jar jittest-1.0-SNAPSHOT-standalone.jar | grep "Unloading" > -> We do *not* unload any classes. The code cache fills up with OSR nmethods that are not flushed. > > Removing the -XX:+UseG1GC flag solves the issue: > > java -mx20g -ms20g -XX:ReservedCodeCacheSize=20m -XX:+TraceClassUnloading -jar jittest-1.0-SNAPSHOT-standalone.jar | grep Unloading > -> Prints plenty of [Unloading class Runnable_40 0x00000007c0086028] messages and the code cache does not fill up. > -> OSR nmethods are flushed because the classes are unloaded: > 21670 970 % 4 Runnable_87::run @ -2 (31 bytes) made zombie > > The log files look good: > > 1456825330672 112939 10950016 10195496 111.28 > 1456825331675 118563 11432256 10467176 112.41 > 1456825332678 125935 11972928 10778432 115.72 > [Unloading class Runnable_2498 0x00000007c0566028] > ... > [Unloading class Runnable_34 0x00000007c0082028] > 1456825333684 131493 10220608 5382976 117.46 > 1456825334688 137408 10359296 5636120 116.81 > 1456825335692 143593 7635136 5914624 114.21 > > After the code cache fills up, we unload classes and therefore flush methods and start over again. > > I checked for several releases if classes are unloaded: > - 8u27: success > - 8u33: success > - 8u40: fail > - 8u45: fail > - 8u76: fail > > The regression was introduced in 8u40. > > I also tried with the latest JDK 9 build and it fails as well (had to change the bean name from "Code Cache" to "CodeCache" and run with -XX:-SegmentedCodeCache). Again, -XX:-UseG1GC -XX:+UseParallelGC solves the problem. > > Can someone from the GC team have a look? > > Is this a known issue? > > > I'm not aware of any related issue. > > Best regards, > Tobias > > Thanks! > > Nileema > > > > -- > View this message in context: http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-tp259603.html > Sent from the OpenJDK Hotspot Compiler Development List mailing list archive at Nabble.com. > > From aleksey.shipilev at oracle.com Wed Mar 2 09:02:22 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 12:02:22 +0300 Subject: RFR (XS) 8151017: [TESTBUG] test/compiler/c1/CanonicalizeArrayLength does not work on product builds Message-ID: <56D6AC1E.1000701@oracle.com> Hi, Please review the fix for a testbug: https://bugs.openjdk.java.net/browse/JDK-8151017 Webrev: http://cr.openjdk.java.net/~shade/8151017/webrev.01/ Test uses develop/diagnostic VM options: ScavengeRootsInCode and PatchALot, and we need to preceed the test configs with -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions to make it work on product bits. The test passes now on Linux x86_64/release. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Wed Mar 2 09:04:15 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 2 Mar 2016 10:04:15 +0100 Subject: RFR (XS) 8151017: [TESTBUG] test/compiler/c1/CanonicalizeArrayLength does not work on product builds In-Reply-To: <56D6AC1E.1000701@oracle.com> References: <56D6AC1E.1000701@oracle.com> Message-ID: <56D6AC8F.1070905@oracle.com> Hi Aleksey, On 02.03.2016 10:02, Aleksey Shipilev wrote: > Hi, > > Please review the fix for a testbug: > https://bugs.openjdk.java.net/browse/JDK-8151017 > > Webrev: > http://cr.openjdk.java.net/~shade/8151017/webrev.01/ > > Test uses develop/diagnostic VM options: ScavengeRootsInCode and > PatchALot, and we need to preceed the test configs with > -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions to make > it work on product bits. Looks good. Best regards, Tobias > > The test passes now on Linux x86_64/release. > > Thanks, > -Aleksey > From zoltan.majo at oracle.com Wed Mar 2 09:07:49 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 2 Mar 2016 10:07:49 +0100 Subject: RFR (XS) 8151017: [TESTBUG] test/compiler/c1/CanonicalizeArrayLength does not work on product builds In-Reply-To: <56D6AC1E.1000701@oracle.com> References: <56D6AC1E.1000701@oracle.com> Message-ID: <56D6AD65.2030409@oracle.com> Looks good to me as well. Thank you and best regards, Zoltan On 03/02/2016 10:02 AM, Aleksey Shipilev wrote: > Hi, > > Please review the fix for a testbug: > https://bugs.openjdk.java.net/browse/JDK-8151017 > > Webrev: > http://cr.openjdk.java.net/~shade/8151017/webrev.01/ > > Test uses develop/diagnostic VM options: ScavengeRootsInCode and > PatchALot, and we need to preceed the test configs with > -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions to make > it work on product bits. > > The test passes now on Linux x86_64/release. > > Thanks, > -Aleksey > From uschindler at apache.org Wed Mar 2 09:08:48 2016 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 2 Mar 2016 10:08:48 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <56D6A985.4020800@oracle.com> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> <56D6A985.4020800@oracle.com> Message-ID: <010d01d17463$25053e60$6f0fbb20$@apache.org> Hi, > >> For real world applications I hope that this is a much smaller issue but if > you must load and execute loads and loads of short lived classes then it might > be reasonable to disable concurrent class unloading (at the cost of getting > serial Full gcs instead). > > > > Unfortunately, this is not a theoretical issue for us. We see this problem > running Presto (http://prestodb.io), which generates bytecode for every > query it processes. For now, we're working around it with a background > thread that watches the size of the code cache and calls System.gc() when it > gets close to the max > (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24 > 231cd388992d7c). > > Okay, I changed JDK-8023191 from enhancement to bug and set fix version to > 9. We can then backport this to 8u. Hi many thanks for taking care! Apache Lucene's Expressions module (heavily used by Elasticsearch) also compiles small Java classes to execute custom document scoring operations that the user can supply, similar to Presto's SQL, as a Javascript-like formula that can work on arbitrary static double-parameter/double returning functions. These classes are loaded in a separate ClassLoader used solely for a single class and thrown away afterwards. In older Java versions GC was perfectly throwing away those classes... Compiler & Classloader: https://github.com/apache/lucene-solr/blob/master/lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java See tests like: https://github.com/apache/lucene-solr/tree/master/lucene/expressions/src/test/org/apache/lucene/expressions/js Uwe From tobias.hartmann at oracle.com Wed Mar 2 09:33:28 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 2 Mar 2016 10:33:28 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <010d01d17463$25053e60$6f0fbb20$@apache.org> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> <56D6A985.4020800@oracle.com> <010d01d17463$25053e60$6f0fbb20$@apache.org> Message-ID: <56D6B368.2010600@oracle.com> Hi Uwe, On 02.03.2016 10:08, Uwe Schindler wrote: > Hi, > >>>> For real world applications I hope that this is a much smaller issue but if >> you must load and execute loads and loads of short lived classes then it might >> be reasonable to disable concurrent class unloading (at the cost of getting >> serial Full gcs instead). >>> >>> Unfortunately, this is not a theoretical issue for us. We see this problem >> running Presto (http://prestodb.io), which generates bytecode for every >> query it processes. For now, we're working around it with a background >> thread that watches the size of the code cache and calls System.gc() when it >> gets close to the max >> (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24 >> 231cd388992d7c). >> >> Okay, I changed JDK-8023191 from enhancement to bug and set fix version to >> 9. We can then backport this to 8u. > > Hi many thanks for taking care! > > Apache Lucene's Expressions module (heavily used by Elasticsearch) also compiles small Java classes to execute custom document scoring operations that the user can supply, similar to Presto's SQL, as a Javascript-like formula that can work on arbitrary static double-parameter/double returning functions. These classes are loaded in a separate ClassLoader used solely for a single class and thrown away afterwards. In older Java versions GC was perfectly throwing away those classes... > > Compiler & Classloader: > https://github.com/apache/lucene-solr/blob/master/lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java > > See tests like: > https://github.com/apache/lucene-solr/tree/master/lucene/expressions/src/test/org/apache/lucene/expressions/js Please note that JDK-8023191 only takes care of flushing of unused OSR nmethods. This should help if the code cache fills up due to OSR compilations but it's not about unloading of classes. Class unloading just helps because it forces flushing right away. As Mikael wrote, unloading happens later with concurrent class unloading. This shouldn't be a problem for your application as long as the code cache does not fill up, right? Best regards, Tobias From aleksey.shipilev at oracle.com Wed Mar 2 09:43:12 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 12:43:12 +0300 Subject: RFR (XS) 8151017: [TESTBUG] test/compiler/c1/CanonicalizeArrayLength does not work on product builds In-Reply-To: <56D6AD65.2030409@oracle.com> References: <56D6AC1E.1000701@oracle.com> <56D6AD65.2030409@oracle.com> Message-ID: <56D6B5B0.6050008@oracle.com> Thanks Zoltan and Tobias, the fix is on its way to hs-comp. -Aleksey On 03/02/2016 12:07 PM, Zolt?n Maj? wrote: > Looks good to me as well. > > Thank you and best regards, > > > Zoltan > > On 03/02/2016 10:02 AM, Aleksey Shipilev wrote: >> Hi, >> >> Please review the fix for a testbug: >> https://bugs.openjdk.java.net/browse/JDK-8151017 >> >> Webrev: >> http://cr.openjdk.java.net/~shade/8151017/webrev.01/ >> >> Test uses develop/diagnostic VM options: ScavengeRootsInCode and >> PatchALot, and we need to preceed the test configs with >> -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions to make >> it work on product bits. >> >> The test passes now on Linux x86_64/release. >> >> Thanks, >> -Aleksey >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From aleksey.shipilev at oracle.com Wed Mar 2 09:44:44 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 12:44:44 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D61A77.6080509@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> Message-ID: <56D6B60C.9050102@oracle.com> Thanks Vladimir and Vladimir for reviews! RBT hotspot/test/:hotspot_all,vm.runtime.testlist,vm.compiler.testlist came clean, so I would push as soon as gatekeeper opens the flood gates. -Aleksey On 03/02/2016 01:40 AM, Vladimir Kozlov wrote: > Perfect! > > Thanks, > Vladimir > > On 3/1/16 1:46 PM, Aleksey Shipilev wrote: >> On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: >>> Do you have new performance numbers? I hope it did not regress with new >>> code. >> >> It does not regress, the code is tight: >> http://cr.openjdk.java.net/~shade/8146801/notes.txt >> >>> 2 things left I fill should be addressed. >> >> Both are fixed here: >> http://cr.openjdk.java.net/~shade/8146801/webrev.06/ >> >> Still passes JPRT -testset hotspot; RBT run is in progress. >> >> Cheers, >> -Aleksey >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From aleksey.shipilev at oracle.com Wed Mar 2 09:47:09 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 12:47:09 +0300 Subject: RFR (S) 8150669: C1 intrinsic for Class.isPrimitive In-Reply-To: <56D698BB.4050505@oracle.com> References: <56CF6A7E.7080204@oracle.com> <56D4BF23.2070308@oracle.com> <4AAA9FF6-99A5-47F4-B707-0C6E0CB2D3BC@oracle.com> <56D4CD42.2050101@oracle.com> <56D569CC.70406@oracle.com> <56D5F634.9020105@oracle.com> <56D698BB.4050505@oracle.com> Message-ID: <56D6B69D.1060105@oracle.com> Thanks Christian and Vladimir for reviews! For this patch as well, RBT hotspot/test/:hotspot_all,vm.runtime.testlist,vm.compiler.testlist is clean, will push once gatekeeper greenlights the integration. -Aleksey On 03/02/2016 10:39 AM, Vladimir Ivanov wrote: > Looks good! > > Best regards, > Vladimir Ivanov > > On 3/1/16 11:06 PM, Aleksey Shipilev wrote: >> Hi Vladimir, >> >> Fixed all your findings in a new webrev: >> http://cr.openjdk.java.net/~shade/8150669/webrev.03/ >> >> Passes JPRT -testset hotspot; microbenchmarks; new test. >> >> On 03/01/2016 01:07 PM, Vladimir Ivanov wrote: >>> I have a general question: why did you decide to intrinsify the method >>> into a native call? Class::is_primitive looks pretty trivial to >>> translate it right into machine code: >>> bool java_lang_Class::is_primitive(oop java_class) { >>> bool is_primitive = (java_class->metadata_field(_klass_offset) == >>> NULL); >>> ... >>> return is_primitive; >>> } >> >> Yeah, previous version was just a stripped-down isInstance intrinsic >> code. But you are right, we can just read class data without messing >> with runtime calls, which improves performance even without >> canonicalizing: >> http://cr.openjdk.java.net/~shade/8150669/notes.txt >> >> Cheers, >> -Aleksey >> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From uschindler at apache.org Wed Mar 2 10:03:23 2016 From: uschindler at apache.org (Uwe Schindler) Date: Wed, 2 Mar 2016 11:03:23 +0100 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <56D6B368.2010600@oracle.com> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> <56D6A985.4020800@oracle.com> <010d01d17463$25053e60$6f0fbb20$@apache.org> <56D6B368.2010600@oracle.com> Message-ID: <012c01d1746a$c3fd2e40$4bf78ac0$@apache.org> Hi Tobias, > >>>> For real world applications I hope that this is a much smaller issue but if > >> you must load and execute loads and loads of short lived classes then it > might > >> be reasonable to disable concurrent class unloading (at the cost of getting > >> serial Full gcs instead). > >>> > >>> Unfortunately, this is not a theoretical issue for us. We see this problem > >> running Presto (http://prestodb.io), which generates bytecode for every > >> query it processes. For now, we're working around it with a background > >> thread that watches the size of the code cache and calls System.gc() when > it > >> gets close to the max > >> > (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24 > >> 231cd388992d7c). > >> > >> Okay, I changed JDK-8023191 from enhancement to bug and set fix > version to > >> 9. We can then backport this to 8u. > > > > Hi many thanks for taking care! > > > > Apache Lucene's Expressions module (heavily used by Elasticsearch) also > compiles small Java classes to execute custom document scoring operations > that the user can supply, similar to Presto's SQL, as a Javascript-like formula > that can work on arbitrary static double-parameter/double returning > functions. These classes are loaded in a separate ClassLoader used solely for > a single class and thrown away afterwards. In older Java versions GC was > perfectly throwing away those classes... > > > > Compiler & Classloader: > > https://github.com/apache/lucene- > solr/blob/master/lucene/expressions/src/java/org/apache/lucene/expressi > ons/js/JavascriptCompiler.java > > > > See tests like: > > https://github.com/apache/lucene- > solr/tree/master/lucene/expressions/src/test/org/apache/lucene/expressio > ns/js > > Please note that JDK-8023191 only takes care of flushing of unused OSR > nmethods. This should help if the code cache fills up due to OSR compilations > but it's not about unloading of classes. Class unloading just helps because it > forces flushing right away. Thanks! This is how it was designed: The classloader only lives for very short time and the compiled expression gets thrown away after fulltext query execution. This works quite good. I just have seen the issue on this mailing list and I wanted to start some test with recent Java 8 VMs if everything is still alright. Martin Traverso's post alarmed me, because he said that they generate bytecode for every query (which is similar in our case). The main difference is: Our compiled methods are just plain simple mathematical formulas which are represented to the caller by a functional interface. We have no loops in the generated bytecode. The Lucene scoring algorithm just calls the compiled expression bytecode while processing results to calculate score (and depending on the number of search engine results collected, this can be several million times per query execution). > As Mikael wrote, unloading happens later with concurrent class unloading. > This shouldn't be a problem for your application as long as the code cache > does not fill up, right? As said before, it should not fill up - as we have no loops in those expressions. I was just alarmed and wanted to start testing. We checked the whole stuff with Java 6 and Java 7 (when it was written), but have not done extensive testing of garbage collection with Java 8. But there were also no bug reports... > Best regards, > Tobias Uwe From vladimir.x.ivanov at oracle.com Wed Mar 2 11:51:54 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 14:51:54 +0300 Subject: [9] RFR (XS): JDK-8151020: [TESTBUG] UnsafeGetStableArrayElement::testL_* fail intermittently Message-ID: <56D6D3DA.7050801@oracle.com> http://cr.openjdk.java.net/~vlivanov/8151020/webrev.00 https://bugs.openjdk.java.net/browse/JDK-8151020 Some cases in UnsafeGetStableArrayElement are not reliable: there's no way to ensure that only some part of an OOP is 0 or not. The fix is just to run the tests (hoping to trigger an assertion if anything goes wrong) w/o checking the results. Testing: ensured that the test still works. Thanks! Best regards, Vladimir Ivanov From aleksey.shipilev at oracle.com Wed Mar 2 12:07:14 2016 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 2 Mar 2016 15:07:14 +0300 Subject: [9] RFR (XS): JDK-8151020: [TESTBUG] UnsafeGetStableArrayElement::testL_* fail intermittently In-Reply-To: <56D6D3DA.7050801@oracle.com> References: <56D6D3DA.7050801@oracle.com> Message-ID: <56D6D772.9070200@oracle.com> On 03/02/2016 02:51 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8151020/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8151020 > > Some cases in UnsafeGetStableArrayElement are not reliable: there's no > way to ensure that only some part of an OOP is 0 or not. So the test fails when "new Object()" OOP has zeros in some bits, and reading shorter bit stride from that OOP makes the test believe that the field was "unchanged". Bummer. > The fix is just to run the tests (hoping to trigger an assertion if > anything goes wrong) w/o checking the results. Looks good. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From vladimir.x.ivanov at oracle.com Wed Mar 2 12:11:50 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 15:11:50 +0300 Subject: [9] RFR (XS): JDK-8151020: [TESTBUG] UnsafeGetStableArrayElement::testL_* fail intermittently In-Reply-To: <56D6D772.9070200@oracle.com> References: <56D6D3DA.7050801@oracle.com> <56D6D772.9070200@oracle.com> Message-ID: <56D6D886.3030207@oracle.com> Thanks, Aleksey. >> Some cases in UnsafeGetStableArrayElement are not reliable: there's no >> way to ensure that only some part of an OOP is 0 or not. > > So the test fails when "new Object()" OOP has zeros in some bits, and > reading shorter bit stride from that OOP makes the test believe that the > field was "unchanged". Bummer. Moreover, "important bits" can intermittently become 0 during execution (particular object location after some GC). So, it is really hard to reliably predict what value a compiler reads. Best regards, Vladimir Ivanov From zoltan.majo at oracle.com Wed Mar 2 12:23:30 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 2 Mar 2016 13:23:30 +0100 Subject: [9] RFR (XS): JDK-8151020: [TESTBUG] UnsafeGetStableArrayElement::testL_* fail intermittently In-Reply-To: <56D6D3DA.7050801@oracle.com> References: <56D6D3DA.7050801@oracle.com> Message-ID: <56D6DB42.7060702@oracle.com> Looks good to me. Thank you for fixing this! Best regards, Zoltan On 03/02/2016 12:51 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8151020/webrev.00 > https://bugs.openjdk.java.net/browse/JDK-8151020 > > Some cases in UnsafeGetStableArrayElement are not reliable: there's no > way to ensure that only some part of an OOP is 0 or not. > > The fix is just to run the tests (hoping to trigger an assertion if > anything goes wrong) w/o checking the results. > > Testing: ensured that the test still works. > > Thanks! > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed Mar 2 12:25:41 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 15:25:41 +0300 Subject: [9] RFR (XS): JDK-8151020: [TESTBUG] UnsafeGetStableArrayElement::testL_* fail intermittently In-Reply-To: <56D6DB42.7060702@oracle.com> References: <56D6D3DA.7050801@oracle.com> <56D6DB42.7060702@oracle.com> Message-ID: <56D6DBC5.6090909@oracle.com> Zoltan, Aleksey, thanks for the prompt reviews. Best regards, Vladimir Ivanov On 3/2/16 3:23 PM, Zolt?n Maj? wrote: > Looks good to me. Thank you for fixing this! > > Best regards, > > > Zoltan > > On 03/02/2016 12:51 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8151020/webrev.00 >> https://bugs.openjdk.java.net/browse/JDK-8151020 >> >> Some cases in UnsafeGetStableArrayElement are not reliable: there's no >> way to ensure that only some part of an OOP is 0 or not. >> >> The fix is just to run the tests (hoping to trigger an assertion if >> anything goes wrong) w/o checking the results. >> >> Testing: ensured that the test still works. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov > From vladimir.x.ivanov at oracle.com Wed Mar 2 12:32:23 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 15:32:23 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> Message-ID: <56D6DD57.4070701@oracle.com> I think it heavily depends on prefetch strategy: product(intx, AllocatePrefetchStyle, 1, "0 = no prefetch, " "1 = prefetch instructions for each allocation, " "2 = use TLAB watermark to gate allocation prefetch, " "3 = use BIS instruction on Sparc for allocation prefetch") Maybe prefetch distance and the number of prefetched lines is too much for small array allocations and it should be treated like an ordinary object allocation: // Generate several prefetch instructions. uint lines = (length != NULL) ? AllocatePrefetchLines : AllocateInstancePrefetchLines; uint step_size = AllocatePrefetchStepSize; uint distance = AllocatePrefetchDistance; But I'm not sure how much such optimization can buy us for the additional complexity in the code. Best regards, Vladimir Ivanov On 3/2/16 1:53 AM, Vitaly Davidovich wrote: > Related question - can the prefetch hints go away for small array > allocations considering size is already being branched on? I've noticed > allocations always come with a prefetch sequence, so perhaps this is > just standard allocation pattern. > > On Tuesday, March 1, 2016, Vladimir Kozlov > wrote: > > Perfect! > > Thanks, > Vladimir > > On 3/1/16 1:46 PM, Aleksey Shipilev wrote: > > On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: > > Do you have new performance numbers? I hope it did not > regress with new > code. > > > It does not regress, the code is tight: > http://cr.openjdk.java.net/~shade/8146801/notes.txt > > 2 things left I fill should be addressed. > > > Both are fixed here: > http://cr.openjdk.java.net/~shade/8146801/webrev.06/ > > Still passes JPRT -testset hotspot; RBT run is in progress. > > Cheers, > -Aleksey > > > > -- > Sent from my phone From nils.eliasson at oracle.com Wed Mar 2 12:36:23 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 2 Mar 2016 13:36:23 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> Message-ID: <56D6DE47.7060405@oracle.com> Hi Volker, I created these webrevs including all the feedback from everyone: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ * Only add- and removeCompilerDirective http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ * whitebox.cpp -- addCompilerDirective to have correct VM states * advancedThresholdPolicy.cpp -- prevent blocking tasks from becoming stale -- The logic for picking first blocking task broke JVMCI code. Instead made the JVMCI code default (select the blocking task with highest score.) * compilerDirectives.hpp -- Remove option CompileCommand. Not needed. * compileBroker.cpp -- Wrapped compile_method so that directive get and release always are matched. Is anything missing? Best regards, Nils Eliasson On 2016-03-01 19:31, Volker Simonis wrote: > Hi Pavel, Nils, Vladimir, > > sorry, but I was busy the last days so I couldn't answer your mails. > > Thanks a lot for your input and your suggestions. I'll look into this > tomorrow and hopefully I'll be able to address all your concerns. > > Regards, > Volker > > > On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov > wrote: >> Nils, please answer Pavel's questions. >> >> Thanks, >> Vladimir >> >> >> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>> Hi Volker, >>> >>> An excellent proposition. This is how it should be used. >>> >>> I polished a few rough edges: >>> * CompilerBroker.cpp - The directives was already access in >>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>> so we only have a single directive access. Wrapped compile_method to >>> make sure the release of the directive doesn't get lost. >>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>> state - need to be in native to get string, but then need to be in VM >>> when parsing directive. >>> >>> And some comments: >>> * I am against adding new compile option commands (At least until the >>> stringly typeness is fixed). Lets add good ways too use compiler >>> directives instead. >>> >>> I need to look at the stale task removal code tomorrow - hopefully we >>> could save the blocking info in the task so we don't need to access the >>> directive in the policy. >>> >>> All in here: >>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>> >>> The code runs fine with the test I fixed for JDK-8073793: >>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>> >>> Best regards, >>> Nils Eliasson >>> >>> On 2016-02-26 19:47, Volker Simonis wrote: >>>> Hi, >>>> >>>> so I want to propose the following solution for this problem: >>>> >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>> >>>> I've started from the opposite site and made the BackgroundCompilation >>>> manageable through the compiler directives framework. Once this works >>>> (and it's actually trivial due to the nice design of the >>>> CompilerDirectives framework :), we get the possibility to set the >>>> BackgroundCompilation option on a per method base on the command line >>>> via the CompileCommand option for free: >>>> >>>> >>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>> >>>> >>>> And of course we can also use it directly as a compiler directive: >>>> >>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>> >>>> It also becomes possible to use this directly from the Whitebox API >>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>> Unfortunately, this command takes a file with compiler directives as >>>> argument. I think this would be overkill in this context. So because >>>> it was so easy and convenient, I added the following two new Whitebox >>>> methods: >>>> >>>> public native void addCompilerDirective(String compDirect); >>>> public native void removeCompilerDirective(); >>>> >>>> which can now be used to set arbitrary CompilerDirective command >>>> directly from within the WhiteBox API. (The implementation of these >>>> two methods is trivial as you can see in whitebox.cpp). >>>> v >>>> The blocking versions of enqueueMethodForCompilation() now become >>>> simple wrappers around the existing methods without the need of any >>>> code changes in their native implementation. This is good, because it >>>> keeps the WhiteBox API stable! >>>> >>>> Finally some words about the implementation of the per-method >>>> BackgroundCompilation functionality. It actually only requires two >>>> small changes: >>>> >>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>> and compilation level as arguments and use them to query the >>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>> >>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>> prefers blocking compilations. This is not only necessary, because it >>>> decreases the time we have to wait for a blocking compilation, but >>>> also because it prevents blocking compiles from getting stale. This >>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>> for methods which only get artificially compiled during a test because >>>> their invocations counters are usually too small. >>>> >>>> There's still a small probability that a blocking compilation will be >>>> not blocking. This can happen if a method for which we request the >>>> blocking compilation is already in the compilation queue (see the >>>> check 'compilation_is_in_queue(method)' in >>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>> rarely happen because methods which are manually compiled shouldn't >>>> get called that many times to implicitly place them into the compile >>>> queue. But we can even completely avoid this problem by using >>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>> the queue before we request a blocking compilation. >>>> >>>> I've also added a small regression test to demonstrate and verify the >>>> new functionality. >>>> >>>> Regards, >>>> Volker >>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>> wrote: >>>>> Hi Vladimir, >>>>> >>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>> all >>>>> compilations. I this case I just want to achieve blocking compilation >>>>> for a >>>>> single compile without affecting the rest of the system. The tests >>>>> using it >>>>> will continue executing as soon as that compile is finished, saving time >>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>> You are adding parameter which is used only for testing. >>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>> WhiteBox::compilation_locked. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Please review this change that adds support for blocking compiles >>>>>>> in the >>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>> >>>>>>> Motivation: >>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>> consuming >>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>> queue to >>>>>>> complete >>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>> causing indeterminism in the test >>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>> >>>>>>> Testing: >>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>> will be >>>>>>> pushed at the same time. >>>>>>> >>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Mar 2 12:54:30 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 2 Mar 2016 07:54:30 -0500 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D6DD57.4070701@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> <56D6DD57.4070701@oracle.com> Message-ID: Thanks Vladimir. On Wednesday, March 2, 2016, Vladimir Ivanov wrote: > I think it heavily depends on prefetch strategy: > > product(intx, AllocatePrefetchStyle, 1, > "0 = no prefetch, " > "1 = prefetch instructions for each allocation, " > "2 = use TLAB watermark to gate allocation prefetch, " > "3 = use BIS instruction on Sparc for allocation prefetch") > > Maybe prefetch distance and the number of prefetched lines is too much for > small array allocations and it should be treated like an ordinary object > allocation: > > // Generate several prefetch instructions. > uint lines = (length != NULL) ? AllocatePrefetchLines : > AllocateInstancePrefetchLines; > uint step_size = AllocatePrefetchStepSize; > uint distance = AllocatePrefetchDistance; > > But I'm not sure how much such optimization can buy us for the additional > complexity in the code. Yes, I'm not sure either - was just a thought as intuitively it doesn't make much sense to do the current amount of prefetch for such small arrays. Given the new code introduces a branch for small arrays, piggybacking on that to elide prefetch seems kind of desirable. I've no idea whether it would matter in real code. AFAIK, prefetchnta will bring the line into L1 on modern Intel. Prefetching beyond the small array allocation would seem undesirable as it increases instruction stream size for no benefit and may bring in lines that aren't needed at all. Thanks > > Best regards, > Vladimir Ivanov > > On 3/2/16 1:53 AM, Vitaly Davidovich wrote: > >> Related question - can the prefetch hints go away for small array >> allocations considering size is already being branched on? I've noticed >> allocations always come with a prefetch sequence, so perhaps this is >> just standard allocation pattern. >> >> On Tuesday, March 1, 2016, Vladimir Kozlov > > wrote: >> >> Perfect! >> >> Thanks, >> Vladimir >> >> On 3/1/16 1:46 PM, Aleksey Shipilev wrote: >> >> On 03/01/2016 08:03 PM, Vladimir Kozlov wrote: >> >> Do you have new performance numbers? I hope it did not >> regress with new >> code. >> >> >> It does not regress, the code is tight: >> http://cr.openjdk.java.net/~shade/8146801/notes.txt >> >> 2 things left I fill should be addressed. >> >> >> Both are fixed here: >> http://cr.openjdk.java.net/~shade/8146801/webrev.06/ >> >> Still passes JPRT -testset hotspot; RBT run is in progress. >> >> Cheers, >> -Aleksey >> >> >> >> -- >> Sent from my phone >> > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Wed Mar 2 13:05:41 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 2 Mar 2016 16:05:41 +0300 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> <56D6DD57.4070701@oracle.com> Message-ID: <56D6E525.3020101@oracle.com> > I've no idea whether it would matter in real code. AFAIK, prefetchnta > will bring the line into L1 on modern Intel. Prefetching beyond the > small array allocation would seem undesirable as it increases > instruction stream size for no benefit and may bring in lines that > aren't needed at all. Still guessing, but considering it still prefetches lines from current TLAB, consequent allocations may benefit. Actual performance should heavily depend on allocation rate though. Best regards, Vladimir Ivanov From vitalyd at gmail.com Wed Mar 2 13:06:20 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 2 Mar 2016 08:06:20 -0500 Subject: JIT stops compiling after a while (java 8u45) In-Reply-To: <012c01d1746a$c3fd2e40$4bf78ac0$@apache.org> References: <1456786793750-259603.post@n7.nabble.com> <56D56E8E.5060402@oracle.com> <56D58C74.1020406@oracle.com> <56D59669.7030200@oracle.com> <56D6A985.4020800@oracle.com> <010d01d17463$25053e60$6f0fbb20$@apache.org> <56D6B368.2010600@oracle.com> <012c01d1746a$c3fd2e40$4bf78ac0$@apache.org> Message-ID: On Wednesday, March 2, 2016, Uwe Schindler wrote: > Hi Tobias, > > > >>>> For real world applications I hope that this is a much smaller > issue but if > > >> you must load and execute loads and loads of short lived classes then > it > > might > > >> be reasonable to disable concurrent class unloading (at the cost of > getting > > >> serial Full gcs instead). > > >>> > > >>> Unfortunately, this is not a theoretical issue for us. We see this > problem > > >> running Presto (http://prestodb.io), which generates bytecode for > every > > >> query it processes. For now, we're working around it with a background > > >> thread that watches the size of the code cache and calls System.gc() > when > > it > > >> gets close to the max > > >> > > (https://github.com/facebook/presto/commit/91e1b3bb6bbfffc62401025a24 > > >> 231cd388992d7c). > > >> > > >> Okay, I changed JDK-8023191 from enhancement to bug and set fix > > version to > > >> 9. We can then backport this to 8u. > > > > > > Hi many thanks for taking care! > > > > > > Apache Lucene's Expressions module (heavily used by Elasticsearch) also > > compiles small Java classes to execute custom document scoring operations > > that the user can supply, similar to Presto's SQL, as a Javascript-like > formula > > that can work on arbitrary static double-parameter/double returning > > functions. These classes are loaded in a separate ClassLoader used > solely for > > a single class and thrown away afterwards. In older Java versions GC was > > perfectly throwing away those classes... > > > > > > Compiler & Classloader: > > > https://github.com/apache/lucene- > > solr/blob/master/lucene/expressions/src/java/org/apache/lucene/expressi > > ons/js/JavascriptCompiler.java > > > > > > See tests like: > > > https://github.com/apache/lucene- > > solr/tree/master/lucene/expressions/src/test/org/apache/lucene/expressio > > ns/js > > > > Please note that JDK-8023191 only takes care of flushing of unused OSR > > nmethods. This should help if the code cache fills up due to OSR > compilations > > but it's not about unloading of classes. Class unloading just helps > because it > > forces flushing right away. > > Thanks! This is how it was designed: The classloader only lives for very > short time and the compiled expression gets thrown away after fulltext > query execution. This works quite good. I just have seen the issue on this > mailing list and I wanted to start some test with recent Java 8 VMs if > everything is still alright. > > Martin Traverso's post alarmed me, because he said that they generate > bytecode for every query (which is similar in our case). The main > difference is: Our compiled methods are just plain simple mathematical > formulas which are represented to the caller by a functional interface. We > have no loops in the generated bytecode. The Lucene scoring algorithm just > calls the compiled expression bytecode while processing results to > calculate score (and depending on the number of search engine results > collected, this can be several million times per query execution). > > > As Mikael wrote, unloading happens later with concurrent class unloading. > > This shouldn't be a problem for your application as long as the code > cache > > does not fill up, right? > > As said before, it should not fill up - as we have no loops in those > expressions. No loops means no OSR compilation of that bytecode but if it's called millions of times (based on what you said above) then it'll get compiled. It may be that your bytecode generates really small nmethods (you mentioned it's simple algebraic expressions) and their generation naturally doesn't outpace code cache cleaning. I was just alarmed and wanted to start testing. We checked the whole stuff > with Java 6 and Java 7 (when it was written), but have not done extensive > testing of garbage collection with Java 8. But there were also no bug > reports... > > > Best regards, > > Tobias > > Uwe > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Wed Mar 2 13:37:37 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 2 Mar 2016 14:37:37 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D5F5BF.8090303@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5F5BF.8090303@oracle.com> Message-ID: Hi Pavel, Nils, thanks for your input. Please find my comments inline. I've also prepared a new webrev which includes your suggestions (and the new regression test which was missing from the first webrev): http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v2/ http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v2/ On Tue, Mar 1, 2016 at 9:04 PM, Nils Eliasson wrote: > Hi, > > On 2016-02-29 20:01, Pavel Punegov wrote: > > Hi Volker, > > I have some comments and questions about your patch: > > - src/share/vm/runtime/advancedThresholdPolicy.cpp > > You check for background compilation (blocking) by the searching for an > appropriate directive. > But there is a CompileTask::is_blocking() method, that returns a value set > in CompilerBroker::compile_method_base when a compile task was created. It > seems that CompileBroker::is_compile_blocking() finds the right directive > and checks for BackgroundCompilation for being set. > > Yes, CompileTask::is_blocking() should be used instead of looking up the > directive again. > Done. That nicely simplifys the code! > I think that checking it twice could lead to an issue with different > directives being set on the stack. With diagnostic commands I can clear the > directives stack, or remove directives. If I do this in between the task was > submitted and being checked in AdvancedThresholdPolicy::select_task, this > task could became non blocking. > > - src/share/vm/compiler/compileBroker.cpp > > 1317 backgroundCompilation = directive->BackgroundCompilationOption; > > Does it check the BackgroundCompilation for being set for both c1 and c2 at > the same time? What will happen if I set BackgroundCompilation to c1 only? > AFAIK, there are different queues for c1 and c2, and hence we could have > BackgroundCompilation to be set separately for both compilers. > > The correct directive set is retrieved in the beginning of the compilation > when getMatchingDirective(target_method, target_compiler) is called. So this > will work perfectly even with different flags for dfferent compilers. > As Nils already wrote, this already works. I was a little unspecific in WhiteBox.java where I set the blocking option for both compilers. In my new version I set it only for the compiler which matches the actual compilation level. But I'm not sure if this is really relevant in practice. > - test/lib/sun/hotspot/WhiteBox.java > > 318 addCompilerDirective("[{ match: ? > > I?m not quite sure that this is a right way to set a method to be blocking. > Adding a directive on top of the stack makes already set directives for that > method not used. > For example, if I would like to set method to be logged (LogCompilation) and > disable some inlining, but then enqueue it with WB, I will get it to be only > compiled without LogCompilation. > But, AFAIK, setting CompileCommand option will work for already set > directive through a compatibility in CompilerDirectives. > Yes, you're right. And as you correctly noticed, you could still use CompileCommand to set an option which will not be shadowed by a new compiler directive. > > So, I would prefer to have a directive (file or WB) or an option set by > myself, and then invoke standard WB.enqueueMethodForCompilation(). > But you can still do that. I've just added the 'blocking' versions of the enqueueMethod() for convenience because I thought that doing a blocking compile without additional compiler directives is quite common for tests. > I agree that the enqueueMethod with the block-argument might cause > undesirable surprises and that it is better to just have the plain methods > addCompilerDirective, enqueueMethod and removeCompilerDirective in > Whitebox.java. If we find some often used pattern we can add that to > CompilerWhitebox or similar. > > There is one bug here though - addCompilerDirective adds any number of > directives, but removeCompilerDirective just removes one. I can do a quick > fix that limits the WB-api to just add one at a time. > > Thanks for the feedback, > Nils Eliasson > > > > ? Thanks, > Pavel Punegov > > > On 26 Feb 2016, at 21:47, Volker Simonis wrote: > > Hi, > > so I want to propose the following solution for this problem: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ > > I've started from the opposite site and made the BackgroundCompilation > manageable through the compiler directives framework. Once this works > (and it's actually trivial due to the nice design of the > CompilerDirectives framework :), we get the possibility to set the > BackgroundCompilation option on a per method base on the command line > via the CompileCommand option for free: > > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" > > And of course we can also use it directly as a compiler directive: > > [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] > > It also becomes possible to use this directly from the Whitebox API > through the DiagnosticCommand.compilerDirectivesAdd command. > Unfortunately, this command takes a file with compiler directives as > argument. I think this would be overkill in this context. So because > it was so easy and convenient, I added the following two new Whitebox > methods: > > public native void addCompilerDirective(String compDirect); > public native void removeCompilerDirective(); > > which can now be used to set arbitrary CompilerDirective command > directly from within the WhiteBox API. (The implementation of these > two methods is trivial as you can see in whitebox.cpp). > > The blocking versions of enqueueMethodForCompilation() now become > simple wrappers around the existing methods without the need of any > code changes in their native implementation. This is good, because it > keeps the WhiteBox API stable! > > Finally some words about the implementation of the per-method > BackgroundCompilation functionality. It actually only requires two > small changes: > > 1. extending CompileBroker::is_compile_blocking() to take the method > and compilation level as arguments and use them to query the > DirectivesStack for the corresponding BackgroundCompilation value. > > 2. changing AdvancedThresholdPolicy::select_task() such that it > prefers blocking compilations. This is not only necessary, because it > decreases the time we have to wait for a blocking compilation, but > also because it prevents blocking compiles from getting stale. This > could otherwise easily happen in AdvancedThresholdPolicy::is_stale() > for methods which only get artificially compiled during a test because > their invocations counters are usually too small. > > There's still a small probability that a blocking compilation will be > not blocking. This can happen if a method for which we request the > blocking compilation is already in the compilation queue (see the > check 'compilation_is_in_queue(method)' in > CompileBroker::compile_method_base()). In testing scenarios this will > rarely happen because methods which are manually compiled shouldn't > get called that many times to implicitly place them into the compile > queue. But we can even completely avoid this problem by using > WB.isMethodQueuedForCompilation() to make sure that a method is not in > the queue before we request a blocking compilation. > > I've also added a small regression test to demonstrate and verify the > new functionality. > > Regards, > Volker > > > > > On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson > wrote: > > Hi Vladimir, > > WhiteBox::compilation_locked is a global state that temporarily stops all > compilations. I this case I just want to achieve blocking compilation for a > single compile without affecting the rest of the system. The tests using it > will continue executing as soon as that compile is finished, saving time > where wait-loops is used today. It adds nice determinism to tests. > > Best regards, > Nils Eliasson > > > On 2016-02-25 22:14, Vladimir Kozlov wrote: > > > You are adding parameter which is used only for testing. > Can we have callback(or check field) into WB instead? Similar to > WhiteBox::compilation_locked. > > Thanks, > Vladimir > > On 2/25/16 7:01 AM, Nils Eliasson wrote: > > > Hi, > > Please review this change that adds support for blocking compiles in the > whitebox API. This enables simpler less time consuming tests. > > Motivation: > * -XX:-BackgroundCompilation is a global flag and can be time consuming > * Blocking compiles removes the need for waiting on the compile queue to > complete > * Compiles put in the queue may be evicted if the queue grows to big - > causing indeterminism in the test > * Less VM-flags allows for more tests in the same VM > > Testing: > Posting a separate RFR for test fix that uses this change. They will be > pushed at the same time. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 > JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ > Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ > > Best regards, > Nils Eliasson > > > > > From volker.simonis at gmail.com Wed Mar 2 13:42:18 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 2 Mar 2016 14:42:18 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D6DE47.7060405@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> Message-ID: Hi Nils, sorry, it seems like I'm always too late with this issue :( Just sent my new version out without seeing your mail. I'll take a look at it now and resend a new version if that still makes sense. Regards, Volker On Wed, Mar 2, 2016 at 1:36 PM, Nils Eliasson wrote: > Hi Volker, > > I created these webrevs including all the feedback from everyone: > > http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ > * Only add- and removeCompilerDirective > > http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ > * whitebox.cpp > -- addCompilerDirective to have correct VM states > * advancedThresholdPolicy.cpp > -- prevent blocking tasks from becoming stale > -- The logic for picking first blocking task broke JVMCI code. Instead made > the JVMCI code default (select the blocking task with highest score.) > * compilerDirectives.hpp > -- Remove option CompileCommand. Not needed. > * compileBroker.cpp > -- Wrapped compile_method so that directive get and release always are > matched. > > Is anything missing? > > Best regards, > Nils Eliasson > > > > On 2016-03-01 19:31, Volker Simonis wrote: > > Hi Pavel, Nils, Vladimir, > > sorry, but I was busy the last days so I couldn't answer your mails. > > Thanks a lot for your input and your suggestions. I'll look into this > tomorrow and hopefully I'll be able to address all your concerns. > > Regards, > Volker > > > On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov > wrote: > > Nils, please answer Pavel's questions. > > Thanks, > Vladimir > > > On 3/1/16 6:24 AM, Nils Eliasson wrote: > > Hi Volker, > > An excellent proposition. This is how it should be used. > > I polished a few rough edges: > * CompilerBroker.cpp - The directives was already access in > compile_method - but hidden incompilation_is_prohibited. I moved it out > so we only have a single directive access. Wrapped compile_method to > make sure the release of the directive doesn't get lost. > * Let WB_AddCompilerDirective return a bool for success. Also fixed the > state - need to be in native to get string, but then need to be in VM > when parsing directive. > > And some comments: > * I am against adding new compile option commands (At least until the > stringly typeness is fixed). Lets add good ways too use compiler > directives instead. > > I need to look at the stale task removal code tomorrow - hopefully we > could save the blocking info in the task so we don't need to access the > directive in the policy. > > All in here: > Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ > > The code runs fine with the test I fixed for JDK-8073793: > http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ > > Best regards, > Nils Eliasson > > On 2016-02-26 19:47, Volker Simonis wrote: > > Hi, > > so I want to propose the following solution for this problem: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ > > I've started from the opposite site and made the BackgroundCompilation > manageable through the compiler directives framework. Once this works > (and it's actually trivial due to the nice design of the > CompilerDirectives framework :), we get the possibility to set the > BackgroundCompilation option on a per method base on the command line > via the CompileCommand option for free: > > > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" > > > And of course we can also use it directly as a compiler directive: > > [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] > > It also becomes possible to use this directly from the Whitebox API > through the DiagnosticCommand.compilerDirectivesAdd command. > Unfortunately, this command takes a file with compiler directives as > argument. I think this would be overkill in this context. So because > it was so easy and convenient, I added the following two new Whitebox > methods: > > public native void addCompilerDirective(String compDirect); > public native void removeCompilerDirective(); > > which can now be used to set arbitrary CompilerDirective command > directly from within the WhiteBox API. (The implementation of these > two methods is trivial as you can see in whitebox.cpp). > v > The blocking versions of enqueueMethodForCompilation() now become > simple wrappers around the existing methods without the need of any > code changes in their native implementation. This is good, because it > keeps the WhiteBox API stable! > > Finally some words about the implementation of the per-method > BackgroundCompilation functionality. It actually only requires two > small changes: > > 1. extending CompileBroker::is_compile_blocking() to take the method > and compilation level as arguments and use them to query the > DirectivesStack for the corresponding BackgroundCompilation value. > > 2. changing AdvancedThresholdPolicy::select_task() such that it > prefers blocking compilations. This is not only necessary, because it > decreases the time we have to wait for a blocking compilation, but > also because it prevents blocking compiles from getting stale. This > could otherwise easily happen in AdvancedThresholdPolicy::is_stale() > for methods which only get artificially compiled during a test because > their invocations counters are usually too small. > > There's still a small probability that a blocking compilation will be > not blocking. This can happen if a method for which we request the > blocking compilation is already in the compilation queue (see the > check 'compilation_is_in_queue(method)' in > CompileBroker::compile_method_base()). In testing scenarios this will > rarely happen because methods which are manually compiled shouldn't > get called that many times to implicitly place them into the compile > queue. But we can even completely avoid this problem by using > WB.isMethodQueuedForCompilation() to make sure that a method is not in > the queue before we request a blocking compilation. > > I've also added a small regression test to demonstrate and verify the > new functionality. > > Regards, > Volker > > On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson > wrote: > > Hi Vladimir, > > WhiteBox::compilation_locked is a global state that temporarily stops > all > compilations. I this case I just want to achieve blocking compilation > for a > single compile without affecting the rest of the system. The tests > using it > will continue executing as soon as that compile is finished, saving time > where wait-loops is used today. It adds nice determinism to tests. > > Best regards, > Nils Eliasson > > > On 2016-02-25 22:14, Vladimir Kozlov wrote: > > You are adding parameter which is used only for testing. > Can we have callback(or check field) into WB instead? Similar to > WhiteBox::compilation_locked. > > Thanks, > Vladimir > > On 2/25/16 7:01 AM, Nils Eliasson wrote: > > Hi, > > Please review this change that adds support for blocking compiles > in the > whitebox API. This enables simpler less time consuming tests. > > Motivation: > * -XX:-BackgroundCompilation is a global flag and can be time > consuming > * Blocking compiles removes the need for waiting on the compile > queue to > complete > * Compiles put in the queue may be evicted if the queue grows to big - > causing indeterminism in the test > * Less VM-flags allows for more tests in the same VM > > Testing: > Posting a separate RFR for test fix that uses this change. They > will be > pushed at the same time. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 > JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ > Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ > > Best regards, > Nils Eliasson > > From nils.eliasson at oracle.com Wed Mar 2 13:37:22 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 2 Mar 2016 14:37:22 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D6DE47.7060405@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> Message-ID: <56D6EC92.7010309@oracle.com> Yes, I forgot to add the fix for working with multiple directives from whitebox. WB.addCompilerDirectives now returns the number of directives that where added, and removeCompilerDirectives takes a parameter for the number of directives that should be popped (atomically). http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ Fixed test in JDK-8073793 to work with this: http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ Best regards, Nils Eliasson On 2016-03-02 13:36, Nils Eliasson wrote: > Hi Volker, > > I created these webrevs including all the feedback from everyone: > > http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ > * Only add- and removeCompilerDirective > > http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ > * whitebox.cpp > -- addCompilerDirective to have correct VM states > * advancedThresholdPolicy.cpp > -- prevent blocking tasks from becoming stale > -- The logic for picking first blocking task broke JVMCI code. Instead > made the JVMCI code default (select the blocking task with highest > score.) > * compilerDirectives.hpp > -- Remove option CompileCommand. Not needed. > * compileBroker.cpp > -- Wrapped compile_method so that directive get and release always are > matched. > > Is anything missing? > > Best regards, > Nils Eliasson > > > On 2016-03-01 19:31, Volker Simonis wrote: >> Hi Pavel, Nils, Vladimir, >> >> sorry, but I was busy the last days so I couldn't answer your mails. >> >> Thanks a lot for your input and your suggestions. I'll look into this >> tomorrow and hopefully I'll be able to address all your concerns. >> >> Regards, >> Volker >> >> >> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >> wrote: >>> Nils, please answer Pavel's questions. >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>> Hi Volker, >>>> >>>> An excellent proposition. This is how it should be used. >>>> >>>> I polished a few rough edges: >>>> * CompilerBroker.cpp - The directives was already access in >>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>> so we only have a single directive access. Wrapped compile_method to >>>> make sure the release of the directive doesn't get lost. >>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>> state - need to be in native to get string, but then need to be in VM >>>> when parsing directive. >>>> >>>> And some comments: >>>> * I am against adding new compile option commands (At least until the >>>> stringly typeness is fixed). Lets add good ways too use compiler >>>> directives instead. >>>> >>>> I need to look at the stale task removal code tomorrow - hopefully we >>>> could save the blocking info in the task so we don't need to access the >>>> directive in the policy. >>>> >>>> All in here: >>>> Webrev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>> >>>> The code runs fine with the test I fixed for JDK-8073793: >>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>> Hi, >>>>> >>>>> so I want to propose the following solution for this problem: >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>> >>>>> I've started from the opposite site and made the BackgroundCompilation >>>>> manageable through the compiler directives framework. Once this works >>>>> (and it's actually trivial due to the nice design of the >>>>> CompilerDirectives framework :), we get the possibility to set the >>>>> BackgroundCompilation option on a per method base on the command line >>>>> via the CompileCommand option for free: >>>>> >>>>> >>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>> >>>>> >>>>> And of course we can also use it directly as a compiler directive: >>>>> >>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>> >>>>> It also becomes possible to use this directly from the Whitebox API >>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>> Unfortunately, this command takes a file with compiler directives as >>>>> argument. I think this would be overkill in this context. So because >>>>> it was so easy and convenient, I added the following two new Whitebox >>>>> methods: >>>>> >>>>> public native void addCompilerDirective(String compDirect); >>>>> public native void removeCompilerDirective(); >>>>> >>>>> which can now be used to set arbitrary CompilerDirective command >>>>> directly from within the WhiteBox API. (The implementation of these >>>>> two methods is trivial as you can see in whitebox.cpp). >>>>> v >>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>> simple wrappers around the existing methods without the need of any >>>>> code changes in their native implementation. This is good, because it >>>>> keeps the WhiteBox API stable! >>>>> >>>>> Finally some words about the implementation of the per-method >>>>> BackgroundCompilation functionality. It actually only requires two >>>>> small changes: >>>>> >>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>> and compilation level as arguments and use them to query the >>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>> >>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>> prefers blocking compilations. This is not only necessary, because it >>>>> decreases the time we have to wait for a blocking compilation, but >>>>> also because it prevents blocking compiles from getting stale. This >>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>> for methods which only get artificially compiled during a test because >>>>> their invocations counters are usually too small. >>>>> >>>>> There's still a small probability that a blocking compilation will be >>>>> not blocking. This can happen if a method for which we request the >>>>> blocking compilation is already in the compilation queue (see the >>>>> check 'compilation_is_in_queue(method)' in >>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>> rarely happen because methods which are manually compiled shouldn't >>>>> get called that many times to implicitly place them into the compile >>>>> queue. But we can even completely avoid this problem by using >>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>> the queue before we request a blocking compilation. >>>>> >>>>> I've also added a small regression test to demonstrate and verify the >>>>> new functionality. >>>>> >>>>> Regards, >>>>> Volker >>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>> wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>> all >>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>> for a >>>>>> single compile without affecting the rest of the system. The tests >>>>>> using it >>>>>> will continue executing as soon as that compile is finished, saving time >>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>>> You are adding parameter which is used only for testing. >>>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>>> WhiteBox::compilation_locked. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Please review this change that adds support for blocking compiles >>>>>>>> in the >>>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>>> >>>>>>>> Motivation: >>>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>>> consuming >>>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>>> queue to >>>>>>>> complete >>>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>>> causing indeterminism in the test >>>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>>> >>>>>>>> Testing: >>>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>>> will be >>>>>>>> pushed at the same time. >>>>>>>> >>>>>>>> RFE:https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>>> JDK rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>>> Hotspot rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Nils Eliasson > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrij.pochepko at oracle.com Wed Mar 2 15:44:43 2016 From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko) Date: Wed, 2 Mar 2016 18:44:43 +0300 Subject: RFR(S): 8139703 - compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp Message-ID: <56D70A6B.6000906@oracle.com> Hi, please review fix for https://bugs.openjdk.java.net/browse/JDK-8139703 - [TESTBUG] compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp A test compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest failed in Xcomp. After investigation, a bug related to escape analysis was filed(https://bugs.openjdk.java.net/browse/JDK-8140018). So, i've switched this test to use Xmixed mode. Also, this test was changed to trigger method compilation by calling method multiple times instead of using WhiteBox. CR: https://bugs.openjdk.java.net/browse/JDK-8139703 webrev: http://cr.openjdk.java.net/~dpochepk/8139703/webrev.01/ Thanks, Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrij.pochepko at oracle.com Wed Mar 2 15:53:34 2016 From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko) Date: Wed, 2 Mar 2016 18:53:34 +0300 Subject: RFR(S): 8138798 - improve tests for HotSpotVMEventListener::notifyInstall Message-ID: <56D70C7E.5090502@oracle.com> Hi, please review fix for JDK-8138798 - improve tests for HotSpotVMEventListener::notifyInstall A test was improved to include negative cases(verifying that no install events sent on failed install attempt). Also, a minor refactoring was applied. CR: https://bugs.openjdk.java.net/browse/JDK-8138798 webrev: http://cr.openjdk.java.net/~dpochepk/8138798/webrev.01/ Thanks, Dmitrij From volker.simonis at gmail.com Wed Mar 2 16:36:40 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 2 Mar 2016 17:36:40 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D6EC92.7010309@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> Message-ID: Hi Nils, your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is as base for my new webrevs at: http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 I've updated the copyrights, added the current reviewers and also added us both in the Contributed-by line (hope that's fine for you). Except that, I've only done the following minor fixes/changes: *compileBroker.{cpp,hpp}* - we don't need CompileBroker::is_compile_blocking() anymore. *compilerDirectives.hpp* - I think we should use cflags(BackgroundCompilation, bool, BackgroundCompilation, BackgroundCompilation) instead of: cflags(BackgroundCompilation, bool, BackgroundCompilation, X) so we can also trigger blocking compiles from the command line with a CompileCommand (e.g. -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") That's very handy during development or and also for simple tests where we don't want to mess with compiler directives. (And the overhead to keep this feature is quite small, just "BackgroundCompilation" instead of "X" ;-) *whitebox.cpp* I think it is good that you fixed the state but I think it is too complicated now. We don't need to strdup the string and can easily forget to free 'tmpstr' :) So maybe it is simpler to just do another transition for parsing the directive: { ThreadInVMfromNative ttvfn(thread); // back to VM DirectivesParser::parse_string(dir, tty); } *advancedThresholdPolicy.cpp* - the JVMCI code looks reasonable (although I haven't tested JVMCI) and is actually even an improvement over my code which just picked the first blocking compilation. *diagnosticCommand.cpp*- Shouldn't you also fix CompilerDirectivesAddDCmd to return the number of added directives and CompilerDirectivesRemoveDCmd to take the number of directives you want to pop? Or do you want to do this in a later, follow-up change? *WhiteBox.java* - I still think it would make sense to keep the two 'blocking' versions of enqueueMethodForCompilation() for convenience. For example your test fix for JDK-8073793 would be much simpler if you used them. I've added two comments to the 'blocking' convenience methods to mention the fact that calling them may shadow previously added compiler directives. *BlockingCompilation.java* - I've extended my regression test to test both methods of doing blocking compilation - with the new, 'blocking' enqueueMethodForCompilation() methods as well as by manually setting the corresponding compiler directives. If we should finally get consensus on removing the blocking convenience methods, please just remove the corresponding tests. I think we're close to a final version now, what do you think :) Regards, Volker On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson wrote: > Yes, I forgot to add the fix for working with multiple directives from > whitebox. > > WB.addCompilerDirectives now returns the number of directives that where > added, and removeCompilerDirectives takes a parameter for the number of > directives that should be popped (atomically). > > http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ > http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ > > Fixed test in JDK-8073793 to work with this: > http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ > > Best regards, > Nils Eliasson > > > > On 2016-03-02 13:36, Nils Eliasson wrote: > > Hi Volker, > > I created these webrevs including all the feedback from everyone: > > http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ > * Only add- and removeCompilerDirective > > http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ > * whitebox.cpp > -- addCompilerDirective to have correct VM states > * advancedThresholdPolicy.cpp > -- prevent blocking tasks from becoming stale > -- The logic for picking first blocking task broke JVMCI code. Instead > made the JVMCI code default (select the blocking task with highest score.) > * compilerDirectives.hpp > -- Remove option CompileCommand. Not needed. > * compileBroker.cpp > -- Wrapped compile_method so that directive get and release always are > matched. > > Is anything missing? > > Best regards, > Nils Eliasson > > > On 2016-03-01 19:31, Volker Simonis wrote: > > Hi Pavel, Nils, Vladimir, > > sorry, but I was busy the last days so I couldn't answer your mails. > > Thanks a lot for your input and your suggestions. I'll look into this > tomorrow and hopefully I'll be able to address all your concerns. > > Regards, > Volker > > > On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov wrote: > > Nils, please answer Pavel's questions. > > Thanks, > Vladimir > > > On 3/1/16 6:24 AM, Nils Eliasson wrote: > > Hi Volker, > > An excellent proposition. This is how it should be used. > > I polished a few rough edges: > * CompilerBroker.cpp - The directives was already access in > compile_method - but hidden incompilation_is_prohibited. I moved it out > so we only have a single directive access. Wrapped compile_method to > make sure the release of the directive doesn't get lost. > * Let WB_AddCompilerDirective return a bool for success. Also fixed the > state - need to be in native to get string, but then need to be in VM > when parsing directive. > > And some comments: > * I am against adding new compile option commands (At least until the > stringly typeness is fixed). Lets add good ways too use compiler > directives instead. > > I need to look at the stale task removal code tomorrow - hopefully we > could save the blocking info in the task so we don't need to access the > directive in the policy. > > All in here: > Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ > > The code runs fine with the test I fixed for JDK-8073793:http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ > > Best regards, > Nils Eliasson > > On 2016-02-26 19:47, Volker Simonis wrote: > > Hi, > > so I want to propose the following solution for this problem: > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevelhttp://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ > > I've started from the opposite site and made the BackgroundCompilation > manageable through the compiler directives framework. Once this works > (and it's actually trivial due to the nice design of the > CompilerDirectives framework :), we get the possibility to set the > BackgroundCompilation option on a per method base on the command line > via the CompileCommand option for free: > > > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" > > > And of course we can also use it directly as a compiler directive: > > [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] > > It also becomes possible to use this directly from the Whitebox API > through the DiagnosticCommand.compilerDirectivesAdd command. > Unfortunately, this command takes a file with compiler directives as > argument. I think this would be overkill in this context. So because > it was so easy and convenient, I added the following two new Whitebox > methods: > > public native void addCompilerDirective(String compDirect); > public native void removeCompilerDirective(); > > which can now be used to set arbitrary CompilerDirective command > directly from within the WhiteBox API. (The implementation of these > two methods is trivial as you can see in whitebox.cpp). > v > The blocking versions of enqueueMethodForCompilation() now become > simple wrappers around the existing methods without the need of any > code changes in their native implementation. This is good, because it > keeps the WhiteBox API stable! > > Finally some words about the implementation of the per-method > BackgroundCompilation functionality. It actually only requires two > small changes: > > 1. extending CompileBroker::is_compile_blocking() to take the method > and compilation level as arguments and use them to query the > DirectivesStack for the corresponding BackgroundCompilation value. > > 2. changing AdvancedThresholdPolicy::select_task() such that it > prefers blocking compilations. This is not only necessary, because it > decreases the time we have to wait for a blocking compilation, but > also because it prevents blocking compiles from getting stale. This > could otherwise easily happen in AdvancedThresholdPolicy::is_stale() > for methods which only get artificially compiled during a test because > their invocations counters are usually too small. > > There's still a small probability that a blocking compilation will be > not blocking. This can happen if a method for which we request the > blocking compilation is already in the compilation queue (see the > check 'compilation_is_in_queue(method)' in > CompileBroker::compile_method_base()). In testing scenarios this will > rarely happen because methods which are manually compiled shouldn't > get called that many times to implicitly place them into the compile > queue. But we can even completely avoid this problem by using > WB.isMethodQueuedForCompilation() to make sure that a method is not in > the queue before we request a blocking compilation. > > I've also added a small regression test to demonstrate and verify the > new functionality. > > Regards, > Volker > > On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson wrote: > > Hi Vladimir, > > WhiteBox::compilation_locked is a global state that temporarily stops > all > compilations. I this case I just want to achieve blocking compilation > for a > single compile without affecting the rest of the system. The tests > using it > will continue executing as soon as that compile is finished, saving time > where wait-loops is used today. It adds nice determinism to tests. > > Best regards, > Nils Eliasson > > > On 2016-02-25 22:14, Vladimir Kozlov wrote: > > You are adding parameter which is used only for testing. > Can we have callback(or check field) into WB instead? Similar to > WhiteBox::compilation_locked. > > Thanks, > Vladimir > > On 2/25/16 7:01 AM, Nils Eliasson wrote: > > Hi, > > Please review this change that adds support for blocking compiles > in the > whitebox API. This enables simpler less time consuming tests. > > Motivation: > * -XX:-BackgroundCompilation is a global flag and can be time > consuming > * Blocking compiles removes the need for waiting on the compile > queue to > complete > * Compiles put in the queue may be evicted if the queue grows to big - > causing indeterminism in the test > * Less VM-flags allows for more tests in the same VM > > Testing: > Posting a separate RFR for test fix that uses this change. They > will be > pushed at the same time. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 > JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ > Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ > > Best regards, > Nils Eliasson > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Wed Mar 2 17:22:50 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 2 Mar 2016 07:22:50 -1000 Subject: RFR(S): 8139703 - compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp In-Reply-To: <56D70A6B.6000906@oracle.com> References: <56D70A6B.6000906@oracle.com> Message-ID: > On Mar 2, 2016, at 5:44 AM, Dmitrij Pochepko wrote: > > Hi, > > please review fix for https://bugs.openjdk.java.net/browse/JDK-8139703 - [TESTBUG] compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp > > A test compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest failed in Xcomp. After investigation, a bug related to escape analysis was filed(https://bugs.openjdk.java.net/browse/JDK-8140018 ). > > So, i've switched this test to use Xmixed mode. Also, this test was changed to trigger method compilation by calling method multiple times instead of using WhiteBox. The reason for this change is to get proper inlining, I suppose? > > CR: https://bugs.openjdk.java.net/browse/JDK-8139703 > webrev: http://cr.openjdk.java.net/~dpochepk/8139703/webrev.01/ > > Thanks, > Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Wed Mar 2 17:26:52 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 2 Mar 2016 07:26:52 -1000 Subject: RFR(S): 8138798 - improve tests for HotSpotVMEventListener::notifyInstall In-Reply-To: <56D70C7E.5090502@oracle.com> References: <56D70C7E.5090502@oracle.com> Message-ID: <67F5551B-A5D2-40D2-BB27-4C725D2CE820@oracle.com> Looks good. > On Mar 2, 2016, at 5:53 AM, Dmitrij Pochepko wrote: > > Hi, > > please review fix for JDK-8138798 - improve tests for HotSpotVMEventListener::notifyInstall > > A test was improved to include negative cases(verifying that no install events sent on failed install attempt). Also, a minor refactoring was applied. > > CR: https://bugs.openjdk.java.net/browse/JDK-8138798 > webrev: http://cr.openjdk.java.net/~dpochepk/8138798/webrev.01/ > > Thanks, > Dmitrij From vladimir.kozlov at oracle.com Wed Mar 2 17:29:43 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Mar 2016 09:29:43 -0800 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D6E525.3020101@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> <56D6DD57.4070701@oracle.com> <56D6E525.3020101@oracle.com> Message-ID: <56D72307.2040004@oracle.com> The prefetching assumes that next allocation will be of the same type (instance or array). The prefetching is done for future allocation and not a current one. So we can't change it based on size of current allocation. Yes, it is very simple approach and we can do better by searching other allocations in current code. But I doubt it will give us a lot of benefits. My experiments back then showed that prefetching helps offset zeroing cost (in some degree) because cache lines are fetched already. Skipping some prefetching may have negative effect. Memory accesses are more costly then instruction count. Thanks, Vladimir On 3/2/16 5:05 AM, Vladimir Ivanov wrote: >> I've no idea whether it would matter in real code. AFAIK, prefetchnta >> will bring the line into L1 on modern Intel. Prefetching beyond the >> small array allocation would seem undesirable as it increases >> instruction stream size for no benefit and may bring in lines that >> aren't needed at all. > Still guessing, but considering it still prefetches lines from current TLAB, consequent allocations may benefit. Actual > performance should heavily depend on allocation rate though. > > Best regards, > Vladimir Ivanov From christian.thalinger at oracle.com Wed Mar 2 17:30:04 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 2 Mar 2016 07:30:04 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> Message-ID: <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> #define COMMA , Why do we have a define like this? It came in with 8139575. > On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R wrote: > > Hi Vladimir, Christian > > I have updated the code according your suggestion of file name change. > The updated webrev is at this location: > http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ > Please let me know if I have to do anything more. > Also is there any change required to Makefile so that configurations.xml has name of the added file ? > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Monday, February 29, 2016 5:59 PM > To: Vladimir Kozlov > Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > >> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: >> >> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. > > I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. > >> You can manipulate when to use it in vm_version_x86.cpp. >> >> Intel Copyright in the file's header is fine. >> >> Code changes are fine now (webrev.01). >> >> Thanks, >> Vladimir >> >> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>> >>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>>> >>>> Hi Christian >>>> >>>> We used the SHA Extension >>>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >>> >>> Will that extension only be available on Intel chips? >>> >>>> It needed to have Intel copyright, so created a separate file. >>> >>> That is reasonable. >>> >>>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Monday, February 29, 2016 1:58 PM >>>> To: Deshpande, Vivek R >>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>> >>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>>> >>>>> HI Vladimir >>>>> >>>>> Thank you for your review. >>>>> I have updated the patch with the changes you have suggested. >>>>> The new webrev is at this location: >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>> >>>>> Regards >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>> To: Deshpande, Vivek R; hotspot compiler >>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>> enabling >>>>> >>>>> Very nice, Vivek!!! >>>>> >>>>> Did you run tests with both 32- and 64-bit VMs? >>>>> >>>>> Small notes: >>>>> >>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>> >>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>> supports_avx()) && !supports_evex()); } >>>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>>> >>>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>>> >>>>> if (supports_sha()) { >>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>> UseSHA = true; >>>>> } >>>>> } else if (UseSHA) { >>>>> warning("SHA instructions are not available on this CPU"); >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>> } >>>>> } else if (UseSHA1Intrinsics) { >>>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>> } >>>>> } else if (UseSHA256Intrinsics) { >>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>> >>>>> if (UseSHA512Intrinsics) { >>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>> >>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>> Hi all >>>>>> >>>>>> I would like to contribute a patch which optimizesSHA-1 andSHA-256 >>>>>> for >>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>> >>>>>> Could you please review and sponsor this patch. >>>>>> >>>>>> Bug-id: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>> webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>> >>>>>> Thanks and regards, >>>>>> >>>>>> Vivek >>>>>> >>>> >>> > From vitalyd at gmail.com Wed Mar 2 17:50:10 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 2 Mar 2016 12:50:10 -0500 Subject: RFR (S) 8146801: Allocating short arrays of non-constant size is slow In-Reply-To: <56D72307.2040004@oracle.com> References: <56D4B1F7.4040201@oracle.com> <56D4E9F8.4070303@oracle.com> <56D58F62.80102@oracle.com> <56D5CB5C.4050203@oracle.com> <56D60DAA.2010401@oracle.com> <56D61A77.6080509@oracle.com> <56D6DD57.4070701@oracle.com> <56D6E525.3020101@oracle.com> <56D72307.2040004@oracle.com> Message-ID: > > The prefetching assumes that next allocation will be of the same type > (instance or array). The prefetching is done for future allocation and not > a current one. So we can't change it based on size of current allocation. > Yes, it is very simple approach and we can do better by searching other > allocations in current code. But I doubt it will give us a lot of benefits. Indeed, it seems kind of an "unprincipled" approach (I do appreciate the simplicity though). Unless the workload is doing almost nothing but allocations (which would have other performance implications), a future allocation may not come for a while. If that's the case, the prefetched line for future allocation will probably not survive in the L1 cache. I do hope that prefetchnta, being non-temporal, will not fetch a line into L1 if that would cause a replacement of an existing line in there; https://software.intel.com/en-us/forums/intel-vtune-amplifier-xe/topic/356760 appears to imply otherwise, although unfortunately nobody from Intel responded. My experiments back then showed that prefetching helps offset zeroing cost > (in some degree) because cache lines are fetched already. Skipping some > prefetching may have negative effect. For zeroing (or user fill) a large array, it makes sense although given the fill/zeroing is done linearly, h/w prefetch on modern CPUs may already do a good enough job, if not better. It'd be interesting to verify this on modern h/w. Memory accesses are more costly then instruction count Agreed, but software prefetch has nasty habit of either not adding anything at all or making things worse :). This case, in particular, seems a bit odd since the prefetch here is a shot-in-the-dark guess by the compiler. Thanks for the discussion guys. On Wed, Mar 2, 2016 at 12:29 PM, Vladimir Kozlov wrote: > The prefetching assumes that next allocation will be of the same type > (instance or array). The prefetching is done for future allocation and not > a current one. So we can't change it based on size of current allocation. > Yes, it is very simple approach and we can do better by searching other > allocations in current code. But I doubt it will give us a lot of benefits. > > My experiments back then showed that prefetching helps offset zeroing cost > (in some degree) because cache lines are fetched already. Skipping some > prefetching may have negative effect. Memory accesses are more costly then > instruction count. > > Thanks, > Vladimir > > > On 3/2/16 5:05 AM, Vladimir Ivanov wrote: > >> I've no idea whether it would matter in real code. AFAIK, prefetchnta >>> will bring the line into L1 on modern Intel. Prefetching beyond the >>> small array allocation would seem undesirable as it increases >>> instruction stream size for no benefit and may bring in lines that >>> aren't needed at all. >>> >> Still guessing, but considering it still prefetches lines from current >> TLAB, consequent allocations may benefit. Actual >> performance should heavily depend on allocation rate though. >> >> Best regards, >> Vladimir Ivanov >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Mar 2 17:50:14 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Mar 2016 09:50:14 -0800 Subject: RFR(S): 8139703 - compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp In-Reply-To: <56D70A6B.6000906@oracle.com> References: <56D70A6B.6000906@oracle.com> Message-ID: <56D727D6.50700@oracle.com> As we discussed in previous reviews CompilerWhiteBoxTest.THRESHOLD may be not enough to trigger compilation. Thanks, Vladimir On 3/2/16 7:44 AM, Dmitrij Pochepko wrote: > Hi, > > please review fix for https://bugs.openjdk.java.net/browse/JDK-8139703 - [TESTBUG] > compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp > > A test compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest failed in Xcomp. After investigation, a bug related to > escape analysis was filed(https://bugs.openjdk.java.net/browse/JDK-8140018). > > So, i've switched this test to use Xmixed mode. Also, this test was changed to trigger method compilation by calling > method multiple times instead of using WhiteBox. > > CR: https://bugs.openjdk.java.net/browse/JDK-8139703 > webrev: http://cr.openjdk.java.net/~dpochepk/8139703/webrev.01/ > > Thanks, > Dmitrij From dmitrij.pochepko at oracle.com Wed Mar 2 18:04:49 2016 From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko) Date: Wed, 2 Mar 2016 21:04:49 +0300 Subject: RFR(S): 8139703 - compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using -Xcomp In-Reply-To: References: <56D70A6B.6000906@oracle.com> Message-ID: <56D72B41.7050804@oracle.com> > >> On Mar 2, 2016, at 5:44 AM, Dmitrij Pochepko >> > wrote: >> >> Hi, >> >> please review fix for >> https://bugs.openjdk.java.net/browse/JDK-8139703 - [TESTBUG] >> compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest fails using >> -Xcomp >> >> A test compiler/jvmci/compilerToVM/MaterializeVirtualObjectTest >> failed in Xcomp. After investigation, a bug related to escape >> analysis was filed(https://bugs.openjdk.java.net/browse/JDK-8140018). >> >> So, i've switched this test to use Xmixed mode. Also, this test was >> changed to trigger method compilation by calling method multiple >> times instead of using WhiteBox. > > The reason for this change is to get proper inlining, I suppose? It was basically done to have compilation more like in "real" situation, since WhiteBox compilation is kind of workaround. Also, i remember few rare issues with method got thrown out from compilation queue in some cases when using WhiteBox because of small invocation count. Thanks, Dmitrij > >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8139703 >> webrev: http://cr.openjdk.java.net/~dpochepk/8139703/webrev.01/ >> >> Thanks, >> Dmitrij > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrij.pochepko at oracle.com Wed Mar 2 18:05:05 2016 From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko) Date: Wed, 2 Mar 2016 21:05:05 +0300 Subject: RFR(S): 8138798 - improve tests for HotSpotVMEventListener::notifyInstall In-Reply-To: <67F5551B-A5D2-40D2-BB27-4C725D2CE820@oracle.com> References: <56D70C7E.5090502@oracle.com> <67F5551B-A5D2-40D2-BB27-4C725D2CE820@oracle.com> Message-ID: <56D72B51.2070406@oracle.com> Thank you! > Looks good. > >> On Mar 2, 2016, at 5:53 AM, Dmitrij Pochepko wrote: >> >> Hi, >> >> please review fix for JDK-8138798 - improve tests for HotSpotVMEventListener::notifyInstall >> >> A test was improved to include negative cases(verifying that no install events sent on failed install attempt). Also, a minor refactoring was applied. >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8138798 >> webrev: http://cr.openjdk.java.net/~dpochepk/8138798/webrev.01/ >> >> Thanks, >> Dmitrij From mikael.vidstedt at oracle.com Wed Mar 2 21:12:48 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 2 Mar 2016 13:12:48 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D697E4.8060104@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> Message-ID: <56D75750.6020400@oracle.com> Updated webrev: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ Incremental from webrev.00: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/webrev/ Comments below... On 2016-03-01 23:36, Vladimir Ivanov wrote: > Nice cleanup, Mikael! > > src/cpu/x86/vm/assembler_x86.hpp: > > Outdated comments: > // Copy low 128bit into high 128bit of YMM registers. > > // Load/store high 128bit of YMM registers which does not destroy > other half. > > // Copy low 256bit into high 256bit of ZMM registers. Updated, thanks for catching! > src/cpu/x86/vm/assembler_x86.cpp: > > ! emit_int8(imm8 & 0x01); > > Maybe additionally assert valid imm8 range? Good idea, I had added asserts earlier but removed them. I added them back again! > Maybe keep vinsert*h variants and move them to MacroAssembler? They > look clearer in some contextes: > > - __ vextractf128h(Address(rsp, base_addr+n*16), as_XMMRegister(n)); > + __ vextractf128(Address(rsp, base_addr+n*16), > as_XMMRegister(n), 1); Can I suggest that we try to live without them for a while and see how much we miss them? I think having it there may actually be more confusing in many cases :) Cheers, Mikael > > Otherwise, looks good. > > Best regards, > Vladimir Ivanov > > On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >> >> Please review the following change which updates the various vextract* >> and vinsert* methods in assembler_x86 & macroAssembler_x86 to better >> match the real HW instructions, which also has the benefit of providing >> the full functionality/flexibility of the instructions where earlier >> only some specific modes were supported. Put differently, with this >> change it's much easier to correlate the methods to the Intel manual and >> understand what they actually do. >> >> Specifically, the vinsert* family of instructions take three registers >> and an immediate which decide how the bits should be shuffled around, >> but without this change the method only allowed two of the registers to >> be specified, and the immediate was hard-coded to 0x01. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >> Webrev: >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev/ >> >> Special thanks to Mike Berg for helping discuss, co-develop, and test >> the change! >> >> Cheers, >> Mikael >> From vivek.r.deshpande at intel.com Wed Mar 2 21:49:42 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 2 Mar 2016 21:49:42 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> Hi Christian We could combine the declaration of the functions such as fast_sha256() for 32 bit and 64 bit in macroAssembler_x86.hpp using COMMA. We used COMMA to separate more arguments used in 64 bit using LP64_ONLY macro. Regards, Vivek -----Original Message----- From: Christian Thalinger [mailto:christian.thalinger at oracle.com] Sent: Wednesday, March 02, 2016 9:30 AM To: Deshpande, Vivek R Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling #define COMMA , Why do we have a define like this? It came in with 8139575. > On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R wrote: > > Hi Vladimir, Christian > > I have updated the code according your suggestion of file name change. > The updated webrev is at this location: > http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ > Please let me know if I have to do anything more. > Also is there any change required to Makefile so that configurations.xml has name of the added file ? > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Monday, February 29, 2016 5:59 PM > To: Vladimir Kozlov > Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > >> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: >> >> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. > > I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. > >> You can manipulate when to use it in vm_version_x86.cpp. >> >> Intel Copyright in the file's header is fine. >> >> Code changes are fine now (webrev.01). >> >> Thanks, >> Vladimir >> >> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>> >>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>>> >>>> Hi Christian >>>> >>>> We used the SHA Extension >>>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >>> >>> Will that extension only be available on Intel chips? >>> >>>> It needed to have Intel copyright, so created a separate file. >>> >>> That is reasonable. >>> >>>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Monday, February 29, 2016 1:58 PM >>>> To: Deshpande, Vivek R >>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>> >>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>>> >>>>> HI Vladimir >>>>> >>>>> Thank you for your review. >>>>> I have updated the patch with the changes you have suggested. >>>>> The new webrev is at this location: >>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>> >>>>> Regards >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>> To: Deshpande, Vivek R; hotspot compiler >>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>> enabling >>>>> >>>>> Very nice, Vivek!!! >>>>> >>>>> Did you run tests with both 32- and 64-bit VMs? >>>>> >>>>> Small notes: >>>>> >>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>> >>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>> supports_avx()) && !supports_evex()); } >>>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>>> >>>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>>> >>>>> if (supports_sha()) { >>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>> UseSHA = true; >>>>> } >>>>> } else if (UseSHA) { >>>>> warning("SHA instructions are not available on this CPU"); >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>> } >>>>> } else if (UseSHA1Intrinsics) { >>>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>> >>>>> if (UseSHA) { >>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>> } >>>>> } else if (UseSHA256Intrinsics) { >>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>> >>>>> if (UseSHA512Intrinsics) { >>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>> >>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>> Hi all >>>>>> >>>>>> I would like to contribute a patch which optimizesSHA-1 >>>>>> andSHA-256 for >>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>> >>>>>> Could you please review and sponsor this patch. >>>>>> >>>>>> Bug-id: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>> webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>> >>>>>> Thanks and regards, >>>>>> >>>>>> Vivek >>>>>> >>>> >>> > From christian.thalinger at oracle.com Wed Mar 2 22:04:49 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Wed, 2 Mar 2016 12:04:49 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> Message-ID: <126788E1-352A-413A-AA28-5E38064E9DBA@oracle.com> > On Mar 2, 2016, at 11:49 AM, Deshpande, Vivek R wrote: > > Hi Christian > > We could combine the declaration of the functions such as fast_sha256() for 32 bit and 64 bit in macroAssembler_x86.hpp using COMMA. > We used COMMA to separate more arguments used in 64 bit using LP64_ONLY macro. Ugh. This is ugly: void fast_pow(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3, XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7, Register rax, Register rcx, Register rdx NOT_LP64(COMMA Register tmp) LP64_ONLY(COMMA Register tmp1) LP64_ONLY(COMMA Register tmp2) LP64_ONLY(COMMA Register tmp3) LP64_ONLY(COMMA Register tmp4)); Can?t we use #ifdef instead? > > Regards, > Vivek > > -----Original Message----- > From: Christian Thalinger [mailto:christian.thalinger at oracle.com] > Sent: Wednesday, March 02, 2016 9:30 AM > To: Deshpande, Vivek R > Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > #define COMMA , > > Why do we have a define like this? It came in with 8139575. > >> On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R wrote: >> >> Hi Vladimir, Christian >> >> I have updated the code according your suggestion of file name change. >> The updated webrev is at this location: >> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ >> Please let me know if I have to do anything more. >> Also is there any change required to Makefile so that configurations.xml has name of the added file ? >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Monday, February 29, 2016 5:59 PM >> To: Vladimir Kozlov >> Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> >>> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov wrote: >>> >>> I am against to have "intel" in a file name. We have macroAssembler_libm_x86_*.cpp files for math intrinsics which does not have intel in name. So I prefer to not have it. I would suggest macroAssembler_sha_x86.cpp. >> >> I know we already have macroAssembler_libm_x86_*.cpp but macroAssembler_x86_.cpp would be better. >> >>> You can manipulate when to use it in vm_version_x86.cpp. >>> >>> Intel Copyright in the file's header is fine. >>> >>> Code changes are fine now (webrev.01). >>> >>> Thanks, >>> Vladimir >>> >>> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>>> >>>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R wrote: >>>>> >>>>> Hi Christian >>>>> >>>>> We used the SHA Extension >>>>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) for the JVM implementation of SHA1 and SHA256. >>>> >>>> Will that extension only be available on Intel chips? >>>> >>>>> It needed to have Intel copyright, so created a separate file. >>>> >>>> That is reasonable. >>>> >>>>> The white paper for the implementation is https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>>> >>>>> Regards, >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>>> Sent: Monday, February 29, 2016 1:58 PM >>>>> To: Deshpande, Vivek R >>>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>> enabling >>>>> >>>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>>> >>>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R wrote: >>>>>> >>>>>> HI Vladimir >>>>>> >>>>>> Thank you for your review. >>>>>> I have updated the patch with the changes you have suggested. >>>>>> The new webrev is at this location: >>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>>> >>>>>> Regards >>>>>> Vivek >>>>>> >>>>>> -----Original Message----- >>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>>> To: Deshpande, Vivek R; hotspot compiler >>>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>> enabling >>>>>> >>>>>> Very nice, Vivek!!! >>>>>> >>>>>> Did you run tests with both 32- and 64-bit VMs? >>>>>> >>>>>> Small notes: >>>>>> >>>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>>> >>>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>>> supports_avx()) && !supports_evex()); } >>>>>> + static bool supports_sha() { return (_features & CPU_SHA) != 0; } >>>>>> >>>>>> Flags setting code in vm_version_x86.cpp should be like this (you can check supports_sha() only once, don't split '} else {' line, set UseSHA false if all intrinsics flags are false (I included UseSHA512Intrinsics for future) ): >>>>>> >>>>>> if (supports_sha()) { >>>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>>> UseSHA = true; >>>>>> } >>>>>> } else if (UseSHA) { >>>>>> warning("SHA instructions are not available on this CPU"); >>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>> >>>>>> if (UseSHA) { >>>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>>> } >>>>>> } else if (UseSHA1Intrinsics) { >>>>>> warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); >>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>>> >>>>>> if (UseSHA) { >>>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>>> } >>>>>> } else if (UseSHA256Intrinsics) { >>>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); >>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>>> >>>>>> if (UseSHA512Intrinsics) { >>>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); >>>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>>> >>>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA512Intrinsics)) { >>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>>> Hi all >>>>>>> >>>>>>> I would like to contribute a patch which optimizesSHA-1 >>>>>>> andSHA-256 for >>>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>>> >>>>>>> Could you please review and sponsor this patch. >>>>>>> >>>>>>> Bug-id: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>>> webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>>> >>>>>>> Thanks and regards, >>>>>>> >>>>>>> Vivek >>>>>>> >>>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Mar 2 22:13:43 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Mar 2016 14:13:43 -0800 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <126788E1-352A-413A-AA28-5E38064E9DBA@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> <126788E1-352A-413A-AA28-5E38064E9DBA@oracle.com> Message-ID: <56D76597.40205@oracle.com> On 3/2/16 2:04 PM, Christian Thalinger wrote: > >> On Mar 2, 2016, at 11:49 AM, Deshpande, Vivek R >> > wrote: >> >> Hi Christian >> >> We could combine the declaration of the functions such as >> fast_sha256() for 32 bit and 64 bit in macroAssembler_x86.hpp using COMMA. >> We used COMMA to separate more arguments used in 64 bit using >> LP64_ONLY macro. > > Ugh. This is ugly: > > voidfast_pow(XMMRegisterxmm0, XMMRegisterxmm1, XMMRegisterxmm2, > XMMRegisterxmm3, XMMRegisterxmm4, > XMMRegisterxmm5, XMMRegisterxmm6, XMMRegisterxmm7, Registerrax, Registerrcx, > Register rdx NOT_LP64(COMMA Register tmp) LP64_ONLY(COMMA Register tmp1) > LP64_ONLY(COMMA Register tmp2) LP64_ONLY(COMMA Register > tmp3) LP64_ONLY(COMMA Register tmp4)); > > Can?t we use #ifdef instead? +1 for #ifdef It is not first time we scope additional parameters with #ifdef. Vladimir > >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Wednesday, March 02, 2016 9:30 AM >> To: Deshpande, Vivek R >> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> #define COMMA , >> >> Why do we have a define like this? It came in with 8139575. >> >>> On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R >>> > wrote: >>> >>> Hi Vladimir, Christian >>> >>> I have updated the code according your suggestion of file name change. >>> The updated webrev is at this location: >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ >>> Please let me know if I have to do anything more. >>> Also is there any change required to Makefile so that >>> configurations.xml has name of the added file ? >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 5:59 PM >>> To: Vladimir Kozlov >>> Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> >>>> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov >>>> wrote: >>>> >>>> I am against to have "intel" in a file name. We have >>>> macroAssembler_libm_x86_*.cpp files for math intrinsics which does >>>> not have intel in name. So I prefer to not have it. I would suggest >>>> macroAssembler_sha_x86.cpp. >>> >>> I know we already have macroAssembler_libm_x86_*.cpp but >>> macroAssembler_x86_.cpp would be better. >>> >>>> You can manipulate when to use it in vm_version_x86.cpp. >>>> >>>> Intel Copyright in the file's header is fine. >>>> >>>> Code changes are fine now (webrev.01). >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>>>> >>>>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R >>>>>> wrote: >>>>>> >>>>>> Hi Christian >>>>>> >>>>>> We used the SHA Extension >>>>>> implementations(https://software.intel.com/en-us/articles/intel-sha-extensions-implementations) >>>>>> for the JVM implementation of SHA1 and SHA256. >>>>> >>>>> Will that extension only be available on Intel chips? >>>>> >>>>>> It needed to have Intel copyright, so created a separate file. >>>>> >>>>> That is reasonable. >>>>> >>>>>> The white paper for the implementation is >>>>>> https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>>>> >>>>>> Regards, >>>>>> Vivek >>>>>> >>>>>> -----Original Message----- >>>>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>>>> Sent: Monday, February 29, 2016 1:58 PM >>>>>> To: Deshpande, Vivek R >>>>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>> enabling >>>>>> >>>>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>>>> >>>>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R >>>>>>> wrote: >>>>>>> >>>>>>> HI Vladimir >>>>>>> >>>>>>> Thank you for your review. >>>>>>> I have updated the patch with the changes you have suggested. >>>>>>> The new webrev is at this location: >>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>>>> >>>>>>> Regards >>>>>>> Vivek >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>>>> To: Deshpande, Vivek R; hotspot compiler >>>>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>>> enabling >>>>>>> >>>>>>> Very nice, Vivek!!! >>>>>>> >>>>>>> Did you run tests with both 32- and 64-bit VMs? >>>>>>> >>>>>>> Small notes: >>>>>>> >>>>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>>>> >>>>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>>>> supports_avx()) && !supports_evex()); } >>>>>>> + static bool supports_sha() { return (_features & CPU_SHA) >>>>>>> != 0; } >>>>>>> >>>>>>> Flags setting code in vm_version_x86.cpp should be like this (you >>>>>>> can check supports_sha() only once, don't split '} else {' line, >>>>>>> set UseSHA false if all intrinsics flags are false (I included >>>>>>> UseSHA512Intrinsics for future) ): >>>>>>> >>>>>>> if (supports_sha()) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>>>> UseSHA = true; >>>>>>> } >>>>>>> } else if (UseSHA) { >>>>>>> warning("SHA instructions are not available on this CPU"); >>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>> >>>>>>> if (UseSHA) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); >>>>>>> } >>>>>>> } else if (UseSHA1Intrinsics) { >>>>>>> warning("Intrinsics for SHA-1 crypto hash functions not >>>>>>> available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>>>> >>>>>>> if (UseSHA) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); >>>>>>> } >>>>>>> } else if (UseSHA256Intrinsics) { >>>>>>> warning("Intrinsics for SHA-224 and SHA-256 crypto hash >>>>>>> functions not available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>>>> >>>>>>> if (UseSHA512Intrinsics) { >>>>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash >>>>>>> functions not available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>>>> >>>>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || >>>>>>> UseSHA512Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>>>> Hi all >>>>>>>> >>>>>>>> I would like to contribute a patch which optimizesSHA-1 >>>>>>>> andSHA-256 for >>>>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>>>> >>>>>>>> Could you please review and sponsor this patch. >>>>>>>> >>>>>>>> Bug-id: >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>>>> webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>>>> >>>>>>>> Thanks and regards, >>>>>>>> >>>>>>>> Vivek >>>>>>>> >>>>>> >>>>> >>> >> > From mikael.vidstedt at oracle.com Wed Mar 2 23:02:43 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 2 Mar 2016 15:02:43 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D75750.6020400@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> Message-ID: <56D77113.30906@oracle.com> After discussing with Vladimir off-list we agreed that changing the type of the immediate (imm8) argument to uint8_t is both clearer, has the potential to catch incorrect uses of the functions, and also makes the asserts more straightforward. In addition to that Vladimir noted that I had accidentally included newline in the assert messages. New webrev: Full: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ Incremental from webrev.01: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webrev/ Cheers, Mikael On 2016-03-02 13:12, Mikael Vidstedt wrote: > > Updated webrev: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ > > Incremental from webrev.00: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/webrev/ > > Comments below... > > On 2016-03-01 23:36, Vladimir Ivanov wrote: >> Nice cleanup, Mikael! >> >> src/cpu/x86/vm/assembler_x86.hpp: >> >> Outdated comments: >> // Copy low 128bit into high 128bit of YMM registers. >> >> // Load/store high 128bit of YMM registers which does not destroy >> other half. >> >> // Copy low 256bit into high 256bit of ZMM registers. > > Updated, thanks for catching! > >> src/cpu/x86/vm/assembler_x86.cpp: >> >> ! emit_int8(imm8 & 0x01); >> >> Maybe additionally assert valid imm8 range? > > Good idea, I had added asserts earlier but removed them. I added them > back again! > >> Maybe keep vinsert*h variants and move them to MacroAssembler? They >> look clearer in some contextes: >> >> - __ vextractf128h(Address(rsp, base_addr+n*16), >> as_XMMRegister(n)); >> + __ vextractf128(Address(rsp, base_addr+n*16), >> as_XMMRegister(n), 1); > > Can I suggest that we try to live without them for a while and see how > much we miss them? I think having it there may actually be more > confusing in many cases :) > > Cheers, > Mikael > >> >> Otherwise, looks good. >> >> Best regards, >> Vladimir Ivanov >> >> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>> >>> Please review the following change which updates the various vextract* >>> and vinsert* methods in assembler_x86 & macroAssembler_x86 to better >>> match the real HW instructions, which also has the benefit of providing >>> the full functionality/flexibility of the instructions where earlier >>> only some specific modes were supported. Put differently, with this >>> change it's much easier to correlate the methods to the Intel manual >>> and >>> understand what they actually do. >>> >>> Specifically, the vinsert* family of instructions take three registers >>> and an immediate which decide how the bits should be shuffled around, >>> but without this change the method only allowed two of the registers to >>> be specified, and the immediate was hard-coded to 0x01. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>> Webrev: >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev/ >>> >>> Special thanks to Mike Berg for helping discuss, co-develop, and test >>> the change! >>> >>> Cheers, >>> Mikael >>> > From vladimir.kozlov at oracle.com Wed Mar 2 23:31:40 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 2 Mar 2016 15:31:40 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D77113.30906@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> Message-ID: <56D777DC.70708@oracle.com> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: > > After discussing with Vladimir off-list we agreed that changing the type It was Vladimir Ivanov. > of the immediate (imm8) argument to uint8_t is both clearer, has the > potential to catch incorrect uses of the functions, and also makes the > asserts more straightforward. In addition to that Vladimir noted that I > had accidentally included newline in the assert messages. > > New webrev: > > Full: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ I agree with Vladimir I. that we should have macroassembler instructions vinserti128high, vinserti128low, etc. instead of passing imm8. It is more informative. Also why we add new nds->is_valid() checks into assembler instructions? We are going to remove them: https://bugs.openjdk.java.net/browse/JDK-8151003 I know that Mikael had a discussion about this with Michael. So I would like to see arguments here. Michael? Current code always pass correct registers and x86 Manual requires to have valid registers. Thanks, Vladimir > > Incremental from webrev.01: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webrev/ > > Cheers, > Mikael > > On 2016-03-02 13:12, Mikael Vidstedt wrote: >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ >> >> Incremental from webrev.00: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/webrev/ >> >> Comments below... >> >> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>> Nice cleanup, Mikael! >>> >>> src/cpu/x86/vm/assembler_x86.hpp: >>> >>> Outdated comments: >>> // Copy low 128bit into high 128bit of YMM registers. >>> >>> // Load/store high 128bit of YMM registers which does not destroy >>> other half. >>> >>> // Copy low 256bit into high 256bit of ZMM registers. >> >> Updated, thanks for catching! >> >>> src/cpu/x86/vm/assembler_x86.cpp: >>> >>> ! emit_int8(imm8 & 0x01); >>> >>> Maybe additionally assert valid imm8 range? >> >> Good idea, I had added asserts earlier but removed them. I added them >> back again! >> >>> Maybe keep vinsert*h variants and move them to MacroAssembler? They >>> look clearer in some contextes: >>> >>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>> as_XMMRegister(n)); >>> + __ vextractf128(Address(rsp, base_addr+n*16), >>> as_XMMRegister(n), 1); >> >> Can I suggest that we try to live without them for a while and see how >> much we miss them? I think having it there may actually be more >> confusing in many cases :) >> >> Cheers, >> Mikael >> >>> >>> Otherwise, looks good. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>> >>>> Please review the following change which updates the various vextract* >>>> and vinsert* methods in assembler_x86 & macroAssembler_x86 to better >>>> match the real HW instructions, which also has the benefit of providing >>>> the full functionality/flexibility of the instructions where earlier >>>> only some specific modes were supported. Put differently, with this >>>> change it's much easier to correlate the methods to the Intel manual >>>> and >>>> understand what they actually do. >>>> >>>> Specifically, the vinsert* family of instructions take three registers >>>> and an immediate which decide how the bits should be shuffled around, >>>> but without this change the method only allowed two of the registers to >>>> be specified, and the immediate was hard-coded to 0x01. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>> Webrev: >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev/ >>>> >>>> Special thanks to Mike Berg for helping discuss, co-develop, and test >>>> the change! >>>> >>>> Cheers, >>>> Mikael >>>> >> > From michael.c.berg at intel.com Thu Mar 3 04:08:28 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Thu, 3 Mar 2016 04:08:28 +0000 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D777DC.70708@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> Message-ID: Vladimir (K), just for the time being as the problem isn't just confined to these instructions (the nds issue). I have assigned the bug below to myself and will take a holistic view over the issue in its full context. The instructions modified in the webrev, like in the documentation that exists regarding their definitions, are all programmable via what is loosely labeled as the imm8 field in the formal documentation. I think we should leave them that way. The onus of these changes was to make instructions look more like their ISA manual definitions. I think Vladimir Ivanov was saying, and please chime in Vladimir if I do not interpret correctly, wasn't high/low, it was leaving a signature like what we had in place in the macro assembler, and invoking the precise names there. I don't think that is needed though, as the macro assembler's job is to interpret a meaning and do a mapping. Regards, Michael -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, March 02, 2016 3:32 PM To: hotspot-compiler-dev at openjdk.java.net Cc: Berg, Michael C Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions On 3/2/16 3:02 PM, Mikael Vidstedt wrote: > > After discussing with Vladimir off-list we agreed that changing the > type It was Vladimir Ivanov. > of the immediate (imm8) argument to uint8_t is both clearer, has the > potential to catch incorrect uses of the functions, and also makes the > asserts more straightforward. In addition to that Vladimir noted that > I had accidentally included newline in the assert messages. > > New webrev: > > Full: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ I agree with Vladimir I. that we should have macroassembler instructions vinserti128high, vinserti128low, etc. instead of passing imm8. It is more informative. Also why we add new nds->is_valid() checks into assembler instructions? We are going to remove them: https://bugs.openjdk.java.net/browse/JDK-8151003 I know that Mikael had a discussion about this with Michael. So I would like to see arguments here. Michael? Current code always pass correct registers and x86 Manual requires to have valid registers. Thanks, Vladimir > > Incremental from webrev.01: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webr > ev/ > > Cheers, > Mikael > > On 2016-03-02 13:12, Mikael Vidstedt wrote: >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ >> >> Incremental from webrev.00: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/web >> rev/ >> >> Comments below... >> >> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>> Nice cleanup, Mikael! >>> >>> src/cpu/x86/vm/assembler_x86.hpp: >>> >>> Outdated comments: >>> // Copy low 128bit into high 128bit of YMM registers. >>> >>> // Load/store high 128bit of YMM registers which does not >>> destroy other half. >>> >>> // Copy low 256bit into high 256bit of ZMM registers. >> >> Updated, thanks for catching! >> >>> src/cpu/x86/vm/assembler_x86.cpp: >>> >>> ! emit_int8(imm8 & 0x01); >>> >>> Maybe additionally assert valid imm8 range? >> >> Good idea, I had added asserts earlier but removed them. I added them >> back again! >> >>> Maybe keep vinsert*h variants and move them to MacroAssembler? They >>> look clearer in some contextes: >>> >>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>> as_XMMRegister(n)); >>> + __ vextractf128(Address(rsp, base_addr+n*16), >>> as_XMMRegister(n), 1); >> >> Can I suggest that we try to live without them for a while and see >> how much we miss them? I think having it there may actually be more >> confusing in many cases :) >> >> Cheers, >> Mikael >> >>> >>> Otherwise, looks good. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>> >>>> Please review the following change which updates the various >>>> vextract* and vinsert* methods in assembler_x86 & >>>> macroAssembler_x86 to better match the real HW instructions, which >>>> also has the benefit of providing the full >>>> functionality/flexibility of the instructions where earlier only >>>> some specific modes were supported. Put differently, with this >>>> change it's much easier to correlate the methods to the Intel >>>> manual and understand what they actually do. >>>> >>>> Specifically, the vinsert* family of instructions take three >>>> registers and an immediate which decide how the bits should be >>>> shuffled around, but without this change the method only allowed >>>> two of the registers to be specified, and the immediate was hard-coded to 0x01. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>> Webrev: >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev >>>> / >>>> >>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>> test the change! >>>> >>>> Cheers, >>>> Mikael >>>> >> > From tobias.hartmann at oracle.com Thu Mar 3 12:12:24 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 3 Mar 2016 13:12:24 +0100 Subject: [9] RFR(S): 8151130: [BACKOUT] Remove Method::_method_data for C1 Message-ID: <56D82A28.1090801@oracle.com> Hi, please review the following patch that backs out JDK-8147978 [1] because the fix is causing failures in the hs-comp nightlies that block integration. https://bugs.openjdk.java.net/browse/JDK-8151130 http://cr.openjdk.java.net/~thartmann/8151130/webrev.00/ According to the GK rules, I set JDK-8147978 to "verification: fix failed" and created a [REDO] issue: https://bugs.openjdk.java.net/browse/JDK-8151155 Best regards, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8147978 http://cr.openjdk.java.net/~cjplummer/8147978/webrev.04/webrev.hotspot/ From vladimir.x.ivanov at oracle.com Thu Mar 3 12:15:26 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 3 Mar 2016 15:15:26 +0300 Subject: [9] RFR(S): 8151130: [BACKOUT] Remove Method::_method_data for C1 In-Reply-To: <56D82A28.1090801@oracle.com> References: <56D82A28.1090801@oracle.com> Message-ID: <56D82ADE.6060109@oracle.com> Reviewed. Best regards, Vladimir Ivanov On 3/3/16 3:12 PM, Tobias Hartmann wrote: > Hi, > > please review the following patch that backs out JDK-8147978 [1] because the fix is causing failures in the hs-comp nightlies that block integration. > > https://bugs.openjdk.java.net/browse/JDK-8151130 > http://cr.openjdk.java.net/~thartmann/8151130/webrev.00/ > > According to the GK rules, I set JDK-8147978 to "verification: fix failed" and created a [REDO] issue: > https://bugs.openjdk.java.net/browse/JDK-8151155 > > Best regards, > Tobias > > [1] > https://bugs.openjdk.java.net/browse/JDK-8147978 > http://cr.openjdk.java.net/~cjplummer/8147978/webrev.04/webrev.hotspot/ > From tobias.hartmann at oracle.com Thu Mar 3 12:16:15 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 3 Mar 2016 13:16:15 +0100 Subject: [9] RFR(S): 8151130: [BACKOUT] Remove Method::_method_data for C1 In-Reply-To: <56D82ADE.6060109@oracle.com> References: <56D82A28.1090801@oracle.com> <56D82ADE.6060109@oracle.com> Message-ID: <56D82B0F.5040107@oracle.com> Thanks, Vladimir. Best, Tobias On 03.03.2016 13:15, Vladimir Ivanov wrote: > Reviewed. > > Best regards, > Vladimir Ivanov > > On 3/3/16 3:12 PM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch that backs out JDK-8147978 [1] because the fix is causing failures in the hs-comp nightlies that block integration. >> >> https://bugs.openjdk.java.net/browse/JDK-8151130 >> http://cr.openjdk.java.net/~thartmann/8151130/webrev.00/ >> >> According to the GK rules, I set JDK-8147978 to "verification: fix failed" and created a [REDO] issue: >> https://bugs.openjdk.java.net/browse/JDK-8151155 >> >> Best regards, >> Tobias >> >> [1] >> https://bugs.openjdk.java.net/browse/JDK-8147978 >> http://cr.openjdk.java.net/~cjplummer/8147978/webrev.04/webrev.hotspot/ >> From zoltan.majo at oracle.com Thu Mar 3 12:18:10 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 3 Mar 2016 13:18:10 +0100 Subject: [9] RFR(S): 8151130: [BACKOUT] Remove Method::_method_data for C1 In-Reply-To: <56D82A28.1090801@oracle.com> References: <56D82A28.1090801@oracle.com> Message-ID: <56D82B82.3030003@oracle.com> Hi Tobias, looks good to me as well! Thank you for taking care of this. Best regards, Zoltan On 03/03/2016 01:12 PM, Tobias Hartmann wrote: > Hi, > > please review the following patch that backs out JDK-8147978 [1] because the fix is causing failures in the hs-comp nightlies that block integration. > > https://bugs.openjdk.java.net/browse/JDK-8151130 > http://cr.openjdk.java.net/~thartmann/8151130/webrev.00/ > > According to the GK rules, I set JDK-8147978 to "verification: fix failed" and created a [REDO] issue: > https://bugs.openjdk.java.net/browse/JDK-8151155 > > Best regards, > Tobias > > [1] > https://bugs.openjdk.java.net/browse/JDK-8147978 > http://cr.openjdk.java.net/~cjplummer/8147978/webrev.04/webrev.hotspot/ From tobias.hartmann at oracle.com Thu Mar 3 12:19:17 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 3 Mar 2016 13:19:17 +0100 Subject: [9] RFR(S): 8151130: [BACKOUT] Remove Method::_method_data for C1 In-Reply-To: <56D82B82.3030003@oracle.com> References: <56D82A28.1090801@oracle.com> <56D82B82.3030003@oracle.com> Message-ID: <56D82BC5.8090602@oracle.com> Thanks, Zoltan. Best regards, Tobias On 03.03.2016 13:18, Zolt?n Maj? wrote: > Hi Tobias, > > > looks good to me as well! Thank you for taking care of this. > > Best regards, > > > Zoltan > > > On 03/03/2016 01:12 PM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch that backs out JDK-8147978 [1] because the fix is causing failures in the hs-comp nightlies that block integration. >> >> https://bugs.openjdk.java.net/browse/JDK-8151130 >> http://cr.openjdk.java.net/~thartmann/8151130/webrev.00/ >> >> According to the GK rules, I set JDK-8147978 to "verification: fix failed" and created a [REDO] issue: >> https://bugs.openjdk.java.net/browse/JDK-8151155 >> >> Best regards, >> Tobias >> >> [1] >> https://bugs.openjdk.java.net/browse/JDK-8147978 >> http://cr.openjdk.java.net/~cjplummer/8147978/webrev.04/webrev.hotspot/ > From vladimir.x.ivanov at oracle.com Thu Mar 3 12:26:50 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 3 Mar 2016 15:26:50 +0300 Subject: [9] RFR (XS): 8151157: Quarantine test/compiler/unsafe/UnsafeGetStableArrayElement.java Message-ID: <56D82D8A.2030205@oracle.com> http://cr.openjdk.java.net/~vlivanov/8151157/webrev.00/ The test is executed in unsupported configuration: with Client VM. Quarantine the test until the problem is fixed (JDK-8151137). Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8151137 From zoltan.majo at oracle.com Thu Mar 3 12:28:19 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 3 Mar 2016 13:28:19 +0100 Subject: [9] RFR (XS): 8151157: Quarantine test/compiler/unsafe/UnsafeGetStableArrayElement.java In-Reply-To: <56D82D8A.2030205@oracle.com> References: <56D82D8A.2030205@oracle.com> Message-ID: <56D82DE3.4040100@oracle.com> Hi Vladimir, this looks good to me! Thank you and best regards, Zoltan On 03/03/2016 01:26 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8151157/webrev.00/ > > The test is executed in unsupported configuration: with Client VM. > > Quarantine the test until the problem is fixed (JDK-8151137). > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8151137 From tobias.hartmann at oracle.com Thu Mar 3 12:51:38 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 3 Mar 2016 13:51:38 +0100 Subject: [9] RFR (XS): 8151157: Quarantine test/compiler/unsafe/UnsafeGetStableArrayElement.java In-Reply-To: <56D82D8A.2030205@oracle.com> References: <56D82D8A.2030205@oracle.com> Message-ID: <56D8335A.6070605@oracle.com> Hi Vladimir, looks good to me. Best regards, Tobias On 03.03.2016 13:26, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8151157/webrev.00/ > > The test is executed in unsupported configuration: with Client VM. > > Quarantine the test until the problem is fixed (JDK-8151137). > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8151137 From nils.eliasson at oracle.com Thu Mar 3 12:47:46 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 3 Mar 2016 13:47:46 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> Message-ID: <56D83272.9010202@oracle.com> Hi Volker, On 2016-03-02 17:36, Volker Simonis wrote: > Hi Nils, > > your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used > is as base for my new webrevs at: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 > > > I've updated the copyrights, added the current reviewers and also > added us both in the Contributed-by line (hope that's fine for you). Absolutely > > Except that, I've only done the following minor fixes/changes: > * > compileBroker.{cpp,hpp}* > > - we don't need CompileBroker::is_compile_blocking() anymore. Good > > *compilerDirectives.hpp* > > - I think we should use > cflags(BackgroundCompilation, bool, BackgroundCompilation, > BackgroundCompilation) > instead of: > cflags(BackgroundCompilation, bool, BackgroundCompilation, X) > > so we can also trigger blocking compiles from the command line with a > CompileCommand (e.g. > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") > That's very handy during development or and also for simple tests > where we don't want to mess with compiler directives. (And the > overhead to keep this feature is quite small, just > "BackgroundCompilation" instead of "X" ;-) Without a very strong use case for this I don't want it as a CompileCommand. CompileCommand options do have a cost - they force a temporary unique copy of the directive if any option command matches negating some of the positive effects of directives. Also the CompileCommands are stringly typed, no compile time name or type check is done. This can be fixed in various ways, but until then I prefer not to add it. > > *whitebox.cpp* > > I think it is good that you fixed the state but I think it is too > complicated now. We don't need to strdup the string and can easily > forget to free 'tmpstr' :) So maybe it is simpler to just do another > transition for parsing the directive: > > { > ThreadInVMfromNative ttvfn(thread); // back to VM > DirectivesParser::parse_string(dir, tty); > } Transitions are not free, but on the other hand the string may be long. This is not a hot path in anyway so lets go with simple. > > *advancedThresholdPolicy.cpp > * > - the JVMCI code looks reasonable (although I haven't tested JVMCI) > and is actually even an improvement over my code which just picked the > first blocking compilation. Feels good to remove the special cases. > > *diagnosticCommand.cpp > > *- Shouldn't you also fix CompilerDirectivesAddDCmd to return the > number of added directives and CompilerDirectivesRemoveDCmd to take > the number of directives you want to pop? Or do you want to do this in > a later, follow-up change? Yes, lets do that in a follow up change. They affect a number of tests. > > *WhiteBox.java* > > - I still think it would make sense to keep the two 'blocking' > versions of enqueueMethodForCompilation() for convenience. For > example your test fix for JDK-8073793 would be much simpler if you > used them. I've added two comments to the 'blocking' convenience > methods to mention the fact that calling them may shadow previously > added compiler directives. I am ok with having then, but think Whitebox.java will get too bloated. I would rather have the convenience-methods in some test utility class, like CompilerWhiteBoxTest.java. > > *BlockingCompilation.java* > > - I've extended my regression test to test both methods of doing > blocking compilation - with the new, 'blocking' > enqueueMethodForCompilation() methods as well as by manually setting > the corresponding compiler directives. If we should finally get > consensus on removing the blocking convenience methods, please just > remove the corresponding tests. Line 85: for (level = 1; level <= 4; level++) { You can not be sure all compilation levels are available. Use * @library /testlibrary /test/lib / * @build sun.hotspot.WhiteBox * compiler.testlibrary.CompilerUtils import compiler.testlibrary.CompilerUtils; int[] levels = CompilerUtils.getAvailableCompilationLevels(); for (int level : levels) { ... > > I think we're close to a final version now, what do you think :) Yes! I'll take a look as soon as you post an updated webrev. Regards, Nils > > Regards, > Volker > > > On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson > > wrote: > > Yes, I forgot to add the fix for working with multiple directives > from whitebox. > > WB.addCompilerDirectives now returns the number of directives that > where added, and removeCompilerDirectives takes a parameter for > the number of directives that should be popped (atomically). > > http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ > > http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ > > > Fixed test in JDK-8073793 to work with this: > http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ > > > Best regards, > Nils Eliasson > > > > On 2016-03-02 13:36, Nils Eliasson wrote: >> Hi Volker, >> >> I created these webrevs including all the feedback from everyone: >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >> >> * Only add- and removeCompilerDirective >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >> >> * whitebox.cpp >> -- addCompilerDirective to have correct VM states >> * advancedThresholdPolicy.cpp >> -- prevent blocking tasks from becoming stale >> -- The logic for picking first blocking task broke JVMCI code. >> Instead made the JVMCI code default (select the blocking task >> with highest score.) >> * compilerDirectives.hpp >> -- Remove option CompileCommand. Not needed. >> * compileBroker.cpp >> -- Wrapped compile_method so that directive get and release >> always are matched. >> >> Is anything missing? >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-03-01 19:31, Volker Simonis wrote: >>> Hi Pavel, Nils, Vladimir, >>> >>> sorry, but I was busy the last days so I couldn't answer your mails. >>> >>> Thanks a lot for your input and your suggestions. I'll look into this >>> tomorrow and hopefully I'll be able to address all your concerns. >>> >>> Regards, >>> Volker >>> >>> >>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>> wrote: >>>> Nils, please answer Pavel's questions. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> >>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>> Hi Volker, >>>>> >>>>> An excellent proposition. This is how it should be used. >>>>> >>>>> I polished a few rough edges: >>>>> * CompilerBroker.cpp - The directives was already access in >>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>> so we only have a single directive access. Wrapped compile_method to >>>>> make sure the release of the directive doesn't get lost. >>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>> state - need to be in native to get string, but then need to be in VM >>>>> when parsing directive. >>>>> >>>>> And some comments: >>>>> * I am against adding new compile option commands (At least until the >>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>> directives instead. >>>>> >>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>> could save the blocking info in the task so we don't need to access the >>>>> directive in the policy. >>>>> >>>>> All in here: >>>>> Webrev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>> >>>>> >>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>> >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>>> Hi, >>>>>> >>>>>> so I want to propose the following solution for this problem: >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>>> >>>>>> >>>>>> I've started from the opposite site and made the BackgroundCompilation >>>>>> manageable through the compiler directives framework. Once this works >>>>>> (and it's actually trivial due to the nice design of the >>>>>> CompilerDirectives framework :), we get the possibility to set the >>>>>> BackgroundCompilation option on a per method base on the command line >>>>>> via the CompileCommand option for free: >>>>>> >>>>>> >>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>>> >>>>>> >>>>>> And of course we can also use it directly as a compiler directive: >>>>>> >>>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>>> >>>>>> It also becomes possible to use this directly from the Whitebox API >>>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>>> Unfortunately, this command takes a file with compiler directives as >>>>>> argument. I think this would be overkill in this context. So because >>>>>> it was so easy and convenient, I added the following two new Whitebox >>>>>> methods: >>>>>> >>>>>> public native void addCompilerDirective(String compDirect); >>>>>> public native void removeCompilerDirective(); >>>>>> >>>>>> which can now be used to set arbitrary CompilerDirective command >>>>>> directly from within the WhiteBox API. (The implementation of these >>>>>> two methods is trivial as you can see in whitebox.cpp). >>>>>> v >>>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>>> simple wrappers around the existing methods without the need of any >>>>>> code changes in their native implementation. This is good, because it >>>>>> keeps the WhiteBox API stable! >>>>>> >>>>>> Finally some words about the implementation of the per-method >>>>>> BackgroundCompilation functionality. It actually only requires two >>>>>> small changes: >>>>>> >>>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>>> and compilation level as arguments and use them to query the >>>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>>> >>>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>>> prefers blocking compilations. This is not only necessary, because it >>>>>> decreases the time we have to wait for a blocking compilation, but >>>>>> also because it prevents blocking compiles from getting stale. This >>>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>>> for methods which only get artificially compiled during a test because >>>>>> their invocations counters are usually too small. >>>>>> >>>>>> There's still a small probability that a blocking compilation will be >>>>>> not blocking. This can happen if a method for which we request the >>>>>> blocking compilation is already in the compilation queue (see the >>>>>> check 'compilation_is_in_queue(method)' in >>>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>>> rarely happen because methods which are manually compiled shouldn't >>>>>> get called that many times to implicitly place them into the compile >>>>>> queue. But we can even completely avoid this problem by using >>>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>>> the queue before we request a blocking compilation. >>>>>> >>>>>> I've also added a small regression test to demonstrate and verify the >>>>>> new functionality. >>>>>> >>>>>> Regards, >>>>>> Volker >>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>> wrote: >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>>> all >>>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>>> for a >>>>>>> single compile without affecting the rest of the system. The tests >>>>>>> using it >>>>>>> will continue executing as soon as that compile is finished, saving time >>>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> >>>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>>>> You are adding parameter which is used only for testing. >>>>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>>>> WhiteBox::compilation_locked. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Please review this change that adds support for blocking compiles >>>>>>>>> in the >>>>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>>>> >>>>>>>>> Motivation: >>>>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>>>> consuming >>>>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>>>> queue to >>>>>>>>> complete >>>>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>>>> causing indeterminism in the test >>>>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>>>> will be >>>>>>>>> pushed at the same time. >>>>>>>>> >>>>>>>>> RFE:https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>>>> JDK rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>>>> Hotspot rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Nils Eliasson >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Thu Mar 3 14:30:29 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 3 Mar 2016 15:30:29 +0100 Subject: RFR(S): 8066770: EnqueueMethodForCompilationTest.java fails to compile method Message-ID: <56D84A85.9060602@oracle.com> Hi, Please review, Bug: https://bugs.openjdk.java.net/browse/JDK-8066770 Webrev: http://cr.openjdk.java.net/~neliasso/8066770/webrev.02 Summary: These tests fail intermittently when a method that is supposed to be compiled is not. This does not reproduce easily but the cause is understood to be that sometimes pending compilations become stale and are evicted from the compilation queue. Solution: Use the support for blocking compiles in https://bugs.openjdk.java.net/browse/JDK-8150646 that also prevents blocking tasks from going stale. CompilerWhiteBoxTest.java - waitBackgroundCompilation changed to wait 100ms between checking isMethodQueuedForCompilation. Saves a lot of time when most compilations are ready much faster than a second. When the compiles are blocking this should not be an issue. I also reduced the LockCompilationTest to a single test case. Tests that extend CompilerWhiteboxTest run 6 different test cases. This test locks the compilations and waits 10 seconds before feeling reasonably sure that the test case never will be compiled. Doing this serially for 6 test cases consumes a lot of time. Since the compile queues treat all compiles equally this is an acceptable limitation. Results: All related whitebox tests passes 3 min test time saved on a fast machine Regards, Nils Eliasson Reference: Test times before: elapsed time (seconds): 0.378 elapsed time (seconds): 0.389 elapsed time (seconds): 0.393 elapsed time (seconds): 0.394 elapsed time (seconds): 0.394 elapsed time (seconds): 0.395 elapsed time (seconds): 0.395 elapsed time (seconds): 0.396 elapsed time (seconds): 0.397 elapsed time (seconds): 0.398 elapsed time (seconds): 0.404 elapsed time (seconds): 0.405 elapsed time (seconds): 0.41 elapsed time (seconds): 0.411 elapsed time (seconds): 0.413 elapsed time (seconds): 0.415 elapsed time (seconds): 0.421 elapsed time (seconds): 0.594 elapsed time (seconds): 0.601 elapsed time (seconds): 0.628 elapsed time (seconds): 0.646 elapsed time (seconds): 0.651 elapsed time (seconds): 1.218 elapsed time (seconds): 1.219 elapsed time (seconds): 1.28 elapsed time (seconds): 2.294 elapsed time (seconds): 2.595 elapsed time (seconds): 2.871 elapsed time (seconds): 3.674 elapsed time (seconds): 3.692 elapsed time (seconds): 4.711 elapsed time (seconds): 4.809 elapsed time (seconds): 10.562 elapsed time (seconds): 10.964 elapsed time (seconds): 42.838 elapsed time (seconds): 99.492 Test times after: elapsed time (seconds): 0.344 elapsed time (seconds): 0.378 elapsed time (seconds): 0.388 elapsed time (seconds): 0.396 elapsed time (seconds): 0.396 elapsed time (seconds): 0.396 elapsed time (seconds): 0.401 elapsed time (seconds): 0.401 elapsed time (seconds): 0.402 elapsed time (seconds): 0.405 elapsed time (seconds): 0.406 elapsed time (seconds): 0.406 elapsed time (seconds): 0.406 elapsed time (seconds): 0.41 elapsed time (seconds): 0.413 elapsed time (seconds): 0.413 elapsed time (seconds): 0.429 elapsed time (seconds): 0.586 elapsed time (seconds): 0.609 elapsed time (seconds): 0.61 elapsed time (seconds): 0.643 elapsed time (seconds): 0.669 elapsed time (seconds): 0.698 elapsed time (seconds): 1.181 elapsed time (seconds): 1.232 elapsed time (seconds): 1.242 elapsed time (seconds): 1.341 elapsed time (seconds): 1.898 elapsed time (seconds): 2.515 elapsed time (seconds): 2.64 elapsed time (seconds): 2.707 elapsed time (seconds): 2.725 elapsed time (seconds): 2.843 elapsed time (seconds): 2.888 elapsed time (seconds): 2.936 elapsed time (seconds): 10.905 -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Thu Mar 3 15:25:17 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 3 Mar 2016 16:25:17 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D83272.9010202@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> Message-ID: Hi Nils, thanks for your comments. Please find my new webrev here: http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 Comments as always inline: On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson wrote: > Hi Volker, > > On 2016-03-02 17:36, Volker Simonis wrote: > > Hi Nils, > > your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is > as base for my new webrevs at: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 > > I've updated the copyrights, added the current reviewers and also added us > both in the Contributed-by line (hope that's fine for you). > > > Absolutely > > > Except that, I've only done the following minor fixes/changes: > > * compileBroker.{cpp,hpp}* > > - we don't need CompileBroker::is_compile_blocking() anymore. > > Good > > > *compilerDirectives.hpp* > > - I think we should use > cflags(BackgroundCompilation, bool, BackgroundCompilation, > BackgroundCompilation) > instead of: > cflags(BackgroundCompilation, bool, BackgroundCompilation, X) > > so we can also trigger blocking compiles from the command line with a > CompileCommand (e.g. > -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") > That's very handy during development or and also for simple tests where we > don't want to mess with compiler directives. (And the overhead to keep this > feature is quite small, just "BackgroundCompilation" instead of "X" ;-) > > > Without a very strong use case for this I don't want it as a > CompileCommand. > > CompileCommand options do have a cost - they force a temporary unique copy > of the directive if any option command matches negating some of the > positive effects of directives. Also the CompileCommands are stringly > typed, no compile time name or type check is done. This can be fixed in > various ways, but until then I prefer not to add it. > > Well, daily working is a strong use case for me:) Until there's no possibility to provide compiler directives directly on the command line instead of using an extra file, I think there's a justification for the CompileCommand version. Also I think the cost argument is not so relevent, because the feature will be mainly used during developemnt or in small tests which don't need compiler directives. It will actually make it possible to write simple tests which require blocking compilations without the need to use compiler directives or WB (just by specifying a -XX:CompileCommand option). > > *whitebox.cpp* > > I think it is good that you fixed the state but I think it is too > complicated now. We don't need to strdup the string and can easily forget > to free 'tmpstr' :) So maybe it is simpler to just do another transition > for parsing the directive: > > { > ThreadInVMfromNative ttvfn(thread); // back to VM > DirectivesParser::parse_string(dir, tty); > } > > > Transitions are not free, but on the other hand the string may be long. > This is not a hot path in anyway so lets go with simple. > > > > *advancedThresholdPolicy.cpp * > - the JVMCI code looks reasonable (although I haven't tested JVMCI) and is > actually even an improvement over my code which just picked the first > blocking compilation. > > > Feels good to remove the special cases. > > > > > *diagnosticCommand.cpp *- Shouldn't you also fix > CompilerDirectivesAddDCmd to return the number of added directives and > CompilerDirectivesRemoveDCmd to take the number of directives you want to > pop? Or do you want to do this in a later, follow-up change? > > > Yes, lets do that in a follow up change. They affect a number of tests. > > > *WhiteBox.java* > > - I still think it would make sense to keep the two 'blocking' versions > of enqueueMethodForCompilation() for convenience. For example your test > fix for JDK-8073793 would be much simpler if you used them. I've added two > comments to the 'blocking' convenience methods to mention the fact that > calling them may shadow previously added compiler directives. > > > I am ok with having then, but think Whitebox.java will get too bloated. I > would rather have the convenience-methods in some test utility class, like > CompilerWhiteBoxTest.java. > > OK, I can live with that. I removed the blocking enqueue methods and the corresponding tests. > > *BlockingCompilation.java* > > - I've extended my regression test to test both methods of doing blocking > compilation - with the new, 'blocking' enqueueMethodForCompilation() > methods as well as by manually setting the corresponding compiler > directives. If we should finally get consensus on removing the blocking > convenience methods, please just remove the corresponding tests. > > > Line 85: for (level = 1; level <= 4; level++) { > > You can not be sure all compilation levels are available. Use > > * @library /testlibrary /test/lib / > * @build sun.hotspot.WhiteBox > * compiler.testlibrary.CompilerUtils > > import compiler.testlibrary.CompilerUtils; > > int[] levels = CompilerUtils.getAvailableCompilationLevels(); > for (int level : levels) { > ... > Good catch. I've slightly revorked the test. I do bail out early if there are no compilers at all and I've also fixed the break condition of the loop which is calling foo() to compare against the highest available compilation level instead of just using '4'. > > I think we're close to a final version now, what do you think :) > > > Yes! I'll take a look as soon as you post an updated webrev. > Would be good if you could run it trough JPRT once so we can be sure we didn't break anything. Regards, Volker > > Regards, > Nils > > > > Regards, > Volker > > > On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson > wrote: > >> Yes, I forgot to add the fix for working with multiple directives from >> whitebox. >> >> WB.addCompilerDirectives now returns the number of directives that where >> added, and removeCompilerDirectives takes a parameter for the number of >> directives that should be popped (atomically). >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >> >> Fixed test in JDK-8073793 to work with this: >> >> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >> >> Best regards, >> Nils Eliasson >> >> >> >> On 2016-03-02 13:36, Nils Eliasson wrote: >> >> Hi Volker, >> >> I created these webrevs including all the feedback from everyone: >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >> * Only add- and removeCompilerDirective >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >> * whitebox.cpp >> -- addCompilerDirective to have correct VM states >> * advancedThresholdPolicy.cpp >> -- prevent blocking tasks from becoming stale >> -- The logic for picking first blocking task broke JVMCI code. Instead >> made the JVMCI code default (select the blocking task with highest score.) >> * compilerDirectives.hpp >> -- Remove option CompileCommand. Not needed. >> * compileBroker.cpp >> -- Wrapped compile_method so that directive get and release always are >> matched. >> >> Is anything missing? >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-03-01 19:31, Volker Simonis wrote: >> >> Hi Pavel, Nils, Vladimir, >> >> sorry, but I was busy the last days so I couldn't answer your mails. >> >> Thanks a lot for your input and your suggestions. I'll look into this >> tomorrow and hopefully I'll be able to address all your concerns. >> >> Regards, >> Volker >> >> >> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov wrote: >> >> Nils, please answer Pavel's questions. >> >> Thanks, >> Vladimir >> >> >> On 3/1/16 6:24 AM, Nils Eliasson wrote: >> >> Hi Volker, >> >> An excellent proposition. This is how it should be used. >> >> I polished a few rough edges: >> * CompilerBroker.cpp - The directives was already access in >> compile_method - but hidden incompilation_is_prohibited. I moved it out >> so we only have a single directive access. Wrapped compile_method to >> make sure the release of the directive doesn't get lost. >> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >> state - need to be in native to get string, but then need to be in VM >> when parsing directive. >> >> And some comments: >> * I am against adding new compile option commands (At least until the >> stringly typeness is fixed). Lets add good ways too use compiler >> directives instead. >> >> I need to look at the stale task removal code tomorrow - hopefully we >> could save the blocking info in the task so we don't need to access the >> directive in the policy. >> >> All in here: >> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >> >> The code runs fine with the test I fixed for JDK-8073793:http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >> >> Best regards, >> Nils Eliasson >> >> On 2016-02-26 19:47, Volker Simonis wrote: >> >> Hi, >> >> so I want to propose the following solution for this problem: >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevelhttp://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >> >> I've started from the opposite site and made the BackgroundCompilation >> manageable through the compiler directives framework. Once this works >> (and it's actually trivial due to the nice design of the >> CompilerDirectives framework :), we get the possibility to set the >> BackgroundCompilation option on a per method base on the command line >> via the CompileCommand option for free: >> >> >> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >> >> >> And of course we can also use it directly as a compiler directive: >> >> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >> >> It also becomes possible to use this directly from the Whitebox API >> through the DiagnosticCommand.compilerDirectivesAdd command. >> Unfortunately, this command takes a file with compiler directives as >> argument. I think this would be overkill in this context. So because >> it was so easy and convenient, I added the following two new Whitebox >> methods: >> >> public native void addCompilerDirective(String compDirect); >> public native void removeCompilerDirective(); >> >> which can now be used to set arbitrary CompilerDirective command >> directly from within the WhiteBox API. (The implementation of these >> two methods is trivial as you can see in whitebox.cpp). >> v >> The blocking versions of enqueueMethodForCompilation() now become >> simple wrappers around the existing methods without the need of any >> code changes in their native implementation. This is good, because it >> keeps the WhiteBox API stable! >> >> Finally some words about the implementation of the per-method >> BackgroundCompilation functionality. It actually only requires two >> small changes: >> >> 1. extending CompileBroker::is_compile_blocking() to take the method >> and compilation level as arguments and use them to query the >> DirectivesStack for the corresponding BackgroundCompilation value. >> >> 2. changing AdvancedThresholdPolicy::select_task() such that it >> prefers blocking compilations. This is not only necessary, because it >> decreases the time we have to wait for a blocking compilation, but >> also because it prevents blocking compiles from getting stale. This >> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >> for methods which only get artificially compiled during a test because >> their invocations counters are usually too small. >> >> There's still a small probability that a blocking compilation will be >> not blocking. This can happen if a method for which we request the >> blocking compilation is already in the compilation queue (see the >> check 'compilation_is_in_queue(method)' in >> CompileBroker::compile_method_base()). In testing scenarios this will >> rarely happen because methods which are manually compiled shouldn't >> get called that many times to implicitly place them into the compile >> queue. But we can even completely avoid this problem by using >> WB.isMethodQueuedForCompilation() to make sure that a method is not in >> the queue before we request a blocking compilation. >> >> I've also added a small regression test to demonstrate and verify the >> new functionality. >> >> Regards, >> Volker >> >> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson wrote: >> >> Hi Vladimir, >> >> WhiteBox::compilation_locked is a global state that temporarily stops >> all >> compilations. I this case I just want to achieve blocking compilation >> for a >> single compile without affecting the rest of the system. The tests >> using it >> will continue executing as soon as that compile is finished, saving time >> where wait-loops is used today. It adds nice determinism to tests. >> >> Best regards, >> Nils Eliasson >> >> >> On 2016-02-25 22:14, Vladimir Kozlov wrote: >> >> You are adding parameter which is used only for testing. >> Can we have callback(or check field) into WB instead? Similar to >> WhiteBox::compilation_locked. >> >> Thanks, >> Vladimir >> >> On 2/25/16 7:01 AM, Nils Eliasson wrote: >> >> Hi, >> >> Please review this change that adds support for blocking compiles >> in the >> whitebox API. This enables simpler less time consuming tests. >> >> Motivation: >> * -XX:-BackgroundCompilation is a global flag and can be time >> consuming >> * Blocking compiles removes the need for waiting on the compile >> queue to >> complete >> * Compiles put in the queue may be evicted if the queue grows to big - >> causing indeterminism in the test >> * Less VM-flags allows for more tests in the same VM >> >> Testing: >> Posting a separate RFR for test fix that uses this change. They >> will be >> pushed at the same time. >> >> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >> >> Best regards, >> Nils Eliasson >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zoltan.majo at oracle.com Thu Mar 3 15:40:22 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 3 Mar 2016 16:40:22 +0100 Subject: [9] RFR (S): 8150839: Adjust the number of compiler threads for 32-bit platforms Message-ID: <56D85AE6.6070100@oracle.com> Hi, please review the patch for 8150839. https://bugs.openjdk.java.net/browse/JDK-8150839 Problem: If the VM is executed on a machine with a large number of cores, it will create a large number of compiler threads. For example, on a 24-core machine, the VM will create 12 compiler threads (8 C2 compiler threads + 4 C1 compiler threads). On 32-bit platforms the virtual memory available to processes is typically limited to 2-4GB. As a result, the VM is likely to exhaust the virtual memory address space on these platforms and crash. Solution: This patch proposes to set the number of compiler threads to 3 on 32-bit platforms (2 C2 threads and 1 C1 thread), unless the user decides differently. On 64-bit platforms, the number of compiler threads is still set according to the number of available cores. Webrev: http://cr.openjdk.java.net/~zmajo/8150839/webrev.00/ Testing: JPRT. Thank you! Best regards, Zoltan From igor.veresov at oracle.com Thu Mar 3 16:28:52 2016 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 3 Mar 2016 08:28:52 -0800 Subject: [9] RFR (S): 8150839: Adjust the number of compiler threads for 32-bit platforms In-Reply-To: <56D85AE6.6070100@oracle.com> References: <56D85AE6.6070100@oracle.com> Message-ID: <17F00F67-A971-4F3E-AA75-17BC6D713D4E@oracle.com> Seems reasonable. Reviewed. igor > On Mar 3, 2016, at 7:40 AM, Zolt?n Maj? wrote: > > Hi, > > > please review the patch for 8150839. > > https://bugs.openjdk.java.net/browse/JDK-8150839 > > Problem: If the VM is executed on a machine with a large number of cores, it will create a large number of compiler threads. For example, on a 24-core machine, the VM will create 12 compiler threads (8 C2 compiler threads + 4 C1 compiler threads). > > On 32-bit platforms the virtual memory available to processes is typically limited to 2-4GB. As a result, the VM is likely to exhaust the virtual memory address space on these platforms and crash. > > > Solution: This patch proposes to set the number of compiler threads to 3 on 32-bit platforms (2 C2 threads and 1 C1 thread), unless the user decides differently. On 64-bit platforms, the number of compiler threads is still set according to the number of available cores. > > Webrev: > http://cr.openjdk.java.net/~zmajo/8150839/webrev.00/ > > Testing: JPRT. > > Thank you! > > Best regards, > > > Zoltan > From vladimir.kozlov at oracle.com Thu Mar 3 17:56:10 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 3 Mar 2016 09:56:10 -0800 Subject: RFR(S): 8066770: EnqueueMethodForCompilationTest.java fails to compile method In-Reply-To: <56D84A85.9060602@oracle.com> References: <56D84A85.9060602@oracle.com> Message-ID: <56D87ABA.4000302@oracle.com> Looks good. Nice improvement! Thanks, Vladimir On 3/3/16 6:30 AM, Nils Eliasson wrote: > Hi, > > Please review, > > Bug: https://bugs.openjdk.java.net/browse/JDK-8066770 > Webrev: http://cr.openjdk.java.net/~neliasso/8066770/webrev.02 > > Summary: > These tests fail intermittently when a method that is supposed to be compiled is not. This does not reproduce easily but > the cause is understood to be that sometimes pending compilations become stale and are evicted from the compilation queue. > > Solution: > Use the support for blocking compiles in https://bugs.openjdk.java.net/browse/JDK-8150646 that also prevents blocking > tasks from going stale. > > CompilerWhiteBoxTest.java - waitBackgroundCompilation changed to wait 100ms between checking > isMethodQueuedForCompilation. Saves a lot of time when most compilations are ready much faster than a second. When the > compiles are blocking this should not be an issue. > > I also reduced the LockCompilationTest to a single test case. Tests that extend CompilerWhiteboxTest run 6 different > test cases. This test locks the compilations and waits 10 seconds before feeling reasonably sure that the test case > never will be compiled. Doing this serially for 6 test cases consumes a lot of time. Since the compile queues treat all > compiles equally this is an acceptable limitation. > > Results: > All related whitebox tests passes > 3 min test time saved on a fast machine > > Regards, > Nils Eliasson > > > > Reference: > Test times before: > > elapsed time (seconds): 0.378 > elapsed time (seconds): 0.389 > elapsed time (seconds): 0.393 > elapsed time (seconds): 0.394 > elapsed time (seconds): 0.394 > elapsed time (seconds): 0.395 > elapsed time (seconds): 0.395 > elapsed time (seconds): 0.396 > elapsed time (seconds): 0.397 > elapsed time (seconds): 0.398 > elapsed time (seconds): 0.404 > elapsed time (seconds): 0.405 > elapsed time (seconds): 0.41 > elapsed time (seconds): 0.411 > elapsed time (seconds): 0.413 > elapsed time (seconds): 0.415 > elapsed time (seconds): 0.421 > elapsed time (seconds): 0.594 > elapsed time (seconds): 0.601 > elapsed time (seconds): 0.628 > elapsed time (seconds): 0.646 > elapsed time (seconds): 0.651 > elapsed time (seconds): 1.218 > elapsed time (seconds): 1.219 > elapsed time (seconds): 1.28 > elapsed time (seconds): 2.294 > elapsed time (seconds): 2.595 > elapsed time (seconds): 2.871 > elapsed time (seconds): 3.674 > elapsed time (seconds): 3.692 > elapsed time (seconds): 4.711 > elapsed time (seconds): 4.809 > elapsed time (seconds): 10.562 > elapsed time (seconds): 10.964 > elapsed time (seconds): 42.838 > elapsed time (seconds): 99.492 > > Test times after: > elapsed time (seconds): 0.344 > elapsed time (seconds): 0.378 > elapsed time (seconds): 0.388 > elapsed time (seconds): 0.396 > elapsed time (seconds): 0.396 > elapsed time (seconds): 0.396 > elapsed time (seconds): 0.401 > elapsed time (seconds): 0.401 > elapsed time (seconds): 0.402 > elapsed time (seconds): 0.405 > elapsed time (seconds): 0.406 > elapsed time (seconds): 0.406 > elapsed time (seconds): 0.406 > elapsed time (seconds): 0.41 > elapsed time (seconds): 0.413 > elapsed time (seconds): 0.413 > elapsed time (seconds): 0.429 > elapsed time (seconds): 0.586 > elapsed time (seconds): 0.609 > elapsed time (seconds): 0.61 > elapsed time (seconds): 0.643 > elapsed time (seconds): 0.669 > elapsed time (seconds): 0.698 > elapsed time (seconds): 1.181 > elapsed time (seconds): 1.232 > elapsed time (seconds): 1.242 > elapsed time (seconds): 1.341 > elapsed time (seconds): 1.898 > elapsed time (seconds): 2.515 > elapsed time (seconds): 2.64 > elapsed time (seconds): 2.707 > elapsed time (seconds): 2.725 > elapsed time (seconds): 2.843 > elapsed time (seconds): 2.888 > elapsed time (seconds): 2.936 > elapsed time (seconds): 10.905 > From vladimir.x.ivanov at oracle.com Thu Mar 3 21:16:23 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 4 Mar 2016 00:16:23 +0300 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> Message-ID: <56D8A9A7.30606@oracle.com> On 3/3/16 7:08 AM, Berg, Michael C wrote: > Vladimir (K), just for the time being as the problem isn't just confined to these instructions (the nds issue). I have assigned the bug below to myself and will take a holistic view over the issue in its full context. > > The instructions modified in the webrev, like in the documentation that exists regarding their definitions, are all programmable via what is loosely labeled as the imm8 field in the formal documentation. I think we should leave them that way. The onus of these changes was to make instructions look more like their ISA manual definitions. I think Vladimir Ivanov was saying, and please chime in Vladimir if I do not interpret correctly, wasn't high/low, it was leaving a signature like what we had in place in the macro assembler, and invoking the precise names there. I don't think that is needed though, as the macro assembler's job is to interpret a meaning and do a mapping. I'm all for the proposed change in Assembler. My point is that vmovdqu/vinserti128h/vextracti128h(...) are more informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the code. So, keeping original functions in the MacroAssembler, but migrating them to new Assembler versions looks reasonable. But I can live with both variaints. Best regards, Vladimir Ivanov > > Regards, > Michael > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, March 02, 2016 3:32 PM > To: hotspot-compiler-dev at openjdk.java.net > Cc: Berg, Michael C > Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions > > On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >> >> After discussing with Vladimir off-list we agreed that changing the >> type > > It was Vladimir Ivanov. > >> of the immediate (imm8) argument to uint8_t is both clearer, has the >> potential to catch incorrect uses of the functions, and also makes the >> asserts more straightforward. In addition to that Vladimir noted that >> I had accidentally included newline in the assert messages. >> >> New webrev: >> >> Full: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ > > I agree with Vladimir I. that we should have macroassembler instructions vinserti128high, vinserti128low, etc. instead of passing imm8. It is more informative. > > Also why we add new nds->is_valid() checks into assembler instructions? > We are going to remove them: > > https://bugs.openjdk.java.net/browse/JDK-8151003 > > I know that Mikael had a discussion about this with Michael. So I would like to see arguments here. Michael? > > Current code always pass correct registers and x86 Manual requires to have valid registers. > > Thanks, > Vladimir > >> >> Incremental from webrev.01: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webr >> ev/ >> >> Cheers, >> Mikael >> >> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>> >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ >>> >>> Incremental from webrev.00: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/web >>> rev/ >>> >>> Comments below... >>> >>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>> Nice cleanup, Mikael! >>>> >>>> src/cpu/x86/vm/assembler_x86.hpp: >>>> >>>> Outdated comments: >>>> // Copy low 128bit into high 128bit of YMM registers. >>>> >>>> // Load/store high 128bit of YMM registers which does not >>>> destroy other half. >>>> >>>> // Copy low 256bit into high 256bit of ZMM registers. >>> >>> Updated, thanks for catching! >>> >>>> src/cpu/x86/vm/assembler_x86.cpp: >>>> >>>> ! emit_int8(imm8 & 0x01); >>>> >>>> Maybe additionally assert valid imm8 range? >>> >>> Good idea, I had added asserts earlier but removed them. I added them >>> back again! >>> >>>> Maybe keep vinsert*h variants and move them to MacroAssembler? They >>>> look clearer in some contextes: >>>> >>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>> as_XMMRegister(n)); >>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>> as_XMMRegister(n), 1); >>> >>> Can I suggest that we try to live without them for a while and see >>> how much we miss them? I think having it there may actually be more >>> confusing in many cases :) >>> >>> Cheers, >>> Mikael >>> >>>> >>>> Otherwise, looks good. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>> >>>>> Please review the following change which updates the various >>>>> vextract* and vinsert* methods in assembler_x86 & >>>>> macroAssembler_x86 to better match the real HW instructions, which >>>>> also has the benefit of providing the full >>>>> functionality/flexibility of the instructions where earlier only >>>>> some specific modes were supported. Put differently, with this >>>>> change it's much easier to correlate the methods to the Intel >>>>> manual and understand what they actually do. >>>>> >>>>> Specifically, the vinsert* family of instructions take three >>>>> registers and an immediate which decide how the bits should be >>>>> shuffled around, but without this change the method only allowed >>>>> two of the registers to be specified, and the immediate was hard-coded to 0x01. >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev >>>>> / >>>>> >>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>> test the change! >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>> >> From vivek.r.deshpande at intel.com Thu Mar 3 22:43:50 2016 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Thu, 3 Mar 2016 22:43:50 +0000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <56D76597.40205@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> <126788E1-352A-413A-AA28-5E38064E9DBA@oracle.com> <56D76597.40205@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3CF99@ORSMSX106.amr.corp.intel.com> Hi Vladimir, Christian I have updated the code with #ifdef. Please find the updated webrev here: http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.03/ Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, March 02, 2016 2:14 PM To: Christian Thalinger; Deshpande, Vivek R Cc: hotspot compiler; Rukmannagari, Shravya Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling On 3/2/16 2:04 PM, Christian Thalinger wrote: > >> On Mar 2, 2016, at 11:49 AM, Deshpande, Vivek R >> > wrote: >> >> Hi Christian >> >> We could combine the declaration of the functions such as >> fast_sha256() for 32 bit and 64 bit in macroAssembler_x86.hpp using COMMA. >> We used COMMA to separate more arguments used in 64 bit using >> LP64_ONLY macro. > > Ugh. This is ugly: > > voidfast_pow(XMMRegisterxmm0, XMMRegisterxmm1, XMMRegisterxmm2, > XMMRegisterxmm3, XMMRegisterxmm4, XMMRegisterxmm5, XMMRegisterxmm6, > XMMRegisterxmm7, Registerrax, Registerrcx, Register rdx NOT_LP64(COMMA > Register tmp) LP64_ONLY(COMMA Register tmp1) > LP64_ONLY(COMMA Register tmp2) LP64_ONLY(COMMA > Register > tmp3) LP64_ONLY(COMMA Register tmp4)); > > Can?t we use #ifdef instead? +1 for #ifdef It is not first time we scope additional parameters with #ifdef. Vladimir > >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >> Sent: Wednesday, March 02, 2016 9:30 AM >> To: Deshpande, Vivek R >> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >> >> #define COMMA , >> >> Why do we have a define like this? It came in with 8139575. >> >>> On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R >>> > wrote: >>> >>> Hi Vladimir, Christian >>> >>> I have updated the code according your suggestion of file name change. >>> The updated webrev is at this location: >>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ >>> Please let me know if I have to do anything more. >>> Also is there any change required to Makefile so that >>> configurations.xml has name of the added file ? >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Monday, February 29, 2016 5:59 PM >>> To: Vladimir Kozlov >>> Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>> enabling >>> >>> >>>> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov >>>> wrote: >>>> >>>> I am against to have "intel" in a file name. We have >>>> macroAssembler_libm_x86_*.cpp files for math intrinsics which does >>>> not have intel in name. So I prefer to not have it. I would suggest >>>> macroAssembler_sha_x86.cpp. >>> >>> I know we already have macroAssembler_libm_x86_*.cpp but >>> macroAssembler_x86_.cpp would be better. >>> >>>> You can manipulate when to use it in vm_version_x86.cpp. >>>> >>>> Intel Copyright in the file's header is fine. >>>> >>>> Code changes are fine now (webrev.01). >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>>>> >>>>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R >>>>>> wrote: >>>>>> >>>>>> Hi Christian >>>>>> >>>>>> We used the SHA Extension >>>>>> implementations(https://software.intel.com/en-us/articles/intel-s >>>>>> ha-extensions-implementations) for the JVM implementation of SHA1 >>>>>> and SHA256. >>>>> >>>>> Will that extension only be available on Intel chips? >>>>> >>>>>> It needed to have Intel copyright, so created a separate file. >>>>> >>>>> That is reasonable. >>>>> >>>>>> The white paper for the implementation is >>>>>> https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>>>> >>>>>> Regards, >>>>>> Vivek >>>>>> >>>>>> -----Original Message----- >>>>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>>>> Sent: Monday, February 29, 2016 1:58 PM >>>>>> To: Deshpande, Vivek R >>>>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>> enabling >>>>>> >>>>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>>>> >>>>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R >>>>>>> wrote: >>>>>>> >>>>>>> HI Vladimir >>>>>>> >>>>>>> Thank you for your review. >>>>>>> I have updated the patch with the changes you have suggested. >>>>>>> The new webrev is at this location: >>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>>>> >>>>>>> Regards >>>>>>> Vivek >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>>>> To: Deshpande, Vivek R; hotspot compiler >>>>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>>> enabling >>>>>>> >>>>>>> Very nice, Vivek!!! >>>>>>> >>>>>>> Did you run tests with both 32- and 64-bit VMs? >>>>>>> >>>>>>> Small notes: >>>>>>> >>>>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>>>> >>>>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>>>> supports_avx()) && !supports_evex()); } >>>>>>> + static bool supports_sha() { return (_features & CPU_SHA) >>>>>>> != 0; } >>>>>>> >>>>>>> Flags setting code in vm_version_x86.cpp should be like this >>>>>>> (you can check supports_sha() only once, don't split '} else {' >>>>>>> line, set UseSHA false if all intrinsics flags are false (I >>>>>>> included UseSHA512Intrinsics for future) ): >>>>>>> >>>>>>> if (supports_sha()) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>>>> UseSHA = true; >>>>>>> } >>>>>>> } else if (UseSHA) { >>>>>>> warning("SHA instructions are not available on this CPU"); >>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>> >>>>>>> if (UseSHA) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); } } else if >>>>>>> (UseSHA1Intrinsics) { warning("Intrinsics for SHA-1 crypto hash >>>>>>> functions not available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>>>> >>>>>>> if (UseSHA) { >>>>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); } } else if >>>>>>> (UseSHA256Intrinsics) { warning("Intrinsics for SHA-224 and >>>>>>> SHA-256 crypto hash functions not available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>>>> >>>>>>> if (UseSHA512Intrinsics) { >>>>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash >>>>>>> functions not available on this CPU."); >>>>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>>>> >>>>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || >>>>>>> UseSHA512Intrinsics)) { >>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>>>> Hi all >>>>>>>> >>>>>>>> I would like to contribute a patch which optimizesSHA-1 >>>>>>>> andSHA-256 for >>>>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>>>> >>>>>>>> Could you please review and sponsor this patch. >>>>>>>> >>>>>>>> Bug-id: >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>>>> webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>>>> >>>>>>>> Thanks and regards, >>>>>>>> >>>>>>>> Vivek >>>>>>>> >>>>>> >>>>> >>> >> > From christian.thalinger at oracle.com Thu Mar 3 22:54:47 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 3 Mar 2016 12:54:47 -1000 Subject: RFR (M): 8150767: Update for x86 SHA Extensions enabling In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A56A3CF99@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A56A36FCA@ORSMSX106.amr.corp.intel.com> <56D10EEE.4040604@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A38CB7@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A390AA@ORSMSX106.amr.corp.intel.com> <4182E729-495A-4D2E-BCEA-875E6E538256@oracle.com> <56D4ED29.1050108@oracle.com> <3A0386B6-8072-4D84-8AF7-01D904DAEADF@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3AF1F@ORSMSX106.amr.corp.intel.com> <32DA77B0-7683-4CE9-9ED3-8461B70E1E19@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3BE9C@ORSMSX106.amr.corp.intel.com> <126788E1-352A-413A-AA28-5E38064E9DBA@oracle.com> <56D76597.40205@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A56A3CF99@ORSMSX106.amr.corp.intel.com> Message-ID: Much better. Thanks. > On Mar 3, 2016, at 12:43 PM, Deshpande, Vivek R wrote: > > Hi Vladimir, Christian > > I have updated the code with #ifdef. > Please find the updated webrev here: > http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.03/ > > Regards, > Vivek > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, March 02, 2016 2:14 PM > To: Christian Thalinger; Deshpande, Vivek R > Cc: hotspot compiler; Rukmannagari, Shravya > Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling > > On 3/2/16 2:04 PM, Christian Thalinger wrote: >> >>> On Mar 2, 2016, at 11:49 AM, Deshpande, Vivek R >>> > wrote: >>> >>> Hi Christian >>> >>> We could combine the declaration of the functions such as >>> fast_sha256() for 32 bit and 64 bit in macroAssembler_x86.hpp using COMMA. >>> We used COMMA to separate more arguments used in 64 bit using >>> LP64_ONLY macro. >> >> Ugh. This is ugly: >> >> voidfast_pow(XMMRegisterxmm0, XMMRegisterxmm1, XMMRegisterxmm2, >> XMMRegisterxmm3, XMMRegisterxmm4, XMMRegisterxmm5, XMMRegisterxmm6, >> XMMRegisterxmm7, Registerrax, Registerrcx, Register rdx NOT_LP64(COMMA >> Register tmp) LP64_ONLY(COMMA Register tmp1) >> LP64_ONLY(COMMA Register tmp2) LP64_ONLY(COMMA >> Register >> tmp3) LP64_ONLY(COMMA Register tmp4)); >> >> Can?t we use #ifdef instead? > > +1 for #ifdef > It is not first time we scope additional parameters with #ifdef. > > Vladimir > >> >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>> Sent: Wednesday, March 02, 2016 9:30 AM >>> To: Deshpande, Vivek R >>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions enabling >>> >>> #define COMMA , >>> >>> Why do we have a define like this? It came in with 8139575. >>> >>>> On Mar 1, 2016, at 3:24 PM, Deshpande, Vivek R >>>> > wrote: >>>> >>>> Hi Vladimir, Christian >>>> >>>> I have updated the code according your suggestion of file name change. >>>> The updated webrev is at this location: >>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.02/ >>>> Please let me know if I have to do anything more. >>>> Also is there any change required to Makefile so that >>>> configurations.xml has name of the added file ? >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>> Sent: Monday, February 29, 2016 5:59 PM >>>> To: Vladimir Kozlov >>>> Cc: Deshpande, Vivek R; hotspot compiler; Rukmannagari, Shravya >>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>> enabling >>>> >>>> >>>>> On Feb 29, 2016, at 3:15 PM, Vladimir Kozlov >>>>> wrote: >>>>> >>>>> I am against to have "intel" in a file name. We have >>>>> macroAssembler_libm_x86_*.cpp files for math intrinsics which does >>>>> not have intel in name. So I prefer to not have it. I would suggest >>>>> macroAssembler_sha_x86.cpp. >>>> >>>> I know we already have macroAssembler_libm_x86_*.cpp but >>>> macroAssembler_x86_.cpp would be better. >>>> >>>>> You can manipulate when to use it in vm_version_x86.cpp. >>>>> >>>>> Intel Copyright in the file's header is fine. >>>>> >>>>> Code changes are fine now (webrev.01). >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/29/16 4:42 PM, Christian Thalinger wrote: >>>>>> >>>>>>> On Feb 29, 2016, at 2:00 PM, Deshpande, Vivek R >>>>>>> wrote: >>>>>>> >>>>>>> Hi Christian >>>>>>> >>>>>>> We used the SHA Extension >>>>>>> implementations(https://software.intel.com/en-us/articles/intel-s >>>>>>> ha-extensions-implementations) for the JVM implementation of SHA1 >>>>>>> and SHA256. >>>>>> >>>>>> Will that extension only be available on Intel chips? >>>>>> >>>>>>> It needed to have Intel copyright, so created a separate file. >>>>>> >>>>>> That is reasonable. >>>>>> >>>>>>> The white paper for the implementation is >>>>>>> https://software.intel.com/sites/default/files/article/402097/intel-sha-extensions-white-paper.pdf. >>>>>>> >>>>>>> Regards, >>>>>>> Vivek >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Christian Thalinger [mailto:christian.thalinger at oracle.com] >>>>>>> Sent: Monday, February 29, 2016 1:58 PM >>>>>>> To: Deshpande, Vivek R >>>>>>> Cc: Vladimir Kozlov; hotspot compiler; Rukmannagari, Shravya >>>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>>> enabling >>>>>>> >>>>>>> Why is the new file called macroAssembler_intel_x86.cpp? >>>>>>> >>>>>>>> On Feb 29, 2016, at 11:29 AM, Deshpande, Vivek R >>>>>>>> wrote: >>>>>>>> >>>>>>>> HI Vladimir >>>>>>>> >>>>>>>> Thank you for your review. >>>>>>>> I have updated the patch with the changes you have suggested. >>>>>>>> The new webrev is at this location: >>>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.01/ >>>>>>>> >>>>>>>> Regards >>>>>>>> Vivek >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>>>>> Sent: Friday, February 26, 2016 6:50 PM >>>>>>>> To: Deshpande, Vivek R; hotspot compiler >>>>>>>> Cc: Viswanathan, Sandhya; Rukmannagari, Shravya >>>>>>>> Subject: Re: RFR (M): 8150767: Update for x86 SHA Extensions >>>>>>>> enabling >>>>>>>> >>>>>>>> Very nice, Vivek!!! >>>>>>>> >>>>>>>> Did you run tests with both 32- and 64-bit VMs? >>>>>>>> >>>>>>>> Small notes: >>>>>>>> >>>>>>>> In vm_version_x86.hpp spacing are not aligned in next line: >>>>>>>> >>>>>>>> static bool supports_avxonly() { return ((supports_avx2() || >>>>>>>> supports_avx()) && !supports_evex()); } >>>>>>>> + static bool supports_sha() { return (_features & CPU_SHA) >>>>>>>> != 0; } >>>>>>>> >>>>>>>> Flags setting code in vm_version_x86.cpp should be like this >>>>>>>> (you can check supports_sha() only once, don't split '} else {' >>>>>>>> line, set UseSHA false if all intrinsics flags are false (I >>>>>>>> included UseSHA512Intrinsics for future) ): >>>>>>>> >>>>>>>> if (supports_sha()) { >>>>>>>> if (FLAG_IS_DEFAULT(UseSHA)) { >>>>>>>> UseSHA = true; >>>>>>>> } >>>>>>>> } else if (UseSHA) { >>>>>>>> warning("SHA instructions are not available on this CPU"); >>>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>>> >>>>>>>> if (UseSHA) { >>>>>>>> if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { >>>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); } } else if >>>>>>>> (UseSHA1Intrinsics) { warning("Intrinsics for SHA-1 crypto hash >>>>>>>> functions not available on this CPU."); >>>>>>>> FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); } >>>>>>>> >>>>>>>> if (UseSHA) { >>>>>>>> if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { >>>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); } } else if >>>>>>>> (UseSHA256Intrinsics) { warning("Intrinsics for SHA-224 and >>>>>>>> SHA-256 crypto hash functions not available on this CPU."); >>>>>>>> FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); } >>>>>>>> >>>>>>>> if (UseSHA512Intrinsics) { >>>>>>>> warning("Intrinsics for SHA-384 and SHA-512 crypto hash >>>>>>>> functions not available on this CPU."); >>>>>>>> FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); } >>>>>>>> >>>>>>>> if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || >>>>>>>> UseSHA512Intrinsics)) { >>>>>>>> FLAG_SET_DEFAULT(UseSHA, false); } >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 2/26/16 4:37 PM, Deshpande, Vivek R wrote: >>>>>>>>> Hi all >>>>>>>>> >>>>>>>>> I would like to contribute a patch which optimizesSHA-1 >>>>>>>>> andSHA-256 for >>>>>>>>> 64 and 32 bitX86architecture using Intel SHA extensions. >>>>>>>>> >>>>>>>>> Could you please review and sponsor this patch. >>>>>>>>> >>>>>>>>> Bug-id: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8150767 >>>>>>>>> webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~vdeshpande/SHANI/8150767/webrev.00/ >>>>>>>>> >>>>>>>>> Thanks and regards, >>>>>>>>> >>>>>>>>> Vivek >>>>>>>>> >>>>>>> >>>>>> >>>> >>> >> From zoltan.majo at oracle.com Fri Mar 4 07:36:10 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 4 Mar 2016 08:36:10 +0100 Subject: [9] RFR (S): 8150839: Adjust the number of compiler threads for 32-bit platforms In-Reply-To: <17F00F67-A971-4F3E-AA75-17BC6D713D4E@oracle.com> References: <56D85AE6.6070100@oracle.com> <17F00F67-A971-4F3E-AA75-17BC6D713D4E@oracle.com> Message-ID: <56D93AEA.5020003@oracle.com> Thank you for the review, Igor! Best regards, Zoltan On 03/03/2016 05:28 PM, Igor Veresov wrote: > Seems reasonable. Reviewed. > > igor > >> On Mar 3, 2016, at 7:40 AM, Zolt?n Maj? wrote: >> >> Hi, >> >> >> please review the patch for 8150839. >> >> https://bugs.openjdk.java.net/browse/JDK-8150839 >> >> Problem: If the VM is executed on a machine with a large number of cores, it will create a large number of compiler threads. For example, on a 24-core machine, the VM will create 12 compiler threads (8 C2 compiler threads + 4 C1 compiler threads). >> >> On 32-bit platforms the virtual memory available to processes is typically limited to 2-4GB. As a result, the VM is likely to exhaust the virtual memory address space on these platforms and crash. >> >> >> Solution: This patch proposes to set the number of compiler threads to 3 on 32-bit platforms (2 C2 threads and 1 C1 thread), unless the user decides differently. On 64-bit platforms, the number of compiler threads is still set according to the number of available cores. >> >> Webrev: >> http://cr.openjdk.java.net/~zmajo/8150839/webrev.00/ >> >> Testing: JPRT. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> From nils.eliasson at oracle.com Fri Mar 4 10:07:30 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 4 Mar 2016 11:07:30 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> Message-ID: <56D95E62.20309@oracle.com> Hi Volker, On 2016-03-03 16:25, Volker Simonis wrote: > Hi Nils, > > thanks for your comments. Please find my new webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 > Looks very good now. > > Comments as always inline: > > On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson > > wrote: > > Hi Volker, > > On 2016-03-02 17:36, Volker Simonis wrote: >> Hi Nils, >> >> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive >> used is as base for my new webrevs at: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >> >> >> I've updated the copyrights, added the current reviewers and also >> added us both in the Contributed-by line (hope that's fine for you). > > Absolutely > >> >> Except that, I've only done the following minor fixes/changes: >> * >> compileBroker.{cpp,hpp}* >> >> - we don't need CompileBroker::is_compile_blocking() anymore. > Good >> >> *compilerDirectives.hpp* >> >> - I think we should use >> cflags(BackgroundCompilation, bool, BackgroundCompilation, >> BackgroundCompilation) >> instead of: >> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >> >> so we can also trigger blocking compiles from the command line >> with a CompileCommand (e.g. >> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >> That's very handy during development or and also for simple tests >> where we don't want to mess with compiler directives. (And the >> overhead to keep this feature is quite small, just >> "BackgroundCompilation" instead of "X" ;-) > > Without a very strong use case for this I don't want it as a > CompileCommand. > > CompileCommand options do have a cost - they force a temporary > unique copy of the directive if any option command matches > negating some of the positive effects of directives. Also the > CompileCommands are stringly typed, no compile time name or type > check is done. This can be fixed in various ways, but until then I > prefer not to add it. > > > Well, daily working is a strong use case for me:) Until there's no > possibility to provide compiler directives directly on the command > line instead of using an extra file, I think there's a justification > for the CompileCommand version. Also I think the cost argument is not > so relevent, because the feature will be mainly used during > developemnt or in small tests which don't need compiler directives. It > will actually make it possible to write simple tests which require > blocking compilations without the need to use compiler directives or > WB (just by specifying a -XX:CompileCommand option). > ok, you convinced me. >> >> *whitebox.cpp* >> >> I think it is good that you fixed the state but I think it is too >> complicated now. We don't need to strdup the string and can >> easily forget to free 'tmpstr' :) So maybe it is simpler to just >> do another transition for parsing the directive: >> >> { >> ThreadInVMfromNative ttvfn(thread); // back to VM >> DirectivesParser::parse_string(dir, tty); >> } > > Transitions are not free, but on the other hand the string may be > long. This is not a hot path in anyway so lets go with simple. > >> >> *advancedThresholdPolicy.cpp >> * >> - the JVMCI code looks reasonable (although I haven't tested >> JVMCI) and is actually even an improvement over my code which >> just picked the first blocking compilation. > > Feels good to remove the special cases. > >> >> *diagnosticCommand.cpp >> >> *- Shouldn't you also fix CompilerDirectivesAddDCmd to return the >> number of added directives and CompilerDirectivesRemoveDCmd to >> take the number of directives you want to pop? Or do you want to >> do this in a later, follow-up change? > > Yes, lets do that in a follow up change. They affect a number of > tests. > >> >> *WhiteBox.java* >> >> - I still think it would make sense to keep the two 'blocking' >> versions of enqueueMethodForCompilation() for convenience. For >> example your test fix for JDK-8073793 would be much simpler if >> you used them. I've added two comments to the 'blocking' >> convenience methods to mention the fact that calling them may >> shadow previously added compiler directives. > > I am ok with having then, but think Whitebox.java will get too > bloated. I would rather have the convenience-methods in some test > utility class, like CompilerWhiteBoxTest.java. > > > OK, I can live with that. I removed the blocking enqueue methods and > the corresponding tests. > >> >> *BlockingCompilation.java* >> >> - I've extended my regression test to test both methods of doing >> blocking compilation - with the new, 'blocking' >> enqueueMethodForCompilation() methods as well as by manually >> setting the corresponding compiler directives. If we should >> finally get consensus on removing the blocking convenience >> methods, please just remove the corresponding tests. > > Line 85: for (level = 1; level <= 4; level++) { > > You can not be sure all compilation levels are available. Use > > * @library /testlibrary /test/lib / > * @build sun.hotspot.WhiteBox > * compiler.testlibrary.CompilerUtils > > import compiler.testlibrary.CompilerUtils; > > int[] levels = CompilerUtils.getAvailableCompilationLevels(); > for (int level : levels) { > ... > > > Good catch. I've slightly revorked the test. I do bail out early if > there are no compilers at all and I've also fixed the break condition > of the loop which is calling foo() to compare against the highest > available compilation level instead of just using '4'. > >> >> I think we're close to a final version now, what do you think :) > > Yes! I'll take a look as soon as you post an updated webrev. > > > Would be good if you could run it trough JPRT once so we can be sure > we didn't break anything. I am running all jtreg tests on some select platforms right now. Best regards, Nils > > Regards, > Volker > > > Regards, > Nils > > >> >> Regards, >> Volker >> >> >> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson >> > wrote: >> >> Yes, I forgot to add the fix for working with multiple >> directives from whitebox. >> >> WB.addCompilerDirectives now returns the number of directives >> that where added, and removeCompilerDirectives takes a >> parameter for the number of directives that should be popped >> (atomically). >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >> >> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >> >> >> Fixed test in JDK-8073793 to work with this: >> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >> >> >> Best regards, >> Nils Eliasson >> >> >> >> On 2016-03-02 13:36, Nils Eliasson wrote: >>> Hi Volker, >>> >>> I created these webrevs including all the feedback from >>> everyone: >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>> >>> * Only add- and removeCompilerDirective >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>> >>> * whitebox.cpp >>> -- addCompilerDirective to have correct VM states >>> * advancedThresholdPolicy.cpp >>> -- prevent blocking tasks from becoming stale >>> -- The logic for picking first blocking task broke JVMCI >>> code. Instead made the JVMCI code default (select the >>> blocking task with highest score.) >>> * compilerDirectives.hpp >>> -- Remove option CompileCommand. Not needed. >>> * compileBroker.cpp >>> -- Wrapped compile_method so that directive get and release >>> always are matched. >>> >>> Is anything missing? >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-03-01 19:31, Volker Simonis wrote: >>>> Hi Pavel, Nils, Vladimir, >>>> >>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>> >>>> Thanks a lot for your input and your suggestions. I'll look into this >>>> tomorrow and hopefully I'll be able to address all your concerns. >>>> >>>> Regards, >>>> Volker >>>> >>>> >>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>>> >>>> wrote: >>>>> Nils, please answer Pavel's questions. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> >>>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>>> Hi Volker, >>>>>> >>>>>> An excellent proposition. This is how it should be used. >>>>>> >>>>>> I polished a few rough edges: >>>>>> * CompilerBroker.cpp - The directives was already access in >>>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>>> so we only have a single directive access. Wrapped compile_method to >>>>>> make sure the release of the directive doesn't get lost. >>>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>>> state - need to be in native to get string, but then need to be in VM >>>>>> when parsing directive. >>>>>> >>>>>> And some comments: >>>>>> * I am against adding new compile option commands (At least until the >>>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>>> directives instead. >>>>>> >>>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>>> could save the blocking info in the task so we don't need to access the >>>>>> directive in the policy. >>>>>> >>>>>> All in here: >>>>>> Webrev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>>> >>>>>> >>>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>>> >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>>>> Hi, >>>>>>> >>>>>>> so I want to propose the following solution for this problem: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>>>> >>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>>>> >>>>>>> >>>>>>> I've started from the opposite site and made the BackgroundCompilation >>>>>>> manageable through the compiler directives framework. Once this works >>>>>>> (and it's actually trivial due to the nice design of the >>>>>>> CompilerDirectives framework :), we get the possibility to set the >>>>>>> BackgroundCompilation option on a per method base on the command line >>>>>>> via the CompileCommand option for free: >>>>>>> >>>>>>> >>>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>>>> >>>>>>> >>>>>>> And of course we can also use it directly as a compiler directive: >>>>>>> >>>>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>>>> >>>>>>> It also becomes possible to use this directly from the Whitebox API >>>>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>>>> Unfortunately, this command takes a file with compiler directives as >>>>>>> argument. I think this would be overkill in this context. So because >>>>>>> it was so easy and convenient, I added the following two new Whitebox >>>>>>> methods: >>>>>>> >>>>>>> public native void addCompilerDirective(String compDirect); >>>>>>> public native void removeCompilerDirective(); >>>>>>> >>>>>>> which can now be used to set arbitrary CompilerDirective command >>>>>>> directly from within the WhiteBox API. (The implementation of these >>>>>>> two methods is trivial as you can see in whitebox.cpp). >>>>>>> v >>>>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>>>> simple wrappers around the existing methods without the need of any >>>>>>> code changes in their native implementation. This is good, because it >>>>>>> keeps the WhiteBox API stable! >>>>>>> >>>>>>> Finally some words about the implementation of the per-method >>>>>>> BackgroundCompilation functionality. It actually only requires two >>>>>>> small changes: >>>>>>> >>>>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>>>> and compilation level as arguments and use them to query the >>>>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>>>> >>>>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>>>> prefers blocking compilations. This is not only necessary, because it >>>>>>> decreases the time we have to wait for a blocking compilation, but >>>>>>> also because it prevents blocking compiles from getting stale. This >>>>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>>>> for methods which only get artificially compiled during a test because >>>>>>> their invocations counters are usually too small. >>>>>>> >>>>>>> There's still a small probability that a blocking compilation will be >>>>>>> not blocking. This can happen if a method for which we request the >>>>>>> blocking compilation is already in the compilation queue (see the >>>>>>> check 'compilation_is_in_queue(method)' in >>>>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>>>> rarely happen because methods which are manually compiled shouldn't >>>>>>> get called that many times to implicitly place them into the compile >>>>>>> queue. But we can even completely avoid this problem by using >>>>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>>>> the queue before we request a blocking compilation. >>>>>>> >>>>>>> I've also added a small regression test to demonstrate and verify the >>>>>>> new functionality. >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>>> wrote: >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>>>> all >>>>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>>>> for a >>>>>>>> single compile without affecting the rest of the system. The tests >>>>>>>> using it >>>>>>>> will continue executing as soon as that compile is finished, saving time >>>>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Nils Eliasson >>>>>>>> >>>>>>>> >>>>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>>>>> You are adding parameter which is used only for testing. >>>>>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>>>>> WhiteBox::compilation_locked. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Vladimir >>>>>>>>> >>>>>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Please review this change that adds support for blocking compiles >>>>>>>>>> in the >>>>>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>>>>> >>>>>>>>>> Motivation: >>>>>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>>>>> consuming >>>>>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>>>>> queue to >>>>>>>>>> complete >>>>>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>>>>> causing indeterminism in the test >>>>>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>>>>> will be >>>>>>>>>> pushed at the same time. >>>>>>>>>> >>>>>>>>>> RFE:https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>>>>> JDK rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>>>>> >>>>>>>>>> Hotspot rev:http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Nils Eliasson >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Fri Mar 4 11:29:57 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 4 Mar 2016 12:29:57 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56D95E62.20309@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> Message-ID: Great! I'm happy we finally came to an agreement :) Best regards, Volker On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson wrote: > Hi Volker, > > On 2016-03-03 16:25, Volker Simonis wrote: > > Hi Nils, > > thanks for your comments. Please find my new webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 > http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 > > > Looks very good now. > > > Comments as always inline: > > On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson < > nils.eliasson at oracle.com> wrote: > >> Hi Volker, >> >> On 2016-03-02 17:36, Volker Simonis wrote: >> >> Hi Nils, >> >> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is >> as base for my new webrevs at: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >> >> I've updated the copyrights, added the current reviewers and also added >> us both in the Contributed-by line (hope that's fine for you). >> >> >> Absolutely >> >> >> Except that, I've only done the following minor fixes/changes: >> >> * compileBroker.{cpp,hpp}* >> >> - we don't need CompileBroker::is_compile_blocking() anymore. >> >> Good >> >> >> *compilerDirectives.hpp* >> >> - I think we should use >> cflags(BackgroundCompilation, bool, BackgroundCompilation, >> BackgroundCompilation) >> instead of: >> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >> >> so we can also trigger blocking compiles from the command line with a >> CompileCommand (e.g. >> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >> That's very handy during development or and also for simple tests where we >> don't want to mess with compiler directives. (And the overhead to keep this >> feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >> >> >> Without a very strong use case for this I don't want it as a >> CompileCommand. >> >> CompileCommand options do have a cost - they force a temporary unique >> copy of the directive if any option command matches negating some of the >> positive effects of directives. Also the CompileCommands are stringly >> typed, no compile time name or type check is done. This can be fixed in >> various ways, but until then I prefer not to add it. >> >> > Well, daily working is a strong use case for me:) Until there's no > possibility to provide compiler directives directly on the command line > instead of using an extra file, I think there's a justification for the > CompileCommand version. Also I think the cost argument is not so relevent, > because the feature will be mainly used during developemnt or in small > tests which don't need compiler directives. It will actually make it > possible to write simple tests which require blocking compilations without > the need to use compiler directives or WB (just by specifying a > -XX:CompileCommand option). > > > ok, you convinced me. > > >> *whitebox.cpp* >> >> I think it is good that you fixed the state but I think it is too >> complicated now. We don't need to strdup the string and can easily forget >> to free 'tmpstr' :) So maybe it is simpler to just do another transition >> for parsing the directive: >> >> { >> ThreadInVMfromNative ttvfn(thread); // back to VM >> DirectivesParser::parse_string(dir, tty); >> } >> >> >> Transitions are not free, but on the other hand the string may be long. >> This is not a hot path in anyway so lets go with simple. >> >> >> >> *advancedThresholdPolicy.cpp * >> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and >> is actually even an improvement over my code which just picked the first >> blocking compilation. >> >> >> Feels good to remove the special cases. >> >> >> >> >> *diagnosticCommand.cpp *- Shouldn't you also fix >> CompilerDirectivesAddDCmd to return the number of added directives and >> CompilerDirectivesRemoveDCmd to take the number of directives you want to >> pop? Or do you want to do this in a later, follow-up change? >> >> >> Yes, lets do that in a follow up change. They affect a number of tests. >> >> >> *WhiteBox.java* >> >> - I still think it would make sense to keep the two 'blocking' versions >> of enqueueMethodForCompilation() for convenience. For example your test >> fix for JDK-8073793 would be much simpler if you used them. I've added two >> comments to the 'blocking' convenience methods to mention the fact that >> calling them may shadow previously added compiler directives. >> >> >> I am ok with having then, but think Whitebox.java will get too bloated. I >> would rather have the convenience-methods in some test utility class, like >> CompilerWhiteBoxTest.java. >> >> > OK, I can live with that. I removed the blocking enqueue methods and the > corresponding tests. > >> >> *BlockingCompilation.java* >> >> - I've extended my regression test to test both methods of doing blocking >> compilation - with the new, 'blocking' enqueueMethodForCompilation() >> methods as well as by manually setting the corresponding compiler >> directives. If we should finally get consensus on removing the blocking >> convenience methods, please just remove the corresponding tests. >> >> >> Line 85: for (level = 1; level <= 4; level++) { >> >> You can not be sure all compilation levels are available. Use >> >> * @library /testlibrary /test/lib / >> * @build sun.hotspot.WhiteBox >> * compiler.testlibrary.CompilerUtils >> >> import compiler.testlibrary.CompilerUtils; >> >> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >> for (int level : levels) { >> ... >> > > Good catch. I've slightly revorked the test. I do bail out early if there > are no compilers at all and I've also fixed the break condition of the loop > which is calling foo() to compare against the highest available compilation > level instead of just using '4'. > >> >> I think we're close to a final version now, what do you think :) >> >> >> Yes! I'll take a look as soon as you post an updated webrev. >> > > Would be good if you could run it trough JPRT once so we can be sure we > didn't break anything. > > > I am running all jtreg tests on some select platforms right now. > > Best regards, > Nils > > > Regards, > Volker > > >> >> Regards, >> Nils >> >> >> >> Regards, >> Volker >> >> >> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson < >> nils.eliasson at oracle.com> wrote: >> >>> Yes, I forgot to add the fix for working with multiple directives from >>> whitebox. >>> >>> WB.addCompilerDirectives now returns the number of directives that where >>> added, and removeCompilerDirectives takes a parameter for the number of >>> directives that should be popped (atomically). >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>> >>> Fixed test in JDK-8073793 to work with this: >>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> >>> On 2016-03-02 13:36, Nils Eliasson wrote: >>> >>> Hi Volker, >>> >>> I created these webrevs including all the feedback from everyone: >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>> * Only add- and removeCompilerDirective >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>> * whitebox.cpp >>> -- addCompilerDirective to have correct VM states >>> * advancedThresholdPolicy.cpp >>> -- prevent blocking tasks from becoming stale >>> -- The logic for picking first blocking task broke JVMCI code. Instead >>> made the JVMCI code default (select the blocking task with highest score.) >>> * compilerDirectives.hpp >>> -- Remove option CompileCommand. Not needed. >>> * compileBroker.cpp >>> -- Wrapped compile_method so that directive get and release always are >>> matched. >>> >>> Is anything missing? >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-03-01 19:31, Volker Simonis wrote: >>> >>> Hi Pavel, Nils, Vladimir, >>> >>> sorry, but I was busy the last days so I couldn't answer your mails. >>> >>> Thanks a lot for your input and your suggestions. I'll look into this >>> tomorrow and hopefully I'll be able to address all your concerns. >>> >>> Regards, >>> Volker >>> >>> >>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov wrote: >>> >>> Nils, please answer Pavel's questions. >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>> >>> Hi Volker, >>> >>> An excellent proposition. This is how it should be used. >>> >>> I polished a few rough edges: >>> * CompilerBroker.cpp - The directives was already access in >>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>> so we only have a single directive access. Wrapped compile_method to >>> make sure the release of the directive doesn't get lost. >>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>> state - need to be in native to get string, but then need to be in VM >>> when parsing directive. >>> >>> And some comments: >>> * I am against adding new compile option commands (At least until the >>> stringly typeness is fixed). Lets add good ways too use compiler >>> directives instead. >>> >>> I need to look at the stale task removal code tomorrow - hopefully we >>> could save the blocking info in the task so we don't need to access the >>> directive in the policy. >>> >>> All in here: >>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>> >>> The code runs fine with the test I fixed for JDK-8073793:http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>> >>> Best regards, >>> Nils Eliasson >>> >>> On 2016-02-26 19:47, Volker Simonis wrote: >>> >>> Hi, >>> >>> so I want to propose the following solution for this problem: >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevelhttp://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>> >>> I've started from the opposite site and made the BackgroundCompilation >>> manageable through the compiler directives framework. Once this works >>> (and it's actually trivial due to the nice design of the >>> CompilerDirectives framework :), we get the possibility to set the >>> BackgroundCompilation option on a per method base on the command line >>> via the CompileCommand option for free: >>> >>> >>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>> >>> >>> And of course we can also use it directly as a compiler directive: >>> >>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>> >>> It also becomes possible to use this directly from the Whitebox API >>> through the DiagnosticCommand.compilerDirectivesAdd command. >>> Unfortunately, this command takes a file with compiler directives as >>> argument. I think this would be overkill in this context. So because >>> it was so easy and convenient, I added the following two new Whitebox >>> methods: >>> >>> public native void addCompilerDirective(String compDirect); >>> public native void removeCompilerDirective(); >>> >>> which can now be used to set arbitrary CompilerDirective command >>> directly from within the WhiteBox API. (The implementation of these >>> two methods is trivial as you can see in whitebox.cpp). >>> v >>> The blocking versions of enqueueMethodForCompilation() now become >>> simple wrappers around the existing methods without the need of any >>> code changes in their native implementation. This is good, because it >>> keeps the WhiteBox API stable! >>> >>> Finally some words about the implementation of the per-method >>> BackgroundCompilation functionality. It actually only requires two >>> small changes: >>> >>> 1. extending CompileBroker::is_compile_blocking() to take the method >>> and compilation level as arguments and use them to query the >>> DirectivesStack for the corresponding BackgroundCompilation value. >>> >>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>> prefers blocking compilations. This is not only necessary, because it >>> decreases the time we have to wait for a blocking compilation, but >>> also because it prevents blocking compiles from getting stale. This >>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>> for methods which only get artificially compiled during a test because >>> their invocations counters are usually too small. >>> >>> There's still a small probability that a blocking compilation will be >>> not blocking. This can happen if a method for which we request the >>> blocking compilation is already in the compilation queue (see the >>> check 'compilation_is_in_queue(method)' in >>> CompileBroker::compile_method_base()). In testing scenarios this will >>> rarely happen because methods which are manually compiled shouldn't >>> get called that many times to implicitly place them into the compile >>> queue. But we can even completely avoid this problem by using >>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>> the queue before we request a blocking compilation. >>> >>> I've also added a small regression test to demonstrate and verify the >>> new functionality. >>> >>> Regards, >>> Volker >>> >>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson wrote: >>> >>> Hi Vladimir, >>> >>> WhiteBox::compilation_locked is a global state that temporarily stops >>> all >>> compilations. I this case I just want to achieve blocking compilation >>> for a >>> single compile without affecting the rest of the system. The tests >>> using it >>> will continue executing as soon as that compile is finished, saving time >>> where wait-loops is used today. It adds nice determinism to tests. >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>> >>> You are adding parameter which is used only for testing. >>> Can we have callback(or check field) into WB instead? Similar to >>> WhiteBox::compilation_locked. >>> >>> Thanks, >>> Vladimir >>> >>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>> >>> Hi, >>> >>> Please review this change that adds support for blocking compiles >>> in the >>> whitebox API. This enables simpler less time consuming tests. >>> >>> Motivation: >>> * -XX:-BackgroundCompilation is a global flag and can be time >>> consuming >>> * Blocking compiles removes the need for waiting on the compile >>> queue to >>> complete >>> * Compiles put in the queue may be evicted if the queue grows to big - >>> causing indeterminism in the test >>> * Less VM-flags allows for more tests in the same VM >>> >>> Testing: >>> Posting a separate RFR for test fix that uses this change. They >>> will be >>> pushed at the same time. >>> >>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.punegov at oracle.com Fri Mar 4 16:34:31 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Fri, 4 Mar 2016 19:34:31 +0300 Subject: RFR (XXS): 8150955: RandomValidCommandsTest.java fails with UnsatisfiedLinkError: sun.hotspot.WhiteBox.registerNatives Message-ID: <03E7FC93-DA1B-4A03-8E77-E7E256AE0AC8@oracle.com> Hi, please review this small fix to the test bug. Issue: test fails to start because WhiteBox options were not specified. Test randomly generated only one command that was skipped. But appropriate VM options are always got from the Command enum together with options specific for a command. In the failing case VM started without any options. Fix: make test replace invalid commands with a valid one. webrev: http://cr.openjdk.java.net/~ppunegov/8150955/webrev.00/ bug: https://bugs.openjdk.java.net/browse/JDK-8150955 ? Thanks, Pavel Punegov -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.punegov at oracle.com Fri Mar 4 17:01:34 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Fri, 4 Mar 2016 20:01:34 +0300 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> Message-ID: Hi Volker, overall looks good to me (not Reviewer). Just some questions about the test: 1. Do you need to PrintCompilation and Inlining in the test? 39 * -XX:+PrintCompilation 40 * -XX:CompileCommand=option,BlockingCompilation::foo,PrintInlining 2. 500_000 for a loop seems to be a way to much. There is a test/compiler/whitebox/CompilerWhiteBoxTest.java that has a constants enough for a compilation. 3. Running Client compiler only could pass even if compilation were blocking here: 128 if (level == 4 && iteration == i) { I think it should check that any level changing have happened with some amount of iterations, not only the 4th level. ? Thanks, Pavel Punegov > On 04 Mar 2016, at 14:29, Volker Simonis wrote: > > Great! I'm happy we finally came to an agreement :) > > Best regards, > Volker > > On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson > wrote: > Hi Volker, > > On 2016-03-03 16:25, Volker Simonis wrote: >> Hi Nils, >> >> thanks for your comments. Please find my new webrev here: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 > > Looks very good now. > >> >> Comments as always inline: >> >> On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson < nils.eliasson at oracle.com > wrote: >> Hi Volker, >> >> On 2016-03-02 17:36, Volker Simonis wrote: >>> Hi Nils, >>> >>> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is as base for my new webrevs at: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >>> >>> I've updated the copyrights, added the current reviewers and also added us both in the Contributed-by line (hope that's fine for you). >> >> Absolutely >> >>> >>> Except that, I've only done the following minor fixes/changes: >>> >>> compileBroker.{cpp,hpp} >>> >>> - we don't need CompileBroker::is_compile_blocking() anymore. >> Good >>> >>> compilerDirectives.hpp >>> >>> - I think we should use >>> cflags(BackgroundCompilation, bool, BackgroundCompilation, BackgroundCompilation) >>> instead of: >>> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >>> >>> so we can also trigger blocking compiles from the command line with a CompileCommand (e.g. -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") That's very handy during development or and also for simple tests where we don't want to mess with compiler directives. (And the overhead to keep this feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >> >> Without a very strong use case for this I don't want it as a CompileCommand. >> >> CompileCommand options do have a cost - they force a temporary unique copy of the directive if any option command matches negating some of the positive effects of directives. Also the CompileCommands are stringly typed, no compile time name or type check is done. This can be fixed in various ways, but until then I prefer not to add it. >> >> >> Well, daily working is a strong use case for me:) Until there's no possibility to provide compiler directives directly on the command line instead of using an extra file, I think there's a justification for the CompileCommand version. Also I think the cost argument is not so relevent, because the feature will be mainly used during developemnt or in small tests which don't need compiler directives. It will actually make it possible to write simple tests which require blocking compilations without the need to use compiler directives or WB (just by specifying a -XX:CompileCommand option). >> > ok, you convinced me. > >>> >>> whitebox.cpp >>> >>> I think it is good that you fixed the state but I think it is too complicated now. We don't need to strdup the string and can easily forget to free 'tmpstr' :) So maybe it is simpler to just do another transition for parsing the directive: >>> >>> { >>> ThreadInVMfromNative ttvfn(thread); // back to VM >>> DirectivesParser::parse_string(dir, tty); >>> } >> >> Transitions are not free, but on the other hand the string may be long. This is not a hot path in anyway so lets go with simple. >> >>> >>> advancedThresholdPolicy.cpp >>> >>> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and is actually even an improvement over my code which just picked the first blocking compilation. >> >> Feels good to remove the special cases. >> >>> >>> diagnosticCommand.cpp >>> >>> - Shouldn't you also fix CompilerDirectivesAddDCmd to return the number of added directives and CompilerDirectivesRemoveDCmd to take the number of directives you want to pop? Or do you want to do this in a later, follow-up change? >> >> Yes, lets do that in a follow up change. They affect a number of tests. >> >>> >>> WhiteBox.java >>> >>> - I still think it would make sense to keep the two 'blocking' versions of enqueueMethodForCompilation() for convenience. For example your test fix for JDK-8073793 would be much simpler if you used them. I've added two comments to the 'blocking' convenience methods to mention the fact that calling them may shadow previously added compiler directives. >> >> I am ok with having then, but think Whitebox.java will get too bloated. I would rather have the convenience-methods in some test utility class, like CompilerWhiteBoxTest.java. >> >> >> OK, I can live with that. I removed the blocking enqueue methods and the corresponding tests. >>> >>> BlockingCompilation.java >>> >>> - I've extended my regression test to test both methods of doing blocking compilation - with the new, 'blocking' enqueueMethodForCompilation() methods as well as by manually setting the corresponding compiler directives. If we should finally get consensus on removing the blocking convenience methods, please just remove the corresponding tests. >> >> Line 85: for (level = 1; level <= 4; level++) { >> >> You can not be sure all compilation levels are available. Use >> >> * @library /testlibrary /test/lib / >> * @build sun.hotspot.WhiteBox >> * compiler.testlibrary.CompilerUtils >> >> import compiler.testlibrary.CompilerUtils; >> >> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >> for (int level : levels) { >> ... >> >> Good catch. I've slightly revorked the test. I do bail out early if there are no compilers at all and I've also fixed the break condition of the loop which is calling foo() to compare against the highest available compilation level instead of just using '4'. >>> >>> I think we're close to a final version now, what do you think :) >> >> Yes! I'll take a look as soon as you post an updated webrev. >> >> Would be good if you could run it trough JPRT once so we can be sure we didn't break anything. > > I am running all jtreg tests on some select platforms right now. > > Best regards, > Nils > >> >> Regards, >> Volker >> >> >> Regards, >> Nils >> >> >>> >>> Regards, >>> Volker >>> >>> >>> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson < nils.eliasson at oracle.com > wrote: >>> Yes, I forgot to add the fix for working with multiple directives from whitebox. >>> >>> WB.addCompilerDirectives now returns the number of directives that where added, and removeCompilerDirectives takes a parameter for the number of directives that should be popped (atomically). >>> >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>> >>> Fixed test in JDK-8073793 to work with this: http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>> >>> Best regards, >>> Nils Eliasson >>> >>> >>> >>> On 2016-03-02 13:36, Nils Eliasson wrote: >>>> Hi Volker, >>>> >>>> I created these webrevs including all the feedback from everyone: >>>> >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>>> * Only add- and removeCompilerDirective >>>> >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>>> * whitebox.cpp >>>> -- addCompilerDirective to have correct VM states >>>> * advancedThresholdPolicy.cpp >>>> -- prevent blocking tasks from becoming stale >>>> -- The logic for picking first blocking task broke JVMCI code. Instead made the JVMCI code default (select the blocking task with highest score.) >>>> * compilerDirectives.hpp >>>> -- Remove option CompileCommand. Not needed. >>>> * compileBroker.cpp >>>> -- Wrapped compile_method so that directive get and release always are matched. >>>> >>>> Is anything missing? >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-03-01 19:31, Volker Simonis wrote: >>>>> Hi Pavel, Nils, Vladimir, >>>>> >>>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>>> >>>>> Thanks a lot for your input and your suggestions. I'll look into this >>>>> tomorrow and hopefully I'll be able to address all your concerns. >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>>>> wrote: >>>>>> Nils, please answer Pavel's questions. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> >>>>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>>>> Hi Volker, >>>>>>> >>>>>>> An excellent proposition. This is how it should be used. >>>>>>> >>>>>>> I polished a few rough edges: >>>>>>> * CompilerBroker.cpp - The directives was already access in >>>>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>>>> so we only have a single directive access. Wrapped compile_method to >>>>>>> make sure the release of the directive doesn't get lost. >>>>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>>>> state - need to be in native to get string, but then need to be in VM >>>>>>> when parsing directive. >>>>>>> >>>>>>> And some comments: >>>>>>> * I am against adding new compile option commands (At least until the >>>>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>>>> directives instead. >>>>>>> >>>>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>>>> could save the blocking info in the task so we don't need to access the >>>>>>> directive in the policy. >>>>>>> >>>>>>> All in here: >>>>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>>>> >>>>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> so I want to propose the following solution for this problem: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>>>>> >>>>>>>> I've started from the opposite site and made the BackgroundCompilation >>>>>>>> manageable through the compiler directives framework. Once this works >>>>>>>> (and it's actually trivial due to the nice design of the >>>>>>>> CompilerDirectives framework :), we get the possibility to set the >>>>>>>> BackgroundCompilation option on a per method base on the command line >>>>>>>> via the CompileCommand option for free: >>>>>>>> >>>>>>>> >>>>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>>>>> >>>>>>>> >>>>>>>> And of course we can also use it directly as a compiler directive: >>>>>>>> >>>>>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>>>>> >>>>>>>> It also becomes possible to use this directly from the Whitebox API >>>>>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>>>>> Unfortunately, this command takes a file with compiler directives as >>>>>>>> argument. I think this would be overkill in this context. So because >>>>>>>> it was so easy and convenient, I added the following two new Whitebox >>>>>>>> methods: >>>>>>>> >>>>>>>> public native void addCompilerDirective(String compDirect); >>>>>>>> public native void removeCompilerDirective(); >>>>>>>> >>>>>>>> which can now be used to set arbitrary CompilerDirective command >>>>>>>> directly from within the WhiteBox API. (The implementation of these >>>>>>>> two methods is trivial as you can see in whitebox.cpp). >>>>>>>> v >>>>>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>>>>> simple wrappers around the existing methods without the need of any >>>>>>>> code changes in their native implementation. This is good, because it >>>>>>>> keeps the WhiteBox API stable! >>>>>>>> >>>>>>>> Finally some words about the implementation of the per-method >>>>>>>> BackgroundCompilation functionality. It actually only requires two >>>>>>>> small changes: >>>>>>>> >>>>>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>>>>> and compilation level as arguments and use them to query the >>>>>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>>>>> >>>>>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>>>>> prefers blocking compilations. This is not only necessary, because it >>>>>>>> decreases the time we have to wait for a blocking compilation, but >>>>>>>> also because it prevents blocking compiles from getting stale. This >>>>>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>>>>> for methods which only get artificially compiled during a test because >>>>>>>> their invocations counters are usually too small. >>>>>>>> >>>>>>>> There's still a small probability that a blocking compilation will be >>>>>>>> not blocking. This can happen if a method for which we request the >>>>>>>> blocking compilation is already in the compilation queue (see the >>>>>>>> check 'compilation_is_in_queue(method)' in >>>>>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>>>>> rarely happen because methods which are manually compiled shouldn't >>>>>>>> get called that many times to implicitly place them into the compile >>>>>>>> queue. But we can even completely avoid this problem by using >>>>>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>>>>> the queue before we request a blocking compilation. >>>>>>>> >>>>>>>> I've also added a small regression test to demonstrate and verify the >>>>>>>> new functionality. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Volker >>>>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>>>> wrote: >>>>>>>>> Hi Vladimir, >>>>>>>>> >>>>>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>>>>> all >>>>>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>>>>> for a >>>>>>>>> single compile without affecting the rest of the system. The tests >>>>>>>>> using it >>>>>>>>> will continue executing as soon as that compile is finished, saving time >>>>>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Nils Eliasson >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>>>>>> You are adding parameter which is used only for testing. >>>>>>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>>>>>> WhiteBox::compilation_locked. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Vladimir >>>>>>>>>> >>>>>>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Please review this change that adds support for blocking compiles >>>>>>>>>>> in the >>>>>>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>>>>>> >>>>>>>>>>> Motivation: >>>>>>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>>>>>> consuming >>>>>>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>>>>>> queue to >>>>>>>>>>> complete >>>>>>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>>>>>> causing indeterminism in the test >>>>>>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>>>>>> will be >>>>>>>>>>> pushed at the same time. >>>>>>>>>>> >>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>>>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>>>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> Nils Eliasson >>>> >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Fri Mar 4 17:09:19 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 4 Mar 2016 09:09:19 -0800 Subject: RFR (XXS): 8150955: RandomValidCommandsTest.java fails with UnsatisfiedLinkError: sun.hotspot.WhiteBox.registerNatives In-Reply-To: <03E7FC93-DA1B-4A03-8E77-E7E256AE0AC8@oracle.com> References: <03E7FC93-DA1B-4A03-8E77-E7E256AE0AC8@oracle.com> Message-ID: <56D9C13F.9050009@oracle.com> Looks good. Thanks, Vladimir On 3/4/16 8:34 AM, Pavel Punegov wrote: > Hi, > > please review this small fix to the test bug. > > Issue: test fails to start because WhiteBox options were not specified. Test randomly generated only one command that > was skipped. But appropriate VM options are always got from the Command enum together with options specific for > a command. In the failing case VM started without any options. > > Fix: make test replace invalid commands with a valid one. > > webrev: http://cr.openjdk.java.net/~ppunegov/8150955/webrev.00/ > bug: https://bugs.openjdk.java.net/browse/JDK-8150955 > > ? Thanks, > Pavel Punegov > From volker.simonis at gmail.com Fri Mar 4 18:30:55 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 4 Mar 2016 19:30:55 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> Message-ID: Hi Pavel, thanks for your feedabck. Please find my comments inline: On Fri, Mar 4, 2016 at 6:01 PM, Pavel Punegov wrote: > Hi Volker, > > overall looks good to me (not Reviewer). > > Just some questions about the test: > 1. Do you need to PrintCompilation and Inlining in the test? > > 39 * -XX:+PrintCompilation > 40 * -XX:CompileCommand=option,BlockingCompilation::foo,PrintInlining > > It's not necessary for the test, but if the test will fail it will be good to have this information in the .jtr file > 2. 500_000 for a loop seems to be a way to much. > There is a test/compiler/whitebox/CompilerWhiteBoxTest.java that has a > constants enough for a compilation. > > That's just an upper bound which won't be reached. We break out of the loop once we reached the maximum compilation level. Notice that the loop count is not the count until the method get's enqueued for compilation, but until it actually gets compiled because this loop tests the non-blocking compilations. So if the machine is loaded and/or the compile queue is busy, it can take quite some iterations until the method will be finally compiled (in my manual tests it usually took not more than 8000 iterations until the test left the loop). > 3. Running Client compiler only could pass even if compilation were > blocking here: > > 128 if (level == 4 && iteration == i) { > > I think it should check that any level changing have happened with some > amount of iterations, not only the 4th level. > The problem is that C1 compiles are so bleeding fast that I got to many false positives, i.e. the method got compiled in the same iteration just before I queried the compilation level. I actually begin to think that this test may always fail if you run the JTreg tests with "-Xbatch". Do you do that by default (e.g. in JPRT)? In that case we would probably have to additionally set "-XX:+BackgroundCompilation" in the test options? Regards, Volker > > ? Thanks, > Pavel Punegov > > On 04 Mar 2016, at 14:29, Volker Simonis wrote: > > Great! I'm happy we finally came to an agreement :) > > Best regards, > Volker > > On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson > wrote: > >> Hi Volker, >> >> On 2016-03-03 16:25, Volker Simonis wrote: >> >> Hi Nils, >> >> thanks for your comments. Please find my new webrev here: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 >> >> >> Looks very good now. >> >> >> Comments as always inline: >> >> On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson < >> nils.eliasson at oracle.com> wrote: >> >>> Hi Volker, >>> >>> On 2016-03-02 17:36, Volker Simonis wrote: >>> >>> Hi Nils, >>> >>> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is >>> as base for my new webrevs at: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >>> >>> I've updated the copyrights, added the current reviewers and also added >>> us both in the Contributed-by line (hope that's fine for you). >>> >>> >>> Absolutely >>> >>> >>> Except that, I've only done the following minor fixes/changes: >>> >>> * compileBroker.{cpp,hpp}* >>> >>> - we don't need CompileBroker::is_compile_blocking() anymore. >>> >>> Good >>> >>> >>> *compilerDirectives.hpp* >>> >>> - I think we should use >>> cflags(BackgroundCompilation, bool, BackgroundCompilation, >>> BackgroundCompilation) >>> instead of: >>> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >>> >>> so we can also trigger blocking compiles from the command line with a >>> CompileCommand (e.g. >>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >>> That's very handy during development or and also for simple tests where we >>> don't want to mess with compiler directives. (And the overhead to keep this >>> feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >>> >>> >>> Without a very strong use case for this I don't want it as a >>> CompileCommand. >>> >>> CompileCommand options do have a cost - they force a temporary unique >>> copy of the directive if any option command matches negating some of the >>> positive effects of directives. Also the CompileCommands are stringly >>> typed, no compile time name or type check is done. This can be fixed in >>> various ways, but until then I prefer not to add it. >>> >>> >> Well, daily working is a strong use case for me:) Until there's no >> possibility to provide compiler directives directly on the command line >> instead of using an extra file, I think there's a justification for the >> CompileCommand version. Also I think the cost argument is not so relevent, >> because the feature will be mainly used during developemnt or in small >> tests which don't need compiler directives. It will actually make it >> possible to write simple tests which require blocking compilations without >> the need to use compiler directives or WB (just by specifying a >> -XX:CompileCommand option). >> >> >> ok, you convinced me. >> >> >>> *whitebox.cpp* >>> >>> I think it is good that you fixed the state but I think it is too >>> complicated now. We don't need to strdup the string and can easily forget >>> to free 'tmpstr' :) So maybe it is simpler to just do another transition >>> for parsing the directive: >>> >>> { >>> ThreadInVMfromNative ttvfn(thread); // back to VM >>> DirectivesParser::parse_string(dir, tty); >>> } >>> >>> >>> Transitions are not free, but on the other hand the string may be long. >>> This is not a hot path in anyway so lets go with simple. >>> >>> >>> >>> *advancedThresholdPolicy.cpp * >>> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and >>> is actually even an improvement over my code which just picked the first >>> blocking compilation. >>> >>> >>> Feels good to remove the special cases. >>> >>> >>> >>> >>> *diagnosticCommand.cpp *- Shouldn't you also fix >>> CompilerDirectivesAddDCmd to return the number of added directives and >>> CompilerDirectivesRemoveDCmd to take the number of directives you want to >>> pop? Or do you want to do this in a later, follow-up change? >>> >>> >>> Yes, lets do that in a follow up change. They affect a number of tests. >>> >>> >>> *WhiteBox.java* >>> >>> - I still think it would make sense to keep the two 'blocking' versions >>> of enqueueMethodForCompilation() for convenience. For example your test >>> fix for JDK-8073793 would be much simpler if you used them. I've added two >>> comments to the 'blocking' convenience methods to mention the fact that >>> calling them may shadow previously added compiler directives. >>> >>> >>> I am ok with having then, but think Whitebox.java will get too bloated. >>> I would rather have the convenience-methods in some test utility class, >>> like CompilerWhiteBoxTest.java. >>> >>> >> OK, I can live with that. I removed the blocking enqueue methods and the >> corresponding tests. >> >>> >>> *BlockingCompilation.java* >>> >>> - I've extended my regression test to test both methods of doing >>> blocking compilation - with the new, 'blocking' >>> enqueueMethodForCompilation() methods as well as by manually setting the >>> corresponding compiler directives. If we should finally get consensus on >>> removing the blocking convenience methods, please just remove the >>> corresponding tests. >>> >>> >>> Line 85: for (level = 1; level <= 4; level++) { >>> >>> You can not be sure all compilation levels are available. Use >>> >>> * @library /testlibrary /test/lib / >>> * @build sun.hotspot.WhiteBox >>> * compiler.testlibrary.CompilerUtils >>> >>> import compiler.testlibrary.CompilerUtils; >>> >>> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >>> for (int level : levels) { >>> ... >>> >> >> Good catch. I've slightly revorked the test. I do bail out early if there >> are no compilers at all and I've also fixed the break condition of the loop >> which is calling foo() to compare against the highest available compilation >> level instead of just using '4'. >> >>> >>> I think we're close to a final version now, what do you think :) >>> >>> >>> Yes! I'll take a look as soon as you post an updated webrev. >>> >> >> Would be good if you could run it trough JPRT once so we can be sure we >> didn't break anything. >> >> >> I am running all jtreg tests on some select platforms right now. >> >> Best regards, >> Nils >> >> >> Regards, >> Volker >> >> >>> >>> Regards, >>> Nils >>> >>> >>> >>> Regards, >>> Volker >>> >>> >>> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson < >>> nils.eliasson at oracle.com> wrote: >>> >>>> Yes, I forgot to add the fix for working with multiple directives from >>>> whitebox. >>>> >>>> WB.addCompilerDirectives now returns the number of directives that >>>> where added, and removeCompilerDirectives takes a parameter for the number >>>> of directives that should be popped (atomically). >>>> >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>>> >>>> Fixed test in JDK-8073793 to work with this: >>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> >>>> On 2016-03-02 13:36, Nils Eliasson wrote: >>>> >>>> Hi Volker, >>>> >>>> I created these webrevs including all the feedback from everyone: >>>> >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>>> * Only add- and removeCompilerDirective >>>> >>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>>> * whitebox.cpp >>>> -- addCompilerDirective to have correct VM states >>>> * advancedThresholdPolicy.cpp >>>> -- prevent blocking tasks from becoming stale >>>> -- The logic for picking first blocking task broke JVMCI code. Instead >>>> made the JVMCI code default (select the blocking task with highest score.) >>>> * compilerDirectives.hpp >>>> -- Remove option CompileCommand. Not needed. >>>> * compileBroker.cpp >>>> -- Wrapped compile_method so that directive get and release always are >>>> matched. >>>> >>>> Is anything missing? >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-03-01 19:31, Volker Simonis wrote: >>>> >>>> Hi Pavel, Nils, Vladimir, >>>> >>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>> >>>> Thanks a lot for your input and your suggestions. I'll look into this >>>> tomorrow and hopefully I'll be able to address all your concerns. >>>> >>>> Regards, >>>> Volker >>>> >>>> >>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov wrote: >>>> >>>> Nils, please answer Pavel's questions. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> >>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>> >>>> Hi Volker, >>>> >>>> An excellent proposition. This is how it should be used. >>>> >>>> I polished a few rough edges: >>>> * CompilerBroker.cpp - The directives was already access in >>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>> so we only have a single directive access. Wrapped compile_method to >>>> make sure the release of the directive doesn't get lost. >>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>> state - need to be in native to get string, but then need to be in VM >>>> when parsing directive. >>>> >>>> And some comments: >>>> * I am against adding new compile option commands (At least until the >>>> stringly typeness is fixed). Lets add good ways too use compiler >>>> directives instead. >>>> >>>> I need to look at the stale task removal code tomorrow - hopefully we >>>> could save the blocking info in the task so we don't need to access the >>>> directive in the policy. >>>> >>>> All in here: >>>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>> >>>> The code runs fine with the test I fixed for JDK-8073793:http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>> >>>> Hi, >>>> >>>> so I want to propose the following solution for this problem: >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevelhttp://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>> >>>> I've started from the opposite site and made the BackgroundCompilation >>>> manageable through the compiler directives framework. Once this works >>>> (and it's actually trivial due to the nice design of the >>>> CompilerDirectives framework :), we get the possibility to set the >>>> BackgroundCompilation option on a per method base on the command line >>>> via the CompileCommand option for free: >>>> >>>> >>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>> >>>> >>>> And of course we can also use it directly as a compiler directive: >>>> >>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>> >>>> It also becomes possible to use this directly from the Whitebox API >>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>> Unfortunately, this command takes a file with compiler directives as >>>> argument. I think this would be overkill in this context. So because >>>> it was so easy and convenient, I added the following two new Whitebox >>>> methods: >>>> >>>> public native void addCompilerDirective(String compDirect); >>>> public native void removeCompilerDirective(); >>>> >>>> which can now be used to set arbitrary CompilerDirective command >>>> directly from within the WhiteBox API. (The implementation of these >>>> two methods is trivial as you can see in whitebox.cpp). >>>> v >>>> The blocking versions of enqueueMethodForCompilation() now become >>>> simple wrappers around the existing methods without the need of any >>>> code changes in their native implementation. This is good, because it >>>> keeps the WhiteBox API stable! >>>> >>>> Finally some words about the implementation of the per-method >>>> BackgroundCompilation functionality. It actually only requires two >>>> small changes: >>>> >>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>> and compilation level as arguments and use them to query the >>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>> >>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>> prefers blocking compilations. This is not only necessary, because it >>>> decreases the time we have to wait for a blocking compilation, but >>>> also because it prevents blocking compiles from getting stale. This >>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>> for methods which only get artificially compiled during a test because >>>> their invocations counters are usually too small. >>>> >>>> There's still a small probability that a blocking compilation will be >>>> not blocking. This can happen if a method for which we request the >>>> blocking compilation is already in the compilation queue (see the >>>> check 'compilation_is_in_queue(method)' in >>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>> rarely happen because methods which are manually compiled shouldn't >>>> get called that many times to implicitly place them into the compile >>>> queue. But we can even completely avoid this problem by using >>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>> the queue before we request a blocking compilation. >>>> >>>> I've also added a small regression test to demonstrate and verify the >>>> new functionality. >>>> >>>> Regards, >>>> Volker >>>> >>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson wrote: >>>> >>>> Hi Vladimir, >>>> >>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>> all >>>> compilations. I this case I just want to achieve blocking compilation >>>> for a >>>> single compile without affecting the rest of the system. The tests >>>> using it >>>> will continue executing as soon as that compile is finished, saving time >>>> where wait-loops is used today. It adds nice determinism to tests. >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>> >>>> You are adding parameter which is used only for testing. >>>> Can we have callback(or check field) into WB instead? Similar to >>>> WhiteBox::compilation_locked. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>> >>>> Hi, >>>> >>>> Please review this change that adds support for blocking compiles >>>> in the >>>> whitebox API. This enables simpler less time consuming tests. >>>> >>>> Motivation: >>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>> consuming >>>> * Blocking compiles removes the need for waiting on the compile >>>> queue to >>>> complete >>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>> causing indeterminism in the test >>>> * Less VM-flags allows for more tests in the same VM >>>> >>>> Testing: >>>> Posting a separate RFR for test fix that uses this change. They >>>> will be >>>> pushed at the same time. >>>> >>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>> >>>> Best regards, >>>> Nils Eliasson >>>> >>>> >>>> >>>> >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug.simon at oracle.com Sat Mar 5 11:45:56 2016 From: doug.simon at oracle.com (Doug Simon) Date: Sat, 5 Mar 2016 12:45:56 +0100 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected Message-ID: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 https://bugs.openjdk.java.net/browse/JDK-8151266 http://cr.openjdk.java.net/~dnsimon/8151266/ -Doug From felix.yang at linaro.org Sat Mar 5 15:54:43 2016 From: felix.yang at linaro.org (Felix Yang) Date: Sat, 5 Mar 2016 23:54:43 +0800 Subject: RFR: 8151340: aarch64: prefetch the destination word for write prior to ldxr/stxr loops. Message-ID: Hi, Please review the following webrev: http://cr.openjdk.java.net/~fyang/8151340/webrev.00/ JIRA issue: https://bugs.openjdk.java.net/browse/JDK-8151340 As discussed in LKML: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/355996.html, the cost of changing a cache line from shared to exclusive state can be significant on aarch64 cores, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of the "prfm" instruction to prefetch cache lines for write prior to ldxr/stxr loops. Is it OK? Thanks, Felix -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at redhat.com Sun Mar 6 18:45:13 2016 From: aph at redhat.com (Andrew Haley) Date: Sun, 6 Mar 2016 18:45:13 +0000 Subject: RFR: 8151340: aarch64: prefetch the destination word for write prior to ldxr/stxr loops. In-Reply-To: References: Message-ID: <56DC7AB9.8070007@redhat.com> On 05/03/16 15:54, Felix Yang wrote: > This patch makes use of the "prfm" instruction to prefetch cache lines > for write prior to ldxr/stxr loops. Is it OK? Yes, but there is another much larger patch to ldxr/stxr on its way, and this patch is already approved. Please wait for it to be pushed. Thanks, Andrew. From hui.shi at linaro.org Mon Mar 7 12:56:12 2016 From: hui.shi at linaro.org (Hui Shi) Date: Mon, 7 Mar 2016 20:56:12 +0800 Subject: RFR(s): 8149418 AArch64: replace tst+br with tbz instruction when tst's constant operand is 2 power Message-ID: Would someone help review this webrev? Finding this when adding byte arrays equals for aarch64 early. webrev: http://cr.openjdk.java.net/~hshi/8149418/webrev/ bug: https://bugs.openjdk.java.net/browse/JDK-8149418 Replace consecutive tst + br instructions with tbz/tbnz when constant is 2 power and not zero. This patch perform following actions 1. tst(r, constant) + br(EQ, L) -> tbz(r, exact_log2(constant), L) 2. tst(r, constant) + br(NE, L) -> tbnz(r, exact_log2(constant), L) For remaining tst+br patterns: 1. Most are assertion for debug build, leave them there. 2. tst source register is redefined before branch, code sequence tst(r, constant) + ldr(r ..) + br(EQ, L), optimize this need another register. Regards Hui -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.rodriguez at oracle.com Mon Mar 7 18:08:43 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 7 Mar 2016 10:08:43 -0800 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> Message-ID: <4C6FF46F-C87A-4BCA-B2A6-C32F9808561F@oracle.com> > On Mar 5, 2016, at 3:45 AM, Doug Simon wrote: > > Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 I think it would be better to rely on the existing logic in HotSpotResolvedObjectTypeImpl for finding and constructing the right HotSpotResolvedJavaFieldImpl. Then we wouldn?t need a new VM entry point. tom > > > https://bugs.openjdk.java.net/browse/JDK-8151266 > http://cr.openjdk.java.net/~dnsimon/8151266/ > > -Doug From christian.thalinger at oracle.com Mon Mar 7 18:12:54 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 7 Mar 2016 08:12:54 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <4C6FF46F-C87A-4BCA-B2A6-C32F9808561F@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <4C6FF46F-C87A-4BCA-B2A6-C32F9808561F@oracle.com> Message-ID: <59ED949C-4629-4DEB-9243-169327D31D7C@oracle.com> > On Mar 7, 2016, at 8:08 AM, Tom Rodriguez wrote: > > >> On Mar 5, 2016, at 3:45 AM, Doug Simon wrote: >> >> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 > > I think it would be better to rely on the existing logic in HotSpotResolvedObjectTypeImpl for finding and constructing the right HotSpotResolvedJavaFieldImpl. Then we wouldn?t need a new VM entry point. I remembered I added some code for this somewhere but couldn?t find it. Yes, we should use that instead. > > tom > >> >> >> https://bugs.openjdk.java.net/browse/JDK-8151266 >> http://cr.openjdk.java.net/~dnsimon/8151266/ >> >> -Doug > From christian.thalinger at oracle.com Mon Mar 7 18:13:58 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 7 Mar 2016 08:13:58 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> Message-ID: <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> > On Mar 5, 2016, at 1:45 AM, Doug Simon wrote: > > Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 > > > https://bugs.openjdk.java.net/browse/JDK-8151266 There is a test attached to the bug. We should add it. > http://cr.openjdk.java.net/~dnsimon/8151266/ > > -Doug From mikael.vidstedt at oracle.com Mon Mar 7 19:32:17 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 7 Mar 2016 11:32:17 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56D8A9A7.30606@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> Message-ID: <56DDD741.4070802@oracle.com> New webrev: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ Unfortunately I think I messed up the incremental webrev, so in order to not cause confusion I'm not including it here, sorry. Summary: I added pseudo instructions for inserting/extracting _low and _high parts of the vector registers. Note that it only applies for the cases where there is a clear low and high to speak of - that is, in the cases where the instruction operates on the high or low *half* of a register. For instructions like vinsert32x4 (128bit copy to/from a 512 bit register) there are four different sources/targets, so high and low are not applicable, so there are no pseudo instructions for them. The macroAssembler methods now take a uint8_t as well, this was accidentally left out in the last webrev. I kept the nds->is_valid() checks for now, cleaning that up is covered by JDK-8151003[1]. I also updated a couple of comments. Cheers, Mikael [1] https://bugs.openjdk.java.net/browse/JDK-8151003 On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: > On 3/3/16 7:08 AM, Berg, Michael C wrote: >> Vladimir (K), just for the time being as the problem isn't just >> confined to these instructions (the nds issue). I have assigned the >> bug below to myself and will take a holistic view over the issue in >> its full context. >> >> The instructions modified in the webrev, like in the documentation >> that exists regarding their definitions, are all programmable via >> what is loosely labeled as the imm8 field in the formal >> documentation. I think we should leave them that way. The onus of >> these changes was to make instructions look more like their ISA >> manual definitions. I think Vladimir Ivanov was saying, and please >> chime in Vladimir if I do not interpret correctly, wasn't high/low, >> it was leaving a signature like what we had in place in the macro >> assembler, and invoking the precise names there. I don't think that >> is needed though, as the macro assembler's job is to interpret a >> meaning and do a mapping. > I'm all for the proposed change in Assembler. > > My point is that vmovdqu/vinserti128h/vextracti128h(...) are more > informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the > code. So, keeping original functions in the MacroAssembler, but > migrating them to new Assembler versions looks reasonable. > > But I can live with both variaints. > > Best regards, > Vladimir Ivanov > >> >> Regards, >> Michael >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, March 02, 2016 3:32 PM >> To: hotspot-compiler-dev at openjdk.java.net >> Cc: Berg, Michael C >> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >> vinsert match actual instructions >> >> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>> >>> After discussing with Vladimir off-list we agreed that changing the >>> type >> >> It was Vladimir Ivanov. >> >>> of the immediate (imm8) argument to uint8_t is both clearer, has the >>> potential to catch incorrect uses of the functions, and also makes the >>> asserts more straightforward. In addition to that Vladimir noted that >>> I had accidentally included newline in the assert messages. >>> >>> New webrev: >>> >>> Full: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ >> >> I agree with Vladimir I. that we should have macroassembler >> instructions vinserti128high, vinserti128low, etc. instead of passing >> imm8. It is more informative. >> >> Also why we add new nds->is_valid() checks into assembler instructions? >> We are going to remove them: >> >> https://bugs.openjdk.java.net/browse/JDK-8151003 >> >> I know that Mikael had a discussion about this with Michael. So I >> would like to see arguments here. Michael? >> >> Current code always pass correct registers and x86 Manual requires to >> have valid registers. >> >> Thanks, >> Vladimir >> >>> >>> Incremental from webrev.01: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webr >>> ev/ >>> >>> Cheers, >>> Mikael >>> >>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ >>>> >>>> Incremental from webrev.00: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/web >>>> rev/ >>>> >>>> Comments below... >>>> >>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>> Nice cleanup, Mikael! >>>>> >>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>> >>>>> Outdated comments: >>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>> >>>>> // Load/store high 128bit of YMM registers which does not >>>>> destroy other half. >>>>> >>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>> >>>> Updated, thanks for catching! >>>> >>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>> >>>>> ! emit_int8(imm8 & 0x01); >>>>> >>>>> Maybe additionally assert valid imm8 range? >>>> >>>> Good idea, I had added asserts earlier but removed them. I added them >>>> back again! >>>> >>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? They >>>>> look clearer in some contextes: >>>>> >>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>> as_XMMRegister(n)); >>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>> as_XMMRegister(n), 1); >>>> >>>> Can I suggest that we try to live without them for a while and see >>>> how much we miss them? I think having it there may actually be more >>>> confusing in many cases :) >>>> >>>> Cheers, >>>> Mikael >>>> >>>>> >>>>> Otherwise, looks good. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>> >>>>>> Please review the following change which updates the various >>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>> macroAssembler_x86 to better match the real HW instructions, which >>>>>> also has the benefit of providing the full >>>>>> functionality/flexibility of the instructions where earlier only >>>>>> some specific modes were supported. Put differently, with this >>>>>> change it's much easier to correlate the methods to the Intel >>>>>> manual and understand what they actually do. >>>>>> >>>>>> Specifically, the vinsert* family of instructions take three >>>>>> registers and an immediate which decide how the bits should be >>>>>> shuffled around, but without this change the method only allowed >>>>>> two of the registers to be specified, and the immediate was >>>>>> hard-coded to 0x01. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev >>>>>> / >>>>>> >>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>> test the change! >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>> >>> From vladimir.kozlov at oracle.com Mon Mar 7 20:15:50 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 7 Mar 2016 12:15:50 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDD741.4070802@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> Message-ID: <56DDE176.8040101@oracle.com> Changes looks good. In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 groups as you have them in assembler_x86.hpp? Thanks, Vladimir On 3/7/16 11:32 AM, Mikael Vidstedt wrote: > > New webrev: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ > > Unfortunately I think I messed up the incremental webrev, so in order to > not cause confusion I'm not including it here, sorry. > > Summary: > > I added pseudo instructions for inserting/extracting _low and _high > parts of the vector registers. Note that it only applies for the cases > where there is a clear low and high to speak of - that is, in the cases > where the instruction operates on the high or low *half* of a register. > For instructions like vinsert32x4 (128bit copy to/from a 512 bit > register) there are four different sources/targets, so high and low are > not applicable, so there are no pseudo instructions for them. > > The macroAssembler methods now take a uint8_t as well, this was > accidentally left out in the last webrev. > > I kept the nds->is_valid() checks for now, cleaning that up is covered > by JDK-8151003[1]. > > I also updated a couple of comments. > > Cheers, > Mikael > > [1] https://bugs.openjdk.java.net/browse/JDK-8151003 > > > > On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>> Vladimir (K), just for the time being as the problem isn't just >>> confined to these instructions (the nds issue). I have assigned the >>> bug below to myself and will take a holistic view over the issue in >>> its full context. >>> >>> The instructions modified in the webrev, like in the documentation >>> that exists regarding their definitions, are all programmable via >>> what is loosely labeled as the imm8 field in the formal >>> documentation. I think we should leave them that way. The onus of >>> these changes was to make instructions look more like their ISA >>> manual definitions. I think Vladimir Ivanov was saying, and please >>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>> it was leaving a signature like what we had in place in the macro >>> assembler, and invoking the precise names there. I don't think that >>> is needed though, as the macro assembler's job is to interpret a >>> meaning and do a mapping. >> I'm all for the proposed change in Assembler. >> >> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the >> code. So, keeping original functions in the MacroAssembler, but >> migrating them to new Assembler versions looks reasonable. >> >> But I can live with both variaints. >> >> Best regards, >> Vladimir Ivanov >> >>> >>> Regards, >>> Michael >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, March 02, 2016 3:32 PM >>> To: hotspot-compiler-dev at openjdk.java.net >>> Cc: Berg, Michael C >>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>> vinsert match actual instructions >>> >>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>> >>>> After discussing with Vladimir off-list we agreed that changing the >>>> type >>> >>> It was Vladimir Ivanov. >>> >>>> of the immediate (imm8) argument to uint8_t is both clearer, has the >>>> potential to catch incorrect uses of the functions, and also makes the >>>> asserts more straightforward. In addition to that Vladimir noted that >>>> I had accidentally included newline in the assert messages. >>>> >>>> New webrev: >>>> >>>> Full: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ >>> >>> I agree with Vladimir I. that we should have macroassembler >>> instructions vinserti128high, vinserti128low, etc. instead of passing >>> imm8. It is more informative. >>> >>> Also why we add new nds->is_valid() checks into assembler instructions? >>> We are going to remove them: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>> >>> I know that Mikael had a discussion about this with Michael. So I >>> would like to see arguments here. Michael? >>> >>> Current code always pass correct registers and x86 Manual requires to >>> have valid registers. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Incremental from webrev.01: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/webr >>>> ev/ >>>> >>>> Cheers, >>>> Mikael >>>> >>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>> >>>>> Updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev/ >>>>> >>>>> Incremental from webrev.00: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/web >>>>> rev/ >>>>> >>>>> Comments below... >>>>> >>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>> Nice cleanup, Mikael! >>>>>> >>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>> >>>>>> Outdated comments: >>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>> >>>>>> // Load/store high 128bit of YMM registers which does not >>>>>> destroy other half. >>>>>> >>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>> >>>>> Updated, thanks for catching! >>>>> >>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>> >>>>>> ! emit_int8(imm8 & 0x01); >>>>>> >>>>>> Maybe additionally assert valid imm8 range? >>>>> >>>>> Good idea, I had added asserts earlier but removed them. I added them >>>>> back again! >>>>> >>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? They >>>>>> look clearer in some contextes: >>>>>> >>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>> as_XMMRegister(n)); >>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>> as_XMMRegister(n), 1); >>>>> >>>>> Can I suggest that we try to live without them for a while and see >>>>> how much we miss them? I think having it there may actually be more >>>>> confusing in many cases :) >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>>> >>>>>> Otherwise, looks good. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>> >>>>>>> Please review the following change which updates the various >>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>> macroAssembler_x86 to better match the real HW instructions, which >>>>>>> also has the benefit of providing the full >>>>>>> functionality/flexibility of the instructions where earlier only >>>>>>> some specific modes were supported. Put differently, with this >>>>>>> change it's much easier to correlate the methods to the Intel >>>>>>> manual and understand what they actually do. >>>>>>> >>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>> registers and an immediate which decide how the bits should be >>>>>>> shuffled around, but without this change the method only allowed >>>>>>> two of the registers to be specified, and the immediate was >>>>>>> hard-coded to 0x01. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webrev >>>>>>> / >>>>>>> >>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>>> test the change! >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>> >>>> > From michael.c.berg at intel.com Mon Mar 7 21:01:32 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Mon, 7 Mar 2016 21:01:32 +0000 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDD741.4070802@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> Message-ID: Looks ok to me. -Michael -----Original Message----- From: Mikael Vidstedt [mailto:mikael.vidstedt at oracle.com] Sent: Monday, March 07, 2016 11:32 AM To: Vladimir Ivanov; Berg, Michael C; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions New webrev: http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ Unfortunately I think I messed up the incremental webrev, so in order to not cause confusion I'm not including it here, sorry. Summary: I added pseudo instructions for inserting/extracting _low and _high parts of the vector registers. Note that it only applies for the cases where there is a clear low and high to speak of - that is, in the cases where the instruction operates on the high or low *half* of a register. For instructions like vinsert32x4 (128bit copy to/from a 512 bit register) there are four different sources/targets, so high and low are not applicable, so there are no pseudo instructions for them. The macroAssembler methods now take a uint8_t as well, this was accidentally left out in the last webrev. I kept the nds->is_valid() checks for now, cleaning that up is covered by JDK-8151003[1]. I also updated a couple of comments. Cheers, Mikael [1] https://bugs.openjdk.java.net/browse/JDK-8151003 On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: > On 3/3/16 7:08 AM, Berg, Michael C wrote: >> Vladimir (K), just for the time being as the problem isn't just >> confined to these instructions (the nds issue). I have assigned the >> bug below to myself and will take a holistic view over the issue in >> its full context. >> >> The instructions modified in the webrev, like in the documentation >> that exists regarding their definitions, are all programmable via >> what is loosely labeled as the imm8 field in the formal >> documentation. I think we should leave them that way. The onus of >> these changes was to make instructions look more like their ISA >> manual definitions. I think Vladimir Ivanov was saying, and please >> chime in Vladimir if I do not interpret correctly, wasn't high/low, >> it was leaving a signature like what we had in place in the macro >> assembler, and invoking the precise names there. I don't think that >> is needed though, as the macro assembler's job is to interpret a >> meaning and do a mapping. > I'm all for the proposed change in Assembler. > > My point is that vmovdqu/vinserti128h/vextracti128h(...) are more > informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the > code. So, keeping original functions in the MacroAssembler, but > migrating them to new Assembler versions looks reasonable. > > But I can live with both variaints. > > Best regards, > Vladimir Ivanov > >> >> Regards, >> Michael >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, March 02, 2016 3:32 PM >> To: hotspot-compiler-dev at openjdk.java.net >> Cc: Berg, Michael C >> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >> vinsert match actual instructions >> >> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>> >>> After discussing with Vladimir off-list we agreed that changing the >>> type >> >> It was Vladimir Ivanov. >> >>> of the immediate (imm8) argument to uint8_t is both clearer, has the >>> potential to catch incorrect uses of the functions, and also makes >>> the asserts more straightforward. In addition to that Vladimir noted >>> that I had accidentally included newline in the assert messages. >>> >>> New webrev: >>> >>> Full: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev/ >> >> I agree with Vladimir I. that we should have macroassembler >> instructions vinserti128high, vinserti128low, etc. instead of passing >> imm8. It is more informative. >> >> Also why we add new nds->is_valid() checks into assembler instructions? >> We are going to remove them: >> >> https://bugs.openjdk.java.net/browse/JDK-8151003 >> >> I know that Mikael had a discussion about this with Michael. So I >> would like to see arguments here. Michael? >> >> Current code always pass correct registers and x86 Manual requires to >> have valid registers. >> >> Thanks, >> Vladimir >> >>> >>> Incremental from webrev.01: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/we >>> br >>> ev/ >>> >>> Cheers, >>> Mikael >>> >>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webrev >>>> / >>>> >>>> Incremental from webrev.00: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/w >>>> eb >>>> rev/ >>>> >>>> Comments below... >>>> >>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>> Nice cleanup, Mikael! >>>>> >>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>> >>>>> Outdated comments: >>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>> >>>>> // Load/store high 128bit of YMM registers which does not >>>>> destroy other half. >>>>> >>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>> >>>> Updated, thanks for catching! >>>> >>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>> >>>>> ! emit_int8(imm8 & 0x01); >>>>> >>>>> Maybe additionally assert valid imm8 range? >>>> >>>> Good idea, I had added asserts earlier but removed them. I added >>>> them back again! >>>> >>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>> They look clearer in some contextes: >>>>> >>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>> as_XMMRegister(n)); >>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>> as_XMMRegister(n), 1); >>>> >>>> Can I suggest that we try to live without them for a while and see >>>> how much we miss them? I think having it there may actually be more >>>> confusing in many cases :) >>>> >>>> Cheers, >>>> Mikael >>>> >>>>> >>>>> Otherwise, looks good. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>> >>>>>> Please review the following change which updates the various >>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>> which also has the benefit of providing the full >>>>>> functionality/flexibility of the instructions where earlier only >>>>>> some specific modes were supported. Put differently, with this >>>>>> change it's much easier to correlate the methods to the Intel >>>>>> manual and understand what they actually do. >>>>>> >>>>>> Specifically, the vinsert* family of instructions take three >>>>>> registers and an immediate which decide how the bits should be >>>>>> shuffled around, but without this change the method only allowed >>>>>> two of the registers to be specified, and the immediate was >>>>>> hard-coded to 0x01. >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/webr >>>>>> ev >>>>>> / >>>>>> >>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>> test the change! >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>> >>> From doug.simon at oracle.com Mon Mar 7 21:03:02 2016 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 7 Mar 2016 22:03:02 +0100 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> Message-ID: I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. -Doug > On 07 Mar 2016, at 19:13, Christian Thalinger wrote: > > >> On Mar 5, 2016, at 1:45 AM, Doug Simon wrote: >> >> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >> >> >> https://bugs.openjdk.java.net/browse/JDK-8151266 > > There is a test attached to the bug. We should add it. > >> http://cr.openjdk.java.net/~dnsimon/8151266/ >> >> -Doug > From michael.c.berg at intel.com Mon Mar 7 21:17:36 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Mon, 7 Mar 2016 21:17:36 +0000 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDE176.8040101@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> <56DDE176.8040101@oracle.com> Message-ID: If we are going to do that, we might as well break the inserts and the extracts up sorting by size of operation in both files. -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Monday, March 07, 2016 12:16 PM To: Mikael Vidstedt; Berg, Michael C; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions Changes looks good. In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 groups as you have them in assembler_x86.hpp? Thanks, Vladimir On 3/7/16 11:32 AM, Mikael Vidstedt wrote: > > New webrev: > > http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ > > Unfortunately I think I messed up the incremental webrev, so in order > to not cause confusion I'm not including it here, sorry. > > Summary: > > I added pseudo instructions for inserting/extracting _low and _high > parts of the vector registers. Note that it only applies for the cases > where there is a clear low and high to speak of - that is, in the > cases where the instruction operates on the high or low *half* of a register. > For instructions like vinsert32x4 (128bit copy to/from a 512 bit > register) there are four different sources/targets, so high and low > are not applicable, so there are no pseudo instructions for them. > > The macroAssembler methods now take a uint8_t as well, this was > accidentally left out in the last webrev. > > I kept the nds->is_valid() checks for now, cleaning that up is covered > by JDK-8151003[1]. > > I also updated a couple of comments. > > Cheers, > Mikael > > [1] https://bugs.openjdk.java.net/browse/JDK-8151003 > > > > On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>> Vladimir (K), just for the time being as the problem isn't just >>> confined to these instructions (the nds issue). I have assigned the >>> bug below to myself and will take a holistic view over the issue in >>> its full context. >>> >>> The instructions modified in the webrev, like in the documentation >>> that exists regarding their definitions, are all programmable via >>> what is loosely labeled as the imm8 field in the formal >>> documentation. I think we should leave them that way. The onus of >>> these changes was to make instructions look more like their ISA >>> manual definitions. I think Vladimir Ivanov was saying, and please >>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>> it was leaving a signature like what we had in place in the macro >>> assembler, and invoking the precise names there. I don't think that >>> is needed though, as the macro assembler's job is to interpret a >>> meaning and do a mapping. >> I'm all for the proposed change in Assembler. >> >> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the >> code. So, keeping original functions in the MacroAssembler, but >> migrating them to new Assembler versions looks reasonable. >> >> But I can live with both variaints. >> >> Best regards, >> Vladimir Ivanov >> >>> >>> Regards, >>> Michael >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, March 02, 2016 3:32 PM >>> To: hotspot-compiler-dev at openjdk.java.net >>> Cc: Berg, Michael C >>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>> vinsert match actual instructions >>> >>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>> >>>> After discussing with Vladimir off-list we agreed that changing the >>>> type >>> >>> It was Vladimir Ivanov. >>> >>>> of the immediate (imm8) argument to uint8_t is both clearer, has >>>> the potential to catch incorrect uses of the functions, and also >>>> makes the asserts more straightforward. In addition to that >>>> Vladimir noted that I had accidentally included newline in the assert messages. >>>> >>>> New webrev: >>>> >>>> Full: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev >>>> / >>> >>> I agree with Vladimir I. that we should have macroassembler >>> instructions vinserti128high, vinserti128low, etc. instead of >>> passing imm8. It is more informative. >>> >>> Also why we add new nds->is_valid() checks into assembler instructions? >>> We are going to remove them: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>> >>> I know that Mikael had a discussion about this with Michael. So I >>> would like to see arguments here. Michael? >>> >>> Current code always pass correct registers and x86 Manual requires >>> to have valid registers. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Incremental from webrev.01: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/w >>>> ebr >>>> ev/ >>>> >>>> Cheers, >>>> Mikael >>>> >>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>> >>>>> Updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webre >>>>> v/ >>>>> >>>>> Incremental from webrev.00: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/ >>>>> web >>>>> rev/ >>>>> >>>>> Comments below... >>>>> >>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>> Nice cleanup, Mikael! >>>>>> >>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>> >>>>>> Outdated comments: >>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>> >>>>>> // Load/store high 128bit of YMM registers which does not >>>>>> destroy other half. >>>>>> >>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>> >>>>> Updated, thanks for catching! >>>>> >>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>> >>>>>> ! emit_int8(imm8 & 0x01); >>>>>> >>>>>> Maybe additionally assert valid imm8 range? >>>>> >>>>> Good idea, I had added asserts earlier but removed them. I added >>>>> them back again! >>>>> >>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>>> They look clearer in some contextes: >>>>>> >>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>> as_XMMRegister(n)); >>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>> as_XMMRegister(n), 1); >>>>> >>>>> Can I suggest that we try to live without them for a while and see >>>>> how much we miss them? I think having it there may actually be >>>>> more confusing in many cases :) >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>>> >>>>>> Otherwise, looks good. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>> >>>>>>> Please review the following change which updates the various >>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>>> which also has the benefit of providing the full >>>>>>> functionality/flexibility of the instructions where earlier only >>>>>>> some specific modes were supported. Put differently, with this >>>>>>> change it's much easier to correlate the methods to the Intel >>>>>>> manual and understand what they actually do. >>>>>>> >>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>> registers and an immediate which decide how the bits should be >>>>>>> shuffled around, but without this change the method only allowed >>>>>>> two of the registers to be specified, and the immediate was >>>>>>> hard-coded to 0x01. >>>>>>> >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/web >>>>>>> rev >>>>>>> / >>>>>>> >>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>>> test the change! >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>> >>>> > From mikael.vidstedt at oracle.com Mon Mar 7 21:21:04 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 7 Mar 2016 13:21:04 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> <56DDE176.8040101@oracle.com> Message-ID: <56DDF0C0.4050907@oracle.com> I agree with Mike that it would make sense to split the inserts and the extracts, but can I please suggest that we do the rearranging as a follow-up change? Cheers, Mikael On 3/7/2016 1:17 PM, Berg, Michael C wrote: > If we are going to do that, we might as well break the inserts and the extracts up sorting by size of operation in both files. > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Monday, March 07, 2016 12:16 PM > To: Mikael Vidstedt; Berg, Michael C; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions > > Changes looks good. > > In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 groups as you have them in assembler_x86.hpp? > > Thanks, > Vladimir > > On 3/7/16 11:32 AM, Mikael Vidstedt wrote: >> New webrev: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ >> >> Unfortunately I think I messed up the incremental webrev, so in order >> to not cause confusion I'm not including it here, sorry. >> >> Summary: >> >> I added pseudo instructions for inserting/extracting _low and _high >> parts of the vector registers. Note that it only applies for the cases >> where there is a clear low and high to speak of - that is, in the >> cases where the instruction operates on the high or low *half* of a register. >> For instructions like vinsert32x4 (128bit copy to/from a 512 bit >> register) there are four different sources/targets, so high and low >> are not applicable, so there are no pseudo instructions for them. >> >> The macroAssembler methods now take a uint8_t as well, this was >> accidentally left out in the last webrev. >> >> I kept the nds->is_valid() checks for now, cleaning that up is covered >> by JDK-8151003[1]. >> >> I also updated a couple of comments. >> >> Cheers, >> Mikael >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151003 >> >> >> >> On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >>> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>>> Vladimir (K), just for the time being as the problem isn't just >>>> confined to these instructions (the nds issue). I have assigned the >>>> bug below to myself and will take a holistic view over the issue in >>>> its full context. >>>> >>>> The instructions modified in the webrev, like in the documentation >>>> that exists regarding their definitions, are all programmable via >>>> what is loosely labeled as the imm8 field in the formal >>>> documentation. I think we should leave them that way. The onus of >>>> these changes was to make instructions look more like their ISA >>>> manual definitions. I think Vladimir Ivanov was saying, and please >>>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>>> it was leaving a signature like what we had in place in the macro >>>> assembler, and invoking the precise names there. I don't think that >>>> is needed though, as the macro assembler's job is to interpret a >>>> meaning and do a mapping. >>> I'm all for the proposed change in Assembler. >>> >>> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >>> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the >>> code. So, keeping original functions in the MacroAssembler, but >>> migrating them to new Assembler versions looks reasonable. >>> >>> But I can live with both variaints. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> Regards, >>>> Michael >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, March 02, 2016 3:32 PM >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> Cc: Berg, Michael C >>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>>> vinsert match actual instructions >>>> >>>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>>> After discussing with Vladimir off-list we agreed that changing the >>>>> type >>>> It was Vladimir Ivanov. >>>> >>>>> of the immediate (imm8) argument to uint8_t is both clearer, has >>>>> the potential to catch incorrect uses of the functions, and also >>>>> makes the asserts more straightforward. In addition to that >>>>> Vladimir noted that I had accidentally included newline in the assert messages. >>>>> >>>>> New webrev: >>>>> >>>>> Full: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev >>>>> / >>>> I agree with Vladimir I. that we should have macroassembler >>>> instructions vinserti128high, vinserti128low, etc. instead of >>>> passing imm8. It is more informative. >>>> >>>> Also why we add new nds->is_valid() checks into assembler instructions? >>>> We are going to remove them: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>>> >>>> I know that Mikael had a discussion about this with Michael. So I >>>> would like to see arguments here. Michael? >>>> >>>> Current code always pass correct registers and x86 Manual requires >>>> to have valid registers. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> Incremental from webrev.01: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/w >>>>> ebr >>>>> ev/ >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>>> Updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webre >>>>>> v/ >>>>>> >>>>>> Incremental from webrev.00: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/ >>>>>> web >>>>>> rev/ >>>>>> >>>>>> Comments below... >>>>>> >>>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>>> Nice cleanup, Mikael! >>>>>>> >>>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>>> >>>>>>> Outdated comments: >>>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>>> >>>>>>> // Load/store high 128bit of YMM registers which does not >>>>>>> destroy other half. >>>>>>> >>>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>>> Updated, thanks for catching! >>>>>> >>>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>>> >>>>>>> ! emit_int8(imm8 & 0x01); >>>>>>> >>>>>>> Maybe additionally assert valid imm8 range? >>>>>> Good idea, I had added asserts earlier but removed them. I added >>>>>> them back again! >>>>>> >>>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>>>> They look clearer in some contextes: >>>>>>> >>>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>>> as_XMMRegister(n)); >>>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>>> as_XMMRegister(n), 1); >>>>>> Can I suggest that we try to live without them for a while and see >>>>>> how much we miss them? I think having it there may actually be >>>>>> more confusing in many cases :) >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>>>>> Otherwise, looks good. >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>>> Please review the following change which updates the various >>>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>>>> which also has the benefit of providing the full >>>>>>>> functionality/flexibility of the instructions where earlier only >>>>>>>> some specific modes were supported. Put differently, with this >>>>>>>> change it's much easier to correlate the methods to the Intel >>>>>>>> manual and understand what they actually do. >>>>>>>> >>>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>>> registers and an immediate which decide how the bits should be >>>>>>>> shuffled around, but without this change the method only allowed >>>>>>>> two of the registers to be specified, and the immediate was >>>>>>>> hard-coded to 0x01. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/web >>>>>>>> rev >>>>>>>> / >>>>>>>> >>>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>>>> test the change! >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> From michael.c.berg at intel.com Mon Mar 7 21:28:33 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Mon, 7 Mar 2016 21:28:33 +0000 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDF0C0.4050907@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> <56DDE176.8040101@oracle.com> <56DDF0C0.4050907@oracle.com> Message-ID: Sure, I can put it in the nds fix if you like. -----Original Message----- From: Mikael Vidstedt [mailto:mikael.vidstedt at oracle.com] Sent: Monday, March 07, 2016 1:21 PM To: Berg, Michael C; Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions I agree with Mike that it would make sense to split the inserts and the extracts, but can I please suggest that we do the rearranging as a follow-up change? Cheers, Mikael On 3/7/2016 1:17 PM, Berg, Michael C wrote: > If we are going to do that, we might as well break the inserts and the extracts up sorting by size of operation in both files. > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Monday, March 07, 2016 12:16 PM > To: Mikael Vidstedt; Berg, Michael C; > hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and > vinsert match actual instructions > > Changes looks good. > > In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 groups as you have them in assembler_x86.hpp? > > Thanks, > Vladimir > > On 3/7/16 11:32 AM, Mikael Vidstedt wrote: >> New webrev: >> >> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ >> >> Unfortunately I think I messed up the incremental webrev, so in order >> to not cause confusion I'm not including it here, sorry. >> >> Summary: >> >> I added pseudo instructions for inserting/extracting _low and _high >> parts of the vector registers. Note that it only applies for the >> cases where there is a clear low and high to speak of - that is, in >> the cases where the instruction operates on the high or low *half* of a register. >> For instructions like vinsert32x4 (128bit copy to/from a 512 bit >> register) there are four different sources/targets, so high and low >> are not applicable, so there are no pseudo instructions for them. >> >> The macroAssembler methods now take a uint8_t as well, this was >> accidentally left out in the last webrev. >> >> I kept the nds->is_valid() checks for now, cleaning that up is >> covered by JDK-8151003[1]. >> >> I also updated a couple of comments. >> >> Cheers, >> Mikael >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151003 >> >> >> >> On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >>> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>>> Vladimir (K), just for the time being as the problem isn't just >>>> confined to these instructions (the nds issue). I have assigned the >>>> bug below to myself and will take a holistic view over the issue in >>>> its full context. >>>> >>>> The instructions modified in the webrev, like in the documentation >>>> that exists regarding their definitions, are all programmable via >>>> what is loosely labeled as the imm8 field in the formal >>>> documentation. I think we should leave them that way. The onus of >>>> these changes was to make instructions look more like their ISA >>>> manual definitions. I think Vladimir Ivanov was saying, and please >>>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>>> it was leaving a signature like what we had in place in the macro >>>> assembler, and invoking the precise names there. I don't think >>>> that is needed though, as the macro assembler's job is to interpret >>>> a meaning and do a mapping. >>> I'm all for the proposed change in Assembler. >>> >>> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >>> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in >>> the code. So, keeping original functions in the MacroAssembler, but >>> migrating them to new Assembler versions looks reasonable. >>> >>> But I can live with both variaints. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> Regards, >>>> Michael >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, March 02, 2016 3:32 PM >>>> To: hotspot-compiler-dev at openjdk.java.net >>>> Cc: Berg, Michael C >>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>>> vinsert match actual instructions >>>> >>>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>>> After discussing with Vladimir off-list we agreed that changing >>>>> the type >>>> It was Vladimir Ivanov. >>>> >>>>> of the immediate (imm8) argument to uint8_t is both clearer, has >>>>> the potential to catch incorrect uses of the functions, and also >>>>> makes the asserts more straightforward. In addition to that >>>>> Vladimir noted that I had accidentally included newline in the assert messages. >>>>> >>>>> New webrev: >>>>> >>>>> Full: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webre >>>>> v >>>>> / >>>> I agree with Vladimir I. that we should have macroassembler >>>> instructions vinserti128high, vinserti128low, etc. instead of >>>> passing imm8. It is more informative. >>>> >>>> Also why we add new nds->is_valid() checks into assembler instructions? >>>> We are going to remove them: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>>> >>>> I know that Mikael had a discussion about this with Michael. So I >>>> would like to see arguments here. Michael? >>>> >>>> Current code always pass correct registers and x86 Manual requires >>>> to have valid registers. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> Incremental from webrev.01: >>>>> >>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/ >>>>> w >>>>> ebr >>>>> ev/ >>>>> >>>>> Cheers, >>>>> Mikael >>>>> >>>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>>> Updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webr >>>>>> e >>>>>> v/ >>>>>> >>>>>> Incremental from webrev.00: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr >>>>>> / >>>>>> web >>>>>> rev/ >>>>>> >>>>>> Comments below... >>>>>> >>>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>>> Nice cleanup, Mikael! >>>>>>> >>>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>>> >>>>>>> Outdated comments: >>>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>>> >>>>>>> // Load/store high 128bit of YMM registers which does not >>>>>>> destroy other half. >>>>>>> >>>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>>> Updated, thanks for catching! >>>>>> >>>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>>> >>>>>>> ! emit_int8(imm8 & 0x01); >>>>>>> >>>>>>> Maybe additionally assert valid imm8 range? >>>>>> Good idea, I had added asserts earlier but removed them. I added >>>>>> them back again! >>>>>> >>>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>>>> They look clearer in some contextes: >>>>>>> >>>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>>> as_XMMRegister(n)); >>>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>>> as_XMMRegister(n), 1); >>>>>> Can I suggest that we try to live without them for a while and >>>>>> see how much we miss them? I think having it there may actually >>>>>> be more confusing in many cases :) >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>>>>> Otherwise, looks good. >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>>>> >>>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>>> Please review the following change which updates the various >>>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>>>> which also has the benefit of providing the full >>>>>>>> functionality/flexibility of the instructions where earlier >>>>>>>> only some specific modes were supported. Put differently, with >>>>>>>> this change it's much easier to correlate the methods to the >>>>>>>> Intel manual and understand what they actually do. >>>>>>>> >>>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>>> registers and an immediate which decide how the bits should be >>>>>>>> shuffled around, but without this change the method only >>>>>>>> allowed two of the registers to be specified, and the immediate >>>>>>>> was hard-coded to 0x01. >>>>>>>> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/we >>>>>>>> b >>>>>>>> rev >>>>>>>> / >>>>>>>> >>>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, >>>>>>>> and test the change! >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> From vladimir.kozlov at oracle.com Mon Mar 7 22:05:10 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 7 Mar 2016 14:05:10 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDF0C0.4050907@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> <56DDE176.8040101@oracle.com> <56DDF0C0.4050907@oracle.com> Message-ID: <56DDFB16.5090804@oracle.com> Okay, please, file RFE then. You can push what you have now. Thanks, Vladimir On 3/7/16 1:21 PM, Mikael Vidstedt wrote: > > I agree with Mike that it would make sense to split the inserts and the > extracts, but can I please suggest that we do the rearranging as a > follow-up change? > > Cheers, > Mikael > > On 3/7/2016 1:17 PM, Berg, Michael C wrote: >> If we are going to do that, we might as well break the inserts and the >> extracts up sorting by size of operation in both files. >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Monday, March 07, 2016 12:16 PM >> To: Mikael Vidstedt; Berg, Michael C; >> hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >> vinsert match actual instructions >> >> Changes looks good. >> >> In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 >> groups as you have them in assembler_x86.hpp? >> >> Thanks, >> Vladimir >> >> On 3/7/16 11:32 AM, Mikael Vidstedt wrote: >>> New webrev: >>> >>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ >>> >>> Unfortunately I think I messed up the incremental webrev, so in order >>> to not cause confusion I'm not including it here, sorry. >>> >>> Summary: >>> >>> I added pseudo instructions for inserting/extracting _low and _high >>> parts of the vector registers. Note that it only applies for the cases >>> where there is a clear low and high to speak of - that is, in the >>> cases where the instruction operates on the high or low *half* of a >>> register. >>> For instructions like vinsert32x4 (128bit copy to/from a 512 bit >>> register) there are four different sources/targets, so high and low >>> are not applicable, so there are no pseudo instructions for them. >>> >>> The macroAssembler methods now take a uint8_t as well, this was >>> accidentally left out in the last webrev. >>> >>> I kept the nds->is_valid() checks for now, cleaning that up is covered >>> by JDK-8151003[1]. >>> >>> I also updated a couple of comments. >>> >>> Cheers, >>> Mikael >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8151003 >>> >>> >>> >>> On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >>>> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>>>> Vladimir (K), just for the time being as the problem isn't just >>>>> confined to these instructions (the nds issue). I have assigned the >>>>> bug below to myself and will take a holistic view over the issue in >>>>> its full context. >>>>> >>>>> The instructions modified in the webrev, like in the documentation >>>>> that exists regarding their definitions, are all programmable via >>>>> what is loosely labeled as the imm8 field in the formal >>>>> documentation. I think we should leave them that way. The onus of >>>>> these changes was to make instructions look more like their ISA >>>>> manual definitions. I think Vladimir Ivanov was saying, and please >>>>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>>>> it was leaving a signature like what we had in place in the macro >>>>> assembler, and invoking the precise names there. I don't think that >>>>> is needed though, as the macro assembler's job is to interpret a >>>>> meaning and do a mapping. >>>> I'm all for the proposed change in Assembler. >>>> >>>> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >>>> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the >>>> code. So, keeping original functions in the MacroAssembler, but >>>> migrating them to new Assembler versions looks reasonable. >>>> >>>> But I can live with both variaints. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> Regards, >>>>> Michael >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Wednesday, March 02, 2016 3:32 PM >>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>> Cc: Berg, Michael C >>>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>>>> vinsert match actual instructions >>>>> >>>>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>>>> After discussing with Vladimir off-list we agreed that changing the >>>>>> type >>>>> It was Vladimir Ivanov. >>>>> >>>>>> of the immediate (imm8) argument to uint8_t is both clearer, has >>>>>> the potential to catch incorrect uses of the functions, and also >>>>>> makes the asserts more straightforward. In addition to that >>>>>> Vladimir noted that I had accidentally included newline in the >>>>>> assert messages. >>>>>> >>>>>> New webrev: >>>>>> >>>>>> Full: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev >>>>>> / >>>>> I agree with Vladimir I. that we should have macroassembler >>>>> instructions vinserti128high, vinserti128low, etc. instead of >>>>> passing imm8. It is more informative. >>>>> >>>>> Also why we add new nds->is_valid() checks into assembler >>>>> instructions? >>>>> We are going to remove them: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>>>> >>>>> I know that Mikael had a discussion about this with Michael. So I >>>>> would like to see arguments here. Michael? >>>>> >>>>> Current code always pass correct registers and x86 Manual requires >>>>> to have valid registers. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> Incremental from webrev.01: >>>>>> >>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/w >>>>>> ebr >>>>>> ev/ >>>>>> >>>>>> Cheers, >>>>>> Mikael >>>>>> >>>>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webre >>>>>>> v/ >>>>>>> >>>>>>> Incremental from webrev.00: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/ >>>>>>> web >>>>>>> rev/ >>>>>>> >>>>>>> Comments below... >>>>>>> >>>>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>>>> Nice cleanup, Mikael! >>>>>>>> >>>>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>>>> >>>>>>>> Outdated comments: >>>>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>>>> >>>>>>>> // Load/store high 128bit of YMM registers which does not >>>>>>>> destroy other half. >>>>>>>> >>>>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>>>> Updated, thanks for catching! >>>>>>> >>>>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>>>> >>>>>>>> ! emit_int8(imm8 & 0x01); >>>>>>>> >>>>>>>> Maybe additionally assert valid imm8 range? >>>>>>> Good idea, I had added asserts earlier but removed them. I added >>>>>>> them back again! >>>>>>> >>>>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>>>>> They look clearer in some contextes: >>>>>>>> >>>>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>>>> as_XMMRegister(n)); >>>>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>>>> as_XMMRegister(n), 1); >>>>>>> Can I suggest that we try to live without them for a while and see >>>>>>> how much we miss them? I think having it there may actually be >>>>>>> more confusing in many cases :) >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>>>>> Otherwise, looks good. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vladimir Ivanov >>>>>>>> >>>>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>>>> Please review the following change which updates the various >>>>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>>>>> which also has the benefit of providing the full >>>>>>>>> functionality/flexibility of the instructions where earlier only >>>>>>>>> some specific modes were supported. Put differently, with this >>>>>>>>> change it's much easier to correlate the methods to the Intel >>>>>>>>> manual and understand what they actually do. >>>>>>>>> >>>>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>>>> registers and an immediate which decide how the bits should be >>>>>>>>> shuffled around, but without this change the method only allowed >>>>>>>>> two of the registers to be specified, and the immediate was >>>>>>>>> hard-coded to 0x01. >>>>>>>>> >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>>>> Webrev: >>>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/web >>>>>>>>> rev >>>>>>>>> / >>>>>>>>> >>>>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>>>>> test the change! >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Mikael >>>>>>>>> > From mikael.vidstedt at oracle.com Mon Mar 7 22:27:28 2016 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Mon, 7 Mar 2016 14:27:28 -0800 Subject: RFR (S): 8151002: Make Assembler methods vextract and vinsert match actual instructions In-Reply-To: <56DDFB16.5090804@oracle.com> References: <56D63301.9050909@oracle.com> <56D697E4.8060104@oracle.com> <56D75750.6020400@oracle.com> <56D77113.30906@oracle.com> <56D777DC.70708@oracle.com> <56D8A9A7.30606@oracle.com> <56DDD741.4070802@oracle.com> <56DDE176.8040101@oracle.com> <56DDF0C0.4050907@oracle.com> <56DDFB16.5090804@oracle.com> Message-ID: <56DE0050.10103@oracle.com> I filed JDK-8151409[1] to cover the reordering of the methods. For sanity it would probably make sense to do it separately from the removal of the nds checks as described in JDK-8151003[2]. Thanks for the reviews! Cheers, Mikael [1] https://bugs.openjdk.java.net/browse/JDK-8151409 [2] https://bugs.openjdk.java.net/browse/JDK-8151003 On 3/7/2016 2:05 PM, Vladimir Kozlov wrote: > Okay, please, file RFE then. You can push what you have now. > > Thanks, > Vladimir > > On 3/7/16 1:21 PM, Mikael Vidstedt wrote: >> >> I agree with Mike that it would make sense to split the inserts and the >> extracts, but can I please suggest that we do the rearranging as a >> follow-up change? >> >> Cheers, >> Mikael >> >> On 3/7/2016 1:17 PM, Berg, Michael C wrote: >>> If we are going to do that, we might as well break the inserts and the >>> extracts up sorting by size of operation in both files. >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Monday, March 07, 2016 12:16 PM >>> To: Mikael Vidstedt; Berg, Michael C; >>> hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>> vinsert match actual instructions >>> >>> Changes looks good. >>> >>> In assembler_x86.cpp can you group v*128*, v*64*, v32* in separate 3 >>> groups as you have them in assembler_x86.hpp? >>> >>> Thanks, >>> Vladimir >>> >>> On 3/7/16 11:32 AM, Mikael Vidstedt wrote: >>>> New webrev: >>>> >>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.03/webrev/ >>>> >>>> Unfortunately I think I messed up the incremental webrev, so in order >>>> to not cause confusion I'm not including it here, sorry. >>>> >>>> Summary: >>>> >>>> I added pseudo instructions for inserting/extracting _low and _high >>>> parts of the vector registers. Note that it only applies for the cases >>>> where there is a clear low and high to speak of - that is, in the >>>> cases where the instruction operates on the high or low *half* of a >>>> register. >>>> For instructions like vinsert32x4 (128bit copy to/from a 512 bit >>>> register) there are four different sources/targets, so high and low >>>> are not applicable, so there are no pseudo instructions for them. >>>> >>>> The macroAssembler methods now take a uint8_t as well, this was >>>> accidentally left out in the last webrev. >>>> >>>> I kept the nds->is_valid() checks for now, cleaning that up is covered >>>> by JDK-8151003[1]. >>>> >>>> I also updated a couple of comments. >>>> >>>> Cheers, >>>> Mikael >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8151003 >>>> >>>> >>>> >>>> On 3/3/2016 1:16 PM, Vladimir Ivanov wrote: >>>>> On 3/3/16 7:08 AM, Berg, Michael C wrote: >>>>>> Vladimir (K), just for the time being as the problem isn't just >>>>>> confined to these instructions (the nds issue). I have assigned the >>>>>> bug below to myself and will take a holistic view over the issue in >>>>>> its full context. >>>>>> >>>>>> The instructions modified in the webrev, like in the documentation >>>>>> that exists regarding their definitions, are all programmable via >>>>>> what is loosely labeled as the imm8 field in the formal >>>>>> documentation. I think we should leave them that way. The onus of >>>>>> these changes was to make instructions look more like their ISA >>>>>> manual definitions. I think Vladimir Ivanov was saying, and please >>>>>> chime in Vladimir if I do not interpret correctly, wasn't high/low, >>>>>> it was leaving a signature like what we had in place in the macro >>>>>> assembler, and invoking the precise names there. I don't think that >>>>>> is needed though, as the macro assembler's job is to interpret a >>>>>> meaning and do a mapping. >>>>> I'm all for the proposed change in Assembler. >>>>> >>>>> My point is that vmovdqu/vinserti128h/vextracti128h(...) are more >>>>> informative than vinserti128(...,0/1) & vextracti128(..., 0/1) in the >>>>> code. So, keeping original functions in the MacroAssembler, but >>>>> migrating them to new Assembler versions looks reasonable. >>>>> >>>>> But I can live with both variaints. >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>>> Regards, >>>>>> Michael >>>>>> >>>>>> -----Original Message----- >>>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>>> Sent: Wednesday, March 02, 2016 3:32 PM >>>>>> To: hotspot-compiler-dev at openjdk.java.net >>>>>> Cc: Berg, Michael C >>>>>> Subject: Re: RFR (S): 8151002: Make Assembler methods vextract and >>>>>> vinsert match actual instructions >>>>>> >>>>>> On 3/2/16 3:02 PM, Mikael Vidstedt wrote: >>>>>>> After discussing with Vladimir off-list we agreed that changing the >>>>>>> type >>>>>> It was Vladimir Ivanov. >>>>>> >>>>>>> of the immediate (imm8) argument to uint8_t is both clearer, has >>>>>>> the potential to catch incorrect uses of the functions, and also >>>>>>> makes the asserts more straightforward. In addition to that >>>>>>> Vladimir noted that I had accidentally included newline in the >>>>>>> assert messages. >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> Full: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02/webrev >>>>>>> / >>>>>> I agree with Vladimir I. that we should have macroassembler >>>>>> instructions vinserti128high, vinserti128low, etc. instead of >>>>>> passing imm8. It is more informative. >>>>>> >>>>>> Also why we add new nds->is_valid() checks into assembler >>>>>> instructions? >>>>>> We are going to remove them: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8151003 >>>>>> >>>>>> I know that Mikael had a discussion about this with Michael. So I >>>>>> would like to see arguments here. Michael? >>>>>> >>>>>> Current code always pass correct registers and x86 Manual requires >>>>>> to have valid registers. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>>> Incremental from webrev.01: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.02.incr/w >>>>>>> ebr >>>>>>> ev/ >>>>>>> >>>>>>> Cheers, >>>>>>> Mikael >>>>>>> >>>>>>> On 2016-03-02 13:12, Mikael Vidstedt wrote: >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01/webre >>>>>>>> v/ >>>>>>>> >>>>>>>> Incremental from webrev.00: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.01.incr/ >>>>>>>> web >>>>>>>> rev/ >>>>>>>> >>>>>>>> Comments below... >>>>>>>> >>>>>>>> On 2016-03-01 23:36, Vladimir Ivanov wrote: >>>>>>>>> Nice cleanup, Mikael! >>>>>>>>> >>>>>>>>> src/cpu/x86/vm/assembler_x86.hpp: >>>>>>>>> >>>>>>>>> Outdated comments: >>>>>>>>> // Copy low 128bit into high 128bit of YMM registers. >>>>>>>>> >>>>>>>>> // Load/store high 128bit of YMM registers which does not >>>>>>>>> destroy other half. >>>>>>>>> >>>>>>>>> // Copy low 256bit into high 256bit of ZMM registers. >>>>>>>> Updated, thanks for catching! >>>>>>>> >>>>>>>>> src/cpu/x86/vm/assembler_x86.cpp: >>>>>>>>> >>>>>>>>> ! emit_int8(imm8 & 0x01); >>>>>>>>> >>>>>>>>> Maybe additionally assert valid imm8 range? >>>>>>>> Good idea, I had added asserts earlier but removed them. I added >>>>>>>> them back again! >>>>>>>> >>>>>>>>> Maybe keep vinsert*h variants and move them to MacroAssembler? >>>>>>>>> They look clearer in some contextes: >>>>>>>>> >>>>>>>>> - __ vextractf128h(Address(rsp, base_addr+n*16), >>>>>>>>> as_XMMRegister(n)); >>>>>>>>> + __ vextractf128(Address(rsp, base_addr+n*16), >>>>>>>>> as_XMMRegister(n), 1); >>>>>>>> Can I suggest that we try to live without them for a while and see >>>>>>>> how much we miss them? I think having it there may actually be >>>>>>>> more confusing in many cases :) >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Mikael >>>>>>>> >>>>>>>>> Otherwise, looks good. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Vladimir Ivanov >>>>>>>>> >>>>>>>>> On 3/2/16 3:25 AM, Mikael Vidstedt wrote: >>>>>>>>>> Please review the following change which updates the various >>>>>>>>>> vextract* and vinsert* methods in assembler_x86 & >>>>>>>>>> macroAssembler_x86 to better match the real HW instructions, >>>>>>>>>> which also has the benefit of providing the full >>>>>>>>>> functionality/flexibility of the instructions where earlier only >>>>>>>>>> some specific modes were supported. Put differently, with this >>>>>>>>>> change it's much easier to correlate the methods to the Intel >>>>>>>>>> manual and understand what they actually do. >>>>>>>>>> >>>>>>>>>> Specifically, the vinsert* family of instructions take three >>>>>>>>>> registers and an immediate which decide how the bits should be >>>>>>>>>> shuffled around, but without this change the method only allowed >>>>>>>>>> two of the registers to be specified, and the immediate was >>>>>>>>>> hard-coded to 0x01. >>>>>>>>>> >>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151002 >>>>>>>>>> Webrev: >>>>>>>>>> http://cr.openjdk.java.net/~mikael/webrevs/8151002/webrev.00/web >>>>>>>>>> rev >>>>>>>>>> / >>>>>>>>>> >>>>>>>>>> Special thanks to Mike Berg for helping discuss, co-develop, and >>>>>>>>>> test the change! >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Mikael >>>>>>>>>> >> From christian.thalinger at oracle.com Tue Mar 8 01:20:51 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 7 Mar 2016 15:20:51 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> Message-ID: <374A0174-22D8-47F3-BA6C-1865AD16AEBE@oracle.com> > On Mar 7, 2016, at 11:03 AM, Doug Simon wrote: > > I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. That?s much better than I expected. Looks good. > > -Doug > >> On 07 Mar 2016, at 19:13, Christian Thalinger wrote: >> >> >>> On Mar 5, 2016, at 1:45 AM, Doug Simon wrote: >>> >>> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >>> >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151266 >> >> There is a test attached to the bug. We should add it. >> >>> http://cr.openjdk.java.net/~dnsimon/8151266/ >>> >>> -Doug >> > From aph at redhat.com Tue Mar 8 10:08:47 2016 From: aph at redhat.com (Andrew Haley) Date: Tue, 8 Mar 2016 10:08:47 +0000 Subject: RFR: 8150394: aarch64: add support for 8.1 LSE CAS instructions In-Reply-To: <1456422286.21810.2.camel@mint> References: <1456173130.2735.8.camel@mint> <56CD8CCF.1030404@redhat.com> <1456394786.1383.18.camel@mint> <56CF1492.1000400@redhat.com> <1456422286.21810.2.camel@mint> Message-ID: <56DEA4AF.2020309@redhat.com> On 02/25/2016 05:44 PM, Edward Nevill wrote: >> http://cr.openjdk.java.net/~aph/aarch64-lse-cas/ >> > >> > Please test it. >> > > Hi, > > Clean run through jcstress with -XX:+UseLSE. Also clean on some partners tests with and without -XX:+UseLSE. > > Looks fine, OK to push. Thanks, Andrew. From tom.rodriguez at oracle.com Tue Mar 8 17:50:40 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 8 Mar 2016 09:50:40 -0800 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> Message-ID: <3463224F-7CB8-43E0-8CAC-2FFF701A3491@oracle.com> Looks good to me. tom > On Mar 7, 2016, at 1:03 PM, Doug Simon wrote: > > I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. > > -Doug > >> On 07 Mar 2016, at 19:13, Christian Thalinger wrote: >> >> >>> On Mar 5, 2016, at 1:45 AM, Doug Simon wrote: >>> >>> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >>> >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151266 >> >> There is a test attached to the bug. We should add it. >> >>> http://cr.openjdk.java.net/~dnsimon/8151266/ >>> >>> -Doug >> > From christian.thalinger at oracle.com Tue Mar 8 23:05:42 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 8 Mar 2016 13:05:42 -1000 Subject: RFR: 8151470: [JVMCI] remove up-call to HotSpotJVMCICompilerConfig.selectCompiler Message-ID: https://bugs.openjdk.java.net/browse/JDK-8151470 http://cr.openjdk.java.net/~twisti/8151470/webrev.01/ The reason why it was done this way is to use a trusted system property value to select the compiler. We can achieve the same by using VM.getSavedProperty. This patch changes the system property name from ?jvmci.compiler? to ?jvmci.Compiler? as it?s using an Option now. As discussed with Doug I also got rid of some property file parsing code that we don?t need right now. From volker.simonis at gmail.com Wed Mar 9 08:02:33 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 9 Mar 2016 09:02:33 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> Message-ID: On Fri, Mar 4, 2016 at 7:30 PM, Volker Simonis wrote: > Hi Pavel, > > thanks for your feedabck. Please find my comments inline: > > On Fri, Mar 4, 2016 at 6:01 PM, Pavel Punegov > wrote: >> >> Hi Volker, >> >> overall looks good to me (not Reviewer). >> >> Just some questions about the test: >> 1. Do you need to PrintCompilation and Inlining in the test? >> >> 39 * -XX:+PrintCompilation >> 40 * >> -XX:CompileCommand=option,BlockingCompilation::foo,PrintInlining > > It's not necessary for the test, but if the test will fail it will be good > to have this information in the .jtr file > >> >> 2. 500_000 for a loop seems to be a way to much. >> There is a test/compiler/whitebox/CompilerWhiteBoxTest.java that has a >> constants enough for a compilation. >> > > That's just an upper bound which won't be reached. We break out of the loop > once we reached the maximum compilation level. Notice that the loop count is > not the count until the method get's enqueued for compilation, but until it > actually gets compiled because this loop tests the non-blocking > compilations. So if the machine is loaded and/or the compile queue is busy, > it can take quite some iterations until the method will be finally compiled > (in my manual tests it usually took not more than 8000 iterations until the > test left the loop). > >> >> 3. Running Client compiler only could pass even if compilation were >> blocking here: >> >> 128 if (level == 4 && iteration == i) { >> >> I think it should check that any level changing have happened with some >> amount of iterations, not only the 4th level. > > > The problem is that C1 compiles are so bleeding fast that I got to many > false positives, i.e. the method got compiled in the same iteration just > before I queried the compilation level. > > I actually begin to think that this test may always fail if you run the > JTreg tests with "-Xbatch". Do you do that by default (e.g. in JPRT)? In > that case we would probably have to additionally set > "-XX:+BackgroundCompilation" in the test options? > I've forgot to say that I've tried the regression test by running jtreg with -Xbatch and it still works because -vmoption/-javaoption are not passed to a test which is run with '@run main/othervm'. @Nils: what's the result of your testing? Is there anything left we have to do until this can be pushed? Regards, Volker > Regards, > Volker > > >> >> >> ? Thanks, >> Pavel Punegov >> >> On 04 Mar 2016, at 14:29, Volker Simonis wrote: >> >> Great! I'm happy we finally came to an agreement :) >> >> Best regards, >> Volker >> >> On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson >> wrote: >>> >>> Hi Volker, >>> >>> On 2016-03-03 16:25, Volker Simonis wrote: >>> >>> Hi Nils, >>> >>> thanks for your comments. Please find my new webrev here: >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 >>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 >>> >>> >>> Looks very good now. >>> >>> >>> Comments as always inline: >>> >>> On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson >>> wrote: >>>> >>>> Hi Volker, >>>> >>>> On 2016-03-02 17:36, Volker Simonis wrote: >>>> >>>> Hi Nils, >>>> >>>> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is >>>> as base for my new webrevs at: >>>> >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >>>> >>>> I've updated the copyrights, added the current reviewers and also added >>>> us both in the Contributed-by line (hope that's fine for you). >>>> >>>> >>>> Absolutely >>>> >>>> >>>> Except that, I've only done the following minor fixes/changes: >>>> >>>> compileBroker.{cpp,hpp} >>>> >>>> - we don't need CompileBroker::is_compile_blocking() anymore. >>>> >>>> Good >>>> >>>> >>>> compilerDirectives.hpp >>>> >>>> - I think we should use >>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, >>>> BackgroundCompilation) >>>> instead of: >>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >>>> >>>> so we can also trigger blocking compiles from the command line with a >>>> CompileCommand (e.g. >>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >>>> That's very handy during development or and also for simple tests where we >>>> don't want to mess with compiler directives. (And the overhead to keep this >>>> feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >>>> >>>> >>>> Without a very strong use case for this I don't want it as a >>>> CompileCommand. >>>> >>>> CompileCommand options do have a cost - they force a temporary unique >>>> copy of the directive if any option command matches negating some of the >>>> positive effects of directives. Also the CompileCommands are stringly typed, >>>> no compile time name or type check is done. This can be fixed in various >>>> ways, but until then I prefer not to add it. >>>> >>> >>> Well, daily working is a strong use case for me:) Until there's no >>> possibility to provide compiler directives directly on the command line >>> instead of using an extra file, I think there's a justification for the >>> CompileCommand version. Also I think the cost argument is not so relevent, >>> because the feature will be mainly used during developemnt or in small tests >>> which don't need compiler directives. It will actually make it possible to >>> write simple tests which require blocking compilations without the need to >>> use compiler directives or WB (just by specifying a -XX:CompileCommand >>> option). >>> >>> >>> ok, you convinced me. >>> >>>> >>>> whitebox.cpp >>>> >>>> I think it is good that you fixed the state but I think it is too >>>> complicated now. We don't need to strdup the string and can easily forget to >>>> free 'tmpstr' :) So maybe it is simpler to just do another transition for >>>> parsing the directive: >>>> >>>> { >>>> ThreadInVMfromNative ttvfn(thread); // back to VM >>>> DirectivesParser::parse_string(dir, tty); >>>> } >>>> >>>> >>>> Transitions are not free, but on the other hand the string may be long. >>>> This is not a hot path in anyway so lets go with simple. >>>> >>>> >>>> advancedThresholdPolicy.cpp >>>> >>>> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and >>>> is actually even an improvement over my code which just picked the first >>>> blocking compilation. >>>> >>>> >>>> Feels good to remove the special cases. >>>> >>>> >>>> diagnosticCommand.cpp >>>> >>>> - Shouldn't you also fix CompilerDirectivesAddDCmd to return the number >>>> of added directives and CompilerDirectivesRemoveDCmd to take the number of >>>> directives you want to pop? Or do you want to do this in a later, follow-up >>>> change? >>>> >>>> >>>> Yes, lets do that in a follow up change. They affect a number of tests. >>>> >>>> >>>> WhiteBox.java >>>> >>>> - I still think it would make sense to keep the two 'blocking' versions >>>> of enqueueMethodForCompilation() for convenience. For example your test fix >>>> for JDK-8073793 would be much simpler if you used them. I've added two >>>> comments to the 'blocking' convenience methods to mention the fact that >>>> calling them may shadow previously added compiler directives. >>>> >>>> >>>> I am ok with having then, but think Whitebox.java will get too bloated. >>>> I would rather have the convenience-methods in some test utility class, like >>>> CompilerWhiteBoxTest.java. >>>> >>> >>> OK, I can live with that. I removed the blocking enqueue methods and the >>> corresponding tests. >>>> >>>> >>>> BlockingCompilation.java >>>> >>>> - I've extended my regression test to test both methods of doing >>>> blocking compilation - with the new, 'blocking' >>>> enqueueMethodForCompilation() methods as well as by manually setting the >>>> corresponding compiler directives. If we should finally get consensus on >>>> removing the blocking convenience methods, please just remove the >>>> corresponding tests. >>>> >>>> >>>> Line 85: for (level = 1; level <= 4; level++) { >>>> >>>> You can not be sure all compilation levels are available. Use >>>> >>>> * @library /testlibrary /test/lib / >>>> * @build sun.hotspot.WhiteBox >>>> * compiler.testlibrary.CompilerUtils >>>> >>>> import compiler.testlibrary.CompilerUtils; >>>> >>>> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >>>> for (int level : levels) { >>>> ... >>> >>> >>> Good catch. I've slightly revorked the test. I do bail out early if there >>> are no compilers at all and I've also fixed the break condition of the loop >>> which is calling foo() to compare against the highest available compilation >>> level instead of just using '4'. >>>> >>>> >>>> I think we're close to a final version now, what do you think :) >>>> >>>> >>>> Yes! I'll take a look as soon as you post an updated webrev. >>> >>> >>> Would be good if you could run it trough JPRT once so we can be sure we >>> didn't break anything. >>> >>> >>> I am running all jtreg tests on some select platforms right now. >>> >>> Best regards, >>> Nils >>> >>> >>> Regards, >>> Volker >>> >>>> >>>> >>>> Regards, >>>> Nils >>>> >>>> >>>> >>>> Regards, >>>> Volker >>>> >>>> >>>> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson >>>> wrote: >>>>> >>>>> Yes, I forgot to add the fix for working with multiple directives from >>>>> whitebox. >>>>> >>>>> WB.addCompilerDirectives now returns the number of directives that >>>>> where added, and removeCompilerDirectives takes a parameter for the number >>>>> of directives that should be popped (atomically). >>>>> >>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>>>> >>>>> Fixed test in JDK-8073793 to work with this: >>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> >>>>> On 2016-03-02 13:36, Nils Eliasson wrote: >>>>> >>>>> Hi Volker, >>>>> >>>>> I created these webrevs including all the feedback from everyone: >>>>> >>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>>>> * Only add- and removeCompilerDirective >>>>> >>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>>>> * whitebox.cpp >>>>> -- addCompilerDirective to have correct VM states >>>>> * advancedThresholdPolicy.cpp >>>>> -- prevent blocking tasks from becoming stale >>>>> -- The logic for picking first blocking task broke JVMCI code. Instead >>>>> made the JVMCI code default (select the blocking task with highest score.) >>>>> * compilerDirectives.hpp >>>>> -- Remove option CompileCommand. Not needed. >>>>> * compileBroker.cpp >>>>> -- Wrapped compile_method so that directive get and release always are >>>>> matched. >>>>> >>>>> Is anything missing? >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> On 2016-03-01 19:31, Volker Simonis wrote: >>>>> >>>>> Hi Pavel, Nils, Vladimir, >>>>> >>>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>>> >>>>> Thanks a lot for your input and your suggestions. I'll look into this >>>>> tomorrow and hopefully I'll be able to address all your concerns. >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>>>> wrote: >>>>> >>>>> Nils, please answer Pavel's questions. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> >>>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>> >>>>> Hi Volker, >>>>> >>>>> An excellent proposition. This is how it should be used. >>>>> >>>>> I polished a few rough edges: >>>>> * CompilerBroker.cpp - The directives was already access in >>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>> so we only have a single directive access. Wrapped compile_method to >>>>> make sure the release of the directive doesn't get lost. >>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>> state - need to be in native to get string, but then need to be in VM >>>>> when parsing directive. >>>>> >>>>> And some comments: >>>>> * I am against adding new compile option commands (At least until the >>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>> directives instead. >>>>> >>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>> could save the blocking info in the task so we don't need to access the >>>>> directive in the policy. >>>>> >>>>> All in here: >>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>> >>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>> >>>>> Hi, >>>>> >>>>> so I want to propose the following solution for this problem: >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>> >>>>> I've started from the opposite site and made the BackgroundCompilation >>>>> manageable through the compiler directives framework. Once this works >>>>> (and it's actually trivial due to the nice design of the >>>>> CompilerDirectives framework :), we get the possibility to set the >>>>> BackgroundCompilation option on a per method base on the command line >>>>> via the CompileCommand option for free: >>>>> >>>>> >>>>> >>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>> >>>>> >>>>> And of course we can also use it directly as a compiler directive: >>>>> >>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>> >>>>> It also becomes possible to use this directly from the Whitebox API >>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>> Unfortunately, this command takes a file with compiler directives as >>>>> argument. I think this would be overkill in this context. So because >>>>> it was so easy and convenient, I added the following two new Whitebox >>>>> methods: >>>>> >>>>> public native void addCompilerDirective(String compDirect); >>>>> public native void removeCompilerDirective(); >>>>> >>>>> which can now be used to set arbitrary CompilerDirective command >>>>> directly from within the WhiteBox API. (The implementation of these >>>>> two methods is trivial as you can see in whitebox.cpp). >>>>> v >>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>> simple wrappers around the existing methods without the need of any >>>>> code changes in their native implementation. This is good, because it >>>>> keeps the WhiteBox API stable! >>>>> >>>>> Finally some words about the implementation of the per-method >>>>> BackgroundCompilation functionality. It actually only requires two >>>>> small changes: >>>>> >>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>> and compilation level as arguments and use them to query the >>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>> >>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>> prefers blocking compilations. This is not only necessary, because it >>>>> decreases the time we have to wait for a blocking compilation, but >>>>> also because it prevents blocking compiles from getting stale. This >>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>> for methods which only get artificially compiled during a test because >>>>> their invocations counters are usually too small. >>>>> >>>>> There's still a small probability that a blocking compilation will be >>>>> not blocking. This can happen if a method for which we request the >>>>> blocking compilation is already in the compilation queue (see the >>>>> check 'compilation_is_in_queue(method)' in >>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>> rarely happen because methods which are manually compiled shouldn't >>>>> get called that many times to implicitly place them into the compile >>>>> queue. But we can even completely avoid this problem by using >>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>> the queue before we request a blocking compilation. >>>>> >>>>> I've also added a small regression test to demonstrate and verify the >>>>> new functionality. >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>> wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>> all >>>>> compilations. I this case I just want to achieve blocking compilation >>>>> for a >>>>> single compile without affecting the rest of the system. The tests >>>>> using it >>>>> will continue executing as soon as that compile is finished, saving >>>>> time >>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>> >>>>> You are adding parameter which is used only for testing. >>>>> Can we have callback(or check field) into WB instead? Similar to >>>>> WhiteBox::compilation_locked. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>> >>>>> Hi, >>>>> >>>>> Please review this change that adds support for blocking compiles >>>>> in the >>>>> whitebox API. This enables simpler less time consuming tests. >>>>> >>>>> Motivation: >>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>> consuming >>>>> * Blocking compiles removes the need for waiting on the compile >>>>> queue to >>>>> complete >>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>> causing indeterminism in the test >>>>> * Less VM-flags allows for more tests in the same VM >>>>> >>>>> Testing: >>>>> Posting a separate RFR for test fix that uses this change. They >>>>> will be >>>>> pushed at the same time. >>>>> >>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>> >>>>> Best regards, >>>>> Nils Eliasson >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > From nils.eliasson at oracle.com Wed Mar 9 08:14:04 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 9 Mar 2016 09:14:04 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> Message-ID: <56DFDB4C.8000102@oracle.com> Hi Volker, On 2016-03-09 09:02, Volker Simonis wrote: > On Fri, Mar 4, 2016 at 7:30 PM, Volker Simonis wrote: >> Hi Pavel, >> >> thanks for your feedabck. Please find my comments inline: >> >> On Fri, Mar 4, 2016 at 6:01 PM, Pavel Punegov >> wrote: >>> Hi Volker, >>> >>> overall looks good to me (not Reviewer). >>> >>> Just some questions about the test: >>> 1. Do you need to PrintCompilation and Inlining in the test? >>> >>> 39 * -XX:+PrintCompilation >>> 40 * >>> -XX:CompileCommand=option,BlockingCompilation::foo,PrintInlining >> It's not necessary for the test, but if the test will fail it will be good >> to have this information in the .jtr file >> >>> 2. 500_000 for a loop seems to be a way to much. >>> There is a test/compiler/whitebox/CompilerWhiteBoxTest.java that has a >>> constants enough for a compilation. >>> >> That's just an upper bound which won't be reached. We break out of the loop >> once we reached the maximum compilation level. Notice that the loop count is >> not the count until the method get's enqueued for compilation, but until it >> actually gets compiled because this loop tests the non-blocking >> compilations. So if the machine is loaded and/or the compile queue is busy, >> it can take quite some iterations until the method will be finally compiled >> (in my manual tests it usually took not more than 8000 iterations until the >> test left the loop). >> >>> 3. Running Client compiler only could pass even if compilation were >>> blocking here: >>> >>> 128 if (level == 4 && iteration == i) { >>> >>> I think it should check that any level changing have happened with some >>> amount of iterations, not only the 4th level. >> >> The problem is that C1 compiles are so bleeding fast that I got to many >> false positives, i.e. the method got compiled in the same iteration just >> before I queried the compilation level. >> >> I actually begin to think that this test may always fail if you run the >> JTreg tests with "-Xbatch". Do you do that by default (e.g. in JPRT)? In >> that case we would probably have to additionally set >> "-XX:+BackgroundCompilation" in the test options? >> > I've forgot to say that I've tried the regression test by running > jtreg with -Xbatch and it still works because -vmoption/-javaoption > are not passed to a test which is run with '@run main/othervm'. > > @Nils: what's the result of your testing? Is there anything left we > have to do until this can be pushed? I have run all hotspot jtreg tests on linux x64 and x86 without problem. I have also run the test with -Xcomp without problem. -Xbatch is not a part of flag rotation. Now I am just waiting for Pavels ok. Regards, Nils > > Regards, > Volker > > >> Regards, >> Volker >> >> >>> >>> ? Thanks, >>> Pavel Punegov >>> >>> On 04 Mar 2016, at 14:29, Volker Simonis wrote: >>> >>> Great! I'm happy we finally came to an agreement :) >>> >>> Best regards, >>> Volker >>> >>> On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson >>> wrote: >>>> Hi Volker, >>>> >>>> On 2016-03-03 16:25, Volker Simonis wrote: >>>> >>>> Hi Nils, >>>> >>>> thanks for your comments. Please find my new webrev here: >>>> >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 >>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 >>>> >>>> >>>> Looks very good now. >>>> >>>> >>>> Comments as always inline: >>>> >>>> On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson >>>> wrote: >>>>> Hi Volker, >>>>> >>>>> On 2016-03-02 17:36, Volker Simonis wrote: >>>>> >>>>> Hi Nils, >>>>> >>>>> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is >>>>> as base for my new webrevs at: >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >>>>> >>>>> I've updated the copyrights, added the current reviewers and also added >>>>> us both in the Contributed-by line (hope that's fine for you). >>>>> >>>>> >>>>> Absolutely >>>>> >>>>> >>>>> Except that, I've only done the following minor fixes/changes: >>>>> >>>>> compileBroker.{cpp,hpp} >>>>> >>>>> - we don't need CompileBroker::is_compile_blocking() anymore. >>>>> >>>>> Good >>>>> >>>>> >>>>> compilerDirectives.hpp >>>>> >>>>> - I think we should use >>>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, >>>>> BackgroundCompilation) >>>>> instead of: >>>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >>>>> >>>>> so we can also trigger blocking compiles from the command line with a >>>>> CompileCommand (e.g. >>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >>>>> That's very handy during development or and also for simple tests where we >>>>> don't want to mess with compiler directives. (And the overhead to keep this >>>>> feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >>>>> >>>>> >>>>> Without a very strong use case for this I don't want it as a >>>>> CompileCommand. >>>>> >>>>> CompileCommand options do have a cost - they force a temporary unique >>>>> copy of the directive if any option command matches negating some of the >>>>> positive effects of directives. Also the CompileCommands are stringly typed, >>>>> no compile time name or type check is done. This can be fixed in various >>>>> ways, but until then I prefer not to add it. >>>>> >>>> Well, daily working is a strong use case for me:) Until there's no >>>> possibility to provide compiler directives directly on the command line >>>> instead of using an extra file, I think there's a justification for the >>>> CompileCommand version. Also I think the cost argument is not so relevent, >>>> because the feature will be mainly used during developemnt or in small tests >>>> which don't need compiler directives. It will actually make it possible to >>>> write simple tests which require blocking compilations without the need to >>>> use compiler directives or WB (just by specifying a -XX:CompileCommand >>>> option). >>>> >>>> >>>> ok, you convinced me. >>>> >>>>> whitebox.cpp >>>>> >>>>> I think it is good that you fixed the state but I think it is too >>>>> complicated now. We don't need to strdup the string and can easily forget to >>>>> free 'tmpstr' :) So maybe it is simpler to just do another transition for >>>>> parsing the directive: >>>>> >>>>> { >>>>> ThreadInVMfromNative ttvfn(thread); // back to VM >>>>> DirectivesParser::parse_string(dir, tty); >>>>> } >>>>> >>>>> >>>>> Transitions are not free, but on the other hand the string may be long. >>>>> This is not a hot path in anyway so lets go with simple. >>>>> >>>>> >>>>> advancedThresholdPolicy.cpp >>>>> >>>>> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and >>>>> is actually even an improvement over my code which just picked the first >>>>> blocking compilation. >>>>> >>>>> >>>>> Feels good to remove the special cases. >>>>> >>>>> >>>>> diagnosticCommand.cpp >>>>> >>>>> - Shouldn't you also fix CompilerDirectivesAddDCmd to return the number >>>>> of added directives and CompilerDirectivesRemoveDCmd to take the number of >>>>> directives you want to pop? Or do you want to do this in a later, follow-up >>>>> change? >>>>> >>>>> >>>>> Yes, lets do that in a follow up change. They affect a number of tests. >>>>> >>>>> >>>>> WhiteBox.java >>>>> >>>>> - I still think it would make sense to keep the two 'blocking' versions >>>>> of enqueueMethodForCompilation() for convenience. For example your test fix >>>>> for JDK-8073793 would be much simpler if you used them. I've added two >>>>> comments to the 'blocking' convenience methods to mention the fact that >>>>> calling them may shadow previously added compiler directives. >>>>> >>>>> >>>>> I am ok with having then, but think Whitebox.java will get too bloated. >>>>> I would rather have the convenience-methods in some test utility class, like >>>>> CompilerWhiteBoxTest.java. >>>>> >>>> OK, I can live with that. I removed the blocking enqueue methods and the >>>> corresponding tests. >>>>> >>>>> BlockingCompilation.java >>>>> >>>>> - I've extended my regression test to test both methods of doing >>>>> blocking compilation - with the new, 'blocking' >>>>> enqueueMethodForCompilation() methods as well as by manually setting the >>>>> corresponding compiler directives. If we should finally get consensus on >>>>> removing the blocking convenience methods, please just remove the >>>>> corresponding tests. >>>>> >>>>> >>>>> Line 85: for (level = 1; level <= 4; level++) { >>>>> >>>>> You can not be sure all compilation levels are available. Use >>>>> >>>>> * @library /testlibrary /test/lib / >>>>> * @build sun.hotspot.WhiteBox >>>>> * compiler.testlibrary.CompilerUtils >>>>> >>>>> import compiler.testlibrary.CompilerUtils; >>>>> >>>>> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >>>>> for (int level : levels) { >>>>> ... >>>> >>>> Good catch. I've slightly revorked the test. I do bail out early if there >>>> are no compilers at all and I've also fixed the break condition of the loop >>>> which is calling foo() to compare against the highest available compilation >>>> level instead of just using '4'. >>>>> >>>>> I think we're close to a final version now, what do you think :) >>>>> >>>>> >>>>> Yes! I'll take a look as soon as you post an updated webrev. >>>> >>>> Would be good if you could run it trough JPRT once so we can be sure we >>>> didn't break anything. >>>> >>>> >>>> I am running all jtreg tests on some select platforms right now. >>>> >>>> Best regards, >>>> Nils >>>> >>>> >>>> Regards, >>>> Volker >>>> >>>>> >>>>> Regards, >>>>> Nils >>>>> >>>>> >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>> >>>>> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson >>>>> wrote: >>>>>> Yes, I forgot to add the fix for working with multiple directives from >>>>>> whitebox. >>>>>> >>>>>> WB.addCompilerDirectives now returns the number of directives that >>>>>> where added, and removeCompilerDirectives takes a parameter for the number >>>>>> of directives that should be popped (atomically). >>>>>> >>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>>>>> >>>>>> Fixed test in JDK-8073793 to work with this: >>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> >>>>>> On 2016-03-02 13:36, Nils Eliasson wrote: >>>>>> >>>>>> Hi Volker, >>>>>> >>>>>> I created these webrevs including all the feedback from everyone: >>>>>> >>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>>>>> * Only add- and removeCompilerDirective >>>>>> >>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>>>>> * whitebox.cpp >>>>>> -- addCompilerDirective to have correct VM states >>>>>> * advancedThresholdPolicy.cpp >>>>>> -- prevent blocking tasks from becoming stale >>>>>> -- The logic for picking first blocking task broke JVMCI code. Instead >>>>>> made the JVMCI code default (select the blocking task with highest score.) >>>>>> * compilerDirectives.hpp >>>>>> -- Remove option CompileCommand. Not needed. >>>>>> * compileBroker.cpp >>>>>> -- Wrapped compile_method so that directive get and release always are >>>>>> matched. >>>>>> >>>>>> Is anything missing? >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> On 2016-03-01 19:31, Volker Simonis wrote: >>>>>> >>>>>> Hi Pavel, Nils, Vladimir, >>>>>> >>>>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>>>> >>>>>> Thanks a lot for your input and your suggestions. I'll look into this >>>>>> tomorrow and hopefully I'll be able to address all your concerns. >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>>>>> wrote: >>>>>> >>>>>> Nils, please answer Pavel's questions. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> >>>>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>>> >>>>>> Hi Volker, >>>>>> >>>>>> An excellent proposition. This is how it should be used. >>>>>> >>>>>> I polished a few rough edges: >>>>>> * CompilerBroker.cpp - The directives was already access in >>>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>>> so we only have a single directive access. Wrapped compile_method to >>>>>> make sure the release of the directive doesn't get lost. >>>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>>> state - need to be in native to get string, but then need to be in VM >>>>>> when parsing directive. >>>>>> >>>>>> And some comments: >>>>>> * I am against adding new compile option commands (At least until the >>>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>>> directives instead. >>>>>> >>>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>>> could save the blocking info in the task so we don't need to access the >>>>>> directive in the policy. >>>>>> >>>>>> All in here: >>>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>>> >>>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> so I want to propose the following solution for this problem: >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>>> >>>>>> I've started from the opposite site and made the BackgroundCompilation >>>>>> manageable through the compiler directives framework. Once this works >>>>>> (and it's actually trivial due to the nice design of the >>>>>> CompilerDirectives framework :), we get the possibility to set the >>>>>> BackgroundCompilation option on a per method base on the command line >>>>>> via the CompileCommand option for free: >>>>>> >>>>>> >>>>>> >>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>>> >>>>>> >>>>>> And of course we can also use it directly as a compiler directive: >>>>>> >>>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>>> >>>>>> It also becomes possible to use this directly from the Whitebox API >>>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>>> Unfortunately, this command takes a file with compiler directives as >>>>>> argument. I think this would be overkill in this context. So because >>>>>> it was so easy and convenient, I added the following two new Whitebox >>>>>> methods: >>>>>> >>>>>> public native void addCompilerDirective(String compDirect); >>>>>> public native void removeCompilerDirective(); >>>>>> >>>>>> which can now be used to set arbitrary CompilerDirective command >>>>>> directly from within the WhiteBox API. (The implementation of these >>>>>> two methods is trivial as you can see in whitebox.cpp). >>>>>> v >>>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>>> simple wrappers around the existing methods without the need of any >>>>>> code changes in their native implementation. This is good, because it >>>>>> keeps the WhiteBox API stable! >>>>>> >>>>>> Finally some words about the implementation of the per-method >>>>>> BackgroundCompilation functionality. It actually only requires two >>>>>> small changes: >>>>>> >>>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>>> and compilation level as arguments and use them to query the >>>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>>> >>>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>>> prefers blocking compilations. This is not only necessary, because it >>>>>> decreases the time we have to wait for a blocking compilation, but >>>>>> also because it prevents blocking compiles from getting stale. This >>>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>>> for methods which only get artificially compiled during a test because >>>>>> their invocations counters are usually too small. >>>>>> >>>>>> There's still a small probability that a blocking compilation will be >>>>>> not blocking. This can happen if a method for which we request the >>>>>> blocking compilation is already in the compilation queue (see the >>>>>> check 'compilation_is_in_queue(method)' in >>>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>>> rarely happen because methods which are manually compiled shouldn't >>>>>> get called that many times to implicitly place them into the compile >>>>>> queue. But we can even completely avoid this problem by using >>>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>>> the queue before we request a blocking compilation. >>>>>> >>>>>> I've also added a small regression test to demonstrate and verify the >>>>>> new functionality. >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>>> wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>> all >>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>> for a >>>>>> single compile without affecting the rest of the system. The tests >>>>>> using it >>>>>> will continue executing as soon as that compile is finished, saving >>>>>> time >>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>> >>>>>> You are adding parameter which is used only for testing. >>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>> WhiteBox::compilation_locked. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Please review this change that adds support for blocking compiles >>>>>> in the >>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>> >>>>>> Motivation: >>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>> consuming >>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>> queue to >>>>>> complete >>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>> causing indeterminism in the test >>>>>> * Less VM-flags allows for more tests in the same VM >>>>>> >>>>>> Testing: >>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>> will be >>>>>> pushed at the same time. >>>>>> >>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>> >>>>>> Best regards, >>>>>> Nils Eliasson >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> From edward.nevill at gmail.com Wed Mar 9 12:17:41 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Wed, 09 Mar 2016 12:17:41 +0000 Subject: RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words Message-ID: <1457525861.10946.31.camel@mint> Hi, Please review the following webrev http://cr.openjdk.java.net/~enevill/8151502/webrev/ This optimizes Copy::pd_disjoint_words and Copy::pd_conjoint_words using inline assembler. These routines are heavily used in GC and the aim is to improve the overall performance of GC. Tested in JMH using the following GCStress program. http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar The following are the results I get Original: /home/ed/images/jdk9-orig/bin/java -jar target/benchmarks.jar -i 5 -wi 5 -f 5 Result "gcstress": 24636979.087 ?(99.9%) 267838.773 us/op [Average] (min, avg, max) = (24102797.710, 24636979.087, 25372022.370), stdev = 357557.099 CI (99.9%): [24369140.314, 24904817.860] (assumes normal distribution) # Run complete. Total time: 00:20:55 Benchmark Mode Cnt Score Error Units JMHSample_97_GCStress.gcstress avgt 25 24636979.087 ? 267838.773 us/op --------------------------------------------------------------------------- Optimized: /home/ed/images/jdk9-test/bin/java -jar target/benchmarks.jar -i 5 -wi 5 -f 5 Result "gcstress": 20164420.762 ?(99.9%) 280305.425 us/op [Average] (min, avg, max) = (19738992.960, 20164420.762, 21137460.090), stdev = 374199.723 CI (99.9%): [19884115.337, 20444726.188] (assumes normal distribution) # Run complete. Total time: 00:17:06 Benchmark Mode Cnt Score Error Units JMHSample_97_GCStress.gcstress avgt 25 20164420.762 ? 280305.425 us/op This shows approx 22% performance improvement on this benchmark. I have also included a small bug fix to Array copy when using -XX:+UseSIMDForMemoryOps. I had fixed this previously, but somehow it fell out. All the best, Ed From pavel.punegov at oracle.com Wed Mar 9 12:19:33 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Wed, 9 Mar 2016 15:19:33 +0300 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56DFDB4C.8000102@oracle.com> References: <56CF175E.1030806@oracle.com> <56CF6E9D.8060507@oracle.com> <56D00EAB.1010009@oracle.com> <56D5A60A.50700@oracle.com> <56D5D066.7040805@oracle.com> <56D6DE47.7060405@oracle.com> <56D6EC92.7010309@oracle.com> <56D83272.9010202@oracle.com> <56D95E62.20309@oracle.com> <56DFDB4C.8000102@oracle.com> Message-ID: <43C9547D-3BA1-4DB6-A3D4-4431FAF10395@oracle.com> Thanks for your explanation, Volker. Changes are good. ? Pavel. > On 09 Mar 2016, at 11:14, Nils Eliasson wrote: > > Hi Volker, > > On 2016-03-09 09:02, Volker Simonis wrote: >> On Fri, Mar 4, 2016 at 7:30 PM, Volker Simonis wrote: >>> Hi Pavel, >>> >>> thanks for your feedabck. Please find my comments inline: >>> >>> On Fri, Mar 4, 2016 at 6:01 PM, Pavel Punegov >>> wrote: >>>> Hi Volker, >>>> >>>> overall looks good to me (not Reviewer). >>>> >>>> Just some questions about the test: >>>> 1. Do you need to PrintCompilation and Inlining in the test? >>>> >>>> 39 * -XX:+PrintCompilation >>>> 40 * >>>> -XX:CompileCommand=option,BlockingCompilation::foo,PrintInlining >>> It's not necessary for the test, but if the test will fail it will be good >>> to have this information in the .jtr file >>> >>>> 2. 500_000 for a loop seems to be a way to much. >>>> There is a test/compiler/whitebox/CompilerWhiteBoxTest.java that has a >>>> constants enough for a compilation. >>>> >>> That's just an upper bound which won't be reached. We break out of the loop >>> once we reached the maximum compilation level. Notice that the loop count is >>> not the count until the method get's enqueued for compilation, but until it >>> actually gets compiled because this loop tests the non-blocking >>> compilations. So if the machine is loaded and/or the compile queue is busy, >>> it can take quite some iterations until the method will be finally compiled >>> (in my manual tests it usually took not more than 8000 iterations until the >>> test left the loop). >>> >>>> 3. Running Client compiler only could pass even if compilation were >>>> blocking here: >>>> >>>> 128 if (level == 4 && iteration == i) { >>>> >>>> I think it should check that any level changing have happened with some >>>> amount of iterations, not only the 4th level. >>> >>> The problem is that C1 compiles are so bleeding fast that I got to many >>> false positives, i.e. the method got compiled in the same iteration just >>> before I queried the compilation level. >>> >>> I actually begin to think that this test may always fail if you run the >>> JTreg tests with "-Xbatch". Do you do that by default (e.g. in JPRT)? In >>> that case we would probably have to additionally set >>> "-XX:+BackgroundCompilation" in the test options? >>> >> I've forgot to say that I've tried the regression test by running >> jtreg with -Xbatch and it still works because -vmoption/-javaoption >> are not passed to a test which is run with '@run main/othervm'. >> >> @Nils: what's the result of your testing? Is there anything left we >> have to do until this can be pushed? > > I have run all hotspot jtreg tests on linux x64 and x86 without problem. I have also run the test with -Xcomp without problem. -Xbatch is not a part of flag rotation. Now I am just waiting for Pavels ok. > > Regards, > Nils > >> >> Regards, >> Volker >> >> >>> Regards, >>> Volker >>> >>> >>>> >>>> ? Thanks, >>>> Pavel Punegov >>>> >>>> On 04 Mar 2016, at 14:29, Volker Simonis wrote: >>>> >>>> Great! I'm happy we finally came to an agreement :) >>>> >>>> Best regards, >>>> Volker >>>> >>>> On Fri, Mar 4, 2016 at 11:07 AM, Nils Eliasson >>>> wrote: >>>>> Hi Volker, >>>>> >>>>> On 2016-03-03 16:25, Volker Simonis wrote: >>>>> >>>>> Hi Nils, >>>>> >>>>> thanks for your comments. Please find my new webrev here: >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v4 >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v4 >>>>> >>>>> >>>>> Looks very good now. >>>>> >>>>> >>>>> Comments as always inline: >>>>> >>>>> On Thu, Mar 3, 2016 at 1:47 PM, Nils Eliasson >>>>> wrote: >>>>>> Hi Volker, >>>>>> >>>>>> On 2016-03-02 17:36, Volker Simonis wrote: >>>>>> >>>>>> Hi Nils, >>>>>> >>>>>> your last webrev (jdk.03 and hotspot.05)) looks pretty good! Ive used is >>>>>> as base for my new webrevs at: >>>>>> >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot.v3 >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel.v3 >>>>>> >>>>>> I've updated the copyrights, added the current reviewers and also added >>>>>> us both in the Contributed-by line (hope that's fine for you). >>>>>> >>>>>> >>>>>> Absolutely >>>>>> >>>>>> >>>>>> Except that, I've only done the following minor fixes/changes: >>>>>> >>>>>> compileBroker.{cpp,hpp} >>>>>> >>>>>> - we don't need CompileBroker::is_compile_blocking() anymore. >>>>>> >>>>>> Good >>>>>> >>>>>> >>>>>> compilerDirectives.hpp >>>>>> >>>>>> - I think we should use >>>>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, >>>>>> BackgroundCompilation) >>>>>> instead of: >>>>>> cflags(BackgroundCompilation, bool, BackgroundCompilation, X) >>>>>> >>>>>> so we can also trigger blocking compiles from the command line with a >>>>>> CompileCommand (e.g. >>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false") >>>>>> That's very handy during development or and also for simple tests where we >>>>>> don't want to mess with compiler directives. (And the overhead to keep this >>>>>> feature is quite small, just "BackgroundCompilation" instead of "X" ;-) >>>>>> >>>>>> >>>>>> Without a very strong use case for this I don't want it as a >>>>>> CompileCommand. >>>>>> >>>>>> CompileCommand options do have a cost - they force a temporary unique >>>>>> copy of the directive if any option command matches negating some of the >>>>>> positive effects of directives. Also the CompileCommands are stringly typed, >>>>>> no compile time name or type check is done. This can be fixed in various >>>>>> ways, but until then I prefer not to add it. >>>>>> >>>>> Well, daily working is a strong use case for me:) Until there's no >>>>> possibility to provide compiler directives directly on the command line >>>>> instead of using an extra file, I think there's a justification for the >>>>> CompileCommand version. Also I think the cost argument is not so relevent, >>>>> because the feature will be mainly used during developemnt or in small tests >>>>> which don't need compiler directives. It will actually make it possible to >>>>> write simple tests which require blocking compilations without the need to >>>>> use compiler directives or WB (just by specifying a -XX:CompileCommand >>>>> option). >>>>> >>>>> >>>>> ok, you convinced me. >>>>> >>>>>> whitebox.cpp >>>>>> >>>>>> I think it is good that you fixed the state but I think it is too >>>>>> complicated now. We don't need to strdup the string and can easily forget to >>>>>> free 'tmpstr' :) So maybe it is simpler to just do another transition for >>>>>> parsing the directive: >>>>>> >>>>>> { >>>>>> ThreadInVMfromNative ttvfn(thread); // back to VM >>>>>> DirectivesParser::parse_string(dir, tty); >>>>>> } >>>>>> >>>>>> >>>>>> Transitions are not free, but on the other hand the string may be long. >>>>>> This is not a hot path in anyway so lets go with simple. >>>>>> >>>>>> >>>>>> advancedThresholdPolicy.cpp >>>>>> >>>>>> - the JVMCI code looks reasonable (although I haven't tested JVMCI) and >>>>>> is actually even an improvement over my code which just picked the first >>>>>> blocking compilation. >>>>>> >>>>>> >>>>>> Feels good to remove the special cases. >>>>>> >>>>>> >>>>>> diagnosticCommand.cpp >>>>>> >>>>>> - Shouldn't you also fix CompilerDirectivesAddDCmd to return the number >>>>>> of added directives and CompilerDirectivesRemoveDCmd to take the number of >>>>>> directives you want to pop? Or do you want to do this in a later, follow-up >>>>>> change? >>>>>> >>>>>> >>>>>> Yes, lets do that in a follow up change. They affect a number of tests. >>>>>> >>>>>> >>>>>> WhiteBox.java >>>>>> >>>>>> - I still think it would make sense to keep the two 'blocking' versions >>>>>> of enqueueMethodForCompilation() for convenience. For example your test fix >>>>>> for JDK-8073793 would be much simpler if you used them. I've added two >>>>>> comments to the 'blocking' convenience methods to mention the fact that >>>>>> calling them may shadow previously added compiler directives. >>>>>> >>>>>> >>>>>> I am ok with having then, but think Whitebox.java will get too bloated. >>>>>> I would rather have the convenience-methods in some test utility class, like >>>>>> CompilerWhiteBoxTest.java. >>>>>> >>>>> OK, I can live with that. I removed the blocking enqueue methods and the >>>>> corresponding tests. >>>>>> >>>>>> BlockingCompilation.java >>>>>> >>>>>> - I've extended my regression test to test both methods of doing >>>>>> blocking compilation - with the new, 'blocking' >>>>>> enqueueMethodForCompilation() methods as well as by manually setting the >>>>>> corresponding compiler directives. If we should finally get consensus on >>>>>> removing the blocking convenience methods, please just remove the >>>>>> corresponding tests. >>>>>> >>>>>> >>>>>> Line 85: for (level = 1; level <= 4; level++) { >>>>>> >>>>>> You can not be sure all compilation levels are available. Use >>>>>> >>>>>> * @library /testlibrary /test/lib / >>>>>> * @build sun.hotspot.WhiteBox >>>>>> * compiler.testlibrary.CompilerUtils >>>>>> >>>>>> import compiler.testlibrary.CompilerUtils; >>>>>> >>>>>> int[] levels = CompilerUtils.getAvailableCompilationLevels(); >>>>>> for (int level : levels) { >>>>>> ... >>>>> >>>>> Good catch. I've slightly revorked the test. I do bail out early if there >>>>> are no compilers at all and I've also fixed the break condition of the loop >>>>> which is calling foo() to compare against the highest available compilation >>>>> level instead of just using '4'. >>>>>> >>>>>> I think we're close to a final version now, what do you think :) >>>>>> >>>>>> >>>>>> Yes! I'll take a look as soon as you post an updated webrev. >>>>> >>>>> Would be good if you could run it trough JPRT once so we can be sure we >>>>> didn't break anything. >>>>> >>>>> >>>>> I am running all jtreg tests on some select platforms right now. >>>>> >>>>> Best regards, >>>>> Nils >>>>> >>>>> >>>>> Regards, >>>>> Volker >>>>> >>>>>> >>>>>> Regards, >>>>>> Nils >>>>>> >>>>>> >>>>>> >>>>>> Regards, >>>>>> Volker >>>>>> >>>>>> >>>>>> On Wed, Mar 2, 2016 at 2:37 PM, Nils Eliasson >>>>>> wrote: >>>>>>> Yes, I forgot to add the fix for working with multiple directives from >>>>>>> whitebox. >>>>>>> >>>>>>> WB.addCompilerDirectives now returns the number of directives that >>>>>>> where added, and removeCompilerDirectives takes a parameter for the number >>>>>>> of directives that should be popped (atomically). >>>>>>> >>>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.03/ >>>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.05/ >>>>>>> >>>>>>> Fixed test in JDK-8073793 to work with this: >>>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.03/ >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2016-03-02 13:36, Nils Eliasson wrote: >>>>>>> >>>>>>> Hi Volker, >>>>>>> >>>>>>> I created these webrevs including all the feedback from everyone: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.02/ >>>>>>> * Only add- and removeCompilerDirective >>>>>>> >>>>>>> http://cr.openjdk.java.net/~neliasso/8150646/webrev.04/ >>>>>>> * whitebox.cpp >>>>>>> -- addCompilerDirective to have correct VM states >>>>>>> * advancedThresholdPolicy.cpp >>>>>>> -- prevent blocking tasks from becoming stale >>>>>>> -- The logic for picking first blocking task broke JVMCI code. Instead >>>>>>> made the JVMCI code default (select the blocking task with highest score.) >>>>>>> * compilerDirectives.hpp >>>>>>> -- Remove option CompileCommand. Not needed. >>>>>>> * compileBroker.cpp >>>>>>> -- Wrapped compile_method so that directive get and release always are >>>>>>> matched. >>>>>>> >>>>>>> Is anything missing? >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> >>>>>>> On 2016-03-01 19:31, Volker Simonis wrote: >>>>>>> >>>>>>> Hi Pavel, Nils, Vladimir, >>>>>>> >>>>>>> sorry, but I was busy the last days so I couldn't answer your mails. >>>>>>> >>>>>>> Thanks a lot for your input and your suggestions. I'll look into this >>>>>>> tomorrow and hopefully I'll be able to address all your concerns. >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 1, 2016 at 6:24 PM, Vladimir Kozlov >>>>>>> wrote: >>>>>>> >>>>>>> Nils, please answer Pavel's questions. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> >>>>>>> On 3/1/16 6:24 AM, Nils Eliasson wrote: >>>>>>> >>>>>>> Hi Volker, >>>>>>> >>>>>>> An excellent proposition. This is how it should be used. >>>>>>> >>>>>>> I polished a few rough edges: >>>>>>> * CompilerBroker.cpp - The directives was already access in >>>>>>> compile_method - but hidden incompilation_is_prohibited. I moved it out >>>>>>> so we only have a single directive access. Wrapped compile_method to >>>>>>> make sure the release of the directive doesn't get lost. >>>>>>> * Let WB_AddCompilerDirective return a bool for success. Also fixed the >>>>>>> state - need to be in native to get string, but then need to be in VM >>>>>>> when parsing directive. >>>>>>> >>>>>>> And some comments: >>>>>>> * I am against adding new compile option commands (At least until the >>>>>>> stringly typeness is fixed). Lets add good ways too use compiler >>>>>>> directives instead. >>>>>>> >>>>>>> I need to look at the stale task removal code tomorrow - hopefully we >>>>>>> could save the blocking info in the task so we don't need to access the >>>>>>> directive in the policy. >>>>>>> >>>>>>> All in here: >>>>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.03/ >>>>>>> >>>>>>> The code runs fine with the test I fixed for JDK-8073793: >>>>>>> http://cr.openjdk.java.net/~neliasso/8073793/webrev.02/ >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> On 2016-02-26 19:47, Volker Simonis wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> so I want to propose the following solution for this problem: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_toplevel >>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8150646_hotspot/ >>>>>>> >>>>>>> I've started from the opposite site and made the BackgroundCompilation >>>>>>> manageable through the compiler directives framework. Once this works >>>>>>> (and it's actually trivial due to the nice design of the >>>>>>> CompilerDirectives framework :), we get the possibility to set the >>>>>>> BackgroundCompilation option on a per method base on the command line >>>>>>> via the CompileCommand option for free: >>>>>>> >>>>>>> >>>>>>> >>>>>>> -XX:CompileCommand="option,java.lang.String::charAt,bool,BackgroundCompilation,false" >>>>>>> >>>>>>> >>>>>>> And of course we can also use it directly as a compiler directive: >>>>>>> >>>>>>> [{ match: "java.lang.String::charAt", BackgroundCompilation: false }] >>>>>>> >>>>>>> It also becomes possible to use this directly from the Whitebox API >>>>>>> through the DiagnosticCommand.compilerDirectivesAdd command. >>>>>>> Unfortunately, this command takes a file with compiler directives as >>>>>>> argument. I think this would be overkill in this context. So because >>>>>>> it was so easy and convenient, I added the following two new Whitebox >>>>>>> methods: >>>>>>> >>>>>>> public native void addCompilerDirective(String compDirect); >>>>>>> public native void removeCompilerDirective(); >>>>>>> >>>>>>> which can now be used to set arbitrary CompilerDirective command >>>>>>> directly from within the WhiteBox API. (The implementation of these >>>>>>> two methods is trivial as you can see in whitebox.cpp). >>>>>>> v >>>>>>> The blocking versions of enqueueMethodForCompilation() now become >>>>>>> simple wrappers around the existing methods without the need of any >>>>>>> code changes in their native implementation. This is good, because it >>>>>>> keeps the WhiteBox API stable! >>>>>>> >>>>>>> Finally some words about the implementation of the per-method >>>>>>> BackgroundCompilation functionality. It actually only requires two >>>>>>> small changes: >>>>>>> >>>>>>> 1. extending CompileBroker::is_compile_blocking() to take the method >>>>>>> and compilation level as arguments and use them to query the >>>>>>> DirectivesStack for the corresponding BackgroundCompilation value. >>>>>>> >>>>>>> 2. changing AdvancedThresholdPolicy::select_task() such that it >>>>>>> prefers blocking compilations. This is not only necessary, because it >>>>>>> decreases the time we have to wait for a blocking compilation, but >>>>>>> also because it prevents blocking compiles from getting stale. This >>>>>>> could otherwise easily happen in AdvancedThresholdPolicy::is_stale() >>>>>>> for methods which only get artificially compiled during a test because >>>>>>> their invocations counters are usually too small. >>>>>>> >>>>>>> There's still a small probability that a blocking compilation will be >>>>>>> not blocking. This can happen if a method for which we request the >>>>>>> blocking compilation is already in the compilation queue (see the >>>>>>> check 'compilation_is_in_queue(method)' in >>>>>>> CompileBroker::compile_method_base()). In testing scenarios this will >>>>>>> rarely happen because methods which are manually compiled shouldn't >>>>>>> get called that many times to implicitly place them into the compile >>>>>>> queue. But we can even completely avoid this problem by using >>>>>>> WB.isMethodQueuedForCompilation() to make sure that a method is not in >>>>>>> the queue before we request a blocking compilation. >>>>>>> >>>>>>> I've also added a small regression test to demonstrate and verify the >>>>>>> new functionality. >>>>>>> >>>>>>> Regards, >>>>>>> Volker >>>>>>> >>>>>>> On Fri, Feb 26, 2016 at 9:36 AM, Nils Eliasson >>>>>>> wrote: >>>>>>> >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> WhiteBox::compilation_locked is a global state that temporarily stops >>>>>>> all >>>>>>> compilations. I this case I just want to achieve blocking compilation >>>>>>> for a >>>>>>> single compile without affecting the rest of the system. The tests >>>>>>> using it >>>>>>> will continue executing as soon as that compile is finished, saving >>>>>>> time >>>>>>> where wait-loops is used today. It adds nice determinism to tests. >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson >>>>>>> >>>>>>> >>>>>>> On 2016-02-25 22:14, Vladimir Kozlov wrote: >>>>>>> >>>>>>> You are adding parameter which is used only for testing. >>>>>>> Can we have callback(or check field) into WB instead? Similar to >>>>>>> WhiteBox::compilation_locked. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/25/16 7:01 AM, Nils Eliasson wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Please review this change that adds support for blocking compiles >>>>>>> in the >>>>>>> whitebox API. This enables simpler less time consuming tests. >>>>>>> >>>>>>> Motivation: >>>>>>> * -XX:-BackgroundCompilation is a global flag and can be time >>>>>>> consuming >>>>>>> * Blocking compiles removes the need for waiting on the compile >>>>>>> queue to >>>>>>> complete >>>>>>> * Compiles put in the queue may be evicted if the queue grows to big - >>>>>>> causing indeterminism in the test >>>>>>> * Less VM-flags allows for more tests in the same VM >>>>>>> >>>>>>> Testing: >>>>>>> Posting a separate RFR for test fix that uses this change. They >>>>>>> will be >>>>>>> pushed at the same time. >>>>>>> >>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >>>>>>> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >>>>>>> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >>>>>>> >>>>>>> Best regards, >>>>>>> Nils Eliasson -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at redhat.com Wed Mar 9 12:57:27 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 9 Mar 2016 12:57:27 +0000 Subject: [aarch64-port-dev ] RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words In-Reply-To: <1457525861.10946.31.camel@mint> References: <1457525861.10946.31.camel@mint> Message-ID: <56E01DB7.1070609@redhat.com> On 03/09/2016 12:17 PM, Edward Nevill wrote: > http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java > > JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar > > The following are the results I get Not bad, but not quite perfect. But I guess you knew I'd say that. :-) The switch on count < threshold should be done in C, with multiple inline asm blocks. That way, GCC can do value range propagation for small copies. Also, GCC can do things like if (__builtin_constant_p(cnt)). There are some cases where cnt is a constant. We should be careful that we don't slow down slow such cases. GCC does this: 0x0000007fb774850c <+0>: adrp x2, 0x7fb7db1000 0x0000007fb7748510 <+4>: add x0, x2, #0xe20 0x0000007fb7748514 <+8>: ldr x5, [x0,#56] 0x0000007fb7748518 <+12>: ldr x4, [x0,#48] 0x0000007fb774851c <+16>: ldr x3, [x0,#40] 0x0000007fb7748520 <+20>: ldr x1, [x0,#32] 0x0000007fb7748524 <+24>: str x5, [x0,#24] 0x0000007fb7748528 <+28>: str x4, [x0,#16] 0x0000007fb774852c <+32>: str x3, [x0,#8] 0x0000007fb7748530 <+36>: str x1, [x2,#3616] for this: HeapWord blah[4]; HeapWord blah2[4]; void bletch() { Copy::disjoint_words(blah, blah2, sizeof blah / sizeof blah[0]); } Finally, GCC has __builtin_expect(bool). We should use that to emit the large copy and the backwards copy out of line. Finally, GCC knows that copying from one object to the other copies the contents. It can do copy propagation. I think that this change can be done with no performance regressions, either in code size or speed, for any range of arguments. Andrew. From vladimir.x.ivanov at oracle.com Wed Mar 9 13:09:00 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 9 Mar 2016 16:09:00 +0300 Subject: [9] RFR (XS): 8150320: C1: Illegal bci in debug info for MH::linkTo* methods Message-ID: <56E0206C.9090601@oracle.com> http://cr.openjdk.java.net/~vlivanov/8150320/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8150320 C1 doesn't attach any bci info to TypeCast nodes which narrow receiver and argument types when inlining through MH::linkTo* happens. The fix is to switch from parse-only to complete JVM state. No test case provided since it is hard to trigger the problem in a unit test. Testing: verified that the assert doesn't fire anymore w/ a long-running javac. Thanks! Best regards, Vladimir Ivanov From pavel.punegov at oracle.com Wed Mar 9 13:48:29 2016 From: pavel.punegov at oracle.com (Pavel Punegov) Date: Wed, 9 Mar 2016 16:48:29 +0300 Subject: RFR (XXS): 8150955: RandomValidCommandsTest.java fails with UnsatisfiedLinkError: sun.hotspot.WhiteBox.registerNatives In-Reply-To: <56D9C13F.9050009@oracle.com> References: <03E7FC93-DA1B-4A03-8E77-E7E256AE0AC8@oracle.com> <56D9C13F.9050009@oracle.com> Message-ID: <7270DF7B-FD90-4CAB-BFF3-9FDE7A15BF9B@oracle.com> Thanks for review, Vladimir! ? Pavel. > On 04 Mar 2016, at 20:09, Vladimir Kozlov wrote: > > Looks good. > > Thanks, > Vladimir > > On 3/4/16 8:34 AM, Pavel Punegov wrote: >> Hi, >> >> please review this small fix to the test bug. >> >> Issue: test fails to start because WhiteBox options were not specified. Test randomly generated only one command that >> was skipped. But appropriate VM options are always got from the Command enum together with options specific for >> a command. In the failing case VM started without any options. >> >> Fix: make test replace invalid commands with a valid one. >> >> webrev: http://cr.openjdk.java.net/~ppunegov/8150955/webrev.00/ >> bug: https://bugs.openjdk.java.net/browse/JDK-8150955 >> >> ? Thanks, >> Pavel Punegov >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From edward.nevill at gmail.com Wed Mar 9 14:07:12 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Wed, 09 Mar 2016 14:07:12 +0000 Subject: [aarch64-port-dev ] RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words In-Reply-To: <56E01DB7.1070609@redhat.com> References: <1457525861.10946.31.camel@mint> <56E01DB7.1070609@redhat.com> Message-ID: <1457532432.10946.77.camel@mint> On Wed, 2016-03-09 at 12:57 +0000, Andrew Haley wrote: > On 03/09/2016 12:17 PM, Edward Nevill wrote: > > http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java > > > > JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar > > > > The following are the results I get > > Not bad, but not quite perfect. But I guess you knew I'd say that. > :-) > > The switch on count < threshold should be done in C, with multiple > inline asm blocks. That way, GCC can do value range propagation for > small copies. Hmm. I did try using switch on my first stab at this but gave up when I got the following output for this simple test program (this is with stock gcc 5.2). --- CUT HERE --- unsigned long test_switch(unsigned long /*aka size_t*/ i) { switch (i) { case 0: return 1; case 1: return i+2; case 2: return i; case 3: return i-10; case 4: return 11; case 5: return 16; default: return -1; } } --- CUT HERE --- generates ---------------- test_switch: cmp x0, 5 bhi .L2 cmp w0, 5 <<<<< REALLY bls .L12 .L2: mov x0, -1 ret .p2align 3 .L12: adrp x1, .L4 add x1, x1, :lo12:.L4 ldrb w0, [x1,w0,uxtw] <<<<< REALLY adr x1, .Lrtx4 add x0, x1, w0, sxtb #2 br x0 ---------------- I've filed a bug report. I could maybe do a gcc goto table, but would that fool value range propagation? > > Also, GCC can do things like if (__builtin_constant_p(cnt)). There > are some cases where cnt is a constant. We should be careful that we > don't slow down slow such cases. Agreed. > void bletch() { > Copy::disjoint_words(blah, blah2, sizeof blah / sizeof blah[0]); > } > > Finally, GCC has __builtin_expect(bool). We should use that to emit > the large copy and the backwards copy out of line. I did initially do the large copy out of line. My concern was that the register allocator wouldn't handle the two paths and would treat x0..x18 as corrupted on both paths, whereas the inline version 'only' uses 11 registers. All the best, Ed. From aph at redhat.com Wed Mar 9 14:11:19 2016 From: aph at redhat.com (Andrew Haley) Date: Wed, 9 Mar 2016 14:11:19 +0000 Subject: [aarch64-port-dev ] RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words In-Reply-To: <1457532432.10946.77.camel@mint> References: <1457525861.10946.31.camel@mint> <56E01DB7.1070609@redhat.com> <1457532432.10946.77.camel@mint> Message-ID: <56E02F07.4040005@redhat.com> On 03/09/2016 02:07 PM, Edward Nevill wrote: > On Wed, 2016-03-09 at 12:57 +0000, Andrew Haley wrote: >> On 03/09/2016 12:17 PM, Edward Nevill wrote: >>> http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java >>> >>> JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar >>> >>> The following are the results I get >> >> Not bad, but not quite perfect. But I guess you knew I'd say that. >> :-) >> >> The switch on count < threshold should be done in C, with multiple >> inline asm blocks. That way, GCC can do value range propagation for >> small copies. > > Hmm. I did try using switch on my first stab at this but gave up > when I got the following output for this simple test program (this > is with stock gcc 5.2). I was just thinking of if (cnt < 8) small() else large(); >> void bletch() { >> Copy::disjoint_words(blah, blah2, sizeof blah / sizeof blah[0]); >> } >> >> Finally, GCC has __builtin_expect(bool). We should use that to emit >> the large copy and the backwards copy out of line. > > I did initially do the large copy out of line. My concern was that > the register allocator wouldn't handle the two paths and would treat > x0..x18 as corrupted on both paths, whereas the inline version > 'only' uses 11 registers. You can do the pushing and popping yourself. Andrew. From vladimir.kempik at oracle.com Wed Mar 9 16:15:05 2016 From: vladimir.kempik at oracle.com (Vladimir Kempik) Date: Wed, 9 Mar 2016 19:15:05 +0300 Subject: [8u] RFR 8151522: Disable 8130150 and 8081778 intrinsics by default Message-ID: <56E04C09.90106@oracle.com> Hello Please review this simple change for jdk8u. This patch makes intrinsics introduced in 8130150 and 8081778 to be disabled by default. They still can be enabled with proper flags. I also had to modify testcases introduced by 8130150 and 8081778 to enable these flags. Testing: jprt. Bug: https://bugs.openjdk.java.net/browse/JDK-8151522 Webrev: http://cr.openjdk.java.net/~vkempik/8151522/webrev.00/ Thanks -Vladimir From vladimir.kozlov at oracle.com Wed Mar 9 18:06:58 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 9 Mar 2016 10:06:58 -0800 Subject: [8u] RFR 8151522: Disable 8130150 and 8081778 intrinsics by default In-Reply-To: <56E04C09.90106@oracle.com> References: <56E04C09.90106@oracle.com> Message-ID: <56E06642.3070005@oracle.com> Please, add testing for flags combination for MontgomeryMultiplyTest.java the same as you did for 7u. Thanks, Vladimir On 3/9/16 8:15 AM, Vladimir Kempik wrote: > Hello > > Please review this simple change for jdk8u. > > This patch makes intrinsics introduced in 8130150 and 8081778 to be > disabled by default. > > They still can be enabled with proper flags. > > I also had to modify testcases introduced by 8130150 and 8081778 to > enable these flags. > > Testing: jprt. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8151522 > Webrev: http://cr.openjdk.java.net/~vkempik/8151522/webrev.00/ > > Thanks > -Vladimir > From vladimir.kempik at oracle.com Wed Mar 9 18:13:07 2016 From: vladimir.kempik at oracle.com (Vladimir Kempik) Date: Wed, 9 Mar 2016 21:13:07 +0300 Subject: [8u] RFR 8151522: Disable 8130150 and 8081778 intrinsics by default In-Reply-To: <56E06642.3070005@oracle.com> References: <56E04C09.90106@oracle.com> <56E06642.3070005@oracle.com> Message-ID: <56E067B3.9050202@oracle.com> Hello Updated webrev - http://cr.openjdk.java.net/~vkempik/8151522/webrev.01/ Thanks, Vladimir On 09.03.2016 21:06, Vladimir Kozlov wrote: > Please, add testing for flags combination for > MontgomeryMultiplyTest.java the same as you did for 7u. > > Thanks, > Vladimir > > On 3/9/16 8:15 AM, Vladimir Kempik wrote: >> Hello >> >> Please review this simple change for jdk8u. >> >> This patch makes intrinsics introduced in 8130150 and 8081778 to be >> disabled by default. >> >> They still can be enabled with proper flags. >> >> I also had to modify testcases introduced by 8130150 and 8081778 to >> enable these flags. >> >> Testing: jprt. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8151522 >> Webrev: http://cr.openjdk.java.net/~vkempik/8151522/webrev.00/ >> >> Thanks >> -Vladimir >> From vladimir.kozlov at oracle.com Wed Mar 9 18:14:12 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 9 Mar 2016 10:14:12 -0800 Subject: [8u] RFR 8151522: Disable 8130150 and 8081778 intrinsics by default In-Reply-To: <56E067B3.9050202@oracle.com> References: <56E04C09.90106@oracle.com> <56E06642.3070005@oracle.com> <56E067B3.9050202@oracle.com> Message-ID: <56E067F4.20109@oracle.com> This looks good. Thanks, Vladimir On 3/9/16 10:13 AM, Vladimir Kempik wrote: > Hello > > Updated webrev - http://cr.openjdk.java.net/~vkempik/8151522/webrev.01/ > > Thanks, Vladimir > > On 09.03.2016 21:06, Vladimir Kozlov wrote: >> Please, add testing for flags combination for >> MontgomeryMultiplyTest.java the same as you did for 7u. >> >> Thanks, >> Vladimir >> >> On 3/9/16 8:15 AM, Vladimir Kempik wrote: >>> Hello >>> >>> Please review this simple change for jdk8u. >>> >>> This patch makes intrinsics introduced in 8130150 and 8081778 to be >>> disabled by default. >>> >>> They still can be enabled with proper flags. >>> >>> I also had to modify testcases introduced by 8130150 and 8081778 to >>> enable these flags. >>> >>> Testing: jprt. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151522 >>> Webrev: http://cr.openjdk.java.net/~vkempik/8151522/webrev.00/ >>> >>> Thanks >>> -Vladimir >>> > From edward.nevill at gmail.com Wed Mar 9 20:15:08 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Wed, 09 Mar 2016 20:15:08 +0000 Subject: [aarch64-port-dev ] RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words In-Reply-To: <56E02F07.4040005@redhat.com> References: <1457525861.10946.31.camel@mint> <56E01DB7.1070609@redhat.com> <1457532432.10946.77.camel@mint> <56E02F07.4040005@redhat.com> Message-ID: <1457554508.29918.13.camel@mint> On Wed, 2016-03-09 at 14:11 +0000, Andrew Haley wrote: > On 03/09/2016 02:07 PM, Edward Nevill wrote: > > On Wed, 2016-03-09 at 12:57 +0000, Andrew Haley wrote: > >> On 03/09/2016 12:17 PM, Edward Nevill wrote: > >>> http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java > >>> > >>> JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar > >>> > >>> The following are the results I get > >> > >> Not bad, but not quite perfect. But I guess you knew I'd say that. > >> :-) > >> I'll settle for good enough! What about this one? http://cr.openjdk.java.net/~enevill/8151502/webrev.3 So this gets the following from GCStress Benchmark Mode Cnt Score Error Units JMHSample_97_GCStress.gcstress avgt 25 20171328.764 ? 284468.532 us/op Whereas previously the best was Benchmark Mode Cnt Score Error Units JMHSample_97_GCStress.gcstress avgt 25 20164420.762 ? 280305.425 us/op IE. No significant difference. But it does inline less code and handles the case where count is a constant. All the best, Ed. From ivan at azulsystems.com Wed Mar 9 23:53:20 2016 From: ivan at azulsystems.com (Ivan Krylov) Date: Wed, 9 Mar 2016 15:53:20 -0800 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> Message-ID: <56E0B770.1@azulsystems.com> Paul, Indeed, thanks. I have modified the test. I also made changes to reflect the fact that onSpinWait is now decided to be placed into j.l.Thread. Igor, This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ This is the diff between previous and this patches (03 vs 04): http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt Thanks, Ivan On 12/02/2016 06:01, Paul Sandoz wrote: >> On 12 Feb 2016, at 00:55, Ivan Krylov wrote: >> >> Hi Igor, >> >> Thanks both for your help and your reviews. >> Here is a new version, tested on mac for c1 and c2: >> >> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >> > Now that support C1 is supported should the test be updated with C1 only execution? > > Paul. From igor.veresov at oracle.com Thu Mar 10 01:04:25 2016 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 9 Mar 2016 17:04:25 -0800 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <56E0B770.1@azulsystems.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> Message-ID: Ok, good. igor > On Mar 9, 2016, at 3:53 PM, Ivan Krylov wrote: > > Paul, Indeed, thanks. I have modified the test. > I also made changes to reflect the fact that onSpinWait is now decided to be placed into j.l.Thread. > > Igor, > This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ > This is the diff between previous and this patches (03 vs 04): > http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt > > Thanks, > > Ivan > > On 12/02/2016 06:01, Paul Sandoz wrote: >>> On 12 Feb 2016, at 00:55, Ivan Krylov wrote: >>> >>> Hi Igor, >>> >>> Thanks both for your help and your reviews. >>> Here is a new version, tested on mac for c1 and c2: >>> >>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >>> >> Now that support C1 is supported should the test be updated with C1 only execution? >> >> Paul. > From vitalyd at gmail.com Thu Mar 10 01:39:49 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 9 Mar 2016 20:39:49 -0500 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <56E0B770.1@azulsystems.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> Message-ID: Minor quibble - in x86.ad: if (VM_Version::supports_on_spin_wait() == false) Any particular reason it's using == false form instead of !supports_on_spin_wait? On Wednesday, March 9, 2016, Ivan Krylov wrote: > Paul, Indeed, thanks. I have modified the test. > I also made changes to reflect the fact that onSpinWait is now decided to > be placed into j.l.Thread. > > Igor, > This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ > This is the diff between previous and this patches (03 vs 04): > http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt > > Thanks, > > Ivan > > On 12/02/2016 06:01, Paul Sandoz wrote: > >> On 12 Feb 2016, at 00:55, Ivan Krylov wrote: >>> >>> Hi Igor, >>> >>> Thanks both for your help and your reviews. >>> Here is a new version, tested on mac for c1 and c2: >>> >>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >>> >>> Now that support C1 is supported should the test be updated with C1 only >> execution? >> >> Paul. >> > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivan at azulsystems.com Thu Mar 10 02:05:03 2016 From: ivan at azulsystems.com (Ivan Krylov) Date: Wed, 9 Mar 2016 18:05:03 -0800 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> Message-ID: <56E0D64F.1040909@azulsystems.com> Merely for consistency with the surrounding code. I wouldn't write it that way but I have to follow the practice there. See http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/7e7e50ac4faf/src/cpu/x86/vm/x86.ad Thanks, Igor! I will ask someone to push the change in once the JEP becomes targeted. Ivan On 09/03/2016 17:39, Vitaly Davidovich wrote: > Minor quibble - in x86.ad : > if (VM_Version::supports_on_spin_wait() == false) > Any particular reason it's using == false form instead of > !supports_on_spin_wait? > > On Wednesday, March 9, 2016, Ivan Krylov > wrote: > > Paul, Indeed, thanks. I have modified the test. > I also made changes to reflect the fact that onSpinWait is now > decided to be placed into j.l.Thread. > > Igor, > This is a new webrev: > http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ > > This is the diff between previous and this patches (03 vs 04): > http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt > > > Thanks, > > Ivan > > On 12/02/2016 06:01, Paul Sandoz wrote: > > On 12 Feb 2016, at 00:55, Ivan Krylov > wrote: > > Hi Igor, > > Thanks both for your help and your reviews. > Here is a new version, tested on mac for c1 and c2: > > http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 > > > Now that support C1 is supported should the test be updated > with C1 only execution? > > Paul. > > > > > -- > Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Mar 10 02:08:57 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 9 Mar 2016 21:08:57 -0500 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <56E0D64F.1040909@azulsystems.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> <56E0D64F.1040909@azulsystems.com> Message-ID: On Wednesday, March 9, 2016, Ivan Krylov wrote: > Merely for consistency with the surrounding code. I wouldn't write it that > way but I have to follow the practice there. > See > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/7e7e50ac4faf/src/cpu/x86/vm/x86.ad > I looked just above the lines you added and there's a supports_cx8 check using the ! form. But you're right, your form appears elsewhere too in the file. > > > Thanks, Igor! I will ask someone to push the change in once the JEP > becomes targeted. > > Ivan > > On 09/03/2016 17:39, Vitaly Davidovich wrote: > > Minor quibble - in x86.ad: > > if (VM_Version::supports_on_spin_wait() == false) > > Any particular reason it's using == false form instead of > !supports_on_spin_wait? > > On Wednesday, March 9, 2016, Ivan Krylov < > ivan at azulsystems.com > > wrote: > >> Paul, Indeed, thanks. I have modified the test. >> I also made changes to reflect the fact that onSpinWait is now decided to >> be placed into j.l.Thread. >> >> Igor, >> This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ >> This is the diff between previous and this patches (03 vs 04): >> http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt >> >> Thanks, >> >> Ivan >> >> On 12/02/2016 06:01, Paul Sandoz wrote: >> >>> On 12 Feb 2016, at 00:55, Ivan Krylov >>> > wrote: >>>> >>>> Hi Igor, >>>> >>>> Thanks both for your help and your reviews. >>>> Here is a new version, tested on mac for c1 and c2: >>>> >>>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >>>> >>>> Now that support C1 is supported should the test be updated with C1 >>> only execution? >>> >>> Paul. >>> >> >> > > -- > Sent from my phone > > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Thu Mar 10 04:05:20 2016 From: dean.long at oracle.com (Dean Long) Date: Wed, 9 Mar 2016 20:05:20 -0800 Subject: [9] RFR (XS): 8150320: C1: Illegal bci in debug info for MH::linkTo* methods In-Reply-To: <56E0206C.9090601@oracle.com> References: <56E0206C.9090601@oracle.com> Message-ID: <56E0F280.4020901@oracle.com> Looks good. dl On 3/9/2016 5:09 AM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8150320/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8150320 > > C1 doesn't attach any bci info to TypeCast nodes which narrow receiver > and argument types when inlining through MH::linkTo* happens. > > The fix is to switch from parse-only to complete JVM state. > > No test case provided since it is hard to trigger the problem in a > unit test. > > Testing: verified that the assert doesn't fire anymore w/ a > long-running javac. > > Thanks! > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Thu Mar 10 08:24:21 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 10 Mar 2016 11:24:21 +0300 Subject: [9] RFR (XS): 8150320: C1: Illegal bci in debug info for MH::linkTo* methods In-Reply-To: <56E0F280.4020901@oracle.com> References: <56E0206C.9090601@oracle.com> <56E0F280.4020901@oracle.com> Message-ID: <56E12F35.3010604@oracle.com> Thanks, Dean. Best regards, Vladimir Ivanov On 3/10/16 7:05 AM, Dean Long wrote: > Looks good. > > dl > > On 3/9/2016 5:09 AM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8150320/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8150320 >> >> C1 doesn't attach any bci info to TypeCast nodes which narrow receiver >> and argument types when inlining through MH::linkTo* happens. >> >> The fix is to switch from parse-only to complete JVM state. >> >> No test case provided since it is hard to trigger the problem in a >> unit test. >> >> Testing: verified that the assert doesn't fire anymore w/ a >> long-running javac. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov > From vladimir.x.ivanov at oracle.com Thu Mar 10 09:54:25 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 10 Mar 2016 12:54:25 +0300 Subject: RFR: 8151470: [JVMCI] remove up-call to HotSpotJVMCICompilerConfig.selectCompiler In-Reply-To: References: Message-ID: <56E14451.9080304@oracle.com> Looks good. Best regards, Vladimir Ivanov On 3/9/16 2:05 AM, Christian Thalinger wrote: > https://bugs.openjdk.java.net/browse/JDK-8151470 > http://cr.openjdk.java.net/~twisti/8151470/webrev.01/ > > The reason why it was done this way is to use a trusted system property value to select the compiler. We can achieve the same by using VM.getSavedProperty. > > This patch changes the system property name from ?jvmci.compiler? to ?jvmci.Compiler? as it?s using an Option now. > > As discussed with Doug I also got rid of some property file parsing code that we don?t need right now. > From doug.simon at oracle.com Thu Mar 10 09:55:44 2016 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 10 Mar 2016 10:55:44 +0100 Subject: RFR: 8151470: [JVMCI] remove up-call to HotSpotJVMCICompilerConfig.selectCompiler In-Reply-To: References: Message-ID: <50F67766-AC17-4AEC-AFCC-834579B96959@oracle.com> For more laziness, in HotSpotJVMCICompilerConfig, can you please move all the factory initialization logic into getCompilerFactory. > On 09 Mar 2016, at 00:05, Christian Thalinger wrote: > > https://bugs.openjdk.java.net/browse/JDK-8151470 > http://cr.openjdk.java.net/~twisti/8151470/webrev.01/ > > The reason why it was done this way is to use a trusted system property value to select the compiler. We can achieve the same by using VM.getSavedProperty. > > This patch changes the system property name from ?jvmci.compiler? to ?jvmci.Compiler? as it?s using an Option now. > > As discussed with Doug I also got rid of some property file parsing code that we don?t need right now. From vladimir.x.ivanov at oracle.com Thu Mar 10 14:02:36 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 10 Mar 2016 17:02:36 +0300 Subject: [9] RFR (S): 8141420: Compiler runtime entries don't hold Klass* from being GCed Message-ID: <56E17E7C.3010304@oracle.com> http://cr.openjdk.java.net/~vlivanov/8141420/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8141420 Though compiler runtime entries use raw Klass*, they don't ensure the classes can't be unloaded. It causes rare crashes when Full GC and class unloading happens when freshly loaded class is being constructed and the only live reference to it is the Klass* passed into the runtime call. There are KlassHandles/instanceKlassHandles, but they don't do anything after PermGen was removed. The fix is to add mirror handles to keep classes alive across safepoints during the runtime calls. FTR handles aren't needed for primitive arrays. I chose the conservative fix, since I plan to backport it into 8u. Filed JDK-8141420 [1] to refactor the code to use mirrors instead. It should simplify the logic to track class liveness. No regression test provided, since I wasn't able to write one w/o instrumenting the JVM. Testing: manual (instrumented build which triggers class unloading from runtime entries), JPRT. Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8141420 From christian.thalinger at oracle.com Thu Mar 10 17:03:19 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 07:03:19 -1000 Subject: RFR: 8151470: [JVMCI] remove up-call to HotSpotJVMCICompilerConfig.selectCompiler In-Reply-To: <50F67766-AC17-4AEC-AFCC-834579B96959@oracle.com> References: <50F67766-AC17-4AEC-AFCC-834579B96959@oracle.com> Message-ID: <414EA983-8D23-4D16-903C-5671B08F5586@oracle.com> > On Mar 9, 2016, at 11:55 PM, Doug Simon wrote: > > For more laziness, in HotSpotJVMCICompilerConfig, can you please move all the factory initialization logic into getCompilerFactory. I can: http://cr.openjdk.java.net/~twisti/8151470/webrev.02/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCICompilerConfig.java.udiff.html > > >> On 09 Mar 2016, at 00:05, Christian Thalinger wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8151470 >> http://cr.openjdk.java.net/~twisti/8151470/webrev.01/ >> >> The reason why it was done this way is to use a trusted system property value to select the compiler. We can achieve the same by using VM.getSavedProperty. >> >> This patch changes the system property name from ?jvmci.compiler? to ?jvmci.Compiler? as it?s using an Option now. >> >> As discussed with Doug I also got rid of some property file parsing code that we don?t need right now. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Thu Mar 10 17:11:58 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 07:11:58 -1000 Subject: [9] RFR (S): 8141420: Compiler runtime entries don't hold Klass* from being GCed In-Reply-To: <56E17E7C.3010304@oracle.com> References: <56E17E7C.3010304@oracle.com> Message-ID: <11092161-DC6B-4CB9-9C11-5D21CE78FD58@oracle.com> Looks good. > On Mar 10, 2016, at 4:02 AM, Vladimir Ivanov wrote: > > http://cr.openjdk.java.net/~vlivanov/8141420/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8141420 > > Though compiler runtime entries use raw Klass*, they don't ensure the classes can't be unloaded. It causes rare crashes when Full GC and class unloading happens when freshly loaded class is being constructed and the only live reference to it is the Klass* passed into the runtime call. > > There are KlassHandles/instanceKlassHandles, but they don't do anything after PermGen was removed. > > The fix is to add mirror handles to keep classes alive across safepoints during the runtime calls. FTR handles aren't needed for primitive arrays. > > I chose the conservative fix, since I plan to backport it into 8u. Filed JDK-8141420 [1] to refactor the code to use mirrors instead. It should simplify the logic to track class liveness. > > No regression test provided, since I wasn't able to write one w/o instrumenting the JVM. > > Testing: manual (instrumented build which triggers class unloading from runtime entries), JPRT. > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8141420 From paul.sandoz at oracle.com Thu Mar 10 17:45:08 2016 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 10 Mar 2016 18:45:08 +0100 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <56E0D64F.1040909@azulsystems.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> <56E0D64F.1040909@azulsystems.com> Message-ID: <92D7C721-C784-469C-86F4-91A1A7E5E377@oracle.com> > On 10 Mar 2016, at 03:05, Ivan Krylov wrote: > > Merely for consistency with the surrounding code. I wouldn't write it that way but I have to follow the practice there. > See http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/7e7e50ac4faf/src/cpu/x86/vm/x86.ad > > Thanks, Igor! I will ask someone to push the change in once the JEP becomes targeted. > Because there is also an API change associated with this we have to go through an internal process (referred to as CCC) to track the API change and review the compatibility/design. It should just be a formality for this API. If you are happy with: http://cr.openjdk.java.net/~ikrylov/8147844.jdk.03/src/java.base/share/classes/java/lang/Thread.java.sdiff.html i can do this tomorrow, but i have one suggestion: use @apiNote for the example (i know i am hi-jacking this thread here and not using the core-libs thread, but it?s just too convenient to write this here!). Once approved we can then push all changes to hs-comp. > Ivan > > On 09/03/2016 17:39, Vitaly Davidovich wrote: >> Minor quibble - in x86.ad : >> if (VM_Version::supports_on_spin_wait() == false) >> Any particular reason it's using == false form instead of !supports_on_spin_wait? >> >> On Wednesday, March 9, 2016, Ivan Krylov < ivan at azulsystems.com > wrote: >> Paul, Indeed, thanks. I have modified the test. Ok, looks good. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From christian.thalinger at oracle.com Thu Mar 10 18:11:34 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 08:11:34 -1000 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <56E0B770.1@azulsystems.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> Message-ID: <641679F3-C308-4F3B-B0EE-CAD4FD9596C9@oracle.com> src/share/vm/runtime/vm_version.hpp: + // Does this compiler support intrinsification + // of java.lang.Thread.onSpinWait() + static bool supports_on_spin_wait() { return false; } This comment seems wrong. > On Mar 9, 2016, at 1:53 PM, Ivan Krylov wrote: > > Paul, Indeed, thanks. I have modified the test. > I also made changes to reflect the fact that onSpinWait is now decided to be placed into j.l.Thread. > > Igor, > This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ > This is the diff between previous and this patches (03 vs 04): > http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt > > Thanks, > > Ivan > > On 12/02/2016 06:01, Paul Sandoz wrote: >>> On 12 Feb 2016, at 00:55, Ivan Krylov wrote: >>> >>> Hi Igor, >>> >>> Thanks both for your help and your reviews. >>> Here is a new version, tested on mac for c1 and c2: >>> >>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >>> >> Now that support C1 is supported should the test be updated with C1 only execution? >> >> Paul. > From ivan at azulsystems.com Thu Mar 10 19:09:36 2016 From: ivan at azulsystems.com (Ivan Krylov) Date: Thu, 10 Mar 2016 11:09:36 -0800 Subject: RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic In-Reply-To: <641679F3-C308-4F3B-B0EE-CAD4FD9596C9@oracle.com> References: <56A751AE.9090203@azulsystems.com> <45B4730C-CCC2-4523-ACD1-D18B20E5EC5F@oracle.com> <56A8BC9D.8060004@azulsystems.com> <6148E4D7-AF5E-4094-B363-52E0D83452E9@oracle.com> <56AA2AE4.2090803@azulsystems.com> <2538083C-7906-44AA-A074-7DBF5F2D8654@oracle.com> <50C14C66-4068-4DD7-BD94-96E37F7C9B0A@oracle.com> <56AF85F3.3060802@azulsystems.com> <56BBCBF4.2070504@azulsystems.com> <56BD1F7F.3020808@azulsystems.com> <56E0B770.1@azulsystems.com> <641679F3-C308-4F3B-B0EE-CAD4FD9596C9@oracle.com> Message-ID: <56E1C670.40403@azulsystems.com> On 10/03/2016 10:11, Christian Thalinger wrote: > src/share/vm/runtime/vm_version.hpp: > > + // Does this compiler support intrinsification > + // of java.lang.Thread.onSpinWait() > + static bool supports_on_spin_wait() { return false; } > > This comment seems wrong. Fixed with // Does this CPU support spin wait instruction? Thx. > >> On Mar 9, 2016, at 1:53 PM, Ivan Krylov wrote: >> >> Paul, Indeed, thanks. I have modified the test. >> I also made changes to reflect the fact that onSpinWait is now decided to be placed into j.l.Thread. >> >> Igor, >> This is a new webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/ >> This is the diff between previous and this patches (03 vs 04): >> http://cr.openjdk.java.net/~ikrylov/8147844.hs.04/diff.txt >> >> Thanks, >> >> Ivan >> >> On 12/02/2016 06:01, Paul Sandoz wrote: >>>> On 12 Feb 2016, at 00:55, Ivan Krylov wrote: >>>> >>>> Hi Igor, >>>> >>>> Thanks both for your help and your reviews. >>>> Here is a new version, tested on mac for c1 and c2: >>>> >>>> http://cr.openjdk.java.net/~ikrylov/8147844.hs.03 >>>> >>> Now that support C1 is supported should the test be updated with C1 only execution? >>> >>> Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu Mar 10 19:12:41 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 10 Mar 2016 11:12:41 -0800 Subject: [9] RFR (XS): 8150320: C1: Illegal bci in debug info for MH::linkTo* methods In-Reply-To: <56E0206C.9090601@oracle.com> References: <56E0206C.9090601@oracle.com> Message-ID: <56E1C729.8040103@oracle.com> Good. Thanks, Vladimir K On 3/9/16 5:09 AM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8150320/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8150320 > > C1 doesn't attach any bci info to TypeCast nodes which narrow receiver > and argument types when inlining through MH::linkTo* happens. > > The fix is to switch from parse-only to complete JVM state. > > No test case provided since it is hard to trigger the problem in a unit > test. > > Testing: verified that the assert doesn't fire anymore w/ a long-running > javac. > > Thanks! > > Best regards, > Vladimir Ivanov From doug.simon at oracle.com Thu Mar 10 19:19:37 2016 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 10 Mar 2016 20:19:37 +0100 Subject: RFR: 8151470: [JVMCI] remove up-call to HotSpotJVMCICompilerConfig.selectCompiler In-Reply-To: <414EA983-8D23-4D16-903C-5671B08F5586@oracle.com> References: <50F67766-AC17-4AEC-AFCC-834579B96959@oracle.com> <414EA983-8D23-4D16-903C-5671B08F5586@oracle.com> Message-ID: <630CE811-8A0A-4549-A688-2E1716B1E4B2@oracle.com> Ok, looks good. > On 10 Mar 2016, at 18:03, Christian Thalinger wrote: > > >> On Mar 9, 2016, at 11:55 PM, Doug Simon wrote: >> >> For more laziness, in HotSpotJVMCICompilerConfig, can you please move all the factory initialization logic into getCompilerFactory. > > I can: > > http://cr.openjdk.java.net/~twisti/8151470/webrev.02/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotJVMCICompilerConfig.java.udiff.html > >> >> >>> On 09 Mar 2016, at 00:05, Christian Thalinger wrote: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151470 >>> http://cr.openjdk.java.net/~twisti/8151470/webrev.01/ >>> >>> The reason why it was done this way is to use a trusted system property value to select the compiler. We can achieve the same by using VM.getSavedProperty. >>> >>> This patch changes the system property name from ?jvmci.compiler? to ?jvmci.Compiler? as it?s using an Option now. >>> >>> As discussed with Doug I also got rid of some property file parsing code that we don?t need right now. >> > From christian.thalinger at oracle.com Thu Mar 10 20:58:43 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 10:58:43 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <374A0174-22D8-47F3-BA6C-1865AD16AEBE@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> <374A0174-22D8-47F3-BA6C-1865AD16AEBE@oracle.com> Message-ID: <460637D6-5886-4B09-B788-BC1B731CED89@oracle.com> Damn. I forgot to hg add the test: http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/3a1f495e37b3 That?s why you should send exported changesets :-) > On Mar 7, 2016, at 3:20 PM, Christian Thalinger wrote: > > >> On Mar 7, 2016, at 11:03 AM, Doug Simon wrote: >> >> I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. > > That?s much better than I expected. Looks good. > >> >> -Doug >> >>> On 07 Mar 2016, at 19:13, Christian Thalinger wrote: >>> >>> >>>> On Mar 5, 2016, at 1:45 AM, Doug Simon wrote: >>>> >>>> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >>>> >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8151266 >>> >>> There is a test attached to the bug. We should add it. >>> >>>> http://cr.openjdk.java.net/~dnsimon/8151266/ >>>> >>>> -Doug >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Thu Mar 10 21:03:21 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 11:03:21 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <460637D6-5886-4B09-B788-BC1B731CED89@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> <374A0174-22D8-47F3-BA6C-1865AD16AEBE@oracle.com> <460637D6-5886-4B09-B788-BC1B731CED89@oracle.com> Message-ID: <9C9D771C-340E-4A56-9D3D-4BF027ADFAD9@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8151664 I will just push it since it was reviewed already. > On Mar 10, 2016, at 10:58 AM, Christian Thalinger wrote: > > Damn. I forgot to hg add the test: > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/3a1f495e37b3 > > That?s why you should send exported changesets :-) > >> On Mar 7, 2016, at 3:20 PM, Christian Thalinger > wrote: >> >> >>> On Mar 7, 2016, at 11:03 AM, Doug Simon > wrote: >>> >>> I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. >> >> That?s much better than I expected. Looks good. >> >>> >>> -Doug >>> >>>> On 07 Mar 2016, at 19:13, Christian Thalinger > wrote: >>>> >>>> >>>>> On Mar 5, 2016, at 1:45 AM, Doug Simon > wrote: >>>>> >>>>> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >>>>> >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8151266 >>>> >>>> There is a test attached to the bug. We should add it. >>>> >>>>> http://cr.openjdk.java.net/~dnsimon/8151266/ >>>>> >>>>> -Doug >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Thu Mar 10 22:04:55 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 10 Mar 2016 12:04:55 -1000 Subject: RFR: 8151266: HotSpotResolvedJavaFieldImpl::isStable() does not work as expected In-Reply-To: <9C9D771C-340E-4A56-9D3D-4BF027ADFAD9@oracle.com> References: <681386EA-5DC4-4664-B80B-D64CEBF22A89@oracle.com> <00AA7F89-AD30-4CC8-9D14-EFA27821AA63@oracle.com> <374A0174-22D8-47F3-BA6C-1865AD16AEBE@oracle.com> <460637D6-5886-4B09-B788-BC1B731CED89@oracle.com> <9C9D771C-340E-4A56-9D3D-4BF027ADFAD9@oracle.com> Message-ID: > On Mar 10, 2016, at 11:03 AM, Christian Thalinger wrote: > > https://bugs.openjdk.java.net/browse/JDK-8151664 > > I will just push it since it was reviewed already. Turns out there was a missing jtreg tag: * @requires (os.simpleArch == "x64" | os.simpleArch == "sparcv9" | os.simpleArch == "aarch64") > >> On Mar 10, 2016, at 10:58 AM, Christian Thalinger > wrote: >> >> Damn. I forgot to hg add the test: >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/3a1f495e37b3 >> >> That?s why you should send exported changesets :-) >> >>> On Mar 7, 2016, at 3:20 PM, Christian Thalinger > wrote: >>> >>> >>>> On Mar 7, 2016, at 11:03 AM, Doug Simon > wrote: >>>> >>>> I changed the webrev to use HotSpotResolvedObjectTypeImpl instead of making a (new) VM call and also added the test provided in the bug report. >>> >>> That?s much better than I expected. Looks good. >>> >>>> >>>> -Doug >>>> >>>>> On 07 Mar 2016, at 19:13, Christian Thalinger > wrote: >>>>> >>>>> >>>>>> On Mar 5, 2016, at 1:45 AM, Doug Simon > wrote: >>>>>> >>>>>> Please review this small change that makes HotSpotResolvedJavaFieldImpl.isStable() return the right value for HotSpotResolvedJavaFieldImpl objects created from java.lang.reflect.Field objects. The problem was that HotSpotResolvedJavaField objects created from reflection objects didn?t get the VM internal modifier flags: >>>>>> >>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0adf6c8c7223/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotMetaAccessProvider.java#l117 >>>>>> >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8151266 >>>>> >>>>> There is a test attached to the bug. We should add it. >>>>> >>>>>> http://cr.openjdk.java.net/~dnsimon/8151266/ >>>>>> >>>>>> -Doug >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Fri Mar 11 08:08:14 2016 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 11 Mar 2016 11:08:14 +0300 Subject: [9] RFR (XS): 8150320: C1: Illegal bci in debug info for MH::linkTo* methods In-Reply-To: <56E1C729.8040103@oracle.com> References: <56E0206C.9090601@oracle.com> <56E1C729.8040103@oracle.com> Message-ID: <56E27CEE.9080807@oracle.com> Thanks, Vladimir. Best regards, Vladimir Ivanov On 3/10/16 10:12 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir K > > On 3/9/16 5:09 AM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8150320/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8150320 >> >> C1 doesn't attach any bci info to TypeCast nodes which narrow receiver >> and argument types when inlining through MH::linkTo* happens. >> >> The fix is to switch from parse-only to complete JVM state. >> >> No test case provided since it is hard to trigger the problem in a unit >> test. >> >> Testing: verified that the assert doesn't fire anymore w/ a long-running >> javac. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov From filipp.zhinkin at gmail.com Fri Mar 11 11:42:40 2016 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Fri, 11 Mar 2016 14:42:40 +0300 Subject: RFR (L): 8149374: Replace C1-specific collection classes with universal collection classes Message-ID: Hi all, please review a fix for JDK-8149374: Webrev: http://cr.openjdk.java.net/~fzhinkin/8149374/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8149374 Testing done: hotspot_all tests + CTW I've replaced all usages of collections defined via define_array and define_stack macros with GrowableArray. There are good and bad news regarding performance impact of that change. Unfortunately, C1 compilation time for CTW-scenario w/ release bits increased from 51.07?0.28s to 52.99?0.23s (it's about 3.5%). Such difference caused by eager initialization of GrowableArray's backing array elements [1]. I can imagine when we actually need to force initialization and de-initialization during array's growing/destruction, but for some types like c++ primitive types or pointers such initialization does not make much sense, because GrowableArray is not allowing to access an element which was not explicitly placed inside of it. And as long as GrowableArray most widely used to store pointers we're simply wasting the time with initialization. I've measured CTW time with following workaround which implements initialization for numeric types and pointers as no-op and C1 compilation time returned back to values that were measured before original change (51.06?0.24s): http://cr.openjdk.java.net/~fzhinkin/growableArrayInitialization/webrev/ I've also measured C2 compilation time and it dropped down by a few seconds too: 1138?9s w/o GrowableArray's change and 1132?5s w/ it. Summing up: I guess we should avoid GrowableArray's backing array initialization for some types, don't we? Best regards, Filipp [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/323b8370b0f6/src/share/vm/utilities/growableArray.hpp#l165 From nils.eliasson at oracle.com Fri Mar 11 13:49:32 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 11 Mar 2016 14:49:32 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56CF175E.1030806@oracle.com> References: <56CF175E.1030806@oracle.com> Message-ID: <56E2CCEC.7010401@oracle.com> Hi, I got a failure during testing and had to fix that. The additional change is very small. Problem: Running embedded server profiles with -XX:+AggressiveOpts can cause compilations to trigger (many Integers in the cache...) before the compile broker is initialized. What this code at the start of compile_method doesn't tell you - is that the NULL check guards from compile requests before the compilation_init is called. // lock, make sure that the compilation 1052 // isn't prohibited in a straightforward way. 1053 AbstractCompiler *comp = CompileBroker::compiler(comp_level); 1054 if (comp == NULL || !comp->can_compile_method(method) || 1055 compilation_is_prohibited(method, osr_bci, comp_level)) { 1056 return NULL; 1057 } The compile_method wrapper we introduced used "CompileBroker::compiler(comp_level)" but didn't guard for NULL. Solution: There was already a check on CompileBroker::_initalized in the code - a bit late though - so it was never executed. I hoisted it from compile_method_base to the beginning of compile_method and check the check comp==NULL check to an assert. All new changes in CompileBroker.cpp: Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.07 Regards, Nils Eliasson On 2016-02-25 16:01, Nils Eliasson wrote: > Hi, > > Please review this change that adds support for blocking compiles in > the whitebox API. This enables simpler less time consuming tests. > > Motivation: > * -XX:-BackgroundCompilation is a global flag and can be time consuming > * Blocking compiles removes the need for waiting on the compile queue > to complete > * Compiles put in the queue may be evicted if the queue grows to big - > causing indeterminism in the test > * Less VM-flags allows for more tests in the same VM > > Testing: > Posting a separate RFR for test fix that uses this change. They will > be pushed at the same time. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 > JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ > Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ > > Best regards, > Nils Eliasson -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Fri Mar 11 15:40:04 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 11 Mar 2016 16:40:04 +0100 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer Message-ID: <56E2E6D4.4040308@oracle.com> Hi, please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8150804 http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. Here are the details (for context, see 'TestPhiElimination.java'): A::get() is inlined into test(obj) producing the following graph: Parm (obj) TestPhiElimination | CastPP TestPhiElimination:NotNull | CheckCastPP A:NotNull / \ CheckCastPP | A:NotNull | \ / Phi A | Safepoint https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): Parm (obj) TestPhiElimination | CheckCastPP A | Safepoint https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: phase->type(uncasted_input)->higher_equal(type())) http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: Parm (obj) TestPhiElimination | CastPP TestPhiElimination:NotNull | CheckCastPP A:NotNull / \ CheckCastPP | A:NotNull | \ / Phi Object We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. What do you think? Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8139771 [1] https://bugs.openjdk.java.net/browse/JDK-8146999 From volker.simonis at gmail.com Fri Mar 11 16:24:40 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 11 Mar 2016 17:24:40 +0100 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56E2CCEC.7010401@oracle.com> References: <56CF175E.1030806@oracle.com> <56E2CCEC.7010401@oracle.com> Message-ID: Hi Nils, good catch! Your changes look good. Thumbs up from me. The only minor comment is the indentation mismatch I can see here: *! if (comp == NULL || !comp->can_compile_method(method) ||**! compilation_is_prohibited(method, osr_bci, comp_level)) {**! if (!comp->can_compile_method(method) ||**! compilation_is_prohibited(method, osr_bci, comp_level, directive->ExcludeOption)) {* return NULL; But that could also be a diff artifact. Regards, Volker On Fri, Mar 11, 2016 at 2:49 PM, Nils Eliasson wrote: > Hi, > > I got a failure during testing and had to fix that. The additional change > is very small. > > Problem: > Running embedded server profiles with -XX:+AggressiveOpts can cause > compilations to trigger (many Integers in the cache...) before the compile > broker is initialized. > > What this code at the start of compile_method doesn't tell you - is that > the NULL check guards from compile requests before the compilation_init is > called. > > // lock, make sure that the compilation > 1052 // isn't prohibited in a straightforward way. > 1053 AbstractCompiler *comp = CompileBroker::compiler(comp_level);1054 if (comp == NULL || !comp->can_compile_method(method) ||1055 compilation_is_prohibited(method, osr_bci, comp_level)) { > 1056 return NULL; > 1057 } > > The compile_method wrapper we introduced used > "CompileBroker::compiler(comp_level)" but didn't guard for NULL. > > Solution: > There was already a check on CompileBroker::_initalized in the code - a > bit late though - so it was never executed. I hoisted it from > compile_method_base to the beginning of compile_method and check the check > comp==NULL check to an assert. > > All new changes in CompileBroker.cpp: > Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.07 > > Regards, > Nils Eliasson > > > > On 2016-02-25 16:01, Nils Eliasson wrote: > > Hi, > > Please review this change that adds support for blocking compiles in the > whitebox API. This enables simpler less time consuming tests. > > Motivation: > * -XX:-BackgroundCompilation is a global flag and can be time consuming > * Blocking compiles removes the need for waiting on the compile queue to > complete > * Compiles put in the queue may be evicted if the queue grows to big - > causing indeterminism in the test > * Less VM-flags allows for more tests in the same VM > > Testing: > Posting a separate RFR for test fix that uses this change. They will be > pushed at the same time. > > RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 > JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ > Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ > > Best regards, > Nils Eliasson > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Sat Mar 12 00:29:52 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 11 Mar 2016 16:29:52 -0800 Subject: RFR(S/M): 8150646: Add support for blocking compiles through whitebox API In-Reply-To: <56E2CCEC.7010401@oracle.com> References: <56CF175E.1030806@oracle.com> <56E2CCEC.7010401@oracle.com> Message-ID: <56E36300.1080704@oracle.com> Looks good. Thanks, Vladimir On 3/11/16 5:49 AM, Nils Eliasson wrote: > Hi, > > I got a failure during testing and had to fix that. The additional > change is very small. > > Problem: > Running embedded server profiles with -XX:+AggressiveOpts can cause > compilations to trigger (many Integers in the cache...) before the > compile broker is initialized. > > What this code at the start of compile_method doesn't tell you - is that > the NULL check guards from compile requests before the compilation_init > is called. > > // lock, make sure that the compilation > 1052 // isn't prohibited in a straightforward way. > 1053 AbstractCompiler *comp = CompileBroker::compiler(comp_level); > 1054 if (comp == NULL || !comp->can_compile_method(method) || > 1055 compilation_is_prohibited(method, osr_bci, comp_level)) { > 1056 return NULL; > 1057 } > > The compile_method wrapper we introduced used > "CompileBroker::compiler(comp_level)" but didn't guard for NULL. > > Solution: > There was already a check on CompileBroker::_initalized in the code - a > bit late though - so it was never executed. I hoisted it from > compile_method_base to the beginning of compile_method and check the > check comp==NULL check to an assert. > > All new changes in CompileBroker.cpp: > Webrev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.07 > > Regards, > Nils Eliasson > > > On 2016-02-25 16:01, Nils Eliasson wrote: >> Hi, >> >> Please review this change that adds support for blocking compiles in >> the whitebox API. This enables simpler less time consuming tests. >> >> Motivation: >> * -XX:-BackgroundCompilation is a global flag and can be time consuming >> * Blocking compiles removes the need for waiting on the compile queue >> to complete >> * Compiles put in the queue may be evicted if the queue grows to big - >> causing indeterminism in the test >> * Less VM-flags allows for more tests in the same VM >> >> Testing: >> Posting a separate RFR for test fix that uses this change. They will >> be pushed at the same time. >> >> RFE: https://bugs.openjdk.java.net/browse/JDK-8150646 >> JDK rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev_jdk.01/ >> Hotspot rev: http://cr.openjdk.java.net/~neliasso/8150646/webrev.02/ >> >> Best regards, >> Nils Eliasson > From uschindler at apache.org Sat Mar 12 11:43:20 2016 From: uschindler at apache.org (Uwe Schindler) Date: Sat, 12 Mar 2016 12:43:20 +0100 Subject: JDK 9 build 109 -> Lucene's Ant build works again; still missing Hotspot patches Message-ID: <002d01d17c54$62e8dc20$28ba9460$@apache.org> Hi, I just wanted to inform you that the first Lucene test with build 109 of JDK were working with Ant and Ivy (both on Linux and Windows including whitespace in build directory), so the Multi-Release JAR file fix did not break the build system anymore. Many thanks to Steve Drach and Alan Bateman for helping! I have seen the follow-up issue about the "#release" fragment, so I am looking forward to better compatibility with the new URL schemes and existing code. I will try to maybe write some test this weekend to help with that. Nevertheless, build 109 does not contain (according to the changelog) fixes for JDK-8150436 (still fails consistently) and JDK-8148786 (our duplicate issue JDK-8150280, happens sometimes). Those patches were committed long ago. What's the reason for delaying them in nightly builds? I was hoping for build 108 containing them (which was unusable because of the Ant build problems) and I was quite sure that they will be in build 109. Those 2 issues still make the test suite fail quite often (hotspot issues). On the issues the "resolved in" field contains "team". What does this mean? Uwe ----- Uwe Schindler uschindler at apache.org ASF Member, Apache Lucene PMC / Committer Bremen, Germany http://lucene.apache.org/ From Alan.Bateman at oracle.com Sat Mar 12 12:00:13 2016 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sat, 12 Mar 2016 12:00:13 +0000 Subject: JDK 9 build 109 -> Lucene's Ant build works again; still missing Hotspot patches In-Reply-To: <002d01d17c54$62e8dc20$28ba9460$@apache.org> References: <002d01d17c54$62e8dc20$28ba9460$@apache.org> Message-ID: <56E404CD.7030706@oracle.com> On 12/03/2016 11:43, Uwe Schindler wrote: > : > > Nevertheless, build 109 does not contain (according to the changelog) fixes for JDK-8150436 (still fails consistently) and JDK-8148786 (our duplicate issue JDK-8150280, happens sometimes). Those patches were committed long ago. What's the reason for delaying them in nightly builds? I was hoping for build 108 containing them (which was unusable because of the Ant build problems) and I was quite sure that they will be in build 109. Those 2 issues still make the test suite fail quite often (hotspot issues). On the issues the "resolved in" field contains "team". What does this mean? > "team" means that it has been pushed to one of the team forests, in this case I assume jdk9/hs-comp originally. The value changes to "b" once it gets into a promoted/weekly build. There was a hs -> jdk9/dev integration last Wednesday (March 9) and I see that it has fixes for JDK-8150436 and JDK-8148786 so they should be next week's build (jdk-9+110). -Alan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon Mar 14 01:28:40 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sun, 13 Mar 2016 18:28:40 -0700 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer In-Reply-To: <56E2E6D4.4040308@oracle.com> References: <56E2E6D4.4040308@oracle.com> Message-ID: <56E613C8.1040300@oracle.com> It is strange that next subgraph did not collapse - CheckCastPPNode::Identity() should remove second CheckCastPP: > CheckCastPP > A:NotNull > / \ > CheckCastPP | > A:NotNull | > \ / > Phi > A I think next code in PhiNode::Ideal() should check can_reshape too since during parsing graph is incomplete: if (uin == NULL) { uncasted = true; uin = unique_input(phase, true); } Thanks, Vladimir On 3/11/16 7:40 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8150804 > http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ > > We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. > > Here are the details (for context, see 'TestPhiElimination.java'): > A::get() is inlined into test(obj) producing the following graph: > > Parm (obj) > TestPhiElimination > | > CastPP > TestPhiElimination:NotNull > | > CheckCastPP > A:NotNull > / \ > CheckCastPP | > A:NotNull | > \ / > Phi > A > | > Safepoint > > https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png > > PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): > > Parm (obj) > TestPhiElimination > | > CheckCastPP > A > | > Safepoint > > https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png > > We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. > > Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: > > phase->type(uncasted_input)->higher_equal(type())) > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 > > Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: > > Parm (obj) > TestPhiElimination > | > CastPP > TestPhiElimination:NotNull > | > CheckCastPP > A:NotNull > / \ > CheckCastPP | > A:NotNull | > \ / > Phi > Object > > We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. > > What do you think? > > Thanks, > Tobias > > [1] https://bugs.openjdk.java.net/browse/JDK-8139771 > [1] https://bugs.openjdk.java.net/browse/JDK-8146999 > From edward.nevill at gmail.com Mon Mar 14 09:06:25 2016 From: edward.nevill at gmail.com (Edward Nevill) Date: Mon, 14 Mar 2016 09:06:25 +0000 Subject: RFR: 8151775: aarch64: add support for 8.1 LSE atomic operations Message-ID: <1457946385.13788.9.camel@mint> Hi, The following webrev adds support for 8.1 LSE atomic operations http://cr.openjdk.java.net/~enevill/8151775/webrev It also adds to CAS cases which I missed in the previous patch for 8.1 CAS instructions. Tested with a clean run through jcstress. Thanks, Ed From tobias.hartmann at oracle.com Mon Mar 14 10:09:44 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 14 Mar 2016 11:09:44 +0100 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer In-Reply-To: <56E613C8.1040300@oracle.com> References: <56E2E6D4.4040308@oracle.com> <56E613C8.1040300@oracle.com> Message-ID: <56E68DE8.10608@oracle.com> Hi Vladimir, thanks for the review! On 14.03.2016 02:28, Vladimir Kozlov wrote: > It is strange that next subgraph did not collapse - CheckCastPPNode::Identity() should remove second CheckCastPP: > >> CheckCastPP >> A:NotNull >> / \ >> CheckCastPP | >> A:NotNull | >> \ / >> Phi >> A When CheckCastPPNode::Identity() is first invoked, the graph looks like this 92 CheckCastPP === 90 32 [[]] #A:NotNull * Oop:A:NotNull * !jvms: TestPhiElimination::test @ bci:8 135 CheckCastPP === 133 92 [[]] #A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) Oop:A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) !jvms: A::get @ bci:9 TestPhiElimination::test @ bci:11 Because 135 has a speculative part, the types are not equal and 135 is not replaced by 92. > I think next code in PhiNode::Ideal() should check can_reshape too since during parsing graph is incomplete: > > if (uin == NULL) { > uncasted = true; > uin = unique_input(phase, true); > } Yes, that's a better solution. I verified that it works (Phi is folded after parsing and has correct A:NotNull type). Here is the new webrev: http://cr.openjdk.java.net/~thartmann/8150804/webrev.01/ Thanks, Tobias > > Thanks, > Vladimir > > On 3/11/16 7:40 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch. >> >> https://bugs.openjdk.java.net/browse/JDK-8150804 >> http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ >> >> We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. >> >> Here are the details (for context, see 'TestPhiElimination.java'): >> A::get() is inlined into test(obj) producing the following graph: >> >> Parm (obj) >> TestPhiElimination >> | >> CastPP >> TestPhiElimination:NotNull >> | >> CheckCastPP >> A:NotNull >> / \ >> CheckCastPP | >> A:NotNull | >> \ / >> Phi >> A >> | >> Safepoint >> >> https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png >> >> PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): >> >> Parm (obj) >> TestPhiElimination >> | >> CheckCastPP >> A >> | >> Safepoint >> >> https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png >> >> We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. >> >> Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: >> >> phase->type(uncasted_input)->higher_equal(type())) >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 >> >> Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: >> >> Parm (obj) >> TestPhiElimination >> | >> CastPP >> TestPhiElimination:NotNull >> | >> CheckCastPP >> A:NotNull >> / \ >> CheckCastPP | >> A:NotNull | >> \ / >> Phi >> Object >> >> We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. >> >> What do you think? >> >> Thanks, >> Tobias >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8139771 >> [1] https://bugs.openjdk.java.net/browse/JDK-8146999 >> From volker.simonis at gmail.com Mon Mar 14 10:27:34 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 14 Mar 2016 11:27:34 +0100 Subject: RFR(XXS): 8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled Message-ID: Hi, can I please have a review and sponsor for the following tiny test fix: http://cr.openjdk.java.net/~simonis/webrevs/2016/8151796/ https://bugs.openjdk.java.net/browse/JDK-8151796 Change JDK-8150646 added a new test (compiler/whitebox/BlockingCompilation.java) which checks that is is possible to compile a simple method on every available compilation level trough the whitebox API. It also checks that these compilations can be done in blocking as well as in unblocking mode. Unfortunately it seems that the test is a little shaky. It may happen that the test simple method is not compilable on an available compilation level (e.g. because the code cache is full). Also testing for non-blocking compilations isn't 100% bullet proof, because non-blocking compiles may look like blocking compilations if the compiler thread is faster than the test thread for some reason. I've therefore flagged this test as "@ignore". It can still be run manually by using the '-ignore:run' jtreg command line option. Automatic test runs can now skip this test by using '-ignore:quiet'. Regards, Volker From nils.eliasson at oracle.com Mon Mar 14 10:45:56 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 14 Mar 2016 11:45:56 +0100 Subject: RFR(XXS): 8151795: compiler/compilercontrol/parser/DirectiveParserTest.java fails with "assert failed: 0 != 0" Message-ID: <56E69664.8030500@oracle.com> Hi, Summary: Adding directives from diagnostic command should treat zero directives in file as failure. Bug: https://bugs.openjdk.java.net/browse/JDK-8151795 Webrev: http://cr.openjdk.java.net/~neliasso/8151795/webrev.01/ Regards, Nils Eliasson From nils.eliasson at oracle.com Mon Mar 14 11:20:47 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 14 Mar 2016 12:20:47 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled Message-ID: <56E69E8F.7090104@oracle.com> Hi, Summary: The test wasn't as robust as expected. Solution: Change the way we verify that we are having a un-blocking compilation: First lock the compilation queue - no new compiles will be completed. Enqueue method for compilation. If the method is compiled blockingly - the java thread will hang since the compile can't complete as long as the compile queue is locked. Use this to test the blocking functionality in three steps: 1) Verify that we are not blocking on target method as described. 2) Add compiler directive with instruction to block on target method - verify that it can be compiled on all levels. If it is not blocking it will eventually be stalled for a moment in the compiler queue and the test will fail. 3) Pop directive, and redo step one - verify that target method is not blocking. Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 Webrev: http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ Regards, Nils Eliasson From nils.eliasson at oracle.com Mon Mar 14 11:23:49 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 14 Mar 2016 12:23:49 +0100 Subject: RFR(XXS): 8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: References: Message-ID: <56E69F45.4060004@oracle.com> Hi Volker, Please do not quarantine. I have posted a separate RFR for fixing the test. Regards, Nils Eliasson On 2016-03-14 11:27, Volker Simonis wrote: > Hi, > > can I please have a review and sponsor for the following tiny test fix: > > http://cr.openjdk.java.net/~simonis/webrevs/2016/8151796/ > https://bugs.openjdk.java.net/browse/JDK-8151796 > > Change JDK-8150646 added a new test > (compiler/whitebox/BlockingCompilation.java) which checks that is is > possible to compile a simple method on every available compilation > level trough the whitebox API. It also checks that these compilations > can be done in blocking as well as in unblocking mode. > > Unfortunately it seems that the test is a little shaky. It may happen > that the test simple method is not compilable on an available > compilation level (e.g. because the code cache is full). Also testing > for non-blocking compilations isn't 100% bullet proof, because > non-blocking compiles may look like blocking compilations if the > compiler thread is faster than the test thread for some reason. > > I've therefore flagged this test as "@ignore". It can still be run > manually by using the '-ignore:run' jtreg command line option. > Automatic test runs can now skip this test by using '-ignore:quiet'. > > Regards, > Volker From zoltan.majo at oracle.com Mon Mar 14 11:43:08 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Mon, 14 Mar 2016 12:43:08 +0100 Subject: RFR(XXS): 8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: <56E69F45.4060004@oracle.com> References: <56E69F45.4060004@oracle.com> Message-ID: <56E6A3CC.2090206@oracle.com> Hi Volker, Hi Nils, @Volker: Thank you for jumping on this. @Nils: Thank you for coming up with the fix. I'll close the quarantine sub-task related to the failure https://bugs.openjdk.java.net/browse/JDK-8151804 once Nils's fix has been pushed. Best regards, Zoltan On 03/14/2016 12:23 PM, Nils Eliasson wrote: > Hi Volker, > > Please do not quarantine. I have posted a separate RFR for fixing the > test. > > Regards, > Nils Eliasson > > On 2016-03-14 11:27, Volker Simonis wrote: >> Hi, >> >> can I please have a review and sponsor for the following tiny test fix: >> >> http://cr.openjdk.java.net/~simonis/webrevs/2016/8151796/ >> https://bugs.openjdk.java.net/browse/JDK-8151796 >> >> Change JDK-8150646 added a new test >> (compiler/whitebox/BlockingCompilation.java) which checks that is is >> possible to compile a simple method on every available compilation >> level trough the whitebox API. It also checks that these compilations >> can be done in blocking as well as in unblocking mode. >> >> Unfortunately it seems that the test is a little shaky. It may happen >> that the test simple method is not compilable on an available >> compilation level (e.g. because the code cache is full). Also testing >> for non-blocking compilations isn't 100% bullet proof, because >> non-blocking compiles may look like blocking compilations if the >> compiler thread is faster than the test thread for some reason. >> >> I've therefore flagged this test as "@ignore". It can still be run >> manually by using the '-ignore:run' jtreg command line option. >> Automatic test runs can now skip this test by using '-ignore:quiet'. >> >> Regards, >> Volker > From konstantin.shefov at oracle.com Mon Mar 14 12:02:16 2016 From: konstantin.shefov at oracle.com (Konstantin Shefov) Date: Mon, 14 Mar 2016 15:02:16 +0300 Subject: [9] RFR 8150850: [JVMCI] NPE when executing HotSpotConstantReflectionProvider.readStableFieldValue Message-ID: <56E6A848.6060200@oracle.com> Hello Please review a bug fix in JVMCI jdk.vm.ci.hotspot.HotSpotConstantReflectionProvider::readStableFieldValue method. When one executes HotSpotConstantReflectionProvider::readStableFieldValue method for an instance field and JavaConstant.NULL_POINTER as a receiver, it throws an NPE. However, the javadoc says it should return null. E.g. when we execute method HotSpotConstantReflectionProvider::readFieldValue for an instance field and JavaConstant.NULL_POINTER as a receiver, we get null as expected. Additional check should be added for null as the first argument. Bug: https://bugs.openjdk.java.net/browse/JDK-8150850 Webrev: http://cr.openjdk.java.net/~kshefov/8150850/webrev.00 Thanks -Konstantin -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug.simon at oracle.com Mon Mar 14 12:50:57 2016 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 14 Mar 2016 13:50:57 +0100 Subject: [9] RFR 8150850: [JVMCI] NPE when executing HotSpotConstantReflectionProvider.readStableFieldValue In-Reply-To: <56E6A848.6060200@oracle.com> References: <56E6A848.6060200@oracle.com> Message-ID: Looks good. > On 14 Mar 2016, at 13:02, Konstantin Shefov wrote: > > Hello > > Please review a bug fix in JVMCI jdk.vm.ci.hotspot.HotSpotConstantReflectionProvider::readStableFieldValue method. > > When one executes HotSpotConstantReflectionProvider::readStableFieldValue method for an instance field and JavaConstant.NULL_POINTER as a receiver, it throws an NPE. However, the javadoc says it should return null. > > E.g. when we execute method HotSpotConstantReflectionProvider::readFieldValue for an instance field and JavaConstant.NULL_POINTER as a receiver, we get null as expected. > > Additional check should be added for null as the first argument. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8150850 > Webrev: http://cr.openjdk.java.net/~kshefov/8150850/webrev.00 > > Thanks > -Konstantin From volker.simonis at gmail.com Mon Mar 14 13:58:28 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 14 Mar 2016 14:58:28 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: <56E69E8F.7090104@oracle.com> References: <56E69E8F.7090104@oracle.com> Message-ID: Hi Nils, thanks for improving the test. I think your fix solves some problems with regard to fast, non-blocking compiles which are wrongly interpreted as blocking by the test. But what about the initial test failure: java.lang.Exception: public static int BlockingCompilation.foo() should be compiled at level 4(but is actually compiled at level 0) at BlockingCompilation.main(BlockingCompilation.java:104) This is from the loop which does blocking compilations. It seems that a method enqueued for level 4 couldn't be compiled at all. I don't know the exact reason, but one could be for example that the code cache was full or that the compiler bailed out because of another reason. I'm not sure we can accurately handle this situation in the test. Maybe we should tolerate if a method couldn't be compiled at all: 121 if (WB.getMethodCompilationLevel(m) != l* && **WB.getMethodCompilationLevel(m) != 0*) { Also, I don't understand the following code: 67 // Make sure no compilations can progress, blocking compiles will hang 68 WB.lockCompilation(); ... 78 // Normal compile on all levels 79 for (int l : levels) { 80 WB.enqueueMethodForCompilation(m, l); 81 } 82 83 // restore state 84 WB.unlockCompilation(); 85 while (!WB.isMethodCompiled(m)) { 86 Thread.sleep(100); 87 } 88 WB.deoptimizeMethod(m); 89 WB.clearMethodState(m); You enqueue the methods on all levels (let's assume 1,2,3,4). Then you wait until the method gets compiled at a level (lets say at level 1). I think this is already shaky, because these are non-blocking compiles of a method which hasn't been called before, so the requests can be easily get stale. But lets say the method will be compiled at level one. You then deoptimze and clear the method state. But the queue can still contain the compilation requests for the three other levels which can lead to errors in the following test: 103 //Verify that it actuall is uncompiled 104 if (WB.isMethodCompiled(m)) { 105 throw new Exception("Should not be compiled after deoptimization"); 106 } Finally some typos: 103 //Verify that it actuall*y* is uncompiled 111 // Add to queue a*n*d make sure it actually went well Regards, Volker On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson wrote: > Hi, > > Summary: > The test wasn't as robust as expected. > > Solution: > Change the way we verify that we are having a un-blocking compilation: > First lock the compilation queue - no new compiles will be completed. > Enqueue method for compilation. If the method is compiled blockingly - the > java thread will hang since the compile can't complete as long as the > compile queue is locked. > > Use this to test the blocking functionality in three steps: > 1) Verify that we are not blocking on target method as described. > 2) Add compiler directive with instruction to block on target method - > verify that it can be compiled on all levels. If it is not blocking it will > eventually be stalled for a moment in the compiler queue and the test will > fail. > 3) Pop directive, and redo step one - verify that target method is not > blocking. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 > Webrev: http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ > > Regards, > Nils Eliasson > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Mon Mar 14 14:43:05 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 14 Mar 2016 15:43:05 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: References: <56E69E8F.7090104@oracle.com> Message-ID: <56E6CDF9.20005@oracle.com> Hi Volker, On 2016-03-14 14:58, Volker Simonis wrote: > Hi Nils, > > thanks for improving the test. I think your fix solves some problems > with regard to fast, non-blocking compiles which are wrongly > interpreted as blocking by the test. But what about the initial test > failure: > > java.lang.Exception: public static int BlockingCompilation.foo() > should be compiled at level 4(but is actually compiled at level 0) > at BlockingCompilation.main(BlockingCompilation.java:104) > > This is from the loop which does blocking compilations. It seems that > a method enqueued for level 4 couldn't be compiled at all. I don't > know the exact reason, but one could be for example that the code > cache was full or that the compiler bailed out because of another > reason. I'm not sure we can accurately handle this situation in the > test. Maybe we should tolerate if a method couldn't be compiled at all: This failure only happened on (slow) non-tiered platforms and the log looked like that as if the compiler even hadn't been put on the compile queue. In the first version of my rewrite I checked the return value from enqueueMethodForCompilation to make sure the compile was actually added. But then I changed my mind and focused on just testing the blocking functionality. > 121 if (WB.getMethodCompilationLevel(m) != l*&& > **WB.getMethodCompilationLevel(m) != 0*) { > Also, I don't understand the following code: > 67 // Make sure no compilations can progress, blocking compiles will hang > 68 WB.lockCompilation(); ... 78 // Normal compile on all levels > 79 for (int l : levels) { > 80 WB.enqueueMethodForCompilation(m, l); > 81 } > 82 > 83 // restore state > 84 WB.unlockCompilation(); > 85 while (!WB.isMethodCompiled(m)) { > 86 Thread.sleep(100); > 87 } > 88 WB.deoptimizeMethod(m); > 89 WB.clearMethodState(m); > You enqueue the methods on all levels (let's assume 1,2,3,4). Then you > wait until the method gets compiled at a level (lets say at level 1). > I think this is already shaky, because these are non-blocking compiles > of a method which hasn't been called before, so the requests can be > easily get stale. Blocking compiles do not get stale any more - that is included in the patch. Only one item will actually be added to the compile queue - the rest will be dropped because the method is already enqueued. The loop makes the code work on all VM-flavours (client, serverm tiered) without worrying about compilation levels. The compilation-lock prevents any compilations from completing - so the all calls on enqueueMethodForCompilation() will be deterministic, and most important - we will get a deterministic result (hanged VM) if the method is blocking here. > But lets say the method will be compiled at level one. You then > deoptimze and clear the method state. But the queue can still contain > the compilation requests for the three other levels which can lead to > errors in the following test: It can only get on the queue once. It looks like this in the log: Start of test - not blocking 524 257 1 BlockingCompilation::foo (7 bytes) 625 257 1 BlockingCompilation::foo (7 bytes) made not entrant Directive added, then blocking part where all levels are tested: 1 compiler directives added 626 258 b 1 BlockingCompilation::foo (7 bytes) 627 258 1 BlockingCompilation::foo (7 bytes) made not entrant 627 259 b 2 BlockingCompilation::foo (7 bytes) 628 259 2 BlockingCompilation::foo (7 bytes) made not entrant 629 260 b 3 BlockingCompilation::foo (7 bytes) 630 260 3 BlockingCompilation::foo (7 bytes) made not entrant 630 261 b 4 BlockingCompilation::foo (7 bytes) 632 261 4 BlockingCompilation::foo (7 bytes) made not entrant And finally the non-blocking part where only one level gets compiled: 633 262 1 BlockingCompilation::foo (7 bytes) > 103 //Verify that it actuall is uncompiled > 104 if (WB.isMethodCompiled(m)) { > 105 throw new Exception("Should not be compiled after deoptimization"); > 106 } > Finally some typos: > 103 //Verify that it actuall*y* is uncompiled 111 // Add to queue > a*n*d make sure it actually went well > > Regards, > Volker Thanks for feedback, Nils Eliasson > > > On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson > > wrote: > > Hi, > > Summary: > The test wasn't as robust as expected. > > Solution: > Change the way we verify that we are having a un-blocking compilation: > First lock the compilation queue - no new compiles will be > completed. Enqueue method for compilation. If the method is > compiled blockingly - the java thread will hang since the compile > can't complete as long as the compile queue is locked. > > Use this to test the blocking functionality in three steps: > 1) Verify that we are not blocking on target method as described. > 2) Add compiler directive with instruction to block on target > method - verify that it can be compiled on all levels. If it is > not blocking it will eventually be stalled for a moment in the > compiler queue and the test will fail. > 3) Pop directive, and redo step one - verify that target method is > not blocking. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 > Webrev: http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ > > > Regards, > Nils Eliasson > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Mon Mar 14 15:16:12 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 14 Mar 2016 16:16:12 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: <56E6CDF9.20005@oracle.com> References: <56E69E8F.7090104@oracle.com> <56E6CDF9.20005@oracle.com> Message-ID: On Mon, Mar 14, 2016 at 3:43 PM, Nils Eliasson wrote: > Hi Volker, > > On 2016-03-14 14:58, Volker Simonis wrote: > > Hi Nils, > > thanks for improving the test. I think your fix solves some problems with > regard to fast, non-blocking compiles which are wrongly interpreted as > blocking by the test. But what about the initial test failure: > > java.lang.Exception: public static int BlockingCompilation.foo() should be > compiled at level 4(but is actually compiled at level 0) > at BlockingCompilation.main(BlockingCompilation.java:104) > > This is from the loop which does blocking compilations. It seems that a > method enqueued for level 4 couldn't be compiled at all. I don't know the > exact reason, but one could be for example that the code cache was full or > that the compiler bailed out because of another reason. I'm not sure we can > accurately handle this situation in the test. Maybe we should tolerate if a > method couldn't be compiled at all: > > > This failure only happened on (slow) non-tiered platforms and the log > looked like that as if the compiler even hadn't been put on the compile > queue. In the first version of my rewrite I checked the return value from > enqueueMethodForCompilation to make sure the compile was actually added. > But then I changed my mind and focused on just testing the blocking > functionality. > > 121 if (WB.getMethodCompilationLevel(m) != l* && **WB.getMethodCompilationLevel(m) != 0*) { > > Also, I don't understand the following code: > > 67 // Make sure no compilations can progress, blocking compiles will hang 68 WB.lockCompilation(); > ... 78 // Normal compile on all levels 79 for (int l : levels) { 80 WB.enqueueMethodForCompilation(m, l); > 81 } > 82 83 // restore state 84 WB.unlockCompilation(); 85 while (!WB.isMethodCompiled(m)) { 86 Thread.sleep(100); 87 } 88 WB.deoptimizeMethod(m); > 89 WB.clearMethodState(m); > > You enqueue the methods on all levels (let's assume 1,2,3,4). Then you > wait until the method gets compiled at a level (lets say at level 1). I > think this is already shaky, because these are non-blocking compiles of a > method which hasn't been called before, so the requests can be easily get > stale. > > > Blocking compiles do not get stale any more - that is included in the > patch. > > Only one item will actually be added to the compile queue - the rest will > be dropped because the method is already enqueued. The loop makes the code > work on all VM-flavours (client, serverm tiered) without worrying about > compilation levels. The compilation-lock prevents any compilations from > completing - so the all calls on enqueueMethodForCompilation() will be > deterministic, and most important - we will get a deterministic result > (hanged VM) if the method is blocking here. > > But lets say the method will be compiled at level one. You then deoptimze > and clear the method state. But the queue can still contain the compilation > requests for the three other levels which can lead to errors in the > following test: > > > It can only get on the queue once. It looks like this in the log: > > Start of test - not blocking > > 524 257 1 BlockingCompilation::foo (7 bytes) > 625 257 1 BlockingCompilation::foo (7 bytes) made not entrant > > OK, but then the following loop is useless (and the comment misleading) because we actually only enqueue and compile on one level (the first one which is available): 78 // Normal compile on all levels 79 for (int l : levels) { 80 WB.enqueueMethodForCompilation(m, l); 81 } Besides that, your changes look good! Regards, Volker Directive added, then blocking part where all levels are tested: > > 1 compiler directives added > 626 258 b 1 BlockingCompilation::foo (7 bytes) > 627 258 1 BlockingCompilation::foo (7 bytes) made not entrant > 627 259 b 2 BlockingCompilation::foo (7 bytes) > 628 259 2 BlockingCompilation::foo (7 bytes) made not entrant > 629 260 b 3 BlockingCompilation::foo (7 bytes) > 630 260 3 BlockingCompilation::foo (7 bytes) made not entrant > 630 261 b 4 BlockingCompilation::foo (7 bytes) > 632 261 4 BlockingCompilation::foo (7 bytes) made not entrant > > > And finally the non-blocking part where only one level gets compiled: > > 633 262 1 BlockingCompilation::foo (7 bytes) > > > 103 //Verify that it actuall is uncompiled 104 if (WB.isMethodCompiled(m)) { 105 throw new Exception("Should not be compiled after deoptimization"); 106 } > > Finally some typos: > > 103 //Verify that it actuall*y* is uncompiled > 111 // Add to queue a*n*d make sure it actually went well > > > Regards, > Volker > > > Thanks for feedback, > Nils Eliasson > > > > On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson > wrote: > >> Hi, >> >> Summary: >> The test wasn't as robust as expected. >> >> Solution: >> Change the way we verify that we are having a un-blocking compilation: >> First lock the compilation queue - no new compiles will be completed. >> Enqueue method for compilation. If the method is compiled blockingly - the >> java thread will hang since the compile can't complete as long as the >> compile queue is locked. >> >> Use this to test the blocking functionality in three steps: >> 1) Verify that we are not blocking on target method as described. >> 2) Add compiler directive with instruction to block on target method - >> verify that it can be compiled on all levels. If it is not blocking it will >> eventually be stalled for a moment in the compiler queue and the test will >> fail. >> 3) Pop directive, and redo step one - verify that target method is not >> blocking. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 >> Webrev: http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ >> >> Regards, >> Nils Eliasson >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Mon Mar 14 15:21:33 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 14 Mar 2016 16:21:33 +0100 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer In-Reply-To: <56E68DE8.10608@oracle.com> References: <56E2E6D4.4040308@oracle.com> <56E613C8.1040300@oracle.com> <56E68DE8.10608@oracle.com> Message-ID: <56E6D6FD.2010502@oracle.com> Hi, I triggered another problem with the latest webrev: # Internal Error (/scratch/opt/jprt/T/P1/101046.tohartma/s/hotspot/src/share/vm/opto/phaseX.cpp:346), pid=11651, tid=11671 # assert(t == t_no_spec) failed: dead node in hash table or missed node during speculative cleanup We should add the input nodes to the IGVN worklist after cutting off the Phi because they may be dead now: http://cr.openjdk.java.net/~thartmann/8150804/webrev.02/ Thanks, Tobias On 14.03.2016 11:09, Tobias Hartmann wrote: > Hi Vladimir, > > thanks for the review! > > On 14.03.2016 02:28, Vladimir Kozlov wrote: >> It is strange that next subgraph did not collapse - CheckCastPPNode::Identity() should remove second CheckCastPP: >> >>> CheckCastPP >>> A:NotNull >>> / \ >>> CheckCastPP | >>> A:NotNull | >>> \ / >>> Phi >>> A > > When CheckCastPPNode::Identity() is first invoked, the graph looks like this > > 92 CheckCastPP === 90 32 [[]] #A:NotNull * Oop:A:NotNull * !jvms: TestPhiElimination::test @ bci:8 > 135 CheckCastPP === 133 92 [[]] #A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) Oop:A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) !jvms: A::get @ bci:9 TestPhiElimination::test @ bci:11 > > Because 135 has a speculative part, the types are not equal and 135 is not replaced by 92. > >> I think next code in PhiNode::Ideal() should check can_reshape too since during parsing graph is incomplete: >> >> if (uin == NULL) { >> uncasted = true; >> uin = unique_input(phase, true); >> } > > Yes, that's a better solution. I verified that it works (Phi is folded after parsing and has correct A:NotNull type). > > Here is the new webrev: > http://cr.openjdk.java.net/~thartmann/8150804/webrev.01/ > > Thanks, > Tobias > >> >> Thanks, >> Vladimir >> >> On 3/11/16 7:40 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8150804 >>> http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ >>> >>> We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. >>> >>> Here are the details (for context, see 'TestPhiElimination.java'): >>> A::get() is inlined into test(obj) producing the following graph: >>> >>> Parm (obj) >>> TestPhiElimination >>> | >>> CastPP >>> TestPhiElimination:NotNull >>> | >>> CheckCastPP >>> A:NotNull >>> / \ >>> CheckCastPP | >>> A:NotNull | >>> \ / >>> Phi >>> A >>> | >>> Safepoint >>> >>> https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png >>> >>> PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): >>> >>> Parm (obj) >>> TestPhiElimination >>> | >>> CheckCastPP >>> A >>> | >>> Safepoint >>> >>> https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png >>> >>> We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. >>> >>> Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: >>> >>> phase->type(uncasted_input)->higher_equal(type())) >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 >>> >>> Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: >>> >>> Parm (obj) >>> TestPhiElimination >>> | >>> CastPP >>> TestPhiElimination:NotNull >>> | >>> CheckCastPP >>> A:NotNull >>> / \ >>> CheckCastPP | >>> A:NotNull | >>> \ / >>> Phi >>> Object >>> >>> We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. >>> >>> What do you think? >>> >>> Thanks, >>> Tobias >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8139771 >>> [1] https://bugs.openjdk.java.net/browse/JDK-8146999 >>> From martin.doerr at sap.com Mon Mar 14 15:49:18 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 14 Mar 2016 15:49:18 +0000 Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi Message-ID: Hi, we found out that C1 can't deal with illegal Phi functions which propagate into other Phi functions. Phi functions get illegal when their inputs have different types. This was observed when we activated the JVMTI capability can_access_local_variables and restored the old behavior of BlockBegin::try_merge: invalidate the phi functions instead of bailing out. The function LIRGenerator::move_to_phi crashes in this case. Proposed fix is to bail out as this case happens extremely rarely. Seems like it was never observed with the new behavior of BlockBegin::try_merge. I also improved some assertions to support locals with illegal types. The webrev is here: https://bugs.openjdk.java.net/browse/JDK-8151818 Please review. I will also need a sponsor if this change is desired. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Mon Mar 14 15:50:31 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 14 Mar 2016 15:50:31 +0000 Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi Message-ID: Sorry, I had pasted the wrong link to the webrev: http://cr.openjdk.java.net/~mdoerr/8151818_c1_illegal_phi/webrev.00/ From: Doerr, Martin Sent: Montag, 14. M?rz 2016 16:49 To: hotspot-compiler-dev at openjdk.java.net Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi Hi, we found out that C1 can't deal with illegal Phi functions which propagate into other Phi functions. Phi functions get illegal when their inputs have different types. This was observed when we activated the JVMTI capability can_access_local_variables and restored the old behavior of BlockBegin::try_merge: invalidate the phi functions instead of bailing out. The function LIRGenerator::move_to_phi crashes in this case. Proposed fix is to bail out as this case happens extremely rarely. Seems like it was never observed with the new behavior of BlockBegin::try_merge. I also improved some assertions to support locals with illegal types. The webrev is here: https://bugs.openjdk.java.net/browse/JDK-8151818 Please review. I will also need a sponsor if this change is desired. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From HORII at jp.ibm.com Mon Mar 14 16:34:16 2016 From: HORII at jp.ibm.com (Hiroshi H Horii) Date: Mon, 14 Mar 2016 16:34:16 +0000 Subject: Support for AES on ppc64le Message-ID: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com> Dear all: Can I please request reviews for the following change? This change was created for JDK 9. Description: This change adds stub routines support for single-block AES encryption and decryption operations on the POWER8 platform. They are available only when the application is configured to use SunJCE crypto provider on little endian. These stubs make use of efficient hardware AES instructions and thus offer significant performance improvements over JITed code on POWER8 as on x86 and SPARC. AES stub routines are enabled by default on POWER8 platforms that support AES instructions (vcipher). They can be explicitly enabled or disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls the stub with AESCrypt.sessionK[0] as round keys. Summary of source code changes: *src/cpu/ppc/vm/assembler_ppc.hpp *src/cpu/ppc/vm/assembler_ppc.inline.hpp - Adds support for vrld instruction to rotate vector register values with left doubleword. *src/cpu/ppc/vm/stubGenerator_ppc.cpp - Defines stubs for single-block AES encryption and decryption routines supporting all key sizes (128-bit, 192-bit and 256-bit). - Current POWER AES decryption instructions are not compatible with SunJCE expanded decryption key format. Thus decryption stubs read the expanded encryption keys (sessionK[0]) with descendant order. - Encryption stubs use SunJCE expanded encryption key as their is no incompatibility issue between POWER8 AES encryption instructions and SunJCE expanded encryption keys. *src/cpu/ppc/vm/vm_version_ppc.cpp - Detects AES capabilities of the underlying CPU by using has_vcipher(). - Enables UseAES and UseAESIntrinsics flags if the underlying CPU supports AES instructions and neither of them is explicitly disabled on the command-line. Generate warning message if either of these flags are enabled on the command-line whereas the underlying CPU does not support AES instructions. *src/share/vm/opto/library_call.cpp - Passes the first input parameter, reference to sessionK[0] to the AES stubs only on the POWER platform. Code change: Please see an attached diff file that was generated with "hg diff -g" under the latest hotspot directory. Passed tests: jtreg compiler/codegen/7184394/ jtreg compiler/cpuflags/ (after removing @ignored annotation) * This is my first post of a change. I'm sorry in advance if I don't follow the community manners. * I wrote this description based on the follows. http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-November/012670.html Regards, Hiroshi ----------------------- Hiroshi Horii, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ppc64le_aes_support.diff Type: application/octet-stream Size: 21684 bytes Desc: not available URL: From igor.veresov at oracle.com Mon Mar 14 19:16:25 2016 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 14 Mar 2016 12:16:25 -0700 Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi In-Reply-To: References: Message-ID: <258236A7-9325-4F19-B3F4-3108DA39D9DF@oracle.com> Seems fine. igor > On Mar 14, 2016, at 8:50 AM, Doerr, Martin wrote: > > Sorry, I had pasted the wrong link to the webrev: > http://cr.openjdk.java.net/~mdoerr/8151818_c1_illegal_phi/webrev.00/ > > From: Doerr, Martin > Sent: Montag, 14. M?rz 2016 16:49 > To: hotspot-compiler-dev at openjdk.java.net > Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi > > Hi, > > we found out that C1 can't deal with illegal Phi functions which propagate into other Phi functions. > > Phi functions get illegal when their inputs have different types. > This was observed when we activated the JVMTI capability can_access_local_variables and restored the old behavior of BlockBegin::try_merge: invalidate the phi functions instead of bailing out. > The function LIRGenerator::move_to_phi crashes in this case. > > Proposed fix is to bail out as this case happens extremely rarely. Seems like it was never observed with the new behavior of BlockBegin::try_merge. > > I also improved some assertions to support locals with illegal types. > > The webrev is here: > https://bugs.openjdk.java.net/browse/JDK-8151818 > > Please review. I will also need a sponsor if this change is desired. > > Best regards, > Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Mon Mar 14 20:24:30 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 14 Mar 2016 10:24:30 -1000 Subject: [9] RFR 8150850: [JVMCI] NPE when executing HotSpotConstantReflectionProvider.readStableFieldValue In-Reply-To: <56E6A848.6060200@oracle.com> References: <56E6A848.6060200@oracle.com> Message-ID: Looks good. > On Mar 14, 2016, at 2:02 AM, Konstantin Shefov wrote: > > Hello > > Please review a bug fix in JVMCI jdk.vm.ci.hotspot.HotSpotConstantReflectionProvider::readStableFieldValue method. > > When one executes HotSpotConstantReflectionProvider::readStableFieldValue method for an instance field and JavaConstant.NULL_POINTER as a receiver, it throws an NPE. However, the javadoc says it should return null. > > E.g. when we execute method HotSpotConstantReflectionProvider::readFieldValue for an instance field and JavaConstant.NULL_POINTER as a receiver, we get null as expected. > > Additional check should be added for null as the first argument. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8150850 > Webrev: http://cr.openjdk.java.net/~kshefov/8150850/webrev.00 > > Thanks > -Konstantin -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon Mar 14 20:24:49 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Mar 2016 13:24:49 -0700 Subject: Support for AES on ppc64le In-Reply-To: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com> References: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com> Message-ID: <56E71E11.3040808@oracle.com> Hi Hiroshi About library_call.cpp changes. You don't need GraphKit:: And you can use load_array_element() instead: Node* objAESCryptKey = load_array_element(control(), objSessionK, intcon(0), TypeAryPtr::OOPS); You may need additional check and cast because next expression expects the objAESCryptKey points to int[]: Node* k_start = array_element_address(objAESCryptKey, intcon(0), T_INT); Thanks, Vladimir On 3/14/16 9:34 AM, Hiroshi H Horii wrote: > Dear all: > > Can I please request reviews for the following change? > This change was created for JDK 9. > > Description: > This change adds stub routines support for single-block AES encryption and > decryption operations on the POWER8 platform. They are available only when > the application is configured to use SunJCE crypto provider on little > endian. > These stubs make use of efficient hardware AES instructions and thus > offer significant performance improvements over JITed code on POWER8 > as on x86 and SPARC. AES stub routines are enabled by default on POWER8 > platforms that support AES instructions (vcipher). They can be > explicitly enabled or > disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. > Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round > keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls > the stub with > AESCrypt.sessionK[0] as round keys. > > Summary of source code changes: > > *src/cpu/ppc/vm/assembler_ppc.hpp > *src/cpu/ppc/vm/assembler_ppc.inline.hpp > - Adds support for vrld instruction to rotate vector register values > with > left doubleword. > > *src/cpu/ppc/vm/stubGenerator_ppc.cpp > - Defines stubs for single-block AES encryption and decryption routines > supporting all key sizes (128-bit, 192-bit and 256-bit). > - Current POWER AES decryption instructions are not compatible with > SunJCE expanded decryption key format. Thus decryption stubs read > the expanded encryption keys (sessionK[0]) with descendant order. > - Encryption stubs use SunJCE expanded encryption key as their is > no incompatibility issue between POWER8 AES encryption instructions > and SunJCE expanded encryption keys. > > *src/cpu/ppc/vm/vm_version_ppc.cpp > - Detects AES capabilities of the underlying CPU by using has_vcipher(). > - Enables UseAES and UseAESIntrinsics flags if the underlying CPU > supports AES instructions and neither of them is explicitly > disabled on > the command-line. Generate warning message if either of these > flags are > enabled on the command-line whereas the underlying CPU does not > support > AES instructions. > > *src/share/vm/opto/library_call.cpp > - Passes the first input parameter, reference to sessionK[0] to the > AES stubs > only on the POWER platform. > > Code change: > Please see an attached diff file that was generated with "hg diff > -g" under > the latest hotspot directory. > > Passed tests: > jtreg compiler/codegen/7184394/ > jtreg compiler/cpuflags/ (after removing @ignored annotation) > > * This is my first post of a change. I'm sorry in advance if I don't > follow the > community manners. > > * I wrote this description based on the follows. > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013-November/012670.html > > > > Regards, > Hiroshi > ----------------------- > Hiroshi Horii, > IBM Research - Tokyo > From christian.thalinger at oracle.com Mon Mar 14 20:26:51 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Mon, 14 Mar 2016 10:26:51 -1000 Subject: RFR(XXS): 8151795: compiler/compilercontrol/parser/DirectiveParserTest.java fails with "assert failed: 0 != 0" In-Reply-To: <56E69664.8030500@oracle.com> References: <56E69664.8030500@oracle.com> Message-ID: <017694D0-80BE-460B-8239-61189F5592B1@oracle.com> Looks good. > On Mar 14, 2016, at 12:45 AM, Nils Eliasson wrote: > > Hi, > > Summary: > Adding directives from diagnostic command should treat zero directives in file as failure. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8151795 > Webrev: http://cr.openjdk.java.net/~neliasso/8151795/webrev.01/ > > Regards, > Nils Eliasson From vladimir.kozlov at oracle.com Mon Mar 14 20:28:47 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Mar 2016 13:28:47 -0700 Subject: RFR(XXS): 8151795: compiler/compilercontrol/parser/DirectiveParserTest.java fails with "assert failed: 0 != 0" In-Reply-To: <017694D0-80BE-460B-8239-61189F5592B1@oracle.com> References: <56E69664.8030500@oracle.com> <017694D0-80BE-460B-8239-61189F5592B1@oracle.com> Message-ID: <56E71EFF.1020601@oracle.com> Good. Thanks, Vladimir On 3/14/16 1:26 PM, Christian Thalinger wrote: > Looks good. > >> On Mar 14, 2016, at 12:45 AM, Nils Eliasson wrote: >> >> Hi, >> >> Summary: >> Adding directives from diagnostic command should treat zero directives in file as failure. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8151795 >> Webrev: http://cr.openjdk.java.net/~neliasso/8151795/webrev.01/ >> >> Regards, >> Nils Eliasson > From vladimir.kozlov at oracle.com Mon Mar 14 20:30:33 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 14 Mar 2016 13:30:33 -0700 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer In-Reply-To: <56E6D6FD.2010502@oracle.com> References: <56E2E6D4.4040308@oracle.com> <56E613C8.1040300@oracle.com> <56E68DE8.10608@oracle.com> <56E6D6FD.2010502@oracle.com> Message-ID: <56E71F69.8090304@oracle.com> Looks good. Thanks, Vladimir On 3/14/16 8:21 AM, Tobias Hartmann wrote: > Hi, > > I triggered another problem with the latest webrev: > > # Internal Error (/scratch/opt/jprt/T/P1/101046.tohartma/s/hotspot/src/share/vm/opto/phaseX.cpp:346), pid=11651, tid=11671 > # assert(t == t_no_spec) failed: dead node in hash table or missed node during speculative cleanup > > We should add the input nodes to the IGVN worklist after cutting off the Phi because they may be dead now: > http://cr.openjdk.java.net/~thartmann/8150804/webrev.02/ > > Thanks, > Tobias > > On 14.03.2016 11:09, Tobias Hartmann wrote: >> Hi Vladimir, >> >> thanks for the review! >> >> On 14.03.2016 02:28, Vladimir Kozlov wrote: >>> It is strange that next subgraph did not collapse - CheckCastPPNode::Identity() should remove second CheckCastPP: >>> >>>> CheckCastPP >>>> A:NotNull >>>> / \ >>>> CheckCastPP | >>>> A:NotNull | >>>> \ / >>>> Phi >>>> A >> >> When CheckCastPPNode::Identity() is first invoked, the graph looks like this >> >> 92 CheckCastPP === 90 32 [[]] #A:NotNull * Oop:A:NotNull * !jvms: TestPhiElimination::test @ bci:8 >> 135 CheckCastPP === 133 92 [[]] #A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) Oop:A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) !jvms: A::get @ bci:9 TestPhiElimination::test @ bci:11 >> >> Because 135 has a speculative part, the types are not equal and 135 is not replaced by 92. >> >>> I think next code in PhiNode::Ideal() should check can_reshape too since during parsing graph is incomplete: >>> >>> if (uin == NULL) { >>> uncasted = true; >>> uin = unique_input(phase, true); >>> } >> >> Yes, that's a better solution. I verified that it works (Phi is folded after parsing and has correct A:NotNull type). >> >> Here is the new webrev: >> http://cr.openjdk.java.net/~thartmann/8150804/webrev.01/ >> >> Thanks, >> Tobias >> >>> >>> Thanks, >>> Vladimir >>> >>> On 3/11/16 7:40 AM, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8150804 >>>> http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ >>>> >>>> We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. >>>> >>>> Here are the details (for context, see 'TestPhiElimination.java'): >>>> A::get() is inlined into test(obj) producing the following graph: >>>> >>>> Parm (obj) >>>> TestPhiElimination >>>> | >>>> CastPP >>>> TestPhiElimination:NotNull >>>> | >>>> CheckCastPP >>>> A:NotNull >>>> / \ >>>> CheckCastPP | >>>> A:NotNull | >>>> \ / >>>> Phi >>>> A >>>> | >>>> Safepoint >>>> >>>> https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png >>>> >>>> PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): >>>> >>>> Parm (obj) >>>> TestPhiElimination >>>> | >>>> CheckCastPP >>>> A >>>> | >>>> Safepoint >>>> >>>> https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png >>>> >>>> We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. >>>> >>>> Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: >>>> >>>> phase->type(uncasted_input)->higher_equal(type())) >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 >>>> >>>> Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: >>>> >>>> Parm (obj) >>>> TestPhiElimination >>>> | >>>> CastPP >>>> TestPhiElimination:NotNull >>>> | >>>> CheckCastPP >>>> A:NotNull >>>> / \ >>>> CheckCastPP | >>>> A:NotNull | >>>> \ / >>>> Phi >>>> Object >>>> >>>> We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. >>>> >>>> What do you think? >>>> >>>> Thanks, >>>> Tobias >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8139771 >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8146999 >>>> From tom.rodriguez at oracle.com Tue Mar 15 06:53:05 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 14 Mar 2016 23:53:05 -0700 Subject: RFR(XS) 8151871: [JVMCI] missing HAS_PENDING_EXCEPTION check Message-ID: <2228DD8E-063B-4BFE-8EB9-1FC6D2387A8C@oracle.com> http://cr.openjdk.java.net/~never/8151871/webrev/index.html Somehow during various edits a HAS_PENDING_EXCEPTION/CLEAR_PENDING_EXCEPTION check got lost. It should be restored. tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Tue Mar 15 07:01:08 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 15 Mar 2016 08:01:08 +0100 Subject: [9] RFR(S): 8150804: C2 Compilation fails with assert(_base >= OopPtr && _base <= AryPtr) failed: Not a Java pointer In-Reply-To: <56E71F69.8090304@oracle.com> References: <56E2E6D4.4040308@oracle.com> <56E613C8.1040300@oracle.com> <56E68DE8.10608@oracle.com> <56E6D6FD.2010502@oracle.com> <56E71F69.8090304@oracle.com> Message-ID: <56E7B334.1010905@oracle.com> Thanks, Vladimir! Best regards, Tobias On 14.03.2016 21:30, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 3/14/16 8:21 AM, Tobias Hartmann wrote: >> Hi, >> >> I triggered another problem with the latest webrev: >> >> # Internal Error (/scratch/opt/jprt/T/P1/101046.tohartma/s/hotspot/src/share/vm/opto/phaseX.cpp:346), pid=11651, tid=11671 >> # assert(t == t_no_spec) failed: dead node in hash table or missed node during speculative cleanup >> >> We should add the input nodes to the IGVN worklist after cutting off the Phi because they may be dead now: >> http://cr.openjdk.java.net/~thartmann/8150804/webrev.02/ >> >> Thanks, >> Tobias >> >> On 14.03.2016 11:09, Tobias Hartmann wrote: >>> Hi Vladimir, >>> >>> thanks for the review! >>> >>> On 14.03.2016 02:28, Vladimir Kozlov wrote: >>>> It is strange that next subgraph did not collapse - CheckCastPPNode::Identity() should remove second CheckCastPP: >>>> >>>>> CheckCastPP >>>>> A:NotNull >>>>> / \ >>>>> CheckCastPP | >>>>> A:NotNull | >>>>> \ / >>>>> Phi >>>>> A >>> >>> When CheckCastPPNode::Identity() is first invoked, the graph looks like this >>> >>> 92 CheckCastPP === 90 32 [[]] #A:NotNull * Oop:A:NotNull * !jvms: TestPhiElimination::test @ bci:8 >>> 135 CheckCastPP === 133 92 [[]] #A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) Oop:A:NotNull * (speculative=A:NotNull:exact * (inline_depth=2)) !jvms: A::get @ bci:9 TestPhiElimination::test @ bci:11 >>> >>> Because 135 has a speculative part, the types are not equal and 135 is not replaced by 92. >>> >>>> I think next code in PhiNode::Ideal() should check can_reshape too since during parsing graph is incomplete: >>>> >>>> if (uin == NULL) { >>>> uncasted = true; >>>> uin = unique_input(phase, true); >>>> } >>> >>> Yes, that's a better solution. I verified that it works (Phi is folded after parsing and has correct A:NotNull type). >>> >>> Here is the new webrev: >>> http://cr.openjdk.java.net/~thartmann/8150804/webrev.01/ >>> >>> Thanks, >>> Tobias >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 3/11/16 7:40 AM, Tobias Hartmann wrote: >>>>> Hi, >>>>> >>>>> please review the following patch. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8150804 >>>>> http://cr.openjdk.java.net/~thartmann/8150804/webrev.00/ >>>>> >>>>> We fail in Compile::Process_OopMap_Node() while processing monitors of a safepoint node because the monitor object is TOP. The crash is rare but reproduces with my regression test. The problem is the elimination of Phi nodes with a unique input which was broken by the fixes for JDK-8139771 [1] and JDK-8146999 [2]. >>>>> >>>>> Here are the details (for context, see 'TestPhiElimination.java'): >>>>> A::get() is inlined into test(obj) producing the following graph: >>>>> >>>>> Parm (obj) >>>>> TestPhiElimination >>>>> | >>>>> CastPP >>>>> TestPhiElimination:NotNull >>>>> | >>>>> CheckCastPP >>>>> A:NotNull >>>>> / \ >>>>> CheckCastPP | >>>>> A:NotNull | >>>>> \ / >>>>> Phi >>>>> A >>>>> | >>>>> Safepoint >>>>> >>>>> https://bugs.openjdk.java.net/secure/attachment/57820/before_ideal.png >>>>> >>>>> PhiNode::ideal() then replaces the Phi by a CheckCastPP because it has a unique input (see PhiNode::unique_input()): >>>>> >>>>> Parm (obj) >>>>> TestPhiElimination >>>>> | >>>>> CheckCastPP >>>>> A >>>>> | >>>>> Safepoint >>>>> >>>>> https://bugs.openjdk.java.net/secure/attachment/57821/after_ideal.png >>>>> >>>>> We completely lose the NotNull information provided by the CastPP. Therefore, we cannot prove that obj != null when accessing a field of obj and add an uncommon trap. Obj is also used as a monitor (A::get() is synchronized) and set to TOP in the uncommon trap branch. We are never able to prove that the null branch is not reachable and later fail when emitting code in Process_OopMap_Node because the monitor object is still TOP. >>>>> >>>>> Before the fix for JDK-8139771, we had a check to verify that the type of the unique (uncasted) input is "at least as good" as the type of the PhiNode: >>>>> >>>>> phase->type(uncasted_input)->higher_equal(type())) >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/9e17d9e4b59f#l4.79 >>>>> >>>>> Re-adding this check, fixes the problem. However, I'm concerned that this check is not strong enough. For example, in the case where the type of the PhiNode is Object: >>>>> >>>>> Parm (obj) >>>>> TestPhiElimination >>>>> | >>>>> CastPP >>>>> TestPhiElimination:NotNull >>>>> | >>>>> CheckCastPP >>>>> A:NotNull >>>>> / \ >>>>> CheckCastPP | >>>>> A:NotNull | >>>>> \ / >>>>> Phi >>>>> Object >>>>> >>>>> We would still replace the Phi because TestPhiElimination->higher_equal(Object) and again lose the NotNull information. I therefore added a slightly stronger check that also checks the types in-between. I had to remove the assert that Roland added. >>>>> >>>>> What do you think? >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8139771 >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8146999 >>>>> From tom.rodriguez at oracle.com Tue Mar 15 07:12:12 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 15 Mar 2016 00:12:12 -0700 Subject: RFR(XS) 8151874: [JVMCI] canInlineMethod should check is_not_compilable for correct CompLevel Message-ID: <059D20A4-51E0-4DB3-B626-E922ED33A1F5@oracle.com> http://cr.openjdk.java.net/~never/8151874/webrev/ Currently canInlineMethod is calling is_not_compilable() with no arguments which checks whether compilation has been disabled at any level which is incorrect. Only CompLevel_full_optimization should be consulted. tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 15 09:25:24 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 15 Mar 2016 10:25:24 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: References: <56E69E8F.7090104@oracle.com> <56E6CDF9.20005@oracle.com> Message-ID: <56E7D504.4000502@oracle.com> Hi, On 2016-03-14 16:16, Volker Simonis wrote: > > > On Mon, Mar 14, 2016 at 3:43 PM, Nils Eliasson > > wrote: > > Hi Volker, > > On 2016-03-14 14:58, Volker Simonis wrote: >> Hi Nils, >> >> thanks for improving the test. I think your fix solves some >> problems with regard to fast, non-blocking compiles which are >> wrongly interpreted as blocking by the test. But what about the >> initial test failure: >> >> java.lang.Exception: public static int BlockingCompilation.foo() >> should be compiled at level 4(but is actually compiled at level 0) >> at BlockingCompilation.main(BlockingCompilation.java:104) >> >> This is from the loop which does blocking compilations. It seems >> that a method enqueued for level 4 couldn't be compiled at all. I >> don't know the exact reason, but one could be for example that >> the code cache was full or that the compiler bailed out because >> of another reason. I'm not sure we can accurately handle this >> situation in the test. Maybe we should tolerate if a method >> couldn't be compiled at all: > > This failure only happened on (slow) non-tiered platforms and the > log looked like that as if the compiler even hadn't been put on > the compile queue. In the first version of my rewrite I checked > the return value from enqueueMethodForCompilation to make sure the > compile was actually added. But then I changed my mind and focused > on just testing the blocking functionality. > >> 121 if (WB.getMethodCompilationLevel(m) != l*&& >> **WB.getMethodCompilationLevel(m) != 0*) { >> Also, I don't understand the following code: >> 67 // Make sure no compilations can progress, blocking compiles >> will hang >> 68 WB.lockCompilation(); ... 78 // Normal compile on all levels >> 79 for (int l : levels) { >> 80 WB.enqueueMethodForCompilation(m, l); >> 81 } >> 82 >> 83 // restore state >> 84 WB.unlockCompilation(); >> 85 while (!WB.isMethodCompiled(m)) { >> 86 Thread.sleep(100); >> 87 } >> 88 WB.deoptimizeMethod(m); >> 89 WB.clearMethodState(m); >> You enqueue the methods on all levels (let's assume 1,2,3,4). >> Then you wait until the method gets compiled at a level (lets say >> at level 1). I think this is already shaky, because these are >> non-blocking compiles of a method which hasn't been called >> before, so the requests can be easily get stale. > > Blocking compiles do not get stale any more - that is included in > the patch. > > Only one item will actually be added to the compile queue - the > rest will be dropped because the method is already enqueued. The > loop makes the code work on all VM-flavours (client, serverm > tiered) without worrying about compilation levels. The > compilation-lock prevents any compilations from completing - so > the all calls on enqueueMethodForCompilation() will be > deterministic, and most important - we will get a deterministic > result (hanged VM) if the method is blocking here. > >> But lets say the method will be compiled at level one. You then >> deoptimze and clear the method state. But the queue can still >> contain the compilation requests for the three other levels which >> can lead to errors in the following test: > > It can only get on the queue once. It looks like this in the log: > > Start of test - not blocking > > 524 257 1 BlockingCompilation::foo (7 bytes) > 625 257 1 BlockingCompilation::foo (7 bytes) made not entrant > > OK, but then the following loop is useless (and the comment > misleading) because we actually only enqueue and compile on one level > (the first one which is available): I changed to compiling on the highest available comp level. http://cr.openjdk.java.net/~neliasso/8151796/webrev.07/ > 78 // Normal compile on all levels > 79 for (int l : levels) { > 80 WB.enqueueMethodForCompilation(m, l); > 81 } > > Besides that, your changes look good! > Regards, Volker Thank you, Nils > Directive added, then blocking part where all levels are tested: > > 1 compiler directives added > 626 258 b 1 BlockingCompilation::foo (7 bytes) > 627 258 1 BlockingCompilation::foo (7 bytes) made not entrant > 627 259 b 2 BlockingCompilation::foo (7 bytes) > 628 259 2 BlockingCompilation::foo (7 bytes) made not entrant > 629 260 b 3 BlockingCompilation::foo (7 bytes) > 630 260 3 BlockingCompilation::foo (7 bytes) made not entrant > 630 261 b 4 BlockingCompilation::foo (7 bytes) > 632 261 4 BlockingCompilation::foo (7 bytes) made not entrant > > > And finally the non-blocking part where only one level gets compiled: > > 633 262 1 BlockingCompilation::foo (7 bytes) > > >> 103 //Verify that it actuall is uncompiled >> 104 if (WB.isMethodCompiled(m)) { >> 105 throw new Exception("Should not be compiled after >> deoptimization"); >> 106 } >> Finally some typos: >> 103 //Verify that it actuall*y* is uncompiled 111 // Add to queue >> a*n*d make sure it actually went well >> >> Regards, >> Volker > > Thanks for feedback, > Nils Eliasson > >> >> >> On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson >> > wrote: >> >> Hi, >> >> Summary: >> The test wasn't as robust as expected. >> >> Solution: >> Change the way we verify that we are having a un-blocking >> compilation: >> First lock the compilation queue - no new compiles will be >> completed. Enqueue method for compilation. If the method is >> compiled blockingly - the java thread will hang since the >> compile can't complete as long as the compile queue is locked. >> >> Use this to test the blocking functionality in three steps: >> 1) Verify that we are not blocking on target method as described. >> 2) Add compiler directive with instruction to block on target >> method - verify that it can be compiled on all levels. If it >> is not blocking it will eventually be stalled for a moment in >> the compiler queue and the test will fail. >> 3) Pop directive, and redo step one - verify that target >> method is not blocking. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 >> Webrev: >> http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ >> >> >> Regards, >> Nils Eliasson >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 15 09:26:13 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 15 Mar 2016 10:26:13 +0100 Subject: RFR(XXS): 8151795: compiler/compilercontrol/parser/DirectiveParserTest.java fails with "assert failed: 0 != 0" In-Reply-To: <56E71EFF.1020601@oracle.com> References: <56E69664.8030500@oracle.com> <017694D0-80BE-460B-8239-61189F5592B1@oracle.com> <56E71EFF.1020601@oracle.com> Message-ID: <56E7D535.9070802@oracle.com> Thank you Christian and Vladimir, Nils On 2016-03-14 21:28, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 3/14/16 1:26 PM, Christian Thalinger wrote: >> Looks good. >> >>> On Mar 14, 2016, at 12:45 AM, Nils Eliasson >>> wrote: >>> >>> Hi, >>> >>> Summary: >>> Adding directives from diagnostic command should treat zero >>> directives in file as failure. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151795 >>> Webrev: http://cr.openjdk.java.net/~neliasso/8151795/webrev.01/ >>> >>> Regards, >>> Nils Eliasson >> From HORII at jp.ibm.com Tue Mar 15 09:31:22 2016 From: HORII at jp.ibm.com (Hiroshi H Horii) Date: Tue, 15 Mar 2016 09:31:22 +0000 Subject: Support for AES on ppc64le In-Reply-To: <56E71E11.3040808@oracle.com> References: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com> <56E71E11.3040808@oracle.com> Message-ID: <201603150932.u2F9W7k9028664@d19av05.sagamino.japan.ibm.com> Hi Vladimir, Thank you a lots for your quick response and review. To use load_array_element, SEGV happened, I needed the following change. Could you also review this change is reasonable? diff --git a/src/share/vm/opto/graphKit.cpp b/src/share/vm/opto/graphKit.cpp --- a/src/share/vm/opto/graphKit.cpp +++ b/src/share/vm/opto/graphKit.cpp @@ -1680,6 +1680,8 @@ Node* GraphKit::load_array_element(Node* ctl, Node* ary, Node* idx, const TypeAryPtr* arytype) { const Type* elemtype = arytype->elem(); BasicType elembt = elemtype->array_element_basic_type(); + if (elembt == T_NARROWOOP) + elembt = T_OBJECT; Node* adr = array_element_address(ary, idx, elembt, arytype->size()); Node* ld = make_load(ctl, adr, elemtype, elembt, arytype, MemNode::unordered); return ld; I attached a full diff that is applied your kind suggestions. Regards, Hiroshi ----------------------- Hiroshi Horii, Ph.D. IBM Research - Tokyo Vladimir Kozlov wrote on 03/15/2016 05:24:49: > From: Vladimir Kozlov > To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-compiler-dev at openjdk.java.net > Cc: "Simonis, Volker" , Tim Ellison > > Date: 03/15/2016 05:25 > Subject: Re: Support for AES on ppc64le > > Hi Hiroshi > > About library_call.cpp changes. > > You don't need GraphKit:: > > And you can use load_array_element() instead: > > Node* objAESCryptKey = load_array_element(control(), objSessionK, > intcon(0), TypeAryPtr::OOPS); > > You may need additional check and cast because next expression expects > the objAESCryptKey points to int[]: > > Node* k_start = array_element_address(objAESCryptKey, intcon(0), T_INT); > > Thanks, > Vladimir > > On 3/14/16 9:34 AM, Hiroshi H Horii wrote: > > Dear all: > > > > Can I please request reviews for the following change? > > This change was created for JDK 9. > > > > Description: > > This change adds stub routines support for single-block AES encryption and > > decryption operations on the POWER8 platform. They are available only when > > the application is configured to use SunJCE crypto provider on little > > endian. > > These stubs make use of efficient hardware AES instructions and thus > > offer significant performance improvements over JITed code on POWER8 > > as on x86 and SPARC. AES stub routines are enabled by default on POWER8 > > platforms that support AES instructions (vcipher). They can be > > explicitly enabled or > > disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. > > Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round > > keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls > > the stub with > > AESCrypt.sessionK[0] as round keys. > > > > Summary of source code changes: > > > > *src/cpu/ppc/vm/assembler_ppc.hpp > > *src/cpu/ppc/vm/assembler_ppc.inline.hpp > > - Adds support for vrld instruction to rotate vector register values > > with > > left doubleword. > > > > *src/cpu/ppc/vm/stubGenerator_ppc.cpp > > - Defines stubs for single-block AES encryption and decryption routines > > supporting all key sizes (128-bit, 192-bit and 256-bit). > > - Current POWER AES decryption instructions are not compatible with > > SunJCE expanded decryption key format. Thus decryption stubs read > > the expanded encryption keys (sessionK[0]) with descendant order. > > - Encryption stubs use SunJCE expanded encryption key as their is > > no incompatibility issue between POWER8 AES encryption instructions > > and SunJCE expanded encryption keys. > > > > *src/cpu/ppc/vm/vm_version_ppc.cpp > > - Detects AES capabilities of the underlying CPU by using has_vcipher(). > > - Enables UseAES and UseAESIntrinsics flags if the underlying CPU > > supports AES instructions and neither of them is explicitly > > disabled on > > the command-line. Generate warning message if either of these > > flags are > > enabled on the command-line whereas the underlying CPU does not > > support > > AES instructions. > > > > *src/share/vm/opto/library_call.cpp > > - Passes the first input parameter, reference to sessionK[0] to the > > AES stubs > > only on the POWER platform. > > > > Code change: > > Please see an attached diff file that was generated with "hg diff > > -g" under > > the latest hotspot directory. > > > > Passed tests: > > jtreg compiler/codegen/7184394/ > > jtreg compiler/cpuflags/ (after removing @ignored annotation) > > > > * This is my first post of a change. I'm sorry in advance if I don't > > follow the > > community manners. > > > > * I wrote this description based on the follows. > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013- > November/012670.html > > > > > > > > Regards, > > Hiroshi > > ----------------------- > > Hiroshi Horii, > > IBM Research - Tokyo > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ppc64le_aes_support_20160315.diff Type: application/octet-stream Size: 22280 bytes Desc: not available URL: From volker.simonis at gmail.com Tue Mar 15 09:46:44 2016 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 15 Mar 2016 10:46:44 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: <56E7D504.4000502@oracle.com> References: <56E69E8F.7090104@oracle.com> <56E6CDF9.20005@oracle.com> <56E7D504.4000502@oracle.com> Message-ID: OK, looks good now. Regards, Volker On Tue, Mar 15, 2016 at 10:25 AM, Nils Eliasson wrote: > Hi, > > > On 2016-03-14 16:16, Volker Simonis wrote: > > > > On Mon, Mar 14, 2016 at 3:43 PM, Nils Eliasson > wrote: > >> Hi Volker, >> >> On 2016-03-14 14:58, Volker Simonis wrote: >> >> Hi Nils, >> >> thanks for improving the test. I think your fix solves some problems with >> regard to fast, non-blocking compiles which are wrongly interpreted as >> blocking by the test. But what about the initial test failure: >> >> java.lang.Exception: public static int BlockingCompilation.foo() should >> be compiled at level 4(but is actually compiled at level 0) >> at BlockingCompilation.main(BlockingCompilation.java:104) >> >> This is from the loop which does blocking compilations. It seems that a >> method enqueued for level 4 couldn't be compiled at all. I don't know the >> exact reason, but one could be for example that the code cache was full or >> that the compiler bailed out because of another reason. I'm not sure we can >> accurately handle this situation in the test. Maybe we should tolerate if a >> method couldn't be compiled at all: >> >> >> This failure only happened on (slow) non-tiered platforms and the log >> looked like that as if the compiler even hadn't been put on the compile >> queue. In the first version of my rewrite I checked the return value from >> enqueueMethodForCompilation to make sure the compile was actually added. >> But then I changed my mind and focused on just testing the blocking >> functionality. >> >> 121 if (WB.getMethodCompilationLevel(m) != l* && **WB.getMethodCompilationLevel(m) != 0*) { >> >> Also, I don't understand the following code: >> >> 67 // Make sure no compilations can progress, blocking compiles will hang 68 WB.lockCompilation(); >> ... 78 // Normal compile on all levels 79 for (int l : levels) { 80 WB.enqueueMethodForCompilation(m, l); >> 81 } >> 82 83 // restore state 84 WB.unlockCompilation(); 85 while (!WB.isMethodCompiled(m)) { 86 Thread.sleep(100); 87 } 88 WB.deoptimizeMethod(m); >> 89 WB.clearMethodState(m); >> >> You enqueue the methods on all levels (let's assume 1,2,3,4). Then you >> wait until the method gets compiled at a level (lets say at level 1). I >> think this is already shaky, because these are non-blocking compiles of a >> method which hasn't been called before, so the requests can be easily get >> stale. >> >> >> Blocking compiles do not get stale any more - that is included in the >> patch. >> >> Only one item will actually be added to the compile queue - the rest will >> be dropped because the method is already enqueued. The loop makes the code >> work on all VM-flavours (client, serverm tiered) without worrying about >> compilation levels. The compilation-lock prevents any compilations from >> completing - so the all calls on enqueueMethodForCompilation() will be >> deterministic, and most important - we will get a deterministic result >> (hanged VM) if the method is blocking here. >> >> But lets say the method will be compiled at level one. You then deoptimze >> and clear the method state. But the queue can still contain the compilation >> requests for the three other levels which can lead to errors in the >> following test: >> >> >> It can only get on the queue once. It looks like this in the log: >> >> Start of test - not blocking >> >> 524 257 1 BlockingCompilation::foo (7 bytes) >> 625 257 1 BlockingCompilation::foo (7 bytes) made not entrant >> >> OK, but then the following loop is useless (and the comment misleading) > because we actually only enqueue and compile on one level (the first one > which is available): > > I changed to compiling on the highest available comp level. > > http://cr.openjdk.java.net/~neliasso/8151796/webrev.07/ > > > > 78 // Normal compile on all levels 79 for (int l : levels) { 80 WB.enqueueMethodForCompilation(m, l); > 81 } > > > Besides that, your changes look good! > > Regards, > Volker > > > Thank you, > Nils > > > Directive added, then blocking part where all levels are tested: >> >> 1 compiler directives added >> 626 258 b 1 BlockingCompilation::foo (7 bytes) >> 627 258 1 BlockingCompilation::foo (7 bytes) made not entrant >> 627 259 b 2 BlockingCompilation::foo (7 bytes) >> 628 259 2 BlockingCompilation::foo (7 bytes) made not entrant >> 629 260 b 3 BlockingCompilation::foo (7 bytes) >> 630 260 3 BlockingCompilation::foo (7 bytes) made not entrant >> 630 261 b 4 BlockingCompilation::foo (7 bytes) >> 632 261 4 BlockingCompilation::foo (7 bytes) made not entrant >> >> >> And finally the non-blocking part where only one level gets compiled: >> >> 633 262 1 BlockingCompilation::foo (7 bytes) >> >> >> 103 //Verify that it actuall is uncompiled 104 if (WB.isMethodCompiled(m)) { 105 throw new Exception("Should not be compiled after deoptimization"); 106 } >> >> Finally some typos: >> >> 103 //Verify that it actuall*y* is uncompiled >> 111 // Add to queue a*n*d make sure it actually went well >> >> >> Regards, >> Volker >> >> >> Thanks for feedback, >> Nils Eliasson >> >> >> >> On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson < >> nils.eliasson at oracle.com> wrote: >> >>> Hi, >>> >>> Summary: >>> The test wasn't as robust as expected. >>> >>> Solution: >>> Change the way we verify that we are having a un-blocking compilation: >>> First lock the compilation queue - no new compiles will be completed. >>> Enqueue method for compilation. If the method is compiled blockingly - the >>> java thread will hang since the compile can't complete as long as the >>> compile queue is locked. >>> >>> Use this to test the blocking functionality in three steps: >>> 1) Verify that we are not blocking on target method as described. >>> 2) Add compiler directive with instruction to block on target method - >>> verify that it can be compiled on all levels. If it is not blocking it will >>> eventually be stalled for a moment in the compiler queue and the test will >>> fail. >>> 3) Pop directive, and redo step one - verify that target method is not >>> blocking. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 >>> Webrev: http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ >>> >>> Regards, >>> Nils Eliasson >>> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 15 10:01:56 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 15 Mar 2016 11:01:56 +0100 Subject: RFR(S):8151796: compiler/whitebox/BlockingCompilation.java fails due to method not compiled In-Reply-To: References: <56E69E8F.7090104@oracle.com> <56E6CDF9.20005@oracle.com> <56E7D504.4000502@oracle.com> Message-ID: <56E7DD94.4000404@oracle.com> Thank you Volker, I'll push this together with the 8151795 today. Regards, Nils Eliasson On 2016-03-15 10:46, Volker Simonis wrote: > OK, looks good now. > > Regards, > Volker > > > On Tue, Mar 15, 2016 at 10:25 AM, Nils Eliasson > > wrote: > > Hi, > > > On 2016-03-14 16:16, Volker Simonis wrote: >> >> >> On Mon, Mar 14, 2016 at 3:43 PM, Nils Eliasson >> > wrote: >> >> Hi Volker, >> >> On 2016-03-14 14:58, Volker Simonis wrote: >>> Hi Nils, >>> >>> thanks for improving the test. I think your fix solves some >>> problems with regard to fast, non-blocking compiles which >>> are wrongly interpreted as blocking by the test. But what >>> about the initial test failure: >>> >>> java.lang.Exception: public static int >>> BlockingCompilation.foo() should be compiled at level 4(but >>> is actually compiled at level 0) >>> at BlockingCompilation.main(BlockingCompilation.java:104) >>> >>> This is from the loop which does blocking compilations. It >>> seems that a method enqueued for level 4 couldn't be >>> compiled at all. I don't know the exact reason, but one >>> could be for example that the code cache was full or that >>> the compiler bailed out because of another reason. I'm not >>> sure we can accurately handle this situation in the test. >>> Maybe we should tolerate if a method couldn't be compiled at >>> all: >> >> This failure only happened on (slow) non-tiered platforms and >> the log looked like that as if the compiler even hadn't been >> put on the compile queue. In the first version of my rewrite >> I checked the return value from enqueueMethodForCompilation >> to make sure the compile was actually added. But then I >> changed my mind and focused on just testing the blocking >> functionality. >> >>> 121 if (WB.getMethodCompilationLevel(m) != l*&& >>> **WB.getMethodCompilationLevel(m) != 0*) { >>> Also, I don't understand the following code: >>> 67 // Make sure no compilations can progress, blocking >>> compiles will hang >>> 68 WB.lockCompilation(); ... 78 // Normal compile on all levels >>> 79 for (int l : levels) { >>> 80 WB.enqueueMethodForCompilation(m, l); >>> 81 } >>> 82 >>> 83 // restore state >>> 84 WB.unlockCompilation(); >>> 85 while (!WB.isMethodCompiled(m)) { >>> 86 Thread.sleep(100); >>> 87 } >>> 88 WB.deoptimizeMethod(m); >>> 89 WB.clearMethodState(m); >>> You enqueue the methods on all levels (let's assume >>> 1,2,3,4). Then you wait until the method gets compiled at a >>> level (lets say at level 1). I think this is already shaky, >>> because these are non-blocking compiles of a method which >>> hasn't been called before, so the requests can be easily get >>> stale. >> >> Blocking compiles do not get stale any more - that is >> included in the patch. >> >> Only one item will actually be added to the compile queue - >> the rest will be dropped because the method is already >> enqueued. The loop makes the code work on all VM-flavours >> (client, serverm tiered) without worrying about compilation >> levels. The compilation-lock prevents any compilations from >> completing - so the all calls on >> enqueueMethodForCompilation() will be deterministic, and most >> important - we will get a deterministic result (hanged VM) if >> the method is blocking here. >> >>> But lets say the method will be compiled at level one. You >>> then deoptimze and clear the method state. But the queue can >>> still contain the compilation requests for the three other >>> levels which can lead to errors in the following test: >> >> It can only get on the queue once. It looks like this in the log: >> >> Start of test - not blocking >> >> 524 257 1 BlockingCompilation::foo (7 bytes) >> 625 257 1 BlockingCompilation::foo (7 bytes) made not entrant >> >> OK, but then the following loop is useless (and the comment >> misleading) because we actually only enqueue and compile on one >> level (the first one which is available): > I changed to compiling on the highest available comp level. > > http://cr.openjdk.java.net/~neliasso/8151796/webrev.07/ > >> 78 // Normal compile on all levels >> 79 for (int l : levels) { >> 80 WB.enqueueMethodForCompilation(m, l); >> 81 } >> >> Besides that, your changes look good! >> Regards, Volker > > Thank you, > Nils > > >> Directive added, then blocking part where all levels are tested: >> >> 1 compiler directives added >> 626 258 b 1 BlockingCompilation::foo (7 bytes) >> 627 258 1 BlockingCompilation::foo (7 bytes) made not entrant >> 627 259 b 2 BlockingCompilation::foo (7 bytes) >> 628 259 2 BlockingCompilation::foo (7 bytes) made not entrant >> 629 260 b 3 BlockingCompilation::foo (7 bytes) >> 630 260 3 BlockingCompilation::foo (7 bytes) made not entrant >> 630 261 b 4 BlockingCompilation::foo (7 bytes) >> 632 261 4 BlockingCompilation::foo (7 bytes) made not entrant >> >> >> And finally the non-blocking part where only one level gets compiled: >> >> 633 262 1 BlockingCompilation::foo (7 bytes) >> >> >>> 103 //Verify that it actuall is uncompiled >>> 104 if (WB.isMethodCompiled(m)) { >>> 105 throw new Exception("Should not be compiled after >>> deoptimization"); >>> 106 } >>> Finally some typos: >>> 103 //Verify that it actuall*y* is uncompiled 111 // Add to >>> queue a*n*d make sure it actually went well >>> >>> Regards, >>> Volker >> >> Thanks for feedback, >> Nils Eliasson >> >>> >>> >>> On Mon, Mar 14, 2016 at 12:20 PM, Nils Eliasson >>> > >>> wrote: >>> >>> Hi, >>> >>> Summary: >>> The test wasn't as robust as expected. >>> >>> Solution: >>> Change the way we verify that we are having a >>> un-blocking compilation: >>> First lock the compilation queue - no new compiles will >>> be completed. Enqueue method for compilation. If the >>> method is compiled blockingly - the java thread will >>> hang since the compile can't complete as long as the >>> compile queue is locked. >>> >>> Use this to test the blocking functionality in three steps: >>> 1) Verify that we are not blocking on target method as >>> described. >>> 2) Add compiler directive with instruction to block on >>> target method - verify that it can be compiled on all >>> levels. If it is not blocking it will eventually be >>> stalled for a moment in the compiler queue and the test >>> will fail. >>> 3) Pop directive, and redo step one - verify that target >>> method is not blocking. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8151796 >>> Webrev: >>> http://cr.openjdk.java.net/~neliasso/8151796/werev.03/ >>> >>> >>> Regards, >>> Nils Eliasson >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Tue Mar 15 10:25:49 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 15 Mar 2016 11:25:49 +0100 Subject: [9] RFR(XS): 8151882: -XX:+Verbose prints messages even if no other flag is set Message-ID: <56E7E32D.7090807@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8151882 http://cr.openjdk.java.net/~thartmann/8151882/webrev.00/ Running the VM with -XX:+Verbose prints dozens of register allocator debug messages even if no additional flag like -XX:+PrintOpto is specified. This makes it hard to filter out relevant information if other debug flags are used in combination with -XX:+Verbose. According to the documentation, -XX:+Verbose should only "Print additional debugging information from other modes" but should not print anything on its own. I changed the code to only print messages if PrintOpto && WizardMode is set. This is consistent with other place in reg_split.cpp where we print debug info. Thanks, Tobias From nils.eliasson at oracle.com Tue Mar 15 11:03:05 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 15 Mar 2016 12:03:05 +0100 Subject: [9] RFR(XS): 8151882: -XX:+Verbose prints messages even if no other flag is set In-Reply-To: <56E7E32D.7090807@oracle.com> References: <56E7E32D.7090807@oracle.com> Message-ID: <56E7EBE9.8060003@oracle.com> Hi Tobias, Looks good, Regards, Nils On 2016-03-15 11:25, Tobias Hartmann wrote: > Hi, > > please review the following patch: > > https://bugs.openjdk.java.net/browse/JDK-8151882 > http://cr.openjdk.java.net/~thartmann/8151882/webrev.00/ > > Running the VM with -XX:+Verbose prints dozens of register allocator debug messages even if no additional flag like -XX:+PrintOpto is specified. This makes it hard to filter out relevant information if other debug flags are used in combination with -XX:+Verbose. According to the documentation, -XX:+Verbose should only "Print additional debugging information from other modes" but should not print anything on its own. > > I changed the code to only print messages if PrintOpto && WizardMode is set. This is consistent with other place in reg_split.cpp where we print debug info. > > Thanks, > Tobias From tobias.hartmann at oracle.com Tue Mar 15 11:22:19 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 15 Mar 2016 12:22:19 +0100 Subject: [9] RFR(XS): 8151882: -XX:+Verbose prints messages even if no other flag is set In-Reply-To: <56E7EBE9.8060003@oracle.com> References: <56E7E32D.7090807@oracle.com> <56E7EBE9.8060003@oracle.com> Message-ID: <56E7F06B.2040605@oracle.com> Thanks, Nils! Best regards, Tobias On 15.03.2016 12:03, Nils Eliasson wrote: > Hi Tobias, > > Looks good, > > Regards, > Nils > > On 2016-03-15 11:25, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> >> https://bugs.openjdk.java.net/browse/JDK-8151882 >> http://cr.openjdk.java.net/~thartmann/8151882/webrev.00/ >> >> Running the VM with -XX:+Verbose prints dozens of register allocator debug messages even if no additional flag like -XX:+PrintOpto is specified. This makes it hard to filter out relevant information if other debug flags are used in combination with -XX:+Verbose. According to the documentation, -XX:+Verbose should only "Print additional debugging information from other modes" but should not print anything on its own. >> >> I changed the code to only print messages if PrintOpto && WizardMode is set. This is consistent with other place in reg_split.cpp where we print debug info. >> >> Thanks, >> Tobias > From martin.doerr at sap.com Tue Mar 15 11:57:14 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 15 Mar 2016 11:57:14 +0000 Subject: Support for AES on ppc64le In-Reply-To: <201603150932.u2F9W7k9028664@d19av05.sagamino.japan.ibm.com> References: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com> <56E71E11.3040808@oracle.com> <201603150932.u2F9W7k9028664@d19av05.sagamino.japan.ibm.com> Message-ID: Hi Hiroshi, thanks for contributing AES support. We appreciate it. May we ask you to support big endian, too? AIX is still big endian. I think big endian linux requires setting and restoring VRSAVE which is not needed on little endian linux and AIX. I have taken a look at the PPC64 part of the change and I have a minor change request: We don't like to see both strings if has_vcipher(): " vcipher" and " aes". Please remove one of them. I noticed that you use the non-volatile vector registers v20-v31 which may be dangerous as they are not saved & restored. A quick workaround to allow their usage could be to change the build to disallow GCC to use altivec. Else I think we should save & restore them in the java ENTRY_FRAME. I think we can assist with this. Thanks and best regards, Martin From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Hiroshi H Horii Sent: Dienstag, 15. M?rz 2016 10:31 To: Vladimir Kozlov Cc: Tim Ellison ; Simonis, Volker ; hotspot-compiler-dev at openjdk.java.net Subject: Re: Support for AES on ppc64le Hi Vladimir, Thank you a lots for your quick response and review. To use load_array_element, SEGV happened, I needed the following change. Could you also review this change is reasonable? diff --git a/src/share/vm/opto/graphKit.cpp b/src/share/vm/opto/graphKit.cpp --- a/src/share/vm/opto/graphKit.cpp +++ b/src/share/vm/opto/graphKit.cpp @@ -1680,6 +1680,8 @@ Node* GraphKit::load_array_element(Node* ctl, Node* ary, Node* idx, const TypeAryPtr* arytype) { const Type* elemtype = arytype->elem(); BasicType elembt = elemtype->array_element_basic_type(); + if (elembt == T_NARROWOOP) + elembt = T_OBJECT; Node* adr = array_element_address(ary, idx, elembt, arytype->size()); Node* ld = make_load(ctl, adr, elemtype, elembt, arytype, MemNode::unordered); return ld; I attached a full diff that is applied your kind suggestions. Regards, Hiroshi ----------------------- Hiroshi Horii, Ph.D. IBM Research - Tokyo Vladimir Kozlov > wrote on 03/15/2016 05:24:49: > From: Vladimir Kozlov > > To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-compiler-dev at openjdk.java.net > Cc: "Simonis, Volker" >, Tim Ellison > > > Date: 03/15/2016 05:25 > Subject: Re: Support for AES on ppc64le > > Hi Hiroshi > > About library_call.cpp changes. > > You don't need GraphKit:: > > And you can use load_array_element() instead: > > Node* objAESCryptKey = load_array_element(control(), objSessionK, > intcon(0), TypeAryPtr::OOPS); > > You may need additional check and cast because next expression expects > the objAESCryptKey points to int[]: > > Node* k_start = array_element_address(objAESCryptKey, intcon(0), T_INT); > > Thanks, > Vladimir > > On 3/14/16 9:34 AM, Hiroshi H Horii wrote: > > Dear all: > > > > Can I please request reviews for the following change? > > This change was created for JDK 9. > > > > Description: > > This change adds stub routines support for single-block AES encryption and > > decryption operations on the POWER8 platform. They are available only when > > the application is configured to use SunJCE crypto provider on little > > endian. > > These stubs make use of efficient hardware AES instructions and thus > > offer significant performance improvements over JITed code on POWER8 > > as on x86 and SPARC. AES stub routines are enabled by default on POWER8 > > platforms that support AES instructions (vcipher). They can be > > explicitly enabled or > > disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. > > Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round > > keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls > > the stub with > > AESCrypt.sessionK[0] as round keys. > > > > Summary of source code changes: > > > > *src/cpu/ppc/vm/assembler_ppc.hpp > > *src/cpu/ppc/vm/assembler_ppc.inline.hpp > > - Adds support for vrld instruction to rotate vector register values > > with > > left doubleword. > > > > *src/cpu/ppc/vm/stubGenerator_ppc.cpp > > - Defines stubs for single-block AES encryption and decryption routines > > supporting all key sizes (128-bit, 192-bit and 256-bit). > > - Current POWER AES decryption instructions are not compatible with > > SunJCE expanded decryption key format. Thus decryption stubs read > > the expanded encryption keys (sessionK[0]) with descendant order. > > - Encryption stubs use SunJCE expanded encryption key as their is > > no incompatibility issue between POWER8 AES encryption instructions > > and SunJCE expanded encryption keys. > > > > *src/cpu/ppc/vm/vm_version_ppc.cpp > > - Detects AES capabilities of the underlying CPU by using has_vcipher(). > > - Enables UseAES and UseAESIntrinsics flags if the underlying CPU > > supports AES instructions and neither of them is explicitly > > disabled on > > the command-line. Generate warning message if either of these > > flags are > > enabled on the command-line whereas the underlying CPU does not > > support > > AES instructions. > > > > *src/share/vm/opto/library_call.cpp > > - Passes the first input parameter, reference to sessionK[0] to the > > AES stubs > > only on the POWER platform. > > > > Code change: > > Please see an attached diff file that was generated with "hg diff > > -g" under > > the latest hotspot directory. > > > > Passed tests: > > jtreg compiler/codegen/7184394/ > > jtreg compiler/cpuflags/ (after removing @ignored annotation) > > > > * This is my first post of a change. I'm sorry in advance if I don't > > follow the > > community manners. > > > > * I wrote this description based on the follows. > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013- > November/012670.html > > > > > > > > Regards, > > Hiroshi > > ----------------------- > > Hiroshi Horii, > > IBM Research - Tokyo > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Tue Mar 15 13:44:03 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 15 Mar 2016 14:44:03 +0100 Subject: RFR(M): 8150054: CompilerControl: add Xmixed for execution Message-ID: <56E811A3.10101@oracle.com> Hi, Please review this improvement of the compiler control tests. There are many files in the diff but only three kind of changes: 1) Most of the test spawn a separate VM that it can control with flags and diagnostic commands. The launching VM just need to setup the test environment and is not part of the test. We can therefore change "@run main/othervm" to " @run driver". A driver can not have any flags and are not affected by any flag rotation. 2) Change the "@run main ClassFileInstaller sun.hotspot.WhiteBox " to driver too. This could be done for all the ~400 test but I have limited myself to the compiler control test in this bug. In -Xcomp-batches this saves ~5 seconds per ClassFileInstaller invocation on a fast machine. 3) Add the -Xmixed flag to Scenario.java to prevent -Xcomp from interfering with testing printing and logging. It causes long run times and huge logs. These three changes together save more than 10 minutes of test time on a -Xcomp-batch on a fast x64 workstation. Bug: https://bugs.openjdk.java.net/browse/JDK-8150054 Webrev: http://cr.openjdk.java.net/~neliasso/8150054/webrev.01 Regards, Nils -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Mar 15 16:09:13 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Mar 2016 09:09:13 -0700 Subject: RFR(XS) 8151871: [JVMCI] missing HAS_PENDING_EXCEPTION check In-Reply-To: <2228DD8E-063B-4BFE-8EB9-1FC6D2387A8C@oracle.com> References: <2228DD8E-063B-4BFE-8EB9-1FC6D2387A8C@oracle.com> Message-ID: <56E833A9.7020300@oracle.com> Good. Thanks, Vladimir On 3/14/16 11:53 PM, Tom Rodriguez wrote: > http://cr.openjdk.java.net/~never/8151871/webrev/index.html > > Somehow during various edits > a HAS_PENDING_EXCEPTION/CLEAR_PENDING_EXCEPTION check got lost. It > should be restored. > > tom From vladimir.kozlov at oracle.com Tue Mar 15 16:33:35 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Mar 2016 09:33:35 -0700 Subject: [9] RFR(XS): 8151882: -XX:+Verbose prints messages even if no other flag is set In-Reply-To: <56E7E32D.7090807@oracle.com> References: <56E7E32D.7090807@oracle.com> Message-ID: <56E8395F.8010002@oracle.com> Good. Thanks, Vladimir On 3/15/16 3:25 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > > https://bugs.openjdk.java.net/browse/JDK-8151882 > http://cr.openjdk.java.net/~thartmann/8151882/webrev.00/ > > Running the VM with -XX:+Verbose prints dozens of register allocator debug messages even if no additional flag like -XX:+PrintOpto is specified. This makes it hard to filter out relevant information if other debug flags are used in combination with -XX:+Verbose. According to the documentation, -XX:+Verbose should only "Print additional debugging information from other modes" but should not print anything on its own. > > I changed the code to only print messages if PrintOpto && WizardMode is set. This is consistent with other place in reg_split.cpp where we print debug info. > > Thanks, > Tobias > From tobias.hartmann at oracle.com Tue Mar 15 16:35:12 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 15 Mar 2016 17:35:12 +0100 Subject: [9] RFR(XS): 8151882: -XX:+Verbose prints messages even if no other flag is set In-Reply-To: <56E8395F.8010002@oracle.com> References: <56E7E32D.7090807@oracle.com> <56E8395F.8010002@oracle.com> Message-ID: <56E839C0.9000802@oracle.com> Thanks, Vladimir. Best regards, Tobias On 15.03.2016 17:33, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 3/15/16 3:25 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> >> https://bugs.openjdk.java.net/browse/JDK-8151882 >> http://cr.openjdk.java.net/~thartmann/8151882/webrev.00/ >> >> Running the VM with -XX:+Verbose prints dozens of register allocator debug messages even if no additional flag like -XX:+PrintOpto is specified. This makes it hard to filter out relevant information if other debug flags are used in combination with -XX:+Verbose. According to the documentation, -XX:+Verbose should only "Print additional debugging information from other modes" but should not print anything on its own. >> >> I changed the code to only print messages if PrintOpto && WizardMode is set. This is consistent with other place in reg_split.cpp where we print debug info. >> >> Thanks, >> Tobias >> From christian.thalinger at oracle.com Tue Mar 15 18:43:56 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Tue, 15 Mar 2016 08:43:56 -1000 Subject: RFR(XS) 8151874: [JVMCI] canInlineMethod should check is_not_compilable for correct CompLevel In-Reply-To: <059D20A4-51E0-4DB3-B626-E922ED33A1F5@oracle.com> References: <059D20A4-51E0-4DB3-B626-E922ED33A1F5@oracle.com> Message-ID: Looks good. > On Mar 14, 2016, at 9:12 PM, Tom Rodriguez wrote: > > http://cr.openjdk.java.net/~never/8151874/webrev/ > > Currently canInlineMethod is calling is_not_compilable() with no arguments which checks whether compilation has been disabled at any level which is incorrect. Only CompLevel_full_optimization should be consulted. > > tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Mar 15 20:02:10 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Mar 2016 13:02:10 -0700 Subject: RFR(M): 8150054: CompilerControl: add Xmixed for execution In-Reply-To: <56E811A3.10101@oracle.com> References: <56E811A3.10101@oracle.com> Message-ID: <56E86A42.8060709@oracle.com> Looks good Thanks, Vladimir On 3/15/16 6:44 AM, Nils Eliasson wrote: > Hi, > > Please review this improvement of the compiler control tests. There are > many files in the diff but only three kind of changes: > > 1) Most of the test spawn a separate VM that it can control with flags > and diagnostic commands. The launching VM just need to setup the test > environment and is not part of the test. We can therefore change "@run > main/othervm" to " @run driver". A driver can not have any flags and are > not affected by any flag rotation. > > 2) Change the "@run main ClassFileInstaller sun.hotspot.WhiteBox " to > driver too. This could be done for all the ~400 test but I have limited > myself to the compiler control test in this bug. In -Xcomp-batches this > saves ~5 seconds per ClassFileInstaller invocation on a fast machine. > > 3) Add the -Xmixed flag to Scenario.java to prevent -Xcomp from > interfering with testing printing and logging. It causes long run times > and huge logs. > > These three changes together save more than 10 minutes of test time on a > -Xcomp-batch on a fast x64 workstation. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8150054 > Webrev: http://cr.openjdk.java.net/~neliasso/8150054/webrev.01 > > Regards, > Nils > From michael.c.berg at intel.com Tue Mar 15 21:04:21 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Tue, 15 Mar 2016 21:04:21 +0000 Subject: CR for RFR 8151573 Message-ID: Hi Folks, I would like to contribute multi-versioning post loops for range check elimination. Beforehand cfg optimizations after register allocation were where post loop optimizations were done for range checks. I have added code which produces the desired effect much earlier by introducing a safe transformation which will minimally allow a range check free version of the final post loop to execute up until the point it actually has to take a range check exception by re-ranging the limit of the rce'd loop, then exit the rce'd post loop and take the range check exception in the legacy loops execution if required. If during optimization we discover that we know enough to remove the range check version of the post loop, mostly by exposing the load range values into the limit logic of the rce'd post loop, we will eliminate the range check post loop altogether much like cfg optimizations did, but much earlier. This gives optimizations like programmable SIMD (via SuperWord) the opportunity to vectorize the rce'd post loops to a single iteration based on mask vectors which map to the residual iterations. Programmable SIMD will be a follow on change set utilizing this code to stage its work. This optimization also exposes the rce'd post loop without flow to other optimizations. Currently I have enabled this optimization for x86 only. We base this loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. This code was tested as follows: Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 webrev: http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ Thanks, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Tue Mar 15 21:42:01 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Mar 2016 14:42:01 -0700 Subject: CR for RFR 8151573 In-Reply-To: References: Message-ID: <56E881A9.7070004@oracle.com> Hi Michael, Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? Thanks, Vladimir On 3/15/16 2:04 PM, Berg, Michael C wrote: > Hi Folks, > > I would like to contribute multi-versioning post loops for range check > elimination. Beforehand cfg optimizations after register allocation > were where post loop optimizations were done for range checks. I have > added code which produces the desired effect much earlier by introducing > a safe transformation which will minimally allow a range check free > version of the final post loop to execute up until the point it actually > has to take a range check exception by re-ranging the limit of the rce?d > loop, then exit the rce?d post loop and take the range check exception > in the legacy loops execution if required. If during optimization we > discover that we know enough to remove the range check version of the > post loop, mostly by exposing the load range values into the limit logic > of the rce?d post loop, we will eliminate the range check post loop > altogether much like cfg optimizations did, but much earlier. This > gives optimizations like programmable SIMD (via SuperWord) the > opportunity to vectorize the rce?d post loops to a single iteration > based on mask vectors which map to the residual iterations. Programmable > SIMD will be a follow on change set utilizing this code to stage its > work. This optimization also exposes the rce?d post loop without flow to > other optimizations. Currently I have enabled this optimization for x86 > only. We base this loop on successfully rce?d main loops and if for > whatever reason, multiversioning fails, we eliminate the loop we added. > > This code was tested as follows: > > > Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 > > > webrev: > > http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ > > Thanks, > > Michael > From michael.c.berg at intel.com Tue Mar 15 23:07:57 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Tue, 15 Mar 2016 23:07:57 +0000 Subject: CR for RFR 8151573 In-Reply-To: <56E881A9.7070004@oracle.com> References: <56E881A9.7070004@oracle.com> Message-ID: Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: for(int i = 0; i < process_len; i++) { d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); } The above code makes 9 vector ops. For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is exected 4 times, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. Would you prefer I integrate this with programmable SIMD and submit the patches as one? I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. Regards, Michael -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, March 15, 2016 2:42 PM To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' Subject: Re: CR for RFR 8151573 Hi Michael, Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? Thanks, Vladimir On 3/15/16 2:04 PM, Berg, Michael C wrote: > Hi Folks, > > I would like to contribute multi-versioning post loops for range check > elimination. Beforehand cfg optimizations after register allocation > were where post loop optimizations were done for range checks. I have > added code which produces the desired effect much earlier by > introducing a safe transformation which will minimally allow a range > check free version of the final post loop to execute up until the > point it actually has to take a range check exception by re-ranging > the limit of the rce'd loop, then exit the rce'd post loop and take > the range check exception in the legacy loops execution if required. > If during optimization we discover that we know enough to remove the > range check version of the post loop, mostly by exposing the load > range values into the limit logic of the rce'd post loop, we will > eliminate the range check post loop altogether much like cfg > optimizations did, but much earlier. This gives optimizations like > programmable SIMD (via SuperWord) the opportunity to vectorize the > rce'd post loops to a single iteration based on mask vectors which map > to the residual iterations. Programmable SIMD will be a follow on > change set utilizing this code to stage its work. This optimization > also exposes the rce'd post loop without flow to other optimizations. > Currently I have enabled this optimization for x86 only. We base this > loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. > > This code was tested as follows: > > > Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 > > > webrev: > > http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ > > Thanks, > > Michael > From michael.c.berg at intel.com Tue Mar 15 23:14:39 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Tue, 15 Mar 2016 23:14:39 +0000 Subject: CR for RFR 8151573 In-Reply-To: References: <56E881A9.7070004@oracle.com> Message-ID: Correction below... -----Original Message----- From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Berg, Michael C Sent: Tuesday, March 15, 2016 4:08 PM To: Vladimir Kozlov; 'hotspot-compiler-dev at openjdk.java.net' Subject: RE: CR for RFR 8151573 Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: for(int i = 0; i < process_len; i++) { d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); } The above code makes 9 vector ops. For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is executed 1 time, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. Would you prefer I integrate this with programmable SIMD and submit the patches as one? I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. Regards, Michael -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, March 15, 2016 2:42 PM To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' Subject: Re: CR for RFR 8151573 Hi Michael, Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? Thanks, Vladimir On 3/15/16 2:04 PM, Berg, Michael C wrote: > Hi Folks, > > I would like to contribute multi-versioning post loops for range check > elimination. Beforehand cfg optimizations after register allocation > were where post loop optimizations were done for range checks. I have > added code which produces the desired effect much earlier by > introducing a safe transformation which will minimally allow a range > check free version of the final post loop to execute up until the > point it actually has to take a range check exception by re-ranging > the limit of the rce'd loop, then exit the rce'd post loop and take > the range check exception in the legacy loops execution if required. > If during optimization we discover that we know enough to remove the > range check version of the post loop, mostly by exposing the load > range values into the limit logic of the rce'd post loop, we will > eliminate the range check post loop altogether much like cfg > optimizations did, but much earlier. This gives optimizations like > programmable SIMD (via SuperWord) the opportunity to vectorize the > rce'd post loops to a single iteration based on mask vectors which map > to the residual iterations. Programmable SIMD will be a follow on > change set utilizing this code to stage its work. This optimization > also exposes the rce'd post loop without flow to other optimizations. > Currently I have enabled this optimization for x86 only. We base this > loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. > > This code was tested as follows: > > > Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 > > > webrev: > > http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ > > Thanks, > > Michael > From vladimir.kozlov at oracle.com Tue Mar 15 23:37:08 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 15 Mar 2016 16:37:08 -0700 Subject: CR for RFR 8151573 In-Reply-To: References: <56E881A9.7070004@oracle.com> Message-ID: <56E89CA4.8010201@oracle.com> As we all know we can always construct microbenchmarks which shows 30% - 50% difference. When in real application we will never see difference. I still don't see a real reason why we should spend time and optimize *POST* loops. We already have vectorized post loop to improve performance. Note, additional loop opts code will rise its maintenance cost. Why "programmable SIMD" depends on it? What about pre-loop? Thanks, Vladimir On 3/15/16 4:14 PM, Berg, Michael C wrote: > Correction below... > > -----Original Message----- > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Berg, Michael C > Sent: Tuesday, March 15, 2016 4:08 PM > To: Vladimir Kozlov; 'hotspot-compiler-dev at openjdk.java.net' > Subject: RE: CR for RFR 8151573 > > Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: > > for(int i = 0; i < process_len; i++) > { > d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); > } > > The above code makes 9 vector ops. > > For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. > For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. > The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. > > An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is executed 1 time, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. > > Would you prefer I integrate this with programmable SIMD and submit the patches as one? > > I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. > > Regards, > Michael > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Tuesday, March 15, 2016 2:42 PM > To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' > Subject: Re: CR for RFR 8151573 > > Hi Michael, > > Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? > > Thanks, > Vladimir > > On 3/15/16 2:04 PM, Berg, Michael C wrote: >> Hi Folks, >> >> I would like to contribute multi-versioning post loops for range check >> elimination. Beforehand cfg optimizations after register allocation >> were where post loop optimizations were done for range checks. I have >> added code which produces the desired effect much earlier by >> introducing a safe transformation which will minimally allow a range >> check free version of the final post loop to execute up until the >> point it actually has to take a range check exception by re-ranging >> the limit of the rce'd loop, then exit the rce'd post loop and take >> the range check exception in the legacy loops execution if required. >> If during optimization we discover that we know enough to remove the >> range check version of the post loop, mostly by exposing the load >> range values into the limit logic of the rce'd post loop, we will >> eliminate the range check post loop altogether much like cfg >> optimizations did, but much earlier. This gives optimizations like >> programmable SIMD (via SuperWord) the opportunity to vectorize the >> rce'd post loops to a single iteration based on mask vectors which map >> to the residual iterations. Programmable SIMD will be a follow on >> change set utilizing this code to stage its work. This optimization >> also exposes the rce'd post loop without flow to other optimizations. >> Currently I have enabled this optimization for x86 only. We base this >> loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. >> >> This code was tested as follows: >> >> >> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 >> >> >> webrev: >> >> http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ >> >> Thanks, >> >> Michael >> From michael.c.berg at intel.com Wed Mar 16 00:29:08 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Wed, 16 Mar 2016 00:29:08 +0000 Subject: CR for RFR 8151573 In-Reply-To: <56E89CA4.8010201@oracle.com> References: <56E881A9.7070004@oracle.com> <56E89CA4.8010201@oracle.com> Message-ID: Vladimir: The why programmable SIMD depends upon this that all versions of the final post loop have range checks in them until very late, after register allocation, and might be cleaned up in cfg optimizations, but are not always so. With multiversioning, we always remove the range checks in our key loop. With regards to the pre loop, pre loops have special checks too do they not, requiring flow in many cases? Programmable SIMD needs tight loops to accurately facilitate masked iteration mapping. Regards, Michael -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, March 15, 2016 4:37 PM To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' Subject: Re: CR for RFR 8151573 As we all know we can always construct microbenchmarks which shows 30% - 50% difference. When in real application we will never see difference. I still don't see a real reason why we should spend time and optimize *POST* loops. We already have vectorized post loop to improve performance. Note, additional loop opts code will rise its maintenance cost. Why "programmable SIMD" depends on it? What about pre-loop? Thanks, Vladimir On 3/15/16 4:14 PM, Berg, Michael C wrote: > Correction below... > > -----Original Message----- > From: hotspot-compiler-dev > [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of > Berg, Michael C > Sent: Tuesday, March 15, 2016 4:08 PM > To: Vladimir Kozlov; 'hotspot-compiler-dev at openjdk.java.net' > Subject: RE: CR for RFR 8151573 > > Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: > > for(int i = 0; i < process_len; i++) > { > d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); > } > > The above code makes 9 vector ops. > > For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. > For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. > The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. > > An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is executed 1 time, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. > > Would you prefer I integrate this with programmable SIMD and submit the patches as one? > > I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. > > Regards, > Michael > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Tuesday, March 15, 2016 2:42 PM > To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' > Subject: Re: CR for RFR 8151573 > > Hi Michael, > > Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? > > Thanks, > Vladimir > > On 3/15/16 2:04 PM, Berg, Michael C wrote: >> Hi Folks, >> >> I would like to contribute multi-versioning post loops for range >> check elimination. Beforehand cfg optimizations after register >> allocation were where post loop optimizations were done for range >> checks. I have added code which produces the desired effect much >> earlier by introducing a safe transformation which will minimally >> allow a range check free version of the final post loop to execute up >> until the point it actually has to take a range check exception by >> re-ranging the limit of the rce'd loop, then exit the rce'd post loop >> and take the range check exception in the legacy loops execution if required. >> If during optimization we discover that we know enough to remove the >> range check version of the post loop, mostly by exposing the load >> range values into the limit logic of the rce'd post loop, we will >> eliminate the range check post loop altogether much like cfg >> optimizations did, but much earlier. This gives optimizations like >> programmable SIMD (via SuperWord) the opportunity to vectorize the >> rce'd post loops to a single iteration based on mask vectors which >> map to the residual iterations. Programmable SIMD will be a follow on >> change set utilizing this code to stage its work. This optimization >> also exposes the rce'd post loop without flow to other optimizations. >> Currently I have enabled this optimization for x86 only. We base >> this loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. >> >> This code was tested as follows: >> >> >> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 >> >> >> webrev: >> >> http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ >> >> Thanks, >> >> Michael >> From nils.eliasson at oracle.com Wed Mar 16 10:06:37 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 16 Mar 2016 11:06:37 +0100 Subject: RFR(M): 8150054: CompilerControl: add Xmixed for execution In-Reply-To: <56E86A42.8060709@oracle.com> References: <56E811A3.10101@oracle.com> <56E86A42.8060709@oracle.com> Message-ID: <56E9302D.3050206@oracle.com> Thank you! //Nils On 2016-03-15 21:02, Vladimir Kozlov wrote: > Looks good > > Thanks, > Vladimir > > On 3/15/16 6:44 AM, Nils Eliasson wrote: >> Hi, >> >> Please review this improvement of the compiler control tests. There are >> many files in the diff but only three kind of changes: >> >> 1) Most of the test spawn a separate VM that it can control with flags >> and diagnostic commands. The launching VM just need to setup the test >> environment and is not part of the test. We can therefore change "@run >> main/othervm" to " @run driver". A driver can not have any flags and are >> not affected by any flag rotation. >> >> 2) Change the "@run main ClassFileInstaller sun.hotspot.WhiteBox " to >> driver too. This could be done for all the ~400 test but I have limited >> myself to the compiler control test in this bug. In -Xcomp-batches this >> saves ~5 seconds per ClassFileInstaller invocation on a fast machine. >> >> 3) Add the -Xmixed flag to Scenario.java to prevent -Xcomp from >> interfering with testing printing and logging. It causes long run times >> and huge logs. >> >> These three changes together save more than 10 minutes of test time on a >> -Xcomp-batch on a fast x64 workstation. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8150054 >> Webrev: http://cr.openjdk.java.net/~neliasso/8150054/webrev.01 >> >> Regards, >> Nils >> From filipp.zhinkin at gmail.com Wed Mar 16 15:39:34 2016 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Wed, 16 Mar 2016 18:39:34 +0300 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration Message-ID: Hi all, please review a small fix that force VM to disable BlockingCompilation flag and ignore BlockingCompilationOption if CompileTheWorld or ReplayCompiles options were turned on. Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 Testing: CTW Regards, Filipp. From vladimir.kozlov at oracle.com Wed Mar 16 15:41:57 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 16 Mar 2016 08:41:57 -0700 Subject: CR for RFR 8151573 In-Reply-To: References: <56E881A9.7070004@oracle.com> <56E89CA4.8010201@oracle.com> Message-ID: <56E97EC5.6030608@oracle.com> On 3/15/16 5:29 PM, Berg, Michael C wrote: > Vladimir: > > The why programmable SIMD depends upon this that all versions of the final post loop have range checks in them until very late, after register allocation, and might be cleaned up in cfg optimizations, but are not always so. With multiversioning, we always remove the range checks in our key loop. I understand that we can get some benefits. But in general case they will not be visible. > > With regards to the pre loop, pre loops have special checks too do they not, requiring flow in many cases? > Programmable SIMD needs tight loops to accurately facilitate masked iteration mapping. Yes, after you explained me vector masking I now understand why it could be used for post loop. After thinking about this I would suggest for you to look on arraycopy and generate_fill stubs instead in stub_Generator_x86*.cpp (may be only 64-bit). They also have post loops but changes would be only platform specific, smaller and easy to understand and test. Also arraycopy and 'fill' code are used very frequently by Java applications so we may get more benefits than optimizing general loops. Regards, Vladimir > > Regards, > Michael > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Tuesday, March 15, 2016 4:37 PM > To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' > Subject: Re: CR for RFR 8151573 > > As we all know we can always construct microbenchmarks which shows 30% - 50% difference. When in real application we will never see difference. I still don't see a real reason why we should spend time and optimize > *POST* loops. We already have vectorized post loop to improve performance. Note, additional loop opts code will rise its maintenance cost. > > Why "programmable SIMD" depends on it? What about pre-loop? > > Thanks, > Vladimir > > On 3/15/16 4:14 PM, Berg, Michael C wrote: >> Correction below... >> >> -----Original Message----- >> From: hotspot-compiler-dev >> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >> Berg, Michael C >> Sent: Tuesday, March 15, 2016 4:08 PM >> To: Vladimir Kozlov; 'hotspot-compiler-dev at openjdk.java.net' >> Subject: RE: CR for RFR 8151573 >> >> Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: >> >> for(int i = 0; i < process_len; i++) >> { >> d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); >> } >> >> The above code makes 9 vector ops. >> >> For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. >> For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. >> The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. >> >> An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is executed 1 time, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. >> >> Would you prefer I integrate this with programmable SIMD and submit the patches as one? >> >> I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. >> >> Regards, >> Michael >> >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Tuesday, March 15, 2016 2:42 PM >> To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' >> Subject: Re: CR for RFR 8151573 >> >> Hi Michael, >> >> Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? >> >> Thanks, >> Vladimir >> >> On 3/15/16 2:04 PM, Berg, Michael C wrote: >>> Hi Folks, >>> >>> I would like to contribute multi-versioning post loops for range >>> check elimination. Beforehand cfg optimizations after register >>> allocation were where post loop optimizations were done for range >>> checks. I have added code which produces the desired effect much >>> earlier by introducing a safe transformation which will minimally >>> allow a range check free version of the final post loop to execute up >>> until the point it actually has to take a range check exception by >>> re-ranging the limit of the rce'd loop, then exit the rce'd post loop >>> and take the range check exception in the legacy loops execution if required. >>> If during optimization we discover that we know enough to remove the >>> range check version of the post loop, mostly by exposing the load >>> range values into the limit logic of the rce'd post loop, we will >>> eliminate the range check post loop altogether much like cfg >>> optimizations did, but much earlier. This gives optimizations like >>> programmable SIMD (via SuperWord) the opportunity to vectorize the >>> rce'd post loops to a single iteration based on mask vectors which >>> map to the residual iterations. Programmable SIMD will be a follow on >>> change set utilizing this code to stage its work. This optimization >>> also exposes the rce'd post loop without flow to other optimizations. >>> Currently I have enabled this optimization for x86 only. We base >>> this loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. >>> >>> This code was tested as follows: >>> >>> >>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 >>> >>> >>> webrev: >>> >>> http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ >>> >>> Thanks, >>> >>> Michael >>> From tom.rodriguez at oracle.com Wed Mar 16 16:07:02 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Wed, 16 Mar 2016 09:07:02 -0700 Subject: RFR(XS) 8151874: [JVMCI] canInlineMethod should check is_not_compilable for correct CompLevel In-Reply-To: References: <059D20A4-51E0-4DB3-B626-E922ED33A1F5@oracle.com> Message-ID: <4E4FD7EB-4BE0-4C7D-AACF-4878222C7EB7@oracle.com> Thanks! tom > On Mar 15, 2016, at 11:43 AM, Christian Thalinger wrote: > > Looks good. > >> On Mar 14, 2016, at 9:12 PM, Tom Rodriguez > wrote: >> >> http://cr.openjdk.java.net/~never/8151874/webrev/ >> >> Currently canInlineMethod is calling is_not_compilable() with no arguments which checks whether compilation has been disabled at any level which is incorrect. Only CompLevel_full_optimization should be consulted. >> >> tom > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.rodriguez at oracle.com Wed Mar 16 16:07:14 2016 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Wed, 16 Mar 2016 09:07:14 -0700 Subject: RFR(XS) 8151871: [JVMCI] missing HAS_PENDING_EXCEPTION check In-Reply-To: <56E833A9.7020300@oracle.com> References: <2228DD8E-063B-4BFE-8EB9-1FC6D2387A8C@oracle.com> <56E833A9.7020300@oracle.com> Message-ID: Thanks! tom > On Mar 15, 2016, at 9:09 AM, Vladimir Kozlov wrote: > > Good. > > Thanks, > Vladimir > > On 3/14/16 11:53 PM, Tom Rodriguez wrote: >> http://cr.openjdk.java.net/~never/8151871/webrev/index.html >> >> Somehow during various edits >> a HAS_PENDING_EXCEPTION/CLEAR_PENDING_EXCEPTION check got lost. It >> should be restored. >> >> tom From vladimir.kozlov at oracle.com Wed Mar 16 16:13:45 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 16 Mar 2016 09:13:45 -0700 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: References: Message-ID: <56E98639.4010401@oracle.com> Looks good. Thanks, Vladimir On 3/16/16 8:39 AM, Filipp Zhinkin wrote: > Hi all, > > please review a small fix that force VM to disable BlockingCompilation > flag and ignore BlockingCompilationOption if CompileTheWorld or > ReplayCompiles options were turned on. > > Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ > Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 > Testing: CTW > > Regards, > Filipp. > From nils.eliasson at oracle.com Wed Mar 16 16:18:20 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 16 Mar 2016 17:18:20 +0100 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: References: Message-ID: <56E9874C.4050208@oracle.com> Looks good. Thanks for fixing. Regards, Nils On 2016-03-16 16:39, Filipp Zhinkin wrote: > Hi all, > > please review a small fix that force VM to disable BlockingCompilation > flag and ignore BlockingCompilationOption if CompileTheWorld or > ReplayCompiles options were turned on. > > Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ > Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 > Testing: CTW > > Regards, > Filipp. From tobias.hartmann at oracle.com Wed Mar 16 17:10:00 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 16 Mar 2016 18:10:00 +0100 Subject: [9] RFR(M): 8023191: OSR nmethods should be flushed to free space in CodeCache Message-ID: <56E99368.80806@oracle.com> Hi, please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8023191 http://cr.openjdk.java.net/~thartmann/8023191/webrev.00/ Currently, we only remove unloaded/not-entrant OSR nmethods from the code cache but never flush alive and unused ones. This is a problem if many OSR compilations are triggered but the methods are only used for a short time. If the corresponding classes are not unloaded, these nmethods will never be flushed and occupy space in the code cache. This can lead to a drop in performance from which we may never recover (see discussions [1], [2]). My fix enables flushing of OSR nmethods, treating them the same way as we treat "normal" compilations. I implemented a fast path, that allows the sweeper to flush zombie OSR nmethods directly because they are never referenced by an inline cache. I refactored the debug printing in NMethodSweeper::process_nmethod() to get consistent output if OSR nmethods are flushed. During testing, I noticed that we need to clean the CodeCacheSweeperThread::_scanned_nmethod reference to the flushed nmethod in NMethodSweeper::release_nmethod() because otherwise the GC may visit the zombie nmethod at a safepoint that may occur during sweeping. I also fixed the make_unloaded()/make_zombie() code to only invoke nmethod::invalidate_osr_method() once and did some small typo/comment cleanups in related code. I've run jittest with the latest JDK 9 build and -XX:ReservedCodeCacheSize=20m -XX:-ClassUnloading. Graph [3] shows the results: the code cache fills up quickly and performance degrades significantly. We don't recover because OSR nmethods are not flushed and we therefore don't have enough space in the code cache to compile the methods of newly loaded classes. Graph [4] shows that the problem is solved by my fix. The code cache does not completely fill up and performance remains stable at a high level. Also the number of compilations is higher with the fix (86546 vs. 132495). Testing: - JPRT - RBT with hotspot_all and -Xcomp/-Xmixed - Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 and 100 runs each Thanks, Tobias [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-February/021732.html [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-March/021750.html [3] https://bugs.openjdk.java.net/secure/attachment/57899/base.png [4] https://bugs.openjdk.java.net/secure/attachment/57898/fix.png From rory.odonnell at oracle.com Wed Mar 16 17:20:41 2016 From: rory.odonnell at oracle.com (Rory O'Donnell) Date: Wed, 16 Mar 2016 17:20:41 +0000 Subject: JDK 9 build 109 -> Lucene's Ant build works again; still missing Hotspot patches In-Reply-To: <002d01d17c54$62e8dc20$28ba9460$@apache.org> References: <002d01d17c54$62e8dc20$28ba9460$@apache.org> Message-ID: <56E995E9.3020708@oracle.com> Hi Uwe, b110 is available on java.net and fixes to your issues are included ! Rgds, Rory On 12/03/2016 11:43, Uwe Schindler wrote: > Hi, > > I just wanted to inform you that the first Lucene test with build 109 of JDK were working with Ant and Ivy (both on Linux and Windows including whitespace in build directory), so the Multi-Release JAR file fix did not break the build system anymore. Many thanks to Steve Drach and Alan Bateman for helping! I have seen the follow-up issue about the "#release" fragment, so I am looking forward to better compatibility with the new URL schemes and existing code. I will try to maybe write some test this weekend to help with that. > > Nevertheless, build 109 does not contain (according to the changelog) fixes for JDK-8150436 (still fails consistently) and JDK-8148786 (our duplicate issue JDK-8150280, happens sometimes). Those patches were committed long ago. What's the reason for delaying them in nightly builds? I was hoping for build 108 containing them (which was unusable because of the Ant build problems) and I was quite sure that they will be in build 109. Those 2 issues still make the test suite fail quite often (hotspot issues). On the issues the "resolved in" field contains "team". What does this mean? > > Uwe > > ----- > Uwe Schindler > uschindler at apache.org > ASF Member, Apache Lucene PMC / Committer > Bremen, Germany > http://lucene.apache.org/ > > -- Rgds,Rory O'Donnell Quality Engineering Manager Oracle EMEA , Dublin, Ireland From michael.c.berg at intel.com Wed Mar 16 17:30:12 2016 From: michael.c.berg at intel.com (Berg, Michael C) Date: Wed, 16 Mar 2016 17:30:12 +0000 Subject: CR for RFR 8151573 In-Reply-To: <56E97EC5.6030608@oracle.com> References: <56E881A9.7070004@oracle.com> <56E89CA4.8010201@oracle.com> <56E97EC5.6030608@oracle.com> Message-ID: Putting a hold on the review, retesting everything on my end. -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, March 16, 2016 8:42 AM To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' Subject: Re: CR for RFR 8151573 On 3/15/16 5:29 PM, Berg, Michael C wrote: > Vladimir: > > The why programmable SIMD depends upon this that all versions of the final post loop have range checks in them until very late, after register allocation, and might be cleaned up in cfg optimizations, but are not always so. With multiversioning, we always remove the range checks in our key loop. I understand that we can get some benefits. But in general case they will not be visible. > > With regards to the pre loop, pre loops have special checks too do they not, requiring flow in many cases? > Programmable SIMD needs tight loops to accurately facilitate masked iteration mapping. Yes, after you explained me vector masking I now understand why it could be used for post loop. After thinking about this I would suggest for you to look on arraycopy and generate_fill stubs instead in stub_Generator_x86*.cpp (may be only 64-bit). They also have post loops but changes would be only platform specific, smaller and easy to understand and test. Also arraycopy and 'fill' code are used very frequently by Java applications so we may get more benefits than optimizing general loops. Regards, Vladimir > > Regards, > Michael > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Tuesday, March 15, 2016 4:37 PM > To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' > Subject: Re: CR for RFR 8151573 > > As we all know we can always construct microbenchmarks which shows 30% > - 50% difference. When in real application we will never see > difference. I still don't see a real reason why we should spend time > and optimize > *POST* loops. We already have vectorized post loop to improve performance. Note, additional loop opts code will rise its maintenance cost. > > Why "programmable SIMD" depends on it? What about pre-loop? > > Thanks, > Vladimir > > On 3/15/16 4:14 PM, Berg, Michael C wrote: >> Correction below... >> >> -----Original Message----- >> From: hotspot-compiler-dev >> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >> Berg, Michael C >> Sent: Tuesday, March 15, 2016 4:08 PM >> To: Vladimir Kozlov; 'hotspot-compiler-dev at openjdk.java.net' >> Subject: RE: CR for RFR 8151573 >> >> Vladimir for programmable SIMD which is the optimization which uses this implementation, I get the following on micros and code in general that look like this: >> >> for(int i = 0; i < process_len; i++) >> { >> d[i]= (a[i] * b[i]) + (a[i] * c[i]) + (b[i] * c[i]); >> } >> >> The above code makes 9 vector ops. >> >> For float with vector length VecZ, I get as much as 1.3x and for int as much as 1.4x uplift. >> For double and long on VecZ it is smaller, but then so is the value of vectorization on those types anyways. >> The value process_len is some fraction of the array length in my measurements. The idea of the metrics Is to pose a post loop with a modest amount of iterations in it. For instance N is the max trip of the post loop, and N is 1..VecZ-1 size, then for float we could do as many as 15 iterations in the fixup loop. >> >> An example would be array_length = 512, process_len is a range of 81..96, we create a VecZ loop which was superunrolled 4 times with vector length 16, or unroll of 64, we align process 4 iterations, and the vectorized post loop is executed 1 time, leaving the remaining work in the final post loop, in this case possibly a mutilversioned post loop. We start that final loop at iteration 81 so we always do at least 1 iteration fixup, and as many as 15. If we left the fixup loop as a scalar loop that would mean 1 to 15 iterations plus our initial loops which have {4,1,1} iterations as a group or 6 to get us to index 80. By vectorizing the fixup loop to one iteration we now always have 7 iterations in our loops for all ranges of 81..96, without this optimization and programmable SIMD, we would have the initial 6 plush 1 to 15 more, or a range of 7 to 21 iterations. >> >> Would you prefer I integrate this with programmable SIMD and submit the patches as one? >> >> I thought it would be easier to do them separately. Also, exposing the post loops to this path offloads cfg processing to earlier compilation, making the graph less complex through register allocation. >> >> Regards, >> Michael >> >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Tuesday, March 15, 2016 2:42 PM >> To: Berg, Michael C; 'hotspot-compiler-dev at openjdk.java.net' >> Subject: Re: CR for RFR 8151573 >> >> Hi Michael, >> >> Changes are significant so they have to be justified. Especially since we are in later stage of jdk9 development. Do you have performance numbers (not only for microbenchmarhks) which show the benefit of these changes? >> >> Thanks, >> Vladimir >> >> On 3/15/16 2:04 PM, Berg, Michael C wrote: >>> Hi Folks, >>> >>> I would like to contribute multi-versioning post loops for range >>> check elimination. Beforehand cfg optimizations after register >>> allocation were where post loop optimizations were done for range >>> checks. I have added code which produces the desired effect much >>> earlier by introducing a safe transformation which will minimally >>> allow a range check free version of the final post loop to execute >>> up until the point it actually has to take a range check exception >>> by re-ranging the limit of the rce'd loop, then exit the rce'd post >>> loop and take the range check exception in the legacy loops execution if required. >>> If during optimization we discover that we know enough to remove the >>> range check version of the post loop, mostly by exposing the load >>> range values into the limit logic of the rce'd post loop, we will >>> eliminate the range check post loop altogether much like cfg >>> optimizations did, but much earlier. This gives optimizations like >>> programmable SIMD (via SuperWord) the opportunity to vectorize the >>> rce'd post loops to a single iteration based on mask vectors which >>> map to the residual iterations. Programmable SIMD will be a follow >>> on change set utilizing this code to stage its work. This >>> optimization also exposes the rce'd post loop without flow to other optimizations. >>> Currently I have enabled this optimization for x86 only. We base >>> this loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added. >>> >>> This code was tested as follows: >>> >>> >>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8151573 >>> >>> >>> webrev: >>> >>> http://cr.openjdk.java.net/~mcberg/8151573/webrev.01/ >>> >>> Thanks, >>> >>> Michael >>> From zoltan.majo at oracle.com Wed Mar 16 17:59:13 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Wed, 16 Mar 2016 18:59:13 +0100 Subject: [9] RFR (S): 8148754: C2 loop unrolling fails due to unexpected graph shape In-Reply-To: <56CE6CAF.9090904@oracle.com> References: <56C1ED18.6060903@oracle.com> <56C26929.4050706@oracle.com> <56CB1997.40107@oracle.com> <56CE6CAF.9090904@oracle.com> Message-ID: <56E99EF1.2030900@oracle.com> Hi Vladimir, I've spent more time on this issue. Please find my findings below. On 02/25/2016 03:53 AM, Vladimir Kozlov wrote: > So it is again _major_progress problem. > I have to spend more time on this. It is not simple. > > We may add an other state when Ideal transformation could be executed. > For example, after all loop opts: > > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0fc557e05fc0/src/share/vm/opto/compile.cpp#l2286 > > > Or more states to specify which Ideal transformations and loop > optimizations could be executed in which state. I think adding more states is necessary, adding a single state is not sufficient as... (see below) > > The main problem from your description is elimination of Opaque1 on > which loop optimizations relies. > > We can simply remove Opaque1Node::Identity(PhaseGVN* phase) because > PhaseMacroExpand::expand_macro_nodes() will remove them after all loop > opts. ...there are even more places where Opaque1 nodes are removed, than we've initially assumed. The two I'm concerned about are - Compile::cleanup_loop_predicates() http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 - IdealLoopTree::remove_main_post_loops() http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2469 > On other hand we may do want to execute some simple loop optimizations > even after Opaque, CastII and CastI2L are optimized out. For example, > removing empty loops or one iteration loops (pre-loops). But > definitely not ones which use cloning or other aggressive optimizations. Yes, I agree that we want to execute some loop optimizations even afterwards. So I've added three states: LOOP_OPTS_FULL, LOOP_OPTS_LIMITED, and LOOP_OPTS_INHIBITED. These states indicate which loop optimizations are allowed, Major_progress indicates only if loop optimizations have made progress (but not if loop optimizations are expected to be performed). > > Inline_Warm() is not used since InlineWarmCalls for very long time. > The code could be very rotten by now. So removing set_major_progress > from it is fine. OK. > > It is also fine to remove it from inline_incrementally since it will > be restored by skip_loop_opts code (and cleared if method is empty or > set if there are expensive nodes). OK. > > LoopNode::Ideal() change seems also fine. LoopNode is created only in > loop opts (RootNode has own Ideal()) so if it has TOP input it will be > removed by RegionNode::Ideal most likely. OK. > > Which leaves remove_useless_bool() code only and I have concern about > it. It could happened after CCP phase and we may want to execute loop > opts after it. I am actually want to set major progress after CCP > unconditionally since some If nodes could be folded by it. Yes, that makes sense and I did it. > > As you can see it is not simple :( No, it's not simple at all. I did a prototype that implements all we discussed above. Here is the code: http://cr.openjdk.java.net/~zmajo/code/8148754/webrev/ The code is not yet RFR quality, but I've sent it out because I'd like to have your feedback on how to continue. The code fixes the current problem with the unexpected graph shape. But it is likely to also solve similar problems that are triggered also by an unexpected graph shape, for example any of the asserts in PhaseIdealLoop::do_range_check: http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2124 I evaluated performance of the prototype. Peformance improves in a number of cases by 1-4%: Octane-Mandreel Octane-Richards Octane-Splay Unfortunately, there is also a performance regression with SPECjvm2008-MonteCarlo-G1 (3-5%). Finding the cause of that regression is likely to take a at least a week, but most likely even more. So my question is: Should I spend more time on this prototype and fix the performance regression? A different solution would be check the complete graph shape. That is also done at other places, e.g., in SuperWord::get_pre_loop_end() http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/superword.cpp#l3076 Here is the webrev for the second solution: http://cr.openjdk.java.net/~zmajo/8148754/webrev.03/ The second solution does not regress. I've tested it with: - JPRT; - local testing (linux-86_64) with the failing test case; - executing all hotspot tests locally, all tests pass that pass with an unmodified build. Can you please let me know which solution you prefer: - (1) the prototype with the regression solved or - (2) checking the graph shape? We could also fix this issue with pushing (2) for now (as this issue is a "critical" nightly failure). I could then spend more time on (1) later in a different bug. Thank you and best regards, Zoltan > Thanks, > Vladimir > > On 2/22/16 6:22 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback! >> >> On 02/16/2016 01:11 AM, Vladimir Kozlov wrote: >>> Zoltan, >>> >>> It should not be "main" loop if peeling happened. See do_peeling(): >>> >>> if (cl->is_main_loop()) { >>> cl->set_normal_loop(); >>> >>> Split-if optimization should not split through loop's phi. And >>> generally not through loop's head since it is not making code better - >>> split through backedge moves code into loop again. Making loop body >>> more complicated as this case shows. >> >> I did more investigation to understand what causes the invalid graph >> shape to appear. It seems that the invalid graph shape appears because >> the compiler uses the Compile:: _major_progress inconsistently. Here are >> some details. >> >> - If _major_progress *is set*, the compiler expects more loop >> optimizations to happen. Therefore, certain transformations on the graph >> are not allowed so that the graph is in a shape that can be processed by >> loop optimizations. See: >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/convertnode.cpp#l253 >> >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/castnode.cpp#l251 >> >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/loopnode.cpp#l950 >> >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/opaquenode.cpp#l37 >> >> >> >> - If _major_progress *is not set*, the compiler is allowed to perform >> all possible transformations (because it does not have to care about >> future loop optimizations). >> >> The crash reported for the current issue appears because _major_progress >> *can be accidentally set again* after the compiler decided to stop >> performing loop optimizations. As a result, invalid graph shapes appear. >> >> Here are details about how this happens for both failures I've been >> studying: >> https://bugs.openjdk.java.net/browse/JDK-8148754?focusedCommentId=13901941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13901941 >> >> >> >> I would propose to change the compiler to use _major_progress >> consistently. (This goes into the same direction as Tobias's recent work >> on JDK-8144487.) >> >> I propose that _major_progress: >> - can be SET when the compiler is initialized (because loop >> optimizations are expected to happen afterwards); >> - can be SET/RESET in the scope of loop optimizations (because we want >> to see if loop optimizations made progress); >> - cannot be SET/RESET by neither incremental inlining nor IGVN (even if >> the IGVN is performed in the scope of loop optimizations). >> >> Here is the updated webrev: >> http://cr.openjdk.java.net/~zmajo/8148754/webrev.02/ >> >> Performance evaluation: >> - The proposed webrev does not cause performance regressions for >> SPECjvm2008, SPECjbb2005, and Octane. >> >> Testing: >> - all hotspot JTREG tests on all supported platforms; >> - JPRT; >> - failing test case. >> >> Thank you and best regards, >> >> >> Zoltan >> >> >>> >>> Bailout unrolling is fine but performance may suffer because in some >>> cases loop unrolling is better then split-if. >> >> >>> >>> Thanks, >>> Vladimir >>> >>> On 2/15/16 7:22 AM, Zolt?n Maj? wrote: >>>> Hi, >>>> >>>> >>>> please review the patch for 8148754. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8148754 >>>> >>>> Problem: Compilation fails when the C2 compiler attempts loop >>>> unrolling. >>>> The cause of the failure is that the loop unrolling optimization >>>> expects >>>> a well-defined graph shape at the entry control of a 'CountedLoopNode' >>>> ('IfTrue'/'IfFalse' preceeded by 'If' preceeded by 'Bool' preceeded by >>>> 'CmpI'). >>>> >>>> >>>> Solution: I investigated several different instances of the same >>>> failure. It turns out that the shape of the graph at a loop's entry >>>> control is often different from the way loop unrolling expects it >>>> to be >>>> (please find some examples in the bug's JBS issue). The various graph >>>> shapes are a result of previously performed transformations, e.g., >>>> split-if optimization and loop peeling. >>>> >>>> Loop unrolling requires the above mentioned graph shape so that it can >>>> adjust the zero-trip guard of the loop. With the unexpected graph >>>> shapes, it is not possible to perform loop unrolling. However, the >>>> graph >>>> is still in a valid state (except for loop unrolling) and can be >>>> used to >>>> produce correct code. >>>> >>>> I propose that (1) we check if an unexpected graph shape is >>>> encountered >>>> and (2) bail out of loop unrolling if it is (but not fail in the >>>> compiler in such cases). >>>> >>>> The failure was triggered by Aleksey's Indify String Concatenation >>>> changes but the generated bytecodes are valid. So this seems to be a >>>> compiler issue that was previously there but was not yet triggered. >>>> >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.00/ >>>> >>>> Testing: >>>> - JPRT; >>>> - local testing (linux-86_64) with the failing test case; >>>> - executed all hotspot tests locally, all tests pass that pass with an >>>> unmodified build. >>>> >>>> Thank you! >>>> >>>> Best regards, >>>> >>>> >>>> Zoltan >>>> >> From vladimir.kozlov at oracle.com Wed Mar 16 18:38:01 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 16 Mar 2016 11:38:01 -0700 Subject: [9] RFR(M): 8023191: OSR nmethods should be flushed to free space in CodeCache In-Reply-To: <56E99368.80806@oracle.com> References: <56E99368.80806@oracle.com> Message-ID: <56E9A809.1030400@oracle.com> Nice graphs! nmethod.hpp - we try to avoid calling SNRH() in such case but use fatal() with printing value (_state in this case) to get more info in error output. In invalidate_osr_method() add {} for != NULL check. I am a little worry about next change: ! if (is_osr_method() && is_in_use()) { invalidate_osr_method(); May be we should move is_in_use() check inside invalidate_osr_method() and in false case in debug VM do verification that osr nmethod is not on the list. Otherwise looks good. Thanks, Vladimir On 3/16/16 10:10 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8023191 > http://cr.openjdk.java.net/~thartmann/8023191/webrev.00/ > > Currently, we only remove unloaded/not-entrant OSR nmethods from the code cache but never flush alive and unused ones. This is a problem if many OSR compilations are triggered but the methods are only used for a short time. If the corresponding classes are not unloaded, these nmethods will never be flushed and occupy space in the code cache. This can lead to a drop in performance from which we may never recover (see discussions [1], [2]). > > My fix enables flushing of OSR nmethods, treating them the same way as we treat "normal" compilations. I implemented a fast path, that allows the sweeper to flush zombie OSR nmethods directly because they are never referenced by an inline cache. > > I refactored the debug printing in NMethodSweeper::process_nmethod() to get consistent output if OSR nmethods are flushed. During testing, I noticed that we need to clean the CodeCacheSweeperThread::_scanned_nmethod reference to the flushed nmethod in NMethodSweeper::release_nmethod() because otherwise the GC may visit the zombie nmethod at a safepoint that may occur during sweeping. > > I also fixed the make_unloaded()/make_zombie() code to only invoke nmethod::invalidate_osr_method() once and did some small typo/comment cleanups in related code. > > I've run jittest with the latest JDK 9 build and -XX:ReservedCodeCacheSize=20m -XX:-ClassUnloading. Graph [3] shows the results: the code cache fills up quickly and performance degrades significantly. We don't recover because OSR nmethods are not flushed and we therefore don't have enough space in the code cache to compile the methods of newly loaded classes. Graph [4] shows that the problem is solved by my fix. The code cache does not completely fill up and performance remains stable at a high level. Also the number of compilations is higher with the fix (86546 vs. 132495). > > Testing: > - JPRT > - RBT with hotspot_all and -Xcomp/-Xmixed > - Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 and 100 runs each > > Thanks, > Tobias > > [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-February/021732.html > [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-March/021750.html > [3] https://bugs.openjdk.java.net/secure/attachment/57899/base.png > [4] https://bugs.openjdk.java.net/secure/attachment/57898/fix.png > From vladimir.kozlov at oracle.com Wed Mar 16 20:52:59 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 16 Mar 2016 13:52:59 -0700 Subject: [9] RFR (S): 8148754: C2 loop unrolling fails due to unexpected graph shape In-Reply-To: <56E99EF1.2030900@oracle.com> References: <56C1ED18.6060903@oracle.com> <56C26929.4050706@oracle.com> <56CB1997.40107@oracle.com> <56CE6CAF.9090904@oracle.com> <56E99EF1.2030900@oracle.com> Message-ID: <56E9C7AB.8050504@oracle.com> > Can you please let me know which solution you prefer: > - (1) the prototype with the regression solved or > - (2) checking the graph shape? I agree that we should do (2) now. One suggestion I have is to prepare a separate method to do these checks and use it in other places which you pointed - superword and do_range_check. Yes, you can work later on (1) solution if you have a bug and not RFE - we should stabilize C2 as you know and these changes may have some side effects we don't know about yet. But I like it because we can explicitly specify which optimizations are allowed. > The two I'm concerned about are > - Compile::cleanup_loop_predicates() Yes, this one should be marked LOOP_OPTS_LIMITED. > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 > - IdealLoopTree::remove_main_post_loops() This one is fine because the loop goes away. Thanks, Vladimir On 3/16/16 10:59 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > I've spent more time on this issue. Please find my findings below. > > On 02/25/2016 03:53 AM, Vladimir Kozlov wrote: >> So it is again _major_progress problem. >> I have to spend more time on this. It is not simple. >> >> We may add an other state when Ideal transformation could be executed. For example, after all loop opts: >> >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0fc557e05fc0/src/share/vm/opto/compile.cpp#l2286 >> >> Or more states to specify which Ideal transformations and loop optimizations could be executed in which state. > > I think adding more states is necessary, adding a single state is not sufficient as... (see below) > >> >> The main problem from your description is elimination of Opaque1 on which loop optimizations relies. >> >> We can simply remove Opaque1Node::Identity(PhaseGVN* phase) because PhaseMacroExpand::expand_macro_nodes() will remove them after all loop opts. > > ...there are even more places where Opaque1 nodes are removed, than we've initially assumed. > > The two I'm concerned about are > - Compile::cleanup_loop_predicates() Yes, this one should be marked LOOP_OPTS_LIMITED. > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 > - IdealLoopTree::remove_main_post_loops() This one is fine because the loop goes away. > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2469 > >> On other hand we may do want to execute some simple loop optimizations even after Opaque, CastII and CastI2L are optimized out. For example, removing empty loops or one iteration loops (pre-loops). >> But definitely not ones which use cloning or other aggressive optimizations. > > Yes, I agree that we want to execute some loop optimizations even afterwards. > > So I've added three states: LOOP_OPTS_FULL, LOOP_OPTS_LIMITED, and LOOP_OPTS_INHIBITED. These states indicate which loop optimizations are allowed, Major_progress indicates only if loop optimizations > have made progress (but not if loop optimizations are expected to be performed). > >> >> Inline_Warm() is not used since InlineWarmCalls for very long time. The code could be very rotten by now. So removing set_major_progress from it is fine. > > OK. > >> >> It is also fine to remove it from inline_incrementally since it will be restored by skip_loop_opts code (and cleared if method is empty or set if there are expensive nodes). > > OK. > >> >> LoopNode::Ideal() change seems also fine. LoopNode is created only in loop opts (RootNode has own Ideal()) so if it has TOP input it will be removed by RegionNode::Ideal most likely. > > OK. > >> >> Which leaves remove_useless_bool() code only and I have concern about it. It could happened after CCP phase and we may want to execute loop opts after it. I am actually want to set major progress >> after CCP unconditionally since some If nodes could be folded by it. > > Yes, that makes sense and I did it. > >> >> As you can see it is not simple :( > > No, it's not simple at all. I did a prototype that implements all we discussed above. Here is the code: > http://cr.openjdk.java.net/~zmajo/code/8148754/webrev/ > > The code is not yet RFR quality, but I've sent it out because I'd like to have your feedback on how to continue. > > The code fixes the current problem with the unexpected graph shape. But it is likely to also solve similar problems that are triggered also by an unexpected graph shape, for example any of the asserts > in PhaseIdealLoop::do_range_check: > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2124 > > I evaluated performance of the prototype. Peformance improves in a number of cases by 1-4%: > Octane-Mandreel > Octane-Richards > Octane-Splay > > Unfortunately, there is also a performance regression with SPECjvm2008-MonteCarlo-G1 (3-5%). Finding the cause of that regression is likely to take a at least a week, but most likely even more. > > So my question is: Should I spend more time on this prototype and fix the performance regression? > > A different solution would be check the complete graph shape. That is also done at other places, e.g., in SuperWord::get_pre_loop_end() > http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/superword.cpp#l3076 > > Here is the webrev for the second solution: > http://cr.openjdk.java.net/~zmajo/8148754/webrev.03/ > > The second solution does not regress. I've tested it with: > - JPRT; > - local testing (linux-86_64) with the failing test case; > - executing all hotspot tests locally, all tests pass that pass with an unmodified build. > > Can you please let me know which solution you prefer: > - (1) the prototype with the regression solved or > - (2) checking the graph shape? > > We could also fix this issue with pushing (2) for now (as this issue is a "critical" nightly failure). I could then spend more time on (1) later in a different bug. > > Thank you and best regards, > > > Zoltan > >> Thanks, >> Vladimir >> >> On 2/22/16 6:22 AM, Zolt?n Maj? wrote: >>> Hi Vladimir, >>> >>> >>> thank you for the feedback! >>> >>> On 02/16/2016 01:11 AM, Vladimir Kozlov wrote: >>>> Zoltan, >>>> >>>> It should not be "main" loop if peeling happened. See do_peeling(): >>>> >>>> if (cl->is_main_loop()) { >>>> cl->set_normal_loop(); >>>> >>>> Split-if optimization should not split through loop's phi. And >>>> generally not through loop's head since it is not making code better - >>>> split through backedge moves code into loop again. Making loop body >>>> more complicated as this case shows. >>> >>> I did more investigation to understand what causes the invalid graph >>> shape to appear. It seems that the invalid graph shape appears because >>> the compiler uses the Compile:: _major_progress inconsistently. Here are >>> some details. >>> >>> - If _major_progress *is set*, the compiler expects more loop >>> optimizations to happen. Therefore, certain transformations on the graph >>> are not allowed so that the graph is in a shape that can be processed by >>> loop optimizations. See: >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/convertnode.cpp#l253 >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/castnode.cpp#l251 >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/loopnode.cpp#l950 >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/opaquenode.cpp#l37 >>> >>> >>> - If _major_progress *is not set*, the compiler is allowed to perform >>> all possible transformations (because it does not have to care about >>> future loop optimizations). >>> >>> The crash reported for the current issue appears because _major_progress >>> *can be accidentally set again* after the compiler decided to stop >>> performing loop optimizations. As a result, invalid graph shapes appear. >>> >>> Here are details about how this happens for both failures I've been >>> studying: >>> https://bugs.openjdk.java.net/browse/JDK-8148754?focusedCommentId=13901941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13901941 >>> >>> >>> I would propose to change the compiler to use _major_progress >>> consistently. (This goes into the same direction as Tobias's recent work >>> on JDK-8144487.) >>> >>> I propose that _major_progress: >>> - can be SET when the compiler is initialized (because loop >>> optimizations are expected to happen afterwards); >>> - can be SET/RESET in the scope of loop optimizations (because we want >>> to see if loop optimizations made progress); >>> - cannot be SET/RESET by neither incremental inlining nor IGVN (even if >>> the IGVN is performed in the scope of loop optimizations). >>> >>> Here is the updated webrev: >>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.02/ >>> >>> Performance evaluation: >>> - The proposed webrev does not cause performance regressions for >>> SPECjvm2008, SPECjbb2005, and Octane. >>> >>> Testing: >>> - all hotspot JTREG tests on all supported platforms; >>> - JPRT; >>> - failing test case. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>> >>>> >>>> Bailout unrolling is fine but performance may suffer because in some >>>> cases loop unrolling is better then split-if. >>> >>> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/15/16 7:22 AM, Zolt?n Maj? wrote: >>>>> Hi, >>>>> >>>>> >>>>> please review the patch for 8148754. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8148754 >>>>> >>>>> Problem: Compilation fails when the C2 compiler attempts loop unrolling. >>>>> The cause of the failure is that the loop unrolling optimization expects >>>>> a well-defined graph shape at the entry control of a 'CountedLoopNode' >>>>> ('IfTrue'/'IfFalse' preceeded by 'If' preceeded by 'Bool' preceeded by >>>>> 'CmpI'). >>>>> >>>>> >>>>> Solution: I investigated several different instances of the same >>>>> failure. It turns out that the shape of the graph at a loop's entry >>>>> control is often different from the way loop unrolling expects it to be >>>>> (please find some examples in the bug's JBS issue). The various graph >>>>> shapes are a result of previously performed transformations, e.g., >>>>> split-if optimization and loop peeling. >>>>> >>>>> Loop unrolling requires the above mentioned graph shape so that it can >>>>> adjust the zero-trip guard of the loop. With the unexpected graph >>>>> shapes, it is not possible to perform loop unrolling. However, the graph >>>>> is still in a valid state (except for loop unrolling) and can be used to >>>>> produce correct code. >>>>> >>>>> I propose that (1) we check if an unexpected graph shape is encountered >>>>> and (2) bail out of loop unrolling if it is (but not fail in the >>>>> compiler in such cases). >>>>> >>>>> The failure was triggered by Aleksey's Indify String Concatenation >>>>> changes but the generated bytecodes are valid. So this seems to be a >>>>> compiler issue that was previously there but was not yet triggered. >>>>> >>>>> >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.00/ >>>>> >>>>> Testing: >>>>> - JPRT; >>>>> - local testing (linux-86_64) with the failing test case; >>>>> - executed all hotspot tests locally, all tests pass that pass with an >>>>> unmodified build. >>>>> >>>>> Thank you! >>>>> >>>>> Best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>> > From filipp.zhinkin at gmail.com Thu Mar 17 06:56:52 2016 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Thu, 17 Mar 2016 09:56:52 +0300 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: <56E98639.4010401@oracle.com> References: <56E98639.4010401@oracle.com> Message-ID: Thank you, Vladimir. Regards, Filipp. On Wed, Mar 16, 2016 at 7:13 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > > On 3/16/16 8:39 AM, Filipp Zhinkin wrote: >> >> Hi all, >> >> please review a small fix that force VM to disable BlockingCompilation >> flag and ignore BlockingCompilationOption if CompileTheWorld or >> ReplayCompiles options were turned on. >> >> Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ >> Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 >> Testing: CTW >> >> Regards, >> Filipp. >> > From filipp.zhinkin at gmail.com Thu Mar 17 06:59:32 2016 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Thu, 17 Mar 2016 09:59:32 +0300 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: <56E9874C.4050208@oracle.com> References: <56E9874C.4050208@oracle.com> Message-ID: Nils, thank you for the review. May I ask you push this change? A patch with correct commit message is here: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.01/hotspot.changeset Thanks, Filipp. On Wed, Mar 16, 2016 at 7:18 PM, Nils Eliasson wrote: > Looks good. > > Thanks for fixing. > > Regards, > Nils > > > On 2016-03-16 16:39, Filipp Zhinkin wrote: >> >> Hi all, >> >> please review a small fix that force VM to disable BlockingCompilation >> flag and ignore BlockingCompilationOption if CompileTheWorld or >> ReplayCompiles options were turned on. >> >> Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ >> Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 >> Testing: CTW >> >> Regards, >> Filipp. > > From tobias.hartmann at oracle.com Thu Mar 17 08:29:32 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 17 Mar 2016 09:29:32 +0100 Subject: [9] RFR(M): 8023191: OSR nmethods should be flushed to free space in CodeCache In-Reply-To: <56E9A809.1030400@oracle.com> References: <56E99368.80806@oracle.com> <56E9A809.1030400@oracle.com> Message-ID: <56EA6AEC.8000905@oracle.com> Hi Vladimir, thanks for the review! On 16.03.2016 19:38, Vladimir Kozlov wrote: > Nice graphs! > > nmethod.hpp - we try to avoid calling SNRH() in such case but use fatal() with printing value (_state in this case) to get more info in error output. Okay, I changed it to _fatal. > In invalidate_osr_method() add {} for != NULL check. Done. > I am a little worry about next change: > > ! if (is_osr_method() && is_in_use()) { > invalidate_osr_method(); > > May be we should move is_in_use() check inside invalidate_osr_method() and in false case in debug VM do verification that osr nmethod is not on the list. Right, I changed remove_osr_nmethod() to return true if the osr nmethod was found/removed and moved the is_in_use() check into invalidate_osr_method(). Like his, we verify that the nmethod was invalidated if it is !is_in_use(). New webrev: http://cr.openjdk.java.net/~thartmann/8023191/webrev.01/ Thanks, Tobias > Otherwise looks good. > > Thanks, > Vladimir > > On 3/16/16 10:10 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch. >> >> https://bugs.openjdk.java.net/browse/JDK-8023191 >> http://cr.openjdk.java.net/~thartmann/8023191/webrev.00/ >> >> Currently, we only remove unloaded/not-entrant OSR nmethods from the code cache but never flush alive and unused ones. This is a problem if many OSR compilations are triggered but the methods are only used for a short time. If the corresponding classes are not unloaded, these nmethods will never be flushed and occupy space in the code cache. This can lead to a drop in performance from which we may never recover (see discussions [1], [2]). >> >> My fix enables flushing of OSR nmethods, treating them the same way as we treat "normal" compilations. I implemented a fast path, that allows the sweeper to flush zombie OSR nmethods directly because they are never referenced by an inline cache. >> >> I refactored the debug printing in NMethodSweeper::process_nmethod() to get consistent output if OSR nmethods are flushed. During testing, I noticed that we need to clean the CodeCacheSweeperThread::_scanned_nmethod reference to the flushed nmethod in NMethodSweeper::release_nmethod() because otherwise the GC may visit the zombie nmethod at a safepoint that may occur during sweeping. >> >> I also fixed the make_unloaded()/make_zombie() code to only invoke nmethod::invalidate_osr_method() once and did some small typo/comment cleanups in related code. >> >> I've run jittest with the latest JDK 9 build and -XX:ReservedCodeCacheSize=20m -XX:-ClassUnloading. Graph [3] shows the results: the code cache fills up quickly and performance degrades significantly. We don't recover because OSR nmethods are not flushed and we therefore don't have enough space in the code cache to compile the methods of newly loaded classes. Graph [4] shows that the problem is solved by my fix. The code cache does not completely fill up and performance remains stable at a high level. Also the number of compilations is higher with the fix (86546 vs. 132495). >> >> Testing: >> - JPRT >> - RBT with hotspot_all and -Xcomp/-Xmixed >> - Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 and 100 runs each >> >> Thanks, >> Tobias >> >> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-February/021732.html >> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-March/021750.html >> [3] https://bugs.openjdk.java.net/secure/attachment/57899/base.png >> [4] https://bugs.openjdk.java.net/secure/attachment/57898/fix.png >> From uschindler at apache.org Thu Mar 17 14:25:39 2016 From: uschindler at apache.org (Uwe Schindler) Date: Thu, 17 Mar 2016 15:25:39 +0100 Subject: JDK 9 build 109 -> Lucene's Ant build works again; still missing Hotspot patches In-Reply-To: <56E995E9.3020708@oracle.com> References: <002d01d17c54$62e8dc20$28ba9460$@apache.org> <56E995E9.3020708@oracle.com> Message-ID: <03f201d18058$e43e2390$acba6ab0$@apache.org> Hi Rory, thanks for the pointer. I installed it on our Jenkins server. Will report back. My local tests showed that the MethodHandle-bug is solved, the other one is hopefully fixed, too. Robert may have a way to quickly reproduce. Uwe ----- Uwe Schindler uschindler at apache.org ASF Member, Apache Lucene PMC / Committer Bremen, Germany http://lucene.apache.org/ > -----Original Message----- > From: Rory O'Donnell [mailto:rory.odonnell at oracle.com] > Sent: Wednesday, March 16, 2016 6:21 PM > To: Uwe Schindler ; Core-Libs-Dev dev at openjdk.java.net>; 'Steve Drach' ; Alan > Bateman > Cc: rory.odonnell at oracle.com; hotspot-compiler-dev at openjdk.java.net; > 'Robert Muir' > Subject: Re: JDK 9 build 109 -> Lucene's Ant build works again; still missing > Hotspot patches > > Hi Uwe, > > b110 is available on java.net and fixes to your issues are included ! > > Rgds, Rory > On 12/03/2016 11:43, Uwe Schindler wrote: > > Hi, > > > > I just wanted to inform you that the first Lucene test with build 109 of JDK > were working with Ant and Ivy (both on Linux and Windows including > whitespace in build directory), so the Multi-Release JAR file fix did not break > the build system anymore. Many thanks to Steve Drach and Alan Bateman > for helping! I have seen the follow-up issue about the "#release" fragment, > so I am looking forward to better compatibility with the new URL schemes > and existing code. I will try to maybe write some test this weekend to help > with that. > > > > Nevertheless, build 109 does not contain (according to the changelog) fixes > for JDK-8150436 (still fails consistently) and JDK-8148786 (our duplicate issue > JDK-8150280, happens sometimes). Those patches were committed long ago. > What's the reason for delaying them in nightly builds? I was hoping for build > 108 containing them (which was unusable because of the Ant build > problems) and I was quite sure that they will be in build 109. Those 2 issues > still make the test suite fail quite often (hotspot issues). On the issues the > "resolved in" field contains "team". What does this mean? > > > > Uwe > > > > ----- > > Uwe Schindler > > uschindler at apache.org > > ASF Member, Apache Lucene PMC / Committer > > Bremen, Germany > > http://lucene.apache.org/ > > > > > > -- > Rgds,Rory O'Donnell > Quality Engineering Manager > Oracle EMEA , Dublin, Ireland From rcmuir at gmail.com Thu Mar 17 15:00:39 2016 From: rcmuir at gmail.com (Robert Muir) Date: Thu, 17 Mar 2016 11:00:39 -0400 Subject: JDK 9 build 109 -> Lucene's Ant build works again; still missing Hotspot patches In-Reply-To: <03f201d18058$e43e2390$acba6ab0$@apache.org> References: <002d01d17c54$62e8dc20$28ba9460$@apache.org> <56E995E9.3020708@oracle.com> <03f201d18058$e43e2390$acba6ab0$@apache.org> Message-ID: On Thu, Mar 17, 2016 at 10:25 AM, Uwe Schindler wrote: > > My local tests showed that the MethodHandle-bug is solved, the other one is hopefully fixed, too. Robert may have a way to quickly reproduce. > JDK-8150280 is fixed too, I just tested it. Thanks! From vladimir.kozlov at oracle.com Thu Mar 17 15:04:28 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 08:04:28 -0700 Subject: [9] RFR(M): 8023191: OSR nmethods should be flushed to free space in CodeCache In-Reply-To: <56EA6AEC.8000905@oracle.com> References: <56E99368.80806@oracle.com> <56E9A809.1030400@oracle.com> <56EA6AEC.8000905@oracle.com> Message-ID: <56EAC77C.70806@oracle.com> Looks good. Thanks, Vladimri On 3/17/16 1:29 AM, Tobias Hartmann wrote: > Hi Vladimir, > > thanks for the review! > > On 16.03.2016 19:38, Vladimir Kozlov wrote: >> Nice graphs! >> >> nmethod.hpp - we try to avoid calling SNRH() in such case but use fatal() with printing value (_state in this case) to get more info in error output. > > Okay, I changed it to _fatal. > >> In invalidate_osr_method() add {} for != NULL check. > > Done. > >> I am a little worry about next change: >> >> ! if (is_osr_method() && is_in_use()) { >> invalidate_osr_method(); >> >> May be we should move is_in_use() check inside invalidate_osr_method() and in false case in debug VM do verification that osr nmethod is not on the list. > > Right, I changed remove_osr_nmethod() to return true if the osr nmethod was found/removed and moved the is_in_use() check into invalidate_osr_method(). Like his, we verify that the nmethod was invalidated if it is !is_in_use(). > > New webrev: > http://cr.openjdk.java.net/~thartmann/8023191/webrev.01/ > > Thanks, > Tobias > >> Otherwise looks good. >> >> Thanks, >> Vladimir >> >> On 3/16/16 10:10 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8023191 >>> http://cr.openjdk.java.net/~thartmann/8023191/webrev.00/ >>> >>> Currently, we only remove unloaded/not-entrant OSR nmethods from the code cache but never flush alive and unused ones. This is a problem if many OSR compilations are triggered but the methods are only used for a short time. If the corresponding classes are not unloaded, these nmethods will never be flushed and occupy space in the code cache. This can lead to a drop in performance from which we may never recover (see discussions [1], [2]). >>> >>> My fix enables flushing of OSR nmethods, treating them the same way as we treat "normal" compilations. I implemented a fast path, that allows the sweeper to flush zombie OSR nmethods directly because they are never referenced by an inline cache. >>> >>> I refactored the debug printing in NMethodSweeper::process_nmethod() to get consistent output if OSR nmethods are flushed. During testing, I noticed that we need to clean the CodeCacheSweeperThread::_scanned_nmethod reference to the flushed nmethod in NMethodSweeper::release_nmethod() because otherwise the GC may visit the zombie nmethod at a safepoint that may occur during sweeping. >>> >>> I also fixed the make_unloaded()/make_zombie() code to only invoke nmethod::invalidate_osr_method() once and did some small typo/comment cleanups in related code. >>> >>> I've run jittest with the latest JDK 9 build and -XX:ReservedCodeCacheSize=20m -XX:-ClassUnloading. Graph [3] shows the results: the code cache fills up quickly and performance degrades significantly. We don't recover because OSR nmethods are not flushed and we therefore don't have enough space in the code cache to compile the methods of newly loaded classes. Graph [4] shows that the problem is solved by my fix. The code cache does not completely fill up and performance remains stable at a high level. Also the number of compilations is higher with the fix (86546 vs. 132495). >>> >>> Testing: >>> - JPRT >>> - RBT with hotspot_all and -Xcomp/-Xmixed >>> - Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 and 100 runs each >>> >>> Thanks, >>> Tobias >>> >>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-February/021732.html >>> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-March/021750.html >>> [3] https://bugs.openjdk.java.net/secure/attachment/57899/base.png >>> [4] https://bugs.openjdk.java.net/secure/attachment/57898/fix.png >>> From tobias.hartmann at oracle.com Thu Mar 17 15:15:58 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 17 Mar 2016 16:15:58 +0100 Subject: [9] RFR(M): 8023191: OSR nmethods should be flushed to free space in CodeCache In-Reply-To: <56EAC77C.70806@oracle.com> References: <56E99368.80806@oracle.com> <56E9A809.1030400@oracle.com> <56EA6AEC.8000905@oracle.com> <56EAC77C.70806@oracle.com> Message-ID: <56EACA2E.9030402@oracle.com> Thanks, Vladimir. Best regards, Tobias On 17.03.2016 16:04, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimri > > On 3/17/16 1:29 AM, Tobias Hartmann wrote: >> Hi Vladimir, >> >> thanks for the review! >> >> On 16.03.2016 19:38, Vladimir Kozlov wrote: >>> Nice graphs! >>> >>> nmethod.hpp - we try to avoid calling SNRH() in such case but use fatal() with printing value (_state in this case) to get more info in error output. >> >> Okay, I changed it to _fatal. >> >>> In invalidate_osr_method() add {} for != NULL check. >> >> Done. >> >>> I am a little worry about next change: >>> >>> ! if (is_osr_method() && is_in_use()) { >>> invalidate_osr_method(); >>> >>> May be we should move is_in_use() check inside invalidate_osr_method() and in false case in debug VM do verification that osr nmethod is not on the list. >> >> Right, I changed remove_osr_nmethod() to return true if the osr nmethod was found/removed and moved the is_in_use() check into invalidate_osr_method(). Like his, we verify that the nmethod was invalidated if it is !is_in_use(). >> >> New webrev: >> http://cr.openjdk.java.net/~thartmann/8023191/webrev.01/ >> >> Thanks, >> Tobias >> >>> Otherwise looks good. >>> >>> Thanks, >>> Vladimir >>> >>> On 3/16/16 10:10 AM, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8023191 >>>> http://cr.openjdk.java.net/~thartmann/8023191/webrev.00/ >>>> >>>> Currently, we only remove unloaded/not-entrant OSR nmethods from the code cache but never flush alive and unused ones. This is a problem if many OSR compilations are triggered but the methods are only used for a short time. If the corresponding classes are not unloaded, these nmethods will never be flushed and occupy space in the code cache. This can lead to a drop in performance from which we may never recover (see discussions [1], [2]). >>>> >>>> My fix enables flushing of OSR nmethods, treating them the same way as we treat "normal" compilations. I implemented a fast path, that allows the sweeper to flush zombie OSR nmethods directly because they are never referenced by an inline cache. >>>> >>>> I refactored the debug printing in NMethodSweeper::process_nmethod() to get consistent output if OSR nmethods are flushed. During testing, I noticed that we need to clean the CodeCacheSweeperThread::_scanned_nmethod reference to the flushed nmethod in NMethodSweeper::release_nmethod() because otherwise the GC may visit the zombie nmethod at a safepoint that may occur during sweeping. >>>> >>>> I also fixed the make_unloaded()/make_zombie() code to only invoke nmethod::invalidate_osr_method() once and did some small typo/comment cleanups in related code. >>>> >>>> I've run jittest with the latest JDK 9 build and -XX:ReservedCodeCacheSize=20m -XX:-ClassUnloading. Graph [3] shows the results: the code cache fills up quickly and performance degrades significantly. We don't recover because OSR nmethods are not flushed and we therefore don't have enough space in the code cache to compile the methods of newly loaded classes. Graph [4] shows that the problem is solved by my fix. The code cache does not completely fill up and performance remains stable at a high level. Also the number of compilations is higher with the fix (86546 vs. 132495). >>>> >>>> Testing: >>>> - JPRT >>>> - RBT with hotspot_all and -Xcomp/-Xmixed >>>> - Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 and 100 runs each >>>> >>>> Thanks, >>>> Tobias >>>> >>>> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-February/021732.html >>>> [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-March/021750.html >>>> [3] https://bugs.openjdk.java.net/secure/attachment/57899/base.png >>>> [4] https://bugs.openjdk.java.net/secure/attachment/57898/fix.png >>>> From vladimir.kempik at oracle.com Thu Mar 17 16:04:46 2016 From: vladimir.kempik at oracle.com (Vladimir Kempik) Date: Thu, 17 Mar 2016 19:04:46 +0300 Subject: [8u][TESTBUG] RFR 8152098: Fix 8151522 caused test compiler/intrinsics/squaretolen/TestSquareToLen.java to fail Message-ID: <56EAD59E.5090306@oracle.com> Hello Please review this simple change for jdk8u. Fixing TestSquareToLen testcase after 8151522 and same with MontgomeryMultiplyTest. Bug: https://bugs.openjdk.java.net/browse/JDK-8152098 Webrev: http://cr.openjdk.java.net/~vkempik/8152098/webrev.00/ Thanks -Vladimir From zoltan.majo at oracle.com Thu Mar 17 16:09:19 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Thu, 17 Mar 2016 17:09:19 +0100 Subject: [9] RFR (S): 8148754: C2 loop unrolling fails due to unexpected graph shape In-Reply-To: <56E9C7AB.8050504@oracle.com> References: <56C1ED18.6060903@oracle.com> <56C26929.4050706@oracle.com> <56CB1997.40107@oracle.com> <56CE6CAF.9090904@oracle.com> <56E99EF1.2030900@oracle.com> <56E9C7AB.8050504@oracle.com> Message-ID: <56EAD6AF.7020204@oracle.com> Hi Vladimir, thank you for the feedback. On 03/16/2016 09:52 PM, Vladimir Kozlov wrote: >> Can you please let me know which solution you prefer: >> - (1) the prototype with the regression solved or >> - (2) checking the graph shape? > > I agree that we should do (2) now. One suggestion I have is to prepare > a separate method to do these checks and use it in other places which > you pointed - superword and do_range_check. OK, I updated the patch for (2) so that the check of the graph's shape is performed in a separate method. Here is the webrev: http://cr.openjdk.java.net/~zmajo/8148754/webrev.04/ I've tested the updated webrev with JPRT and also by executing the failing test. Both pass. I will soon start RBT testing as well (will let you know if failures have appeared). > Yes, you can work later on (1) solution if you have a bug and not RFE > - we should stabilize C2 as you know and these changes may have some > side effects we don't know about yet. But I like it because we can > explicitly specify which optimizations are allowed. OK. I filed 8152110: "Stabilize C2 loop optimizations" and will continue work in the scope of that bug: https://bugs.openjdk.java.net/browse/JDK-8152110 > >> The two I'm concerned about are >> - Compile::cleanup_loop_predicates() > > Yes, this one should be marked LOOP_OPTS_LIMITED. > >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >> >> - IdealLoopTree::remove_main_post_loops() > > This one is fine because the loop goes away. I'll take of these once I continue work on 8152110. Thank you! Best regards, Zoltan > > Thanks, > Vladimir > > On 3/16/16 10:59 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> I've spent more time on this issue. Please find my findings below. >> >> On 02/25/2016 03:53 AM, Vladimir Kozlov wrote: >>> So it is again _major_progress problem. >>> I have to spend more time on this. It is not simple. >>> >>> We may add an other state when Ideal transformation could be >>> executed. For example, after all loop opts: >>> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0fc557e05fc0/src/share/vm/opto/compile.cpp#l2286 >>> >>> >>> Or more states to specify which Ideal transformations and loop >>> optimizations could be executed in which state. >> >> I think adding more states is necessary, adding a single state is not >> sufficient as... (see below) >> >>> >>> The main problem from your description is elimination of Opaque1 on >>> which loop optimizations relies. >>> >>> We can simply remove Opaque1Node::Identity(PhaseGVN* phase) because >>> PhaseMacroExpand::expand_macro_nodes() will remove them after all >>> loop opts. >> >> ...there are even more places where Opaque1 nodes are removed, than >> we've initially assumed. >> >> The two I'm concerned about are >> - Compile::cleanup_loop_predicates() > > Yes, this one should be marked LOOP_OPTS_LIMITED. > >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >> >> - IdealLoopTree::remove_main_post_loops() > > This one is fine because the loop goes away. > >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2469 >> >> >>> On other hand we may do want to execute some simple loop >>> optimizations even after Opaque, CastII and CastI2L are optimized >>> out. For example, removing empty loops or one iteration loops >>> (pre-loops). >>> But definitely not ones which use cloning or other aggressive >>> optimizations. >> >> Yes, I agree that we want to execute some loop optimizations even >> afterwards. >> >> So I've added three states: LOOP_OPTS_FULL, LOOP_OPTS_LIMITED, and >> LOOP_OPTS_INHIBITED. These states indicate which loop optimizations >> are allowed, Major_progress indicates only if loop optimizations >> have made progress (but not if loop optimizations are expected to be >> performed). >> >>> >>> Inline_Warm() is not used since InlineWarmCalls for very long time. >>> The code could be very rotten by now. So removing set_major_progress >>> from it is fine. >> >> OK. >> >>> >>> It is also fine to remove it from inline_incrementally since it will >>> be restored by skip_loop_opts code (and cleared if method is empty >>> or set if there are expensive nodes). >> >> OK. >> >>> >>> LoopNode::Ideal() change seems also fine. LoopNode is created only >>> in loop opts (RootNode has own Ideal()) so if it has TOP input it >>> will be removed by RegionNode::Ideal most likely. >> >> OK. >> >>> >>> Which leaves remove_useless_bool() code only and I have concern >>> about it. It could happened after CCP phase and we may want to >>> execute loop opts after it. I am actually want to set major progress >>> after CCP unconditionally since some If nodes could be folded by it. >> >> Yes, that makes sense and I did it. >> >>> >>> As you can see it is not simple :( >> >> No, it's not simple at all. I did a prototype that implements all we >> discussed above. Here is the code: >> http://cr.openjdk.java.net/~zmajo/code/8148754/webrev/ >> >> The code is not yet RFR quality, but I've sent it out because I'd >> like to have your feedback on how to continue. >> >> The code fixes the current problem with the unexpected graph shape. >> But it is likely to also solve similar problems that are triggered >> also by an unexpected graph shape, for example any of the asserts >> in PhaseIdealLoop::do_range_check: >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2124 >> >> >> I evaluated performance of the prototype. Peformance improves in a >> number of cases by 1-4%: >> Octane-Mandreel >> Octane-Richards >> Octane-Splay >> >> Unfortunately, there is also a performance regression with >> SPECjvm2008-MonteCarlo-G1 (3-5%). Finding the cause of that >> regression is likely to take a at least a week, but most likely even >> more. >> >> So my question is: Should I spend more time on this prototype and fix >> the performance regression? >> >> A different solution would be check the complete graph shape. That is >> also done at other places, e.g., in SuperWord::get_pre_loop_end() >> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/superword.cpp#l3076 >> >> >> Here is the webrev for the second solution: >> http://cr.openjdk.java.net/~zmajo/8148754/webrev.03/ >> >> The second solution does not regress. I've tested it with: >> - JPRT; >> - local testing (linux-86_64) with the failing test case; >> - executing all hotspot tests locally, all tests pass that pass with >> an unmodified build. >> >> Can you please let me know which solution you prefer: >> - (1) the prototype with the regression solved or >> - (2) checking the graph shape? >> >> We could also fix this issue with pushing (2) for now (as this issue >> is a "critical" nightly failure). I could then spend more time on (1) >> later in a different bug. >> >> Thank you and best regards, >> >> >> Zoltan >> >>> Thanks, >>> Vladimir >>> >>> On 2/22/16 6:22 AM, Zolt?n Maj? wrote: >>>> Hi Vladimir, >>>> >>>> >>>> thank you for the feedback! >>>> >>>> On 02/16/2016 01:11 AM, Vladimir Kozlov wrote: >>>>> Zoltan, >>>>> >>>>> It should not be "main" loop if peeling happened. See do_peeling(): >>>>> >>>>> if (cl->is_main_loop()) { >>>>> cl->set_normal_loop(); >>>>> >>>>> Split-if optimization should not split through loop's phi. And >>>>> generally not through loop's head since it is not making code >>>>> better - >>>>> split through backedge moves code into loop again. Making loop body >>>>> more complicated as this case shows. >>>> >>>> I did more investigation to understand what causes the invalid graph >>>> shape to appear. It seems that the invalid graph shape appears because >>>> the compiler uses the Compile:: _major_progress inconsistently. >>>> Here are >>>> some details. >>>> >>>> - If _major_progress *is set*, the compiler expects more loop >>>> optimizations to happen. Therefore, certain transformations on the >>>> graph >>>> are not allowed so that the graph is in a shape that can be >>>> processed by >>>> loop optimizations. See: >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/convertnode.cpp#l253 >>>> >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/castnode.cpp#l251 >>>> >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/loopnode.cpp#l950 >>>> >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/opaquenode.cpp#l37 >>>> >>>> >>>> >>>> - If _major_progress *is not set*, the compiler is allowed to perform >>>> all possible transformations (because it does not have to care about >>>> future loop optimizations). >>>> >>>> The crash reported for the current issue appears because >>>> _major_progress >>>> *can be accidentally set again* after the compiler decided to stop >>>> performing loop optimizations. As a result, invalid graph shapes >>>> appear. >>>> >>>> Here are details about how this happens for both failures I've been >>>> studying: >>>> https://bugs.openjdk.java.net/browse/JDK-8148754?focusedCommentId=13901941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13901941 >>>> >>>> >>>> >>>> I would propose to change the compiler to use _major_progress >>>> consistently. (This goes into the same direction as Tobias's recent >>>> work >>>> on JDK-8144487.) >>>> >>>> I propose that _major_progress: >>>> - can be SET when the compiler is initialized (because loop >>>> optimizations are expected to happen afterwards); >>>> - can be SET/RESET in the scope of loop optimizations (because we want >>>> to see if loop optimizations made progress); >>>> - cannot be SET/RESET by neither incremental inlining nor IGVN >>>> (even if >>>> the IGVN is performed in the scope of loop optimizations). >>>> >>>> Here is the updated webrev: >>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.02/ >>>> >>>> Performance evaluation: >>>> - The proposed webrev does not cause performance regressions for >>>> SPECjvm2008, SPECjbb2005, and Octane. >>>> >>>> Testing: >>>> - all hotspot JTREG tests on all supported platforms; >>>> - JPRT; >>>> - failing test case. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>>> >>>>> >>>>> Bailout unrolling is fine but performance may suffer because in some >>>>> cases loop unrolling is better then split-if. >>>> >>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/15/16 7:22 AM, Zolt?n Maj? wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> please review the patch for 8148754. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8148754 >>>>>> >>>>>> Problem: Compilation fails when the C2 compiler attempts loop >>>>>> unrolling. >>>>>> The cause of the failure is that the loop unrolling optimization >>>>>> expects >>>>>> a well-defined graph shape at the entry control of a >>>>>> 'CountedLoopNode' >>>>>> ('IfTrue'/'IfFalse' preceeded by 'If' preceeded by 'Bool' >>>>>> preceeded by >>>>>> 'CmpI'). >>>>>> >>>>>> >>>>>> Solution: I investigated several different instances of the same >>>>>> failure. It turns out that the shape of the graph at a loop's entry >>>>>> control is often different from the way loop unrolling expects it >>>>>> to be >>>>>> (please find some examples in the bug's JBS issue). The various >>>>>> graph >>>>>> shapes are a result of previously performed transformations, e.g., >>>>>> split-if optimization and loop peeling. >>>>>> >>>>>> Loop unrolling requires the above mentioned graph shape so that >>>>>> it can >>>>>> adjust the zero-trip guard of the loop. With the unexpected graph >>>>>> shapes, it is not possible to perform loop unrolling. However, >>>>>> the graph >>>>>> is still in a valid state (except for loop unrolling) and can be >>>>>> used to >>>>>> produce correct code. >>>>>> >>>>>> I propose that (1) we check if an unexpected graph shape is >>>>>> encountered >>>>>> and (2) bail out of loop unrolling if it is (but not fail in the >>>>>> compiler in such cases). >>>>>> >>>>>> The failure was triggered by Aleksey's Indify String Concatenation >>>>>> changes but the generated bytecodes are valid. So this seems to be a >>>>>> compiler issue that was previously there but was not yet triggered. >>>>>> >>>>>> >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.00/ >>>>>> >>>>>> Testing: >>>>>> - JPRT; >>>>>> - local testing (linux-86_64) with the failing test case; >>>>>> - executed all hotspot tests locally, all tests pass that pass >>>>>> with an >>>>>> unmodified build. >>>>>> >>>>>> Thank you! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>> >> From vladimir.kozlov at oracle.com Thu Mar 17 16:39:34 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 09:39:34 -0700 Subject: [8u][TESTBUG] RFR 8152098: Fix 8151522 caused test compiler/intrinsics/squaretolen/TestSquareToLen.java to fail In-Reply-To: <56EAD59E.5090306@oracle.com> References: <56EAD59E.5090306@oracle.com> Message-ID: <56EADDC6.2060205@oracle.com> Looks good. Thanks, Vladimir On 3/17/16 9:04 AM, Vladimir Kempik wrote: > Hello > > Please review this simple change for jdk8u. > > Fixing TestSquareToLen testcase after 8151522 and same with MontgomeryMultiplyTest. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8152098 > Webrev: http://cr.openjdk.java.net/~vkempik/8152098/webrev.00/ > > Thanks > -Vladimir > From vladimir.kozlov at oracle.com Thu Mar 17 17:12:45 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 10:12:45 -0700 Subject: [9] RFR (S): 8148754: C2 loop unrolling fails due to unexpected graph shape In-Reply-To: <56EAD6AF.7020204@oracle.com> References: <56C1ED18.6060903@oracle.com> <56C26929.4050706@oracle.com> <56CB1997.40107@oracle.com> <56CE6CAF.9090904@oracle.com> <56E99EF1.2030900@oracle.com> <56E9C7AB.8050504@oracle.com> <56EAD6AF.7020204@oracle.com> Message-ID: <56EAE58D.10605@oracle.com> On 3/17/16 9:09 AM, Zolt?n Maj? wrote: > Hi Vladimir, > > > thank you for the feedback. > > On 03/16/2016 09:52 PM, Vladimir Kozlov wrote: >>> Can you please let me know which solution you prefer: >>> - (1) the prototype with the regression solved or >>> - (2) checking the graph shape? >> >> I agree that we should do (2) now. One suggestion I have is to prepare a separate method to do these checks and use it in other places which you pointed - superword and do_range_check. > > OK, I updated the patch for (2) so that the check of the graph's shape is performed in a separate method. Here is the webrev: > http://cr.openjdk.java.net/~zmajo/8148754/webrev.04/ Zoltan, I don't see changes in do_range_check(). Opcode() is virtual function. Use is_*() query methods was originally in SuperWord::get_pre_loop_end(). I don't like is_adjustable_loop_entry() name, especially since you negate it in checks. Consider is_canonical_main_loop_entry(). Also move assert(cl->is_main_loop(), "") into it. Thanks, Vladimir > > I've tested the updated webrev with JPRT and also by executing the failing test. Both pass. I will soon start RBT testing as well (will let you know if failures have appeared). > >> Yes, you can work later on (1) solution if you have a bug and not RFE - we should stabilize C2 as you know and these changes may have some side effects we don't know about yet. But I like it because >> we can explicitly specify which optimizations are allowed. > > OK. I filed 8152110: "Stabilize C2 loop optimizations" and will continue work in the scope of that bug: > https://bugs.openjdk.java.net/browse/JDK-8152110 >> >>> The two I'm concerned about are >>> - Compile::cleanup_loop_predicates() >> >> Yes, this one should be marked LOOP_OPTS_LIMITED. >> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >>> - IdealLoopTree::remove_main_post_loops() >> >> This one is fine because the loop goes away. > > I'll take of these once I continue work on 8152110. > > Thank you! > > Best regards, > > > Zoltan > >> >> Thanks, >> Vladimir >> >> On 3/16/16 10:59 AM, Zolt?n Maj? wrote: >>> Hi Vladimir, >>> >>> >>> I've spent more time on this issue. Please find my findings below. >>> >>> On 02/25/2016 03:53 AM, Vladimir Kozlov wrote: >>>> So it is again _major_progress problem. >>>> I have to spend more time on this. It is not simple. >>>> >>>> We may add an other state when Ideal transformation could be executed. For example, after all loop opts: >>>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0fc557e05fc0/src/share/vm/opto/compile.cpp#l2286 >>>> >>>> Or more states to specify which Ideal transformations and loop optimizations could be executed in which state. >>> >>> I think adding more states is necessary, adding a single state is not sufficient as... (see below) >>> >>>> >>>> The main problem from your description is elimination of Opaque1 on which loop optimizations relies. >>>> >>>> We can simply remove Opaque1Node::Identity(PhaseGVN* phase) because PhaseMacroExpand::expand_macro_nodes() will remove them after all loop opts. >>> >>> ...there are even more places where Opaque1 nodes are removed, than we've initially assumed. >>> >>> The two I'm concerned about are >>> - Compile::cleanup_loop_predicates() >> >> Yes, this one should be marked LOOP_OPTS_LIMITED. >> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >>> - IdealLoopTree::remove_main_post_loops() >> >> This one is fine because the loop goes away. >> >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2469 >>> >>>> On other hand we may do want to execute some simple loop optimizations even after Opaque, CastII and CastI2L are optimized out. For example, removing empty loops or one iteration loops (pre-loops). >>>> But definitely not ones which use cloning or other aggressive optimizations. >>> >>> Yes, I agree that we want to execute some loop optimizations even afterwards. >>> >>> So I've added three states: LOOP_OPTS_FULL, LOOP_OPTS_LIMITED, and LOOP_OPTS_INHIBITED. These states indicate which loop optimizations are allowed, Major_progress indicates only if loop optimizations >>> have made progress (but not if loop optimizations are expected to be performed). >>> >>>> >>>> Inline_Warm() is not used since InlineWarmCalls for very long time. The code could be very rotten by now. So removing set_major_progress from it is fine. >>> >>> OK. >>> >>>> >>>> It is also fine to remove it from inline_incrementally since it will be restored by skip_loop_opts code (and cleared if method is empty or set if there are expensive nodes). >>> >>> OK. >>> >>>> >>>> LoopNode::Ideal() change seems also fine. LoopNode is created only in loop opts (RootNode has own Ideal()) so if it has TOP input it will be removed by RegionNode::Ideal most likely. >>> >>> OK. >>> >>>> >>>> Which leaves remove_useless_bool() code only and I have concern about it. It could happened after CCP phase and we may want to execute loop opts after it. I am actually want to set major progress >>>> after CCP unconditionally since some If nodes could be folded by it. >>> >>> Yes, that makes sense and I did it. >>> >>>> >>>> As you can see it is not simple :( >>> >>> No, it's not simple at all. I did a prototype that implements all we discussed above. Here is the code: >>> http://cr.openjdk.java.net/~zmajo/code/8148754/webrev/ >>> >>> The code is not yet RFR quality, but I've sent it out because I'd like to have your feedback on how to continue. >>> >>> The code fixes the current problem with the unexpected graph shape. But it is likely to also solve similar problems that are triggered also by an unexpected graph shape, for example any of the asserts >>> in PhaseIdealLoop::do_range_check: >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2124 >>> >>> I evaluated performance of the prototype. Peformance improves in a number of cases by 1-4%: >>> Octane-Mandreel >>> Octane-Richards >>> Octane-Splay >>> >>> Unfortunately, there is also a performance regression with SPECjvm2008-MonteCarlo-G1 (3-5%). Finding the cause of that regression is likely to take a at least a week, but most likely even more. >>> >>> So my question is: Should I spend more time on this prototype and fix the performance regression? >>> >>> A different solution would be check the complete graph shape. That is also done at other places, e.g., in SuperWord::get_pre_loop_end() >>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/superword.cpp#l3076 >>> >>> Here is the webrev for the second solution: >>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.03/ >>> >>> The second solution does not regress. I've tested it with: >>> - JPRT; >>> - local testing (linux-86_64) with the failing test case; >>> - executing all hotspot tests locally, all tests pass that pass with an unmodified build. >>> >>> Can you please let me know which solution you prefer: >>> - (1) the prototype with the regression solved or >>> - (2) checking the graph shape? >>> >>> We could also fix this issue with pushing (2) for now (as this issue is a "critical" nightly failure). I could then spend more time on (1) later in a different bug. >>> >>> Thank you and best regards, >>> >>> >>> Zoltan >>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/22/16 6:22 AM, Zolt?n Maj? wrote: >>>>> Hi Vladimir, >>>>> >>>>> >>>>> thank you for the feedback! >>>>> >>>>> On 02/16/2016 01:11 AM, Vladimir Kozlov wrote: >>>>>> Zoltan, >>>>>> >>>>>> It should not be "main" loop if peeling happened. See do_peeling(): >>>>>> >>>>>> if (cl->is_main_loop()) { >>>>>> cl->set_normal_loop(); >>>>>> >>>>>> Split-if optimization should not split through loop's phi. And >>>>>> generally not through loop's head since it is not making code better - >>>>>> split through backedge moves code into loop again. Making loop body >>>>>> more complicated as this case shows. >>>>> >>>>> I did more investigation to understand what causes the invalid graph >>>>> shape to appear. It seems that the invalid graph shape appears because >>>>> the compiler uses the Compile:: _major_progress inconsistently. Here are >>>>> some details. >>>>> >>>>> - If _major_progress *is set*, the compiler expects more loop >>>>> optimizations to happen. Therefore, certain transformations on the graph >>>>> are not allowed so that the graph is in a shape that can be processed by >>>>> loop optimizations. See: >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/convertnode.cpp#l253 >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/castnode.cpp#l251 >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/loopnode.cpp#l950 >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/opaquenode.cpp#l37 >>>>> >>>>> >>>>> - If _major_progress *is not set*, the compiler is allowed to perform >>>>> all possible transformations (because it does not have to care about >>>>> future loop optimizations). >>>>> >>>>> The crash reported for the current issue appears because _major_progress >>>>> *can be accidentally set again* after the compiler decided to stop >>>>> performing loop optimizations. As a result, invalid graph shapes appear. >>>>> >>>>> Here are details about how this happens for both failures I've been >>>>> studying: >>>>> https://bugs.openjdk.java.net/browse/JDK-8148754?focusedCommentId=13901941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13901941 >>>>> >>>>> >>>>> I would propose to change the compiler to use _major_progress >>>>> consistently. (This goes into the same direction as Tobias's recent work >>>>> on JDK-8144487.) >>>>> >>>>> I propose that _major_progress: >>>>> - can be SET when the compiler is initialized (because loop >>>>> optimizations are expected to happen afterwards); >>>>> - can be SET/RESET in the scope of loop optimizations (because we want >>>>> to see if loop optimizations made progress); >>>>> - cannot be SET/RESET by neither incremental inlining nor IGVN (even if >>>>> the IGVN is performed in the scope of loop optimizations). >>>>> >>>>> Here is the updated webrev: >>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.02/ >>>>> >>>>> Performance evaluation: >>>>> - The proposed webrev does not cause performance regressions for >>>>> SPECjvm2008, SPECjbb2005, and Octane. >>>>> >>>>> Testing: >>>>> - all hotspot JTREG tests on all supported platforms; >>>>> - JPRT; >>>>> - failing test case. >>>>> >>>>> Thank you and best regards, >>>>> >>>>> >>>>> Zoltan >>>>> >>>>> >>>>>> >>>>>> Bailout unrolling is fine but performance may suffer because in some >>>>>> cases loop unrolling is better then split-if. >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 2/15/16 7:22 AM, Zolt?n Maj? wrote: >>>>>>> Hi, >>>>>>> >>>>>>> >>>>>>> please review the patch for 8148754. >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8148754 >>>>>>> >>>>>>> Problem: Compilation fails when the C2 compiler attempts loop unrolling. >>>>>>> The cause of the failure is that the loop unrolling optimization expects >>>>>>> a well-defined graph shape at the entry control of a 'CountedLoopNode' >>>>>>> ('IfTrue'/'IfFalse' preceeded by 'If' preceeded by 'Bool' preceeded by >>>>>>> 'CmpI'). >>>>>>> >>>>>>> >>>>>>> Solution: I investigated several different instances of the same >>>>>>> failure. It turns out that the shape of the graph at a loop's entry >>>>>>> control is often different from the way loop unrolling expects it to be >>>>>>> (please find some examples in the bug's JBS issue). The various graph >>>>>>> shapes are a result of previously performed transformations, e.g., >>>>>>> split-if optimization and loop peeling. >>>>>>> >>>>>>> Loop unrolling requires the above mentioned graph shape so that it can >>>>>>> adjust the zero-trip guard of the loop. With the unexpected graph >>>>>>> shapes, it is not possible to perform loop unrolling. However, the graph >>>>>>> is still in a valid state (except for loop unrolling) and can be used to >>>>>>> produce correct code. >>>>>>> >>>>>>> I propose that (1) we check if an unexpected graph shape is encountered >>>>>>> and (2) bail out of loop unrolling if it is (but not fail in the >>>>>>> compiler in such cases). >>>>>>> >>>>>>> The failure was triggered by Aleksey's Indify String Concatenation >>>>>>> changes but the generated bytecodes are valid. So this seems to be a >>>>>>> compiler issue that was previously there but was not yet triggered. >>>>>>> >>>>>>> >>>>>>> Webrev: >>>>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.00/ >>>>>>> >>>>>>> Testing: >>>>>>> - JPRT; >>>>>>> - local testing (linux-86_64) with the failing test case; >>>>>>> - executed all hotspot tests locally, all tests pass that pass with an >>>>>>> unmodified build. >>>>>>> >>>>>>> Thank you! >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> >>>>>>> Zoltan >>>>>>> >>>>> >>> > From christian.thalinger at oracle.com Thu Mar 17 17:30:20 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 07:30:20 -1000 Subject: RFR: 8151723: [JVMCI] JVMCIRuntime::treat_as_trivial: Don't limit trivial prefixes to boot class path Message-ID: <30D9B3B0-15B6-4238-A9CA-DCCEB7A83663@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8151723 There's no guarantee that a JVMCI compiler will be loaded from the boot class path. The mechanism provided for a JVMCI compiler to communicate to the compiler broker that it doesn't want to compile itself should take this into account. diff -r 3256d4204291 src/share/vm/jvmci/jvmciRuntime.cpp --- a/src/share/vm/jvmci/jvmciRuntime.cpp Wed Mar 16 10:45:43 2016 +0100 +++ b/src/share/vm/jvmci/jvmciRuntime.cpp Thu Mar 17 07:29:04 2016 -1000 @@ -800,12 +800,9 @@ void JVMCIRuntime::shutdown(TRAPS) { bool JVMCIRuntime::treat_as_trivial(Method* method) { if (_HotSpotJVMCIRuntime_initialized) { - oop loader = method->method_holder()->class_loader(); - if (loader == NULL) { - for (int i = 0; i < _trivial_prefixes_count; i++) { - if (method->method_holder()->name()->starts_with(_trivial_prefixes[i])) { - return true; - } + for (int i = 0; i < _trivial_prefixes_count; i++) { + if (method->method_holder()->name()->starts_with(_trivial_prefixes[i])) { + return true; } } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Thu Mar 17 17:36:24 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 07:36:24 -1000 Subject: RFR: 8151829: [JVMCI] incorrect documentation about jvmci.compiler property Message-ID: https://bugs.openjdk.java.net/browse/JDK-8151829 As part of JDK-8151470, the sentence about the jvmci.compiler property in JVMCICompilerFactory should have been removed since it is a) wrong and b) an implementation detail pertaining to the HotSpot implementation of this interface. diff -r 3256d4204291 src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.java --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.java Wed Mar 16 10:45:43 2016 +0100 +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.java Thu Mar 17 07:35:18 2016 -1000 @@ -28,8 +28,7 @@ package jdk.vm.ci.runtime; public interface JVMCICompilerFactory { /** - * Get the name of this compiler. The compiler will be selected when the jvmci.compiler system - * property is equal to this name. + * Get the name of this compiler. */ String getCompilerName(); -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Thu Mar 17 19:38:23 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 17 Mar 2016 20:38:23 +0100 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures Message-ID: <56EB07AF.7060602@oracle.com> Hi, Please review this fix of the JDK-8150054 checkin that was missing one change causing all affected compilercontrol tests to fail. Testing: rm -rf JT*; java -jar jtreg.jar -noignore hotspot/test/compiler/compilercontrol Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ Regards, Nils Eliasson -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu Mar 17 19:43:41 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 12:43:41 -0700 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <56EB07AF.7060602@oracle.com> References: <56EB07AF.7060602@oracle.com> Message-ID: <56EB08ED.1060800@oracle.com> It seems fine but it is impossible for me to understand what is going on in these tests code. Thanks, Vladimir On 3/17/16 12:38 PM, Nils Eliasson wrote: > Hi, > > Please review this fix of the JDK-8150054 checkin that was missing one change causing all affected compilercontrol tests to fail. > > Testing: > rm -rf JT*; java -jar jtreg.jar -noignore hotspot/test/compiler/compilercontrol > > Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 > Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ > > Regards, > Nils Eliasson > From christian.thalinger at oracle.com Thu Mar 17 20:00:24 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 10:00:24 -1000 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <56EB07AF.7060602@oracle.com> References: <56EB07AF.7060602@oracle.com> Message-ID: <2B27642A-0528-4643-9A24-4B6670ECDB73@oracle.com> > On Mar 17, 2016, at 9:38 AM, Nils Eliasson wrote: > > Hi, > > Please review this fix of the JDK-8150054 checkin that was missing one change causing all affected compilercontrol tests to fail. compilercontrol tests are not run in JPRT? > > Testing: > rm -rf JT*; java -jar jtreg.jar -noignore hotspot/test/compiler/compilercontrol > > Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 > Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ > > Regards, > Nils Eliasson > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu Mar 17 20:12:00 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 13:12:00 -0700 Subject: [9] RFR(S): 8144693: Intrinsify StringCoding.hasNegatives() on SPARC In-Reply-To: References: <56D77856.1020707@oracle.com> <56D7BFA6.1000401@oracle.com> <56DDBACC.2020405@oracle.com> <56E83012.40802@oracle.com> Message-ID: <56EB0F90.4010107@oracle.com> What do you think, Guy? Will you prepare another webrev or we can proceed with your latest webrev? Thanks, Vladimir On 3/15/16 6:00 PM, John Rose wrote: > On Mar 15, 2016, at 8:53 AM, Vladimir Kozlov > wrote: >> >> Somehow I missed this mail. >> >> Changes looks good now. > > Small nit: The first sra instruction doesn't add any value. The andcc(inp, 0x7, i) zeroes all high bits above 0x7, and the neg(i, t3) fills them with ones. Thus sra(t3, 0, t4) is a simple copy from > t3 to t4. > > (I suspect there may be other looseness in the fragment processing parts, but I haven't looked at it closely.) > > Overall, nicely done. > > ? John From vladimir.kozlov at oracle.com Thu Mar 17 20:14:02 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 13:14:02 -0700 Subject: RFR: 8151829: [JVMCI] incorrect documentation about jvmci.compiler property In-Reply-To: References: Message-ID: <56EB100A.5000000@oracle.com> Good. Thanks, Vladimir On 3/17/16 10:36 AM, Christian Thalinger wrote: > https://bugs.openjdk.java.net/browse/JDK-8151829 > > As part of JDK-8151470, the sentence about the jvmci.compiler property in JVMCICompilerFactory should have been removed since it is a) wrong and b) an implementation detail pertaining to the HotSpot > implementation of this interface. > > diff -r 3256d4204291 src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.java > --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.javaWed Mar 16 10:45:43 2016 +0100 > +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.javaThu Mar 17 07:35:18 2016 -1000 > @@ -28,8 +28,7 @@ package jdk.vm.ci.runtime; > public interface JVMCICompilerFactory { > > > /** > - * Get the name of this compiler. The compiler will be selected when the jvmci.compiler system > - * property is equal to this name. > + * Get the name of this compiler. > */ > String getCompilerName(); > > > From nils.eliasson at oracle.com Thu Mar 17 20:47:53 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 17 Mar 2016 21:47:53 +0100 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <2B27642A-0528-4643-9A24-4B6670ECDB73@oracle.com> References: <56EB07AF.7060602@oracle.com> <2B27642A-0528-4643-9A24-4B6670ECDB73@oracle.com> Message-ID: <56EB17F9.1070506@oracle.com> Only one - servicability/dcmd/compiler/CompilerDirectivesDCMDTest.java. //Nils On 2016-03-17 21:00, Christian Thalinger wrote: > >> On Mar 17, 2016, at 9:38 AM, Nils Eliasson > > wrote: >> >> Hi, >> >> Please review this fix of the JDK-8150054 checkin that was missing >> one change causing all affected compilercontrol tests to fail. > > compilercontrol tests are not run in JPRT? > >> >> Testing: >> rm -rf JT*; java -jar jtreg.jar -noignore >> hotspot/test/compiler/compilercontrol >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 >> Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ >> >> Regards, >> Nils Eliasson >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.thalinger at oracle.com Thu Mar 17 20:54:49 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 10:54:49 -1000 Subject: RFR: 8151829: [JVMCI] incorrect documentation about jvmci.compiler property In-Reply-To: <56EB100A.5000000@oracle.com> References: <56EB100A.5000000@oracle.com> Message-ID: <4E1B32AC-E419-491C-AD15-EEAADF61C40D@oracle.com> Thanks! > On Mar 17, 2016, at 10:14 AM, Vladimir Kozlov wrote: > > Good. > > Thanks, > Vladimir > > On 3/17/16 10:36 AM, Christian Thalinger wrote: >> https://bugs.openjdk.java.net/browse/JDK-8151829 >> >> As part of JDK-8151470, the sentence about the jvmci.compiler property in JVMCICompilerFactory should have been removed since it is a) wrong and b) an implementation detail pertaining to the HotSpot >> implementation of this interface. >> >> diff -r 3256d4204291 src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.java >> --- a/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.javaWed Mar 16 10:45:43 2016 +0100 >> +++ b/src/jdk.vm.ci/share/classes/jdk.vm.ci.runtime/src/jdk/vm/ci/runtime/JVMCICompilerFactory.javaThu Mar 17 07:35:18 2016 -1000 >> @@ -28,8 +28,7 @@ package jdk.vm.ci.runtime; >> public interface JVMCICompilerFactory { >> >> >> /** >> - * Get the name of this compiler. The compiler will be selected when the jvmci.compiler system >> - * property is equal to this name. >> + * Get the name of this compiler. >> */ >> String getCompilerName(); >> >> >> From christian.thalinger at oracle.com Thu Mar 17 20:58:05 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 10:58:05 -1000 Subject: RFR: 8152134: [JVMCI] printing compile queues always prints C2 regardless of UseJVMCICompiler Message-ID: https://bugs.openjdk.java.net/browse/JDK-8152134 It should print ?JVMCI compile queue?. diff -r 7c31312c5725 src/share/vm/compiler/compileBroker.cpp --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 17:03:20 2016 +0000 +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:54:08 2016 -1000 @@ -773,7 +773,13 @@ void CompileBroker::init_compiler_sweepe #endif // !ZERO && !SHARK // Initialize the compilation queue if (c2_compiler_count > 0) { - _c2_compile_queue = new CompileQueue("C2 compile queue"); + const char* name; +#if INCLUDE_JVMCI + name = UseJVMCICompiler ? "JVMCI compile queue" : "C2 compile queue"; +#else + name = "C2 compile queue"; +#endif + _c2_compile_queue = new CompileQueue(name); _compilers[1]->set_num_compiler_threads(c2_compiler_count); } if (c1_compiler_count > 0) { or: diff -r f2f1b80b0b03 src/share/vm/compiler/compileBroker.cpp --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:55:15 2016 -1000 +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:57:12 2016 -1000 @@ -773,7 +773,8 @@ void CompileBroker::init_compiler_sweepe #endif // !ZERO && !SHARK // Initialize the compilation queue if (c2_compiler_count > 0) { - _c2_compile_queue = new CompileQueue("C2 compile queue"); + const char* name = JVMCI_ONLY(UseJVMCICompiler ? "JVMCI compile queue" :) "C2 compile queue"; + _c2_compile_queue = new CompileQueue(name); _compilers[1]->set_num_compiler_threads(c2_compiler_count); } if (c1_compiler_count > 0) { -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils.eliasson at oracle.com Thu Mar 17 21:00:46 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 17 Mar 2016 22:00:46 +0100 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <56EB08ED.1060800@oracle.com> References: <56EB07AF.7060602@oracle.com> <56EB08ED.1060800@oracle.com> Message-ID: <56EB1AFE.30309@oracle.com> I can give you an overview and Pavel can give you the gory details. What would have caught this was having at least some of these tests in JPRT, or me removing the JTwork directory between the test runs. //Nils On 2016-03-17 20:43, Vladimir Kozlov wrote: > It seems fine but it is impossible for me to understand what is going > on in these tests code. > > Thanks, > Vladimir > > On 3/17/16 12:38 PM, Nils Eliasson wrote: >> Hi, >> >> Please review this fix of the JDK-8150054 checkin that was missing >> one change causing all affected compilercontrol tests to fail. >> >> Testing: >> rm -rf JT*; java -jar jtreg.jar -noignore >> hotspot/test/compiler/compilercontrol >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 >> Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ >> >> Regards, >> Nils Eliasson >> From vladimir.kozlov at oracle.com Thu Mar 17 22:13:08 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 17 Mar 2016 15:13:08 -0700 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <56EB17F9.1070506@oracle.com> References: <56EB07AF.7060602@oracle.com> <2B27642A-0528-4643-9A24-4B6670ECDB73@oracle.com> <56EB17F9.1070506@oracle.com> Message-ID: <56EB2BF4.5050803@oracle.com> Can you add more compilercontrol tests to JPRT? The number of bugs filed for compilercontrol is *ridicules* and most of them related to tests. You can create separate group for them if they need > 10 min. Thanks, Vladimir On 3/17/16 1:47 PM, Nils Eliasson wrote: > Only one - servicability/dcmd/compiler/CompilerDirectivesDCMDTest.java. > > //Nils > > On 2016-03-17 21:00, Christian Thalinger wrote: >> >>> On Mar 17, 2016, at 9:38 AM, Nils Eliasson <nils.eliasson at oracle.com> wrote: >>> >>> Hi, >>> >>> Please review this fix of the JDK-8150054 checkin that was missing one change causing all affected compilercontrol tests to fail. >> >> compilercontrol tests are not run in JPRT? >> >>> >>> Testing: >>> rm -rf JT*; java -jar jtreg.jar -noignore hotspot/test/compiler/compilercontrol >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 >>> Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ >>> >>> Regards, >>> Nils Eliasson >>> >> > From doug.simon at oracle.com Thu Mar 17 22:36:25 2016 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 17 Mar 2016 23:36:25 +0100 Subject: RFR: 8152134: [JVMCI] printing compile queues always prints C2 regardless of UseJVMCICompiler In-Reply-To: References: Message-ID: <216E4744-020A-465B-8391-5BC7B28598FC@oracle.com> I like the second form - it?s not only shorter but also include a feel-good emoji ;-) > On 17 Mar 2016, at 21:58, Christian Thalinger wrote: > > https://bugs.openjdk.java.net/browse/JDK-8152134 > > It should print ?JVMCI compile queue?. > > diff -r 7c31312c5725 src/share/vm/compiler/compileBroker.cpp > --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 17:03:20 2016 +0000 > +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:54:08 2016 -1000 > @@ -773,7 +773,13 @@ void CompileBroker::init_compiler_sweepe > #endif // !ZERO && !SHARK > // Initialize the compilation queue > if (c2_compiler_count > 0) { > - _c2_compile_queue = new CompileQueue("C2 compile queue"); > + const char* name; > +#if INCLUDE_JVMCI > + name = UseJVMCICompiler ? "JVMCI compile queue" : "C2 compile queue"; > +#else > + name = "C2 compile queue"; > +#endif > + _c2_compile_queue = new CompileQueue(name); > _compilers[1]->set_num_compiler_threads(c2_compiler_count); > } > if (c1_compiler_count > 0) { > > or: > > diff -r f2f1b80b0b03 src/share/vm/compiler/compileBroker.cpp > --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:55:15 2016 -1000 > +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:57:12 2016 -1000 > @@ -773,7 +773,8 @@ void CompileBroker::init_compiler_sweepe > #endif // !ZERO && !SHARK > // Initialize the compilation queue > if (c2_compiler_count > 0) { > - _c2_compile_queue = new CompileQueue("C2 compile queue"); > + const char* name = JVMCI_ONLY(UseJVMCICompiler ? "JVMCI compile queue" :) "C2 compile queue"; > + _c2_compile_queue = new CompileQueue(name); > _compilers[1]->set_num_compiler_threads(c2_compiler_count); > } > if (c1_compiler_count > 0) { > From doug.simon at oracle.com Thu Mar 17 23:01:22 2016 From: doug.simon at oracle.com (Doug Simon) Date: Fri, 18 Mar 2016 00:01:22 +0100 Subject: RFR: 8151723: [JVMCI] JVMCIRuntime::treat_as_trivial: Don't limit trivial prefixes to boot class path In-Reply-To: <30D9B3B0-15B6-4238-A9CA-DCCEB7A83663@oracle.com> References: <30D9B3B0-15B6-4238-A9CA-DCCEB7A83663@oracle.com> Message-ID: <9668889A-EC39-466A-9590-2EB665FFE055@oracle.com> Looks good. > On 17 Mar 2016, at 18:30, Christian Thalinger wrote: > > https://bugs.openjdk.java.net/browse/JDK-8151723 > > There's no guarantee that a JVMCI compiler will be loaded from the boot class path. The mechanism provided for a JVMCI compiler to communicate to the compiler broker that it doesn't want to compile itself should take this into account. > > diff -r 3256d4204291 src/share/vm/jvmci/jvmciRuntime.cpp > --- a/src/share/vm/jvmci/jvmciRuntime.cpp Wed Mar 16 10:45:43 2016 +0100 > +++ b/src/share/vm/jvmci/jvmciRuntime.cpp Thu Mar 17 07:29:04 2016 -1000 > @@ -800,12 +800,9 @@ void JVMCIRuntime::shutdown(TRAPS) { > > bool JVMCIRuntime::treat_as_trivial(Method* method) { > if (_HotSpotJVMCIRuntime_initialized) { > - oop loader = method->method_holder()->class_loader(); > - if (loader == NULL) { > - for (int i = 0; i < _trivial_prefixes_count; i++) { > - if (method->method_holder()->name()->starts_with(_trivial_prefixes[i])) { > - return true; > - } > + for (int i = 0; i < _trivial_prefixes_count; i++) { > + if (method->method_holder()->name()->starts_with(_trivial_prefixes[i])) { > + return true; > } > } > } > From christian.thalinger at oracle.com Fri Mar 18 02:08:03 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 16:08:03 -1000 Subject: RFR(XS): 8150054: Code missing from JDK-8150054 causing many test failures In-Reply-To: <56EB2BF4.5050803@oracle.com> References: <56EB07AF.7060602@oracle.com> <2B27642A-0528-4643-9A24-4B6670ECDB73@oracle.com> <56EB17F9.1070506@oracle.com> <56EB2BF4.5050803@oracle.com> Message-ID: > On Mar 17, 2016, at 12:13 PM, Vladimir Kozlov wrote: > > Can you add more compilercontrol tests to JPRT? The number of bugs filed for compilercontrol is *ridicules* and most of them related to tests. Yes! > > You can create separate group for them if they need > 10 min. > > Thanks, > Vladimir > > On 3/17/16 1:47 PM, Nils Eliasson wrote: >> Only one - servicability/dcmd/compiler/CompilerDirectivesDCMDTest.java. >> >> //Nils >> >> On 2016-03-17 21:00, Christian Thalinger wrote: >>> >>>> On Mar 17, 2016, at 9:38 AM, Nils Eliasson <nils.eliasson at oracle.com> wrote: >>>> >>>> Hi, >>>> >>>> Please review this fix of the JDK-8150054 checkin that was missing one change causing all affected compilercontrol tests to fail. >>> >>> compilercontrol tests are not run in JPRT? >>> >>>> >>>> Testing: >>>> rm -rf JT*; java -jar jtreg.jar -noignore hotspot/test/compiler/compilercontrol >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8152090 >>>> Webrev: http://cr.openjdk.java.net/~neliasso/8152090/webrev.01/ >>>> >>>> Regards, >>>> Nils Eliasson >>>> >>> >> From christian.thalinger at oracle.com Fri Mar 18 02:08:26 2016 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 17 Mar 2016 16:08:26 -1000 Subject: RFR: 8152134: [JVMCI] printing compile queues always prints C2 regardless of UseJVMCICompiler In-Reply-To: <216E4744-020A-465B-8391-5BC7B28598FC@oracle.com> References: <216E4744-020A-465B-8391-5BC7B28598FC@oracle.com> Message-ID: <91911FCA-7B8D-41F3-B938-44B7E3A1E8B7@oracle.com> > On Mar 17, 2016, at 12:36 PM, Doug Simon wrote: > > I like the second form - it?s not only shorter but also include a feel-good emoji ;-) It does :) > >> On 17 Mar 2016, at 21:58, Christian Thalinger wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8152134 >> >> It should print ?JVMCI compile queue?. >> >> diff -r 7c31312c5725 src/share/vm/compiler/compileBroker.cpp >> --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 17:03:20 2016 +0000 >> +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:54:08 2016 -1000 >> @@ -773,7 +773,13 @@ void CompileBroker::init_compiler_sweepe >> #endif // !ZERO && !SHARK >> // Initialize the compilation queue >> if (c2_compiler_count > 0) { >> - _c2_compile_queue = new CompileQueue("C2 compile queue"); >> + const char* name; >> +#if INCLUDE_JVMCI >> + name = UseJVMCICompiler ? "JVMCI compile queue" : "C2 compile queue"; >> +#else >> + name = "C2 compile queue"; >> +#endif >> + _c2_compile_queue = new CompileQueue(name); >> _compilers[1]->set_num_compiler_threads(c2_compiler_count); >> } >> if (c1_compiler_count > 0) { >> >> or: >> >> diff -r f2f1b80b0b03 src/share/vm/compiler/compileBroker.cpp >> --- a/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:55:15 2016 -1000 >> +++ b/src/share/vm/compiler/compileBroker.cpp Thu Mar 17 10:57:12 2016 -1000 >> @@ -773,7 +773,8 @@ void CompileBroker::init_compiler_sweepe >> #endif // !ZERO && !SHARK >> // Initialize the compilation queue >> if (c2_compiler_count > 0) { >> - _c2_compile_queue = new CompileQueue("C2 compile queue"); >> + const char* name = JVMCI_ONLY(UseJVMCICompiler ? "JVMCI compile queue" :) "C2 compile queue"; >> + _c2_compile_queue = new CompileQueue(name); >> _compilers[1]->set_num_compiler_threads(c2_compiler_count); >> } >> if (c1_compiler_count > 0) { >> > From HORII at jp.ibm.com Fri Mar 18 07:42:26 2016 From: HORII at jp.ibm.com (Hiroshi H Horii) Date: Fri, 18 Mar 2016 07:42:26 +0000 Subject: Support for AES on ppc64le In-Reply-To: References: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com><56E71E11.3040808@oracle.com> <201603150932.u2F9W7k9028664@d19av05.sagamino.japan.ibm.com> Message-ID: <201603180742.u2I7gfOX003248@d19av05.sagamino.japan.ibm.com> Hi Martin and all, Thank you for your comments. > May we ask you to support big endian, too? AIX is still big endian. > I think big endian linux requires setting and restoring VRSAVE which > is not needed on little endian linux and AIX. If possible, I would like to finish little endian support first. In the near future, I would like to add a support of big endian. > I have taken a look at the PPC64 part of the change and I have a > minor change request: > We don?t like to see both strings if has_vcipher(): " vcipher" and " > aes". Please remove one of them. In my understanding, to pass the test of compiler/cpuflags/, "aes" should be returned. That is, in the attached new change, "aes" is returned instead of "vcipher". > I noticed that you use the non-volatile vector registers v20-v31 > which may be dangerous as they are not saved & restored. > A quick workaround to allow their usage could be to change the build > to disallow GCC to use altivec. > Else I think we should save & restore them in the java ENTRY_FRAME. > I think we can assist with this. I modified to avoid using v20-v31 in stub codes. Thank you for your correction. > You may need additional check and cast because next expression expects > the objAESCryptKey points to int[]: sessionK is a private field of com.sun.crypto.provider.AESCrypt and it always store only two int[] objects (in makeSessionKey method). So, I believe, additional cast is not necessary to access sessionK[0] as an int[] object. In addition, Vladimir thankfully corrected my change in load_array_element. Please see the attached file for the detail. Regards, Hiroshi ----------------------- Hiroshi Horii, IBM Research - Tokyo "Doerr, Martin" wrote on 03/15/2016 20:57:14: > From: "Doerr, Martin" > To: Hiroshi H Horii/Japan/IBM at IBMJP, Vladimir Kozlov > > Cc: Tim Ellison , "Simonis, Volker" > , "hotspot-compiler-dev at openjdk.java.net" > > Date: 03/15/2016 20:58 > Subject: RE: Support for AES on ppc64le > > Hi Hiroshi, > > thanks for contributing AES support. We appreciate it. > > May we ask you to support big endian, too? AIX is still big endian. > I think big endian linux requires setting and restoring VRSAVE which > is not needed on little endian linux and AIX. > > I have taken a look at the PPC64 part of the change and I have a > minor change request: > We don?t like to see both strings if has_vcipher(): " vcipher" and " > aes". Please remove one of them. > > I noticed that you use the non-volatile vector registers v20-v31 > which may be dangerous as they are not saved & restored. > A quick workaround to allow their usage could be to change the build > to disallow GCC to use altivec. > Else I think we should save & restore them in the java ENTRY_FRAME. > I think we can assist with this. > > Thanks and best regards, > Martin > > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- > bounces at openjdk.java.net] On Behalf Of Hiroshi H Horii > Sent: Dienstag, 15. M?rz 2016 10:31 > To: Vladimir Kozlov > Cc: Tim Ellison ; Simonis, Volker > ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: Support for AES on ppc64le > > Hi Vladimir, > > Thank you a lots for your quick response and review. > > To use load_array_element, SEGV happened, I needed the following change. > Could you also review this change is reasonable? > > diff --git a/src/share/vm/opto/graphKit.cpp b/src/share/vm/opto/graphKit.cpp > --- a/src/share/vm/opto/graphKit.cpp > +++ b/src/share/vm/opto/graphKit.cpp > @@ -1680,6 +1680,8 @@ > Node* GraphKit::load_array_element(Node* ctl, Node* ary, Node* idx, > const TypeAryPtr* arytype) { > const Type* elemtype = arytype->elem(); > BasicType elembt = elemtype->array_element_basic_type(); > + if (elembt == T_NARROWOOP) > + elembt = T_OBJECT; > Node* adr = array_element_address(ary, idx, elembt, arytype->size()); > Node* ld = make_load(ctl, adr, elemtype, elembt, arytype, > MemNode::unordered); > return ld; > > I attached a full diff that is applied your kind suggestions. > > > > Regards, > Hiroshi > ----------------------- > Hiroshi Horii, Ph.D. > IBM Research - Tokyo > > > Vladimir Kozlov wrote on 03/15/2016 05:24:49: > > > From: Vladimir Kozlov > > To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-compiler-dev at openjdk.java.net > > Cc: "Simonis, Volker" , Tim Ellison > > > > Date: 03/15/2016 05:25 > > Subject: Re: Support for AES on ppc64le > > > > Hi Hiroshi > > > > About library_call.cpp changes. > > > > You don't need GraphKit:: > > > > And you can use load_array_element() instead: > > > > Node* objAESCryptKey = load_array_element(control(), objSessionK, > > intcon(0), TypeAryPtr::OOPS); > > > > You may need additional check and cast because next expression expects > > the objAESCryptKey points to int[]: > > > > Node* k_start = array_element_address(objAESCryptKey, intcon(0), T_INT); > > > > Thanks, > > Vladimir > > > > On 3/14/16 9:34 AM, Hiroshi H Horii wrote: > > > Dear all: > > > > > > Can I please request reviews for the following change? > > > This change was created for JDK 9. > > > > > > Description: > > > This change adds stub routines support for single-block AES encryption and > > > decryption operations on the POWER8 platform. They are availableonly when > > > the application is configured to use SunJCE crypto provider on little > > > endian. > > > These stubs make use of efficient hardware AES instructions and thus > > > offer significant performance improvements over JITed code on POWER8 > > > as on x86 and SPARC. AES stub routines are enabled by default on POWER8 > > > platforms that support AES instructions (vcipher). They can be > > > explicitly enabled or > > > disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. > > > Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round > > > keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls > > > the stub with > > > AESCrypt.sessionK[0] as round keys. > > > > > > Summary of source code changes: > > > > > > *src/cpu/ppc/vm/assembler_ppc.hpp > > > *src/cpu/ppc/vm/assembler_ppc.inline.hpp > > > - Adds support for vrld instruction to rotate vector register values > > > with > > > left doubleword. > > > > > > *src/cpu/ppc/vm/stubGenerator_ppc.cpp > > > - Defines stubs for single-block AES encryption and > decryption routines > > > supporting all key sizes (128-bit, 192-bit and 256-bit). > > > - Current POWER AES decryption instructions are not compatible with > > > SunJCE expanded decryption key format. Thus decryption stubs read > > > the expanded encryption keys (sessionK[0]) with descendant order. > > > - Encryption stubs use SunJCE expanded encryption key as their is > > > no incompatibility issue between POWER8 AES encryption instructions > > > and SunJCE expanded encryption keys. > > > > > > *src/cpu/ppc/vm/vm_version_ppc.cpp > > > - Detects AES capabilities of the underlying CPU by using > has_vcipher(). > > > - Enables UseAES and UseAESIntrinsics flags if the underlying CPU > > > supports AES instructions and neither of them is explicitly > > > disabled on > > > the command-line. Generate warning message if either of these > > > flags are > > > enabled on the command-line whereas the underlying CPU does not > > > support > > > AES instructions. > > > > > > *src/share/vm/opto/library_call.cpp > > > - Passes the first input parameter, reference to sessionK[0] to the > > > AES stubs > > > only on the POWER platform. > > > > > > Code change: > > > Please see an attached diff file that was generated with "hg diff > > > -g" under > > > the latest hotspot directory. > > > > > > Passed tests: > > > jtreg compiler/codegen/7184394/ > > > jtreg compiler/cpuflags/ (after removing @ignored annotation) > > > > > > * This is my first post of a change. I'm sorry in advance if I don't > > > follow the > > > community manners. > > > > > > * I wrote this description based on the follows. > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013- > > November/012670.html > > > > > > > > > > > > Regards, > > > Hiroshi > > > ----------------------- > > > Hiroshi Horii, > > > IBM Research - Tokyo > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ppc64le_aes_support_20160318.diff Type: application/octet-stream Size: 22363 bytes Desc: not available URL: From martin.doerr at sap.com Fri Mar 18 08:36:46 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 18 Mar 2016 08:36:46 +0000 Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi References: <258236A7-9325-4F19-B3F4-3108DA39D9DF@oracle.com> Message-ID: Thanks for reviewing. Can I get a second review and a sponsor, please? Best regards, Martin From: Igor Veresov [mailto:igor.veresov at oracle.com] Sent: Montag, 14. M?rz 2016 20:16 To: Doerr, Martin > Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi Seems fine. igor On Mar 14, 2016, at 8:50 AM, Doerr, Martin > wrote: Sorry, I had pasted the wrong link to the webrev: http://cr.openjdk.java.net/~mdoerr/8151818_c1_illegal_phi/webrev.00/ From: Doerr, Martin Sent: Montag, 14. M?rz 2016 16:49 To: hotspot-compiler-dev at openjdk.java.net Subject: RFR(S): 8151818: C1: LIRGenerator::move_to_phi can't deal with illegal phi Hi, we found out that C1 can't deal with illegal Phi functions which propagate into other Phi functions. Phi functions get illegal when their inputs have different types. This was observed when we activated the JVMTI capability can_access_local_variables and restored the old behavior of BlockBegin::try_merge: invalidate the phi functions instead of bailing out. The function LIRGenerator::move_to_phi crashes in this case. Proposed fix is to bail out as this case happens extremely rarely. Seems like it was never observed with the new behavior of BlockBegin::try_merge. I also improved some assertions to support locals with illegal types. The webrev is here: https://bugs.openjdk.java.net/browse/JDK-8151818 Please review. I will also need a sponsor if this change is desired. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Fri Mar 18 09:22:55 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 18 Mar 2016 10:22:55 +0100 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state Message-ID: <56EBC8EF.4040000@oracle.com> Hi, please review the following patch. https://bugs.openjdk.java.net/browse/JDK-8136458 http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: alive -> not-entrant/unloaded (may be on the stack) cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) cycle 2: zombie -> marked for reclamation cycle 3: marked for reclamation -> flush In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. I removed the "marked for reclamation" state. The following testing revealed no problems: - JPRT - RBT with hotspot_all and -Xcomp/-Xmixed - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 Thanks, Tobias From martin.doerr at sap.com Fri Mar 18 12:08:57 2016 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 18 Mar 2016 12:08:57 +0000 Subject: Support for AES on ppc64le In-Reply-To: <201603180742.u2I7gnQF014426@d19av06.sagamino.japan.ibm.com> References: <201603141634.u2EGYYDb023702@d19av08.sagamino.japan.ibm.com><56E71E11.3040808@oracle.com> <201603150932.u2F9W7k9028664@d19av05.sagamino.japan.ibm.com> <201603180742.u2I7gnQF014426@d19av06.sagamino.japan.ibm.com> Message-ID: <6b0f2af371b54363b904e837fb35db84@DEWDFE13DE14.global.corp.sap> Hi Hiroshi, >> If possible, I would like to finish little endian support first. >> In the near future, I would like to add a support of big endian. That's ok. I've opened a bug and created a webrev as needed for every contribution: https://bugs.openjdk.java.net/browse/JDK-8152172 http://cr.openjdk.java.net/~mdoerr/8152172_ppc64le_aes/webrev.00/ I had to make the following changes in order to get the debug build working as well: 1. I had to remove in_bytes() which is only supported for ByteSize objects. The values are already integers. 2. The instructions like lvx don't support 0 (or r0) as second parameter (there's an assertion). We have to use the version which omits this parameter. 3. Some instructions use 4 or 5 bit signed immediates. Therefore I had to replace 4 by -4 or 16 by -16 to avoid assertions. 4. You had removed code for UseAESCTRIntrinsics, too. I have added it back since you only implemented AES but not AESCTR intrinsics. The new version of the diff file is 8152172_ppc64le_aes.changeset which you can find in the webrev. If this is the final version you would like to contribute, please send out a request for review with the following headline (and point to the webrev): RFR(M): 8152172: PPC64: Support AES intrinsics If you need additional changes, just let us know. Best regards, Martin From: Hiroshi H Horii [mailto:HORII at jp.ibm.com] Sent: Freitag, 18. M?rz 2016 08:42 To: Doerr, Martin Cc: hotspot-compiler-dev at openjdk.java.net; Tim Ellison ; Vladimir Kozlov ; Simonis, Volker Subject: RE: Support for AES on ppc64le Hi Martin and all, Thank you for your comments. > May we ask you to support big endian, too? AIX is still big endian. > I think big endian linux requires setting and restoring VRSAVE which > is not needed on little endian linux and AIX. If possible, I would like to finish little endian support first. In the near future, I would like to add a support of big endian. > I have taken a look at the PPC64 part of the change and I have a > minor change request: > We don't like to see both strings if has_vcipher(): " vcipher" and " > aes". Please remove one of them. In my understanding, to pass the test of compiler/cpuflags/, "aes" should be returned. That is, in the attached new change, "aes" is returned instead of "vcipher". > I noticed that you use the non-volatile vector registers v20-v31 > which may be dangerous as they are not saved & restored. > A quick workaround to allow their usage could be to change the build > to disallow GCC to use altivec. > Else I think we should save & restore them in the java ENTRY_FRAME. > I think we can assist with this. I modified to avoid using v20-v31 in stub codes. Thank you for your correction. > You may need additional check and cast because next expression expects > the objAESCryptKey points to int[]: sessionK is a private field of com.sun.crypto.provider.AESCrypt and it always store only two int[] objects (in makeSessionKey method). So, I believe, additional cast is not necessary to access sessionK[0] as an int[] object. In addition, Vladimir thankfully corrected my change in load_array_element. Please see the attached file for the detail. Regards, Hiroshi ----------------------- Hiroshi Horii, IBM Research - Tokyo "Doerr, Martin" > wrote on 03/15/2016 20:57:14: > From: "Doerr, Martin" > > To: Hiroshi H Horii/Japan/IBM at IBMJP, Vladimir Kozlov > > > Cc: Tim Ellison >, "Simonis, Volker" > >, "hotspot-compiler-dev at openjdk.java.net" > > > Date: 03/15/2016 20:58 > Subject: RE: Support for AES on ppc64le > > Hi Hiroshi, > > thanks for contributing AES support. We appreciate it. > > May we ask you to support big endian, too? AIX is still big endian. > I think big endian linux requires setting and restoring VRSAVE which > is not needed on little endian linux and AIX. > > I have taken a look at the PPC64 part of the change and I have a > minor change request: > We don't like to see both strings if has_vcipher(): " vcipher" and " > aes". Please remove one of them. > > I noticed that you use the non-volatile vector registers v20-v31 > which may be dangerous as they are not saved & restored. > A quick workaround to allow their usage could be to change the build > to disallow GCC to use altivec. > Else I think we should save & restore them in the java ENTRY_FRAME. > I think we can assist with this. > > Thanks and best regards, > Martin > > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev- > bounces at openjdk.java.net] On Behalf Of Hiroshi H Horii > Sent: Dienstag, 15. M?rz 2016 10:31 > To: Vladimir Kozlov > > Cc: Tim Ellison >; Simonis, Volker > >; hotspot-compiler-dev at openjdk.java.net > Subject: Re: Support for AES on ppc64le > > Hi Vladimir, > > Thank you a lots for your quick response and review. > > To use load_array_element, SEGV happened, I needed the following change. > Could you also review this change is reasonable? > > diff --git a/src/share/vm/opto/graphKit.cpp b/src/share/vm/opto/graphKit.cpp > --- a/src/share/vm/opto/graphKit.cpp > +++ b/src/share/vm/opto/graphKit.cpp > @@ -1680,6 +1680,8 @@ > Node* GraphKit::load_array_element(Node* ctl, Node* ary, Node* idx, > const TypeAryPtr* arytype) { > const Type* elemtype = arytype->elem(); > BasicType elembt = elemtype->array_element_basic_type(); > + if (elembt == T_NARROWOOP) > + elembt = T_OBJECT; > Node* adr = array_element_address(ary, idx, elembt, arytype->size()); > Node* ld = make_load(ctl, adr, elemtype, elembt, arytype, > MemNode::unordered); > return ld; > > I attached a full diff that is applied your kind suggestions. > > > > Regards, > Hiroshi > ----------------------- > Hiroshi Horii, Ph.D. > IBM Research - Tokyo > > > Vladimir Kozlov > wrote on 03/15/2016 05:24:49: > > > From: Vladimir Kozlov > > > To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-compiler-dev at openjdk.java.net > > Cc: "Simonis, Volker" >, Tim Ellison > > > > > Date: 03/15/2016 05:25 > > Subject: Re: Support for AES on ppc64le > > > > Hi Hiroshi > > > > About library_call.cpp changes. > > > > You don't need GraphKit:: > > > > And you can use load_array_element() instead: > > > > Node* objAESCryptKey = load_array_element(control(), objSessionK, > > intcon(0), TypeAryPtr::OOPS); > > > > You may need additional check and cast because next expression expects > > the objAESCryptKey points to int[]: > > > > Node* k_start = array_element_address(objAESCryptKey, intcon(0), T_INT); > > > > Thanks, > > Vladimir > > > > On 3/14/16 9:34 AM, Hiroshi H Horii wrote: > > > Dear all: > > > > > > Can I please request reviews for the following change? > > > This change was created for JDK 9. > > > > > > Description: > > > This change adds stub routines support for single-block AES encryption and > > > decryption operations on the POWER8 platform. They are availableonly when > > > the application is configured to use SunJCE crypto provider on little > > > endian. > > > These stubs make use of efficient hardware AES instructions and thus > > > offer significant performance improvements over JITed code on POWER8 > > > as on x86 and SPARC. AES stub routines are enabled by default on POWER8 > > > platforms that support AES instructions (vcipher). They can be > > > explicitly enabled or > > > disabled on the command-line using UseAES and UseAESIntrinsics JVM flags. > > > Unlike x86 and SPARC, vcipher and vnchiper of POWER8 need the same round > > > keys of AES. Therefore, inline_aescrypt_Block in library_call.cpp calls > > > the stub with > > > AESCrypt.sessionK[0] as round keys. > > > > > > Summary of source code changes: > > > > > > *src/cpu/ppc/vm/assembler_ppc.hpp > > > *src/cpu/ppc/vm/assembler_ppc.inline.hpp > > > - Adds support for vrld instruction to rotate vector register values > > > with > > > left doubleword. > > > > > > *src/cpu/ppc/vm/stubGenerator_ppc.cpp > > > - Defines stubs for single-block AES encryption and > decryption routines > > > supporting all key sizes (128-bit, 192-bit and 256-bit). > > > - Current POWER AES decryption instructions are not compatible with > > > SunJCE expanded decryption key format. Thus decryption stubs read > > > the expanded encryption keys (sessionK[0]) with descendant order. > > > - Encryption stubs use SunJCE expanded encryption key as their is > > > no incompatibility issue between POWER8 AES encryption instructions > > > and SunJCE expanded encryption keys. > > > > > > *src/cpu/ppc/vm/vm_version_ppc.cpp > > > - Detects AES capabilities of the underlying CPU by using > has_vcipher(). > > > - Enables UseAES and UseAESIntrinsics flags if the underlying CPU > > > supports AES instructions and neither of them is explicitly > > > disabled on > > > the command-line. Generate warning message if either of these > > > flags are > > > enabled on the command-line whereas the underlying CPU does not > > > support > > > AES instructions. > > > > > > *src/share/vm/opto/library_call.cpp > > > - Passes the first input parameter, reference to sessionK[0] to the > > > AES stubs > > > only on the POWER platform. > > > > > > Code change: > > > Please see an attached diff file that was generated with "hg diff > > > -g" under > > > the latest hotspot directory. > > > > > > Passed tests: > > > jtreg compiler/codegen/7184394/ > > > jtreg compiler/cpuflags/ (after removing @ignored annotation) > > > > > > * This is my first post of a change. I'm sorry in advance if I don't > > > follow the > > > community manners. > > > > > > * I wrote this description based on the follows. > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2013- > > November/012670.html > > > > > > > > > > > > Regards, > > > Hiroshi > > > ----------------------- > > > Hiroshi Horii, > > > IBM Research - Tokyo > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at redhat.com Fri Mar 18 12:19:11 2016 From: aph at redhat.com (Andrew Haley) Date: Fri, 18 Mar 2016 12:19:11 +0000 Subject: RFR: 8151775: aarch64: add support for 8.1 LSE atomic operations In-Reply-To: <1457946385.13788.9.camel@mint> References: <1457946385.13788.9.camel@mint> Message-ID: <56EBF23F.90508@redhat.com> On 03/14/2016 09:06 AM, Edward Nevill wrote: > The following webrev adds support for 8.1 LSE atomic operations > > http://cr.openjdk.java.net/~enevill/8151775/webrev > > It also adds to CAS cases which I missed in the previous patch for > 8.1 CAS instructions. > > Tested with a clean run through jcstress. I'm rejecting most of this patch. Firstly, the native code is inappropriate at this level. A runtime check for UseLSE at every atomic operation is intolerable. Instead, please use an appropriately-configured version of GCC and binutils. I really don't like to see UseLSE in places like the TemplateInterpreterGenerator and c1_LIRAssembler. Please define an appropriate macro for ldadd, swp, (etc.) in MacroAssembler and use it. In fact, there seem already to be some appropriate definitions (atomic_add, etc.) in MacroAssembler; I think you could use them. Like this: void LIR_Assembler::casl(Register addr, Register newval, Register cmpval) { Label succeed; __ cmpxchg(addr, cmpval, newval, Assembler::xword, /* acquire*/ true, /* release*/ true, rscratch1); __ cset(rscratch1, Assembler::NE); } Andrew. From nils.eliasson at oracle.com Fri Mar 18 12:49:43 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 18 Mar 2016 13:49:43 +0100 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state In-Reply-To: <56EBC8EF.4040000@oracle.com> References: <56EBC8EF.4040000@oracle.com> Message-ID: <56EBF967.9030806@oracle.com> Hi, Was it the move to a dedicated sweeper thread that make this possible? When the compiler threads swept the code cache, it was done in parts, and then we had to keep the nmethods as marked-for-reclaation until a full sweep was completed. Am I right? Then this makes perfect sense. Looks good, Nils On 2016-03-18 10:22, Tobias Hartmann wrote: > Hi, > > please review the following patch. > > https://bugs.openjdk.java.net/browse/JDK-8136458 > http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ > > The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: > > alive -> not-entrant/unloaded (may be on the stack) > cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) > cycle 2: zombie -> marked for reclamation > cycle 3: marked for reclamation -> flush > > In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. > > I removed the "marked for reclamation" state. The following testing revealed no problems: > - JPRT > - RBT with hotspot_all and -Xcomp/-Xmixed > - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 > > Thanks, > Tobias From tobias.hartmann at oracle.com Fri Mar 18 13:34:44 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 18 Mar 2016 14:34:44 +0100 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state In-Reply-To: <56EBF967.9030806@oracle.com> References: <56EBC8EF.4040000@oracle.com> <56EBF967.9030806@oracle.com> Message-ID: <56EC03F4.3010205@oracle.com> Hi Nils, thanks for looking at this! On 18.03.2016 13:49, Nils Eliasson wrote: > Was it the move to a dedicated sweeper thread that make this possible? When the compiler threads swept the code cache, it was done in parts, and then we had to keep the nmethods as marked-for-reclaation until a full sweep was completed. Am I right? Then this makes perfect sense. No, the dedicated sweeper thread did not change this behavior. Before 8046809 [1], sweeping was done by the compiler threads in several steps to reduce pressure and establish a good balance between sweeping and compiling. Now having a dedicated thread (that may be interrupted at any time), does not affect the nmethod state cycle but allows the thread scheduler to find the best balance between compilation and sweeping. We still have to wait for a full sweep to finish before a zombie nmethod is flushed. My point is that we do this anyway and don't need the "marked for reclamation" state for this. Basically, my assumptions are the following: After a nmethod was marked as zombie by the sweeper, it's guaranteed to be not on the stack and no inline caches will be updated to point to this nmethod. However, outdated ICs may still point to it. The sweeper will continue to visit nmethods and only encounter the zombie nmethod again, after a full sweep cycle is finished. This is guaranteed because nmethods are not moved in the code cache. Therefore, all the inline caches of other nmethods that pointer to the zombie nmethod are cleaned now. We can safely flush the nmethod. I'm not sure why the "marked for reclamation" state was ever necessary (if at all). Best regards, Tobias [1] http://cr.openjdk.java.net/~anoll/8046809/webrev.06/ > > Looks good, > Nils > > On 2016-03-18 10:22, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch. >> >> https://bugs.openjdk.java.net/browse/JDK-8136458 >> http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ >> >> The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: >> >> alive -> not-entrant/unloaded (may be on the stack) >> cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) >> cycle 2: zombie -> marked for reclamation >> cycle 3: marked for reclamation -> flush >> >> In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. >> >> I removed the "marked for reclamation" state. The following testing revealed no problems: >> - JPRT >> - RBT with hotspot_all and -Xcomp/-Xmixed >> - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 >> >> Thanks, >> Tobias > From filipp.zhinkin at gmail.com Fri Mar 18 14:03:08 2016 From: filipp.zhinkin at gmail.com (Filipp Zhinkin) Date: Fri, 18 Mar 2016 17:03:08 +0300 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: References: <56E9874C.4050208@oracle.com> Message-ID: Hi, could someone sponsor this change? http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.01/hotspot.changeset Thanks in advance, Filipp. On Thu, Mar 17, 2016 at 9:59 AM, Filipp Zhinkin wrote: > Nils, thank you for the review. > > May I ask you push this change? > A patch with correct commit message is here: > http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.01/hotspot.changeset > > Thanks, > Filipp. > > On Wed, Mar 16, 2016 at 7:18 PM, Nils Eliasson wrote: >> Looks good. >> >> Thanks for fixing. >> >> Regards, >> Nils >> >> >> On 2016-03-16 16:39, Filipp Zhinkin wrote: >>> >>> Hi all, >>> >>> please review a small fix that force VM to disable BlockingCompilation >>> flag and ignore BlockingCompilationOption if CompileTheWorld or >>> ReplayCompiles options were turned on. >>> >>> Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ >>> Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 >>> Testing: CTW >>> >>> Regards, >>> Filipp. >> >> From nils.eliasson at oracle.com Fri Mar 18 15:02:49 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 18 Mar 2016 16:02:49 +0100 Subject: RFR(S): 8152169: LockCompilationTest.java fails due method present in the compiler queue Message-ID: <56EC1899.7020106@oracle.com> Hi, Please review this test fix. Summary: This test tests the locking of the compilers - make sure no compiles can be completed while the lock is in place. When running with -XX:-TieredCompilation and -XX:CompileThreshold=100 we got a fairly long queue of compiles waiting, and when only C2 is available it can take longer to complete than the test wait time. Solution: Only allow compile of the test method - make sure we have no contention on the compile queue. With only a single method in the queue we can also reduce the wait time. Testing: Test run in failing configuration. Bug: https://bugs.openjdk.java.net/browse/JDK-8152169 Webrev: http://cr.openjdk.java.net/~neliasso/8152169/webrev.01/ Regards, Nils From nils.eliasson at oracle.com Fri Mar 18 15:10:27 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 18 Mar 2016 16:10:27 +0100 Subject: RFR (XS): 8152004: CTW crashes with failed assertion after 8150646 integration In-Reply-To: References: <56E9874C.4050208@oracle.com> Message-ID: <56EC1A63.1070808@oracle.com> Hi, Yes, I'll sponsor it. Regards, Nils On 2016-03-18 15:03, Filipp Zhinkin wrote: > Hi, > > could someone sponsor this change? > > http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.01/hotspot.changeset > > Thanks in advance, > Filipp. > > On Thu, Mar 17, 2016 at 9:59 AM, Filipp Zhinkin > wrote: >> Nils, thank you for the review. >> >> May I ask you push this change? >> A patch with correct commit message is here: >> http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.01/hotspot.changeset >> >> Thanks, >> Filipp. >> >> On Wed, Mar 16, 2016 at 7:18 PM, Nils Eliasson wrote: >>> Looks good. >>> >>> Thanks for fixing. >>> >>> Regards, >>> Nils >>> >>> >>> On 2016-03-16 16:39, Filipp Zhinkin wrote: >>>> Hi all, >>>> >>>> please review a small fix that force VM to disable BlockingCompilation >>>> flag and ignore BlockingCompilationOption if CompileTheWorld or >>>> ReplayCompiles options were turned on. >>>> >>>> Webrev: http://cr.openjdk.java.net/~fzhinkin/8152004/webrev.00/ >>>> Bug id: https://bugs.openjdk.java.net/browse/JDK-8152004 >>>> Testing: CTW >>>> >>>> Regards, >>>> Filipp. >>> From nils.eliasson at oracle.com Fri Mar 18 15:33:15 2016 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 18 Mar 2016 16:33:15 +0100 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state In-Reply-To: <56EC03F4.3010205@oracle.com> References: <56EBC8EF.4040000@oracle.com> <56EBF967.9030806@oracle.com> <56EC03F4.3010205@oracle.com> Message-ID: <56EC1FBB.6060703@oracle.com> On 2016-03-18 14:34, Tobias Hartmann wrote: > Hi Nils, > > thanks for looking at this! > > On 18.03.2016 13:49, Nils Eliasson wrote: >> Was it the move to a dedicated sweeper thread that make this possible? When the compiler threads swept the code cache, it was done in parts, and then we had to keep the nmethods as marked-for-reclaation until a full sweep was completed. Am I right? Then this makes perfect sense. > No, the dedicated sweeper thread did not change this behavior. Before 8046809 [1], sweeping was done by the compiler threads in several steps to reduce pressure and establish a good balance between sweeping and compiling. Now having a dedicated thread (that may be interrupted at any time), does not affect the nmethod state cycle but allows the thread scheduler to find the best balance between compilation and sweeping. We still have to wait for a full sweep to finish before a zombie nmethod is flushed. My point is that we do this anyway and don't need the "marked for reclamation" state for this. The dedicated sweeper is superb - it makes the code so much easier to reason about, both the sweeper code and the compilebroker code. > > Basically, my assumptions are the following: > After a nmethod was marked as zombie by the sweeper, it's guaranteed to be not on the stack and no inline caches will be updated to point to this nmethod. However, outdated ICs may still point to it. The sweeper will continue to visit nmethods and only encounter the zombie nmethod again, after a full sweep cycle is finished. This is guaranteed because nmethods are not moved in the code cache. Therefore, all the inline caches of other nmethods that pointer to the zombie nmethod are cleaned now. We can safely flush the nmethod. > > I'm not sure why the "marked for reclamation" state was ever necessary (if at all). I think nmethods could be made zombies in more ways before. (Currently it looks like only the sweeper make them zombies.) In the case when another threads can make zombies in the middle of a sweep, you need to differentiate between the ones that where zombie when the sweep started, and the ones that where made zombie in a part of the code cache that is already swept. Thanks for fixing this! Nils > > Best regards, > Tobias > > [1] http://cr.openjdk.java.net/~anoll/8046809/webrev.06/ > >> Looks good, >> Nils >> >> On 2016-03-18 10:22, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8136458 >>> http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ >>> >>> The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: >>> >>> alive -> not-entrant/unloaded (may be on the stack) >>> cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) >>> cycle 2: zombie -> marked for reclamation >>> cycle 3: marked for reclamation -> flush >>> >>> In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. >>> >>> I removed the "marked for reclamation" state. The following testing revealed no problems: >>> - JPRT >>> - RBT with hotspot_all and -Xcomp/-Xmixed >>> - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 >>> >>> Thanks, >>> Tobias From tobias.hartmann at oracle.com Fri Mar 18 15:50:34 2016 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 18 Mar 2016 16:50:34 +0100 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state In-Reply-To: <56EC1FBB.6060703@oracle.com> References: <56EBC8EF.4040000@oracle.com> <56EBF967.9030806@oracle.com> <56EC03F4.3010205@oracle.com> <56EC1FBB.6060703@oracle.com> Message-ID: <56EC23CA.6040808@oracle.com> Hi Nils, On 18.03.2016 16:33, Nils Eliasson wrote: > On 2016-03-18 14:34, Tobias Hartmann wrote: >> Hi Nils, >> >> thanks for looking at this! >> >> On 18.03.2016 13:49, Nils Eliasson wrote: >>> Was it the move to a dedicated sweeper thread that make this possible? When the compiler threads swept the code cache, it was done in parts, and then we had to keep the nmethods as marked-for-reclaation until a full sweep was completed. Am I right? Then this makes perfect sense. >> No, the dedicated sweeper thread did not change this behavior. Before 8046809 [1], sweeping was done by the compiler threads in several steps to reduce pressure and establish a good balance between sweeping and compiling. Now having a dedicated thread (that may be interrupted at any time), does not affect the nmethod state cycle but allows the thread scheduler to find the best balance between compilation and sweeping. We still have to wait for a full sweep to finish before a zombie nmethod is flushed. My point is that we do this anyway and don't need the "marked for reclamation" state for this. > > The dedicated sweeper is superb - it makes the code so much easier to reason about, both the sweeper code and the compilebroker code. Yes, I agree. It's important to keep this code simple and fast. >> Basically, my assumptions are the following: >> After a nmethod was marked as zombie by the sweeper, it's guaranteed to be not on the stack and no inline caches will be updated to point to this nmethod. However, outdated ICs may still point to it. The sweeper will continue to visit nmethods and only encounter the zombie nmethod again, after a full sweep cycle is finished. This is guaranteed because nmethods are not moved in the code cache. Therefore, all the inline caches of other nmethods that pointer to the zombie nmethod are cleaned now. We can safely flush the nmethod. >> >> I'm not sure why the "marked for reclamation" state was ever necessary (if at all). > > I think nmethods could be made zombies in more ways before. (Currently it looks like only the sweeper make them zombies.) In the case when another threads can make zombies in the middle of a sweep, you need to differentiate between the ones that where zombie when the sweep started, and the ones that where made zombie in a part of the code cache that is already swept. Right, we had CodeCache::make_marked_nmethods_zombies() which I removed with JDK-8075805 [1] because it caused other problems. The sweeper is now (and should remain) the only place where nmethods can transition to zombie. Thanks again for the review! Best regards, Tobias [1] http://cr.openjdk.java.net/~thartmann/8075805/webrev.01/ > Thanks for fixing this! > Nils >> >> Best regards, >> Tobias >> >> [1] http://cr.openjdk.java.net/~anoll/8046809/webrev.06/ >> >>> Looks good, >>> Nils >>> >>> On 2016-03-18 10:22, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8136458 >>>> http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ >>>> >>>> The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: >>>> >>>> alive -> not-entrant/unloaded (may be on the stack) >>>> cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) >>>> cycle 2: zombie -> marked for reclamation >>>> cycle 3: marked for reclamation -> flush >>>> >>>> In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. >>>> >>>> I removed the "marked for reclamation" state. The following testing revealed no problems: >>>> - JPRT >>>> - RBT with hotspot_all and -Xcomp/-Xmixed >>>> - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 >>>> >>>> Thanks, >>>> Tobias > From vladimir.kozlov at oracle.com Fri Mar 18 16:21:24 2016 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 18 Mar 2016 09:21:24 -0700 Subject: [9] RFR(S): 8136458: Remove "marked for reclamation" nmethod state In-Reply-To: <56EC23CA.6040808@oracle.com> References: <56EBC8EF.4040000@oracle.com> <56EBF967.9030806@oracle.com> <56EC03F4.3010205@oracle.com> <56EC1FBB.6060703@oracle.com> <56EC23CA.6040808@oracle.com> Message-ID: <56EC2B04.7040900@oracle.com> But sweeper should sweep at least twice before cleanin all IC pointing to zombi method because a method could be marked in the middle of sweep. Right? Thanks, Vladimir On 3/18/16 8:50 AM, Tobias Hartmann wrote: > Hi Nils, > > On 18.03.2016 16:33, Nils Eliasson wrote: >> On 2016-03-18 14:34, Tobias Hartmann wrote: >>> Hi Nils, >>> >>> thanks for looking at this! >>> >>> On 18.03.2016 13:49, Nils Eliasson wrote: >>>> Was it the move to a dedicated sweeper thread that make this possible? When the compiler threads swept the code cache, it was done in parts, and then we had to keep the nmethods as marked-for-reclaation until a full sweep was completed. Am I right? Then this makes perfect sense. >>> No, the dedicated sweeper thread did not change this behavior. Before 8046809 [1], sweeping was done by the compiler threads in several steps to reduce pressure and establish a good balance between sweeping and compiling. Now having a dedicated thread (that may be interrupted at any time), does not affect the nmethod state cycle but allows the thread scheduler to find the best balance between compilation and sweeping. We still have to wait for a full sweep to finish before a zombie nmethod is flushed. My point is that we do this anyway and don't need the "marked for reclamation" state for this. >> >> The dedicated sweeper is superb - it makes the code so much easier to reason about, both the sweeper code and the compilebroker code. > > Yes, I agree. It's important to keep this code simple and fast. > >>> Basically, my assumptions are the following: >>> After a nmethod was marked as zombie by the sweeper, it's guaranteed to be not on the stack and no inline caches will be updated to point to this nmethod. However, outdated ICs may still point to it. The sweeper will continue to visit nmethods and only encounter the zombie nmethod again, after a full sweep cycle is finished. This is guaranteed because nmethods are not moved in the code cache. Therefore, all the inline caches of other nmethods that pointer to the zombie nmethod are cleaned now. We can safely flush the nmethod. >>> >>> I'm not sure why the "marked for reclamation" state was ever necessary (if at all). >> >> I think nmethods could be made zombies in more ways before. (Currently it looks like only the sweeper make them zombies.) In the case when another threads can make zombies in the middle of a sweep, you need to differentiate between the ones that where zombie when the sweep started, and the ones that where made zombie in a part of the code cache that is already swept. > > Right, we had CodeCache::make_marked_nmethods_zombies() which I removed with JDK-8075805 [1] because it caused other problems. The sweeper is now (and should remain) the only place where nmethods can transition to zombie. > > Thanks again for the review! > > Best regards, > Tobias > > [1] http://cr.openjdk.java.net/~thartmann/8075805/webrev.01/ > >> Thanks for fixing this! >> Nils >>> >>> Best regards, >>> Tobias >>> >>> [1] http://cr.openjdk.java.net/~anoll/8046809/webrev.06/ >>> >>>> Looks good, >>>> Nils >>>> >>>> On 2016-03-18 10:22, Tobias Hartmann wrote: >>>>> Hi, >>>>> >>>>> please review the following patch. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8136458 >>>>> http://cr.openjdk.java.net/~thartmann/8136458/webrev.00/ >>>>> >>>>> The sweeper removes zombie nmethods only after they were "marked for reclamation" to ensure that there are no inline caches referencing the zombie nmethod. However, this is not required because if a zombie nmethod is encountered again by the sweeper, all ICs pointing to it were already cleaned in the previous sweeper cycle: >>>>> >>>>> alive -> not-entrant/unloaded (may be on the stack) >>>>> cycle 1: not-entrant/unloaded -> zombie (may be referenced by ICs) >>>>> cycle 2: zombie -> marked for reclamation >>>>> cycle 3: marked for reclamation -> flush >>>>> >>>>> In each cycle, we clean all inline caches that point to not-entrant/unloaded/zombie nmethods. Therefore, we know already after sweeper cycle 2, that the zombie nmethod is not referenced by any ICs and we could flush it immediately. >>>>> >>>>> I removed the "marked for reclamation" state. The following testing revealed no problems: >>>>> - JPRT >>>>> - RBT with hotspot_all and -Xcomp/-Xmixed >>>>> - 100 iterations of Nashorn + Octane with -XX:StartAggressiveSweepingAt=100/50 -XX:NmethodSweepActivity=500/100 >>>>> >>>>> Thanks, >>>>> Tobias >> From zoltan.majo at oracle.com Fri Mar 18 17:11:34 2016 From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=) Date: Fri, 18 Mar 2016 18:11:34 +0100 Subject: [9] RFR (S): 8148754: C2 loop unrolling fails due to unexpected graph shape In-Reply-To: <56EAE58D.10605@oracle.com> References: <56C1ED18.6060903@oracle.com> <56C26929.4050706@oracle.com> <56CB1997.40107@oracle.com> <56CE6CAF.9090904@oracle.com> <56E99EF1.2030900@oracle.com> <56E9C7AB.8050504@oracle.com> <56EAD6AF.7020204@oracle.com> <56EAE58D.10605@oracle.com> Message-ID: <56EC36C6.70706@oracle.com> Hi Vladimir, On 03/17/2016 06:12 PM, Vladimir Kozlov wrote: > On 3/17/16 9:09 AM, Zolt?n Maj? wrote: >> Hi Vladimir, >> >> >> thank you for the feedback. >> >> On 03/16/2016 09:52 PM, Vladimir Kozlov wrote: >>>> Can you please let me know which solution you prefer: >>>> - (1) the prototype with the regression solved or >>>> - (2) checking the graph shape? >>> >>> I agree that we should do (2) now. One suggestion I have is to >>> prepare a separate method to do these checks and use it in other >>> places which you pointed - superword and do_range_check. >> >> OK, I updated the patch for (2) so that the check of the graph's >> shape is performed in a separate method. Here is the webrev: >> http://cr.openjdk.java.net/~zmajo/8148754/webrev.04/ > > Zoltan, I don't see changes in do_range_check(). I'm sorry, I forgot to add the check to do_range_check(). > Opcode() is virtual function. Use is_*() query methods was originally > in SuperWord::get_pre_loop_end(). OK, I changed the checks to use is_* query methods instead of Opcode(). > I don't like is_adjustable_loop_entry() name, especially since you > negate it in checks. Consider is_canonical_main_loop_entry(). OK, I've updated the name. > Also move assert(cl->is_main_loop(), "") into it. Done. Here is the updated webrev: http://cr.openjdk.java.net/~zmajo/8148754/webrev.05/ I've tested with JPRT and with locally executing the failing test. Both pass. I've started RBT testing. Thank you and best regards, Zoltan > > Thanks, > Vladimir > >> >> I've tested the updated webrev with JPRT and also by executing the >> failing test. Both pass. I will soon start RBT testing as well (will >> let you know if failures have appeared). >> >>> Yes, you can work later on (1) solution if you have a bug and not >>> RFE - we should stabilize C2 as you know and these changes may have >>> some side effects we don't know about yet. But I like it because >>> we can explicitly specify which optimizations are allowed. >> >> OK. I filed 8152110: "Stabilize C2 loop optimizations" and will >> continue work in the scope of that bug: >> https://bugs.openjdk.java.net/browse/JDK-8152110 >>> >>>> The two I'm concerned about are >>>> - Compile::cleanup_loop_predicates() >>> >>> Yes, this one should be marked LOOP_OPTS_LIMITED. >>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >>>> >>>> - IdealLoopTree::remove_main_post_loops() >>> >>> This one is fine because the loop goes away. >> >> I'll take of these once I continue work on 8152110. >> >> Thank you! >> >> Best regards, >> >> >> Zoltan >> >>> >>> Thanks, >>> Vladimir >>> >>> On 3/16/16 10:59 AM, Zolt?n Maj? wrote: >>>> Hi Vladimir, >>>> >>>> >>>> I've spent more time on this issue. Please find my findings below. >>>> >>>> On 02/25/2016 03:53 AM, Vladimir Kozlov wrote: >>>>> So it is again _major_progress problem. >>>>> I have to spend more time on this. It is not simple. >>>>> >>>>> We may add an other state when Ideal transformation could be >>>>> executed. For example, after all loop opts: >>>>> >>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/0fc557e05fc0/src/share/vm/opto/compile.cpp#l2286 >>>>> >>>>> >>>>> Or more states to specify which Ideal transformations and loop >>>>> optimizations could be executed in which state. >>>> >>>> I think adding more states is necessary, adding a single state is >>>> not sufficient as... (see below) >>>> >>>>> >>>>> The main problem from your description is elimination of Opaque1 >>>>> on which loop optimizations relies. >>>>> >>>>> We can simply remove Opaque1Node::Identity(PhaseGVN* phase) >>>>> because PhaseMacroExpand::expand_macro_nodes() will remove them >>>>> after all loop opts. >>>> >>>> ...there are even more places where Opaque1 nodes are removed, than >>>> we've initially assumed. >>>> >>>> The two I'm concerned about are >>>> - Compile::cleanup_loop_predicates() >>> >>> Yes, this one should be marked LOOP_OPTS_LIMITED. >>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/compile.cpp#l1907 >>>> >>>> - IdealLoopTree::remove_main_post_loops() >>> >>> This one is fine because the loop goes away. >>> >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2469 >>>> >>>> >>>>> On other hand we may do want to execute some simple loop >>>>> optimizations even after Opaque, CastII and CastI2L are optimized >>>>> out. For example, removing empty loops or one iteration loops >>>>> (pre-loops). >>>>> But definitely not ones which use cloning or other aggressive >>>>> optimizations. >>>> >>>> Yes, I agree that we want to execute some loop optimizations even >>>> afterwards. >>>> >>>> So I've added three states: LOOP_OPTS_FULL, LOOP_OPTS_LIMITED, and >>>> LOOP_OPTS_INHIBITED. These states indicate which loop optimizations >>>> are allowed, Major_progress indicates only if loop optimizations >>>> have made progress (but not if loop optimizations are expected to >>>> be performed). >>>> >>>>> >>>>> Inline_Warm() is not used since InlineWarmCalls for very long >>>>> time. The code could be very rotten by now. So removing >>>>> set_major_progress from it is fine. >>>> >>>> OK. >>>> >>>>> >>>>> It is also fine to remove it from inline_incrementally since it >>>>> will be restored by skip_loop_opts code (and cleared if method is >>>>> empty or set if there are expensive nodes). >>>> >>>> OK. >>>> >>>>> >>>>> LoopNode::Ideal() change seems also fine. LoopNode is created only >>>>> in loop opts (RootNode has own Ideal()) so if it has TOP input it >>>>> will be removed by RegionNode::Ideal most likely. >>>> >>>> OK. >>>> >>>>> >>>>> Which leaves remove_useless_bool() code only and I have concern >>>>> about it. It could happened after CCP phase and we may want to >>>>> execute loop opts after it. I am actually want to set major progress >>>>> after CCP unconditionally since some If nodes could be folded by it. >>>> >>>> Yes, that makes sense and I did it. >>>> >>>>> >>>>> As you can see it is not simple :( >>>> >>>> No, it's not simple at all. I did a prototype that implements all >>>> we discussed above. Here is the code: >>>> http://cr.openjdk.java.net/~zmajo/code/8148754/webrev/ >>>> >>>> The code is not yet RFR quality, but I've sent it out because I'd >>>> like to have your feedback on how to continue. >>>> >>>> The code fixes the current problem with the unexpected graph shape. >>>> But it is likely to also solve similar problems that are triggered >>>> also by an unexpected graph shape, for example any of the asserts >>>> in PhaseIdealLoop::do_range_check: >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/loopTransform.cpp#l2124 >>>> >>>> >>>> I evaluated performance of the prototype. Peformance improves in a >>>> number of cases by 1-4%: >>>> Octane-Mandreel >>>> Octane-Richards >>>> Octane-Splay >>>> >>>> Unfortunately, there is also a performance regression with >>>> SPECjvm2008-MonteCarlo-G1 (3-5%). Finding the cause of that >>>> regression is likely to take a at least a week, but most likely >>>> even more. >>>> >>>> So my question is: Should I spend more time on this prototype and >>>> fix the performance regression? >>>> >>>> A different solution would be check the complete graph shape. That >>>> is also done at other places, e.g., in SuperWord::get_pre_loop_end() >>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/3256d4204291/src/share/vm/opto/superword.cpp#l3076 >>>> >>>> >>>> Here is the webrev for the second solution: >>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.03/ >>>> >>>> The second solution does not regress. I've tested it with: >>>> - JPRT; >>>> - local testing (linux-86_64) with the failing test case; >>>> - executing all hotspot tests locally, all tests pass that pass >>>> with an unmodified build. >>>> >>>> Can you please let me know which solution you prefer: >>>> - (1) the prototype with the regression solved or >>>> - (2) checking the graph shape? >>>> >>>> We could also fix this issue with pushing (2) for now (as this >>>> issue is a "critical" nightly failure). I could then spend more >>>> time on (1) later in a different bug. >>>> >>>> Thank you and best regards, >>>> >>>> >>>> Zoltan >>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/22/16 6:22 AM, Zolt?n Maj? wrote: >>>>>> Hi Vladimir, >>>>>> >>>>>> >>>>>> thank you for the feedback! >>>>>> >>>>>> On 02/16/2016 01:11 AM, Vladimir Kozlov wrote: >>>>>>> Zoltan, >>>>>>> >>>>>>> It should not be "main" loop if peeling happened. See do_peeling(): >>>>>>> >>>>>>> if (cl->is_main_loop()) { >>>>>>> cl->set_normal_loop(); >>>>>>> >>>>>>> Split-if optimization should not split through loop's phi. And >>>>>>> generally not through loop's head since it is not making code >>>>>>> better - >>>>>>> split through backedge moves code into loop again. Making loop body >>>>>>> more complicated as this case shows. >>>>>> >>>>>> I did more investigation to understand what causes the invalid graph >>>>>> shape to appear. It seems that the invalid graph shape appears >>>>>> because >>>>>> the compiler uses the Compile:: _major_progress inconsistently. >>>>>> Here are >>>>>> some details. >>>>>> >>>>>> - If _major_progress *is set*, the compiler expects more loop >>>>>> optimizations to happen. Therefore, certain transformations on >>>>>> the graph >>>>>> are not allowed so that the graph is in a shape that can be >>>>>> processed by >>>>>> loop optimizations. See: >>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/convertnode.cpp#l253 >>>>>> >>>>>> >>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/castnode.cpp#l251 >>>>>> >>>>>> >>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/loopnode.cpp#l950 >>>>>> >>>>>> >>>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/2c3c43037e14/src/share/vm/opto/opaquenode.cpp#l37 >>>>>> >>>>>> >>>>>> >>>>>> - If _major_progress *is not set*, the compiler is allowed to >>>>>> perform >>>>>> all possible transformations (because it does not have to care about >>>>>> future loop optimizations). >>>>>> >>>>>> The crash reported for the current issue appears because >>>>>> _major_progress >>>>>> *can be accidentally set again* after the compiler decided to stop >>>>>> performing loop optimizations. As a result, invalid graph shapes >>>>>> appear. >>>>>> >>>>>> Here are details about how this happens for both failures I've been >>>>>> studying: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8148754?focusedCommentId=13901941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13901941 >>>>>> >>>>>> >>>>>> >>>>>> I would propose to change the compiler to use _major_progress >>>>>> consistently. (This goes into the same direction as Tobias's >>>>>> recent work >>>>>> on JDK-8144487.) >>>>>> >>>>>> I propose that _major_progress: >>>>>> - can be SET when the compiler is initialized (because loop >>>>>> optimizations are expected to happen afterwards); >>>>>> - can be SET/RESET in the scope of loop optimizations (because we >>>>>> want >>>>>> to see if loop optimizations made progress); >>>>>> - cannot be SET/RESET by neither incremental inlining nor IGVN >>>>>> (even if >>>>>> the IGVN is performed in the scope of loop optimizations). >>>>>> >>>>>> Here is the updated webrev: >>>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.02/ >>>>>> >>>>>> Performance evaluation: >>>>>> - The proposed webrev does not cause performance regressions for >>>>>> SPECjvm2008, SPECjbb2005, and Octane. >>>>>> >>>>>> Testing: >>>>>> - all hotspot JTREG tests on all supported platforms; >>>>>> - JPRT; >>>>>> - failing test case. >>>>>> >>>>>> Thank you and best regards, >>>>>> >>>>>> >>>>>> Zoltan >>>>>> >>>>>> >>>>>>> >>>>>>> Bailout unrolling is fine but performance may suffer because in >>>>>>> some >>>>>>> cases loop unrolling is better then split-if. >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 2/15/16 7:22 AM, Zolt?n Maj? wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> >>>>>>>> please review the patch for 8148754. >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8148754 >>>>>>>> >>>>>>>> Problem: Compilation fails when the C2 compiler attempts loop >>>>>>>> unrolling. >>>>>>>> The cause of the failure is that the loop unrolling >>>>>>>> optimization expects >>>>>>>> a well-defined graph shape at the entry control of a >>>>>>>> 'CountedLoopNode' >>>>>>>> ('IfTrue'/'IfFalse' preceeded by 'If' preceeded by 'Bool' >>>>>>>> preceeded by >>>>>>>> 'CmpI'). >>>>>>>> >>>>>>>> >>>>>>>> Solution: I investigated several different instances of the same >>>>>>>> failure. It turns out that the shape of the graph at a loop's >>>>>>>> entry >>>>>>>> control is often different from the way loop unrolling expects >>>>>>>> it to be >>>>>>>> (please find some examples in the bug's JBS issue). The various >>>>>>>> graph >>>>>>>> shapes are a result of previously performed transformations, e.g., >>>>>>>> split-if optimization and loop peeling. >>>>>>>> >>>>>>>> Loop unrolling requires the above mentioned graph shape so that >>>>>>>> it can >>>>>>>> adjust the zero-trip guard of the loop. With the unexpected graph >>>>>>>> shapes, it is not possible to perform loop unrolling. However, >>>>>>>> the graph >>>>>>>> is still in a valid state (except for loop unrolling) and can >>>>>>>> be used to >>>>>>>> produce correct code. >>>>>>>> >>>>>>>> I propose that (1) we check if an unexpected graph shape is >>>>>>>> encountered >>>>>>>> and (2) bail out of loop unrolling if it is (but not fail in the >>>>>>>> compiler in such cases). >>>>>>>> >>>>>>>> The failure was triggered by Aleksey's Indify String Concatenation >>>>>>>> changes but the generated bytecodes are valid. So this seems to >>>>>>>> be a >>>>>>>> compiler issue that was previously there but was not yet >>>>>>>> triggered. >>>>>>>> >>>>>>>> >>>>>>>> Webrev: >>>>>>>> http://cr.openjdk.java.net/~zmajo/8148754/webrev.00/ >>>>>>>> >>>>>>>> Testing: >>>>>>>> - JPRT; >>>>>>>> - local testing (linux-86_64) with the failing test case;