From dean.long at oracle.com Wed May 1 01:06:52 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 30 Apr 2019 18:06:52 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: References: Message-ID: Does this allow us to assert !uniqm->can_be_statically_bound() in Dependencies::assert_unique_concrete_method? dl On 4/30/19 12:59 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223171/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223171 > > Both C1 & C2 may register redundant nmethod dependencies (which > (always hold). For example, for instance methods on final classes. > > Moreover, C2 does add dependencies for private methods. > > The patch enhances the checks and unify them between C1 & C2. > > Testing: tier1-4 > > Best regards, > Vladimir Ivanov From sandhya.viswanathan at intel.com Wed May 1 01:14:53 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Wed, 1 May 2019 01:14:53 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> Hi VladimirK, JBS: https://bugs.openjdk.java.net/browse/JDK-8222074 Please find updated webrev at: http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ With this webrev the ad file has only about 60 lines effectively added. Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. Also taken into account reducing the temporary by using TEMP dst for multiply rules. The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. Your review and feedback is welcome. Best Regards, Sandhya -----Original Message----- From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya Sent: Wednesday, April 10, 2019 10:22 AM To: Vladimir Kozlov ; B. Blaser Cc: hotspot-compiler-dev at openjdk.java.net Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, vector_len); Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. Best Regards, Sandhya -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, April 10, 2019 9:59 AM To: Viswanathan, Sandhya ; B. Blaser Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: > Hi Bernard, > > One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead $tmp2 to avoid overwriting $src2 before we get value from it if $dst = $src2. On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. It is a little mess which may cause ineffective use of registers in compiled code. Thanks, Vladimir > > Best Regards, > Sandhya > > > -----Original Message----- > From: B. Blaser [mailto:bsrbnd at gmail.com] > Sent: Wednesday, April 10, 2019 4:10 AM > To: Viswanathan, Sandhya > Cc: Vladimir Kozlov ; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > Hi Sandhya and Vladimir K., > > On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >> >> Hi Vladimir, >> >> Yes, I missed the question below: >>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >> >> No it is not intentional, we can use the dst register in those cases and reduced the tmps. > > I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): > > 7349 format %{"pmovsxbw $tmp,$src1\n\t" > 7350 "pmovsxbw $tmp2,$src2\n\t" > > I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? > > Thanks, > Bernard > From rahul.v.raghavan at oracle.com Wed May 1 16:39:46 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 1 May 2019 22:09:46 +0530 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> <959abf54-d1da-95ee-9cf6-6c6d8ec5e4a1@oracle.com> <18115aa8-edaa-31b9-02a6-06721d9fbfc9@oracle.com> <939f3f5d-b8e7-939f-8953-d34a0f3ff6c9@oracle.com> <259ef902-778b-7eef-46e2-d1927950d21c@oracle.com> <73f7c647-3194-2a65-6cc6-a15cbf6c82be@oracle.com> <37837126-c9d5-1bb1-fc9a-6fb9b848efbe@oracle.com> <28955bc6-020a-29e1-953c-e9f48932cd56@oracle.com> Message-ID: <5fb42e98-f923-b3d8-ee04-6d899fb2ac2d@oracle.com> Thank you Vladimir. On 30/04/19 10:45 PM, Vladimir Ivanov wrote: > Looks good! > > Best regards, > Vladimir Ivanov > > On 30/04/2019 00:04, Rahul Raghavan wrote: >> Thank you Vladimir Ivanov for suggestions. >> >> Please note following latest changes tried. >> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.04/ >> >> Hope did not miss any points. >> Confirmed no failures with the reported test cases. >> Also hs-tier1 to tier4, hs-precheckin-comp testing in progress. >> >> Thanks, >> Rahul From vladimir.x.ivanov at oracle.com Wed May 1 16:40:49 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 09:40:49 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> Message-ID: So far, testing spotted a couple of minor issues: windows build broken: jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1572): error C2220: warning treated as error - no 'object' file generated jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1572): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1600): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1616): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1632): warning C4101: 'inst': unreferenced local variable compiler/graalunit/HotspotTest.java: org.graalvm.compiler.hotspot.test.CRC32SubstitutionsTest finished 1685.0 ms org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics started (7 of 44) test: FAILED test(org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics) java.lang.AssertionError: missing Graal intrinsics for: java/lang/Math.abs(I)I java/lang/Math.abs(J)J at org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics.test(CheckGraalIntrinsics.java:646) I'll respond on the patch itself separately. Best regards, Vladimir Ivanov On 30/04/2019 18:14, Viswanathan, Sandhya wrote: > Hi VladimirK, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222074 > > Please find updated webrev at: > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ > > With this webrev the ad file has only about 60 lines effectively added. > Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. > I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. > What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. > The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. > > In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. > Also taken into account reducing the temporary by using TEMP dst for multiply rules. > > The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. > > Your review and feedback is welcome. > > Best Regards, > Sandhya > > > -----Original Message----- > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya > Sent: Wednesday, April 10, 2019 10:22 AM > To: Vladimir Kozlov ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: > > __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); > __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, vector_len); > > Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, April 10, 2019 9:59 AM > To: Viswanathan, Sandhya ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >> Hi Bernard, >> >> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. > > Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead $tmp2 to avoid overwriting > $src2 before we get value from it if $dst = $src2. > > On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. > > It is a little mess which may cause ineffective use of registers in compiled code. > > Thanks, > Vladimir > >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: B. Blaser [mailto:bsrbnd at gmail.com] >> Sent: Wednesday, April 10, 2019 4:10 AM >> To: Viswanathan, Sandhya >> Cc: Vladimir Kozlov ; hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Sandhya and Vladimir K., >> >> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>> >>> Hi Vladimir, >>> >>> Yes, I missed the question below: >>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>> >>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >> >> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >> >> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >> 7350 "pmovsxbw $tmp2,$src2\n\t" >> >> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >> >> Thanks, >> Bernard >> From vladimir.x.ivanov at oracle.com Wed May 1 16:52:16 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 09:52:16 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: References: Message-ID: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> > Does this allow us to assert !uniqm->can_be_statically_bound() in > Dependencies::assert_unique_concrete_method? In general, no. It doesn't hold for final methods: dependency is still needed when context is broad enough, since an overriding method can be loaded in a different part of the hierarchy (under the same context class). In case of the adjusted checks it's safe, since context == method holder when actual_receiver->is_final() == true. if (!callee->is_final_method() && !callee->is_private() && !actual_receiver->is_final()) { dependencies()->assert_unique_concrete_method(actual_receiver, cha_monomorphic_target); } I refactored the patch a bit: http://cr.openjdk.java.net/~vlivanov/8223171/webrev.01/ >> Moreover, C2 does add dependencies for private methods. I take it back. Earlier checks handle private methods. Only methods on final classes get redundant dependencies. Best regards, Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed May 1 16:52:56 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 09:52:56 -0700 Subject: [13] RFR (S): 8219902: C2: MemNode::can_see_stored_value() ignores casts which carry control dependency In-Reply-To: <147e1906-381d-4a3d-0a24-23e9c149581d@oracle.com> References: <8ab7b14b-d42d-37ea-e6d7-151d068c57f0@oracle.com> <147e1906-381d-4a3d-0a24-23e9c149581d@oracle.com> Message-ID: <4c67bc3e-deac-d6ad-0a63-2e01193100cb@oracle.com> Thanks, Vladimir. Best regards, Vladimir Ivanov On 30/04/2019 13:02, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir K > > On 4/30/19 12:47 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8219902/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8219902 >> >> JDK-8161334 [1] enhanced MemNode::can_see_stored_value to ignore casts >> when access base addresses are compared. It turned out to be too >> aggressive since casts may carry control dependency. >> >> Proposed fix is to keep casts with control dependency. >> >> Testing: failing test case, tier1-3 >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8161334 From john.r.rose at oracle.com Wed May 1 18:20:03 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 1 May 2019 11:20:03 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> <959abf54-d1da-95ee-9cf6-6c6d8ec5e4a1@oracle.com> <18115aa8-edaa-31b9-02a6-06721d9fbfc9@oracle.com> <939f3f5d-b8e7-939f-8953-d34a0f3ff6c9@oracle.com> <259ef902-778b-7eef-46e2-d1927950d21c@oracle.com> <73f7c647-3194-2a65-6cc6-a15cbf6c82be@oracle.com> <37837126-c9d5-1bb1-fc9a-6fb9b848efbe@oracle.com> <28955bc6-020a-29e1-953c-e9f48932cd56@oracle.com> Message-ID: <0E11910B-A4B0-4F64-9B87-5A4BF065B9D2@oracle.com> Here's a late comment: Is there any reason to put the deletion and insertion in different places? If not, it would be easier to follow the history, and to do merges, if they were placed at the same point in the code. That is, insert the new code where the old code is deleted. On Apr 30, 2019, at 12:04 AM, Rahul Raghavan wrote: > > Thank you Vladimir Ivanov for suggestions. > > Please note following latest changes tried. > - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.04/ > > Hope did not miss any points. > Confirmed no failures with the reported test cases. > Also hs-tier1 to tier4, hs-precheckin-comp testing in progress. > > Thanks, > Rahul > > On 27/04/19 11:48 AM, Vladimir Ivanov wrote: >> On 26/04/2019 19:30, Vladimir Ivanov wrote: >> After thinking more about it, I believe new offset alignment check supersedes is_unaligned_access(). And is_mismatched_access() is too conservative here: what is_mismatched_access() adds here (in addition to existing alignment & size checks) is whether type match between location and stored value, but what matters for IN are sizes and offsets only. >> Type mismatches (e.g., byte vs boolean, char vs short) may cause problems when consequent loads are replaced with values from initializing stores, but it should be already handled in MemNode::can_see_stored_value() and Load?Node::Ideal(). >> So, it seems both checks (is_unaligned_access() & is_mismatched_access()) can be safely omitted. >> You are right, I missed that IN::captured_store_insertion_point() inspects already other stores which are already captured. Sorry for the confusion. >> I agree that IN::can_capture_store() is the right place to put the fix in and I like (iii). (Just add a comment, "// mismatched access" is enough) From sandhya.viswanathan at intel.com Wed May 1 19:11:02 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Wed, 1 May 2019 19:11:02 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB553B@FMSMSX126.amr.corp.intel.com> Thanks a lot Vladimir. I will look into fixing these. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Wednesday, May 01, 2019 9:41 AM To: Viswanathan, Sandhya ; Vladimir Kozlov Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 So far, testing spotted a couple of minor issues: windows build broken: jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1572): error C2220: warning treated as error - no 'object' file generated jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1572): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1600): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1616): warning C4101: 'inst': unreferenced local variable jib > t:/workspace/build/windows-x64/hotspot/variant-server/gensrc/adfiles/ad_x86.cpp(1632): warning C4101: 'inst': unreferenced local variable compiler/graalunit/HotspotTest.java: org.graalvm.compiler.hotspot.test.CRC32SubstitutionsTest finished 1685.0 ms org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics started (7 of 44) test: FAILED test(org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics) java.lang.AssertionError: missing Graal intrinsics for: java/lang/Math.abs(I)I java/lang/Math.abs(J)J at org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics.test(CheckGraalIntrinsics.java:646) I'll respond on the patch itself separately. Best regards, Vladimir Ivanov On 30/04/2019 18:14, Viswanathan, Sandhya wrote: > Hi VladimirK, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222074 > > Please find updated webrev at: > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ > > With this webrev the ad file has only about 60 lines effectively added. > Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. > I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. > What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. > The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. > > In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. > Also taken into account reducing the temporary by using TEMP dst for multiply rules. > > The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. > > Your review and feedback is welcome. > > Best Regards, > Sandhya > > > -----Original Message----- > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya > Sent: Wednesday, April 10, 2019 10:22 AM > To: Vladimir Kozlov ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: > > __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); > __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, vector_len); > > Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, April 10, 2019 9:59 AM > To: Viswanathan, Sandhya ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >> Hi Bernard, >> >> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. > > Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead $tmp2 to avoid overwriting > $src2 before we get value from it if $dst = $src2. > > On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. > > It is a little mess which may cause ineffective use of registers in compiled code. > > Thanks, > Vladimir > >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: B. Blaser [mailto:bsrbnd at gmail.com] >> Sent: Wednesday, April 10, 2019 4:10 AM >> To: Viswanathan, Sandhya >> Cc: Vladimir Kozlov ; hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Sandhya and Vladimir K., >> >> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>> >>> Hi Vladimir, >>> >>> Yes, I missed the question below: >>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>> >>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >> >> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >> >> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >> 7350 "pmovsxbw $tmp2,$src2\n\t" >> >> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >> >> Thanks, >> Bernard >> From dean.long at oracle.com Wed May 1 19:20:56 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 1 May 2019 12:20:56 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> References: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> Message-ID: Can you also add check_unique_method(ctxk, uniqm) to the version of assert_unique_concrete_method that takes a Method*? Otherwise, the changes look good to me. dl On 5/1/19 9:52 AM, Vladimir Ivanov wrote: > >> Does this allow us to assert !uniqm->can_be_statically_bound() in >> Dependencies::assert_unique_concrete_method? > > In general, no. It doesn't hold for final methods: dependency is still > needed when context is broad enough, since an overriding method can be > loaded in a different part of the hierarchy (under the same context > class). > > In case of the adjusted checks it's safe, since context == method > holder when actual_receiver->is_final() == true. > > ?? if (!callee->is_final_method() && !callee->is_private() && > !actual_receiver->is_final()) { > dependencies()->assert_unique_concrete_method(actual_receiver, > cha_monomorphic_target); > ??? } > > I refactored the patch a bit: > ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.01/ > >>> Moreover, C2 does add dependencies for private methods. > > I take it back. Earlier checks handle private methods. Only methods on > final classes get redundant dependencies. > > Best regards, > Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed May 1 21:58:01 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 14:58:01 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> Message-ID: <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ Nice job, Sandhya! Glad to hear the approach pays off! Unfortunately, I must note that AD file becomes much more obscure. Especially with those function pointers. 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, 1529 XMMRegister src, XMMRegister shift, 1530 XMMRegister tmp1, XMMRegister tmp2, Register scratch) { 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? false : true); 1532 XX_Inst shiftinst = get_xx_inst(opcode); 1533 1534 (_masm.*extendinst)(tmp1, src); 1535 (_masm.*shiftinst)(tmp1, shift); 1536 __ pshufd(tmp2, src, 0xE); 1537 (_masm.*extendinst)(tmp2, tmp2); 1538 (_masm.*shiftinst)(tmp2, shift); 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), scratch); 1540 __ pand(tmp2, dst); 1541 __ pand(dst, tmp1); 1542 __ packuswb(dst, tmp2); 1543 } Have you tried to encapsulate that into x86-specific MacroAssembler? 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() == 16); 8684 match(Set dst (LShiftVB src shift)); 8685 match(Set dst (RShiftVB src shift)); 8686 match(Set dst (URShiftVB src shift)); 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); 8688 format %{"pmovxbw $tmp1,$src\n\t" 8689 "shiftop $tmp1,$shift\n\t" 8690 "pshufd $tmp2,$src\n\t" 8691 "pmovxbw $tmp2,$tmp2\n\t" 8692 "shiftop $tmp2,$shift\n\t" 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" 8694 "pand $tmp2,$dst\n\t" 8695 "pand $dst,$tmp1\n\t" 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} 8697 ins_encode %{ 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); 8699 %} 8700 ins_pipe( pipe_slow ); 8701 %} can be turned into something like: instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ predicate(n->as_Vector()->length() == 16); match(Set dst (LShiftVB src shift)); match(Set dst (RShiftVB src shift)); match(Set dst (URShiftVB src shift)); effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); format %{"packed16B shift" %} ins_encode %{ int vlen = 0; // 128-bit BasicType elem_type = T_BYTE; int shift_mode = ...; // L/R/UR or S/U + L/R __ vshift(vlen, elem_type, shift_mode, $dst$$..., $src$$..., $shift$$..., $tmp1$$..., $tmp2$$..., $scratch$$...); %} Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. Best regards, Vladimir Ivanov > With this webrev the ad file has only about 60 lines effectively added. > Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. > I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. > What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. > The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. > > In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. > Also taken into account reducing the temporary by using TEMP dst for multiply rules. > > The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. > > Your review and feedback is welcome. > > Best Regards, > Sandhya > > > -----Original Message----- > From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Viswanathan, Sandhya > Sent: Wednesday, April 10, 2019 10:22 AM > To: Vladimir Kozlov ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: > > __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); > __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, vector_len); > > Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, April 10, 2019 9:59 AM > To: Viswanathan, Sandhya ; B. Blaser > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >> Hi Bernard, >> >> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. > > Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead $tmp2 to avoid overwriting > $src2 before we get value from it if $dst = $src2. > > On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. > > It is a little mess which may cause ineffective use of registers in compiled code. > > Thanks, > Vladimir > >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: B. Blaser [mailto:bsrbnd at gmail.com] >> Sent: Wednesday, April 10, 2019 4:10 AM >> To: Viswanathan, Sandhya >> Cc: Vladimir Kozlov ; hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Sandhya and Vladimir K., >> >> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>> >>> Hi Vladimir, >>> >>> Yes, I missed the question below: >>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>> >>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >> >> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >> >> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >> 7350 "pmovsxbw $tmp2,$src2\n\t" >> >> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >> >> Thanks, >> Bernard >> From sandhya.viswanathan at intel.com Wed May 1 22:09:49 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Wed, 1 May 2019 22:09:49 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB569D@FMSMSX126.amr.corp.intel.com> Hi Vladimir, I agree, I wanted to show both the approaches in this patch to get your feedback: 1) with emit as a function 2) with emit part in the instruct body itself With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). Please do let me know if you want to see any other change in this patch. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Wednesday, May 01, 2019 2:58 PM To: Viswanathan, Sandhya ; Vladimir Kozlov Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ Nice job, Sandhya! Glad to hear the approach pays off! Unfortunately, I must note that AD file becomes much more obscure. Especially with those function pointers. 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, 1529 XMMRegister src, XMMRegister shift, 1530 XMMRegister tmp1, XMMRegister tmp2, Register scratch) { 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? false : true); 1532 XX_Inst shiftinst = get_xx_inst(opcode); 1533 1534 (_masm.*extendinst)(tmp1, src); 1535 (_masm.*shiftinst)(tmp1, shift); 1536 __ pshufd(tmp2, src, 0xE); 1537 (_masm.*extendinst)(tmp2, tmp2); 1538 (_masm.*shiftinst)(tmp2, shift); 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), scratch); 1540 __ pand(tmp2, dst); 1541 __ pand(dst, tmp1); 1542 __ packuswb(dst, tmp2); 1543 } Have you tried to encapsulate that into x86-specific MacroAssembler? 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() == 16); 8684 match(Set dst (LShiftVB src shift)); 8685 match(Set dst (RShiftVB src shift)); 8686 match(Set dst (URShiftVB src shift)); 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); 8688 format %{"pmovxbw $tmp1,$src\n\t" 8689 "shiftop $tmp1,$shift\n\t" 8690 "pshufd $tmp2,$src\n\t" 8691 "pmovxbw $tmp2,$tmp2\n\t" 8692 "shiftop $tmp2,$shift\n\t" 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" 8694 "pand $tmp2,$dst\n\t" 8695 "pand $dst,$tmp1\n\t" 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} 8697 ins_encode %{ 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); 8699 %} 8700 ins_pipe( pipe_slow ); 8701 %} can be turned into something like: instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ predicate(n->as_Vector()->length() == 16); match(Set dst (LShiftVB src shift)); match(Set dst (RShiftVB src shift)); match(Set dst (URShiftVB src shift)); effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); format %{"packed16B shift" %} ins_encode %{ int vlen = 0; // 128-bit BasicType elem_type = T_BYTE; int shift_mode = ...; // L/R/UR or S/U + L/R __ vshift(vlen, elem_type, shift_mode, $dst$$..., $src$$..., $shift$$..., $tmp1$$..., $tmp2$$..., $scratch$$...); %} Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. Best regards, Vladimir Ivanov > With this webrev the ad file has only about 60 lines effectively added. > Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. > I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. > What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. > The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. > > In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. > Also taken into account reducing the temporary by using TEMP dst for multiply rules. > > The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. > > Your review and feedback is welcome. > > Best Regards, > Sandhya > > > -----Original Message----- > From: hotspot-compiler-dev > [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of > Viswanathan, Sandhya > Sent: Wednesday, April 10, 2019 10:22 AM > To: Vladimir Kozlov ; B. Blaser > > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: > > __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); > __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, > vector_len); > > Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, April 10, 2019 9:59 AM > To: Viswanathan, Sandhya ; B. Blaser > > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >> Hi Bernard, >> >> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. > > Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead > $tmp2 to avoid overwriting > $src2 before we get value from it if $dst = $src2. > > On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. > > It is a little mess which may cause ineffective use of registers in compiled code. > > Thanks, > Vladimir > >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: B. Blaser [mailto:bsrbnd at gmail.com] >> Sent: Wednesday, April 10, 2019 4:10 AM >> To: Viswanathan, Sandhya >> Cc: Vladimir Kozlov ; >> hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Sandhya and Vladimir K., >> >> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>> >>> Hi Vladimir, >>> >>> Yes, I missed the question below: >>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>> >>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >> >> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >> >> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >> 7350 "pmovsxbw $tmp2,$src2\n\t" >> >> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >> >> Thanks, >> Bernard >> From vladimir.x.ivanov at oracle.com Wed May 1 22:15:05 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 15:15:05 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: References: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> Message-ID: <50f8065b-445f-ae1d-00c8-743fd870404a@oracle.com> > Can you also add check_unique_method(ctxk, uniqm) to the version of > assert_unique_concrete_method that takes a Method*? Like this? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.02/ Best regards, Vladimir Ivanov > On 5/1/19 9:52 AM, Vladimir Ivanov wrote: >> >>> Does this allow us to assert !uniqm->can_be_statically_bound() in >>> Dependencies::assert_unique_concrete_method? >> >> In general, no. It doesn't hold for final methods: dependency is still >> needed when context is broad enough, since an overriding method can be >> loaded in a different part of the hierarchy (under the same context >> class). >> >> In case of the adjusted checks it's safe, since context == method >> holder when actual_receiver->is_final() == true. >> >> ?? if (!callee->is_final_method() && !callee->is_private() && >> !actual_receiver->is_final()) { >> dependencies()->assert_unique_concrete_method(actual_receiver, >> cha_monomorphic_target); >> ??? } >> >> I refactored the patch a bit: >> ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.01/ >> >>>> Moreover, C2 does add dependencies for private methods. >> >> I take it back. Earlier checks handle private methods. Only methods on >> final classes get redundant dependencies. >> >> Best regards, >> Vladimir Ivanov > From sandhya.viswanathan at intel.com Wed May 1 22:16:44 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Wed, 1 May 2019 22:16:44 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. Best Regards, Sandhya -----Original Message----- From: Viswanathan, Sandhya Sent: Wednesday, May 01, 2019 3:10 PM To: 'Vladimir Ivanov' ; Vladimir Kozlov Cc: hotspot-compiler-dev at openjdk.java.net Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 Hi Vladimir, I agree, I wanted to show both the approaches in this patch to get your feedback: 1) with emit as a function 2) with emit part in the instruct body itself With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). Please do let me know if you want to see any other change in this patch. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Wednesday, May 01, 2019 2:58 PM To: Viswanathan, Sandhya ; Vladimir Kozlov Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ Nice job, Sandhya! Glad to hear the approach pays off! Unfortunately, I must note that AD file becomes much more obscure. Especially with those function pointers. 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, 1529 XMMRegister src, XMMRegister shift, 1530 XMMRegister tmp1, XMMRegister tmp2, Register scratch) { 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? false : true); 1532 XX_Inst shiftinst = get_xx_inst(opcode); 1533 1534 (_masm.*extendinst)(tmp1, src); 1535 (_masm.*shiftinst)(tmp1, shift); 1536 __ pshufd(tmp2, src, 0xE); 1537 (_masm.*extendinst)(tmp2, tmp2); 1538 (_masm.*shiftinst)(tmp2, shift); 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), scratch); 1540 __ pand(tmp2, dst); 1541 __ pand(dst, tmp1); 1542 __ packuswb(dst, tmp2); 1543 } Have you tried to encapsulate that into x86-specific MacroAssembler? 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() == 16); 8684 match(Set dst (LShiftVB src shift)); 8685 match(Set dst (RShiftVB src shift)); 8686 match(Set dst (URShiftVB src shift)); 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); 8688 format %{"pmovxbw $tmp1,$src\n\t" 8689 "shiftop $tmp1,$shift\n\t" 8690 "pshufd $tmp2,$src\n\t" 8691 "pmovxbw $tmp2,$tmp2\n\t" 8692 "shiftop $tmp2,$shift\n\t" 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" 8694 "pand $tmp2,$dst\n\t" 8695 "pand $dst,$tmp1\n\t" 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} 8697 ins_encode %{ 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); 8699 %} 8700 ins_pipe( pipe_slow ); 8701 %} can be turned into something like: instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ predicate(n->as_Vector()->length() == 16); match(Set dst (LShiftVB src shift)); match(Set dst (RShiftVB src shift)); match(Set dst (URShiftVB src shift)); effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); format %{"packed16B shift" %} ins_encode %{ int vlen = 0; // 128-bit BasicType elem_type = T_BYTE; int shift_mode = ...; // L/R/UR or S/U + L/R __ vshift(vlen, elem_type, shift_mode, $dst$$..., $src$$..., $shift$$..., $tmp1$$..., $tmp2$$..., $scratch$$...); %} Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. Best regards, Vladimir Ivanov > With this webrev the ad file has only about 60 lines effectively added. > Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. > I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. > What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. > The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. > > In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. > Also taken into account reducing the temporary by using TEMP dst for multiply rules. > > The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. > > Your review and feedback is welcome. > > Best Regards, > Sandhya > > > -----Original Message----- > From: hotspot-compiler-dev > [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of > Viswanathan, Sandhya > Sent: Wednesday, April 10, 2019 10:22 AM > To: Vladimir Kozlov ; B. Blaser > > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: > > __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); > __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, > vector_len); > > Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, April 10, 2019 9:59 AM > To: Viswanathan, Sandhya ; B. Blaser > > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >> Hi Bernard, >> >> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. > > Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead > $tmp2 to avoid overwriting > $src2 before we get value from it if $dst = $src2. > > On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. > > It is a little mess which may cause ineffective use of registers in compiled code. > > Thanks, > Vladimir > >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: B. Blaser [mailto:bsrbnd at gmail.com] >> Sent: Wednesday, April 10, 2019 4:10 AM >> To: Viswanathan, Sandhya >> Cc: Vladimir Kozlov ; >> hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Sandhya and Vladimir K., >> >> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>> >>> Hi Vladimir, >>> >>> Yes, I missed the question below: >>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>> >>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >> >> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >> >> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >> 7350 "pmovsxbw $tmp2,$src2\n\t" >> >> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >> >> Thanks, >> Bernard >> From vladimir.x.ivanov at oracle.com Wed May 1 23:17:17 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 16:17:17 -0700 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 Message-ID: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223213 (It's a followup RFR on a earlier RFC [1].) Recent changes severely affected how static initializers are executed and for long-running initializers it manifested as a severe slowdown. As an example, it led to a 3x slowdown on some Clojure applications (JDK-8219233 [2]). The root cause is that until a class is fully initialized, every invocation of static method on it goes through method resolution. Proposed fix introduces fast class initialization barriers for C1, C2, and template interpreter on x86-64. I did some experiments with cross-platform approaches, but haven't got satisfactory results. On other platforms, behavior stays (mostly) intact. (I had to revert some changes introduced by JDK-8219492 [3], since the assumptions they rely on about accesses inside a class don't hold in all cases.) The barrier is as simple as: if (holder->is_not_initialized() && !holder->is_reentrant_initialization(current_thread)) { // trigger call site re-resolution and block there } There are 3 places where barriers are added: * in template interpreter for invokestatic bytecode; * at nmethod verified entry point (for normal compilations); * c2i adapters; For template interperter, there's additional check added into TemplateTable::resolve_cache_and_index which calls into InterpreterRuntime::resolve_from_cache when fast path checks fail. In case of nmethods, the barrier is put before frame construction, so existing compiler runtime routines can be reused (SharedRuntime::get_handle_wrong_method_stub()). Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers nmethod recompilation once the class is fully initialized. OSR compilations don't need a barrier. Correspondence between barriers and transitions they cover: (1) from interpreter (barrier on caller side) * all transitions: interpreter, compiled (i2c), native, aot, ... (2) from compiled (barrier on callee side) to compiled, to native (barrier in native wrapper on entry) (3) c2i bypasses both barriers (interpreter and compiled) and requires a dedicated barrier in c2i (4) to Graal/AOT code: from interpreter: covered by interpreter barrier from compiled: call site patching is disabled, leading to repeated call site resolution until method holder is fully initialized (original behavior). Performance experiments with clojure [2] demonstrated that the fix almost completely recuperates the regression: (1) always reresolve (w/o the fix): ~12,0s ( 1x) (2) C1/C2 barriers only: ~3,8s (~3x) (3) int/C1/C2 barriers: ~3,2s (-20%) -------- (4) barriers disabled for invokestatic ~3,2s I deliberately tried to keep the patch backport-friendly for 8u/11u/12u and refrained from using newer features like nmethod barriers introduced recently. The fix can be refactored later specifically for 13 as a followup change. Testing: clojure startup, tier1-5 Thanks! Best regards, Vladimir Ivanov [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html [2] https://bugs.openjdk.java.net/browse/JDK-8219233 [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From vladimir.x.ivanov at oracle.com Wed May 1 23:37:22 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 16:37:22 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic Message-ID: http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223216 (The patch has minor dependencies on 8223213 [1] I sent out for review earlier.) C2 implements class initialization checks for new and getstatic/putstatic differently: while "new" supports fast class initialization checks, static field accesses rely on uncommon traps which may lead to deoptimization/recompilation storms during long-running class initialisation. Proposed patch unifies implementation between them and uses the following barrier: if (holder->is_initialized()) { uncommon_trap(initialized, reinterpret); } if (!holder->is_reentrant_initialization(current_thread)) { uncommon_trap(uninitialized, none); } It also enhances checks for not-yet-initialized classes (Compile::needs_clinit_barrier) and unifies the implementation between new, invokestatic, and getfield/putfield. Testing: tier1-5, targeted microbenchmarks, new test from 8223213 Thanks! Best regards, Vladimir Ivanov [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223213 From vladimir.x.ivanov at oracle.com Thu May 2 00:09:20 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 17:09:20 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> Message-ID: Sounds good, thanks! Best regards, Vladimir Ivanov On 01/05/2019 15:16, Viswanathan, Sandhya wrote: > I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Viswanathan, Sandhya > Sent: Wednesday, May 01, 2019 3:10 PM > To: 'Vladimir Ivanov' ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Hi Vladimir, > > I agree, I wanted to show both the approaches in this patch to get your feedback: > 1) with emit as a function > 2) with emit part in the instruct body itself > > With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. > > Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. > > I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). > > Please do let me know if you want to see any other change in this patch. > > Best Regards, > Sandhya > > > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 01, 2019 2:58 PM > To: Viswanathan, Sandhya ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ > > Nice job, Sandhya! Glad to hear the approach pays off! > > Unfortunately, I must note that AD file becomes much more obscure. > Especially with those function pointers. > > 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, > 1529 XMMRegister src, XMMRegister shift, > 1530 XMMRegister tmp1, XMMRegister tmp2, > Register scratch) { > 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? > false : true); > 1532 XX_Inst shiftinst = get_xx_inst(opcode); > 1533 > 1534 (_masm.*extendinst)(tmp1, src); > 1535 (_masm.*shiftinst)(tmp1, shift); > 1536 __ pshufd(tmp2, src, 0xE); > 1537 (_masm.*extendinst)(tmp2, tmp2); > 1538 (_masm.*shiftinst)(tmp2, shift); > 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), > scratch); > 1540 __ pand(tmp2, dst); > 1541 __ pand(dst, tmp1); > 1542 __ packuswb(dst, tmp2); > 1543 } > > Have you tried to encapsulate that into x86-specific MacroAssembler? > > 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ > 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() > == 16); > 8684 match(Set dst (LShiftVB src shift)); > 8685 match(Set dst (RShiftVB src shift)); > 8686 match(Set dst (URShiftVB src shift)); > 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); > 8688 format %{"pmovxbw $tmp1,$src\n\t" > 8689 "shiftop $tmp1,$shift\n\t" > 8690 "pshufd $tmp2,$src\n\t" > 8691 "pmovxbw $tmp2,$tmp2\n\t" > 8692 "shiftop $tmp2,$shift\n\t" > 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" > 8694 "pand $tmp2,$dst\n\t" > 8695 "pand $dst,$tmp1\n\t" > 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} > 8697 ins_encode %{ > 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , > $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); > 8699 %} > 8700 ins_pipe( pipe_slow ); > 8701 %} > > can be turned into something like: > > instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ > predicate(n->as_Vector()->length() == 16); > match(Set dst (LShiftVB src shift)); > match(Set dst (RShiftVB src shift)); > match(Set dst (URShiftVB src shift)); > effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); > format %{"packed16B shift" %} > ins_encode %{ > int vlen = 0; // 128-bit > BasicType elem_type = T_BYTE; > int shift_mode = ...; // L/R/UR or S/U + L/R > __ vshift(vlen, elem_type, shift_mode, > $dst$$..., $src$$..., $shift$$..., > $tmp1$$..., $tmp2$$..., $scratch$$...); > %} > > Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? > > Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. > > Best regards, > Vladimir Ivanov > >> With this webrev the ad file has only about 60 lines effectively added. >> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >> >> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >> >> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >> >> Your review and feedback is welcome. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: hotspot-compiler-dev >> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >> Viswanathan, Sandhya >> Sent: Wednesday, April 10, 2019 10:22 AM >> To: Vladimir Kozlov ; B. Blaser >> >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >> >> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >> vector_len); >> >> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, April 10, 2019 9:59 AM >> To: Viswanathan, Sandhya ; B. Blaser >> >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>> Hi Bernard, >>> >>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >> >> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >> $tmp2 to avoid overwriting >> $src2 before we get value from it if $dst = $src2. >> >> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >> >> It is a little mess which may cause ineffective use of registers in compiled code. >> >> Thanks, >> Vladimir >> >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>> Sent: Wednesday, April 10, 2019 4:10 AM >>> To: Viswanathan, Sandhya >>> Cc: Vladimir Kozlov ; >>> hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Sandhya and Vladimir K., >>> >>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>> >>>> Hi Vladimir, >>>> >>>> Yes, I missed the question below: >>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>> >>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>> >>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>> >>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>> >>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>> >>> Thanks, >>> Bernard >>> From vladimir.kozlov at oracle.com Thu May 2 00:40:05 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 1 May 2019 17:40:05 -0700 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> Message-ID: <9e0616f5-d79b-e439-26dd-a8e3334c10ed@oracle.com> Why you skip patching code compiled by Graal and AOT? The flag UseFastClassInitChecks could be diagnostic or even product. The feature is not for debugging. Thanks, Vladimir K On 5/1/19 4:17 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223213 > > (It's a followup RFR on a earlier RFC [1].) > > Recent changes severely affected how static initializers are executed and for long-running initializers it manifested as > a severe slowdown. > As an example, it led to a 3x slowdown on some Clojure applications > (JDK-8219233 [2]). The root cause is that until a class is fully initialized, every invocation of static method on it > goes through method resolution. > > Proposed fix introduces fast class initialization barriers for C1, C2, and template interpreter on x86-64. I did some > experiments with cross-platform approaches, but haven't got satisfactory results. > > On other platforms, behavior stays (mostly) intact. (I had to revert some changes introduced by JDK-8219492 [3], since > the assumptions they rely on about accesses inside a class don't hold in all cases.) > > The barrier is as simple as: > ?? if (holder->is_not_initialized() && > ?????? !holder->is_reentrant_initialization(current_thread)) { > ???? // trigger call site re-resolution and block there > ?? } > > There are 3 places where barriers are added: > ? * in template interpreter for invokestatic bytecode; > ? * at nmethod verified entry point (for normal compilations); > ? * c2i adapters; > > For template interperter, there's additional check added into TemplateTable::resolve_cache_and_index which calls into > InterpreterRuntime::resolve_from_cache when fast path checks fail. > > In case of nmethods, the barrier is put before frame construction, so existing compiler runtime routines can be reused > (SharedRuntime::get_handle_wrong_method_stub()). > > Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers nmethod recompilation once the class is fully > initialized. > > OSR compilations don't need a barrier. > > Correspondence between barriers and transitions they cover: > ? (1) from interpreter (barrier on caller side) > ?????? * all transitions: interpreter, compiled (i2c), native, aot, ... > > ? (2) from compiled (barrier on callee side) > ?????? to compiled, to native (barrier in native wrapper on entry) > > ? (3) c2i bypasses both barriers (interpreter and compiled) and requires a dedicated barrier in c2i > > ? (4) to Graal/AOT code: > ??????? from interpreter: covered by interpreter barrier > ??????? from compiled: call site patching is disabled, leading to repeated call site resolution until method holder is > fully initialized (original behavior). > > Performance experiments with clojure [2] demonstrated that the fix almost completely recuperates the regression: > > ? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) > ? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) > ? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) > -------- > ? (4) barriers disabled for invokestatic? ~3,2s > > I deliberately tried to keep the patch backport-friendly for 8u/11u/12u and refrained from using newer features like > nmethod barriers introduced recently. The fix can be refactored later specifically for 13 as a followup change. > > Testing: clojure startup, tier1-5 > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html > [2] https://bugs.openjdk.java.net/browse/JDK-8219233 > [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From vladimir.kozlov at oracle.com Thu May 2 00:42:19 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 1 May 2019 17:42:19 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic In-Reply-To: References: Message-ID: <79c9e6ca-bde7-db7d-4c74-51ee9ddac4f6@oracle.com> Looks good. Thanks, Vladimir On 5/1/19 4:37 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223216 > > (The patch has minor dependencies on 8223213 [1] I sent out for review earlier.) > > C2 implements class initialization checks for new and getstatic/putstatic differently: while "new" supports fast class > initialization checks, static field accesses rely on uncommon traps which may lead to deoptimization/recompilation > storms during long-running class initialisation. > > Proposed patch unifies implementation between them and uses the following barrier: > ?? if (holder->is_initialized()) { > ???? uncommon_trap(initialized, reinterpret); > ?? } > ?? if (!holder->is_reentrant_initialization(current_thread)) { > ???? uncommon_trap(uninitialized, none); > ?? } > > It also enhances checks for not-yet-initialized classes (Compile::needs_clinit_barrier) and unifies the implementation > between new, invokestatic, and getfield/putfield. > > Testing: tier1-5, targeted microbenchmarks, new test from 8223213 > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ > ??? https://bugs.openjdk.java.net/browse/JDK-8223213 > From tom.rodriguez at oracle.com Thu May 2 00:44:38 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Wed, 1 May 2019 17:44:38 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> Message-ID: You'll need to update your webrev after Vladimir's push. This code has moved into HotSpootJVMCIRuntime.java. Maybe WeakReferenceHolder instead of WeakTypeRef? It needs a comment explaining that we're intentionally avoiding the use of ClassValue.remove as well. Shouldn't the ref field be volatile? ClassValue includes some barrier semantics and the new code needs similar guarantees. tom dean.long at oracle.com wrote on 4/26/19 12:09 PM: > https://bugs.openjdk.java.net/browse/JDK-8218700 > http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ > > If we throw an OutOfMemoryError in the right place (see JDK-8222941), > HotSpotJVMCIMetaAccessContext.fromClass can go into an infinite loop > calling ClassValue.remove.? To work around the problem, reset the value > in a mutable cell instead of calling remove. > > dl From vladimir.x.ivanov at oracle.com Thu May 2 02:13:37 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 1 May 2019 19:13:37 -0700 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <9e0616f5-d79b-e439-26dd-a8e3334c10ed@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <9e0616f5-d79b-e439-26dd-a8e3334c10ed@oracle.com> Message-ID: <0ceb99f0-2c37-bb27-9ca4-18e1f145dbbe@oracle.com> Thanks for the feedback, Vladimir! > Why you skip patching code compiled by Graal and AOT? It happens only for classes being initialized and effectively preserve current behavior (re-resolution until class is fully initialized). The motivation is the following: * Graal needs to put class init barriers in nmethods at verified entry point in the same way C1/C2 does with this patch; * regarding AOTed code (I haven't done extensive exploration, but based on private discussions), I believe it needs additional barriers at method entry as well. Once proper support lands in Graal or AOT, the patching can be re-enabled. > The flag UseFastClassInitChecks could be diagnostic or even product. The > feature is not for debugging. The flag is used to signal that platform-specific support is available. Unless there's a use case which benefits from ability to turning it off (disable new barriers and fallback to re-resolution) from command line, I don't see much value in turning the flag into diagnostic/product one. Best regards, Vladimir Ivanov > On 5/1/19 4:17 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223213 >> >> (It's a followup RFR on a earlier RFC [1].) >> >> Recent changes severely affected how static initializers are executed >> and for long-running initializers it manifested as a severe slowdown. >> As an example, it led to a 3x slowdown on some Clojure applications >> (JDK-8219233 [2]). The root cause is that until a class is fully >> initialized, every invocation of static method on it goes through >> method resolution. >> >> Proposed fix introduces fast class initialization barriers for C1, C2, >> and template interpreter on x86-64. I did some experiments with >> cross-platform approaches, but haven't got satisfactory results. >> >> On other platforms, behavior stays (mostly) intact. (I had to revert >> some changes introduced by JDK-8219492 [3], since the assumptions they >> rely on about accesses inside a class don't hold in all cases.) >> >> The barrier is as simple as: >> ??? if (holder->is_not_initialized() && >> ??????? !holder->is_reentrant_initialization(current_thread)) { >> ????? // trigger call site re-resolution and block there >> ??? } >> >> There are 3 places where barriers are added: >> ?? * in template interpreter for invokestatic bytecode; >> ?? * at nmethod verified entry point (for normal compilations); >> ?? * c2i adapters; >> >> For template interperter, there's additional check added into >> TemplateTable::resolve_cache_and_index which calls into >> InterpreterRuntime::resolve_from_cache when fast path checks fail. >> >> In case of nmethods, the barrier is put before frame construction, so >> existing compiler runtime routines can be reused >> (SharedRuntime::get_handle_wrong_method_stub()). >> >> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers >> nmethod recompilation once the class is fully initialized. >> >> OSR compilations don't need a barrier. >> >> Correspondence between barriers and transitions they cover: >> ?? (1) from interpreter (barrier on caller side) >> ??????? * all transitions: interpreter, compiled (i2c), native, aot, ... >> >> ?? (2) from compiled (barrier on callee side) >> ??????? to compiled, to native (barrier in native wrapper on entry) >> >> ?? (3) c2i bypasses both barriers (interpreter and compiled) and >> requires a dedicated barrier in c2i >> >> ?? (4) to Graal/AOT code: >> ???????? from interpreter: covered by interpreter barrier >> ???????? from compiled: call site patching is disabled, leading to >> repeated call site resolution until method holder is fully initialized >> (original behavior). >> >> Performance experiments with clojure [2] demonstrated that the fix >> almost completely recuperates the regression: >> >> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >> -------- >> ?? (4) barriers disabled for invokestatic? ~3,2s >> >> I deliberately tried to keep the patch backport-friendly for >> 8u/11u/12u and refrained from using newer features like nmethod >> barriers introduced recently. The fix can be refactored later >> specifically for 13 as a followup change. >> >> Testing: clojure startup, tier1-5 >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] >> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >> >> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From dean.long at oracle.com Thu May 2 02:30:16 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 1 May 2019 19:30:16 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: <50f8065b-445f-ae1d-00c8-743fd870404a@oracle.com> References: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> <50f8065b-445f-ae1d-00c8-743fd870404a@oracle.com> Message-ID: <8788c5d0-48f7-6cfd-f733-19ec5bee84b0@oracle.com> Yes, that's exactly what I had in mind :-) dl On 5/1/19 3:15 PM, Vladimir Ivanov wrote: > >> Can you also add check_unique_method(ctxk, uniqm) to the version of >> assert_unique_concrete_method that takes a Method*? > > Like this? > ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.02/ > > Best regards, > Vladimir Ivanov > >> On 5/1/19 9:52 AM, Vladimir Ivanov wrote: >>> >>>> Does this allow us to assert !uniqm->can_be_statically_bound() in >>>> Dependencies::assert_unique_concrete_method? >>> >>> In general, no. It doesn't hold for final methods: dependency is >>> still needed when context is broad enough, since an overriding >>> method can be loaded in a different part of the hierarchy (under the >>> same context class). >>> >>> In case of the adjusted checks it's safe, since context == method >>> holder when actual_receiver->is_final() == true. >>> >>> ?? if (!callee->is_final_method() && !callee->is_private() && >>> !actual_receiver->is_final()) { >>> dependencies()->assert_unique_concrete_method(actual_receiver, >>> cha_monomorphic_target); >>> ??? } >>> >>> I refactored the patch a bit: >>> ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.01/ >>> >>>>> Moreover, C2 does add dependencies for private methods. >>> >>> I take it back. Earlier checks handle private methods. Only methods >>> on final classes get redundant dependencies. >>> >>> Best regards, >>> Vladimir Ivanov >> From vladimir.kozlov at oracle.com Thu May 2 02:34:39 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 1 May 2019 19:34:39 -0700 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <0ceb99f0-2c37-bb27-9ca4-18e1f145dbbe@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <9e0616f5-d79b-e439-26dd-a8e3334c10ed@oracle.com> <0ceb99f0-2c37-bb27-9ca4-18e1f145dbbe@oracle.com> Message-ID: On 5/1/19 7:13 PM, Vladimir Ivanov wrote: > Thanks for the feedback, Vladimir! > >> Why you skip patching code compiled by Graal and AOT? > > It happens only for classes being initialized and effectively preserve current behavior (re-resolution until class is > fully initialized). > > The motivation is the following: > > ? * Graal needs to put class init barriers in nmethods at verified entry > point in the same way C1/C2 does with this patch; > > ? * regarding AOTed code (I haven't done extensive exploration, but based on private discussions), I believe it needs > additional barriers at method entry as well. When Graal will add barriers AOT code will get them automatically. > > Once proper support lands in Graal or AOT, the patching can be re-enabled. Got it. > >> The flag UseFastClassInitChecks could be diagnostic or even product. The feature is not for debugging. > The flag is used to signal that platform-specific support is available. Unless there's a use case which benefits from > ability to turning it off (disable new barriers and fallback to re-resolution) from command line, I don't see much value > in turning the flag into diagnostic/product one. Okay. Thanks, Vladimir > > Best regards, > Vladimir Ivanov > >> On 5/1/19 4:17 PM, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8223213 >>> >>> (It's a followup RFR on a earlier RFC [1].) >>> >>> Recent changes severely affected how static initializers are executed and for long-running initializers it manifested >>> as a severe slowdown. >>> As an example, it led to a 3x slowdown on some Clojure applications >>> (JDK-8219233 [2]). The root cause is that until a class is fully initialized, every invocation of static method on it >>> goes through method resolution. >>> >>> Proposed fix introduces fast class initialization barriers for C1, C2, and template interpreter on x86-64. I did some >>> experiments with cross-platform approaches, but haven't got satisfactory results. >>> >>> On other platforms, behavior stays (mostly) intact. (I had to revert some changes introduced by JDK-8219492 [3], >>> since the assumptions they rely on about accesses inside a class don't hold in all cases.) >>> >>> The barrier is as simple as: >>> ??? if (holder->is_not_initialized() && >>> ??????? !holder->is_reentrant_initialization(current_thread)) { >>> ????? // trigger call site re-resolution and block there >>> ??? } >>> >>> There are 3 places where barriers are added: >>> ?? * in template interpreter for invokestatic bytecode; >>> ?? * at nmethod verified entry point (for normal compilations); >>> ?? * c2i adapters; >>> >>> For template interperter, there's additional check added into TemplateTable::resolve_cache_and_index which calls into >>> InterpreterRuntime::resolve_from_cache when fast path checks fail. >>> >>> In case of nmethods, the barrier is put before frame construction, so existing compiler runtime routines can be >>> reused (SharedRuntime::get_handle_wrong_method_stub()). >>> >>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers nmethod recompilation once the class is fully >>> initialized. >>> >>> OSR compilations don't need a barrier. >>> >>> Correspondence between barriers and transitions they cover: >>> ?? (1) from interpreter (barrier on caller side) >>> ??????? * all transitions: interpreter, compiled (i2c), native, aot, ... >>> >>> ?? (2) from compiled (barrier on callee side) >>> ??????? to compiled, to native (barrier in native wrapper on entry) >>> >>> ?? (3) c2i bypasses both barriers (interpreter and compiled) and requires a dedicated barrier in c2i >>> >>> ?? (4) to Graal/AOT code: >>> ???????? from interpreter: covered by interpreter barrier >>> ???????? from compiled: call site patching is disabled, leading to repeated call site resolution until method holder >>> is fully initialized (original behavior). >>> >>> Performance experiments with clojure [2] demonstrated that the fix almost completely recuperates the regression: >>> >>> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >>> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >>> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >>> -------- >>> ?? (4) barriers disabled for invokestatic? ~3,2s >>> >>> I deliberately tried to keep the patch backport-friendly for 8u/11u/12u and refrained from using newer features like >>> nmethod barriers introduced recently. The fix can be refactored later specifically for 13 as a followup change. >>> >>> Testing: clojure startup, tier1-5 >>> >>> Thanks! >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From rahul.v.raghavan at oracle.com Thu May 2 06:45:36 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 2 May 2019 12:15:36 +0530 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: <0E11910B-A4B0-4F64-9B87-5A4BF065B9D2@oracle.com> References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> <959abf54-d1da-95ee-9cf6-6c6d8ec5e4a1@oracle.com> <18115aa8-edaa-31b9-02a6-06721d9fbfc9@oracle.com> <939f3f5d-b8e7-939f-8953-d34a0f3ff6c9@oracle.com> <259ef902-778b-7eef-46e2-d1927950d21c@oracle.com> <73f7c647-3194-2a65-6cc6-a15cbf6c82be@oracle.com> <37837126-c9d5-1bb1-fc9a-6fb9b848efbe@oracle.com> <28955bc6-020a-29e1-953c-e9f48932cd56@oracle.com> <0E11910B-A4B0-4F64-9B87-5A4BF065B9D2@oracle.com> Message-ID: Hi John, Thank you, understood your point. But please note here new code cannot be inserted at the other deleted code location as such, due to the dependency on 'offset'! New code added / the main fix, can be added only after the line - AllocateNode* alloc = AllocateNode::Ideal_allocation(adr, phase, offset); The deleted code / cleanup done, because new code added supersedes the checks. (changeset committed - http://hg.openjdk.java.net/jdk/jdk/rev/f8d2b5ce4491) Thanks, Rahul On 01/05/19 11:50 PM, John Rose wrote: > Here's a late comment: Is there any reason to > put the deletion and insertion in different > places? If not, it would be easier to follow > the history, and to do merges, if they were > placed at the same point in the code. > That is, insert the new code where the old > code is deleted. > > On Apr 30, 2019, at 12:04 AM, Rahul Raghavan wrote: >> >> Thank you Vladimir Ivanov for suggestions. >> >> Please note following latest changes tried. >> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.04/ >> >> Hope did not miss any points. >> Confirmed no failures with the reported test cases. >> Also hs-tier1 to tier4, hs-precheckin-comp testing in progress. >> >> Thanks, >> Rahul >> >> On 27/04/19 11:48 AM, Vladimir Ivanov wrote: >>> On 26/04/2019 19:30, Vladimir Ivanov wrote: >>> After thinking more about it, I believe new offset alignment check supersedes is_unaligned_access(). And is_mismatched_access() is too conservative here: what is_mismatched_access() adds here (in addition to existing alignment & size checks) is whether type match between location and stored value, but what matters for IN are sizes and offsets only. >>> Type mismatches (e.g., byte vs boolean, char vs short) may cause problems when consequent loads are replaced with values from initializing stores, but it should be already handled in MemNode::can_see_stored_value() and Load?Node::Ideal(). >>> So, it seems both checks (is_unaligned_access() & is_mismatched_access()) can be safely omitted. >>> You are right, I missed that IN::captured_store_insertion_point() inspects already other stores which are already captured. Sorry for the confusion. >>> I agree that IN::can_capture_store() is the right place to put the fix in and I like (iii). (Just add a comment, "// mismatched access" is enough) > From patric.hedlin at oracle.com Thu May 2 07:53:54 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Thu, 2 May 2019 09:53:54 +0200 Subject: RFR(T): 8223139: Rename mandatory policy-do routines. In-Reply-To: <216752da-5cbd-b14a-8ea2-96959a889759@oracle.com> References: <216752da-5cbd-b14a-8ea2-96959a889759@oracle.com> Message-ID: <48624e8a-91a9-184e-fa29-623f25f24713@oracle.com> Thanks Dean. On 01/05/2019 00:05, dean.long at oracle.com wrote: > Looks good, though "remove_empty_loop" also sounds good. Oooh... I don't know, that might be pushing the limit... :) /Patric > > dl > > On 4/30/19 7:10 AM, Patric Hedlin wrote: >> Dear all, >> >> I would like to ask for help to review the following change/update: >> >> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223139 >> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223139/ >> >> 8223139: Rename mandatory policy-do routines. >> >> ??? These routines do not implement any policy. The policy is to always >> ??? attempt these transforms if applicable. >> ??? 'policy_do_remove_empty_loop' -> 'do_remove_empty_loop'. >> ??? 'policy_do_one_iteration_loop' -> 'do_one_iteration_loop'. >> >> >> Testing: Part of 8216137 (hs-tier1..4, hs-precheckin-comp, >> Kitchensink24h) >> >> >> Best regards, >> Patric From tobias.hartmann at oracle.com Thu May 2 09:04:42 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 2 May 2019 11:04:42 +0200 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> Message-ID: <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> Hi, in the bug description you state: > CompileBroker::compile_method fails to detect the pre-existing nmethod because comp_level doesn't match But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? You need to update the copyright date in Level2RecompilationTest.java (should be 2019 only). Thanks, Tobias On 26.04.19 09:36, Liu, Xin wrote: > Gently ping. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8222670 > I got the new revision. > https://cr.openjdk.java.net/~xliu/8222670/webrev.03/ > > I finish up test Level2RecompilationTest.java. if you want to start a OSR compilation, you have to specify bci which points to the begin of a BB. > Give them bci = 0 is good enough for general cases. > > Thanks, > --lx > > > > > ?On 4/19/19, 11:19 PM, "Liu, Xin" wrote: > > hi, Severin, > > Thanks for reviewing. Yes, it's irrelevant. I revert it. please check it out. > https://cr.openjdk.java.net/~xliu/8222670/webrev.02/ > > Please note that I added an assertion InstanceKlass::add_osr_nmethod(nmethod* n) in this webrev. > In my understanding, it is a potential memleak of codecache. If there's no higher level of osr compilation, those dups will stay in codecache forever. > > Further, it doesn?t make sense to recompile with the same level and same bci. With this assertion, the following tests in tier1-test failed. > test/hotspot/jtreg/compiler/intrinsics/unsafe/DirectByteBufferTest.java > test/hotspot/jtreg/compiler/intrinsics/unsafe/HeapByteBufferTest.java > test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToArrayOpTest.java > test/jdk/tools/pack200/Pack200Test.java > test/jdk/java/util/Arrays/SortingNearlySortedPrimitive.java > > All crashes happen as I described in JDK-8222670. Eg. duplicated OSR compilations occur for level2. > > Program received signal SIGSEGV, Segmentation fault. > # To suppress the following error report, specify this argument > # after -XX: or in .hotspotrc: SuppressErrorAt=/instanceKlass.cpp:2972 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/src/src/hotspot/share/oops/instanceKlass.cpp:2972), pid=8347, tid=8361 > # assert(prev == __null || !prev->is_in_use()) failed: redundunt OSR recompilation detected. memory leak in CodeCache! > # > # JRE version: OpenJDK Runtime Environment (13.0) (slowdebug build 13-internal+0-adhoc..src) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 13-internal+0-adhoc..src, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0xb3dbb4] InstanceKlass::add_osr_nmethod(nmethod*)+0xc4 > # > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /build/JTwork/scratch/hs_err_pid8347.log > > Program received signal SIGSEGV, Segmentation fault. > Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) > total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 > relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 > main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 > stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 > oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 > metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 > scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 > scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 > dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 > nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 > Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) > total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 > relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 > main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 > stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 > oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 > metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 > scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 > scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 > dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 > nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 > > > Thanks, > --lx > > On 4/19/19, 9:31 AM, "Severin Gehwolf" wrote: > > On Thu, 2019-04-18 at 19:46 +0000, Liu, Xin wrote: > > Hi, hotspot-compiler group, > > > > Could you review this webrev for JDK-8222670? > > https://cr.openjdk.java.net/~xliu/8222670/webrev.01/ > > +++ new/test/hotspot/jtreg/compiler/tiered/TieredLevelsTest.java 2019-04-18 12:18:38.000000000 -0700 > @@ -89,7 +89,7 @@ > && actual == COMP_LEVEL_LIMITED_PROFILE) { > // for simple method full_profile may be replaced by limited_profile > if (IS_VERBOSE) { > - System.out.printf("Level check: full profiling was replaced " > + System.out.println("Level check: full profiling was replaced " > + "by limited profiling. Expected: %d, actual:%d", > expected, actual); > > This seems an unintended change, is it? > > Thanks, > Severin > > > > > From tobias.hartmann at oracle.com Thu May 2 09:16:49 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 2 May 2019 11:16:49 +0200 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> Message-ID: Also, why did you rename clearMethodState0 to clearMethodState in whitebox.cpp? This will lead to failures: Warning: 'NoSuchMethodError' on register of sun.hotspot.WhiteBox::clearMethodState(Ljava/lang/reflect/Executable;)V Best regards, Tobias On 02.05.19 11:04, Tobias Hartmann wrote: > Hi, > > in the bug description you state: >> CompileBroker::compile_method fails to detect the pre-existing nmethod because comp_level doesn't match > > But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the > same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? > > You need to update the copyright date in Level2RecompilationTest.java (should be 2019 only). > > Thanks, > Tobias > > On 26.04.19 09:36, Liu, Xin wrote: >> Gently ping. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222670 >> I got the new revision. >> https://cr.openjdk.java.net/~xliu/8222670/webrev.03/ >> >> I finish up test Level2RecompilationTest.java. if you want to start a OSR compilation, you have to specify bci which points to the begin of a BB. >> Give them bci = 0 is good enough for general cases. >> >> Thanks, >> --lx >> >> >> >> >> ?On 4/19/19, 11:19 PM, "Liu, Xin" wrote: >> >> hi, Severin, >> >> Thanks for reviewing. Yes, it's irrelevant. I revert it. please check it out. >> https://cr.openjdk.java.net/~xliu/8222670/webrev.02/ >> >> Please note that I added an assertion InstanceKlass::add_osr_nmethod(nmethod* n) in this webrev. >> In my understanding, it is a potential memleak of codecache. If there's no higher level of osr compilation, those dups will stay in codecache forever. >> >> Further, it doesn?t make sense to recompile with the same level and same bci. With this assertion, the following tests in tier1-test failed. >> test/hotspot/jtreg/compiler/intrinsics/unsafe/DirectByteBufferTest.java >> test/hotspot/jtreg/compiler/intrinsics/unsafe/HeapByteBufferTest.java >> test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToArrayOpTest.java >> test/jdk/tools/pack200/Pack200Test.java >> test/jdk/java/util/Arrays/SortingNearlySortedPrimitive.java >> >> All crashes happen as I described in JDK-8222670. Eg. duplicated OSR compilations occur for level2. >> >> Program received signal SIGSEGV, Segmentation fault. >> # To suppress the following error report, specify this argument >> # after -XX: or in .hotspotrc: SuppressErrorAt=/instanceKlass.cpp:2972 >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/src/src/hotspot/share/oops/instanceKlass.cpp:2972), pid=8347, tid=8361 >> # assert(prev == __null || !prev->is_in_use()) failed: redundunt OSR recompilation detected. memory leak in CodeCache! >> # >> # JRE version: OpenJDK Runtime Environment (13.0) (slowdebug build 13-internal+0-adhoc..src) >> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 13-internal+0-adhoc..src, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # V [libjvm.so+0xb3dbb4] InstanceKlass::add_osr_nmethod(nmethod*)+0xc4 >> # >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /build/JTwork/scratch/hs_err_pid8347.log >> >> Program received signal SIGSEGV, Segmentation fault. >> Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) >> total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 >> relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 >> main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 >> stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 >> oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 >> metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 >> scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 >> scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 >> dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 >> nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 >> Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) >> total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 >> relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 >> main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 >> stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 >> oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 >> metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 >> scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 >> scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 >> dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 >> nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 >> >> >> Thanks, >> --lx >> >> On 4/19/19, 9:31 AM, "Severin Gehwolf" wrote: >> >> On Thu, 2019-04-18 at 19:46 +0000, Liu, Xin wrote: >> > Hi, hotspot-compiler group, >> > >> > Could you review this webrev for JDK-8222670? >> > https://cr.openjdk.java.net/~xliu/8222670/webrev.01/ >> >> +++ new/test/hotspot/jtreg/compiler/tiered/TieredLevelsTest.java 2019-04-18 12:18:38.000000000 -0700 >> @@ -89,7 +89,7 @@ >> && actual == COMP_LEVEL_LIMITED_PROFILE) { >> // for simple method full_profile may be replaced by limited_profile >> if (IS_VERBOSE) { >> - System.out.printf("Level check: full profiling was replaced " >> + System.out.println("Level check: full profiling was replaced " >> + "by limited profiling. Expected: %d, actual:%d", >> expected, actual); >> >> This seems an unintended change, is it? >> >> Thanks, >> Severin >> >> >> >> >> From tobias.hartmann at oracle.com Thu May 2 09:23:02 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 2 May 2019 11:23:02 +0200 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> Message-ID: Hi Jie, this looks good to me too but please add brackets to the checks in InlineTree::is_not_reached. I've submitted some extended testing and let you know once it passed. Someone from the runtime team should also have a look at this because your changes affect the interpreter. CC'ing runtime-dev. Thanks, Tobias On 29.04.19 15:43, Jie Fu wrote: > Hi all, > > May I have another review for this change [1] to finalize the fix? > Thanks a lot. > > Best regards, > Jie > > [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ > > > On 2019?04?20? 11:35, Jie Fu wrote: >> Ah, I got it. >> I like your patch and benefit a lot from you. >> Thank you so much, Vladimir. >> >> Any comments from other reviewers? >> Thanks. >> >> Best regards, >> Jie >> >> On 2019/4/20 ??11:18, Vladimir Ivanov wrote: >>> >>>>> After some explorations I decided to keep original behavior for immature profiles >>>>> (profile.count == -1). >>>> >>>> I agree. >>>> >>>> I have two questions here. >>>> >>>> 1. What's the difference of the following two if statements? >>>> ------------------------------------------------- >>>> +? if (!callee_method->was_executed_more_than(0))? return true; // callee was never executed >>>> + >>>> +? if (caller_method->is_not_reached(caller_bci))? return true; // call site not resolved >>>> ------------------------------------------------- >>>> I think only one of them is needed. >>> >>> The checks are complimentary: one inspects callee and the other looks at call site. >>> >>> "!callee_method->was_executed_more_than(0)" ensures that callee was executed at least once. >>> >>> "caller_method->is_not_reached(caller_bci)" inspects the state of the call site. If corresponding >>> CP entry is not resolved, then the call site isn't reached. If is_not_reached() returns false, >>> it's not a definitive answer: there's still a chance the site is not reached - consider the case >>> of virtual calls where callee_method may differ for the same resolved method. >>> >>>> 2. Does the assert in InlineTree::is_not_reached(...) make sense? >>>> Since we have >>>> ------------------------------------------------- >>>> if (profile.count() > 0)?? return false; // reachable according to profile >>>> ------------------------------------------------- >>>> and >>>> ------------------------------------------------- >>>> if (profile.count() == -1) {...} >>>> ------------------------------------------------- >>>> before >>>> ------------------------------------------------- >>>> assert(profile.count() == 0, "sanity"); >>>> ------------------------------------------------- >>>> is the assert redundant? >>> >>> Asserts are intended to be redundant :-) But still catch bugs from time to time. >>> >>> This one, in particular, checks invariant on profile.count() >= -1 (which is not very useful by >>> itself), but also stresses that "profile.count() == 0" case is being processed. >>> >>> Best regards, >>> Vladimir Ivanov >> > > From claes.redestad at oracle.com Thu May 2 11:03:18 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Thu, 2 May 2019 13:03:18 +0200 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> Message-ID: <37ee4fe8-d962-6e05-82f0-7258f5083459@oracle.com> Hi Vladimir, On 2019-05-02 01:17, Vladimir Ivanov wrote: > Performance experiments with clojure [2] demonstrated that the fix > almost completely recuperates the regression: > > ? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) > ? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) > ? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) > -------- > ? (4) barriers disabled for invokestatic? ~3,2s good stuff! Just to add a few data points I turned some of my earlier experiments to try and isolate some of these issues into a little stress test: BadStress[1]: 11.0.1: 136ms 11.0.2: 13500ms jdk/jdk baseline: 126ms jdk/jdk patched: 123ms GoodStress[2] (baseline): 11.0.1: 56ms 11.0.2: 54ms jdk/jdk baseline: 48ms jdk/jdk patched: 47ms Observations: - On latest jdk/jdk, we've already recuperated most of the cost exposed in these synthetic tests due related fixes (mainly https://bugs.openjdk.java.net/browse/JDK-8188133 and https://bugs.openjdk.java.net/browse/JDK-8219974 ), but the patch helps a bit here too and we're net faster than 11.0.1 (also when taking into account how startup in general has improved since) - The small 1ms startup improvement with the patch on the baseline test is sustained and significant, indicating we have some internal JDK classes exercised during bootstrap which benefit directly from your fixes. I've verified this improvement translates to all our other small-app startup tests. - My tests were too na?ve to capture all the overheads seen with clj - Likely still good performance advice to avoid heavy lifting in static initializers. All in all I think this is a great improvement and hope the added complexity is deemed acceptable. Thanks! /Claes [1] public class BadStress { static void foo() {} static void bar() {} public static class Helper { static void foo() { BadStress.foo(); } } static { long start = System.nanoTime(); for (int i = 0; i < 10_000_000; i++) { Helper.foo(); } for (int i = 0; i < 10_000_000; i++) { bar(); } long end = System.nanoTime(); System.out.println("Elapsed: " + (end - start) + " ns"); } public static void main(String... args) {} } [2] public class GoodStress { public static class Helper { static void foo() {} static void bar() {} } static { long start = System.nanoTime(); for (int i = 0; i < 10_000_000; i++) { Helper.foo(); } for (int i = 0; i < 10_000_000; i++) { Helper.bar(); } long end = System.nanoTime(); System.out.println("Elapsed: " + (end - start) + " ns"); } public static void main(String... args) {} } From tobias.hartmann at oracle.com Thu May 2 12:02:15 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 2 May 2019 14:02:15 +0200 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> Message-ID: <9834558d-20f7-5bc6-4058-7cd007b0ad5f@oracle.com> On 02.05.19 11:23, Tobias Hartmann wrote: > I've submitted some extended testing and let you know once it passed. Testing passed. Best regards, Tobias From nils.eliasson at oracle.com Thu May 2 12:31:29 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 2 May 2019 14:31:29 +0200 Subject: RFR(M): 8216137: assert(Compile::current()->live_nodes() < Compile::current()->max_node_limit()) failed: Live Node limit exceeded limit In-Reply-To: <0553d83e-9d77-0295-acff-9fd3e8a44043@oracle.com> References: <0553d83e-9d77-0295-acff-9fd3e8a44043@oracle.com> Message-ID: <48a14e3b-cd31-b906-7fb7-41be0e845b83@oracle.com> Looks good! Regards, Nils On 2019-04-30 16:28, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8216137 > Webrev: http://cr.openjdk.java.net/~phedlin/tr8216137/ > > 8216137: assert(Compile::current()->live_nodes() < > Compile::current()->max_node_limit()) failed: > ???????? Live Node limit exceeded limit > > Also addressed: > > 8219520: assert(Compile::current()->live_nodes() < > Compile::current()->max_node_limit()) failed: > ???????? Live Node limit exceeded limit > > Approach: > > ??? Adding a simplistic (ad-hoc) node budget mechanism, applied during > loop transforms. > > > Testing: hs-tier1..4, hs-precheckin-comp, Kitchensink24h > > > Caveat:? Testing and benchmarking needs to be reran but is currently > experiencing issues. > > > Best regards, > Patric > From nils.eliasson at oracle.com Thu May 2 12:41:19 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 2 May 2019 14:41:19 +0200 Subject: RFR(S): 8223140: Clean-up in 'ok_to_convert()'. In-Reply-To: References: Message-ID: +1 Regards, Nils On 2019-04-30 18:54, Vladimir Ivanov wrote: > Looks good. > > I like precond/postcond macros. > > +static bool is_cloop_increment(Node* inc) { > +? precond(inc->Opcode() == Op_AddI || inc->Opcode() == Op_AddL); > > Best regards, > Vladimir Ivanov > > On 30/04/2019 07:11, Patric Hedlin wrote: >> Dear all, >> >> I would like to ask for help to review the following change/update: >> >> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223140 >> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223140/ >> >> 8223140: Clean-up in 'ok_to_convert()' >> >> ???? Simplify logic in 'ok_to_convert()'. >> ???? Rename 'is_loop_iv()' to 'is_cloop_ind_var()'. >> ???? Adding precond/postcond macros. >> >> >> Testing: Part of 8216137 (hs-tier1..4, hs-precheckin-comp, >> Kitchensink24h) >> >> >> Best regards, >> Patric >> From patric.hedlin at oracle.com Thu May 2 13:32:58 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Thu, 2 May 2019 15:32:58 +0200 Subject: RFR(S): 8223140: Clean-up in 'ok_to_convert()'. In-Reply-To: References: Message-ID: Thanks Nils. /Patric On 02/05/2019 14:41, Nils Eliasson wrote: > +1 > > Regards, > > Nils > > > On 2019-04-30 18:54, Vladimir Ivanov wrote: >> Looks good. >> >> I like precond/postcond macros. >> >> +static bool is_cloop_increment(Node* inc) { >> +? precond(inc->Opcode() == Op_AddI || inc->Opcode() == Op_AddL); >> >> Best regards, >> Vladimir Ivanov >> >> On 30/04/2019 07:11, Patric Hedlin wrote: >>> Dear all, >>> >>> I would like to ask for help to review the following change/update: >>> >>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223140 >>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223140/ >>> >>> 8223140: Clean-up in 'ok_to_convert()' >>> >>> ???? Simplify logic in 'ok_to_convert()'. >>> ???? Rename 'is_loop_iv()' to 'is_cloop_ind_var()'. >>> ???? Adding precond/postcond macros. >>> >>> >>> Testing: Part of 8216137 (hs-tier1..4, hs-precheckin-comp, >>> Kitchensink24h) >>> >>> >>> Best regards, >>> Patric >>> From patric.hedlin at oracle.com Thu May 2 13:33:39 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Thu, 2 May 2019 15:33:39 +0200 Subject: RFR(M): 8216137: assert(Compile::current()->live_nodes() < Compile::current()->max_node_limit()) failed: Live Node limit exceeded limit In-Reply-To: <48a14e3b-cd31-b906-7fb7-41be0e845b83@oracle.com> References: <0553d83e-9d77-0295-acff-9fd3e8a44043@oracle.com> <48a14e3b-cd31-b906-7fb7-41be0e845b83@oracle.com> Message-ID: <34aeb06a-d0f4-87db-0c0a-3cb179ddb9e6@oracle.com> Thanks Nils. /Patric On 02/05/2019 14:31, Nils Eliasson wrote: > Looks good! > > Regards, > > Nils > > > On 2019-04-30 16:28, Patric Hedlin wrote: >> Dear all, >> >> I would like to ask for help to review the following change/update: >> >> Issue:? https://bugs.openjdk.java.net/browse/JDK-8216137 >> Webrev: http://cr.openjdk.java.net/~phedlin/tr8216137/ >> >> 8216137: assert(Compile::current()->live_nodes() < >> Compile::current()->max_node_limit()) failed: >> ???????? Live Node limit exceeded limit >> >> Also addressed: >> >> 8219520: assert(Compile::current()->live_nodes() < >> Compile::current()->max_node_limit()) failed: >> ???????? Live Node limit exceeded limit >> >> Approach: >> >> ??? Adding a simplistic (ad-hoc) node budget mechanism, applied >> during loop transforms. >> >> >> Testing: hs-tier1..4, hs-precheckin-comp, Kitchensink24h >> >> >> Caveat:? Testing and benchmarking needs to be reran but is currently >> experiencing issues. >> >> >> Best regards, >> Patric >> From fujie at loongson.cn Thu May 2 15:18:43 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 2 May 2019 23:18:43 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> Message-ID: <3910e97e-009e-598a-f91a-8872ecd7ec18@loongson.cn> Hi Tobias, Thank you for your review. I will add the brackets as soon as I come back to my office this Sunday. I sincerely hope that someone from the runtime-dev can also help to review this patch[1]. Thanks a lot. Best regards, Jie [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ On 2019?05?02? 17:23, Tobias Hartmann wrote: > Hi Jie, > > this looks good to me too but please add brackets to the checks in InlineTree::is_not_reached. > > I've submitted some extended testing and let you know once it passed. > > Someone from the runtime team should also have a look at this because your changes affect the > interpreter. CC'ing runtime-dev. > > Thanks, > Tobias > > On 29.04.19 15:43, Jie Fu wrote: >> Hi all, >> >> May I have another review for this change [1] to finalize the fix? >> Thanks a lot. >> >> Best regards, >> Jie >> >> [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ >> >> >> On 2019?04?20? 11:35, Jie Fu wrote: >>> Ah, I got it. >>> I like your patch and benefit a lot from you. >>> Thank you so much, Vladimir. >>> >>> Any comments from other reviewers? >>> Thanks. >>> >>> Best regards, >>> Jie >>> >>> On 2019/4/20 ??11:18, Vladimir Ivanov wrote: >>>>>> After some explorations I decided to keep original behavior for immature profiles >>>>>> (profile.count == -1). >>>>> I agree. >>>>> >>>>> I have two questions here. >>>>> >>>>> 1. What's the difference of the following two if statements? >>>>> ------------------------------------------------- >>>>> + if (!callee_method->was_executed_more_than(0)) return true; // callee was never executed >>>>> + >>>>> + if (caller_method->is_not_reached(caller_bci)) return true; // call site not resolved >>>>> ------------------------------------------------- >>>>> I think only one of them is needed. >>>> The checks are complimentary: one inspects callee and the other looks at call site. >>>> >>>> "!callee_method->was_executed_more_than(0)" ensures that callee was executed at least once. >>>> >>>> "caller_method->is_not_reached(caller_bci)" inspects the state of the call site. If corresponding >>>> CP entry is not resolved, then the call site isn't reached. If is_not_reached() returns false, >>>> it's not a definitive answer: there's still a chance the site is not reached - consider the case >>>> of virtual calls where callee_method may differ for the same resolved method. >>>> >>>>> 2. Does the assert in InlineTree::is_not_reached(...) make sense? >>>>> Since we have >>>>> ------------------------------------------------- >>>>> if (profile.count() > 0) return false; // reachable according to profile >>>>> ------------------------------------------------- >>>>> and >>>>> ------------------------------------------------- >>>>> if (profile.count() == -1) {...} >>>>> ------------------------------------------------- >>>>> before >>>>> ------------------------------------------------- >>>>> assert(profile.count() == 0, "sanity"); >>>>> ------------------------------------------------- >>>>> is the assert redundant? >>>> Asserts are intended to be redundant :-) But still catch bugs from time to time. >>>> >>>> This one, in particular, checks invariant on profile.count() >= -1 (which is not very useful by >>>> itself), but also stresses that "profile.count() == 0" case is being processed. >>>> >>>> Best regards, >>>> Vladimir Ivanov >> From derekw at marvell.com Thu May 2 15:19:10 2019 From: derekw at marvell.com (Derek White) Date: Thu, 2 May 2019 15:19:10 +0000 Subject: [EXT] Re: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <37ee4fe8-d962-6e05-82f0-7258f5083459@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <37ee4fe8-d962-6e05-82f0-7258f5083459@oracle.com> Message-ID: Hi Vladimir, I want to be clear on the relationship between bugs and patches: 8223213 and patchset is intended to *replace* 8219233 and it's patchset, or be applied on top of it? https://bugs.openjdk.java.net/browse/JDK-8223213, https://bugs.openjdk.java.net/browse/JDK-8219233 Thanks! - Derek > -----Original Message----- > From: hotspot-dev On Behalf Of > Claes Redestad > Sent: Thursday, May 02, 2019 7:03 AM > To: Vladimir Ivanov ; hotspot compiler > ; hotspot-runtime-dev runtime-dev at openjdk.java.net>; hotspot-dev developers dev at openjdk.java.net> > Subject: [EXT] Re: [13] RFR (M): 8223213: Implement fast class initialization > checks on x86-64 > > External Email > > ---------------------------------------------------------------------- > Hi Vladimir, > > On 2019-05-02 01:17, Vladimir Ivanov wrote: > > Performance experiments with clojure [2] demonstrated that the fix > > almost completely recuperates the regression: > > > > ? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) > > ? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) > > ? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) > > -------- > > ? (4) barriers disabled for invokestatic? ~3,2s > > good stuff! > > Just to add a few data points I turned some of my earlier experiments to try > and isolate some of these issues into a little stress test: > > BadStress[1]: > 11.0.1: 136ms > 11.0.2: 13500ms > jdk/jdk baseline: 126ms > jdk/jdk patched: 123ms > > GoodStress[2] (baseline): > 11.0.1: 56ms > 11.0.2: 54ms > jdk/jdk baseline: 48ms > jdk/jdk patched: 47ms > > Observations: > > - On latest jdk/jdk, we've already recuperated most of the cost exposed > in these synthetic tests due related fixes (mainly > https://bugs.openjdk.java.net/browse/JDK-8188133 and > https://bugs.openjdk.java.net/browse/JDK-8219974 ), but the patch > helps a bit here too and we're net faster than 11.0.1 (also when > taking into account how startup in general has improved since) > > - The small 1ms startup improvement with the patch on the baseline test > is sustained and significant, indicating we have some internal JDK > classes exercised during bootstrap which benefit directly from your > fixes. I've verified this improvement translates to all our other > small-app startup tests. > > - My tests were too na?ve to capture all the overheads seen with clj > > - Likely still good performance advice to avoid heavy lifting in static > initializers. > > All in all I think this is a great improvement and hope the added complexity is > deemed acceptable. > > Thanks! > > /Claes > > [1] > public class BadStress { > static void foo() {} > static void bar() {} > public static class Helper { > static void foo() { BadStress.foo(); } > } > static { > long start = System.nanoTime(); > for (int i = 0; i < 10_000_000; i++) { > Helper.foo(); > } > for (int i = 0; i < 10_000_000; i++) { > bar(); > } > long end = System.nanoTime(); > System.out.println("Elapsed: " + (end - start) + " ns"); > } > public static void main(String... args) {} } > > [2] > public class GoodStress { > public static class Helper { > static void foo() {} > static void bar() {} > } > static { > long start = System.nanoTime(); > for (int i = 0; i < 10_000_000; i++) { > Helper.foo(); > } > for (int i = 0; i < 10_000_000; i++) { > Helper.bar(); > } > long end = System.nanoTime(); > System.out.println("Elapsed: " + (end - start) + " ns"); > } > public static void main(String... args) {} } From vladimir.x.ivanov at oracle.com Thu May 2 16:26:20 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 2 May 2019 09:26:20 -0700 Subject: [EXT] Re: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <37ee4fe8-d962-6e05-82f0-7258f5083459@oracle.com> Message-ID: <0201a50a-0006-1b28-9b7b-4ce5ed534795@oracle.com> Derek, > I want to be clear on the relationship between bugs and patches: > > 8223213 and patchset is intended to *replace* 8219233 and it's patchset, or be applied on top of it? > https://bugs.openjdk.java.net/browse/JDK-8223213, > https://bugs.openjdk.java.net/browse/JDK-8219233 The former: 8223213 supersedes 8219233 patchset. The plan is to use 8219233 as an umbrella for tracking the progress of relevant fixes and backports. Best regards, Vladimir Ivanov >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Claes Redestad >> Sent: Thursday, May 02, 2019 7:03 AM >> To: Vladimir Ivanov ; hotspot compiler >> ; hotspot-runtime-dev > runtime-dev at openjdk.java.net>; hotspot-dev developers > dev at openjdk.java.net> >> Subject: [EXT] Re: [13] RFR (M): 8223213: Implement fast class initialization >> checks on x86-64 >> >> External Email >> >> ---------------------------------------------------------------------- >> Hi Vladimir, >> >> On 2019-05-02 01:17, Vladimir Ivanov wrote: >>> Performance experiments with clojure [2] demonstrated that the fix >>> almost completely recuperates the regression: >>> >>> ? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >>> ? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >>> ? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >>> -------- >>> ? (4) barriers disabled for invokestatic? ~3,2s >> >> good stuff! >> >> Just to add a few data points I turned some of my earlier experiments to try >> and isolate some of these issues into a little stress test: >> >> BadStress[1]: >> 11.0.1: 136ms >> 11.0.2: 13500ms >> jdk/jdk baseline: 126ms >> jdk/jdk patched: 123ms >> >> GoodStress[2] (baseline): >> 11.0.1: 56ms >> 11.0.2: 54ms >> jdk/jdk baseline: 48ms >> jdk/jdk patched: 47ms >> >> Observations: >> >> - On latest jdk/jdk, we've already recuperated most of the cost exposed >> in these synthetic tests due related fixes (mainly >> https://bugs.openjdk.java.net/browse/JDK-8188133 and >> https://bugs.openjdk.java.net/browse/JDK-8219974 ), but the patch >> helps a bit here too and we're net faster than 11.0.1 (also when >> taking into account how startup in general has improved since) >> >> - The small 1ms startup improvement with the patch on the baseline test >> is sustained and significant, indicating we have some internal JDK >> classes exercised during bootstrap which benefit directly from your >> fixes. I've verified this improvement translates to all our other >> small-app startup tests. >> >> - My tests were too na?ve to capture all the overheads seen with clj >> >> - Likely still good performance advice to avoid heavy lifting in static >> initializers. >> >> All in all I think this is a great improvement and hope the added complexity is >> deemed acceptable. >> >> Thanks! >> >> /Claes >> >> [1] >> public class BadStress { >> static void foo() {} >> static void bar() {} >> public static class Helper { >> static void foo() { BadStress.foo(); } >> } >> static { >> long start = System.nanoTime(); >> for (int i = 0; i < 10_000_000; i++) { >> Helper.foo(); >> } >> for (int i = 0; i < 10_000_000; i++) { >> bar(); >> } >> long end = System.nanoTime(); >> System.out.println("Elapsed: " + (end - start) + " ns"); >> } >> public static void main(String... args) {} } >> >> [2] >> public class GoodStress { >> public static class Helper { >> static void foo() {} >> static void bar() {} >> } >> static { >> long start = System.nanoTime(); >> for (int i = 0; i < 10_000_000; i++) { >> Helper.foo(); >> } >> for (int i = 0; i < 10_000_000; i++) { >> Helper.bar(); >> } >> long end = System.nanoTime(); >> System.out.println("Elapsed: " + (end - start) + " ns"); >> } >> public static void main(String... args) {} } From vladimir.x.ivanov at oracle.com Thu May 2 16:28:07 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 2 May 2019 09:28:07 -0700 Subject: [13] RFR (S): 8223171: Redundant nmethod dependencies for effectively final methods In-Reply-To: <8788c5d0-48f7-6cfd-f733-19ec5bee84b0@oracle.com> References: <5cdb781f-1922-9a5d-9f52-f6874fd6a259@oracle.com> <50f8065b-445f-ae1d-00c8-743fd870404a@oracle.com> <8788c5d0-48f7-6cfd-f733-19ec5bee84b0@oracle.com> Message-ID: <7add187c-56d8-243b-3f1e-f2bfc933a7f7@oracle.com> Thanks, Dean. Best regards, Vladimir Ivanov On 01/05/2019 19:30, dean.long at oracle.com wrote: > Yes, that's exactly what I had in mind :-) > > dl > > On 5/1/19 3:15 PM, Vladimir Ivanov wrote: >> >>> Can you also add check_unique_method(ctxk, uniqm) to the version of >>> assert_unique_concrete_method that takes a Method*? >> >> Like this? >> ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.02/ >> >> Best regards, >> Vladimir Ivanov >> >>> On 5/1/19 9:52 AM, Vladimir Ivanov wrote: >>>> >>>>> Does this allow us to assert !uniqm->can_be_statically_bound() in >>>>> Dependencies::assert_unique_concrete_method? >>>> >>>> In general, no. It doesn't hold for final methods: dependency is >>>> still needed when context is broad enough, since an overriding >>>> method can be loaded in a different part of the hierarchy (under the >>>> same context class). >>>> >>>> In case of the adjusted checks it's safe, since context == method >>>> holder when actual_receiver->is_final() == true. >>>> >>>> ?? if (!callee->is_final_method() && !callee->is_private() && >>>> !actual_receiver->is_final()) { >>>> dependencies()->assert_unique_concrete_method(actual_receiver, >>>> cha_monomorphic_target); >>>> ??? } >>>> >>>> I refactored the patch a bit: >>>> ? http://cr.openjdk.java.net/~vlivanov/8223171/webrev.01/ >>>> >>>>>> Moreover, C2 does add dependencies for private methods. >>>> >>>> I take it back. Earlier checks handle private methods. Only methods >>>> on final classes get redundant dependencies. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>> > From john.r.rose at oracle.com Thu May 2 18:31:50 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 2 May 2019 11:31:50 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> <959abf54-d1da-95ee-9cf6-6c6d8ec5e4a1@oracle.com> <18115aa8-edaa-31b9-02a6-06721d9fbfc9@oracle.com> <939f3f5d-b8e7-939f-8953-d34a0f3ff6c9@oracle.com> <259ef902-778b-7eef-46e2-d1927950d21c@oracle.com> <73f7c647-3194-2a65-6cc6-a15cbf6c82be@oracle.com> <37837126-c9d5-1bb1-fc9a-6fb9b848efbe@oracle.com> <28955bc6-020a-29e1-953c-e9f48932cd56@oracle.com> <0E11910B-A4B0-4F64-9B87-5A4BF065B9D2@oracle.com> Message-ID: <2BD07B45-71CE-41FD-B83B-6E043FB63F09@oracle.com> On May 1, 2019, at 11:45 PM, Rahul Raghavan wrote: > > Thank you, understood your point. > But please note here new code cannot be inserted at the other deleted code location as such, due to the dependency on 'offset'! Oops, my bad! Carry on. :-) From vladimir.x.ivanov at oracle.com Thu May 2 21:55:48 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 2 May 2019 14:55:48 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic In-Reply-To: <79c9e6ca-bde7-db7d-4c74-51ee9ddac4f6@oracle.com> References: <79c9e6ca-bde7-db7d-4c74-51ee9ddac4f6@oracle.com> Message-ID: Thanks, Vladimir. Best regards, Vladimir Ivanov On 01/05/2019 17:42, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 5/1/19 4:37 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223216 >> >> (The patch has minor dependencies on 8223213 [1] I sent out for review >> earlier.) >> >> C2 implements class initialization checks for new and >> getstatic/putstatic differently: while "new" supports fast class >> initialization checks, static field accesses rely on uncommon traps >> which may lead to deoptimization/recompilation storms during >> long-running class initialisation. >> >> Proposed patch unifies implementation between them and uses the >> following barrier: >> ??? if (holder->is_initialized()) { >> ????? uncommon_trap(initialized, reinterpret); >> ??? } >> ??? if (!holder->is_reentrant_initialization(current_thread)) { >> ????? uncommon_trap(uninitialized, none); >> ??? } >> >> It also enhances checks for not-yet-initialized classes >> (Compile::needs_clinit_barrier) and unifies the implementation between >> new, invokestatic, and getfield/putfield. >> >> Testing: tier1-5, targeted microbenchmarks, new test from 8223213 >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >> ???? https://bugs.openjdk.java.net/browse/JDK-8223213 >> From xxinliu at amazon.com Fri May 3 00:21:07 2019 From: xxinliu at amazon.com (Liu, Xin) Date: Fri, 3 May 2019 00:21:07 +0000 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> Message-ID: Hi, Tobias, Thanks for the review. I fixed copyrights and the typo of clearMethodState0. Here is the new revision. https://cr.openjdk.java.net/~xliu/8222670/webrev.04/ ?On 5/2/19, 2:05 AM, "Tobias Hartmann" wrote: Hi, in the bug description you state: > CompileBroker::compile_method fails to detect the pre-existing nmethod because comp_level doesn't match But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? That's the very root cause of level2 recompilation. In CompileBroker::compile_method(), its input argument is comp_level = 3. CompileBroker::compilation_is_complete returns false because codecache only has level=2 nmethod. I don't know why, but hotpsot is also very stubborn. It will request level = 3 again and again. All of them are downgraded to level=2 when they dequeue. Level2RecompilationTest simulates this process. I didn't make it up. I observe the symptom in some real services as follows. https://bugs.openjdk.java.net/secure/attachment/82079/lvl2_recomp_spring.log.zip thanks, --lx You need to update the copyright date in Level2RecompilationTest.java (should be 2019 only). Thanks, Tobias On 26.04.19 09:36, Liu, Xin wrote: > Gently ping. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8222670 > I got the new revision. > https://cr.openjdk.java.net/~xliu/8222670/webrev.03/ > > I finish up test Level2RecompilationTest.java. if you want to start a OSR compilation, you have to specify bci which points to the begin of a BB. > Give them bci = 0 is good enough for general cases. > > Thanks, > --lx > > > > > On 4/19/19, 11:19 PM, "Liu, Xin" wrote: > > hi, Severin, > > Thanks for reviewing. Yes, it's irrelevant. I revert it. please check it out. > https://cr.openjdk.java.net/~xliu/8222670/webrev.02/ > > Please note that I added an assertion InstanceKlass::add_osr_nmethod(nmethod* n) in this webrev. > In my understanding, it is a potential memleak of codecache. If there's no higher level of osr compilation, those dups will stay in codecache forever. > > Further, it doesn?t make sense to recompile with the same level and same bci. With this assertion, the following tests in tier1-test failed. > test/hotspot/jtreg/compiler/intrinsics/unsafe/DirectByteBufferTest.java > test/hotspot/jtreg/compiler/intrinsics/unsafe/HeapByteBufferTest.java > test/jdk/java/util/stream/test/org/openjdk/tests/java/util/stream/ToArrayOpTest.java > test/jdk/tools/pack200/Pack200Test.java > test/jdk/java/util/Arrays/SortingNearlySortedPrimitive.java > > All crashes happen as I described in JDK-8222670. Eg. duplicated OSR compilations occur for level2. > > Program received signal SIGSEGV, Segmentation fault. > # To suppress the following error report, specify this argument > # after -XX: or in .hotspotrc: SuppressErrorAt=/instanceKlass.cpp:2972 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/src/src/hotspot/share/oops/instanceKlass.cpp:2972), pid=8347, tid=8361 > # assert(prev == __null || !prev->is_in_use()) failed: redundunt OSR recompilation detected. memory leak in CodeCache! > # > # JRE version: OpenJDK Runtime Environment (13.0) (slowdebug build 13-internal+0-adhoc..src) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 13-internal+0-adhoc..src, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0xb3dbb4] InstanceKlass::add_osr_nmethod(nmethod*)+0xc4 > # > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /build/JTwork/scratch/hs_err_pid8347.log > > Program received signal SIGSEGV, Segmentation fault. > Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) > total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 > relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 > main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 > stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 > oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 > metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 > scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 > scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 > dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 > nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 > Compiled method (c1) 19032 752 % 2 ByteBufferTest::stepUsingAccessors @ 382 (633 bytes) > total in heap [0x00007fffd8f9ff90,0x00007fffd8fac628] = 50840 > relocation [0x00007fffd8fa0110,0x00007fffd8fa1388] = 4728 > main code [0x00007fffd8fa13a0,0x00007fffd8fa7f80] = 27616 > stub code [0x00007fffd8fa7f80,0x00007fffd8fa86c0] = 1856 > oops [0x00007fffd8fa86c0,0x00007fffd8fa86c8] = 8 > metadata [0x00007fffd8fa86c8,0x00007fffd8fa8800] = 312 > scopes data [0x00007fffd8fa8800,0x00007fffd8fa9ff8] = 6136 > scopes pcs [0x00007fffd8fa9ff8,0x00007fffd8fac408] = 9232 > dependencies [0x00007fffd8fac408,0x00007fffd8fac418] = 16 > nul chk table [0x00007fffd8fac418,0x00007fffd8fac628] = 528 > > > Thanks, > --lx > > On 4/19/19, 9:31 AM, "Severin Gehwolf" wrote: > > On Thu, 2019-04-18 at 19:46 +0000, Liu, Xin wrote: > > Hi, hotspot-compiler group, > > > > Could you review this webrev for JDK-8222670? > > https://cr.openjdk.java.net/~xliu/8222670/webrev.01/ > > +++ new/test/hotspot/jtreg/compiler/tiered/TieredLevelsTest.java 2019-04-18 12:18:38.000000000 -0700 > @@ -89,7 +89,7 @@ > && actual == COMP_LEVEL_LIMITED_PROFILE) { > // for simple method full_profile may be replaced by limited_profile > if (IS_VERBOSE) { > - System.out.printf("Level check: full profiling was replaced " > + System.out.println("Level check: full profiling was replaced " > + "by limited profiling. Expected: %d, actual:%d", > expected, actual); > > This seems an unintended change, is it? > > Thanks, > Severin > > > > > From dean.long at oracle.com Fri May 3 06:47:29 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 2 May 2019 23:47:29 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> Message-ID: <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> On 5/1/19 5:44 PM, Tom Rodriguez wrote: > You'll need to update your webrev after Vladimir's push.? This code > has moved into HotSpootJVMCIRuntime.java. > Here's the updated version: http://cr.openjdk.java.net/~dlong/8218700/webrev.3/ > Maybe WeakReferenceHolder instead of WeakTypeRef?? It needs a comment > explaining that we're intentionally avoiding the use of > ClassValue.remove as well.? Shouldn't the ref field be volatile? > ClassValue includes some barrier semantics and the new code needs > similar guarantees. > I went ahead and made it volatile, but I don't understand what guarantee was missing, and what problem we want to eliminate, unless it is to reduce the possibility of duplicates.? But the fix for JDK-8201248 assumes that duplicates are possible, so I wasn't worried about that. dl > tom > > dean.long at oracle.com wrote on 4/26/19 12:09 PM: >> https://bugs.openjdk.java.net/browse/JDK-8218700 >> http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ >> >> If we throw an OutOfMemoryError in the right place (see JDK-8222941), >> HotSpotJVMCIMetaAccessContext.fromClass can go into an infinite loop >> calling ClassValue.remove.? To work around the problem, reset the >> value in a mutable cell instead of calling remove. >> >> dl From nils.eliasson at oracle.com Fri May 3 08:40:37 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 3 May 2019 10:40:37 +0200 Subject: RFR(S): 8223138: Small clean-up in loop-tree support. In-Reply-To: <757ff96c-ac8b-7d25-9222-dcd0830b1fed@oracle.com> References: <4883fb85-23ab-acf0-4687-3da50b070a4d@oracle.com> <757ff96c-ac8b-7d25-9222-dcd0830b1fed@oracle.com> Message-ID: <7ae9b35b-d679-d705-41d0-4a6646a2908b@oracle.com> Looks good! Regards, Nils On 2019-04-30 19:25, Patric Hedlin wrote: > Thanks Vladimir. > > On 2019-04-30 18:45, Vladimir Ivanov wrote: >> Looks good. >> >> Small nit: I find original version of IdealLoopTree::tail() easier to >> read. What do you think about the folloing? >> >> inline Node* IdealLoopTree::tail() { >> ? // Handle lazy update of _tail field >> ? if (_tail->in(0) == NULL) { >> ??? _tail = _phase->get_ctrl(n); >> ? } >> ? return _tail; >> } >> > Sure, I'm fine with a revised "old" version as well. > > /Patric > >> Best regards, >> Vladimir Ivanov >> >> On 30/04/2019 07:09, Patric Hedlin wrote: >>> Dear all, >>> >>> I would like to ask for help to review the following change/update: >>> >>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223138 >>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223138/ >>> >>> 8223138: Small clean-up in loop-tree support. >>> >>> ???? Rename predicate 'is_inner()' to 'is_innermost()' to be accurate. >>> ???? Add 'is_root()' predicate for root parent test in loop-tree. >>> ???? Change definition of 'is_loop()' to always lazy-read the tail, >>> ???? since it should never be NULL. Clean-up of 'tail()' definition. >>> >>> >>> Testing: Part of 8216137 (hs-tier1..4, hs-precheckin-comp, >>> Kitchensink24h) >>> >>> >>> Best regards, >>> Patric >>> From patric.hedlin at oracle.com Fri May 3 08:42:19 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Fri, 3 May 2019 10:42:19 +0200 Subject: RFR(S): 8223138: Small clean-up in loop-tree support. In-Reply-To: <7ae9b35b-d679-d705-41d0-4a6646a2908b@oracle.com> References: <4883fb85-23ab-acf0-4687-3da50b070a4d@oracle.com> <757ff96c-ac8b-7d25-9222-dcd0830b1fed@oracle.com> <7ae9b35b-d679-d705-41d0-4a6646a2908b@oracle.com> Message-ID: Thanks Nils. /Patric On 03/05/2019 10:40, Nils Eliasson wrote: > Looks good! > > Regards, > > Nils > > On 2019-04-30 19:25, Patric Hedlin wrote: >> Thanks Vladimir. >> >> On 2019-04-30 18:45, Vladimir Ivanov wrote: >>> Looks good. >>> >>> Small nit: I find original version of IdealLoopTree::tail() easier >>> to read. What do you think about the folloing? >>> >>> inline Node* IdealLoopTree::tail() { >>> ? // Handle lazy update of _tail field >>> ? if (_tail->in(0) == NULL) { >>> ??? _tail = _phase->get_ctrl(n); >>> ? } >>> ? return _tail; >>> } >>> >> Sure, I'm fine with a revised "old" version as well. >> >> /Patric >> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 30/04/2019 07:09, Patric Hedlin wrote: >>>> Dear all, >>>> >>>> I would like to ask for help to review the following change/update: >>>> >>>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223138 >>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223138/ >>>> >>>> 8223138: Small clean-up in loop-tree support. >>>> >>>> ???? Rename predicate 'is_inner()' to 'is_innermost()' to be accurate. >>>> ???? Add 'is_root()' predicate for root parent test in loop-tree. >>>> ???? Change definition of 'is_loop()' to always lazy-read the tail, >>>> ???? since it should never be NULL. Clean-up of 'tail()' definition. >>>> >>>> >>>> Testing: Part of 8216137 (hs-tier1..4, hs-precheckin-comp, >>>> Kitchensink24h) >>>> >>>> >>>> Best regards, >>>> Patric >>>> From robbin.ehn at oracle.com Fri May 3 10:31:25 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Fri, 3 May 2019 12:31:25 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Hi, please see this update: Inc: http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/index.html Full: http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/ # Note http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html line 630 This is revert to the original, I accidental had left in a temporary test change, as you can see here in full diff: http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html I think I manage to address all review comments. Dean can you please cast an extra eye on: http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html This OR should be correct. Dan please do the same on the biased locking changes. I left out the merge with MutexLocker changes, since it was not interesting. There were some conflicts with JVMCI changes, so incremental contains some parts of that merge. Passes t1-5 and local testing. I'll continue with some additional testing. Thanks, Robbin On 4/25/19 2:05 PM, Robbin Ehn wrote: > Hi all, please review. > > Let's deopt with handshakes. > Removed VM op Deoptimize, instead we handshake. > Locks needs to be inflate since we are not in a safepoint. > > Goes on top of: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html > > Code: > http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html > Issue: > https://bugs.openjdk.java.net/browse/JDK-8221734 > > Passes t1-7 and multiple t1-5 runs. > > A few startup benchmark see a small speedup. > > Thanks, Robbin From tobias.hartmann at oracle.com Fri May 3 12:29:01 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 3 May 2019 14:29:01 +0200 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> Message-ID: <88c6a4e1-b98b-b0a9-2f76-3f2595be7374@oracle.com> On 03.05.19 02:21, Liu, Xin wrote: > Thanks for the review. I fixed copyrights and the typo of clearMethodState0. > Here is the new revision. > https://cr.openjdk.java.net/~xliu/8222670/webrev.04/ Looks good to me but I think you should also add: if (PrintTieredEvents) { print_event(REMOVE_FROM_QUEUE, method, method, task->osr_bci(), (CompLevel) task->comp_level()); } > But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the > same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? > > That's the very root cause of level2 recompilation. > In CompileBroker::compile_method(), its input argument is comp_level = 3. > CompileBroker::compilation_is_complete returns false because codecache only has level=2 nmethod. > I don't know why, but hotpsot is also very stubborn. It will request level = 3 again and again. All of them are downgraded to level=2 when they dequeue. > > Level2RecompilationTest simulates this process. I didn't make it up. I observe the symptom in some real services as follows. > https://bugs.openjdk.java.net/secure/attachment/82079/lvl2_recomp_spring.log.zip Okay, got it. Thanks, Tobias From rwestrel at redhat.com Fri May 3 12:59:47 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 03 May 2019 14:59:47 +0200 Subject: RFR(S): 8222738: Shenandoah: assert(is_Proj()) failed when running cometd benchmarks In-Reply-To: <0f1a9600-2f2d-f360-9bc5-aa44f49d8990@redhat.com> References: <87zhonnwoq.fsf@redhat.com> <0f1a9600-2f2d-f360-9bc5-aa44f49d8990@redhat.com> Message-ID: <87lfznofr0.fsf@redhat.com> Thanks for the review. Actually, I think it's safer to also make the change below because we want to clone everything that's between the call and the fallthrough/exception paths, that is everything with a control of: the call itself or its control projection. Roland. diff -r f0739ec84bb4 -r 9968255985be src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp --- a/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp Thu Apr 11 12:00:33 2019 +0200 +++ b/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp Thu May 02 20:47:23 2019 +0200 @@ -1362,7 +1362,7 @@ if (idx < n->outcnt()) { Node* u = n->raw_out(idx); Node* c = phase->ctrl_or_self(u); - if (c == ctrl) { + if (phase->is_dominator(call, c) && phase->is_dominator(c, projs.fallthrough_proj)) { stack.set_index(idx+1); assert(!u->is_CFG(), ""); stack.push(u, 0); From vladimir.kozlov at oracle.com Fri May 3 16:03:31 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 3 May 2019 09:03:31 -0700 Subject: [13] RFR(S) 8223262: [AOT] jaotc crashes with assert(!(((ThreadShadow*)__the_thread__)->has_pending_exception())) failed: Should not allocate with exception pending Message-ID: <8603b6d7-8323-7078-aafa-c65437b06718@oracle.com> http://cr.openjdk.java.net/~kvn/8223262/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223262 Added missing checks for pending exception. Fix was reviewed by Tom R. and Gilles D. and pushed into jvmci-8 by Doug S. I tested it with hs-tier4 and hs-tier6-graal where we had the problem. AOT compilation does not crush now but there are still test failures caused by JDK-8220623 which will be fixed later. -- Thanks, Vladimir From rkennke at redhat.com Fri May 3 16:59:52 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 3 May 2019 18:59:52 +0200 Subject: RFR: JDK-8222079: Don't use memset to initialize fields decode_env constructor in disassembler.cpp In-Reply-To: References: Message-ID: Ping? Thanks, Roman > Recent gcc (I use version 9) complains about using memset to initialize > fields of decode_env. Let's use proper field initializers instead. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8222079 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8222079/webrev.01/ > > Can I please get a review? > > Thanks, Roman From tom.rodriguez at oracle.com Fri May 3 17:45:02 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 3 May 2019 10:45:02 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> Message-ID: <3d15e9f0-8717-ac82-678d-2139dcfec7f8@oracle.com> dean.long at oracle.com wrote on 5/2/19 11:47 PM: > On 5/1/19 5:44 PM, Tom Rodriguez wrote: >> You'll need to update your webrev after Vladimir's push.? This code >> has moved into HotSpootJVMCIRuntime.java. >> > > Here's the updated version: > > http://cr.openjdk.java.net/~dlong/8218700/webrev.3/ Looks good to me. > >> Maybe WeakReferenceHolder instead of WeakTypeRef?? It needs a comment >> explaining that we're intentionally avoiding the use of >> ClassValue.remove as well.? Shouldn't the ref field be volatile? >> ClassValue includes some barrier semantics and the new code needs >> similar guarantees. >> > > I went ahead and made it volatile, but I don't understand what guarantee > was missing, and what problem we want to eliminate, unless it is to > reduce the possibility of duplicates.? But the fix for JDK-8201248 > assumes that duplicates are possible, so I wasn't worried about that. We're publishing a mutable locally created object to other threads so it seems like we need some sort of ordering barrier when we do so. Presumably the ClassValue would normally provide some ordering though it's a little unclear from the javadoc if it makes any such guarantees. Is the extra volatile unneeded? tom > > dl > >> tom >> >> dean.long at oracle.com wrote on 4/26/19 12:09 PM: >>> https://bugs.openjdk.java.net/browse/JDK-8218700 >>> http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ >>> >>> If we throw an OutOfMemoryError in the right place (see JDK-8222941), >>> HotSpotJVMCIMetaAccessContext.fromClass can go into an infinite loop >>> calling ClassValue.remove.? To work around the problem, reset the >>> value in a mutable cell instead of calling remove. >>> >>> dl > From dean.long at oracle.com Fri May 3 17:54:09 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 3 May 2019 10:54:09 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic In-Reply-To: References: Message-ID: <5e67b2d3-9856-069e-4886-8366c89bc3f8@oracle.com> I like the refactoring. Do you want to have a Runtime reviewer take a look at the new logic? Can you explain why Parse::clinit_deopt() changed from testing for InstanceKlass::fully_initialized to testing for InstanceKlass::being_initialized instead?? How do we know we it is the initializing thread? dl On 5/1/19 4:37 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223216 > > (The patch has minor dependencies on 8223213 [1] I sent out for review > earlier.) > > C2 implements class initialization checks for new and > getstatic/putstatic differently: while "new" supports fast class > initialization checks, static field accesses rely on uncommon traps > which may lead to deoptimization/recompilation storms during > long-running class initialisation. > > Proposed patch unifies implementation between them and uses the > following barrier: > ?? if (holder->is_initialized()) { > ???? uncommon_trap(initialized, reinterpret); > ?? } > ?? if (!holder->is_reentrant_initialization(current_thread)) { > ???? uncommon_trap(uninitialized, none); > ?? } > > It also enhances checks for not-yet-initialized classes > (Compile::needs_clinit_barrier) and unifies the implementation between > new, invokestatic, and getfield/putfield. > > Testing: tier1-5, targeted microbenchmarks, new test from 8223213 > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ > ??? https://bugs.openjdk.java.net/browse/JDK-8223213 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Fri May 3 18:55:18 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 3 May 2019 11:55:18 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: <3d15e9f0-8717-ac82-678d-2139dcfec7f8@oracle.com> References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> <3d15e9f0-8717-ac82-678d-2139dcfec7f8@oracle.com> Message-ID: <415627e2-165c-14a0-a069-2e01de5574d4@oracle.com> On 5/3/19 10:45 AM, Tom Rodriguez wrote: > > > dean.long at oracle.com wrote on 5/2/19 11:47 PM: >> On 5/1/19 5:44 PM, Tom Rodriguez wrote: >>> You'll need to update your webrev after Vladimir's push.? This code >>> has moved into HotSpootJVMCIRuntime.java. >>> >> >> Here's the updated version: >> >> http://cr.openjdk.java.net/~dlong/8218700/webrev.3/ > > Looks good to me. Thanks for the review. > >> >>> Maybe WeakReferenceHolder instead of WeakTypeRef?? It needs a >>> comment explaining that we're intentionally avoiding the use of >>> ClassValue.remove as well. Shouldn't the ref field be volatile? >>> ClassValue includes some barrier semantics and the new code needs >>> similar guarantees. >>> >> >> I went ahead and made it volatile, but I don't understand what >> guarantee was missing, and what problem we want to eliminate, unless >> it is to reduce the possibility of duplicates.? But the fix for >> JDK-8201248 assumes that duplicates are possible, so I wasn't worried >> about that. > > We're publishing a mutable locally created object to other threads so > it seems like we need some sort of ordering barrier when we do so. > Presumably the ClassValue would normally provide some ordering though > it's a little unclear from the javadoc if it makes any such > guarantees. Is the extra volatile unneeded? > ClassValue uses volatile internally so that an unsynchronized read sees the latest version.? Using a volatile here should help in a similar way, but I believe there is still a race that allows duplicates if the weak reference gets cleared by GC.? To prevent all duplicates I think we would need both volatile and more synchronization. dl > tom > >> >> dl >> >>> tom >>> >>> dean.long at oracle.com wrote on 4/26/19 12:09 PM: >>>> https://bugs.openjdk.java.net/browse/JDK-8218700 >>>> http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ >>>> >>>> If we throw an OutOfMemoryError in the right place (see >>>> JDK-8222941), HotSpotJVMCIMetaAccessContext.fromClass can go into >>>> an infinite loop calling ClassValue.remove.? To work around the >>>> problem, reset the value in a mutable cell instead of calling remove. >>>> >>>> dl >> From dean.long at oracle.com Fri May 3 20:31:55 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 3 May 2019 13:31:55 -0700 Subject: RFR: JDK-8222079: Don't use memset to initialize fields decode_env constructor in disassembler.cpp In-Reply-To: References: Message-ID: <2593bcd8-d4c6-03d2-9d70-d90c94dbb828@oracle.com> Looks good. dl On 5/3/19 9:59 AM, Roman Kennke wrote: > Ping? > > Thanks, > Roman > > >> Recent gcc (I use version 9) complains about using memset to >> initialize fields of decode_env. Let's use proper field initializers >> instead. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8222079 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8222079/webrev.01/ >> >> Can I please get a review? >> >> Thanks, Roman From dean.long at oracle.com Fri May 3 20:43:56 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 3 May 2019 13:43:56 -0700 Subject: [13] RFR(S) 8223262: [AOT] jaotc crashes with assert(!(((ThreadShadow*)__the_thread__)->has_pending_exception())) failed: Should not allocate with exception pending In-Reply-To: <8603b6d7-8323-7078-aafa-c65437b06718@oracle.com> References: <8603b6d7-8323-7078-aafa-c65437b06718@oracle.com> Message-ID: Looks good. dl On 5/3/19 9:03 AM, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/8223262/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223262 > > Added missing checks for pending exception. > Fix was reviewed by Tom R. and Gilles D.? and pushed into jvmci-8 by > Doug S. > > I tested it with hs-tier4 and hs-tier6-graal where we had the problem. > AOT compilation does not crush now but there are still test failures > caused by? JDK-8220623 which will be fixed later. > From daniel.daugherty at oracle.com Fri May 3 21:13:14 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 3 May 2019 17:13:14 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: On 5/3/19 6:31 AM, Robbin Ehn wrote: > Hi, please see this update: > > Inc: > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/index.html > Full: > http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/ src/hotspot/share/aot/aotCodeHeap.cpp ??? No comments. src/hotspot/share/aot/aotCompiledMethod.cpp ??? No comments. src/hotspot/share/code/codeCache.cpp ??? No comments. src/hotspot/share/code/nmethod.cpp ??? No comments. src/hotspot/share/code/nmethod.hpp ??? No comments. src/hotspot/share/gc/z/zBarrierSetNMethod.cpp ??? No comments. src/hotspot/share/gc/z/zNMethod.cpp ??? No comments. src/hotspot/share/jvmci/jvmciEnv.cpp ??? No comments. src/hotspot/share/oops/markOop.hpp ??? No changes to this file. src/hotspot/share/oops/method.cpp ??? No comments. src/hotspot/share/oops/method.hpp ??? No comments. src/hotspot/share/prims/jvmtiEventController.cpp ??? No comments. src/hotspot/share/prims/methodHandles.cpp ??? No comments. src/hotspot/share/prims/whitebox.cpp ??? No comments. src/hotspot/share/runtime/biasedLocking.cpp ??? nit - Please update copyright year for this file. ??? Nice refactoring into more readable chunks! I'm assuming that ??? Patricio is also reviewing these changes... src/hotspot/share/runtime/biasedLocking.hpp ??? No comments. src/hotspot/share/runtime/deoptimization.cpp ??? L778:? bool _in_handshake; ??????? nit - needs one more space of indent. ??? Nice refactoring while adding in the handshake support. src/hotspot/share/runtime/deoptimization.hpp ??? L147:? public: ??? L148: ??? L149: ? // Deoptimizes a frame lazily. nmethod gets patched deopt happens on return to the frame ??? L163: ? static void fix_monitors(JavaThread* thread, frame fr, RegisterMap* map) ??????? Style nit: I would put the blank line on L148 above L147. ??? L164: ??? { inflate_monitors(thread, fr, map); } ??????? Style nit: Should be: ??????????? static void fix_monitors(JavaThread* thread, frame fr, RegisterMap* map) { ????????????? inflate_monitors(thread, fr, map); ??????????? } src/hotspot/share/runtime/mutex.hpp ??? No comments. src/hotspot/share/runtime/mutexLocker.cpp ??? No comments. (So OsrList_lock is now 'special-1' instead of 'leaf'. ??? I presume the Compiler team is okay with that... src/hotspot/share/runtime/mutexLocker.hpp ??? No comments. src/hotspot/share/runtime/synchronizer.cpp ??? No comments. src/hotspot/share/runtime/thread.cpp ??? No comments. src/hotspot/share/runtime/thread.hpp ??? No comments. src/hotspot/share/runtime/vmOperations.cpp ??? No comments. src/hotspot/share/runtime/vmOperations.hpp ??? No comments. src/hotspot/share/services/dtraceAttacher.cpp ??? No comments. > Dan please do the same on the biased locking changes. ??? I did so and they look fine. Thumbs up!? I don't need to see a webrev if you fix the nits... Dan > > # Note > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > line 630 > This is revert to the original, I accidental had left in a temporary > test change, as you can see here in full diff: > http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > > > I think I manage to address all review comments. > > Dean can you please cast an extra eye on: > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html > > This OR should be correct. > > Dan please do the same on the biased locking changes. > > I left out the merge with MutexLocker changes, since it was not > interesting. > There were some conflicts with JVMCI changes, so incremental contains > some parts of that merge. > > Passes t1-5 and local testing. > I'll continue with some additional testing. > > Thanks, Robbin > > On 4/25/19 2:05 PM, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From vladimir.kozlov at oracle.com Fri May 3 21:38:15 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 3 May 2019 14:38:15 -0700 Subject: [13] RFR(S) 8223262: [AOT] jaotc crashes with assert(!(((ThreadShadow*)__the_thread__)->has_pending_exception())) failed: Should not allocate with exception pending In-Reply-To: References: <8603b6d7-8323-7078-aafa-c65437b06718@oracle.com> Message-ID: <48aab9ef-97da-0e3e-2fa3-ceeaac8a9d5d@oracle.com> Thank you, Dean Vladimir On 5/3/19 1:43 PM, dean.long at oracle.com wrote: > Looks good. > > dl > > On 5/3/19 9:03 AM, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/8223262/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223262 >> >> Added missing checks for pending exception. >> Fix was reviewed by Tom R. and Gilles D.? and pushed into jvmci-8 by Doug S. >> >> I tested it with hs-tier4 and hs-tier6-graal where we had the problem. AOT compilation does not crush now but there >> are still test failures caused by? JDK-8220623 which will be fixed later. >> > From vladimir.x.ivanov at oracle.com Fri May 3 22:49:45 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 3 May 2019 15:49:45 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic In-Reply-To: <5e67b2d3-9856-069e-4886-8366c89bc3f8@oracle.com> References: <5e67b2d3-9856-069e-4886-8366c89bc3f8@oracle.com> Message-ID: Thanks for the feedback, Dean. > Do you want to have a Runtime reviewer take a look at the new logic? I'm definitely looking for feedback on 8223213 from Runtime team. But 8223216 is C2-specific and incrementally builds on top of it, so I don't think there's anything new for Runtime team to look at. > Can you explain why Parse::clinit_deopt() changed from testing for > > InstanceKlass::fully_initialized > > to testing for > > InstanceKlass::being_initialized > > instead?? How do we know we it is the initializing thread? Initializing thread is irrelevant here. The check is solely about the current state of the holder class. Parse::clinit_deopt() is not mandatory (nmethod clinit barrier on entry cover all important cases), but an optimization. It is added by 8223213 specifically for C2 to trigger recompilation once the holder class is fully initialized. The motivation is to get better code when a class is fully initialized. The change in 8223216 is intended as a refactoring: since there are only 2 states allowed here (being_initialized and fully_initialized), it doesn't matter what state is checked (== being initialized vs != fully_initialized). Best regards, Vladimir Ivanov > On 5/1/19 4:37 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223216 >> >> (The patch has minor dependencies on 8223213 [1] I sent out for review >> earlier.) >> >> C2 implements class initialization checks for new and >> getstatic/putstatic differently: while "new" supports fast class >> initialization checks, static field accesses rely on uncommon traps >> which may lead to deoptimization/recompilation storms during >> long-running class initialisation. >> >> Proposed patch unifies implementation between them and uses the >> following barrier: >> ?? if (holder->is_initialized()) { >> ???? uncommon_trap(initialized, reinterpret); >> ?? } >> ?? if (!holder->is_reentrant_initialization(current_thread)) { >> ???? uncommon_trap(uninitialized, none); >> ?? } >> >> It also enhances checks for not-yet-initialized classes >> (Compile::needs_clinit_barrier) and unifies the implementation between >> new, invokestatic, and getfield/putfield. >> >> Testing: tier1-5, targeted microbenchmarks, new test from 8223213 >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223213 >> > From sandhya.viswanathan at intel.com Fri May 3 23:02:25 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Fri, 3 May 2019 23:02:25 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> Hi Vladimir, Please find below the updated webrev which implements all your inputs: http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ Looking forward to your feedback. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Wednesday, May 01, 2019 5:09 PM To: Viswanathan, Sandhya ; Vladimir Kozlov Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 Sounds good, thanks! Best regards, Vladimir Ivanov On 01/05/2019 15:16, Viswanathan, Sandhya wrote: > I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Viswanathan, Sandhya > Sent: Wednesday, May 01, 2019 3:10 PM > To: 'Vladimir Ivanov' ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 > > Hi Vladimir, > > I agree, I wanted to show both the approaches in this patch to get your feedback: > 1) with emit as a function > 2) with emit part in the instruct body itself > > With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. > > Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. > > I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). > > Please do let me know if you want to see any other change in this patch. > > Best Regards, > Sandhya > > > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 01, 2019 2:58 PM > To: Viswanathan, Sandhya ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ > > Nice job, Sandhya! Glad to hear the approach pays off! > > Unfortunately, I must note that AD file becomes much more obscure. > Especially with those function pointers. > > 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, > 1529 XMMRegister src, XMMRegister shift, > 1530 XMMRegister tmp1, XMMRegister tmp2, > Register scratch) { > 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? > false : true); > 1532 XX_Inst shiftinst = get_xx_inst(opcode); > 1533 > 1534 (_masm.*extendinst)(tmp1, src); > 1535 (_masm.*shiftinst)(tmp1, shift); > 1536 __ pshufd(tmp2, src, 0xE); > 1537 (_masm.*extendinst)(tmp2, tmp2); > 1538 (_masm.*shiftinst)(tmp2, shift); > 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), > scratch); > 1540 __ pand(tmp2, dst); > 1541 __ pand(dst, tmp1); > 1542 __ packuswb(dst, tmp2); > 1543 } > > Have you tried to encapsulate that into x86-specific MacroAssembler? > > 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ > 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() > == 16); > 8684 match(Set dst (LShiftVB src shift)); > 8685 match(Set dst (RShiftVB src shift)); > 8686 match(Set dst (URShiftVB src shift)); > 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); > 8688 format %{"pmovxbw $tmp1,$src\n\t" > 8689 "shiftop $tmp1,$shift\n\t" > 8690 "pshufd $tmp2,$src\n\t" > 8691 "pmovxbw $tmp2,$tmp2\n\t" > 8692 "shiftop $tmp2,$shift\n\t" > 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" > 8694 "pand $tmp2,$dst\n\t" > 8695 "pand $dst,$tmp1\n\t" > 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} > 8697 ins_encode %{ > 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , > $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); > 8699 %} > 8700 ins_pipe( pipe_slow ); > 8701 %} > > can be turned into something like: > > instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ > predicate(n->as_Vector()->length() == 16); > match(Set dst (LShiftVB src shift)); > match(Set dst (RShiftVB src shift)); > match(Set dst (URShiftVB src shift)); > effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); > format %{"packed16B shift" %} > ins_encode %{ > int vlen = 0; // 128-bit > BasicType elem_type = T_BYTE; > int shift_mode = ...; // L/R/UR or S/U + L/R > __ vshift(vlen, elem_type, shift_mode, > $dst$$..., $src$$..., $shift$$..., > $tmp1$$..., $tmp2$$..., $scratch$$...); > %} > > Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? > > Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. > > Best regards, > Vladimir Ivanov > >> With this webrev the ad file has only about 60 lines effectively added. >> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >> >> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >> >> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >> >> Your review and feedback is welcome. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: hotspot-compiler-dev >> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >> Viswanathan, Sandhya >> Sent: Wednesday, April 10, 2019 10:22 AM >> To: Vladimir Kozlov ; B. Blaser >> >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >> >> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >> vector_len); >> >> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, April 10, 2019 9:59 AM >> To: Viswanathan, Sandhya ; B. Blaser >> >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>> Hi Bernard, >>> >>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >> >> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >> $tmp2 to avoid overwriting >> $src2 before we get value from it if $dst = $src2. >> >> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >> >> It is a little mess which may cause ineffective use of registers in compiled code. >> >> Thanks, >> Vladimir >> >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>> Sent: Wednesday, April 10, 2019 4:10 AM >>> To: Viswanathan, Sandhya >>> Cc: Vladimir Kozlov ; >>> hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Sandhya and Vladimir K., >>> >>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>> >>>> Hi Vladimir, >>>> >>>> Yes, I missed the question below: >>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>> >>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>> >>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>> >>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>> >>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>> >>> Thanks, >>> Bernard >>> From vladimir.x.ivanov at oracle.com Fri May 3 23:22:16 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 3 May 2019 16:22:16 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> Message-ID: <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ Much better! I like how AD files look now. I assume static footprint numbers you provided earlier are still valid. +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register scr) { + if (opcode == Op_AbsVD) { + andpd(dst, ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); + } else { + assert((opcode == Op_NegVD),"opcode should be Op_NegD"); + xorpd(dst, ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); + } +} It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm perfectly fine with incrementally refactor it later. For now, just guard relevant code with #ifdef COMPILER2. Otherwise, looks very good! Best regards, Vladimir Ivanov > > Looking forward to your feedback. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 01, 2019 5:09 PM > To: Viswanathan, Sandhya ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > Sounds good, thanks! > > Best regards, > Vladimir Ivanov > > On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Viswanathan, Sandhya >> Sent: Wednesday, May 01, 2019 3:10 PM >> To: 'Vladimir Ivanov' ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Vladimir, >> >> I agree, I wanted to show both the approaches in this patch to get your feedback: >> 1) with emit as a function >> 2) with emit part in the instruct body itself >> >> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. >> >> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. >> >> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). >> >> Please do let me know if you want to see any other change in this patch. >> >> Best Regards, >> Sandhya >> >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 2:58 PM >> To: Viswanathan, Sandhya ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> >>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >> >> Nice job, Sandhya! Glad to hear the approach pays off! >> >> Unfortunately, I must note that AD file becomes much more obscure. >> Especially with those function pointers. >> >> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >> 1529 XMMRegister src, XMMRegister shift, >> 1530 XMMRegister tmp1, XMMRegister tmp2, >> Register scratch) { >> 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >> false : true); >> 1532 XX_Inst shiftinst = get_xx_inst(opcode); >> 1533 >> 1534 (_masm.*extendinst)(tmp1, src); >> 1535 (_masm.*shiftinst)(tmp1, shift); >> 1536 __ pshufd(tmp2, src, 0xE); >> 1537 (_masm.*extendinst)(tmp2, tmp2); >> 1538 (_masm.*shiftinst)(tmp2, shift); >> 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >> scratch); >> 1540 __ pand(tmp2, dst); >> 1541 __ pand(dst, tmp1); >> 1542 __ packuswb(dst, tmp2); >> 1543 } >> >> Have you tried to encapsulate that into x86-specific MacroAssembler? >> >> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() >> == 16); >> 8684 match(Set dst (LShiftVB src shift)); >> 8685 match(Set dst (RShiftVB src shift)); >> 8686 match(Set dst (URShiftVB src shift)); >> 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> 8688 format %{"pmovxbw $tmp1,$src\n\t" >> 8689 "shiftop $tmp1,$shift\n\t" >> 8690 "pshufd $tmp2,$src\n\t" >> 8691 "pmovxbw $tmp2,$tmp2\n\t" >> 8692 "shiftop $tmp2,$shift\n\t" >> 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" >> 8694 "pand $tmp2,$dst\n\t" >> 8695 "pand $dst,$tmp1\n\t" >> 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} >> 8697 ins_encode %{ >> 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >> 8699 %} >> 8700 ins_pipe( pipe_slow ); >> 8701 %} >> >> can be turned into something like: >> >> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> predicate(n->as_Vector()->length() == 16); >> match(Set dst (LShiftVB src shift)); >> match(Set dst (RShiftVB src shift)); >> match(Set dst (URShiftVB src shift)); >> effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> format %{"packed16B shift" %} >> ins_encode %{ >> int vlen = 0; // 128-bit >> BasicType elem_type = T_BYTE; >> int shift_mode = ...; // L/R/UR or S/U + L/R >> __ vshift(vlen, elem_type, shift_mode, >> $dst$$..., $src$$..., $shift$$..., >> $tmp1$$..., $tmp2$$..., $scratch$$...); >> %} >> >> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? >> >> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. >> >> Best regards, >> Vladimir Ivanov >> >>> With this webrev the ad file has only about 60 lines effectively added. >>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>> >>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>> >>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>> >>> Your review and feedback is welcome. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>> Viswanathan, Sandhya >>> Sent: Wednesday, April 10, 2019 10:22 AM >>> To: Vladimir Kozlov ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>> >>> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>> vector_len); >>> >>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, April 10, 2019 9:59 AM >>> To: Viswanathan, Sandhya ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>> Hi Bernard, >>>> >>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>> >>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>> $tmp2 to avoid overwriting >>> $src2 before we get value from it if $dst = $src2. >>> >>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>> >>> It is a little mess which may cause ineffective use of registers in compiled code. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>> To: Viswanathan, Sandhya >>>> Cc: Vladimir Kozlov ; >>>> hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Hi Sandhya and Vladimir K., >>>> >>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> Yes, I missed the question below: >>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>>> >>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>> >>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>> >>>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>>> >>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>> >>>> Thanks, >>>> Bernard >>>> From fujie at loongson.cn Fri May 3 23:24:28 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 4 May 2019 07:24:28 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> Message-ID: <8510740c-ac56-f8d2-3c5e-451dfa6948a0@loongson.cn> Hi Vladimir Ivanov, The patch in the attachment has been updated by adding brackets to the checks in InlineTree::is_not_reached. Is it OK to be pushed? If so, could you please sponsor it? Thanks a lot. Best regards, Jie On 2019?05?04? 06:06, Vladimir Ivanov wrote: > CCing Jie Fu. > > Best regards, > Vladimir Ivanov > > On 03/05/2019 14:55, coleen.phillimore at oracle.com wrote: >> >> http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/src/hotspot/share/oops/cpCache.cpp.frames.html >> >> >> This looks like it should have gotten the wrong answer without this >> change (there appears to be protection from an index out of range) >> even without your patch. f2 is Method* for invokeinterface now. >> >> The runtime part of this change look good to me. >> >> Thanks, >> Coleen >> >> >> On 5/2/19 5:23 AM, Tobias Hartmann wrote: >>> Hi Jie, >>> >>> this looks good to me too but please add brackets to the checks in >>> InlineTree::is_not_reached. >>> >>> I've submitted some extended testing and let you know once it passed. >>> >>> Someone from the runtime team should also have a look at this >>> because your changes affect the >>> interpreter. CC'ing runtime-dev. >>> >>> Thanks, >>> Tobias >>> >>> On 29.04.19 15:43, Jie Fu wrote: >>>> Hi all, >>>> >>>> May I have another review for this change [1] to finalize the fix? >>>> Thanks a lot. >>>> >>>> Best regards, >>>> Jie >>>> >>>> [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ >>>> >>>> >>>> On 2019?04?20? 11:35, Jie Fu wrote: >>>>> Ah, I got it. >>>>> I like your patch and benefit a lot from you. >>>>> Thank you so much, Vladimir. >>>>> >>>>> Any comments from other reviewers? >>>>> Thanks. >>>>> >>>>> Best regards, >>>>> Jie >>>>> >>>>> On 2019/4/20 ??11:18, Vladimir Ivanov wrote: >>>>>>>> After some explorations I decided to keep original behavior for >>>>>>>> immature profiles >>>>>>>> (profile.count == -1). >>>>>>> I agree. >>>>>>> >>>>>>> I have two questions here. >>>>>>> >>>>>>> 1. What's the difference of the following two if statements? >>>>>>> ------------------------------------------------- >>>>>>> + if (!callee_method->was_executed_more_than(0)) return true; >>>>>>> // callee was never executed >>>>>>> + >>>>>>> + if (caller_method->is_not_reached(caller_bci)) return true; >>>>>>> // call site not resolved >>>>>>> ------------------------------------------------- >>>>>>> I think only one of them is needed. >>>>>> The checks are complimentary: one inspects callee and the other >>>>>> looks at call site. >>>>>> >>>>>> "!callee_method->was_executed_more_than(0)" ensures that callee >>>>>> was executed at least once. >>>>>> >>>>>> "caller_method->is_not_reached(caller_bci)" inspects the state of >>>>>> the call site. If corresponding >>>>>> CP entry is not resolved, then the call site isn't reached. If >>>>>> is_not_reached() returns false, >>>>>> it's not a definitive answer: there's still a chance the site is >>>>>> not reached - consider the case >>>>>> of virtual calls where callee_method may differ for the same >>>>>> resolved method. >>>>>> >>>>>>> 2. Does the assert in InlineTree::is_not_reached(...) make sense? >>>>>>> Since we have >>>>>>> ------------------------------------------------- >>>>>>> if (profile.count() > 0) return false; // reachable according >>>>>>> to profile >>>>>>> ------------------------------------------------- >>>>>>> and >>>>>>> ------------------------------------------------- >>>>>>> if (profile.count() == -1) {...} >>>>>>> ------------------------------------------------- >>>>>>> before >>>>>>> ------------------------------------------------- >>>>>>> assert(profile.count() == 0, "sanity"); >>>>>>> ------------------------------------------------- >>>>>>> is the assert redundant? >>>>>> Asserts are intended to be redundant :-) But still catch bugs >>>>>> from time to time. >>>>>> >>>>>> This one, in particular, checks invariant on profile.count() >= >>>>>> -1 (which is not very useful by >>>>>> itself), but also stresses that "profile.count() == 0" case is >>>>>> being processed. >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: 8221542.patch Type: text/x-patch Size: 7109 bytes Desc: not available URL: From sandhya.viswanathan at intel.com Sat May 4 00:01:33 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Sat, 4 May 2019 00:01:33 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> Hi Vladimir, The updated webrev with #ifdef change is at: http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.03/ Yes, the footprint numbers continue to hold with this patch: The x86.ad file is 126 lines smaller. The libjvm size increase is only 0.24%. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Friday, May 03, 2019 4:22 PM To: Viswanathan, Sandhya Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ Much better! I like how AD files look now. I assume static footprint numbers you provided earlier are still valid. +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register scr) { + if (opcode == Op_AbsVD) { + andpd(dst, ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); + } else { + assert((opcode == Op_NegVD),"opcode should be Op_NegD"); + xorpd(dst, ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); + } +} It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm perfectly fine with incrementally refactor it later. For now, just guard relevant code with #ifdef COMPILER2. Otherwise, looks very good! Best regards, Vladimir Ivanov > > Looking forward to your feedback. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 01, 2019 5:09 PM > To: Viswanathan, Sandhya ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > Sounds good, thanks! > > Best regards, > Vladimir Ivanov > > On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Viswanathan, Sandhya >> Sent: Wednesday, May 01, 2019 3:10 PM >> To: 'Vladimir Ivanov' ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Vladimir, >> >> I agree, I wanted to show both the approaches in this patch to get your feedback: >> 1) with emit as a function >> 2) with emit part in the instruct body itself >> >> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. >> >> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. >> >> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). >> >> Please do let me know if you want to see any other change in this patch. >> >> Best Regards, >> Sandhya >> >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 2:58 PM >> To: Viswanathan, Sandhya ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> >>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >> >> Nice job, Sandhya! Glad to hear the approach pays off! >> >> Unfortunately, I must note that AD file becomes much more obscure. >> Especially with those function pointers. >> >> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >> 1529 XMMRegister src, XMMRegister shift, >> 1530 XMMRegister tmp1, XMMRegister tmp2, >> Register scratch) { >> 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >> false : true); >> 1532 XX_Inst shiftinst = get_xx_inst(opcode); >> 1533 >> 1534 (_masm.*extendinst)(tmp1, src); >> 1535 (_masm.*shiftinst)(tmp1, shift); >> 1536 __ pshufd(tmp2, src, 0xE); >> 1537 (_masm.*extendinst)(tmp2, tmp2); >> 1538 (_masm.*shiftinst)(tmp2, shift); >> 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >> scratch); >> 1540 __ pand(tmp2, dst); >> 1541 __ pand(dst, tmp1); >> 1542 __ packuswb(dst, tmp2); >> 1543 } >> >> Have you tried to encapsulate that into x86-specific MacroAssembler? >> >> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() >> == 16); >> 8684 match(Set dst (LShiftVB src shift)); >> 8685 match(Set dst (RShiftVB src shift)); >> 8686 match(Set dst (URShiftVB src shift)); >> 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> 8688 format %{"pmovxbw $tmp1,$src\n\t" >> 8689 "shiftop $tmp1,$shift\n\t" >> 8690 "pshufd $tmp2,$src\n\t" >> 8691 "pmovxbw $tmp2,$tmp2\n\t" >> 8692 "shiftop $tmp2,$shift\n\t" >> 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" >> 8694 "pand $tmp2,$dst\n\t" >> 8695 "pand $dst,$tmp1\n\t" >> 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} >> 8697 ins_encode %{ >> 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >> 8699 %} >> 8700 ins_pipe( pipe_slow ); >> 8701 %} >> >> can be turned into something like: >> >> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> predicate(n->as_Vector()->length() == 16); >> match(Set dst (LShiftVB src shift)); >> match(Set dst (RShiftVB src shift)); >> match(Set dst (URShiftVB src shift)); >> effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> format %{"packed16B shift" %} >> ins_encode %{ >> int vlen = 0; // 128-bit >> BasicType elem_type = T_BYTE; >> int shift_mode = ...; // L/R/UR or S/U + L/R >> __ vshift(vlen, elem_type, shift_mode, >> $dst$$..., $src$$..., $shift$$..., >> $tmp1$$..., $tmp2$$..., $scratch$$...); >> %} >> >> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? >> >> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. >> >> Best regards, >> Vladimir Ivanov >> >>> With this webrev the ad file has only about 60 lines effectively added. >>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>> >>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>> >>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>> >>> Your review and feedback is welcome. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>> Viswanathan, Sandhya >>> Sent: Wednesday, April 10, 2019 10:22 AM >>> To: Vladimir Kozlov ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>> >>> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>> vector_len); >>> >>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, April 10, 2019 9:59 AM >>> To: Viswanathan, Sandhya ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>> Hi Bernard, >>>> >>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>> >>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>> $tmp2 to avoid overwriting >>> $src2 before we get value from it if $dst = $src2. >>> >>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>> >>> It is a little mess which may cause ineffective use of registers in compiled code. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>> To: Viswanathan, Sandhya >>>> Cc: Vladimir Kozlov ; >>>> hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Hi Sandhya and Vladimir K., >>>> >>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> Yes, I missed the question below: >>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>>> >>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>> >>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>> >>>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>>> >>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>> >>>> Thanks, >>>> Bernard >>>> From jesper.wilhelmsson at oracle.com Sat May 4 01:13:04 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Sat, 4 May 2019 03:13:04 +0200 Subject: RFR: JDK-8222665 - Update Graal Message-ID: <74DF69E0-2792-492E-99DC-BEA707375BBB@oracle.com> Hi, Please review the patch to integrate recent Graal changes into OpenJDK. Graal tip to integrate: 556bed673d5bccbed227e2e108dc36eaf00239eb Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Sat May 4 02:04:57 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 3 May 2019 19:04:57 -0700 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: <74DF69E0-2792-492E-99DC-BEA707375BBB@oracle.com> References: <74DF69E0-2792-492E-99DC-BEA707375BBB@oracle.com> Message-ID: <998bc1f0-ba55-6c06-21ed-349267a8271f@oracle.com> This is too old Graal's tip. You need at least take a29972bd6677f2e8165438caf1073ff596b95f26 to get next changes and to avoid rollback needed changes listed in overwrite file. Otherwise you will get tests failures. my changes: [GR-14499] Update jdk9 version of GraalServices.java and Dean's: [GR-15582] Replace getCompilationLevelAdjustment with excludeFromJVMCICompilation after JDK-8219403. Vladimir On 5/3/19 6:13 PM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate recent Graal changes into OpenJDK. > Graal tip to integrate: 556bed673d5bccbed227e2e108dc36eaf00239eb > > Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ > > This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. > > Thanks, > /Jesper > From jesper.wilhelmsson at oracle.com Sat May 4 02:15:58 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Sat, 4 May 2019 04:15:58 +0200 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: <998bc1f0-ba55-6c06-21ed-349267a8271f@oracle.com> References: <74DF69E0-2792-492E-99DC-BEA707375BBB@oracle.com> <998bc1f0-ba55-6c06-21ed-349267a8271f@oracle.com> Message-ID: <5C486583-81AC-4372-8B5A-8A7B369E8BCE@oracle.com> Ok, then I withdraw this RFR and we need to wait until we have a more recent clean Graal nightly. Thanks, /Jesper > On 4 May 2019, at 04:04, Vladimir Kozlov wrote: > > This is too old Graal's tip. > > You need at least take a29972bd6677f2e8165438caf1073ff596b95f26 to get next changes and to avoid rollback needed changes listed in overwrite file. Otherwise you will get tests failures. > > my changes: [GR-14499] Update jdk9 version of GraalServices.java > and Dean's: [GR-15582] Replace getCompilationLevelAdjustment with excludeFromJVMCICompilation after JDK-8219403. > > Vladimir > > On 5/3/19 6:13 PM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate recent Graal changes into OpenJDK. >> Graal tip to integrate: 556bed673d5bccbed227e2e108dc36eaf00239eb >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >> Thanks, >> /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Sat May 4 17:54:07 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 4 May 2019 10:54:07 -0700 Subject: [13] RFR(M) 8223332: Update JVMCI Message-ID: <085cd3bd-e2ba-4bdc-0573-c7f5cad98fed@oracle.com> http://cr.openjdk.java.net/~kvn/8223332/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223332 Sync latest JVMCI changes from graal-jvmci-8 [1] The list in the bug report. [1] https://github.com/graalvm/graal-jvmci-8/commits/master -- Thanks, Vladimir From igor.ignatyev at oracle.com Mon May 6 03:31:37 2019 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Sun, 5 May 2019 20:31:37 -0700 Subject: RFR(trivial): 8223054: [TESTBUG] Put graalJarsCP before existing classpath in GraalUnitTestLauncher In-Reply-To: References: Message-ID: <803B96E9-8EDD-4469-9137-63451E825724@oracle.com> Looks good to me. // moved to hotspot compiler list ? Igor > On May 4, 2019, at 6:32 PM, Pengfei Li (Arm Technology China) wrote: > > Hi, > > Please help review this trivial change on GraalUnitTestLauncher. > > Webrev: http://cr.openjdk.java.net/~pli/rfr/8223054/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8223054 > > Current graal unit test in jtreg requires junit-4.12.jar as a dependency. In GraalUnitTestLauncher.java, we put the path of this file into graalJarsCP and concat it with existing classpath. But existing classpath may contain another version of junit with which the jtreg tool is built. (According to OpenJDK "Building jtreg" webpage[1], the recommended version of Junit to build jtreg is junit-4.10). > > In this patch, graalJarsCP is put before existing classpath returned by System.getProperty() when generating the new classpath string to avoid incompatibility issues. Jteg graal unit test cases passed after this change. > > [1] https://openjdk.java.net/jtreg/build.html > > -- > Thanks, > Pengfei > From robbin.ehn at oracle.com Mon May 6 08:42:11 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 6 May 2019 10:42:11 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: Hi Dan, > src/hotspot/share/runtime/biasedLocking.cpp > ??? nit - Please update copyright year for this file. > Updated in 8220724. > ??? Nice refactoring into more readable chunks! I'm assuming that > ??? Patricio is also reviewing these changes... Great, good! > src/hotspot/share/runtime/deoptimization.cpp > ??? L778:? bool _in_handshake; > ??????? nit - needs one more space of indent. Fixed. > > ??? Nice refactoring while adding in the handshake support. Great! > > src/hotspot/share/runtime/deoptimization.hpp > ??? L147:? public: > ??? L148: > ??? L149: ? // Deoptimizes a frame lazily. nmethod gets patched deopt happens > on return to the frame > ??? L163: ? static void fix_monitors(JavaThread* thread, frame fr, RegisterMap* > map) > ??????? Style nit: I would put the blank line on L148 above L147. Fixed. > > ??? L164: ??? { inflate_monitors(thread, fr, map); } > ??????? Style nit: Should be: > > ??????????? static void fix_monitors(JavaThread* thread, frame fr, RegisterMap* > map) { > ????????????? inflate_monitors(thread, fr, map); > ??????????? } Fixed. > src/hotspot/share/runtime/mutexLocker.cpp > ??? No comments. (So OsrList_lock is now 'special-1' instead of 'leaf'. > ??? I presume the Compiler team is okay with that... Since need we hold CodeCache_lock while iterating nmethods, all locks that might be taken needed to be pushed down under CodeCache_lock. So I hope they are okay with that. > Thumbs up!? I don't need to see a webrev if you fix the nits... Thanks Dan! Fixed! I did t6-7 over the weekend, no issues found. /Robbin > > Dan > > >> >> # Note >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> line 630 >> This is revert to the original, I accidental had left in a temporary test >> change, as you can see here in full diff: >> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> >> >> I think I manage to address all review comments. >> >> Dean can you please cast an extra eye on: >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >> >> This OR should be correct. >> >> Dan please do the same on the biased locking changes. >> >> I left out the merge with MutexLocker changes, since it was not interesting. >> There were some conflicts with JVMCI changes, so incremental contains some >> parts of that merge. >> >> Passes t1-5 and local testing. >> I'll continue with some additional testing. >> >> Thanks, Robbin >> >> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From Pengfei.Li at arm.com Mon May 6 10:41:00 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Mon, 6 May 2019 10:41:00 +0000 Subject: RFR(trivial): 8223054: [TESTBUG] Put graalJarsCP before existing classpath in GraalUnitTestLauncher In-Reply-To: <803B96E9-8EDD-4469-9137-63451E825724@oracle.com> References: <803B96E9-8EDD-4469-9137-63451E825724@oracle.com> Message-ID: Thanks Igor. Do I need another reviewer for this trivial change? // Also cc graal-dev list -- Thanks, Pengfei > > Looks good to me. > > // moved to hotspot compiler list > > ? Igor > > > On May 4, 2019, at 6:32 PM, Pengfei Li (Arm Technology China) > wrote: > > > > Hi, > > > > Please help review this trivial change on GraalUnitTestLauncher. > > > > Webrev: http://cr.openjdk.java.net/~pli/rfr/8223054/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8223054 > > > > Current graal unit test in jtreg requires junit-4.12.jar as a dependency. In > GraalUnitTestLauncher.java, we put the path of this file into graalJarsCP and > concat it with existing classpath. But existing classpath may contain another > version of junit with which the jtreg tool is built. (According to OpenJDK > "Building jtreg" webpage[1], the recommended version of Junit to build jtreg > is junit-4.10). > > > > In this patch, graalJarsCP is put before existing classpath returned by > System.getProperty() when generating the new classpath string to avoid > incompatibility issues. Jteg graal unit test cases passed after this change. > > > > [1] https://openjdk.java.net/jtreg/build.html > > > > -- > > Thanks, > > Pengfei > > From martin.doerr at sap.com Mon May 6 12:54:01 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 6 May 2019 12:54:01 +0000 Subject: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness computation with JVMTI Message-ID: Hi, I'd like to backport this change to jdk11u because it's very simply and avoids some unnecessary overhead. Applies almost cleanly (only needs manual resolution because neighboring hunk has changed: CompileTheWorld removal). bug: https://bugs.openjdk.java.net/browse/JDK-8216556 original change: http://hg.openjdk.java.net/jdk/jdk/rev/91ab128a65a3 jdk11u webrev: http://cr.openjdk.java.net/~mdoerr/8216556_JVMTI_liveness/jdk11u/webrev.00/ I only had to reapply the change around "if (CURRENT_ENV->should_retain_local_variables() || DeoptimizeALot || CompileTheWorld) {" (ciMethod.cpp) because CompileTheWorld was removed. Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From goetz.lindenmaier at sap.com Mon May 6 12:59:23 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 6 May 2019 12:59:23 +0000 Subject: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness computation with JVMTI In-Reply-To: References: Message-ID: Hi Martin, you well intergrated this change to 11. Thanks for downporting it, it will help the debugging performance slightly. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Montag, 6. Mai 2019 14:54 > To: 'hotspot-compiler-dev at openjdk.java.net' dev at openjdk.java.net>; Lindenmaier, Goetz > Subject: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness > computation with JVMTI > > Hi, > > > > I'd like to backport this change to jdk11u because it's very simply and avoids > some unnecessary overhead. > > Applies almost cleanly (only needs manual resolution because neighboring > hunk has changed: CompileTheWorld removal). > > > > bug: > > https://bugs.openjdk.java.net/browse/JDK-8216556 > > > > original change: > > http://hg.openjdk.java.net/jdk/jdk/rev/91ab128a65a3 > > > > jdk11u webrev: > > http://cr.openjdk.java.net/~mdoerr/8216556_JVMTI_liveness/jdk11u/webrev. > 00/ > > > > I only had to reapply the change around > > "if (CURRENT_ENV->should_retain_local_variables() || DeoptimizeALot || > CompileTheWorld) {" > > (ciMethod.cpp) because CompileTheWorld was removed. > > > > Please review. > > > > Best regards, > > Martin > > From martin.doerr at sap.com Mon May 6 13:05:13 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 6 May 2019 13:05:13 +0000 Subject: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness computation with JVMTI In-Reply-To: References: Message-ID: Hi G?tz, thank you for reviewing. Best regards, Martin -----Original Message----- From: Lindenmaier, Goetz Sent: Montag, 6. Mai 2019 14:59 To: Doerr, Martin ; 'hotspot-compiler-dev at openjdk.java.net' ; jdk-updates-dev at openjdk.java.net Subject: RE: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness computation with JVMTI Hi Martin, you well intergrated this change to 11. Thanks for downporting it, it will help the debugging performance slightly. Best regards, Goetz. > -----Original Message----- > From: Doerr, Martin > Sent: Montag, 6. Mai 2019 14:54 > To: 'hotspot-compiler-dev at openjdk.java.net' dev at openjdk.java.net>; Lindenmaier, Goetz > Subject: RFR(S): jdk11u-dev backport 8216556: Unnecessary liveness > computation with JVMTI > > Hi, > > > > I'd like to backport this change to jdk11u because it's very simply and avoids > some unnecessary overhead. > > Applies almost cleanly (only needs manual resolution because neighboring > hunk has changed: CompileTheWorld removal). > > > > bug: > > https://bugs.openjdk.java.net/browse/JDK-8216556 > > > > original change: > > http://hg.openjdk.java.net/jdk/jdk/rev/91ab128a65a3 > > > > jdk11u webrev: > > http://cr.openjdk.java.net/~mdoerr/8216556_JVMTI_liveness/jdk11u/webrev. > 00/ > > > > I only had to reapply the change around > > "if (CURRENT_ENV->should_retain_local_variables() || DeoptimizeALot || > CompileTheWorld) {" > > (ciMethod.cpp) because CompileTheWorld was removed. > > > > Please review. > > > > Best regards, > > Martin > > From jesper.wilhelmsson at oracle.com Mon May 6 14:18:16 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 6 May 2019 16:18:16 +0200 Subject: RFR: JDK-8222665 - Update Graal Message-ID: Hi, Please review the patch to integrate recent Graal changes into OpenJDK. Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From eric.caspole at oracle.com Mon May 6 14:21:08 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Mon, 6 May 2019 10:21:08 -0400 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB5C2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> Message-ID: Hi Sandhya, Could add some new JMH to this webrev that target the java code that show the benefit of these changes? Or, you could look through the existing ones in test/micro/org/openjdk/bench/ and mention in the bug which existing ones exercise these changes. That will be a big help to us in the course of working on JDK 13. Thanks, Eric On 5/3/19 19:02, Viswanathan, Sandhya wrote: > Hi Vladimir, > > Please find below the updated webrev which implements all your inputs: > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ > > Looking forward to your feedback. > > Best Regards, > Sandhya > > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 01, 2019 5:09 PM > To: Viswanathan, Sandhya ; Vladimir Kozlov > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > Sounds good, thanks! > > Best regards, > Vladimir Ivanov > > On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Viswanathan, Sandhya >> Sent: Wednesday, May 01, 2019 3:10 PM >> To: 'Vladimir Ivanov' ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Hi Vladimir, >> >> I agree, I wanted to show both the approaches in this patch to get your feedback: >> 1) with emit as a function >> 2) with emit part in the instruct body itself >> >> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. >> >> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. >> >> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). >> >> Please do let me know if you want to see any other change in this patch. >> >> Best Regards, >> Sandhya >> >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 2:58 PM >> To: Viswanathan, Sandhya ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> >>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >> >> Nice job, Sandhya! Glad to hear the approach pays off! >> >> Unfortunately, I must note that AD file becomes much more obscure. >> Especially with those function pointers. >> >> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >> 1529 XMMRegister src, XMMRegister shift, >> 1530 XMMRegister tmp1, XMMRegister tmp2, >> Register scratch) { >> 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >> false : true); >> 1532 XX_Inst shiftinst = get_xx_inst(opcode); >> 1533 >> 1534 (_masm.*extendinst)(tmp1, src); >> 1535 (_masm.*shiftinst)(tmp1, shift); >> 1536 __ pshufd(tmp2, src, 0xE); >> 1537 (_masm.*extendinst)(tmp2, tmp2); >> 1538 (_masm.*shiftinst)(tmp2, shift); >> 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >> scratch); >> 1540 __ pand(tmp2, dst); >> 1541 __ pand(dst, tmp1); >> 1542 __ packuswb(dst, tmp2); >> 1543 } >> >> Have you tried to encapsulate that into x86-specific MacroAssembler? >> >> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() >> == 16); >> 8684 match(Set dst (LShiftVB src shift)); >> 8685 match(Set dst (RShiftVB src shift)); >> 8686 match(Set dst (URShiftVB src shift)); >> 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> 8688 format %{"pmovxbw $tmp1,$src\n\t" >> 8689 "shiftop $tmp1,$shift\n\t" >> 8690 "pshufd $tmp2,$src\n\t" >> 8691 "pmovxbw $tmp2,$tmp2\n\t" >> 8692 "shiftop $tmp2,$shift\n\t" >> 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" >> 8694 "pand $tmp2,$dst\n\t" >> 8695 "pand $dst,$tmp1\n\t" >> 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} >> 8697 ins_encode %{ >> 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >> 8699 %} >> 8700 ins_pipe( pipe_slow ); >> 8701 %} >> >> can be turned into something like: >> >> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >> predicate(n->as_Vector()->length() == 16); >> match(Set dst (LShiftVB src shift)); >> match(Set dst (RShiftVB src shift)); >> match(Set dst (URShiftVB src shift)); >> effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >> format %{"packed16B shift" %} >> ins_encode %{ >> int vlen = 0; // 128-bit >> BasicType elem_type = T_BYTE; >> int shift_mode = ...; // L/R/UR or S/U + L/R >> __ vshift(vlen, elem_type, shift_mode, >> $dst$$..., $src$$..., $shift$$..., >> $tmp1$$..., $tmp2$$..., $scratch$$...); >> %} >> >> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? >> >> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. >> >> Best regards, >> Vladimir Ivanov >> >>> With this webrev the ad file has only about 60 lines effectively added. >>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>> >>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>> >>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>> >>> Your review and feedback is welcome. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev >>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>> Viswanathan, Sandhya >>> Sent: Wednesday, April 10, 2019 10:22 AM >>> To: Vladimir Kozlov ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>> >>> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>> vector_len); >>> >>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, April 10, 2019 9:59 AM >>> To: Viswanathan, Sandhya ; B. Blaser >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>> Hi Bernard, >>>> >>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>> >>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>> $tmp2 to avoid overwriting >>> $src2 before we get value from it if $dst = $src2. >>> >>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>> >>> It is a little mess which may cause ineffective use of registers in compiled code. >>> >>> Thanks, >>> Vladimir >>> >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>> To: Viswanathan, Sandhya >>>> Cc: Vladimir Kozlov ; >>>> hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Hi Sandhya and Vladimir K., >>>> >>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> Yes, I missed the question below: >>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>>> >>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>> >>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>> >>>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>>> >>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>> >>>> Thanks, >>>> Bernard >>>> From patricio.chilano.mateo at oracle.com Mon May 6 16:10:16 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 6 May 2019 12:10:16 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> Hi Robbin, I'm going to just review the biased locking part since I'm not really familiar with the rest of the code. In BiasedLocking::revoke_and_rebias_in_handshake(), why do you need to execute fast_revoke(obj, false)? If these are objects locked by the JavaThread you are handshaking then it seems they should be normal locks (no bias pattern) or the condition (mark->biased_locker() == THREAD && prototype_header->bias_epoch() == mark->bias_epoch()) you are testing for later should hold. Then that would save the extra comparisons in fast_revoke(). Also instead of placing the condition (mark->biased_locker() == THREAD && prototype_header->bias_epoch() == mark->bias_epoch()) inside an if() and then later use a ShouldNotReachHere(), wouldn't it be better to make that an assertion, place that code outside the if() and remove the ShouldNotReachHere()? For the execution of revoke_bias() inside BiasedLocking::revoke_and_rebias_in_handshake() you could use a shorter version of BiasedLocking::revoke_and_rebias() that avoids the extra comparisons made for the general case and just starts at the walking the stack part, but I'm actually doing that for 8191890 so I can merge that with my patch. In deoptimization.cpp you have methods inflate_monitors() and inflate_monitors_handshake(), but in inflate_monitors() you are not inflating the monitors, you just revoke the ones that have bias. You mentioned in your first email that we need to inflate if we are not at a safepoint, why is that? Since revocation seems to be the common factor between those methods, maybe s/inflate/revoke is a better name? Thanks! Patricio On 5/6/19 4:42 AM, Robbin Ehn wrote: > Hi Dan, > >> src/hotspot/share/runtime/biasedLocking.cpp >> ???? nit - Please update copyright year for this file. >> > > Updated in 8220724. > >> ???? Nice refactoring into more readable chunks! I'm assuming that >> ???? Patricio is also reviewing these changes... > > Great, good! > >> src/hotspot/share/runtime/deoptimization.cpp >> ???? L778:? bool _in_handshake; >> ???????? nit - needs one more space of indent. > > Fixed. > >> >> ???? Nice refactoring while adding in the handshake support. > > Great! > >> >> src/hotspot/share/runtime/deoptimization.hpp >> ???? L147:? public: >> ???? L148: >> ???? L149: ? // Deoptimizes a frame lazily. nmethod gets patched >> deopt happens on return to the frame >> ???? L163: ? static void fix_monitors(JavaThread* thread, frame fr, >> RegisterMap* map) >> ???????? Style nit: I would put the blank line on L148 above L147. > > Fixed. > >> >> ???? L164: ??? { inflate_monitors(thread, fr, map); } >> ???????? Style nit: Should be: >> >> ???????????? static void fix_monitors(JavaThread* thread, frame fr, >> RegisterMap* map) { >> ?????????????? inflate_monitors(thread, fr, map); >> ???????????? } > > Fixed. > >> src/hotspot/share/runtime/mutexLocker.cpp >> ???? No comments. (So OsrList_lock is now 'special-1' instead of 'leaf'. >> ???? I presume the Compiler team is okay with that... > > Since need we hold CodeCache_lock while iterating nmethods, all locks > that might be taken needed to be pushed down under CodeCache_lock. > So I hope they are okay with that. > >> Thumbs up!? I don't need to see a webrev if you fix the nits... > > Thanks Dan! Fixed! > > I did t6-7 over the weekend, no issues found. > > /Robbin > >> >> Dan >> >> >>> >>> # Note >>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>> line 630 >>> This is revert to the original, I accidental had left in a temporary >>> test change, as you can see here in full diff: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>> >>> >>> I think I manage to address all review comments. >>> >>> Dean can you please cast an extra eye on: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >>> >>> This OR should be correct. >>> >>> Dan please do the same on the biased locking changes. >>> >>> I left out the merge with MutexLocker changes, since it was not >>> interesting. >>> There were some conflicts with JVMCI changes, so incremental >>> contains some parts of that merge. >>> >>> Passes t1-5 and local testing. >>> I'll continue with some additional testing. >>> >>> Thanks, Robbin >>> >>> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Let's deopt with handshakes. >>>> Removed VM op Deoptimize, instead we handshake. >>>> Locks needs to be inflate since we are not in a safepoint. >>>> >>>> Goes on top of: >>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>>> >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>>> >>>> Passes t1-7 and multiple t1-5 runs. >>>> >>>> A few startup benchmark see a small speedup. >>>> >>>> Thanks, Robbin >> From tom.rodriguez at oracle.com Mon May 6 17:00:59 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 6 May 2019 10:00:59 -0700 Subject: [13] RFR(M) 8223332: Update JVMCI In-Reply-To: <085cd3bd-e2ba-4bdc-0573-c7f5cad98fed@oracle.com> References: <085cd3bd-e2ba-4bdc-0573-c7f5cad98fed@oracle.com> Message-ID: Looks good. tom Vladimir Kozlov wrote on 5/4/19 10:54 AM: > http://cr.openjdk.java.net/~kvn/8223332/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223332 > > Sync latest JVMCI changes from graal-jvmci-8 [1] > The list in the bug report. > > [1] https://github.com/graalvm/graal-jvmci-8/commits/master > From vladimir.kozlov at oracle.com Mon May 6 17:13:56 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 10:13:56 -0700 Subject: [13] RFR(M) 8223332: Update JVMCI In-Reply-To: References: <085cd3bd-e2ba-4bdc-0573-c7f5cad98fed@oracle.com> Message-ID: Thank you, Tom Vladimir On 5/6/19 10:00 AM, Tom Rodriguez wrote: > Looks good. > > tom > > Vladimir Kozlov wrote on 5/4/19 10:54 AM: >> http://cr.openjdk.java.net/~kvn/8223332/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223332 >> >> Sync latest JVMCI changes from graal-jvmci-8 [1] >> The list in the bug report. >> >> [1] https://github.com/graalvm/graal-jvmci-8/commits/master >> From vladimir.kozlov at oracle.com Mon May 6 18:11:38 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 11:11:38 -0700 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: References: Message-ID: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> It seems webrev is wrong. Jesper, is it possible you sent old webrev? I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly updated only Copyright year in JDK (in JDK it was old 2018). But webrev shows reversed changes [2]. The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. Thanks, Vladimir [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 [2] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html [3] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate recent Graal changes into OpenJDK. > Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ > > This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. > > Thanks, > /Jesper > From jesper.wilhelmsson at oracle.com Mon May 6 18:32:37 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 6 May 2019 20:32:37 +0200 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> References: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> Message-ID: Sorry! I forgot to remove the old one so the script automatically created webrev.01 but still linked to the old in the email. Current webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.01/ /Jesper > On 6 May 2019, at 20:11, Vladimir Kozlov wrote: > > It seems webrev is wrong. Jesper, is it possible you sent old webrev? > > I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly updated only Copyright year in JDK (in JDK it was old 2018). > But webrev shows reversed changes [2]. > > The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. > > The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. > > Thanks, > Vladimir > > [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 > > [2] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html > > [3] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html > > On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate recent Graal changes into OpenJDK. >> Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >> Thanks, >> /Jesper -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From dean.long at oracle.com Mon May 6 18:39:23 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 6 May 2019 11:39:23 -0700 Subject: [13] RFR (M): 8223216: C2: Unify class initialization checks between new, getstatic, and putstatic In-Reply-To: References: <5e67b2d3-9856-069e-4886-8366c89bc3f8@oracle.com> Message-ID: <2d4adb75-c86b-8115-16b0-fe7c5b4129e0@oracle.com> OK, thanks for the explanation.? Looks good. dl On 5/3/19 3:49 PM, Vladimir Ivanov wrote: > Thanks for the feedback, Dean. > >> Do you want to have a Runtime reviewer take a look at the new logic? > > I'm definitely looking for feedback on 8223213 from Runtime team. But > 8223216 is C2-specific and incrementally builds on top of it, so I > don't think there's anything new for Runtime team to look at. > >> Can you explain why Parse::clinit_deopt() changed from testing for >> >> InstanceKlass::fully_initialized >> >> to testing for >> >> InstanceKlass::being_initialized >> >> instead?? How do we know we it is the initializing thread? > > Initializing thread is irrelevant here. The check is solely about the > current state of the holder class. > > Parse::clinit_deopt() is not mandatory (nmethod clinit barrier on > entry cover all important cases), but an optimization. It is added by > 8223213 specifically for C2 to trigger recompilation once the holder > class is fully initialized. The motivation is to get better code when > a class is fully initialized. > > The change in 8223216 is intended as a refactoring: since there are > only 2 states allowed here (being_initialized and fully_initialized), > it doesn't matter what state is checked (== being initialized vs != > fully_initialized). > > Best regards, > Vladimir Ivanov > >> On 5/1/19 4:37 PM, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~vlivanov/8223216/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8223216 >>> >>> (The patch has minor dependencies on 8223213 [1] I sent out for >>> review earlier.) >>> >>> C2 implements class initialization checks for new and >>> getstatic/putstatic differently: while "new" supports fast class >>> initialization checks, static field accesses rely on uncommon traps >>> which may lead to deoptimization/recompilation storms during >>> long-running class initialisation. >>> >>> Proposed patch unifies implementation between them and uses the >>> following barrier: >>> ?? if (holder->is_initialized()) { >>> ???? uncommon_trap(initialized, reinterpret); >>> ?? } >>> ?? if (!holder->is_reentrant_initialization(current_thread)) { >>> ???? uncommon_trap(uninitialized, none); >>> ?? } >>> >>> It also enhances checks for not-yet-initialized classes >>> (Compile::needs_clinit_barrier) and unifies the implementation >>> between new, invokestatic, and getfield/putfield. >>> >>> Testing: tier1-5, targeted microbenchmarks, new test from 8223213 >>> >>> Thanks! >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8223213 >>> >> From vladimir.kozlov at oracle.com Mon May 6 18:43:39 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 11:43:39 -0700 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: References: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> Message-ID: Yes, this one looks good! And testing seems fine - most failures are timeouts due to Graal runs with -Xcomp -XX:-TieredCompilation which is known issue 8222524. Thanks, Vladimir On 5/6/19 11:32 AM, jesper.wilhelmsson at oracle.com wrote: > Sorry! ?I forgot to remove the old one so the script automatically created webrev.01 but still linked to the old in the > email. > > Current webrev: > http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.01/ > > /Jesper > >> On 6 May 2019, at 20:11, Vladimir Kozlov > wrote: >> >> It seems webrev is wrong. Jesper, is it possible you sent old webrev? >> >> I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly >> updated only Copyright year in JDK (in JDK it was old 2018). >> But webrev shows reversed changes [2]. >> >> The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. >> >> The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. >> >> Thanks, >> Vladimir >> >> [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 >> >> [2] >> http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html >> >> [3] >> http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html >> >> On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> Please review the patch to integrate recent Graal changes into OpenJDK. >>> Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>> Thanks, >>> /Jesper > From dean.long at oracle.com Mon May 6 18:45:33 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 6 May 2019 11:45:33 -0700 Subject: RFR(trivial): 8223054: [TESTBUG] Put graalJarsCP before existing classpath in GraalUnitTestLauncher In-Reply-To: References: <803B96E9-8EDD-4469-9137-63451E825724@oracle.com> Message-ID: Looks good (and trivial) to me. dl On 5/6/19 3:41 AM, Pengfei Li (Arm Technology China) wrote: > Thanks Igor. Do I need another reviewer for this trivial change? > > // Also cc graal-dev list > > -- > Thanks, > Pengfei > >> Looks good to me. >> >> // moved to hotspot compiler list >> >> ? Igor >> >>> On May 4, 2019, at 6:32 PM, Pengfei Li (Arm Technology China) >> wrote: >>> Hi, >>> >>> Please help review this trivial change on GraalUnitTestLauncher. >>> >>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8223054/webrev.00/ >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8223054 >>> >>> Current graal unit test in jtreg requires junit-4.12.jar as a dependency. In >> GraalUnitTestLauncher.java, we put the path of this file into graalJarsCP and >> concat it with existing classpath. But existing classpath may contain another >> version of junit with which the jtreg tool is built. (According to OpenJDK >> "Building jtreg" webpage[1], the recommended version of Junit to build jtreg >> is junit-4.10). >>> In this patch, graalJarsCP is put before existing classpath returned by >> System.getProperty() when generating the new classpath string to avoid >> incompatibility issues. Jteg graal unit test cases passed after this change. >>> [1] https://openjdk.java.net/jtreg/build.html >>> >>> -- >>> Thanks, >>> Pengfei >>> From vladimir.x.ivanov at oracle.com Mon May 6 19:15:21 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 6 May 2019 12:15:21 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> Message-ID: > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.03/ Looks good. Testing results are good as well. Best regards, Vladimir Ivanov > Yes, the footprint numbers continue to hold with this patch: > The x86.ad file is 126 lines smaller. > The libjvm size increase is only 0.24%. > > Best Regards, > Sandhya > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Friday, May 03, 2019 4:22 PM > To: Viswanathan, Sandhya > Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ > > Much better! I like how AD files look now. I assume static footprint > numbers you provided earlier are still valid. > > +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register scr) { > + if (opcode == Op_AbsVD) { > + andpd(dst, > ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); > + } else { > + assert((opcode == Op_NegVD),"opcode should be Op_NegD"); > + xorpd(dst, > ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); > + } > +} > > It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm > perfectly fine with incrementally refactor it later. > > For now, just guard relevant code with #ifdef COMPILER2. > > Otherwise, looks very good! > > Best regards, > Vladimir Ivanov > >> >> Looking forward to your feedback. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 5:09 PM >> To: Viswanathan, Sandhya ; Vladimir Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Sounds good, thanks! >> >> Best regards, >> Vladimir Ivanov >> >> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Viswanathan, Sandhya >>> Sent: Wednesday, May 01, 2019 3:10 PM >>> To: 'Vladimir Ivanov' ; Vladimir Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Vladimir, >>> >>> I agree, I wanted to show both the approaches in this patch to get your feedback: >>> 1) with emit as a function >>> 2) with emit part in the instruct body itself >>> >>> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. >>> >>> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. >>> >>> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). >>> >>> Please do let me know if you want to see any other change in this patch. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 2:58 PM >>> To: Viswanathan, Sandhya ; Vladimir Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> >>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>> >>> Nice job, Sandhya! Glad to hear the approach pays off! >>> >>> Unfortunately, I must note that AD file becomes much more obscure. >>> Especially with those function pointers. >>> >>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >>> 1529 XMMRegister src, XMMRegister shift, >>> 1530 XMMRegister tmp1, XMMRegister tmp2, >>> Register scratch) { >>> 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>> false : true); >>> 1532 XX_Inst shiftinst = get_xx_inst(opcode); >>> 1533 >>> 1534 (_masm.*extendinst)(tmp1, src); >>> 1535 (_masm.*shiftinst)(tmp1, shift); >>> 1536 __ pshufd(tmp2, src, 0xE); >>> 1537 (_masm.*extendinst)(tmp2, tmp2); >>> 1538 (_masm.*shiftinst)(tmp2, shift); >>> 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>> scratch); >>> 1540 __ pand(tmp2, dst); >>> 1541 __ pand(dst, tmp1); >>> 1542 __ packuswb(dst, tmp2); >>> 1543 } >>> >>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>> >>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>> 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() >>> == 16); >>> 8684 match(Set dst (LShiftVB src shift)); >>> 8685 match(Set dst (RShiftVB src shift)); >>> 8686 match(Set dst (URShiftVB src shift)); >>> 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> 8688 format %{"pmovxbw $tmp1,$src\n\t" >>> 8689 "shiftop $tmp1,$shift\n\t" >>> 8690 "pshufd $tmp2,$src\n\t" >>> 8691 "pmovxbw $tmp2,$tmp2\n\t" >>> 8692 "shiftop $tmp2,$shift\n\t" >>> 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" >>> 8694 "pand $tmp2,$dst\n\t" >>> 8695 "pand $dst,$tmp1\n\t" >>> 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} >>> 8697 ins_encode %{ >>> 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>> 8699 %} >>> 8700 ins_pipe( pipe_slow ); >>> 8701 %} >>> >>> can be turned into something like: >>> >>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>> predicate(n->as_Vector()->length() == 16); >>> match(Set dst (LShiftVB src shift)); >>> match(Set dst (RShiftVB src shift)); >>> match(Set dst (URShiftVB src shift)); >>> effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> format %{"packed16B shift" %} >>> ins_encode %{ >>> int vlen = 0; // 128-bit >>> BasicType elem_type = T_BYTE; >>> int shift_mode = ...; // L/R/UR or S/U + L/R >>> __ vshift(vlen, elem_type, shift_mode, >>> $dst$$..., $src$$..., $shift$$..., >>> $tmp1$$..., $tmp2$$..., $scratch$$...); >>> %} >>> >>> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? >>> >>> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> With this webrev the ad file has only about 60 lines effectively added. >>>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >>>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>>> >>>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >>>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>>> >>>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>>> >>>> Your review and feedback is welcome. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>> Viswanathan, Sandhya >>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>> To: Vladimir Kozlov ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>>> >>>> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>>> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>> vector_len); >>>> >>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>> To: Viswanathan, Sandhya ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>> Hi Bernard, >>>>> >>>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>>> >>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>>> $tmp2 to avoid overwriting >>>> $src2 before we get value from it if $dst = $src2. >>>> >>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>>> >>>> It is a little mess which may cause ineffective use of registers in compiled code. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>> To: Viswanathan, Sandhya >>>>> Cc: Vladimir Kozlov ; >>>>> hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Hi Sandhya and Vladimir K., >>>>> >>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> Yes, I missed the question below: >>>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>>>> >>>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>>> >>>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>>> >>>>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>>>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>>>> >>>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>>> >>>>> Thanks, >>>>> Bernard >>>>> From vladimir.x.ivanov at oracle.com Mon May 6 19:19:02 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 6 May 2019 12:19:02 -0700 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: <8510740c-ac56-f8d2-3c5e-451dfa6948a0@loongson.cn> References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> <8510740c-ac56-f8d2-3c5e-451dfa6948a0@loongson.cn> Message-ID: <0ee5e3ad-6eef-7976-7fa7-adb60a9eb4ad@oracle.com> Sure, pushed [1] Best regards, Vladimir Ivanov [1] http://hg.openjdk.java.net/jdk/jdk/rev/1abca1170080 On 03/05/2019 16:24, Jie Fu wrote: > Hi Vladimir Ivanov, > > The patch in the attachment has been updated by adding brackets to the > checks in InlineTree::is_not_reached. > > Is it OK to be pushed? > If so, could you please sponsor it? > > Thanks a lot. > > Best regards, > Jie > > > On 2019?05?04? 06:06, Vladimir Ivanov wrote: >> CCing Jie Fu. >> >> Best regards, >> Vladimir Ivanov >> >> On 03/05/2019 14:55, coleen.phillimore at oracle.com wrote: >>> >>> http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/src/hotspot/share/oops/cpCache.cpp.frames.html >>> >>> >>> This looks like it should have gotten the wrong answer without this >>> change (there appears to be protection from an index out of range) >>> even without your patch.? f2 is Method* for invokeinterface now. >>> >>> The runtime part of this change look good to me. >>> >>> Thanks, >>> Coleen >>> >>> >>> On 5/2/19 5:23 AM, Tobias Hartmann wrote: >>>> Hi Jie, >>>> >>>> this looks good to me too but please add brackets to the checks in >>>> InlineTree::is_not_reached. >>>> >>>> I've submitted some extended testing and let you know once it passed. >>>> >>>> Someone from the runtime team should also have a look at this >>>> because your changes affect the >>>> interpreter. CC'ing runtime-dev. >>>> >>>> Thanks, >>>> Tobias >>>> >>>> On 29.04.19 15:43, Jie Fu wrote: >>>>> Hi all, >>>>> >>>>> May I have another review for this change [1] to finalize the fix? >>>>> Thanks a lot. >>>>> >>>>> Best regards, >>>>> Jie >>>>> >>>>> [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ >>>>> >>>>> >>>>> On 2019?04?20? 11:35, Jie Fu wrote: >>>>>> Ah, I got it. >>>>>> I like your patch and benefit a lot from you. >>>>>> Thank you so much, Vladimir. >>>>>> >>>>>> Any comments from other reviewers? >>>>>> Thanks. >>>>>> >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>> On 2019/4/20 ??11:18, Vladimir Ivanov wrote: >>>>>>>>> After some explorations I decided to keep original behavior for >>>>>>>>> immature profiles >>>>>>>>> (profile.count == -1). >>>>>>>> I agree. >>>>>>>> >>>>>>>> I have two questions here. >>>>>>>> >>>>>>>> 1. What's the difference of the following two if statements? >>>>>>>> ------------------------------------------------- >>>>>>>> +? if (!callee_method->was_executed_more_than(0)) return true; >>>>>>>> // callee was never executed >>>>>>>> + >>>>>>>> +? if (caller_method->is_not_reached(caller_bci)) return true; >>>>>>>> // call site not resolved >>>>>>>> ------------------------------------------------- >>>>>>>> I think only one of them is needed. >>>>>>> The checks are complimentary: one inspects callee and the other >>>>>>> looks at call site. >>>>>>> >>>>>>> "!callee_method->was_executed_more_than(0)" ensures that callee >>>>>>> was executed at least once. >>>>>>> >>>>>>> "caller_method->is_not_reached(caller_bci)" inspects the state of >>>>>>> the call site. If corresponding >>>>>>> CP entry is not resolved, then the call site isn't reached. If >>>>>>> is_not_reached() returns false, >>>>>>> it's not a definitive answer: there's still a chance the site is >>>>>>> not reached - consider the case >>>>>>> of virtual calls where callee_method may differ for the same >>>>>>> resolved method. >>>>>>> >>>>>>>> 2. Does the assert in InlineTree::is_not_reached(...) make sense? >>>>>>>> Since we have >>>>>>>> ------------------------------------------------- >>>>>>>> if (profile.count() > 0)?? return false; // reachable according >>>>>>>> to profile >>>>>>>> ------------------------------------------------- >>>>>>>> and >>>>>>>> ------------------------------------------------- >>>>>>>> if (profile.count() == -1) {...} >>>>>>>> ------------------------------------------------- >>>>>>>> before >>>>>>>> ------------------------------------------------- >>>>>>>> assert(profile.count() == 0, "sanity"); >>>>>>>> ------------------------------------------------- >>>>>>>> is the assert redundant? >>>>>>> Asserts are intended to be redundant :-) But still catch bugs >>>>>>> from time to time. >>>>>>> >>>>>>> This one, in particular, checks invariant on profile.count() >= >>>>>>> -1 (which is not very useful by >>>>>>> itself), but also stresses that "profile.count() == 0" case is >>>>>>> being processed. >>>>>>> >>>>>>> Best regards, >>>>>>> Vladimir Ivanov >>>>> >>> > From jesper.wilhelmsson at oracle.com Mon May 6 19:19:55 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 6 May 2019 21:19:55 +0200 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: References: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> Message-ID: <6D547293-39BD-4652-8DD4-6F59D41C0C35@oracle.com> Thanks Vladimir! What about the overwritten diff? Should it be applied? /Jesper > On 6 May 2019, at 20:43, Vladimir Kozlov wrote: > > Yes, this one looks good! And testing seems fine - most failures are timeouts due to Graal runs with -Xcomp -XX:-TieredCompilation which is known issue 8222524. > > Thanks, > Vladimir > > On 5/6/19 11:32 AM, jesper.wilhelmsson at oracle.com wrote: >> Sorry! I forgot to remove the old one so the script automatically created webrev.01 but still linked to the old in the email. >> Current webrev: >> http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.01/ >> /Jesper >>> On 6 May 2019, at 20:11, Vladimir Kozlov > wrote: >>> >>> It seems webrev is wrong. Jesper, is it possible you sent old webrev? >>> >>> I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly updated only Copyright year in JDK (in JDK it was old 2018). >>> But webrev shows reversed changes [2]. >>> >>> The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. >>> >>> The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. >>> >>> Thanks, >>> Vladimir >>> >>> [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 >>> >>> [2] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html >>> >>> [3] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html >>> >>> On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: >>>> Hi, >>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>> Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>> Thanks, >>>> /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Mon May 6 19:43:00 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 12:43:00 -0700 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: <6D547293-39BD-4652-8DD4-6F59D41C0C35@oracle.com> References: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> <6D547293-39BD-4652-8DD4-6F59D41C0C35@oracle.com> Message-ID: On 5/6/19 12:19 PM, jesper.wilhelmsson at oracle.com wrote: > Thanks Vladimir! > > What about the overwritten diff? Should it be applied? No need to apply. The overwritten diffs were pushed into Graal master and your last changes have them. I verified it. Thanks, Vladimir > /Jesper > >> On 6 May 2019, at 20:43, Vladimir Kozlov wrote: >> >> Yes, this one looks good! And testing seems fine - most failures are timeouts due to Graal runs with -Xcomp -XX:-TieredCompilation which is known issue 8222524. >> >> Thanks, >> Vladimir >> >> On 5/6/19 11:32 AM, jesper.wilhelmsson at oracle.com wrote: >>> Sorry! I forgot to remove the old one so the script automatically created webrev.01 but still linked to the old in the email. >>> Current webrev: >>> http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.01/ >>> /Jesper >>>> On 6 May 2019, at 20:11, Vladimir Kozlov > wrote: >>>> >>>> It seems webrev is wrong. Jesper, is it possible you sent old webrev? >>>> >>>> I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly updated only Copyright year in JDK (in JDK it was old 2018). >>>> But webrev shows reversed changes [2]. >>>> >>>> The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. >>>> >>>> The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 >>>> >>>> [2] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html >>>> >>>> [3] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html >>>> >>>> On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: >>>>> Hi, >>>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>>> Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >>>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>>> Thanks, >>>>> /Jesper > From jesper.wilhelmsson at oracle.com Mon May 6 19:48:29 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 6 May 2019 21:48:29 +0200 Subject: RFR: JDK-8222665 - Update Graal In-Reply-To: References: <5ff3bca1-2ca5-24d4-976a-0d3ffdcaa874@oracle.com> <6D547293-39BD-4652-8DD4-6F59D41C0C35@oracle.com> Message-ID: Thank you! /Jesper > On 6 May 2019, at 21:43, Vladimir Kozlov wrote: > > On 5/6/19 12:19 PM, jesper.wilhelmsson at oracle.com wrote: >> Thanks Vladimir! >> What about the overwritten diff? Should it be applied? > > No need to apply. The overwritten diffs were pushed into Graal master and your last changes have them. I verified it. > > Thanks, > Vladimir > >> /Jesper >>> On 6 May 2019, at 20:43, Vladimir Kozlov wrote: >>> >>> Yes, this one looks good! And testing seems fine - most failures are timeouts due to Graal runs with -Xcomp -XX:-TieredCompilation which is known issue 8222524. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/6/19 11:32 AM, jesper.wilhelmsson at oracle.com wrote: >>>> Sorry! I forgot to remove the old one so the script automatically created webrev.01 but still linked to the old in the email. >>>> Current webrev: >>>> http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.01/ >>>> /Jesper >>>>> On 6 May 2019, at 20:11, Vladimir Kozlov > wrote: >>>>> >>>>> It seems webrev is wrong. Jesper, is it possible you sent old webrev? >>>>> >>>>> I looked on patch (from submitted test job) and it seems correct. For example, from next changes [1] it correctly updated only Copyright year in JDK (in JDK it was old 2018). >>>>> But webrev shows reversed changes [2]. >>>>> >>>>> The patch does not have IsGraalPredicate.java changes. But webrev has it with reversed changes again [3]. >>>>> >>>>> The same for GraalServices.java file changes. No changes in patch but reverse changes in webrev. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> [1] https://github.com/oracle/graal/commit/4fa819e120212393122b55e2c95e9de7c6101ccf#diff-3f2f58ebefeb6c5489c4d264ec8ae502 >>>>> >>>>> [2] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/meta/DefaultHotSpotLoweringProvider.java.udiff.html >>>>> >>>>> [3] http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/IsGraalPredicate.java.udiff.html >>>>> >>>>> On 5/6/19 7:18 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>> Hi, >>>>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>>>> Graal tip to integrate: 88c3adb11b1bc10f6443435685b65227e7584b43 > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8222665 >>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8222665/webrev.00/ >>>>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>>>> Thanks, >>>>>> /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Mon May 6 21:34:25 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 14:34:25 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: <415627e2-165c-14a0-a069-2e01de5574d4@oracle.com> References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> <3d15e9f0-8717-ac82-678d-2139dcfec7f8@oracle.com> <415627e2-165c-14a0-a069-2e01de5574d4@oracle.com> Message-ID: <0817ed0f-a912-4ffc-6a4a-6623802b28ec@oracle.com> Looks good to me too. Thanks Vladimir On 5/3/19 11:55 AM, dean.long at oracle.com wrote: > On 5/3/19 10:45 AM, Tom Rodriguez wrote: >> >> >> dean.long at oracle.com wrote on 5/2/19 11:47 PM: >>> On 5/1/19 5:44 PM, Tom Rodriguez wrote: >>>> You'll need to update your webrev after Vladimir's push.? This code has moved into HotSpootJVMCIRuntime.java. >>>> >>> >>> Here's the updated version: >>> >>> http://cr.openjdk.java.net/~dlong/8218700/webrev.3/ >> >> Looks good to me. > > Thanks for the review. > >> >>> >>>> Maybe WeakReferenceHolder instead of WeakTypeRef?? It needs a comment explaining that we're intentionally avoiding >>>> the use of ClassValue.remove as well. Shouldn't the ref field be volatile? ClassValue includes some barrier >>>> semantics and the new code needs similar guarantees. >>>> >>> >>> I went ahead and made it volatile, but I don't understand what guarantee was missing, and what problem we want to >>> eliminate, unless it is to reduce the possibility of duplicates.? But the fix for JDK-8201248 assumes that duplicates >>> are possible, so I wasn't worried about that. >> >> We're publishing a mutable locally created object to other threads so it seems like we need some sort of ordering >> barrier when we do so. Presumably the ClassValue would normally provide some ordering though it's a little unclear >> from the javadoc if it makes any such guarantees. Is the extra volatile unneeded? >> > > ClassValue uses volatile internally so that an unsynchronized read sees the latest version.? Using a volatile here > should help in a similar way, but I believe there is still a race that allows duplicates if the weak reference gets > cleared by GC.? To prevent all duplicates I think we would need both volatile and more synchronization. > > dl > >> tom >> >>> >>> dl >>> >>>> tom >>>> >>>> dean.long at oracle.com wrote on 4/26/19 12:09 PM: >>>>> https://bugs.openjdk.java.net/browse/JDK-8218700 >>>>> http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ >>>>> >>>>> If we throw an OutOfMemoryError in the right place (see JDK-8222941), HotSpotJVMCIMetaAccessContext.fromClass can >>>>> go into an infinite loop calling ClassValue.remove.? To work around the problem, reset the value in a mutable cell >>>>> instead of calling remove. >>>>> >>>>> dl >>> > From vivek.r.deshpande at intel.com Mon May 6 22:00:57 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Mon, 6 May 2019 22:00:57 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F48413D@ORSMSX106.amr.corp.intel.com> Hi Vladimir Can I sponsor and push the patch? Regards, Vivek -----Original Message----- From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Ivanov Sent: Monday, May 6, 2019 12:15 PM To: Viswanathan, Sandhya Cc: Vladimir Kozlov ; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.03/ Looks good. Testing results are good as well. Best regards, Vladimir Ivanov > Yes, the footprint numbers continue to hold with this patch: > The x86.ad file is 126 lines smaller. > The libjvm size increase is only 0.24%. > > Best Regards, > Sandhya > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Friday, May 03, 2019 4:22 PM > To: Viswanathan, Sandhya > Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov > > Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 > > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ > > Much better! I like how AD files look now. I assume static footprint > numbers you provided earlier are still valid. > > +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register > +scr) { > + if (opcode == Op_AbsVD) { > + andpd(dst, > ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); > + } else { > + assert((opcode == Op_NegVD),"opcode should be Op_NegD"); > + xorpd(dst, > ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); > + } > +} > > It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm > perfectly fine with incrementally refactor it later. > > For now, just guard relevant code with #ifdef COMPILER2. > > Otherwise, looks very good! > > Best regards, > Vladimir Ivanov > >> >> Looking forward to your feedback. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 5:09 PM >> To: Viswanathan, Sandhya ; Vladimir >> Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Sounds good, thanks! >> >> Best regards, >> Vladimir Ivanov >> >> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also wonderful instead of function pointer. I will look into making that change as well. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Viswanathan, Sandhya >>> Sent: Wednesday, May 01, 2019 3:10 PM >>> To: 'Vladimir Ivanov' ; Vladimir >>> Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Vladimir, >>> >>> I agree, I wanted to show both the approaches in this patch to get your feedback: >>> 1) with emit as a function >>> 2) with emit part in the instruct body itself >>> >>> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for vabsneg2D etc. That is what you are recommending as well so I feel good. >>> >>> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small overhead with approach 2) as emit is duplicated per match rule. >>> >>> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using approach 2). >>> >>> Please do let me know if you want to see any other change in this patch. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 2:58 PM >>> To: Viswanathan, Sandhya ; Vladimir >>> Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> >>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>> >>> Nice job, Sandhya! Glad to hear the approach pays off! >>> >>> Unfortunately, I must note that AD file becomes much more obscure. >>> Especially with those function pointers. >>> >>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >>> 1529 XMMRegister src, XMMRegister shift, >>> 1530 XMMRegister tmp1, XMMRegister tmp2, >>> Register scratch) { >>> 1531 XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>> false : true); >>> 1532 XX_Inst shiftinst = get_xx_inst(opcode); >>> 1533 >>> 1534 (_masm.*extendinst)(tmp1, src); >>> 1535 (_masm.*shiftinst)(tmp1, shift); >>> 1536 __ pshufd(tmp2, src, 0xE); >>> 1537 (_masm.*extendinst)(tmp2, tmp2); >>> 1538 (_masm.*shiftinst)(tmp2, shift); >>> 1539 __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>> scratch); >>> 1540 __ pand(tmp2, dst); >>> 1541 __ pand(dst, tmp1); >>> 1542 __ packuswb(dst, tmp2); >>> 1543 } >>> >>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>> >>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>> 8683 predicate(UseSSE > 3 && UseAVX <= 1 && n->as_Vector()->length() >>> == 16); >>> 8684 match(Set dst (LShiftVB src shift)); >>> 8685 match(Set dst (RShiftVB src shift)); >>> 8686 match(Set dst (URShiftVB src shift)); >>> 8687 effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> 8688 format %{"pmovxbw $tmp1,$src\n\t" >>> 8689 "shiftop $tmp1,$shift\n\t" >>> 8690 "pshufd $tmp2,$src\n\t" >>> 8691 "pmovxbw $tmp2,$tmp2\n\t" >>> 8692 "shiftop $tmp2,$shift\n\t" >>> 8693 "movdqu $dst,[0x00ff00ff0x00ff00ff]\n\t" >>> 8694 "pand $tmp2,$dst\n\t" >>> 8695 "pand $dst,$tmp1\n\t" >>> 8696 "packuswb $dst,$tmp2\n\t! packed16B shift" %} >>> 8697 ins_encode %{ >>> 8698 emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>> 8699 %} >>> 8700 ins_pipe( pipe_slow ); >>> 8701 %} >>> >>> can be turned into something like: >>> >>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>> predicate(n->as_Vector()->length() == 16); >>> match(Set dst (LShiftVB src shift)); >>> match(Set dst (RShiftVB src shift)); >>> match(Set dst (URShiftVB src shift)); >>> effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> format %{"packed16B shift" %} >>> ins_encode %{ >>> int vlen = 0; // 128-bit >>> BasicType elem_type = T_BYTE; >>> int shift_mode = ...; // L/R/UR or S/U + L/R >>> __ vshift(vlen, elem_type, shift_mode, >>> $dst$$..., $src$$..., $shift$$..., >>> $tmp1$$..., $tmp2$$..., $scratch$$...); >>> %} >>> >>> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any problems with that from footprint perspective? >>> >>> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and used in x86.ad. That will accommodate further reductions in AD instructions needed. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> With this webrev the ad file has only about 60 lines effectively added. >>>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size for multiple match rule case by improving the adlc itself. >>>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>>> >>>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by jcheck tool. >>>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>>> >>>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>>> >>>> Your review and feedback is welcome. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>> Viswanathan, Sandhya >>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>> To: Vladimir Kozlov ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>>> >>>> __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>>> __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>> vector_len); >>>> >>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>> To: Viswanathan, Sandhya ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>> Hi Bernard, >>>>> >>>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>>> >>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst >>>> instead >>>> $tmp2 to avoid overwriting >>>> $src2 before we get value from it if $dst = $src2. >>>> >>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>>> >>>> It is a little mess which may cause ineffective use of registers in compiled code. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>> To: Viswanathan, Sandhya >>>>> Cc: Vladimir Kozlov ; >>>>> hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Hi Sandhya and Vladimir K., >>>>> >>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> Yes, I missed the question below: >>>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it intentional to not use 'dst' there? >>>>>> >>>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>>> >>>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>>> >>>>> 7349 format %{"pmovsxbw $tmp,$src1\n\t" >>>>> 7350 "pmovsxbw $tmp2,$src2\n\t" >>>>> >>>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>>> >>>>> Thanks, >>>>> Bernard >>>>> From vladimir.kozlov at oracle.com Mon May 6 22:05:05 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 15:05:05 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> Message-ID: This looks good to me too. Thank you for cleaning this up and keeping size increase very small. Thanks, Vladimir K On 5/6/19 12:15 PM, Vladimir Ivanov wrote: > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.03/ > > Looks good. > > Testing results are good as well. > > Best regards, > Vladimir Ivanov > >> Yes, the footprint numbers continue to hold with this patch: >> ??? The x86.ad file is 126 lines smaller. >> ??? The libjvm size increase is only 0.24%. >> >> Best Regards, >> Sandhya >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Friday, May 03, 2019 4:22 PM >> To: Viswanathan, Sandhya >> Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> >>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ >> >> Much better! I like how AD files look now. I assume static footprint >> numbers you provided earlier are still valid. >> >> +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register scr) { >> +? if (opcode == Op_AbsVD) { >> +??? andpd(dst, >> ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); >> +? } else { >> +??? assert((opcode == Op_NegVD),"opcode should be Op_NegD"); >> +??? xorpd(dst, >> ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); >> +? } >> +} >> >> It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm >> perfectly fine with incrementally refactor it later. >> >> For now, just guard relevant code with #ifdef COMPILER2. >> >> Otherwise, looks very good! >> >> Best regards, >> Vladimir Ivanov >> >>> >>> Looking forward to your feedback. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 5:09 PM >>> To: Viswanathan, Sandhya ; Vladimir Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Sounds good, thanks! >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>>> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also >>>> wonderful instead of function pointer.? I will look into making that change as well. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Viswanathan, Sandhya >>>> Sent: Wednesday, May 01, 2019 3:10 PM >>>> To: 'Vladimir Ivanov' ; Vladimir Kozlov >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Hi Vladimir, >>>> >>>> I agree, I wanted to show both the approaches in this patch to get your feedback: >>>> 1) with emit as a function >>>> 2) with emit part in the instruct body itself >>>> >>>> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for >>>> vabsneg2D etc. That is what you are recommending as well so I feel good. >>>> >>>> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small >>>> overhead with approach 2) as emit is duplicated per match rule. >>>> >>>> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using >>>> approach 2). >>>> >>>> Please do let me know if you want to see any other change in this patch. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>>> Sent: Wednesday, May 01, 2019 2:58 PM >>>> To: Viswanathan, Sandhya ; Vladimir Kozlov >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> >>>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>>> >>>> Nice job, Sandhya! Glad to hear the approach pays off! >>>> >>>> Unfortunately, I must note that AD file becomes much more obscure. >>>> Especially with those function pointers. >>>> >>>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >>>> 1529???????????????????????? XMMRegister src, XMMRegister shift, >>>> 1530???????????????????????? XMMRegister tmp1, XMMRegister tmp2, >>>> Register scratch) { >>>> 1531?? XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>>> false : true); >>>> 1532?? XX_Inst shiftinst = get_xx_inst(opcode); >>>> 1533 >>>> 1534?? (_masm.*extendinst)(tmp1, src); >>>> 1535?? (_masm.*shiftinst)(tmp1, shift); >>>> 1536?? __ pshufd(tmp2, src, 0xE); >>>> 1537?? (_masm.*extendinst)(tmp2, tmp2); >>>> 1538?? (_masm.*shiftinst)(tmp2, shift); >>>> 1539?? __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>>> scratch); >>>> 1540?? __ pand(tmp2, dst); >>>> 1541?? __ pand(dst, tmp1); >>>> 1542?? __ packuswb(dst, tmp2); >>>> 1543 } >>>> >>>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>>> >>>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>>> 8683?? predicate(UseSSE > 3? && UseAVX <= 1 && n->as_Vector()->length() >>>> == 16); >>>> 8684?? match(Set dst (LShiftVB src shift)); >>>> 8685?? match(Set dst (RShiftVB src shift)); >>>> 8686?? match(Set dst (URShiftVB src shift)); >>>> 8687?? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>>> 8688?? format %{"pmovxbw?? $tmp1,$src\n\t" >>>> 8689??????????? "shiftop?? $tmp1,$shift\n\t" >>>> 8690??????????? "pshufd??? $tmp2,$src\n\t" >>>> 8691??????????? "pmovxbw?? $tmp2,$tmp2\n\t" >>>> 8692??????????? "shiftop?? $tmp2,$shift\n\t" >>>> 8693??????????? "movdqu??? $dst,[0x00ff00ff0x00ff00ff]\n\t" >>>> 8694??????????? "pand????? $tmp2,$dst\n\t" >>>> 8695??????????? "pand????? $dst,$tmp1\n\t" >>>> 8696??????????? "packuswb? $dst,$tmp2\n\t! packed16B shift" %} >>>> 8697?? ins_encode %{ >>>> 8698???? emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>>> 8699?? %} >>>> 8700?? ins_pipe( pipe_slow ); >>>> 8701 %} >>>> >>>> can be turned into something like: >>>> >>>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>>> ????? predicate(n->as_Vector()->length() == 16); >>>> ????? match(Set dst (LShiftVB src shift)); >>>> ????? match(Set dst (RShiftVB src shift)); >>>> ????? match(Set dst (URShiftVB src shift)); >>>> ????? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>>> ????? format %{"packed16B shift" %} >>>> ????? ins_encode %{ >>>> ??????? int vlen = 0; // 128-bit >>>> ??????? BasicType elem_type = T_BYTE; >>>> ??????? int shift_mode = ...; // L/R/UR or S/U + L/R >>>> ??????? __ vshift(vlen, elem_type, shift_mode, >>>> ????????????????? $dst$$..., $src$$..., $shift$$..., >>>> ????????? $tmp1$$..., $tmp2$$..., $scratch$$...); >>>> ?????? %} >>>> >>>> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any >>>> problems with that from footprint perspective? >>>> >>>> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and >>>> used in x86.ad. That will accommodate further reductions in AD instructions needed. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> With this webrev the ad file has only about 60 lines effectively added. >>>>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>>>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>>>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size >>>>> for multiple match rule case by improving the adlc itself. >>>>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>>>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by >>>>> jcheck tool. >>>>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>>>> >>>>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>>>> >>>>> Your review and feedback is welcome. >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-compiler-dev >>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>>> Viswanathan, Sandhya >>>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>>> To: Vladimir Kozlov ; B. Blaser >>>>> >>>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>>>> >>>>> ??????? __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>>>> ??????? __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>>> vector_len); >>>>> >>>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>>> To: Viswanathan, Sandhya ; B. Blaser >>>>> >>>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>>> Hi Bernard, >>>>>> >>>>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>>>> >>>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>>>> $tmp2 to avoid overwriting >>>>> $src2 before we get value from it if $dst = $src2. >>>>> >>>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>>>> >>>>> It is a little mess which may cause ineffective use of registers in compiled code. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> >>>>>> Best Regards, >>>>>> Sandhya >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>>> To: Viswanathan, Sandhya >>>>>> Cc: Vladimir Kozlov ; >>>>>> hotspot-compiler-dev at openjdk.java.net >>>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>>> >>>>>> Hi Sandhya and Vladimir K., >>>>>> >>>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>>>> >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> Yes, I missed the question below: >>>>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it >>>>>>>>> intentional to not use 'dst' there? >>>>>>> >>>>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>>>> >>>>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical >>>>>> $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>>>> >>>>>> 7349?? format %{"pmovsxbw? $tmp,$src1\n\t" >>>>>> 7350??????????? "pmovsxbw? $tmp2,$src2\n\t" >>>>>> >>>>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>>>> >>>>>> Thanks, >>>>>> Bernard >>>>>> From sandhya.viswanathan at intel.com Mon May 6 22:35:19 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Mon, 6 May 2019 22:35:19 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> <52876f29-4da2-2885-fe18-5e362b57eb2b@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB74D2@FMSMSX126.amr.corp.intel.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7D9A@FMSMSX126.amr.corp.intel.com> Thanks a lot VladimirK and VladimirI. Vivek has offered to push the patch. Best Regards, Sandhya -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Monday, May 06, 2019 3:05 PM To: Vladimir Ivanov ; Viswanathan, Sandhya Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 This looks good to me too. Thank you for cleaning this up and keeping size increase very small. Thanks, Vladimir K On 5/6/19 12:15 PM, Vladimir Ivanov wrote: > >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.03/ > > Looks good. > > Testing results are good as well. > > Best regards, > Vladimir Ivanov > >> Yes, the footprint numbers continue to hold with this patch: >> ??? The x86.ad file is 126 lines smaller. >> ??? The libjvm size increase is only 0.24%. >> >> Best Regards, >> Sandhya >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Friday, May 03, 2019 4:22 PM >> To: Viswanathan, Sandhya >> Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> >>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ >> >> Much better! I like how AD files look now. I assume static footprint >> numbers you provided earlier are still valid. >> >> +void MacroAssembler::vabsnegd(int opcode, XMMRegister dst, Register scr) { >> +? if (opcode == Op_AbsVD) { >> +??? andpd(dst, >> ExternalAddress(StubRoutines::x86::vector_double_sign_mask()), scr); >> +? } else { >> +??? assert((opcode == Op_NegVD),"opcode should be Op_NegD"); >> +??? xorpd(dst, >> ExternalAddress(StubRoutines::x86::vector_double_sign_flip()), scr); >> +? } >> +} >> >> It's a bit odd to see C2-specific stuff in MacroAssembler, but I'm >> perfectly fine with incrementally refactor it later. >> >> For now, just guard relevant code with #ifdef COMPILER2. >> >> Otherwise, looks very good! >> >> Best regards, >> Vladimir Ivanov >> >>> >>> Looking forward to your feedback. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 5:09 PM >>> To: Viswanathan, Sandhya ; Vladimir Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Sounds good, thanks! >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>>> I should add here that your suggestion of adding generic shift instruction etc to the macroAssembler is also >>>> wonderful instead of function pointer.? I will look into making that change as well. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Viswanathan, Sandhya >>>> Sent: Wednesday, May 01, 2019 3:10 PM >>>> To: 'Vladimir Ivanov' ; Vladimir Kozlov >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Hi Vladimir, >>>> >>>> I agree, I wanted to show both the approaches in this patch to get your feedback: >>>> 1) with emit as a function >>>> 2) with emit part in the instruct body itself >>>> >>>> With emit as a function it becomes hard to read and I personally prefer it in the instruct itself as is done for >>>> vabsneg2D etc. That is what you are recommending as well so I feel good. >>>> >>>> Once the adlc enhancement is done both the approaches should give similar binary size. Till then there will be small >>>> overhead with approach 2) as emit is duplicated per match rule. >>>> >>>> I will send an updated patch fixing the two issues you mentioned in your previous email plus this change of using >>>> approach 2). >>>> >>>> Please do let me know if you want to see any other change in this patch. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>>> Sent: Wednesday, May 01, 2019 2:58 PM >>>> To: Viswanathan, Sandhya ; Vladimir Kozlov >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> >>>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>>> >>>> Nice job, Sandhya! Glad to hear the approach pays off! >>>> >>>> Unfortunately, I must note that AD file becomes much more obscure. >>>> Especially with those function pointers. >>>> >>>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, XMMRegister dst, >>>> 1529???????????????????????? XMMRegister src, XMMRegister shift, >>>> 1530???????????????????????? XMMRegister tmp1, XMMRegister tmp2, >>>> Register scratch) { >>>> 1531?? XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>>> false : true); >>>> 1532?? XX_Inst shiftinst = get_xx_inst(opcode); >>>> 1533 >>>> 1534?? (_masm.*extendinst)(tmp1, src); >>>> 1535?? (_masm.*shiftinst)(tmp1, shift); >>>> 1536?? __ pshufd(tmp2, src, 0xE); >>>> 1537?? (_masm.*extendinst)(tmp2, tmp2); >>>> 1538?? (_masm.*shiftinst)(tmp2, shift); >>>> 1539?? __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>>> scratch); >>>> 1540?? __ pand(tmp2, dst); >>>> 1541?? __ pand(dst, tmp1); >>>> 1542?? __ packuswb(dst, tmp2); >>>> 1543 } >>>> >>>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>>> >>>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>>> 8683?? predicate(UseSSE > 3? && UseAVX <= 1 && n->as_Vector()->length() >>>> == 16); >>>> 8684?? match(Set dst (LShiftVB src shift)); >>>> 8685?? match(Set dst (RShiftVB src shift)); >>>> 8686?? match(Set dst (URShiftVB src shift)); >>>> 8687?? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>>> 8688?? format %{"pmovxbw?? $tmp1,$src\n\t" >>>> 8689??????????? "shiftop?? $tmp1,$shift\n\t" >>>> 8690??????????? "pshufd??? $tmp2,$src\n\t" >>>> 8691??????????? "pmovxbw?? $tmp2,$tmp2\n\t" >>>> 8692??????????? "shiftop?? $tmp2,$shift\n\t" >>>> 8693??????????? "movdqu??? $dst,[0x00ff00ff0x00ff00ff]\n\t" >>>> 8694??????????? "pand????? $tmp2,$dst\n\t" >>>> 8695??????????? "pand????? $dst,$tmp1\n\t" >>>> 8696??????????? "packuswb? $dst,$tmp2\n\t! packed16B shift" %} >>>> 8697?? ins_encode %{ >>>> 8698???? emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>>> 8699?? %} >>>> 8700?? ins_pipe( pipe_slow ); >>>> 8701 %} >>>> >>>> can be turned into something like: >>>> >>>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX tmp2, rRegI scratch) %{ >>>> ????? predicate(n->as_Vector()->length() == 16); >>>> ????? match(Set dst (LShiftVB src shift)); >>>> ????? match(Set dst (RShiftVB src shift)); >>>> ????? match(Set dst (URShiftVB src shift)); >>>> ????? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>>> ????? format %{"packed16B shift" %} >>>> ????? ins_encode %{ >>>> ??????? int vlen = 0; // 128-bit >>>> ??????? BasicType elem_type = T_BYTE; >>>> ??????? int shift_mode = ...; // L/R/UR or S/U + L/R >>>> ??????? __ vshift(vlen, elem_type, shift_mode, >>>> ????????????????? $dst$$..., $src$$..., $shift$$..., >>>> ????????? $tmp1$$..., $tmp2$$..., $scratch$$...); >>>> ?????? %} >>>> >>>> Then MA::vshift can dispatch between different implementations depending on SSE/AVX level available. Do you see any >>>> problems with that from footprint perspective? >>>> >>>> Ideally, I'd prefer to see a library of operations on vectors encapsulated in MacroAssembler (or a subclass) and >>>> used in x86.ad. That will accommodate further reductions in AD instructions needed. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> With this webrev the ad file has only about 60 lines effectively added. >>>>> Also the generated product libjvm.so size only increases by about 0.26% vs the prior 1.50%. >>>>> I have used multiple match rules in one instruct for same size shift related rules and also for the new Abs/Neg rules. >>>>> What I noticed is that the adlc still duplicates lot of code and there is potential to further improve code size >>>>> for multiple match rule case by improving the adlc itself. >>>>> The adlc improvement (like removing duplicate emits, formats, expand, pipeline etc) can be done as a separate RFE. >>>>> In this webrev, I have also fixed the errors reported by Vladimir Ivanov and corrected the issues reported by >>>>> jcheck tool. >>>>> Also taken into account reducing the temporary by using TEMP dst for multiply rules. >>>>> >>>>> The compiler jtreg tests and the java math tests pass on Haswell, SKX, and KNL. >>>>> >>>>> Your review and feedback is welcome. >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: hotspot-compiler-dev >>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>>> Viswanathan, Sandhya >>>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>>> To: Vladimir Kozlov ; B. Blaser >>>>> >>>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are the only place where dst is used: >>>>> >>>>> ??????? __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, $tmp1$$XMMRegister, vector_len); >>>>> ??????? __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>>> vector_len); >>>>> >>>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect TEMP dst is not required. >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>>> To: Viswanathan, Sandhya ; B. Blaser >>>>> >>>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>>> Hi Bernard, >>>>>> >>>>>> One could add TEMP dst in effect() to let the register allocator know that dst needs to be different from src. >>>>> >>>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>>>> $tmp2 to avoid overwriting >>>>> $src2 before we get value from it if $dst = $src2. >>>>> >>>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but $dst is used only for final result. >>>>> >>>>> It is a little mess which may cause ineffective use of registers in compiled code. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> >>>>>> Best Regards, >>>>>> Sandhya >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>>> To: Viswanathan, Sandhya >>>>>> Cc: Vladimir Kozlov ; >>>>>> hotspot-compiler-dev at openjdk.java.net >>>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>>> >>>>>> Hi Sandhya and Vladimir K., >>>>>> >>>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya wrote: >>>>>>> >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> Yes, I missed the question below: >>>>>>>>> There are cases where we can use less `TEMP tmp` registers by using 'dst' register like in mul4B_reg(). Is it >>>>>>>>> intentional to not use 'dst' there? >>>>>>> >>>>>>> No it is not intentional, we can use the dst register in those cases and reduced the tmps. >>>>>> >>>>>> I guess we have to be careful using $dst instead of $tmp registers as the allocator sometimes provides identical >>>>>> $src & $dst. Also, I'm not sure this would be possible in the case of mul4B_reg(): >>>>>> >>>>>> 7349?? format %{"pmovsxbw? $tmp,$src1\n\t" >>>>>> 7350??????????? "pmovsxbw? $tmp2,$src2\n\t" >>>>>> >>>>>> I believe this couldn't work if you use $dst instead of $tmp and $dst = $src2, what do you think? >>>>>> >>>>>> Thanks, >>>>>> Bernard >>>>>> From xxinliu at amazon.com Mon May 6 22:49:25 2019 From: xxinliu at amazon.com (Liu, Xin) Date: Mon, 6 May 2019 22:49:25 +0000 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: <88c6a4e1-b98b-b0a9-2f76-3f2595be7374@oracle.com> References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> <88c6a4e1-b98b-b0a9-2f76-3f2595be7374@oracle.com> Message-ID: Hi, Tobias, Here is the new revision of webrev. It includes the tieredEvent you mentioned. https://cr.openjdk.java.net/~xliu/8222670/webrev.05/ Paul help me to try the patch in submit repo. It's clean. Thanks, --lx From: "do-not-reply at oracle.com" Reply-To: "mach5_admin_ww_grp at oracle.com" Date: Monday, May 6, 2019 at 11:25 AM To: "Hohensee, Paul" Subject: [Mach5] mach5-one-phh-JDK-8222670-1-20190506-1730-2299657: PASSED Job: mach5-one-phh-JDK-8222670-1-20190506-1730-2299657 BuildId: 2019-05-06-1725217.hohensee.source No failed tests Tasks Summary NA: 0 UNABLE_TO_RUN: 0 PASSED: 76 HARNESS_ERROR: 0 EXECUTED_WITH_FAILURE: 0 KILLED: 0 FAILED: 0 NOTHING_TO_RUN: 0 ?On 5/3/19, 5:30 AM, "Tobias Hartmann" wrote: On 03.05.19 02:21, Liu, Xin wrote: > Thanks for the review. I fixed copyrights and the typo of clearMethodState0. > Here is the new revision. > https://cr.openjdk.java.net/~xliu/8222670/webrev.04/ Looks good to me but I think you should also add: if (PrintTieredEvents) { print_event(REMOVE_FROM_QUEUE, method, method, task->osr_bci(), (CompLevel) task->comp_level()); } > But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the > same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? > > That's the very root cause of level2 recompilation. > In CompileBroker::compile_method(), its input argument is comp_level = 3. > CompileBroker::compilation_is_complete returns false because codecache only has level=2 nmethod. > I don't know why, but hotpsot is also very stubborn. It will request level = 3 again and again. All of them are downgraded to level=2 when they dequeue. > > Level2RecompilationTest simulates this process. I didn't make it up. I observe the symptom in some real services as follows. > https://bugs.openjdk.java.net/secure/attachment/82079/lvl2_recomp_spring.log.zip Okay, got it. Thanks, Tobias From vladimir.x.ivanov at oracle.com Tue May 7 01:48:50 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 6 May 2019 18:48:50 -0700 Subject: [13] RFR (T): [Graal] assert(!m->can_be_statically_bound(InstanceKlass::cast(ctxk))) failed: redundant Message-ID: http://cr.openjdk.java.net/~vlivanov/8223422/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223422 Graal hits newly added assert (as part of 8223171 [1]) which ensures there are no redundant dependencies added. I propose to disable the assert for now when Graal is enabled, then fix Graal to avoid registering redundant dependencies (in the same vein as 8223171 did for C1/C2), and then re-enable the assert in its original form. Thanks! Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8223171 From david.holmes at oracle.com Tue May 7 02:49:05 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 7 May 2019 12:49:05 +1000 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: <903e7954-0c16-0fe4-1a1f-6e9c5c403a28@oracle.com> Hi Robbin, I took a look at this and it is a lot more complex than I had expected. There are some interconnections that I'm not understanding here. From existing code why does deopt need to revoke biases? I don't see how deopt changes anything in respect to monitor ownwership. Then in the new code why do you have to inflate monitors to do deopt? If this is truly new behaviour and I didn't just miss where this happens in existing code, then how does this impact the number of ObjectMonitors in existence and the monitor deflation process? More comments below ... On 3/05/2019 8:31 pm, Robbin Ehn wrote: > Hi, please see this update: > > Inc: > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/index.html > Full: > http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/ src/hotspot/share/code/codeCache.cpp This comment block no longer applies give there is no safepoint op now: 1190 // CodeCache can only be updated by a thread_in_VM and they will all be 1191 // stopped during the safepoint so CodeCache will be safe to update without 1192 // holding the CodeCache_lock. The same comment block here: 1208 // CodeCache can only be updated by a thread_in_VM and they will all be 1209 // stopped dring the safepoint so CodeCache will be safe to update without 1210 // holding the CodeCache_lock. is already incorrect if not actually at a safepoint. It makes the relationship between the Compile_lock, CodeCache_lock and being at a safepoint rather confusing. --- src/hotspot/share/code/nmethod.cpp The comment here: 1119 // If _method is already NULL the Method* is about to be unloaded, 1120 // so we don't have to break the cycle. Note that it is possible to 1121 // have the Method* live here, in case we unload the nmethod because 1122 // it is pointing to some oop (other than the Method*) being unloaded. no longer fits the code given we no longer skip the _method==NULL case. Further, the trailing comment here: 1123 Method::unlink_code(_method, this); // Break a cycle seems unnecessary given we preceded this with 4 lines of commentary already! Again this comment block: 1205 void nmethod::unlink_from_method() { 1206 // We need to check if both the _code and _from_compiled_code_entry_point 1207 // refer to this nmethod because there is a race in setting these two fields 1208 // in Method* as seen in bugid 4947125. 1209 // If the vep() points to the zombie nmethod, the memory for the nmethod 1210 // could be flushed and the compiler and vtable stubs could still call 1211 // through it. 1212 Method::unlink_code(method(), this); seems meaningless with the code change you've applied. --- src/hotspot/share/code/nmethod.hpp ! void unlink_from_method(); Now this doesn't take the acquire_lock parameter it would be useful to document what the locking expectations are: must this be called with a given lock held, or will it always acquire a given lock if needed? --- src/hotspot/share/oops/method.cpp + void Method::unlink_code(Method *method, CompiledMethod *compare) { + void Method::unlink_code(Method *method) { I'm not sure making these methods static just so the NULL check can be internalized is the best way to deal with this. Now you can't tell when NULL is expected and when it is an error. IMHO it is better to keep these as instance methods with a NULL check at the callsite if needed (or an assert if NULL is not expected). --- src/hotspot/share/runtime/biasedLocking.cpp src/hotspot/share/runtime/biasedLocking.hpp This seems like it would be better done after we have switched biased-locking revocation to use handshakes instead of safepoints. Otherwise we seem to be doing a partial conversion as a side-effect of this bug and it's far from obvious that it is complete/correct. --- src/hotspot/share/runtime/synchronizer.cpp These changes have me worried: assert(Universe::verify_in_progress() || ! !SafepointSynchronize::is_at_safepoint(), "invariant"); becomes: assert(Universe::verify_in_progress() || ! !Universe::heap()->is_gc_active(), "invariant"); I don't see an immediate equivalence between not being at a safepoint and not having a GC active. If I'm not at a safepoint now and the code following doesn't do a safepoint check then we remain outside of a safepoint. What is the same reasoning for a GC being active? ! ResourceMark rm(Self); ! ResourceMark rm; Are you suggesting the current thread is not Self? If that is the case then there should be numerous asserts earlier on to ensure we can't follow any code paths that expect that Self is the current thread! But I'm concerned that we've introduced a new way for a third-party thread to introduce monitor inflation almost independent of the threads using the monitor. Thanks, David ----- > > # Note > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > line 630 > This is revert to the original, I accidental had left in a temporary > test change, as you can see here in full diff: > http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > > > I think I manage to address all review comments. > > Dean can you please cast an extra eye on: > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html > > This OR should be correct. > > Dan please do the same on the biased locking changes. > > I left out the merge with MutexLocker changes, since it was not > interesting. > There were some conflicts with JVMCI changes, so incremental contains > some parts of that merge. > > Passes t1-5 and local testing. > I'll continue with some additional testing. > > Thanks, Robbin > > On 4/25/19 2:05 PM, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From vladimir.kozlov at oracle.com Tue May 7 03:02:20 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 20:02:20 -0700 Subject: [13] RFR (T): [Graal] assert(!m->can_be_statically_bound(InstanceKlass::cast(ctxk))) failed: redundant In-Reply-To: References: Message-ID: <5f5a9aa4-2227-ba9a-6c8b-8d62887b7673@oracle.com> Good. Thanks, Vladimir K On 5/6/19 6:48 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223422/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223422 > > Graal hits newly added assert (as part of 8223171 [1]) which ensures there are no redundant dependencies added. > > I propose to disable the assert for now when Graal is enabled, then fix Graal to avoid registering redundant > dependencies (in the same vein as 8223171 did for C1/C2), and then re-enable the assert in its original form. > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8223171 From vladimir.kozlov at oracle.com Tue May 7 03:03:14 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 6 May 2019 20:03:14 -0700 Subject: [13] RFR (T): [Graal] assert(!m->can_be_statically_bound(InstanceKlass::cast(ctxk))) failed: redundant In-Reply-To: <5f5a9aa4-2227-ba9a-6c8b-8d62887b7673@oracle.com> References: <5f5a9aa4-2227-ba9a-6c8b-8d62887b7673@oracle.com> Message-ID: <189f08ba-d16f-6d23-d9db-6c5b940e62b9@oracle.com> And trivial. On 5/6/19 8:02 PM, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir K > > On 5/6/19 6:48 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223422/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223422 >> >> Graal hits newly added assert (as part of 8223171 [1]) which ensures there are no redundant dependencies added. >> >> I propose to disable the assert for now when Graal is enabled, then fix Graal to avoid registering redundant >> dependencies (in the same vein as 8223171 did for C1/C2), and then re-enable the assert in its original form. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8223171 From dean.long at oracle.com Tue May 7 03:28:37 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 6 May 2019 20:28:37 -0700 Subject: RFR(S) 8218700: infinite loop in HotSpotJVMCIMetaAccessContext.fromClass after OutOfMemoryError In-Reply-To: <0817ed0f-a912-4ffc-6a4a-6623802b28ec@oracle.com> References: <53bcf718-e543-d40c-5486-58b98f66bcee@oracle.com> <1e91a8e6-16bc-2ae0-8aaf-830e1c6b450a@oracle.com> <3d15e9f0-8717-ac82-678d-2139dcfec7f8@oracle.com> <415627e2-165c-14a0-a069-2e01de5574d4@oracle.com> <0817ed0f-a912-4ffc-6a4a-6623802b28ec@oracle.com> Message-ID: Thanks Vladimir. dl On 5/6/19 2:34 PM, Vladimir Kozlov wrote: > Looks good to me too. > > Thanks > Vladimir > > On 5/3/19 11:55 AM, dean.long at oracle.com wrote: >> On 5/3/19 10:45 AM, Tom Rodriguez wrote: >>> >>> >>> dean.long at oracle.com wrote on 5/2/19 11:47 PM: >>>> On 5/1/19 5:44 PM, Tom Rodriguez wrote: >>>>> You'll need to update your webrev after Vladimir's push.? This >>>>> code has moved into HotSpootJVMCIRuntime.java. >>>>> >>>> >>>> Here's the updated version: >>>> >>>> http://cr.openjdk.java.net/~dlong/8218700/webrev.3/ >>> >>> Looks good to me. >> >> Thanks for the review. >> >>> >>>> >>>>> Maybe WeakReferenceHolder instead of WeakTypeRef?? It needs a >>>>> comment explaining that we're intentionally avoiding the use of >>>>> ClassValue.remove as well. Shouldn't the ref field be volatile? >>>>> ClassValue includes some barrier semantics and the new code needs >>>>> similar guarantees. >>>>> >>>> >>>> I went ahead and made it volatile, but I don't understand what >>>> guarantee was missing, and what problem we want to eliminate, >>>> unless it is to reduce the possibility of duplicates.? But the fix >>>> for JDK-8201248 assumes that duplicates are possible, so I wasn't >>>> worried about that. >>> >>> We're publishing a mutable locally created object to other threads >>> so it seems like we need some sort of ordering barrier when we do >>> so. Presumably the ClassValue would normally provide some ordering >>> though it's a little unclear from the javadoc if it makes any such >>> guarantees. Is the extra volatile unneeded? >>> >> >> ClassValue uses volatile internally so that an unsynchronized read >> sees the latest version.? Using a volatile here should help in a >> similar way, but I believe there is still a race that allows >> duplicates if the weak reference gets cleared by GC.? To prevent all >> duplicates I think we would need both volatile and more synchronization. >> >> dl >> >>> tom >>> >>>> >>>> dl >>>> >>>>> tom >>>>> >>>>> dean.long at oracle.com wrote on 4/26/19 12:09 PM: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218700 >>>>>> http://cr.openjdk.java.net/~dlong/8218700/webrev.2/ >>>>>> >>>>>> If we throw an OutOfMemoryError in the right place (see >>>>>> JDK-8222941), HotSpotJVMCIMetaAccessContext.fromClass can go into >>>>>> an infinite loop calling ClassValue.remove. To work around the >>>>>> problem, reset the value in a mutable cell instead of calling >>>>>> remove. >>>>>> >>>>>> dl >>>> >> From fujie at loongson.cn Tue May 7 03:35:09 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 7 May 2019 11:35:09 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: <0ee5e3ad-6eef-7976-7fa7-adb60a9eb4ad@oracle.com> References: <6aebd883-0be7-0b05-5364-262e138a1fbc@loongson.cn> <182d87da-0d99-3f33-fbe7-ef5818be0422@loongson.cn> <0936427d-f4d2-299a-87ce-860dce5e57e1@loongson.cn> <574d59f5-3437-738f-e10c-796dcb02b42e@oracle.com> <5275854c-ab35-f160-f6f0-6ab9ac86e3d0@loongson.cn> <8bc507fe-b6db-d697-8821-0547860de232@oracle.com> <1a398a1f-ed52-2197-5886-d9d5fd872974@loongson.cn> <5607f7ca-57b9-b409-3bce-efc1688f0678@loongson.cn> <8510740c-ac56-f8d2-3c5e-451dfa6948a0@loongson.cn> <0ee5e3ad-6eef-7976-7fa7-adb60a9eb4ad@oracle.com> Message-ID: <4709d584-6846-7201-c74a-1207d05a949f@loongson.cn> Thank you so much Vladimir Ivanov. On 2019/5/7 ??3:19, Vladimir Ivanov wrote: > Sure, pushed [1] > > Best regards, > Vladimir Ivanov > > [1] http://hg.openjdk.java.net/jdk/jdk/rev/1abca1170080 > > On 03/05/2019 16:24, Jie Fu wrote: >> Hi Vladimir Ivanov, >> >> The patch in the attachment has been updated by adding brackets to >> the checks in InlineTree::is_not_reached. >> >> Is it OK to be pushed? >> If so, could you please sponsor it? >> >> Thanks a lot. >> >> Best regards, >> Jie >> >> >> On 2019?05?04? 06:06, Vladimir Ivanov wrote: >>> CCing Jie Fu. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 03/05/2019 14:55, coleen.phillimore at oracle.com wrote: >>>> >>>> http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/src/hotspot/share/oops/cpCache.cpp.frames.html >>>> >>>> >>>> This looks like it should have gotten the wrong answer without this >>>> change (there appears to be protection from an index out of range) >>>> even without your patch.? f2 is Method* for invokeinterface now. >>>> >>>> The runtime part of this change look good to me. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> >>>> On 5/2/19 5:23 AM, Tobias Hartmann wrote: >>>>> Hi Jie, >>>>> >>>>> this looks good to me too but please add brackets to the checks in >>>>> InlineTree::is_not_reached. >>>>> >>>>> I've submitted some extended testing and let you know once it passed. >>>>> >>>>> Someone from the runtime team should also have a look at this >>>>> because your changes affect the >>>>> interpreter. CC'ing runtime-dev. >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>>>> On 29.04.19 15:43, Jie Fu wrote: >>>>>> Hi all, >>>>>> >>>>>> May I have another review for this change [1] to finalize the fix? >>>>>> Thanks a lot. >>>>>> >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>> [1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.02/ >>>>>> >>>>>> >>>>>> On 2019?04?20? 11:35, Jie Fu wrote: >>>>>>> Ah, I got it. >>>>>>> I like your patch and benefit a lot from you. >>>>>>> Thank you so much, Vladimir. >>>>>>> >>>>>>> Any comments from other reviewers? >>>>>>> Thanks. >>>>>>> >>>>>>> Best regards, >>>>>>> Jie >>>>>>> >>>>>>> On 2019/4/20 ??11:18, Vladimir Ivanov wrote: >>>>>>>>>> After some explorations I decided to keep original behavior >>>>>>>>>> for immature profiles >>>>>>>>>> (profile.count == -1). >>>>>>>>> I agree. >>>>>>>>> >>>>>>>>> I have two questions here. >>>>>>>>> >>>>>>>>> 1. What's the difference of the following two if statements? >>>>>>>>> ------------------------------------------------- >>>>>>>>> +? if (!callee_method->was_executed_more_than(0)) return true; >>>>>>>>> // callee was never executed >>>>>>>>> + >>>>>>>>> +? if (caller_method->is_not_reached(caller_bci)) return true; >>>>>>>>> // call site not resolved >>>>>>>>> ------------------------------------------------- >>>>>>>>> I think only one of them is needed. >>>>>>>> The checks are complimentary: one inspects callee and the other >>>>>>>> looks at call site. >>>>>>>> >>>>>>>> "!callee_method->was_executed_more_than(0)" ensures that callee >>>>>>>> was executed at least once. >>>>>>>> >>>>>>>> "caller_method->is_not_reached(caller_bci)" inspects the state >>>>>>>> of the call site. If corresponding >>>>>>>> CP entry is not resolved, then the call site isn't reached. If >>>>>>>> is_not_reached() returns false, >>>>>>>> it's not a definitive answer: there's still a chance the site >>>>>>>> is not reached - consider the case >>>>>>>> of virtual calls where callee_method may differ for the same >>>>>>>> resolved method. >>>>>>>> >>>>>>>>> 2. Does the assert in InlineTree::is_not_reached(...) make sense? >>>>>>>>> Since we have >>>>>>>>> ------------------------------------------------- >>>>>>>>> if (profile.count() > 0)?? return false; // reachable >>>>>>>>> according to profile >>>>>>>>> ------------------------------------------------- >>>>>>>>> and >>>>>>>>> ------------------------------------------------- >>>>>>>>> if (profile.count() == -1) {...} >>>>>>>>> ------------------------------------------------- >>>>>>>>> before >>>>>>>>> ------------------------------------------------- >>>>>>>>> assert(profile.count() == 0, "sanity"); >>>>>>>>> ------------------------------------------------- >>>>>>>>> is the assert redundant? >>>>>>>> Asserts are intended to be redundant :-) But still catch bugs >>>>>>>> from time to time. >>>>>>>> >>>>>>>> This one, in particular, checks invariant on profile.count() >= >>>>>>>> -1 (which is not very useful by >>>>>>>> itself), but also stresses that "profile.count() == 0" case is >>>>>>>> being processed. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Vladimir Ivanov >>>>>> >>>> >> From patricio.chilano.mateo at oracle.com Tue May 7 03:55:28 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 6 May 2019 23:55:28 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> Message-ID: <875b76f9-9a2c-4413-23bf-6add2cf0b5e0@oracle.com> Sorry, in fourth paragraph I meant: s/BiasedLocking::revoke_and_rebias()/revoke_bias() Thanks, Patricio On 5/6/19 12:10 PM, Patricio Chilano wrote: > Hi Robbin, > > I'm going to just review the biased locking part since I'm not really > familiar with the rest of the code. > > In BiasedLocking::revoke_and_rebias_in_handshake(), why do you need to > execute fast_revoke(obj, false)? If these are objects locked by the > JavaThread you are handshaking then it seems they should be normal > locks (no bias pattern) or the condition (mark->biased_locker() == > THREAD && prototype_header->bias_epoch() == mark->bias_epoch()) you > are testing for later should hold. Then that would save the extra > comparisons in fast_revoke(). > > Also instead of placing the condition (mark->biased_locker() == THREAD > && prototype_header->bias_epoch() == mark->bias_epoch()) inside an > if() and then later use a ShouldNotReachHere(), wouldn't it be better > to make that an assertion, place that code outside the if() and remove > the ShouldNotReachHere()? > > For the execution of revoke_bias() inside > BiasedLocking::revoke_and_rebias_in_handshake() you could use a > shorter version of BiasedLocking::revoke_and_rebias() that avoids the > extra comparisons made for the general case and just starts at the > walking the stack part, but I'm actually doing that for 8191890 so I > can merge that with my patch. > > In deoptimization.cpp you have methods inflate_monitors() and > inflate_monitors_handshake(), but in inflate_monitors() you are not > inflating the monitors, you just revoke the ones that have bias. You > mentioned in your first email that we need to inflate if we are not at > a safepoint, why is that? Since revocation seems to be the common > factor between those methods, maybe s/inflate/revoke is a better name? > > > Thanks! > Patricio > > > On 5/6/19 4:42 AM, Robbin Ehn wrote: >> Hi Dan, >> >>> src/hotspot/share/runtime/biasedLocking.cpp >>> ???? nit - Please update copyright year for this file. >>> >> >> Updated in 8220724. >> >>> ???? Nice refactoring into more readable chunks! I'm assuming that >>> ???? Patricio is also reviewing these changes... >> >> Great, good! >> >>> src/hotspot/share/runtime/deoptimization.cpp >>> ???? L778:? bool _in_handshake; >>> ???????? nit - needs one more space of indent. >> >> Fixed. >> >>> >>> ???? Nice refactoring while adding in the handshake support. >> >> Great! >> >>> >>> src/hotspot/share/runtime/deoptimization.hpp >>> ???? L147:? public: >>> ???? L148: >>> ???? L149: ? // Deoptimizes a frame lazily. nmethod gets patched >>> deopt happens on return to the frame >>> ???? L163: ? static void fix_monitors(JavaThread* thread, frame fr, >>> RegisterMap* map) >>> ???????? Style nit: I would put the blank line on L148 above L147. >> >> Fixed. >> >>> >>> ???? L164: ??? { inflate_monitors(thread, fr, map); } >>> ???????? Style nit: Should be: >>> >>> ???????????? static void fix_monitors(JavaThread* thread, frame fr, >>> RegisterMap* map) { >>> ?????????????? inflate_monitors(thread, fr, map); >>> ???????????? } >> >> Fixed. >> >>> src/hotspot/share/runtime/mutexLocker.cpp >>> ???? No comments. (So OsrList_lock is now 'special-1' instead of >>> 'leaf'. >>> ???? I presume the Compiler team is okay with that... >> >> Since need we hold CodeCache_lock while iterating nmethods, all locks >> that might be taken needed to be pushed down under CodeCache_lock. >> So I hope they are okay with that. >> >>> Thumbs up!? I don't need to see a webrev if you fix the nits... >> >> Thanks Dan! Fixed! >> >> I did t6-7 over the weekend, no issues found. >> >> /Robbin >> >>> >>> Dan >>> >>> >>>> >>>> # Note >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>> line 630 >>>> This is revert to the original, I accidental had left in a >>>> temporary test change, as you can see here in full diff: >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>> >>>> >>>> I think I manage to address all review comments. >>>> >>>> Dean can you please cast an extra eye on: >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >>>> >>>> This OR should be correct. >>>> >>>> Dan please do the same on the biased locking changes. >>>> >>>> I left out the merge with MutexLocker changes, since it was not >>>> interesting. >>>> There were some conflicts with JVMCI changes, so incremental >>>> contains some parts of that merge. >>>> >>>> Passes t1-5 and local testing. >>>> I'll continue with some additional testing. >>>> >>>> Thanks, Robbin >>>> >>>> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>>>> Hi all, please review. >>>>> >>>>> Let's deopt with handshakes. >>>>> Removed VM op Deoptimize, instead we handshake. >>>>> Locks needs to be inflate since we are not in a safepoint. >>>>> >>>>> Goes on top of: >>>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>>>> >>>>> >>>>> Code: >>>>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>>>> >>>>> Passes t1-7 and multiple t1-5 runs. >>>>> >>>>> A few startup benchmark see a small speedup. >>>>> >>>>> Thanks, Robbin >>> > From vladimir.x.ivanov at oracle.com Tue May 7 04:33:29 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 6 May 2019 21:33:29 -0700 Subject: [13] RFR (T): [Graal] assert(!m->can_be_statically_bound(InstanceKlass::cast(ctxk))) failed: redundant In-Reply-To: <5f5a9aa4-2227-ba9a-6c8b-8d62887b7673@oracle.com> References: <5f5a9aa4-2227-ba9a-6c8b-8d62887b7673@oracle.com> Message-ID: Thanks, Vladimir. Pushed. Best regards, Vladimir Ivanov On 06/05/2019 20:02, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir K > > On 5/6/19 6:48 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223422/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223422 >> >> Graal hits newly added assert (as part of 8223171 [1]) which ensures >> there are no redundant dependencies added. >> >> I propose to disable the assert for now when Graal is enabled, then >> fix Graal to avoid registering redundant dependencies (in the same >> vein as 8223171 did for C1/C2), and then re-enable the assert in its >> original form. >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8223171 From rwestrel at redhat.com Tue May 7 07:10:00 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 07 May 2019 09:10:00 +0200 Subject: RFR(XS): 8223389: Shenandoah optimizations fail with assert(!phase->exceeding_node_budget()) Message-ID: <87v9ymn3jr.fsf@redhat.com> http://cr.openjdk.java.net/~roland/8223389/webrev.00/ This is needed following 8216137. Roland. From tobias.hartmann at oracle.com Tue May 7 07:18:06 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 7 May 2019 09:18:06 +0200 Subject: 8222670 patch review: prevent downgraded tasks from recompiling In-Reply-To: References: <99aae03d0315482c723abda2f2cb530b4b52f82d.camel@redhat.com> <427BC0A9-DAB2-43A3-AF93-F96414EC1E7E@amazon.com> <0fca1798-5851-3e5d-e603-54282dc3be81@oracle.com> <88c6a4e1-b98b-b0a9-2f76-3f2595be7374@oracle.com> Message-ID: <12d69927-9f05-b5cb-0fa3-3ecd8238490c@oracle.com> On 07.05.19 00:49, Liu, Xin wrote: > Here is the new revision of webrev. It includes the tieredEvent you mentioned. > https://cr.openjdk.java.net/~xliu/8222670/webrev.05/ Looks good to me, I've sponsored the changes: http://hg.openjdk.java.net/jdk/jdk/rev/1dc9bf9d016b Thanks, Tobias > Paul help me to try the patch in submit repo. It's clean. > > Thanks, > --lx > > > From: "do-not-reply at oracle.com" > Reply-To: "mach5_admin_ww_grp at oracle.com" > Date: Monday, May 6, 2019 at 11:25 AM > To: "Hohensee, Paul" > Subject: [Mach5] mach5-one-phh-JDK-8222670-1-20190506-1730-2299657: PASSED > > Job: mach5-one-phh-JDK-8222670-1-20190506-1730-2299657 > > BuildId: 2019-05-06-1725217.hohensee.source > > No failed tests > > Tasks Summary > > NA: 0 > UNABLE_TO_RUN: 0 > PASSED: 76 > HARNESS_ERROR: 0 > EXECUTED_WITH_FAILURE: 0 > KILLED: 0 > FAILED: 0 > NOTHING_TO_RUN: 0 > > > > ?On 5/3/19, 5:30 AM, "Tobias Hartmann" wrote: > > > On 03.05.19 02:21, Liu, Xin wrote: > > Thanks for the review. I fixed copyrights and the typo of clearMethodState0. > > Here is the new revision. > > https://cr.openjdk.java.net/~xliu/8222670/webrev.04/ > > Looks good to me but I think you should also add: > > if (PrintTieredEvents) { > print_event(REMOVE_FROM_QUEUE, method, method, task->osr_bci(), (CompLevel) task->comp_level()); > } > > > But why is that? If a downgraded compilation succeeded at level 2, shouldn't a re-compilation at the > > same level be detected by CompileBroker::compilation_is_complete() in CompileBroker::compile_method()? > > > > That's the very root cause of level2 recompilation. > > In CompileBroker::compile_method(), its input argument is comp_level = 3. > > CompileBroker::compilation_is_complete returns false because codecache only has level=2 nmethod. > > I don't know why, but hotpsot is also very stubborn. It will request level = 3 again and again. All of them are downgraded to level=2 when they dequeue. > > > > Level2RecompilationTest simulates this process. I didn't make it up. I observe the symptom in some real services as follows. > > https://bugs.openjdk.java.net/secure/attachment/82079/lvl2_recomp_spring.log.zip > > Okay, got it. > > Thanks, > Tobias > > From tobias.hartmann at oracle.com Tue May 7 07:20:38 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 7 May 2019 09:20:38 +0200 Subject: RFR(XS): 8223389: Shenandoah optimizations fail with assert(!phase->exceeding_node_budget()) In-Reply-To: <87v9ymn3jr.fsf@redhat.com> References: <87v9ymn3jr.fsf@redhat.com> Message-ID: <329222be-1435-eba4-f1c5-b9c37a3aa65f@oracle.com> Hi Roland, looks good and trivial. Best regards, Tobias On 07.05.19 09:10, Roland Westrelin wrote: > > http://cr.openjdk.java.net/~roland/8223389/webrev.00/ > > This is needed following 8216137. > > Roland. > From rkennke at redhat.com Tue May 7 09:16:46 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 7 May 2019 11:16:46 +0200 Subject: RFR(S): 8222738: Shenandoah: assert(is_Proj()) failed when running cometd benchmarks In-Reply-To: <87lfznofr0.fsf@redhat.com> References: <87zhonnwoq.fsf@redhat.com> <0f1a9600-2f2d-f360-9bc5-aa44f49d8990@redhat.com> <87lfznofr0.fsf@redhat.com> Message-ID: Ok. Thanks! Roman > Thanks for the review. Actually, I think it's safer to also make the > change below because we want to clone everything that's between the call > and the fallthrough/exception paths, that is everything with a control > of: the call itself or its control projection. > > Roland. > > > diff -r f0739ec84bb4 -r 9968255985be src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp > --- a/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp Thu Apr 11 12:00:33 2019 +0200 > +++ b/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp Thu May 02 20:47:23 2019 +0200 > @@ -1362,7 +1362,7 @@ > if (idx < n->outcnt()) { > Node* u = n->raw_out(idx); > Node* c = phase->ctrl_or_self(u); > - if (c == ctrl) { > + if (phase->is_dominator(call, c) && phase->is_dominator(c, projs.fallthrough_proj)) { > stack.set_index(idx+1); > assert(!u->is_CFG(), ""); > stack.push(u, 0); > From shade at redhat.com Tue May 7 10:26:13 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 7 May 2019 12:26:13 +0200 Subject: RFR (XS) 8223450: Disable Shenandoah C2 barriers verification for x86_32 Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8223450 Shenandoah C2 barrier verification is disabled when unusual barrier configuration is requested. However, that only takes care of options provided from the command line (as our tests assert). For the configuration that disables barriers implicitly, e.g. x86_32, this is not enough, and tests fail with false negatives. We need to disable C2 barriers verification explicitly there. Fix: diff -r 1c3292907e4b src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp --- a/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp Tue May 07 12:19:28 2019 +0200 +++ b/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp Tue May 07 12:23:15 2019 +0200 @@ -48,10 +48,12 @@ FLAG_SET_DEFAULT(ShenandoahLoadRefBarrier, false); FLAG_SET_DEFAULT(ShenandoahKeepAliveBarrier, false); FLAG_SET_DEFAULT(ShenandoahStoreValEnqueueBarrier, false); FLAG_SET_DEFAULT(ShenandoahCASBarrier, false); FLAG_SET_DEFAULT(ShenandoahCloneBarrier, false); + + FLAG_SET_DEFAULT(ShenandoahVerifyOptoBarriers, false); #endif #ifdef _LP64 // The optimized ObjArrayChunkedTask takes some bits away from the full 64 addressable // bits, fail if we ever attempt to address more than we can. Only valid on 64bit. Testing: hotspot_gc_shenandoah -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Tue May 7 10:34:51 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 7 May 2019 12:34:51 +0200 Subject: RFR (XS) 8223450: Disable Shenandoah C2 barriers verification for x86_32 In-Reply-To: References: Message-ID: Looks good, thanks! Roman > Bug: > https://bugs.openjdk.java.net/browse/JDK-8223450 > > Shenandoah C2 barrier verification is disabled when unusual barrier configuration is requested. > However, that only takes care of options provided from the command line (as our tests assert). For > the configuration that disables barriers implicitly, e.g. x86_32, this is not enough, and tests fail > with false negatives. We need to disable C2 barriers verification explicitly there. > > Fix: > > diff -r 1c3292907e4b src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp > --- a/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp Tue May 07 12:19:28 2019 +0200 > +++ b/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp Tue May 07 12:23:15 2019 +0200 > @@ -48,10 +48,12 @@ > FLAG_SET_DEFAULT(ShenandoahLoadRefBarrier, false); > FLAG_SET_DEFAULT(ShenandoahKeepAliveBarrier, false); > FLAG_SET_DEFAULT(ShenandoahStoreValEnqueueBarrier, false); > FLAG_SET_DEFAULT(ShenandoahCASBarrier, false); > FLAG_SET_DEFAULT(ShenandoahCloneBarrier, false); > + > + FLAG_SET_DEFAULT(ShenandoahVerifyOptoBarriers, false); > #endif > > #ifdef _LP64 > // The optimized ObjArrayChunkedTask takes some bits away from the full 64 addressable > // bits, fail if we ever attempt to address more than we can. Only valid on 64bit. > > Testing: hotspot_gc_shenandoah > From rwestrel at redhat.com Tue May 7 12:38:39 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 07 May 2019 14:38:39 +0200 Subject: RFR(XS): 8223389: Shenandoah optimizations fail with assert(!phase->exceeding_node_budget()) In-Reply-To: <329222be-1435-eba4-f1c5-b9c37a3aa65f@oracle.com> References: <87v9ymn3jr.fsf@redhat.com> <329222be-1435-eba4-f1c5-b9c37a3aa65f@oracle.com> Message-ID: <87r29amoc0.fsf@redhat.com> Thanks for the review, Tobias. Roland. From shade at redhat.com Tue May 7 14:12:59 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 7 May 2019 16:12:59 +0200 Subject: RFR (XS) 8223450: Disable Shenandoah C2 barriers verification for x86_32 In-Reply-To: References: Message-ID: <4b836e5b-d5f2-4f64-8f5d-57d75fdd6eb4@redhat.com> Thanks, pushed under triviality rules. -Aleksey On 5/7/19 12:34 PM, Roman Kennke wrote: > Looks good, thanks! > > Roman > > >> Bug: >> ?? https://bugs.openjdk.java.net/browse/JDK-8223450 >> >> Shenandoah C2 barrier verification is disabled when unusual barrier configuration is requested. >> However, that only takes care of options provided from the command line (as our tests assert). For >> the configuration that disables barriers implicitly, e.g. x86_32, this is not enough, and tests fail >> with false negatives. We need to disable C2 barriers verification explicitly there. >> >> Fix: >> >> diff -r 1c3292907e4b src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp >> --- a/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp?? Tue May 07 12:19:28 2019 +0200 >> +++ b/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp?? Tue May 07 12:23:15 2019 +0200 >> @@ -48,10 +48,12 @@ >> ??? FLAG_SET_DEFAULT(ShenandoahLoadRefBarrier,???????? false); >> ??? FLAG_SET_DEFAULT(ShenandoahKeepAliveBarrier,?????? false); >> ??? FLAG_SET_DEFAULT(ShenandoahStoreValEnqueueBarrier, false); >> ??? FLAG_SET_DEFAULT(ShenandoahCASBarrier,???????????? false); >> ??? FLAG_SET_DEFAULT(ShenandoahCloneBarrier,?????????? false); >> + >> +? FLAG_SET_DEFAULT(ShenandoahVerifyOptoBarriers,???? false); >> ? #endif >> >> ? #ifdef _LP64 >> ??? // The optimized ObjArrayChunkedTask takes some bits away from the full 64 addressable >> ??? // bits, fail if we ever attempt to address more than we can. Only valid on 64bit. >> >> Testing: hotspot_gc_shenandoah >> -- Thanks, -Aleksey Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Michael O'Neill, Tom Savage, Eric Shander -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From robbin.ehn at oracle.com Tue May 7 14:26:41 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 7 May 2019 16:26:41 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> Message-ID: <0fe2590d-a41a-0554-b9b2-efc3bb644684@oracle.com> Hi Patricio, On 5/6/19 6:10 PM, Patricio Chilano wrote: > Hi Robbin, > > I'm going to just review the biased locking part since I'm not really familiar > with the rest of the code. > > In BiasedLocking::revoke_and_rebias_in_handshake(), why do you need to execute > fast_revoke(obj, false)? If these are objects locked by the JavaThread you are > handshaking then it seems they should be normal locks (no bias pattern) or the > condition (mark->biased_locker() == THREAD && prototype_header->bias_epoch() == > mark->bias_epoch()) you are testing for later should hold. Then that would save > the extra comparisons in fast_revoke(). I tried to have as little changes possible. What you are saying are true for current code also. Since often there are no locks on stack, the biggest over-head is walking the entire stack, twice if a lock is found. A couple of comparison doesn't really matter here. I wanted the biased code to be the same as before, since this is not suppose todo change anything there, but we needed to avoid triggering a VM_BulkRevokeBias, so I avoided the update_heuristics. > > Also instead of placing the condition (mark->biased_locker() == THREAD && > prototype_header->bias_epoch() == mark->bias_epoch()) inside an if() and then > later use a ShouldNotReachHere(), wouldn't it be better to make that an > assertion, place that code outside the if() and remove the ShouldNotReachHere()? If we would failed here (hardware fault or so), you can loose your lock, I'll rather have a crash in the release build, than having to try debugging an application that sometimes spontaneous drops a lock, so I'd like to keep a hard crash. In your suggestion we should have an guarantee, e.g: + guarantee(mark->biased_locker() == THREAD && + prototype_header->bias_epoch() == mark->bias_epoch(), "Revoke failed, unhandled biased lock state"); + ResourceMark rm; + log_info(biasedlocking)("Revoking bias by walking my own stack:"); ... Seems good? > > For the execution of revoke_bias() inside > BiasedLocking::revoke_and_rebias_in_handshake() you could use a shorter version > of BiasedLocking::revoke_and_rebias() that avoids the extra comparisons made for > the general case and just starts at the walking the stack part, but I'm actually > doing that for 8191890 so I can merge that with my patch. Yes, ok, good. > > In deoptimization.cpp you have methods inflate_monitors() and > inflate_monitors_handshake(), but in inflate_monitors() you are not inflating > the monitors, you just revoke the ones that have bias. You mentioned in your > first email that we need to inflate if we are not at a safepoint, why is that? > Since revocation seems to be the common factor between those methods, maybe > s/inflate/revoke is a better name? > I looked this over, we actually always only need to revoke. The JavaThread will inflate, if needed, on it's way to the interpreter when unpacking the stack. (since we are inflating the stack, we must move the stack locks which can require an inflation) I renamed methods and remove early inflation. I'll send an update to RFR mail. Thanks, Robbin > > Thanks! > Patricio > > > On 5/6/19 4:42 AM, Robbin Ehn wrote: >> Hi Dan, >> >>> src/hotspot/share/runtime/biasedLocking.cpp >>> ???? nit - Please update copyright year for this file. >>> >> >> Updated in 8220724. >> >>> ???? Nice refactoring into more readable chunks! I'm assuming that >>> ???? Patricio is also reviewing these changes... >> >> Great, good! >> >>> src/hotspot/share/runtime/deoptimization.cpp >>> ???? L778:? bool _in_handshake; >>> ???????? nit - needs one more space of indent. >> >> Fixed. >> >>> >>> ???? Nice refactoring while adding in the handshake support. >> >> Great! >> >>> >>> src/hotspot/share/runtime/deoptimization.hpp >>> ???? L147:? public: >>> ???? L148: >>> ???? L149: ? // Deoptimizes a frame lazily. nmethod gets patched deopt >>> happens on return to the frame >>> ???? L163: ? static void fix_monitors(JavaThread* thread, frame fr, >>> RegisterMap* map) >>> ???????? Style nit: I would put the blank line on L148 above L147. >> >> Fixed. >> >>> >>> ???? L164: ??? { inflate_monitors(thread, fr, map); } >>> ???????? Style nit: Should be: >>> >>> ???????????? static void fix_monitors(JavaThread* thread, frame fr, >>> RegisterMap* map) { >>> ?????????????? inflate_monitors(thread, fr, map); >>> ???????????? } >> >> Fixed. >> >>> src/hotspot/share/runtime/mutexLocker.cpp >>> ???? No comments. (So OsrList_lock is now 'special-1' instead of 'leaf'. >>> ???? I presume the Compiler team is okay with that... >> >> Since need we hold CodeCache_lock while iterating nmethods, all locks that >> might be taken needed to be pushed down under CodeCache_lock. >> So I hope they are okay with that. >> >>> Thumbs up!? I don't need to see a webrev if you fix the nits... >> >> Thanks Dan! Fixed! >> >> I did t6-7 over the weekend, no issues found. >> >> /Robbin >> >>> >>> Dan >>> >>> >>>> >>>> # Note >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>> line 630 >>>> This is revert to the original, I accidental had left in a temporary test >>>> change, as you can see here in full diff: >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>> >>>> >>>> I think I manage to address all review comments. >>>> >>>> Dean can you please cast an extra eye on: >>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >>>> >>>> This OR should be correct. >>>> >>>> Dan please do the same on the biased locking changes. >>>> >>>> I left out the merge with MutexLocker changes, since it was not interesting. >>>> There were some conflicts with JVMCI changes, so incremental contains some >>>> parts of that merge. >>>> >>>> Passes t1-5 and local testing. >>>> I'll continue with some additional testing. >>>> >>>> Thanks, Robbin >>>> >>>> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>>>> Hi all, please review. >>>>> >>>>> Let's deopt with handshakes. >>>>> Removed VM op Deoptimize, instead we handshake. >>>>> Locks needs to be inflate since we are not in a safepoint. >>>>> >>>>> Goes on top of: >>>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>>>> >>>>> >>>>> Code: >>>>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>>>> Issue: >>>>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>>>> >>>>> Passes t1-7 and multiple t1-5 runs. >>>>> >>>>> A few startup benchmark see a small speedup. >>>>> >>>>> Thanks, Robbin >>> > From robbin.ehn at oracle.com Tue May 7 16:01:50 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 7 May 2019 18:01:50 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <903e7954-0c16-0fe4-1a1f-6e9c5c403a28@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> <903e7954-0c16-0fe4-1a1f-6e9c5c403a28@oracle.com> Message-ID: <76f1a03f-d1a4-b6e4-bba5-6444fab04116@oracle.com> Hi David, On 5/7/19 4:49 AM, David Holmes wrote: > Hi Robbin, > > I took a look at this and it is a lot more complex than I had expected. There > are some interconnections that I'm not understanding here. From existing code > why does deopt need to revoke biases? I don't see how deopt changes anything in > respect to monitor ownwership. Then in the new code why do you have to inflate > monitors to do deopt? If this is truly new behaviour and I didn't just miss > where this happens in existing code, then how does this impact the number of > ObjectMonitors in existence and the monitor deflation process? When going to interpreter we expand stack, which means moving anything on stack. When moving locks around it might require an inflation, we therefore revoke the biases before. The code revoking is seemingly from duke, so not that new I think :) > > More comments below ... > > On 3/05/2019 8:31 pm, Robbin Ehn wrote: >> Hi, please see this update: >> >> Inc: >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/index.html >> Full: >> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/ > > src/hotspot/share/code/codeCache.cpp > > This comment block no longer applies give there is no safepoint op now: > > 1190?? // CodeCache can only be updated by a thread_in_VM and they will all be > 1191?? // stopped during the safepoint so CodeCache will be safe to update without > 1192?? // holding the CodeCache_lock. > > The same comment block here: > > 1208?? // CodeCache can only be updated by a thread_in_VM and they will all be > 1209?? // stopped dring the safepoint so CodeCache will be safe to update without > 1210?? // holding the CodeCache_lock. > > is already incorrect if not actually at a safepoint. It makes the relationship > between the Compile_lock, CodeCache_lock and being at a safepoint rather confusing. Yes, these area is filled with 'wired' comments. mark_for_deoptimization takes CodeCache_lock unconditionally. I just removed these two. > > --- > > src/hotspot/share/code/nmethod.cpp > > The comment here: > > 1119?? // If _method is already NULL the Method* is about to be unloaded, > 1120?? // so we don't have to break the cycle. Note that it is possible to > 1121?? // have the Method* live here, in case we unload the nmethod because > 1122?? // it is pointing to some oop (other than the Method*) being unloaded. > > no longer fits the code given we no longer skip the _method==NULL case. Further, > the trailing comment here: > > 1123?? Method::unlink_code(_method, this); // Break a cycle > > seems unnecessary given we preceded this with 4 lines of commentary already! Updated. > > Again this comment block: > > 1205 void nmethod::unlink_from_method() { > 1206?? // We need to check if both the _code and _from_compiled_code_entry_point > 1207?? // refer to this nmethod because there is a race in setting these two fields > 1208?? // in Method* as seen in bugid 4947125. > 1209?? // If the vep() points to the zombie nmethod, the memory for the nmethod > 1210?? // could be flushed and the compiler and vtable stubs could still call > 1211?? // through it. > 1212?? Method::unlink_code(method(), this); > > seems meaningless with the code change you've applied. Moved it to Method::unlink_code. > > --- > > src/hotspot/share/code/nmethod.hpp > > !?? void unlink_from_method(); > > Now this doesn't take the acquire_lock parameter it would be useful to document > what the locking expectations are: must this be called with a given lock held, > or will it always acquire a given lock if needed? Added comment. > > --- > > src/hotspot/share/oops/method.cpp > > + void Method::unlink_code(Method *method, CompiledMethod *compare) { > + void Method::unlink_code(Method *method) { > > I'm not sure making these methods static just so the NULL check can be > internalized is the best way to deal with this. Now you can't tell when NULL is > expected and when it is an error. IMHO it is better to keep these as instance > methods with a NULL check at the callsite if needed (or an assert if NULL is not > expected). All places was NULL checked before calling old clear_code() and in no place was that asserted. > > --- > > ?src/hotspot/share/runtime/biasedLocking.cpp > ?src/hotspot/share/runtime/biasedLocking.hpp > > This seems like it would be better done after we have switched biased-locking > revocation to use handshakes instead of safepoints. Otherwise we seem to be > doing a partial conversion as a side-effect of this bug and it's far from > obvious that it is complete/correct. This is not the same, we are in a handshake and we want to revoke the locks we find on our stack. Doing revoke with handshake is when you are revoking another threads bias. This is a thread local operation. As I said to Patricio the only thing I changed is that we can't safepoint. So it's the same code as before to revoke, just minus the safepoints possibilities. > > --- > > src/hotspot/share/runtime/synchronizer.cpp > > These changes have me worried: > > ?? assert(Universe::verify_in_progress() || > !????????? !SafepointSynchronize::is_at_safepoint(), "invariant"); > > becomes: > > ?? assert(Universe::verify_in_progress() || > !????????? !Universe::heap()->is_gc_active(), "invariant"); > > I don't see an immediate equivalence between not being at a safepoint and not > having a GC active. If I'm not at a safepoint now and the code following doesn't > do a safepoint check then we remain outside of a safepoint. What is the same > reasoning for a GC being active? When we use the handshake fallback path, handshaking on a platform not supporting handshakes, the handshake is performed with a safepoint (will go away in JDK 14). "calling hashCode() or any of its internal aliases may result in changes to an object's markword if we need to assign a hashCode. That is, hashCode() can be a heap mutator. Given that, it seems unwise to call hashCode while at a safepoint. " Since the JavaThread inflates on demand I removed the eagerly inflation. Inflation is thus done by the JavaThread, I revert this. > > !???????? ResourceMark rm(Self); > !???????? ResourceMark rm; > > Are you suggesting the current thread is not Self? If that is the case then > there should be numerous asserts earlier on to ensure we can't follow any code > paths that expect that Self is the current thread! But I'm concerned that we've > introduced a new way for a third-party thread to introduce monitor inflation > almost independent of the threads using the monitor. Reverted this because of above reason. (when VM thread executed the handshake on behalf of the JavaThread it did the inflation, so Self was not Thread::current()) I'll send update to RFR mail. Thanks, Robbin > > Thanks, > David > ----- > >> >> # Note >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> line 630 >> This is revert to the original, I accidental had left in a temporary test >> change, as you can see here in full diff: >> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> >> >> I think I manage to address all review comments. >> >> Dean can you please cast an extra eye on: >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >> >> This OR should be correct. >> >> Dan please do the same on the biased locking changes. >> >> I left out the merge with MutexLocker changes, since it was not interesting. >> There were some conflicts with JVMCI changes, so incremental contains some >> parts of that merge. >> >> Passes t1-5 and local testing. >> I'll continue with some additional testing. >> >> Thanks, Robbin >> >> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin From vladimir.x.ivanov at oracle.com Tue May 7 17:41:03 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 7 May 2019 10:41:03 -0700 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> Message-ID: Good point, Eric. I agree that having microbenchmarks accompanying such changes would be very valuable. There's a rich set of microbenchmarks in Project Panama for Vector API and some of them target auto-vectorization case as well. Sandya, can you share the plans, please, regarding contributing relevant ones to the mainline? I'm fine with handling that as a separate RFE. Best regards, Vladimir Ivanov On 06/05/2019 07:21, Eric Caspole wrote: > Hi Sandhya, > Could add some new JMH to this webrev that target the java code that > show the benefit of these changes? Or, you could look through the > existing ones in > ?test/micro/org/openjdk/bench/ > > and mention in the bug which existing ones exercise these changes. That > will be a big help to us in the course of working on JDK 13. > Thanks, > Eric > > > > On 5/3/19 19:02, Viswanathan, Sandhya wrote: >> Hi Vladimir, >> >> Please find below the updated webrev which implements all your inputs: >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ >> >> Looking forward to your feedback. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 5:09 PM >> To: Viswanathan, Sandhya ; Vladimir >> Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Sounds good, thanks! >> >> Best regards, >> Vladimir Ivanov >> >> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>> I should add here that your suggestion of adding generic shift >>> instruction etc to the macroAssembler is also wonderful instead of >>> function pointer.? I will look into making that change as well. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Viswanathan, Sandhya >>> Sent: Wednesday, May 01, 2019 3:10 PM >>> To: 'Vladimir Ivanov' ; Vladimir Kozlov >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Vladimir, >>> >>> I agree, I wanted to show both the approaches in this patch to get >>> your feedback: >>> 1) with emit as a function >>> 2) with emit part in the instruct body itself >>> >>> With emit as a function it becomes hard to read and I personally >>> prefer it in the instruct itself as is done for vabsneg2D etc. That >>> is what you are recommending as well so I feel good. >>> >>> Once the adlc enhancement is done both the approaches should give >>> similar binary size. Till then there will be small overhead with >>> approach 2) as emit is duplicated per match rule. >>> >>> I will send an updated patch fixing the two issues you mentioned in >>> your previous email plus this change of using approach 2). >>> >>> Please do let me know if you want to see any other change in this patch. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 2:58 PM >>> To: Viswanathan, Sandhya ; Vladimir >>> Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> >>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>> >>> Nice job, Sandhya! Glad to hear the approach pays off! >>> >>> Unfortunately, I must note that AD file becomes much more obscure. >>> Especially with those function pointers. >>> >>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, >>> XMMRegister dst, >>> 1529???????????????????????? XMMRegister src, XMMRegister shift, >>> 1530???????????????????????? XMMRegister tmp1, XMMRegister tmp2, >>> Register scratch) { >>> 1531?? XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>> false : true); >>> 1532?? XX_Inst shiftinst = get_xx_inst(opcode); >>> 1533 >>> 1534?? (_masm.*extendinst)(tmp1, src); >>> 1535?? (_masm.*shiftinst)(tmp1, shift); >>> 1536?? __ pshufd(tmp2, src, 0xE); >>> 1537?? (_masm.*extendinst)(tmp2, tmp2); >>> 1538?? (_masm.*shiftinst)(tmp2, shift); >>> 1539?? __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>> scratch); >>> 1540?? __ pand(tmp2, dst); >>> 1541?? __ pand(dst, tmp1); >>> 1542?? __ packuswb(dst, tmp2); >>> 1543 } >>> >>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>> >>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, >>> vecX tmp2, rRegI scratch) %{ >>> 8683?? predicate(UseSSE > 3? && UseAVX <= 1 && n->as_Vector()->length() >>> == 16); >>> 8684?? match(Set dst (LShiftVB src shift)); >>> 8685?? match(Set dst (RShiftVB src shift)); >>> 8686?? match(Set dst (URShiftVB src shift)); >>> 8687?? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> 8688?? format %{"pmovxbw?? $tmp1,$src\n\t" >>> 8689??????????? "shiftop?? $tmp1,$shift\n\t" >>> 8690??????????? "pshufd??? $tmp2,$src\n\t" >>> 8691??????????? "pmovxbw?? $tmp2,$tmp2\n\t" >>> 8692??????????? "shiftop?? $tmp2,$shift\n\t" >>> 8693??????????? "movdqu??? $dst,[0x00ff00ff0x00ff00ff]\n\t" >>> 8694??????????? "pand????? $tmp2,$dst\n\t" >>> 8695??????????? "pand????? $dst,$tmp1\n\t" >>> 8696??????????? "packuswb? $dst,$tmp2\n\t! packed16B shift" %} >>> 8697?? ins_encode %{ >>> 8698???? emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, >>> $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>> 8699?? %} >>> 8700?? ins_pipe( pipe_slow ); >>> 8701 %} >>> >>> can be turned into something like: >>> >>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX >>> tmp2, rRegI scratch) %{ >>> ???? predicate(n->as_Vector()->length() == 16); >>> ???? match(Set dst (LShiftVB src shift)); >>> ???? match(Set dst (RShiftVB src shift)); >>> ???? match(Set dst (URShiftVB src shift)); >>> ???? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> ???? format %{"packed16B shift" %} >>> ???? ins_encode %{ >>> ?????? int vlen = 0; // 128-bit >>> ?????? BasicType elem_type = T_BYTE; >>> ?????? int shift_mode = ...; // L/R/UR or S/U + L/R >>> ?????? __ vshift(vlen, elem_type, shift_mode, >>> ???????????????? $dst$$..., $src$$..., $shift$$..., >>> ????????? $tmp1$$..., $tmp2$$..., $scratch$$...); >>> ????? %} >>> >>> Then MA::vshift can dispatch between different implementations >>> depending on SSE/AVX level available. Do you see any problems with >>> that from footprint perspective? >>> >>> Ideally, I'd prefer to see a library of operations on vectors >>> encapsulated in MacroAssembler (or a subclass) and used in x86.ad. >>> That will accommodate further reductions in AD instructions needed. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> With this webrev the ad file has only about 60 lines effectively added. >>>> Also the generated product libjvm.so size only increases by about >>>> 0.26% vs the prior 1.50%. >>>> I have used multiple match rules in one instruct for same size shift >>>> related rules and also for the new Abs/Neg rules. >>>> What I noticed is that the adlc still duplicates lot of code and >>>> there is potential to further improve code size for multiple match >>>> rule case by improving the adlc itself. >>>> The adlc improvement (like removing duplicate emits, formats, >>>> expand, pipeline etc) can be done as a separate RFE. >>>> In this webrev, I have also fixed the errors reported by Vladimir >>>> Ivanov and corrected the issues reported by jcheck tool. >>>> Also taken into account reducing the temporary by using TEMP dst for >>>> multiply rules. >>>> >>>> The compiler jtreg tests and the java math tests pass on Haswell, >>>> SKX, and KNL. >>>> >>>> Your review and feedback is welcome. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>> Viswanathan, Sandhya >>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>> To: Vladimir Kozlov ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are >>>> the only place where dst is used: >>>> >>>> ?????? __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, >>>> $tmp1$$XMMRegister, vector_len); >>>> ?????? __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>> vector_len); >>>> >>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect >>>> TEMP dst is not required. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>> To: Viswanathan, Sandhya ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>> Hi Bernard, >>>>> >>>>> One could add TEMP dst in effect() to let the register allocator >>>>> know that dst needs to be different from src. >>>> >>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>>> $tmp2 to avoid overwriting >>>> $src2 before we get value from it if $dst = $src2. >>>> >>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but >>>> $dst is used only for final result. >>>> >>>> It is a little mess which may cause ineffective use of registers in >>>> compiled code. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>> To: Viswanathan, Sandhya >>>>> Cc: Vladimir Kozlov ; >>>>> hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Hi Sandhya and Vladimir K., >>>>> >>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya >>>>> wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> Yes, I missed the question below: >>>>>>>> There are cases where we can use less `TEMP tmp` registers by >>>>>>>> using 'dst' register like in mul4B_reg(). Is it intentional to >>>>>>>> not use 'dst' there? >>>>>> >>>>>> No it is not intentional, we can use the dst register in those >>>>>> cases and reduced the tmps. >>>>> >>>>> I guess we have to be careful using $dst instead of $tmp registers >>>>> as the allocator sometimes provides identical $src & $dst. Also, >>>>> I'm not sure this would be possible in the case of mul4B_reg(): >>>>> >>>>> 7349?? format %{"pmovsxbw? $tmp,$src1\n\t" >>>>> 7350??????????? "pmovsxbw? $tmp2,$src2\n\t" >>>>> >>>>> I believe this couldn't work if you use $dst instead of $tmp and >>>>> $dst = $src2, what do you think? >>>>> >>>>> Thanks, >>>>> Bernard >>>>> From sandhya.viswanathan at intel.com Tue May 7 18:16:28 2019 From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya) Date: Tue, 7 May 2019 18:16:28 +0000 Subject: RFR (M) 8222074: Enhance auto vectorization for x86 In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A99813@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB845@FMSMSX126.amr.corp.intel.com> <21eeec09-624f-2dbd-b2f5-86d512233fe0@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AAB898@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABCE7@FMSMSX126.amr.corp.intel.com> <4a77b7c0-fc1a-441c-d018-70568876c4f4@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AABDA2@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB5094@FMSMSX126.amr.corp.intel.com> <0cd3fd93-0f1e-a6d0-d4c3-f8d95b533ff7@oracle.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB56B1@FMSMSX126.amr.corp.intel.com> <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB7472@FMSMSX126.amr.corp.intel.com> Message-ID: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1AB8201@FMSMSX126.amr.corp.intel.com> Yes, handling that as a separate RFE would be good. Let me look through the existing ones in mainline and the ones in panama vector api to identify which way to go. Best Regards, Sandhya -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Tuesday, May 07, 2019 10:41 AM To: Eric Caspole ; hotspot-compiler-dev at openjdk.java.net; Viswanathan, Sandhya Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 Good point, Eric. I agree that having microbenchmarks accompanying such changes would be very valuable. There's a rich set of microbenchmarks in Project Panama for Vector API and some of them target auto-vectorization case as well. Sandya, can you share the plans, please, regarding contributing relevant ones to the mainline? I'm fine with handling that as a separate RFE. Best regards, Vladimir Ivanov On 06/05/2019 07:21, Eric Caspole wrote: > Hi Sandhya, > Could add some new JMH to this webrev that target the java code that > show the benefit of these changes? Or, you could look through the > existing ones in > ?test/micro/org/openjdk/bench/ > > and mention in the bug which existing ones exercise these changes. That > will be a big help to us in the course of working on JDK 13. > Thanks, > Eric > > > > On 5/3/19 19:02, Viswanathan, Sandhya wrote: >> Hi Vladimir, >> >> Please find below the updated webrev which implements all your inputs: >> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.02/ >> >> Looking forward to your feedback. >> >> Best Regards, >> Sandhya >> >> >> -----Original Message----- >> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >> Sent: Wednesday, May 01, 2019 5:09 PM >> To: Viswanathan, Sandhya ; Vladimir >> Kozlov >> Cc: hotspot-compiler-dev at openjdk.java.net >> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >> >> Sounds good, thanks! >> >> Best regards, >> Vladimir Ivanov >> >> On 01/05/2019 15:16, Viswanathan, Sandhya wrote: >>> I should add here that your suggestion of adding generic shift >>> instruction etc to the macroAssembler is also wonderful instead of >>> function pointer.? I will look into making that change as well. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> -----Original Message----- >>> From: Viswanathan, Sandhya >>> Sent: Wednesday, May 01, 2019 3:10 PM >>> To: 'Vladimir Ivanov' ; Vladimir Kozlov >>> >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> Hi Vladimir, >>> >>> I agree, I wanted to show both the approaches in this patch to get >>> your feedback: >>> 1) with emit as a function >>> 2) with emit part in the instruct body itself >>> >>> With emit as a function it becomes hard to read and I personally >>> prefer it in the instruct itself as is done for vabsneg2D etc. That >>> is what you are recommending as well so I feel good. >>> >>> Once the adlc enhancement is done both the approaches should give >>> similar binary size. Till then there will be small overhead with >>> approach 2) as emit is duplicated per match rule. >>> >>> I will send an updated patch fixing the two issues you mentioned in >>> your previous email plus this change of using approach 2). >>> >>> Please do let me know if you want to see any other change in this patch. >>> >>> Best Regards, >>> Sandhya >>> >>> >>> >>> -----Original Message----- >>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] >>> Sent: Wednesday, May 01, 2019 2:58 PM >>> To: Viswanathan, Sandhya ; Vladimir >>> Kozlov >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>> >>> >>>> http://cr.openjdk.java.net/~sviswanathan/8222074/webrev.01/ >>> >>> Nice job, Sandhya! Glad to hear the approach pays off! >>> >>> Unfortunately, I must note that AD file becomes much more obscure. >>> Especially with those function pointers. >>> >>> 1528 void emit_vshift16B_code(MacroAssembler& _masm, int opcode, >>> XMMRegister dst, >>> 1529???????????????????????? XMMRegister src, XMMRegister shift, >>> 1530???????????????????????? XMMRegister tmp1, XMMRegister tmp2, >>> Register scratch) { >>> 1531?? XX_Inst extendinst = get_extend_inst(opcode == Op_URShiftVB ? >>> false : true); >>> 1532?? XX_Inst shiftinst = get_xx_inst(opcode); >>> 1533 >>> 1534?? (_masm.*extendinst)(tmp1, src); >>> 1535?? (_masm.*shiftinst)(tmp1, shift); >>> 1536?? __ pshufd(tmp2, src, 0xE); >>> 1537?? (_masm.*extendinst)(tmp2, tmp2); >>> 1538?? (_masm.*shiftinst)(tmp2, shift); >>> 1539?? __ movdqu(dst, ExternalAddress(vector_short_to_byte_mask()), >>> scratch); >>> 1540?? __ pand(tmp2, dst); >>> 1541?? __ pand(dst, tmp1); >>> 1542?? __ packuswb(dst, tmp2); >>> 1543 } >>> >>> Have you tried to encapsulate that into x86-specific MacroAssembler? >>> >>> 8682 instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, >>> vecX tmp2, rRegI scratch) %{ >>> 8683?? predicate(UseSSE > 3? && UseAVX <= 1 && n->as_Vector()->length() >>> == 16); >>> 8684?? match(Set dst (LShiftVB src shift)); >>> 8685?? match(Set dst (RShiftVB src shift)); >>> 8686?? match(Set dst (URShiftVB src shift)); >>> 8687?? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> 8688?? format %{"pmovxbw?? $tmp1,$src\n\t" >>> 8689??????????? "shiftop?? $tmp1,$shift\n\t" >>> 8690??????????? "pshufd??? $tmp2,$src\n\t" >>> 8691??????????? "pmovxbw?? $tmp2,$tmp2\n\t" >>> 8692??????????? "shiftop?? $tmp2,$shift\n\t" >>> 8693??????????? "movdqu??? $dst,[0x00ff00ff0x00ff00ff]\n\t" >>> 8694??????????? "pand????? $tmp2,$dst\n\t" >>> 8695??????????? "pand????? $dst,$tmp1\n\t" >>> 8696??????????? "packuswb? $dst,$tmp2\n\t! packed16B shift" %} >>> 8697?? ins_encode %{ >>> 8698???? emit_vshift16B_code(_masm, this->as_Mach()->ideal_Opcode() , >>> $dst$$XMMRegister, $src$$XMMRegister, $shift$$XMMRegister, >>> $tmp1$$XMMRegister, $tmp2$$XMMRegister, $scratch$$Register); >>> 8699?? %} >>> 8700?? ins_pipe( pipe_slow ); >>> 8701 %} >>> >>> can be turned into something like: >>> >>> instruct vshift16B(vecX dst, vecX src, vecS shift, vecX tmp1, vecX >>> tmp2, rRegI scratch) %{ >>> ???? predicate(n->as_Vector()->length() == 16); >>> ???? match(Set dst (LShiftVB src shift)); >>> ???? match(Set dst (RShiftVB src shift)); >>> ???? match(Set dst (URShiftVB src shift)); >>> ???? effect(TEMP dst, TEMP tmp1, TEMP tmp2, TEMP scratch); >>> ???? format %{"packed16B shift" %} >>> ???? ins_encode %{ >>> ?????? int vlen = 0; // 128-bit >>> ?????? BasicType elem_type = T_BYTE; >>> ?????? int shift_mode = ...; // L/R/UR or S/U + L/R >>> ?????? __ vshift(vlen, elem_type, shift_mode, >>> ???????????????? $dst$$..., $src$$..., $shift$$..., >>> ????????? $tmp1$$..., $tmp2$$..., $scratch$$...); >>> ????? %} >>> >>> Then MA::vshift can dispatch between different implementations >>> depending on SSE/AVX level available. Do you see any problems with >>> that from footprint perspective? >>> >>> Ideally, I'd prefer to see a library of operations on vectors >>> encapsulated in MacroAssembler (or a subclass) and used in x86.ad. >>> That will accommodate further reductions in AD instructions needed. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> With this webrev the ad file has only about 60 lines effectively added. >>>> Also the generated product libjvm.so size only increases by about >>>> 0.26% vs the prior 1.50%. >>>> I have used multiple match rules in one instruct for same size shift >>>> related rules and also for the new Abs/Neg rules. >>>> What I noticed is that the adlc still duplicates lot of code and >>>> there is potential to further improve code size for multiple match >>>> rule case by improving the adlc itself. >>>> The adlc improvement (like removing duplicate emits, formats, >>>> expand, pipeline etc) can be done as a separate RFE. >>>> In this webrev, I have also fixed the errors reported by Vladimir >>>> Ivanov and corrected the issues reported by jcheck tool. >>>> Also taken into account reducing the temporary by using TEMP dst for >>>> multiply rules. >>>> >>>> The compiler jtreg tests and the java math tests pass on Haswell, >>>> SKX, and KNL. >>>> >>>> Your review and feedback is welcome. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: hotspot-compiler-dev >>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of >>>> Viswanathan, Sandhya >>>> Sent: Wednesday, April 10, 2019 10:22 AM >>>> To: Vladimir Kozlov ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: RE: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> Yes good catch, in mul32B_reg_avx(), the last two instructions are >>>> the only place where dst is used: >>>> >>>> ?????? __ vpackuswb($dst$$XMMRegister, $tmp2$$XMMRegister, >>>> $tmp1$$XMMRegister, vector_len); >>>> ?????? __ vpermq($dst$$XMMRegister, $dst$$XMMRegister, 0xD8, >>>> vector_len); >>>> >>>> Here dst can be same as tmp2 or tmp1 in packuswb() and so the effect >>>> TEMP dst is not required. >>>> >>>> Best Regards, >>>> Sandhya >>>> >>>> >>>> -----Original Message----- >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>>> Sent: Wednesday, April 10, 2019 9:59 AM >>>> To: Viswanathan, Sandhya ; B. Blaser >>>> >>>> Cc: hotspot-compiler-dev at openjdk.java.net >>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>> >>>> On 4/10/19 8:36 AM, Viswanathan, Sandhya wrote: >>>>> Hi Bernard, >>>>> >>>>> One could add TEMP dst in effect() to let the register allocator >>>>> know that dst needs to be different from src. >>>> >>>> Yes, we use this way. Or, in mul4B_reg() case, we can use $dst instead >>>> $tmp2 to avoid overwriting >>>> $src2 before we get value from it if $dst = $src2. >>>> >>>> On other hand, mul32B_reg_avx() and other have 'TEMP dst' effect but >>>> $dst is used only for final result. >>>> >>>> It is a little mess which may cause ineffective use of registers in >>>> compiled code. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> >>>>> Best Regards, >>>>> Sandhya >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: B. Blaser [mailto:bsrbnd at gmail.com] >>>>> Sent: Wednesday, April 10, 2019 4:10 AM >>>>> To: Viswanathan, Sandhya >>>>> Cc: Vladimir Kozlov ; >>>>> hotspot-compiler-dev at openjdk.java.net >>>>> Subject: Re: RFR (M) 8222074: Enhance auto vectorization for x86 >>>>> >>>>> Hi Sandhya and Vladimir K., >>>>> >>>>> On Wed, 10 Apr 2019 at 03:06, Viswanathan, Sandhya >>>>> wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> Yes, I missed the question below: >>>>>>>> There are cases where we can use less `TEMP tmp` registers by >>>>>>>> using 'dst' register like in mul4B_reg(). Is it intentional to >>>>>>>> not use 'dst' there? >>>>>> >>>>>> No it is not intentional, we can use the dst register in those >>>>>> cases and reduced the tmps. >>>>> >>>>> I guess we have to be careful using $dst instead of $tmp registers >>>>> as the allocator sometimes provides identical $src & $dst. Also, >>>>> I'm not sure this would be possible in the case of mul4B_reg(): >>>>> >>>>> 7349?? format %{"pmovsxbw? $tmp,$src1\n\t" >>>>> 7350??????????? "pmovsxbw? $tmp2,$src2\n\t" >>>>> >>>>> I believe this couldn't work if you use $dst instead of $tmp and >>>>> $dst = $src2, what do you think? >>>>> >>>>> Thanks, >>>>> Bernard >>>>> From sergey.kuksenko at oracle.com Tue May 7 18:39:50 2019 From: sergey.kuksenko at oracle.com (Sergey Kuksenko) Date: Tue, 7 May 2019 11:39:50 -0700 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. Message-ID: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> Hi All, I would like to ask for review the following change/update: https://bugs.openjdk.java.net/browse/JDK-8223504 http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ Previous email discussion: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-February/032623.html Obtained performance improvements of forall loops: Maps (iterating entries, values, keys): - HashMap - 2x times speedup - TreeMap - +15% - ConcurrentHashMap - +16% Collections: - HashSet - +13% - ConcurrentLinkedQueue - +27% - LinkedList - +8% From shade at redhat.com Tue May 7 18:56:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 7 May 2019 20:56:29 +0200 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> Message-ID: <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: > Hi All, > > I would like to ask for review the following change/update: > > https://bugs.openjdk.java.net/browse/JDK-8223504 > > http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ The idea sounds fine. Nits (the usual drill): *) Copyright years need to be updated, at least in bytecodeInfo.cpp *) Do we need to put Iterator_klass initialization this early in WK_KLASSES_DO? It feels safer to initialize it at the end, to avoid surprising bootstrap issues. *) Backslash indent is off here in vmSymbols.hpp: 129 template(java_util_Iterator, "java/util/Iterator") \ *) Space after "if"? Also, I think you can use ciType::is_subtype_of instead here. Plus, since you declared iterator in WK klasses, SystemDictionary::Iterator_klass() should be available. 100 if(retType->is_klass() && retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From patricio.chilano.mateo at oracle.com Wed May 8 03:06:11 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Tue, 7 May 2019 23:06:11 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <0fe2590d-a41a-0554-b9b2-efc3bb644684@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> <259f2edc-a842-8f14-39d6-74eb47a2964c@oracle.com> <0fe2590d-a41a-0554-b9b2-efc3bb644684@oracle.com> Message-ID: <3e3db7fd-ccc9-bf4a-3fdf-26f9c479a443@oracle.com> Hi Robbin, On 5/7/19 10:26 AM, Robbin Ehn wrote: > Hi Patricio, > > On 5/6/19 6:10 PM, Patricio Chilano wrote: >> Hi Robbin, >> >> I'm going to just review the biased locking part since I'm not really >> familiar with the rest of the code. >> >> In BiasedLocking::revoke_and_rebias_in_handshake(), why do you need >> to execute fast_revoke(obj, false)? If these are objects locked by >> the JavaThread you are handshaking then it seems they should be >> normal locks (no bias pattern) or the condition >> (mark->biased_locker() == THREAD && prototype_header->bias_epoch() == >> mark->bias_epoch()) you are testing for later should hold. Then that >> would save the extra comparisons in fast_revoke(). > > I tried to have as little changes possible. What you are saying are > true for > current code also. Since often there are no locks on stack, the biggest > over-head is walking the entire stack, twice if a lock is found. A > couple of > comparison doesn't really matter here. > I wanted the biased code to be the same as before, since this is not > suppose todo change anything there, but we needed to avoid triggering > a VM_BulkRevokeBias, so I avoided the update_heuristics. Ok, maybe we can simplify it later then. >> Also instead of placing the condition (mark->biased_locker() == >> THREAD && prototype_header->bias_epoch() == mark->bias_epoch()) >> inside an if() and then later use a ShouldNotReachHere(), wouldn't it >> be better to make that an assertion, place that code outside the if() >> and remove the ShouldNotReachHere()? > > If we would failed here (hardware fault or so), you can loose your > lock, I'll rather have a crash in the release build, than having to > try debugging an application that sometimes spontaneous drops a lock, > so I'd like to keep a hard crash. > > In your suggestion we should have an guarantee, e.g: > +? guarantee(mark->biased_locker() == THREAD && > +??????????? prototype_header->bias_epoch() == mark->bias_epoch(), > "Revoke failed, unhandled biased lock state"); > +? ResourceMark rm; > +? log_info(biasedlocking)("Revoking bias by walking my own stack:"); > ... > > Seems good? Using guarantee() sounds good. >> For the execution of revoke_bias() inside >> BiasedLocking::revoke_and_rebias_in_handshake() you could use a >> shorter version of BiasedLocking::revoke_and_rebias() that avoids the >> extra comparisons made for the general case and just starts at the >> walking the stack part, but I'm actually doing that for 8191890 so I >> can merge that with my patch. > > Yes, ok, good. > >> >> In deoptimization.cpp you have methods inflate_monitors() and >> inflate_monitors_handshake(), but in inflate_monitors() you are not >> inflating the monitors, you just revoke the ones that have bias. You >> mentioned in your first email that we need to inflate if we are not >> at a safepoint, why is that? Since revocation seems to be the common >> factor between those methods, maybe s/inflate/revoke is a better name? >> > > I looked this over, we actually always only need to revoke. > The JavaThread will inflate, if needed, on it's way to the interpreter > when > unpacking the stack. (since we are inflating the stack, we must move > the stack > locks which can require an inflation) > I renamed methods and remove early inflation. > > I'll send an update to RFR mail. Thanks! Patricio > Thanks, Robbin > >> >> Thanks! >> Patricio >> >> >> On 5/6/19 4:42 AM, Robbin Ehn wrote: >>> Hi Dan, >>> >>>> src/hotspot/share/runtime/biasedLocking.cpp >>>> ???? nit - Please update copyright year for this file. >>>> >>> >>> Updated in 8220724. >>> >>>> ???? Nice refactoring into more readable chunks! I'm assuming that >>>> ???? Patricio is also reviewing these changes... >>> >>> Great, good! >>> >>>> src/hotspot/share/runtime/deoptimization.cpp >>>> ???? L778:? bool _in_handshake; >>>> ???????? nit - needs one more space of indent. >>> >>> Fixed. >>> >>>> >>>> ???? Nice refactoring while adding in the handshake support. >>> >>> Great! >>> >>>> >>>> src/hotspot/share/runtime/deoptimization.hpp >>>> ???? L147:? public: >>>> ???? L148: >>>> ???? L149: ? // Deoptimizes a frame lazily. nmethod gets patched >>>> deopt happens on return to the frame >>>> ???? L163: ? static void fix_monitors(JavaThread* thread, frame fr, >>>> RegisterMap* map) >>>> ???????? Style nit: I would put the blank line on L148 above L147. >>> >>> Fixed. >>> >>>> >>>> ???? L164: ??? { inflate_monitors(thread, fr, map); } >>>> ???????? Style nit: Should be: >>>> >>>> ???????????? static void fix_monitors(JavaThread* thread, frame fr, >>>> RegisterMap* map) { >>>> ?????????????? inflate_monitors(thread, fr, map); >>>> ???????????? } >>> >>> Fixed. >>> >>>> src/hotspot/share/runtime/mutexLocker.cpp >>>> ???? No comments. (So OsrList_lock is now 'special-1' instead of >>>> 'leaf'. >>>> ???? I presume the Compiler team is okay with that... >>> >>> Since need we hold CodeCache_lock while iterating nmethods, all >>> locks that might be taken needed to be pushed down under >>> CodeCache_lock. >>> So I hope they are okay with that. >>> >>>> Thumbs up!? I don't need to see a webrev if you fix the nits... >>> >>> Thanks Dan! Fixed! >>> >>> I did t6-7 over the weekend, no issues found. >>> >>> /Robbin >>> >>>> >>>> Dan >>>> >>>> >>>>> >>>>> # Note >>>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>>> line 630 >>>>> This is revert to the original, I accidental had left in a >>>>> temporary test change, as you can see here in full diff: >>>>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>>>> >>>>> >>>>> I think I manage to address all review comments. >>>>> >>>>> Dean can you please cast an extra eye on: >>>>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >>>>> >>>>> This OR should be correct. >>>>> >>>>> Dan please do the same on the biased locking changes. >>>>> >>>>> I left out the merge with MutexLocker changes, since it was not >>>>> interesting. >>>>> There were some conflicts with JVMCI changes, so incremental >>>>> contains some parts of that merge. >>>>> >>>>> Passes t1-5 and local testing. >>>>> I'll continue with some additional testing. >>>>> >>>>> Thanks, Robbin >>>>> >>>>> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>>>>> Hi all, please review. >>>>>> >>>>>> Let's deopt with handshakes. >>>>>> Removed VM op Deoptimize, instead we handshake. >>>>>> Locks needs to be inflate since we are not in a safepoint. >>>>>> >>>>>> Goes on top of: >>>>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>>>>> >>>>>> >>>>>> Code: >>>>>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>>>>> Issue: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>>>>> >>>>>> Passes t1-7 and multiple t1-5 runs. >>>>>> >>>>>> A few startup benchmark see a small speedup. >>>>>> >>>>>> Thanks, Robbin >>>> >> From dean.long at oracle.com Wed May 8 06:59:23 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 7 May 2019 23:59:23 -0700 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: On 5/3/19 3:31 AM, Robbin Ehn wrote: > Dean can you please cast an extra eye on: > http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html > > This OR should be correct. Looks good. I also noticed that nmethod::make_unloaded calling unlink_code is still doing an extra || method->from_compiled_entry() == compare->verified_entry_point() that wasn't there before.? However, this seems like an improvement that could be fixing a latent bug. dl -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbin.ehn at oracle.com Wed May 8 09:22:26 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 8 May 2019 11:22:26 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> Message-ID: <5d989ff4-9724-0b01-ae5c-545163b7b446@oracle.com> Hi Dean, Good, thanks! /Robbin On 2019-05-08 08:59, dean.long at oracle.com wrote: > On 5/3/19 3:31 AM, Robbin Ehn wrote: >> Dean can you please cast an extra eye on: >> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >> >> This OR should be correct. > > Looks good. > > I also noticed that nmethod::make_unloaded calling unlink_code is still doing an > extra > > || method->from_compiled_entry() == compare->verified_entry_point() > > > that wasn't there before.? However, this seems like an improvement that could be > fixing a latent bug. > > dl From Pengfei.Li at arm.com Wed May 8 10:15:21 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Wed, 8 May 2019 10:15:21 +0000 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled Message-ID: Hi, Could anyone help review this patch (tiny change on 75 test files)? Webrev: http://cr.openjdk.java.net/~pli/rfr/8223427/webrev.00/ JBS: https://bugs.openjdk.java.net/browse/JDK-8223427 JVMCI compiler supports only a few OpenJDK GCs (only Serial, Parallel, ParallelOld and G1). In current jtreg cases, when Graal is enabled, other GC test cases are skipped except Shenandoah. We should add "!vm.graal.enabled" together with the "@requires vm.gc.Shenandoah" annotation in gc test cases to skip Shenandoah as well. -- Thanks, Pengfei From rahul.v.raghavan at oracle.com Wed May 8 11:08:11 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 8 May 2019 16:38:11 +0530 Subject: [13] RFR: 8223445: compiler/intrinsics/mathexact/LongMulOverflowTest.java java timeout Message-ID: Hi, please review the following patch: http://cr.openjdk.java.net/~rraghavan/8223445/webrev.00/ Related JBS links: https://bugs.openjdk.java.net/browse/JDK-8223445 https://bugs.openjdk.java.net/browse/JDK-8207267 Similar to 8222417, 8222418 cases problem listing this test also with Graal, as it timeout with '-Djvmci.Compiler=graal -Xcomp -XX:-TieredCompilation' run. Thanks, Rahul From shade at redhat.com Wed May 8 11:16:23 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 8 May 2019 13:16:23 +0200 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled In-Reply-To: References: Message-ID: On 5/8/19 12:15 PM, Pengfei Li (Arm Technology China) wrote: > Could anyone help review this patch (tiny change on 75 test files)? > > Webrev: http://cr.openjdk.java.net/~pli/rfr/8223427/webrev.00/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8223427 Looks good. We already have a few tests that do "@requires vm.gc.Shenandoah & !vm.graal.enabled". -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Wed May 8 12:24:11 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 8 May 2019 14:24:11 +0200 Subject: [13] RFR: 8223445: compiler/intrinsics/mathexact/LongMulOverflowTest.java java timeout In-Reply-To: References: Message-ID: Hi Rahul, looks good and trivial. Best regards, Tobias On 08.05.19 13:08, Rahul Raghavan wrote: > Hi, > > please review the following patch: > http://cr.openjdk.java.net/~rraghavan/8223445/webrev.00/ > > Related JBS links: > https://bugs.openjdk.java.net/browse/JDK-8223445 > https://bugs.openjdk.java.net/browse/JDK-8207267 > > Similar to 8222417, 8222418 cases problem listing this test also with Graal, as it timeout with > '-Djvmci.Compiler=graal -Xcomp -XX:-TieredCompilation' run. > > Thanks, > Rahul From rahul.v.raghavan at oracle.com Wed May 8 12:26:32 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 8 May 2019 17:56:32 +0530 Subject: [13] RFR: 8223445: compiler/intrinsics/mathexact/LongMulOverflowTest.java java timeout In-Reply-To: References: Message-ID: Thank you Tobias. On 08/05/19 5:54 PM, Tobias Hartmann wrote: > Hi Rahul, > > looks good and trivial. > > Best regards, > Tobias > > On 08.05.19 13:08, Rahul Raghavan wrote: >> Hi, >> >> please review the following patch: >> http://cr.openjdk.java.net/~rraghavan/8223445/webrev.00/ >> >> Related JBS links: >> https://bugs.openjdk.java.net/browse/JDK-8223445 >> https://bugs.openjdk.java.net/browse/JDK-8207267 >> >> Similar to 8222417, 8222418 cases problem listing this test also with Graal, as it timeout with >> '-Djvmci.Compiler=graal -Xcomp -XX:-TieredCompilation' run. >> >> Thanks, >> Rahul From lutz.schmidt at sap.com Wed May 8 15:31:04 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 8 May 2019 15:31:04 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output Message-ID: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> Dear Community, may I please request comments and reviews for this change? Thank you! I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ Regards, Lutz ?On 11.04.19, 23:24, "Schmidt, Lutz" wrote: Dear All, this topic was discussed back in Nov/Dec 2018: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): PrintAssemblyOptions help: hsdis-print-raw test plugin by requesting raw output (deprecated) hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) hsdis-print-pc turn off PC printing (on by default) (deprecated) hsdis-print-bytes turn on instruction byte output (deprecated) hsdis-show-pc toggle printing current pc, currently ON hsdis-show-offset toggle printing current offset, currently OFF hsdis-show-bytes toggle printing instruction bytes, currently OFF hsdis-show-data-hex toggle formatting data as hex, currently ON hsdis-show-data-int toggle formatting data as int, currently OFF hsdis-show-data-float toggle formatting data as float, currently OFF hsdis-show-structs toggle compiler data structures, currently OFF hsdis-show-comment toggle instruction comments, currently OFF hsdis-show-block-comment toggle block comments, currently OFF hsdis-align-instr toggle instruction alignment, currently OFF Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: The machine code generated by the JVM can be printed in three different formats: - Hexadecimal. This is basically a hex dump of the memory range containing the code. This format is always available (PRODUCT and not-PRODUCT builds), regardless of the availability of a disassembler library. It applies to all sorts of code, be it blobs, stubs, compiled nmethods, ... This format seems useless at first glance, but it is not. In an upcoming, separate enhancement, the JVM will be made capable of reading files containing such code blocks and disassembling them post mortem. The most prominent example is an hs_err* file. - Disassembled. This is an assembly listing of the instructions as found in the memory range occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable disassembler library (hsdis-.so) must be available at runtime. Most often, that will only be the case in test environments. If no disassembler library is available, hexadecimal output is used as fallback. - OptoAssembly. This is a meta code listing created only by the C2 compiler. As it is somewhat closer to the Java code, it may be helpful in linking assembly code to Java code. All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: http://cr.openjdk.java.net/~lucy/webrevs/8213084/ OptoAssembly output for reference (always on with PrintAssembly): ================================================================= 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) 039 movl R11, [RBP + #12 (8-bit)] # range 03d NullCheck RBP 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 03d cmpl RDX, R11 # unsigned 040 jnb,us B6 P=0.000000 C=5375.000000 PrintAssembly with no disassembler library available: ===================================================== [Code] [Entry Point] 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 [Verified Entry Point] 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 [Exception Handler] 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff [End] PrintAssembly with minimal verbosity: ===================================== 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d 0x00007f0434b89bdd: cmp %r11d,%edx 0x00007f0434b89be0: jae 0x00007f0434b89c0e PrintAssembly (previous plus code offsets from code begin): =========================================================== 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e PrintAssembly (previous plus block comments): =========================================================== ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e PrintAssembly (previous plus instruction comments): =========================================================== ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.String::charAt at 8 (line 702) 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e For completeness, here are the links to Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. Thank you so much for your time! Lutz From sergey.kuksenko at oracle.com Wed May 8 20:18:26 2019 From: sergey.kuksenko at oracle.com (Sergey Kuksenko) Date: Wed, 8 May 2019 13:18:26 -0700 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> Message-ID: Updated: http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ On 5/7/19 11:56 AM, Aleksey Shipilev wrote: > On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >> Hi All, >> >> I would like to ask for review the following change/update: >> >> https://bugs.openjdk.java.net/browse/JDK-8223504 >> >> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ > The idea sounds fine. > > Nits (the usual drill): > > *) Copyright years need to be updated, at least in bytecodeInfo.cpp > > *) Do we need to put Iterator_klass initialization this early in WK_KLASSES_DO? It feels safer to > initialize it at the end, to avoid surprising bootstrap issues. > > *) Backslash indent is off here in vmSymbols.hpp: > > 129 template(java_util_Iterator, "java/util/Iterator") \ > > *) Space after "if"? Also, I think you can use ciType::is_subtype_of instead here. Plus, since you > declared iterator in WK klasses, SystemDictionary::Iterator_klass() should be available. > > 100 if(retType->is_klass() && retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { > From shade at redhat.com Wed May 8 21:09:05 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 8 May 2019 23:09:05 +0200 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> Message-ID: <42ef77e2-e9bb-c266-bdd1-89afef86e193@redhat.com> On 5/8/19 10:18 PM, Sergey Kuksenko wrote: > Updated: > http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ Looks good to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.x.ivanov at oracle.com Wed May 8 22:10:32 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 8 May 2019 15:10:32 -0700 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> Message-ID: <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> > http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ src/hotspot/share/opto/bytecodeInfo.cpp: + if (callee_method->name() == ciSymbol::iterator_name()) { + if (callee_method->signature()->return_type()->is_subtype_of(C->env()->Iterator_klass())) { + return true; + } + } The check looks too broad for me: it returns true for any method with a name "iterator" which returns an instance of Iterator which is much broader that just overrides/overloads of Iterable::iterator(). Can you elaborate, please, why did you decide to extend the check for non-Iterables? Commenting on the general approach, it looks like a good candidate for a fist-line filter before performing a more extensive analysis. I'd prefer to see BCEscapeAnalyzer extended to determine that returned Iterator is a freshly-allocated instance and decide whether to inline or not based on that instead. Among java.util classes you mentioned most iterators are trivial, so even naive analysis should get decent results. And then the analysis can be applied to any method which returns an Object to see whether EA may benefit from inlining. What do you think? Best regards, Vladimir Ivanov > On 5/7/19 11:56 AM, Aleksey Shipilev wrote: >> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >>> Hi All, >>> >>> I would like to ask for review the following change/update: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8223504 >>> >>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ >> The idea sounds fine. >> >> Nits (the usual drill): >> >> ? *) Copyright years need to be updated, at least in bytecodeInfo.cpp >> >> ? *) Do we need to put Iterator_klass initialization this early in >> WK_KLASSES_DO? It feels safer to >> initialize it at the end, to avoid surprising bootstrap issues. >> >> ? *) Backslash indent is off here in vmSymbols.hpp: >> >> ? 129?? template(java_util_Iterator, >> "java/util/Iterator")?????????????? \ >> >> ? *) Space after "if"? Also, I think you can use ciType::is_subtype_of >> instead here. Plus, since you >> declared iterator in WK klasses, SystemDictionary::Iterator_klass() >> should be available. >> >> ? 100???? if(retType->is_klass() && >> retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { >> From vladimir.kozlov at oracle.com Wed May 8 22:52:23 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 8 May 2019 15:52:23 -0700 Subject: [13] RFR(T) 8223380: [Graal] compiler/c2/Test8062950.java failed with time out. Message-ID: <264439ba-7f2c-47c5-f840-770c6afed921@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8223380 We should not run Java Graal with -XX:-TieredCompilation because Graal will be executed by Interpreter instead of bean compiled with C1. With -Xcomp and -Xbatch it is worse because test execution is blocked until compilation is finished - that is why such tests are slow with Graal. Put the test on Graal's problem list with umbrella bug: https://bugs.openjdk.java.net/browse/JDK-8207267 diff -r d266d24b6a0e test/hotspot/jtreg/ProblemList-graal.txt --- a/test/hotspot/jtreg/ProblemList-graal.txt +++ b/test/hotspot/jtreg/ProblemList-graal.txt @@ -218,3 +218,5 @@ +# Next tests should be re-enabled once libgraal is introduced compiler/arguments/TestScavengeRootsInCode.java 8207267 generic-all +compiler/c2/Test8062950.java 8207267 generic-all compiler/intrinsics/mathexact/LongMulOverflowTest.java 8207267 generic-all We will run these tests with these flags when libgraal is introduced in JDK (JDK-8223220). Thanks, Vladimir From vladimir.kozlov at oracle.com Wed May 8 23:18:05 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 8 May 2019 16:18:05 -0700 Subject: [13] RFR(T) 8223539: compiler/graalunit/HotspotTest.java hotspot.test.CheckGraalIntrinsics AssertionError: found plugins for intrinsics Message-ID: https://bugs.openjdk.java.net/browse/JDK-8223539 8222074 added new Math intrinsics abs(F), abs(I), abs(J). Usually new intrinsics should be listed in CheckGraalIntrinsics.java Graal's unit test until they are implemented in Graal. And it was done in 8222074 changes [1]. But Graal has own intrinsics - plugins. And there already plugins for Math.abs(F) and Math.abs(D) [2]. That is why the test failed. The fix is trivial - remove line with Math.abs(F) from CheckGraalIntrinsics.java: --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java @@ -399,5 +399,4 @@ if (isJDK13OrHigher()) { add(toBeInvestigated, - "java/lang/Math.abs(F)F", "java/lang/Math.abs(I)I", "java/lang/Math.abs(J)J", Thanks, Vladimir [1] http://hg.openjdk.java.net/jdk/jdk/rev/1851a532ddfe#l24.1 [2] https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.compiler.replacements/src/org/graalvm/compiler/replacements/StandardGraphBuilderPlugins.java#L693 From vladimir.x.ivanov at oracle.com Wed May 8 23:31:22 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 8 May 2019 16:31:22 -0700 Subject: [13] RFR(T) 8223539: compiler/graalunit/HotspotTest.java hotspot.test.CheckGraalIntrinsics AssertionError: found plugins for intrinsics In-Reply-To: References: Message-ID: <3d08ee0b-e468-f167-1ff0-5b3f92234e7d@oracle.com> Looks good. Best regards, Vladimir Ivanov On 08/05/2019 16:18, Vladimir Kozlov wrote: > https://bugs.openjdk.java.net/browse/JDK-8223539 > > 8222074 added new Math intrinsics abs(F), abs(I), abs(J). > Usually new intrinsics should be listed in CheckGraalIntrinsics.java > Graal's unit test until they are implemented in Graal. And it was done > in 8222074 changes [1]. > > But Graal has own intrinsics - plugins. And there already plugins for > Math.abs(F) and Math.abs(D) [2]. That is why the test failed. > > The fix is trivial - remove line with Math.abs(F) from > CheckGraalIntrinsics.java: > > --- > a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > > +++ > b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > > @@ -399,5 +399,4 @@ > ???????? if (isJDK13OrHigher()) { > ???????????? add(toBeInvestigated, > -??????????????????????????? "java/lang/Math.abs(F)F", > ???????????????????????????? "java/lang/Math.abs(I)I", > ???????????????????????????? "java/lang/Math.abs(J)J", > > Thanks, > Vladimir > > [1] http://hg.openjdk.java.net/jdk/jdk/rev/1851a532ddfe#l24.1 > [2] > https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.compiler.replacements/src/org/graalvm/compiler/replacements/StandardGraphBuilderPlugins.java#L693 > From vladimir.kozlov at oracle.com Wed May 8 23:41:00 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 8 May 2019 16:41:00 -0700 Subject: [13] RFR(T) 8223539: compiler/graalunit/HotspotTest.java hotspot.test.CheckGraalIntrinsics AssertionError: found plugins for intrinsics In-Reply-To: <3d08ee0b-e468-f167-1ff0-5b3f92234e7d@oracle.com> References: <3d08ee0b-e468-f167-1ff0-5b3f92234e7d@oracle.com> Message-ID: <2235cfdf-4f20-85fa-fefe-1660cc8d6af3@oracle.com> Thanks! Vladimir K On 5/8/19 4:31 PM, Vladimir Ivanov wrote: > Looks good. > > Best regards, > Vladimir Ivanov > > On 08/05/2019 16:18, Vladimir Kozlov wrote: >> https://bugs.openjdk.java.net/browse/JDK-8223539 >> >> 8222074 added new Math intrinsics abs(F), abs(I), abs(J). >> Usually new intrinsics should be listed in CheckGraalIntrinsics.java Graal's unit test until they >> are implemented in Graal. And it was done in 8222074 changes [1]. >> >> But Graal has own intrinsics - plugins. And there already plugins for Math.abs(F) and Math.abs(D) >> [2]. That is why the test failed. >> >> The fix is trivial - remove line with Math.abs(F) from CheckGraalIntrinsics.java: >> >> --- >> a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java >> >> +++ >> b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java >> >> @@ -399,5 +399,4 @@ >> ????????? if (isJDK13OrHigher()) { >> ????????????? add(toBeInvestigated, >> -??????????????????????????? "java/lang/Math.abs(F)F", >> ????????????????????????????? "java/lang/Math.abs(I)I", >> ????????????????????????????? "java/lang/Math.abs(J)J", >> >> Thanks, >> Vladimir >> >> [1] http://hg.openjdk.java.net/jdk/jdk/rev/1851a532ddfe#l24.1 >> [2] >> https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.compiler.replacements/src/org/graalvm/compiler/replacements/StandardGraphBuilderPlugins.java#L693 >> From vladimir.x.ivanov at oracle.com Wed May 8 23:41:28 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 8 May 2019 16:41:28 -0700 Subject: [13] RFR(T) 8223380: [Graal] compiler/c2/Test8062950.java failed with time out. In-Reply-To: <264439ba-7f2c-47c5-f840-770c6afed921@oracle.com> References: <264439ba-7f2c-47c5-f840-770c6afed921@oracle.com> Message-ID: Looks good. Best regards, Vladimir Ivanov On 08/05/2019 15:52, Vladimir Kozlov wrote: > https://bugs.openjdk.java.net/browse/JDK-8223380 > > We should not run Java Graal with -XX:-TieredCompilation because Graal > will be executed by Interpreter instead of bean compiled with C1. With > -Xcomp and -Xbatch it is worse because test execution is blocked until > compilation is finished - that is why such tests are slow with Graal. > > Put the test on Graal's problem list with umbrella bug: > https://bugs.openjdk.java.net/browse/JDK-8207267 > > diff -r d266d24b6a0e test/hotspot/jtreg/ProblemList-graal.txt > --- a/test/hotspot/jtreg/ProblemList-graal.txt > +++ b/test/hotspot/jtreg/ProblemList-graal.txt > @@ -218,3 +218,5 @@ > > +# Next tests should be re-enabled once libgraal is introduced > ?compiler/arguments/TestScavengeRootsInCode.java???????? 8207267 > generic-all > +compiler/c2/Test8062950.java??????????????????????????? 8207267 > generic-all > ?compiler/intrinsics/mathexact/LongMulOverflowTest.java? 8207267 > generic-all > > > We will run these tests with these flags when libgraal is introduced in > JDK (JDK-8223220). > > Thanks, > Vladimir > From vladimir.kozlov at oracle.com Wed May 8 23:45:54 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 8 May 2019 16:45:54 -0700 Subject: [13] RFR(T) 8223380: [Graal] compiler/c2/Test8062950.java failed with time out. In-Reply-To: References: <264439ba-7f2c-47c5-f840-770c6afed921@oracle.com> Message-ID: Thanks! On 5/8/19 4:41 PM, Vladimir Ivanov wrote: > Looks good. > > Best regards, > Vladimir Ivanov > > On 08/05/2019 15:52, Vladimir Kozlov wrote: >> https://bugs.openjdk.java.net/browse/JDK-8223380 >> >> We should not run Java Graal with -XX:-TieredCompilation because Graal will be executed by >> Interpreter instead of bean compiled with C1. With -Xcomp and -Xbatch it is worse because test >> execution is blocked until compilation is finished - that is why such tests are slow with Graal. >> >> Put the test on Graal's problem list with umbrella bug: >> https://bugs.openjdk.java.net/browse/JDK-8207267 >> >> diff -r d266d24b6a0e test/hotspot/jtreg/ProblemList-graal.txt >> --- a/test/hotspot/jtreg/ProblemList-graal.txt >> +++ b/test/hotspot/jtreg/ProblemList-graal.txt >> @@ -218,3 +218,5 @@ >> >> +# Next tests should be re-enabled once libgraal is introduced >> ??compiler/arguments/TestScavengeRootsInCode.java???????? 8207267 generic-all >> +compiler/c2/Test8062950.java??????????????????????????? 8207267 generic-all >> ??compiler/intrinsics/mathexact/LongMulOverflowTest.java? 8207267 generic-all >> >> >> We will run these tests with these flags when libgraal is introduced in JDK (JDK-8223220). >> >> Thanks, >> Vladimir >> From vladimir.kozlov at oracle.com Thu May 9 03:50:54 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 8 May 2019 20:50:54 -0700 Subject: [13] RFR(T) 8223531: [Graal] assert(type() == T_INT) failed: type check Message-ID: <9d9ac006-41cb-721a-a9c3-9fece2625716@oracle.com> http://cr.openjdk.java.net/~kvn/8223531/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8223531 In my JDK13 port of JVMCI changes (8220623) pushed recently I missed few lines from GR-13374 changes in graal-jmvci-8. Doug Simon found that and prepared this fix. Thanks, Vladimir From david.holmes at oracle.com Thu May 9 04:55:49 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 9 May 2019 14:55:49 +1000 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <76f1a03f-d1a4-b6e4-bba5-6444fab04116@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <64a8afca-9dc8-b119-0a12-dd05799bdd22@oracle.com> <903e7954-0c16-0fe4-1a1f-6e9c5c403a28@oracle.com> <76f1a03f-d1a4-b6e4-bba5-6444fab04116@oracle.com> Message-ID: <03bc2512-4427-147a-475f-1b4fd7424b26@oracle.com> Hi Robbin, On 8/05/2019 2:01 am, Robbin Ehn wrote: > Hi David, > > On 5/7/19 4:49 AM, David Holmes wrote: >> Hi Robbin, >> >> I took a look at this and it is a lot more complex than I had >> expected. There are some interconnections that I'm not understanding >> here. From existing code why does deopt need to revoke biases? I don't >> see how deopt changes anything in respect to monitor ownwership. Then >> in the new code why do you have to inflate monitors to do deopt? If >> this is truly new behaviour and I didn't just miss where this happens >> in existing code, then how does this impact the number of >> ObjectMonitors in existence and the monitor deflation process? > > When going to interpreter we expand stack, which means moving anything > on stack. > When moving locks around it might require an inflation, we therefore > revoke the > biases before. The code revoking is seemingly from duke, so not that new > I think :) Some bits here I'm not familiar enough with. If we've already gone to stack-lock then I would have expected the bias to already have been revoked. If we're biased only then there's nothing on the stack so nothing to move and so no reason to revoke the bias. Maybe it's just considered simpler to revoke and inflate no matter what. >> >> More comments below ... >> >> On 3/05/2019 8:31 pm, Robbin Ehn wrote: >>> Hi, please see this update: >>> >>> Inc: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/index.html >>> Full: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/ >> >> src/hotspot/share/code/codeCache.cpp >> >> This comment block no longer applies give there is no safepoint op now: >> >> 1190?? // CodeCache can only be updated by a thread_in_VM and they >> will all be >> 1191?? // stopped during the safepoint so CodeCache will be safe to >> update without >> 1192?? // holding the CodeCache_lock. >> >> The same comment block here: >> >> 1208?? // CodeCache can only be updated by a thread_in_VM and they >> will all be >> 1209?? // stopped dring the safepoint so CodeCache will be safe to >> update without >> 1210?? // holding the CodeCache_lock. >> >> is already incorrect if not actually at a safepoint. It makes the >> relationship between the Compile_lock, CodeCache_lock and being at a >> safepoint rather confusing. > > Yes, these area is filled with 'wired' comments. > mark_for_deoptimization takes CodeCache_lock unconditionally. > I just removed these two. > >> >> --- >> >> src/hotspot/share/code/nmethod.cpp >> >> The comment here: >> >> 1119?? // If _method is already NULL the Method* is about to be unloaded, >> 1120?? // so we don't have to break the cycle. Note that it is >> possible to >> 1121?? // have the Method* live here, in case we unload the nmethod >> because >> 1122?? // it is pointing to some oop (other than the Method*) being >> unloaded. >> >> no longer fits the code given we no longer skip the _method==NULL >> case. Further, the trailing comment here: >> >> 1123?? Method::unlink_code(_method, this); // Break a cycle >> >> seems unnecessary given we preceded this with 4 lines of commentary >> already! > > Updated. > >> >> Again this comment block: >> >> 1205 void nmethod::unlink_from_method() { >> 1206?? // We need to check if both the _code and >> _from_compiled_code_entry_point >> 1207?? // refer to this nmethod because there is a race in setting >> these two fields >> 1208?? // in Method* as seen in bugid 4947125. >> 1209?? // If the vep() points to the zombie nmethod, the memory for >> the nmethod >> 1210?? // could be flushed and the compiler and vtable stubs could >> still call >> 1211?? // through it. >> 1212?? Method::unlink_code(method(), this); >> >> seems meaningless with the code change you've applied. > > Moved it to Method::unlink_code. > >> >> --- >> >> src/hotspot/share/code/nmethod.hpp >> >> !?? void unlink_from_method(); >> >> Now this doesn't take the acquire_lock parameter it would be useful to >> document what the locking expectations are: must this be called with a >> given lock held, or will it always acquire a given lock if needed? > > Added comment. > >> >> --- >> >> src/hotspot/share/oops/method.cpp >> >> + void Method::unlink_code(Method *method, CompiledMethod *compare) { >> + void Method::unlink_code(Method *method) { >> >> I'm not sure making these methods static just so the NULL check can be >> internalized is the best way to deal with this. Now you can't tell >> when NULL is expected and when it is an error. IMHO it is better to >> keep these as instance methods with a NULL check at the callsite if >> needed (or an assert if NULL is not expected). > > All places was NULL checked before calling old clear_code() and in no > place was that asserted. I'd still prefer an external NULL check and a call to an instance method, then using a static method just to internalize the NULL check. That's not a style of API that I like - if you push on it hard enough you end up with only public static methods that do a NULL check then dispatch to private instance methods. > >> >> --- >> >> ??src/hotspot/share/runtime/biasedLocking.cpp >> ??src/hotspot/share/runtime/biasedLocking.hpp >> >> This seems like it would be better done after we have switched >> biased-locking revocation to use handshakes instead of safepoints. >> Otherwise we seem to be doing a partial conversion as a side-effect of >> this bug and it's far from obvious that it is complete/correct. > > This is not the same, we are in a handshake and we want to revoke the > locks we > find on our stack. Doing revoke with handshake is when you are revoking > another > threads bias. This is a thread local operation. > As I said to Patricio the only thing I changed is that we can't safepoint. > So it's the same code as before to revoke, just minus the safepoints > possibilities. I'll wait to see new webrev. >> >> --- >> >> src/hotspot/share/runtime/synchronizer.cpp >> >> These changes have me worried: >> >> ??? assert(Universe::verify_in_progress() || >> !????????? !SafepointSynchronize::is_at_safepoint(), "invariant"); >> >> becomes: >> >> ??? assert(Universe::verify_in_progress() || >> !????????? !Universe::heap()->is_gc_active(), "invariant"); >> >> I don't see an immediate equivalence between not being at a safepoint >> and not having a GC active. If I'm not at a safepoint now and the code >> following doesn't do a safepoint check then we remain outside of a >> safepoint. What is the same reasoning for a GC being active? > > When we use the handshake fallback path, handshaking on a platform not > supporting handshakes, the handshake is performed with a safepoint > (will go away in JDK 14). > > "calling hashCode() or any of its internal aliases may result in changes > to an object's markword if we need to assign a hashCode. That is, > hashCode() can be a heap mutator. Given that, it seems unwise to call > hashCode while at a safepoint. " > > Since the JavaThread inflates on demand I removed the eagerly inflation. > Inflation is thus done by the JavaThread, I revert this. > >> >> !???????? ResourceMark rm(Self); >> !???????? ResourceMark rm; >> >> Are you suggesting the current thread is not Self? If that is the case >> then there should be numerous asserts earlier on to ensure we can't >> follow any code paths that expect that Self is the current thread! But >> I'm concerned that we've introduced a new way for a third-party thread >> to introduce monitor inflation almost independent of the threads using >> the monitor. > > Reverted this because of above reason. > (when VM thread executed the handshake on behalf of the JavaThread it > did the > inflation, so Self was not Thread::current()) I'm happy to hear that is reverted. I'm not at all sure that inflation by a third-party thread would work correctly - and I very much expect it might complicate the async monitor deflation work. Thanks, David ----- > I'll send update to RFR mail. > > Thanks, Robbin > >> >> Thanks, >> David >> ----- >> >>> >>> # Note >>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>> line 630 >>> This is revert to the original, I accidental had left in a temporary >>> test change, as you can see here in full diff: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >>> >>> >>> I think I manage to address all review comments. >>> >>> Dean can you please cast an extra eye on: >>> http://cr.openjdk.java.net/~rehn/8221734/v2/inc/webrev/src/hotspot/share/oops/method.cpp.sdiff.html >>> >>> This OR should be correct. >>> >>> Dan please do the same on the biased locking changes. >>> >>> I left out the merge with MutexLocker changes, since it was not >>> interesting. >>> There were some conflicts with JVMCI changes, so incremental contains >>> some parts of that merge. >>> >>> Passes t1-5 and local testing. >>> I'll continue with some additional testing. >>> >>> Thanks, Robbin >>> >>> On 4/25/19 2:05 PM, Robbin Ehn wrote: >>>> Hi all, please review. >>>> >>>> Let's deopt with handshakes. >>>> Removed VM op Deoptimize, instead we handshake. >>>> Locks needs to be inflate since we are not in a safepoint. >>>> >>>> Goes on top of: >>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>>> >>>> >>>> Code: >>>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>>> Issue: >>>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>>> >>>> Passes t1-7 and multiple t1-5 runs. >>>> >>>> A few startup benchmark see a small speedup. >>>> >>>> Thanks, Robbin From Pengfei.Li at arm.com Thu May 9 05:34:03 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Thu, 9 May 2019 05:34:03 +0000 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled In-Reply-To: References: Message-ID: Thanks Aleksey. Is it required to have another reviewer for this change? -- Thanks, Pengfei > > Could anyone help review this patch (tiny change on 75 test files)? > > > > Webrev: http://cr.openjdk.java.net/~pli/rfr/8223427/webrev.00/ > > JBS: https://bugs.openjdk.java.net/browse/JDK-8223427 > > Looks good. > > We already have a few tests that do "@requires vm.gc.Shenandoah > & !vm.graal.enabled". > > -Aleksey From shade at redhat.com Thu May 9 07:29:57 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 9 May 2019 09:29:57 +0200 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled In-Reply-To: References: Message-ID: <29c11b50-5d85-a68b-573d-2e5116c4a2da@redhat.com> Seeing that it is trivial and only affects Shenandoah tests, I'd say one review is enough. Do you need a sponsor to push? -Aleksey On 5/9/19 7:34 AM, Pengfei Li (Arm Technology China) wrote: > Thanks Aleksey. Is it required to have another reviewer for this change? > > -- > Thanks, > Pengfei > >>> Could anyone help review this patch (tiny change on 75 test files)? >>> >>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8223427/webrev.00/ >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8223427 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From Pengfei.Li at arm.com Thu May 9 07:33:24 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Thu, 9 May 2019 07:33:24 +0000 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled In-Reply-To: <29c11b50-5d85-a68b-573d-2e5116c4a2da@redhat.com> References: <29c11b50-5d85-a68b-573d-2e5116c4a2da@redhat.com> Message-ID: Hi Aleksey, > Seeing that it is trivial and only affects Shenandoah tests, I'd say one review is > enough. > > Do you need a sponsor to push? Yes, I would be happy if you could help to push. -- Thanks, Pengfei From shade at redhat.com Thu May 9 08:09:08 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 9 May 2019 10:09:08 +0200 Subject: RFR(T): 8223427: [TESTBUG] Disable JTReg Shenandoah tests when Graal is enabled In-Reply-To: References: <29c11b50-5d85-a68b-573d-2e5116c4a2da@redhat.com> Message-ID: On 5/9/19 9:33 AM, Pengfei Li (Arm Technology China) wrote: > Hi Aleksey, > >> Seeing that it is trivial and only affects Shenandoah tests, I'd say one review is >> enough. >> >> Do you need a sponsor to push? > > Yes, I would be happy if you could help to push. There: http://hg.openjdk.java.net/jdk/jdk/rev/206afa6372ae -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Thu May 9 10:30:19 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 9 May 2019 12:30:19 +0200 Subject: [13] RFR(T) 8223531: [Graal] assert(type() == T_INT) failed: type check In-Reply-To: <9d9ac006-41cb-721a-a9c3-9fece2625716@oracle.com> References: <9d9ac006-41cb-721a-a9c3-9fece2625716@oracle.com> Message-ID: <50eda8c3-1cf7-13f9-8670-bab8db26e51a@oracle.com> Hi Vladimir, looks good. Best regards, Tobias On 09.05.19 05:50, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/8223531/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223531 > > In my JDK13 port of JVMCI changes (8220623) pushed recently I missed few lines from GR-13374 changes > in graal-jmvci-8. > > Doug Simon found that and prepared this fix. > > Thanks, > Vladimir From ralf.schmelter at sap.com Thu May 9 12:24:20 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Thu, 9 May 2019 12:24:20 +0000 Subject: RFR (XS) 8223617: code_size2 needs adjustments Message-ID: Hi, on my Skylake Windows machine the current jdk/jdk fails with the following assertion: # Internal Error (c:/priv/hg/openjdk/src/hotspot/share/runtime/stubRoutines.cpp:279), pid=42504, tid=131676 # assert(code_size2 == 0 || buffer.insts_remaining() > 200) failed: increase code_size2 After increasing the code size by 200 bytes, the assertion vanishes. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223617/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8223617 Best regards, Ralf From martin.doerr at sap.com Thu May 9 12:37:35 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 9 May 2019 12:37:35 +0000 Subject: RFR (XS) 8223617: code_size2 needs adjustments In-Reply-To: References: Message-ID: Hi Ralf, looks good to me. Thanks, Martin -----Original Message----- From: hotspot-compiler-dev On Behalf Of Schmelter, Ralf Sent: Donnerstag, 9. Mai 2019 14:24 To: hotspot-compiler-dev at openjdk.java.net Subject: RFR (XS) 8223617: code_size2 needs adjustments Hi, on my Skylake Windows machine the current jdk/jdk fails with the following assertion: # Internal Error (c:/priv/hg/openjdk/src/hotspot/share/runtime/stubRoutines.cpp:279), pid=42504, tid=131676 # assert(code_size2 == 0 || buffer.insts_remaining() > 200) failed: increase code_size2 After increasing the code size by 200 bytes, the assertion vanishes. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223617/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8223617 Best regards, Ralf From volker.simonis at gmail.com Thu May 9 13:11:59 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 9 May 2019 15:11:59 +0200 Subject: RFR (XS) 8223617: code_size2 needs adjustments In-Reply-To: References: Message-ID: Looks good! Just out of interest, do you know why we need more space now? Is it because of the CPU or because another recent change that hasn't been tested in the slowdebug mode yet? Regards, Volker On Thu, May 9, 2019 at 2:25 PM Schmelter, Ralf wrote: > > Hi, > > on my Skylake Windows machine the current jdk/jdk fails with the following assertion: > > # Internal Error (c:/priv/hg/openjdk/src/hotspot/share/runtime/stubRoutines.cpp:279), pid=42504, tid=131676 > # assert(code_size2 == 0 || buffer.insts_remaining() > 200) failed: increase code_size2 > > After increasing the code size by 200 bytes, the assertion vanishes. > > webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223617/webrev.0/ > bugreport: https://bugs.openjdk.java.net/browse/JDK-8223617 > > Best regards, > Ralf From ralf.schmelter at sap.com Thu May 9 13:20:38 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Thu, 9 May 2019 13:20:38 +0000 Subject: RFR (XS) 8223617: code_size2 needs adjustments In-Reply-To: References: Message-ID: My guess would be http://hg.openjdk.java.net/jdk/jdk/rev/1851a532ddfe -----Original Message----- From: Volker Simonis Sent: Donnerstag, 9. Mai 2019 15:12 To: Schmelter, Ralf Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR (XS) 8223617: code_size2 needs adjustments Looks good! Just out of interest, do you know why we need more space now? Is it because of the CPU or because another recent change that hasn't been tested in the slowdebug mode yet? Regards, Volker On Thu, May 9, 2019 at 2:25 PM Schmelter, Ralf wrote: > > Hi, > > on my Skylake Windows machine the current jdk/jdk fails with the following assertion: > > # Internal Error (c:/priv/hg/openjdk/src/hotspot/share/runtime/stubRoutines.cpp:279), pid=42504, tid=131676 > # assert(code_size2 == 0 || buffer.insts_remaining() > 200) failed: increase code_size2 > > After increasing the code size by 200 bytes, the assertion vanishes. > > webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223617/webrev.0/ > bugreport: https://bugs.openjdk.java.net/browse/JDK-8223617 > > Best regards, > Ralf From xxinliu at amazon.com Thu May 9 15:47:52 2019 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 9 May 2019 15:47:52 +0000 Subject: RFR(S): 8223537:testlibrary_tests/ctw/ClassesListTest.java fails with Agent timeout frequently Message-ID: <59F07E3F-C116-4BDA-BBD0-FD5CFF33CDDC@amazon.com> BUG: https://bugs.openjdk.java.net/browse/JDK-8223537 Webrev: https://cr.openjdk.java.net/~xliu/8223537/webrev.01/ Could you please review the patch to fix regression of JDK-8222670 for -Xcomp? The root cause is deadlock in blocking mode(-Xcomp). Compiler thread has to notify java threads that the task is completed if recompilation happens in JDK-822670. Purging stale tasks after select_task can do that. Could the sponsor submit this webrev to submit repo to verify it? Thanks, --lx -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu May 9 17:28:40 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 9 May 2019 10:28:40 -0700 Subject: RFR(S): 8223537:testlibrary_tests/ctw/ClassesListTest.java fails with Agent timeout frequently In-Reply-To: <59F07E3F-C116-4BDA-BBD0-FD5CFF33CDDC@amazon.com> References: <59F07E3F-C116-4BDA-BBD0-FD5CFF33CDDC@amazon.com> Message-ID: Looks good. I submitted testing. Thanks, Vladimir On 5/9/19 8:47 AM, Liu, Xin wrote: > BUG: https://bugs.openjdk.java.net/browse/JDK-8223537 > > Webrev: https://cr.openjdk.java.net/~xliu/8223537/webrev.01/ > > Could you please review the patch to fix regression of JDK-8222670 > for -Xcomp? > > The root cause is deadlock in blocking mode(-Xcomp).? ?Compiler thread has to notify java threads > that the task is completed if recompilation happens in JDK-822670. > > Purging stale tasks after select_task can do that. > > Could the sponsor submit this webrev to ?submit repo to verify it? > > Thanks, > > --lx > From vladimir.kozlov at oracle.com Thu May 9 19:30:26 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 9 May 2019 12:30:26 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> Message-ID: Hi Lutz, Thank you for doing this great work. I have just small comments: x86_64.ad - empty change. nmethod.cpp - LUCY? + st->print_cr("LUCY: NULL-oop"); + tty->print("LUCY NULL-oop"); nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). disassembler.* - LUCY_OBSOLETE? +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only compilerDefinitions.hpp - I don't see where tier_digit() is used. disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You need to check for invalid value and print help output in such case - it will be very useful if you forgot a value spelling. Also add line for 'help' value. Do you need next commented lines: disassembler.cpp - +// ptrdiff_t _offset; +// Output suppressed because it messes up disassembly. +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); disassembler_s390.cpp - +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); compile.cpp - +// st->print("# "); _tf->dump_on(st); st->cr(); abstractDisassembler.cpp - // st->print("0x%016lx", *((julong*)here)); st->print("0x%016lx", *((uintptr_t*)here)); // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); abstractDisassembler.cpp - may be explicit cast (byte*)?: st->print("%2.2x", *byte); st->print("%2.2x", *pos); st->print("0x%02x", *here); PTR64_FORMAT ?: st->print("0x%016lx", *((uintptr_t*)here)); Thanks, Vladimir On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > Dear Community, > > may I please request comments and reviews for this change? Thank you! > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > > Regards, > Lutz > > ?On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > > Dear All, > > this topic was discussed back in Nov/Dec 2018: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): > > PrintAssemblyOptions help: > hsdis-print-raw test plugin by requesting raw output (deprecated) > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > hsdis-print-pc turn off PC printing (on by default) (deprecated) > hsdis-print-bytes turn on instruction byte output (deprecated) > > hsdis-show-pc toggle printing current pc, currently ON > hsdis-show-offset toggle printing current offset, currently OFF > hsdis-show-bytes toggle printing instruction bytes, currently OFF > hsdis-show-data-hex toggle formatting data as hex, currently ON > hsdis-show-data-int toggle formatting data as int, currently OFF > hsdis-show-data-float toggle formatting data as float, currently OFF > hsdis-show-structs toggle compiler data structures, currently OFF > hsdis-show-comment toggle instruction comments, currently OFF > hsdis-show-block-comment toggle block comments, currently OFF > hsdis-align-instr toggle instruction alignment, currently OFF > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: > > The machine code generated by the JVM can be printed in three different formats: > - Hexadecimal. > This is basically a hex dump of the memory range containing the code. > This format is always available (PRODUCT and not-PRODUCT builds), regardless > of the availability of a disassembler library. It applies to all sorts of > code, be it blobs, stubs, compiled nmethods, ... > This format seems useless at first glance, but it is not. In an upcoming, > separate enhancement, the JVM will be made capable of reading files > containing such code blocks and disassembling them post mortem. The most > prominent example is an hs_err* file. > - Disassembled. > This is an assembly listing of the instructions as found in the memory range > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > disassembler library (hsdis-.so) must be available at runtime. > Most often, that will only be the case in test environments. If no disassembler > library is available, hexadecimal output is used as fallback. > - OptoAssembly. > This is a meta code listing created only by the C2 compiler. As it is somewhat > closer to the Java code, it may be helpful in linking assembly code to Java code. > > All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > > OptoAssembly output for reference (always on with PrintAssembly): > ================================================================= > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) > 039 movl R11, [RBP + #12 (8-bit)] # range > 03d NullCheck RBP > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 03d cmpl RDX, R11 # unsigned > 040 jnb,us B6 P=0.000000 C=5375.000000 > > PrintAssembly with no disassembler library available: > ===================================================== > > [Code] > [Entry Point] > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > [Verified Entry Point] > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > [Exception Handler] > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > [End] > > PrintAssembly with minimal verbosity: > ===================================== > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > 0x00007f0434b89bdd: cmp %r11d,%edx > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > > PrintAssembly (previous plus code offsets from code begin): > =========================================================== > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > > PrintAssembly (previous plus block comments): > =========================================================== > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > > PrintAssembly (previous plus instruction comments): > =========================================================== > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} > ; - java.lang.String::charAt at 8 (line 702) > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > > For completeness, here are the links to > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > > But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. > > Thank you so much for your time! > Lutz > > > > From gerard.ziemski at oracle.com Thu May 9 19:56:13 2019 From: gerard.ziemski at oracle.com (gerard ziemski) Date: Thu, 9 May 2019 14:56:13 -0500 Subject: RFR (T) 8223639: [JVMCI] jvmciCompiler.cpp needs to include "oops/objArrayOop.inline.hpp"" In-Reply-To: References: Message-ID: hi all, Please review this trivial fix, where we add a missing include. bug link at https://bugs.openjdk.java.net/browse/JDK-8223639 webrev http://cr.openjdk.java.net/~gziemski/8223639_rev1 Testing Mach5 tier1 in progress... cheers From dean.long at oracle.com Thu May 9 20:15:17 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 9 May 2019 13:15:17 -0700 Subject: RFR (T) 8223639: [JVMCI] jvmciCompiler.cpp needs to include "oops/objArrayOop.inline.hpp"" In-Reply-To: References: Message-ID: Looks good.? Thanks for fixing this. dl On 5/9/19 12:56 PM, gerard ziemski wrote: > hi all, > > Please review this trivial fix, where we add a missing include. > > bug link at https://bugs.openjdk.java.net/browse/JDK-8223639 > webrev http://cr.openjdk.java.net/~gziemski/8223639_rev1 > Testing Mach5 tier1 in progress... > > > cheers From vladimir.kozlov at oracle.com Thu May 9 20:54:22 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 9 May 2019 13:54:22 -0700 Subject: RFR (T) 8223639: [JVMCI] jvmciCompiler.cpp needs to include "oops/objArrayOop.inline.hpp"" In-Reply-To: References: Message-ID: Looks good and trivial. Thanks, Vladimir On 5/9/19 12:56 PM, gerard ziemski wrote: > hi all, > > Please review this trivial fix, where we add a missing include. > > bug link at https://bugs.openjdk.java.net/browse/JDK-8223639 > webrev http://cr.openjdk.java.net/~gziemski/8223639_rev1 > Testing Mach5 tier1 in progress... > > > cheers From john.r.rose at oracle.com Thu May 9 21:02:20 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 9 May 2019 14:02:20 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> Message-ID: <94F02B34-94C4-4E88-9854-28863704921C@oracle.com> On May 8, 2019, at 8:31 AM, Schmidt, Lutz wrote: > > may I please request comments and reviews for this change? Thank you! Short comment: This is great, thank you! I write the hsdis stuff long ago and am overjoyed to see it get this level of thoughtful care. ? John From dean.long at oracle.com Thu May 9 21:03:15 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 9 May 2019 14:03:15 -0700 Subject: [13] RFR(T) 8223531: [Graal] assert(type() == T_INT) failed: type check In-Reply-To: <50eda8c3-1cf7-13f9-8670-bab8db26e51a@oracle.com> References: <9d9ac006-41cb-721a-a9c3-9fece2625716@oracle.com> <50eda8c3-1cf7-13f9-8670-bab8db26e51a@oracle.com> Message-ID: +1 dl On 5/9/19 3:30 AM, Tobias Hartmann wrote: > Hi Vladimir, > > looks good. > > Best regards, > Tobias > > On 09.05.19 05:50, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/8223531/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223531 >> >> In my JDK13 port of JVMCI changes (8220623) pushed recently I missed few lines from GR-13374 changes >> in graal-jmvci-8. >> >> Doug Simon found that and prepared this fix. >> >> Thanks, >> Vladimir From vladimir.kozlov at oracle.com Thu May 9 21:49:18 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 9 May 2019 14:49:18 -0700 Subject: [13] RFR(T) 8223531: [Graal] assert(type() == T_INT) failed: type check In-Reply-To: References: <9d9ac006-41cb-721a-a9c3-9fece2625716@oracle.com> <50eda8c3-1cf7-13f9-8670-bab8db26e51a@oracle.com> Message-ID: <654cbfbc-dcd3-d1d5-6ec2-0ad69ad67183@oracle.com> Thank you, Dean and Tobias. Vladimir K On 5/9/19 2:03 PM, dean.long at oracle.com wrote: > +1 > > dl > > On 5/9/19 3:30 AM, Tobias Hartmann wrote: >> Hi Vladimir, >> >> looks good. >> >> Best regards, >> Tobias >> >> On 09.05.19 05:50, Vladimir Kozlov wrote: >>> http://cr.openjdk.java.net/~kvn/8223531/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8223531 >>> >>> In my JDK13 port of JVMCI changes (8220623) pushed recently I missed few lines from GR-13374 changes >>> in graal-jmvci-8. >>> >>> Doug Simon found that and prepared this fix. >>> >>> Thanks, >>> Vladimir > From OGATAK at jp.ibm.com Fri May 10 06:30:05 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Fri, 10 May 2019 15:30:05 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 Message-ID: Hi, May I get review for backport of 8158232: PPC64: improve byte, int and long array copy stubs by using VSX instructions? This changeset looks no conflict with the latest jdk8u-dev code, but the patch command failed to apply it. It seems the patch command lost the code regions to apply patches because stubGenerator_ppc.cpp has sets of similar (but slightly different) functions. I created new webrev mainly to update line numbers in the patch file. I verified I can build fastdebug and release builds and there was no degradation in "make test" results. http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ Regards, Ogata From OGATAK at jp.ibm.com Fri May 10 06:55:51 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Fri, 10 May 2019 15:55:51 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 In-Reply-To: References: Message-ID: Sorry, I forgot to put the links to the bug report and the original changeset Also forgot to mention that this changeset is needed to backport AES intrinsics support [1] on ppc64 big-endian. Bug report: https://bugs.openjdk.java.net/browse/JDK-8158232 Original change set http://hg.openjdk.java.net/jdk/jdk/rev/987528901b83 Webrev: http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ Refs: [1] https://bugs.openjdk.java.net/browse/JDK-8188868 Regards, Ogata "hotspot-compiler-dev" wrote on 2019/05/10 15:30:05: > From: "Kazunori Ogata" > To: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net > Date: 2019/05/10 15:31 > Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > Sent by: "hotspot-compiler-dev" > > Hi, > > May I get review for backport of 8158232: PPC64: improve byte, int and > long array copy stubs by using VSX instructions? > > This changeset looks no conflict with the latest jdk8u-dev code, but the > patch command failed to apply it. It seems the patch command lost the > code regions to apply patches because stubGenerator_ppc.cpp has sets of > similar (but slightly different) functions. > > I created new webrev mainly to update line numbers in the patch file. I > verified I can build fastdebug and release builds and there was no > degradation in "make test" results. > > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > > Regards, > Ogata > > From lutz.schmidt at sap.com Fri May 10 12:23:11 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 10 May 2019 12:23:11 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <94F02B34-94C4-4E88-9854-28863704921C@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <94F02B34-94C4-4E88-9854-28863704921C@oracle.com> Message-ID: <442D899C-C8FF-478C-ABD0-2F73AEA4EF3A@sap.com> Thank you, John! This soul balm feels good. __ Regards, Lutz ?On 09.05.19, 23:02, "John Rose" wrote: On May 8, 2019, at 8:31 AM, Schmidt, Lutz wrote: > > may I please request comments and reviews for this change? Thank you! Short comment: This is great, thank you! I write the hsdis stuff long ago and am overjoyed to see it get this level of thoughtful care. ? John From lutz.schmidt at sap.com Fri May 10 15:44:12 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 10 May 2019 15:44:12 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> Message-ID: <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> Thank you, Vladimir! Please find my comments inline and let me know what you think. A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there were no serious clashes) and sent a short note. Regards, Lutz ?On 09.05.19, 21:30, "Vladimir Kozlov" wrote: Hi Lutz, Thank you for doing this great work. I have just small comments: x86_64.ad - empty change. File contains whitespace changes for formatting. Not visible in webrev. nmethod.cpp - LUCY? + st->print_cr("LUCY: NULL-oop"); + tty->print("LUCY NULL-oop"); Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. Changed. vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). Changed. disassembler.* - LUCY_OBSOLETE? +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it out. But now I did. It's gone. compilerDefinitions.hpp - I don't see where tier_digit() is used. I'm surprised myself. Introduced it and then made it obsolete. It's gone. disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You need to check for invalid value and print help output in such case - it will be very useful if you forgot a value spelling. Also add line for 'help' value. The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) like. Not changed as of now. Awaiting your input. Printing help text: There is an option (hsdis-help) to request help text printout. Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more intelligence. I can do that, but would like to do it in a separate effort. Your opinion? Do you need next commented lines: disassembler.cpp - +// ptrdiff_t _offset; Deleted. +// Output suppressed because it messes up disassembly. +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); Uncommented, would like to keep it. Made the if condition permanently false. disassembler_s390.cpp - +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); Deleted. compile.cpp - +// st->print("# "); _tf->dump_on(st); st->cr(); Uncommented. abstractDisassembler.cpp - // st->print("0x%016lx", *((julong*)here)); st->print("0x%016lx", *((uintptr_t*)here)); // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); Commented lines are gone. abstractDisassembler.cpp - may be explicit cast (byte*)?: st->print("%2.2x", *byte); st->print("%2.2x", *pos); st->print("0x%02x", *here); Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, std::byte is a C++17 feature. We are not there yet. PTR64_FORMAT ?: st->print("0x%016lx", *((uintptr_t*)here)); I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. Changed other occurrences, so I changed this one as well. Thanks, Vladimir On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > Dear Community, > > may I please request comments and reviews for this change? Thank you! > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > > Regards, > Lutz > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > > Dear All, > > this topic was discussed back in Nov/Dec 2018: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): > > PrintAssemblyOptions help: > hsdis-print-raw test plugin by requesting raw output (deprecated) > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > hsdis-print-pc turn off PC printing (on by default) (deprecated) > hsdis-print-bytes turn on instruction byte output (deprecated) > > hsdis-show-pc toggle printing current pc, currently ON > hsdis-show-offset toggle printing current offset, currently OFF > hsdis-show-bytes toggle printing instruction bytes, currently OFF > hsdis-show-data-hex toggle formatting data as hex, currently ON > hsdis-show-data-int toggle formatting data as int, currently OFF > hsdis-show-data-float toggle formatting data as float, currently OFF > hsdis-show-structs toggle compiler data structures, currently OFF > hsdis-show-comment toggle instruction comments, currently OFF > hsdis-show-block-comment toggle block comments, currently OFF > hsdis-align-instr toggle instruction alignment, currently OFF > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: > > The machine code generated by the JVM can be printed in three different formats: > - Hexadecimal. > This is basically a hex dump of the memory range containing the code. > This format is always available (PRODUCT and not-PRODUCT builds), regardless > of the availability of a disassembler library. It applies to all sorts of > code, be it blobs, stubs, compiled nmethods, ... > This format seems useless at first glance, but it is not. In an upcoming, > separate enhancement, the JVM will be made capable of reading files > containing such code blocks and disassembling them post mortem. The most > prominent example is an hs_err* file. > - Disassembled. > This is an assembly listing of the instructions as found in the memory range > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > disassembler library (hsdis-.so) must be available at runtime. > Most often, that will only be the case in test environments. If no disassembler > library is available, hexadecimal output is used as fallback. > - OptoAssembly. > This is a meta code listing created only by the C2 compiler. As it is somewhat > closer to the Java code, it may be helpful in linking assembly code to Java code. > > All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > > OptoAssembly output for reference (always on with PrintAssembly): > ================================================================= > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) > 039 movl R11, [RBP + #12 (8-bit)] # range > 03d NullCheck RBP > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 03d cmpl RDX, R11 # unsigned > 040 jnb,us B6 P=0.000000 C=5375.000000 > > PrintAssembly with no disassembler library available: > ===================================================== > > [Code] > [Entry Point] > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > [Verified Entry Point] > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > [Exception Handler] > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > [End] > > PrintAssembly with minimal verbosity: > ===================================== > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > 0x00007f0434b89bdd: cmp %r11d,%edx > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > > PrintAssembly (previous plus code offsets from code begin): > =========================================================== > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > > PrintAssembly (previous plus block comments): > =========================================================== > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > > PrintAssembly (previous plus instruction comments): > =========================================================== > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} > ; - java.lang.String::charAt at 8 (line 702) > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > > For completeness, here are the links to > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > > But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. > > Thank you so much for your time! > Lutz > > > > From sergey.kuksenko at oracle.com Fri May 10 18:47:38 2019 From: sergey.kuksenko at oracle.com (Sergey Kuksenko) Date: Fri, 10 May 2019 11:47:38 -0700 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> Message-ID: <294c512c-a613-7679-0b90-61e2fe015d3c@oracle.com> Let me do a broader description. When hotspot makes a decision to the "ultimate question compilation, optimization and everything"? inline or not inline there are two key part of that decision. It is check of sizes (callee and caller) and check of frequencies (invocation count). Frequency check is reasonable, why should we inline rarely invoked method? But sometimes we loose optimization opportunities with that. Let's narrow the scenario. We have a loop and a method invocation before the loop. Inline of the method is a vital? for the loop performance. I see at least two key optimizations here: constant propagation and scalar replacement, maybe more. But if the loop has large enough amount of iterations -> hotspot has large enough backedge counters -> but it means that prolog is considered as relatively cold code (small amount of invocation counter) -> that method (potentially vital for performance) is not inlined (due to frequency/MinInlineThreashold cut off). We can't say if inlining is important until we look into the loop (even if there is a loop there). But we have to make a decision about inline before that. So let's try to make reasonable heuristic and narrow the scenario again. Limit our sight to Iterators. There is a very high probability that after Iterable::iterator() invocation there is a loop (covers all for-all loop). Also there is a high correlation between collection size and amount of loop iterations. Let's inline all iterators. I don't think the idea to analyze if "returned Iterator is a freshly-allocated instance" makes sense. First of all it's unnecessary complication.? Moreover, I have results when we have chain of iterators, hotspot can't inline the whole chain due to absence of profile (and/or profile pollution), but partial inline of the chain have shown performance benefits. To get more effective prediction if that particular inline is important we should look not into the method, but to the usage of the method results (into the loop). About the first comment (to broad or to narrow check). I have to note that this fix doesn't force inline for all methods with "iterator" name. The fix only excludes frequency cut off. All other checks (by sizes) are still in place. I did broader check for two reasons: to simplify modifications and to have wider appliances when it works. I could narrow it if you insist, but at the same time I think we have to make that check broader - don't look into method name at all. If you have something? that returns Iterator - there will be loop after that with a very high probability. So I'd vote for making that wider - check only return type. On 5/8/19 3:10 PM, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ > returned Iterator is a freshly-allocated instance > src/hotspot/share/opto/bytecodeInfo.cpp: > > +? if (callee_method->name() == ciSymbol::iterator_name()) { > +??? if > (callee_method->signature()->return_type()->is_subtype_of(C->env()->Iterator_klass())) > { > +????? return true; > +??? } > +? } > > The check looks too broad for me: it returns true for any method with > a name "iterator" which returns an instance of Iterator which is much > broader that just overrides/overloads of Iterable::iterator(). > > Can you elaborate, please, why did you decide to extend the check for > non-Iterables? > > Commenting on the general approach, it looks like a good candidate for > a fist-line filter before performing a more extensive analysis. I'd > prefer to see BCEscapeAnalyzer extended to determine that returned > Iterator is a freshly-allocated instance and decide whether to inline > or not based on that instead. Among java.util classes you mentioned > most iterators are trivial, so even naive analysis should get decent > results. > > And then the analysis can be applied to any method which returns an > Object to see whether EA may benefit from inlining. > > What do you think? > > Best regards, > Vladimir Ivanov > >> On 5/7/19 11:56 AM, Aleksey Shipilev wrote: >>> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >>>> Hi All, >>>> >>>> I would like to ask for review the following change/update: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8223504 >>>> >>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ >>> The idea sounds fine. >>> >>> Nits (the usual drill): >>> >>> ? *) Copyright years need to be updated, at least in bytecodeInfo.cpp >>> >>> ? *) Do we need to put Iterator_klass initialization this early in >>> WK_KLASSES_DO? It feels safer to >>> initialize it at the end, to avoid surprising bootstrap issues. >>> >>> ? *) Backslash indent is off here in vmSymbols.hpp: >>> >>> ? 129?? template(java_util_Iterator, >>> "java/util/Iterator")?????????????? \ >>> >>> ? *) Space after "if"? Also, I think you can use >>> ciType::is_subtype_of instead here. Plus, since you >>> declared iterator in WK klasses, SystemDictionary::Iterator_klass() >>> should be available. >>> >>> ? 100???? if(retType->is_klass() && >>> retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { >>> From vladimir.kozlov at oracle.com Fri May 10 21:16:13 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 10 May 2019 14:16:13 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> Message-ID: <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> Hi Lutz, My comments are inlined below. On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > Thank you, Vladimir! > Please find my comments inline and let me know what you think. > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ Found one more I missed last time: assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): static int instr_len(unsigned char *instr); > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there were no serious clashes) and sent a short note. NP. Please, provide final webrev when you can so that I can run these changes through our testing to make sure no issues are present (especially in builds). > Regards, > Lutz > > ?On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > > Hi Lutz, > > Thank you for doing this great work. > > I have just small comments: > > x86_64.ad - empty change. > File contains whitespace changes for formatting. Not visible in webrev. Okay. > > nmethod.cpp - LUCY? > > + st->print_cr("LUCY: NULL-oop"); > + tty->print("LUCY NULL-oop"); > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. Okay. > > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > Changed. > > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > Changed. > > disassembler.* - LUCY_OBSOLETE? > > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it out. But now I did. It's gone. Okay. > > compilerDefinitions.hpp - I don't see where tier_digit() is used. > I'm surprised myself. Introduced it and then made it obsolete. It's gone. > > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > need to check for invalid value and print help output in such case - it will be very useful if you > forgot a value spelling. Also add line for 'help' value. > > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) like. Not changed as of now. Awaiting your input. It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is useless because PrintAssemblyOptions is used only for disassembler and there are no values which don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). And an other commented new line: + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > > Printing help text: There is an option (hsdis-help) to request help text printout. > > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more intelligence. I can do that, but would like to do it in a separate effort. Your opinion? Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use if-else or switch without complicating the code. I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use here? I don't think next comment is correct for ccstr type: http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 It would be nice to fix it but you can do it later if you don't want to add more changes. > > Do you need next commented lines: > > disassembler.cpp - > +// ptrdiff_t _offset; > Deleted. > > +// Output suppressed because it messes up disassembly. > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > Uncommented, would like to keep it. Made the if condition permanently false. > > disassembler_s390.cpp - > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > Deleted. > > compile.cpp - > +// st->print("# "); _tf->dump_on(st); st->cr(); > Uncommented. > > > abstractDisassembler.cpp - > // st->print("0x%016lx", *((julong*)here)); > st->print("0x%016lx", *((uintptr_t*)here)); > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > Commented lines are gone. > > abstractDisassembler.cpp - may be explicit cast (byte*)?: > > st->print("%2.2x", *byte); > st->print("%2.2x", *pos); > st->print("0x%02x", *here); > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, std::byte is a C++17 feature. We are not there yet. okay > > PTR64_FORMAT ?: > st->print("0x%016lx", *((uintptr_t*)here)); > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. Changed other occurrences, so I changed this one as well. Thanks, Vladimir > > > Thanks, > Vladimir > > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > > Dear Community, > > > > may I please request comments and reviews for this change? Thank you! > > > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > > > > Regards, > > Lutz > > > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > > > > Dear All, > > > > this topic was discussed back in Nov/Dec 2018: > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > > > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > > The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): > > > > PrintAssemblyOptions help: > > hsdis-print-raw test plugin by requesting raw output (deprecated) > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > > hsdis-print-pc turn off PC printing (on by default) (deprecated) > > hsdis-print-bytes turn on instruction byte output (deprecated) > > > > hsdis-show-pc toggle printing current pc, currently ON > > hsdis-show-offset toggle printing current offset, currently OFF > > hsdis-show-bytes toggle printing instruction bytes, currently OFF > > hsdis-show-data-hex toggle formatting data as hex, currently ON > > hsdis-show-data-int toggle formatting data as int, currently OFF > > hsdis-show-data-float toggle formatting data as float, currently OFF > > hsdis-show-structs toggle compiler data structures, currently OFF > > hsdis-show-comment toggle instruction comments, currently OFF > > hsdis-show-block-comment toggle block comments, currently OFF > > hsdis-align-instr toggle instruction alignment, currently OFF > > > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: > > > > The machine code generated by the JVM can be printed in three different formats: > > - Hexadecimal. > > This is basically a hex dump of the memory range containing the code. > > This format is always available (PRODUCT and not-PRODUCT builds), regardless > > of the availability of a disassembler library. It applies to all sorts of > > code, be it blobs, stubs, compiled nmethods, ... > > This format seems useless at first glance, but it is not. In an upcoming, > > separate enhancement, the JVM will be made capable of reading files > > containing such code blocks and disassembling them post mortem. The most > > prominent example is an hs_err* file. > > - Disassembled. > > This is an assembly listing of the instructions as found in the memory range > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > > disassembler library (hsdis-.so) must be available at runtime. > > Most often, that will only be the case in test environments. If no disassembler > > library is available, hexadecimal output is used as fallback. > > - OptoAssembly. > > This is a meta code listing created only by the C2 compiler. As it is somewhat > > closer to the Java code, it may be helpful in linking assembly code to Java code. > > > > All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. > > > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. > > > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. > > > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > > > > OptoAssembly output for reference (always on with PrintAssembly): > > ================================================================= > > > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) > > 039 movl R11, [RBP + #12 (8-bit)] # range > > 03d NullCheck RBP > > > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 03d cmpl RDX, R11 # unsigned > > 040 jnb,us B6 P=0.000000 C=5375.000000 > > > > PrintAssembly with no disassembler library available: > > ===================================================== > > > > [Code] > > [Entry Point] > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > > [Verified Entry Point] > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > > [Exception Handler] > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > > [End] > > > > PrintAssembly with minimal verbosity: > > ===================================== > > > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > > 0x00007f0434b89bdd: cmp %r11d,%edx > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > > > > PrintAssembly (previous plus code offsets from code begin): > > =========================================================== > > > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > > > > PrintAssembly (previous plus block comments): > > =========================================================== > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > > > > PrintAssembly (previous plus instruction comments): > > =========================================================== > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} > > ; - java.lang.String::charAt at 8 (line 702) > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > > > > For completeness, here are the links to > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > > > > But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. > > > > Thank you so much for your time! > > Lutz > > > > > > > > > > From tobias.hartmann at oracle.com Mon May 13 08:40:35 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 13 May 2019 10:40:35 +0200 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> Message-ID: <10030110-a115-0106-fdea-5b49f9503ca9@oracle.com> Hi Lutz, very nice work! On 10.05.19 17:44, Schmidt, Lutz wrote: > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ This looks good to me but as Vladimir suggested, it would be good to run some extended testing once the repo is fixed and you've rebased your patch. Just noticed that there are two TODO's + disabled code in disassembler_s390.cpp, will you file a follow up bug to address these? Thanks, Tobias From lutz.schmidt at sap.com Mon May 13 10:39:49 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 13 May 2019 10:39:49 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <10030110-a115-0106-fdea-5b49f9503ca9@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <10030110-a115-0106-fdea-5b49f9503ca9@oracle.com> Message-ID: <19946C68-3628-469B-9707-6AD6EEFBDEFC@sap.com> Thank you, Tobias! Answers to your questions are in my response to Vladimir's comments. I'm delaying sending that one out. I'm hoping that by this evening I can create and test a new webrev. Regards, Lutz ?On 13.05.19, 10:40, "Tobias Hartmann" wrote: Hi Lutz, very nice work! On 10.05.19 17:44, Schmidt, Lutz wrote: > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ This looks good to me but as Vladimir suggested, it would be good to run some extended testing once the repo is fixed and you've rebased your patch. Just noticed that there are two TODO's + disabled code in disassembler_s390.cpp, will you file a follow up bug to address these? Thanks, Tobias From ralf.schmelter at sap.com Mon May 13 15:05:52 2019 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 13 May 2019 15:05:52 +0000 Subject: RFR (S) 8223770: code_size2 still too small in some compressed oops configurations Message-ID: Hi, this is an addition to https://bugs.openjdk.java.net/browse/JDK-8223617. It turns out, that the additionally given space is not always enough to not trigger the assertion. This is caused by some of the stub routines needing more memory (93 bytes overall), when the compressed oops mode used a non-zero disjoint base. To be on the safe side I've added another 200 bytes to the code_size2 constant and ran the hotspot/jtreg/gc/arguments/TestUseCompressedOopsErgo.java jtreg test, which showed the problem consistently before the change. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223770/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8223770 Best regards, Ralf From martin.doerr at sap.com Mon May 13 15:30:01 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 13 May 2019 15:30:01 +0000 Subject: RFR (S) 8223770: code_size2 still too small in some compressed oops configurations In-Reply-To: References: Message-ID: Hi Ralf, looks good. Seems like you're the only user who really stresses the stub code generation in debug build on a modern Windows machine. Thanks and best regards, Martin -----Original Message----- From: hotspot-compiler-dev On Behalf Of Schmelter, Ralf Sent: Montag, 13. Mai 2019 17:06 To: hotspot-compiler-dev at openjdk.java.net Subject: [CAUTION] RFR (S) 8223770: code_size2 still too small in some compressed oops configurations Hi, this is an addition to https://bugs.openjdk.java.net/browse/JDK-8223617. It turns out, that the additionally given space is not always enough to not trigger the assertion. This is caused by some of the stub routines needing more memory (93 bytes overall), when the compressed oops mode used a non-zero disjoint base. To be on the safe side I've added another 200 bytes to the code_size2 constant and ran the hotspot/jtreg/gc/arguments/TestUseCompressedOopsErgo.java jtreg test, which showed the problem consistently before the change. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223770/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8223770 Best regards, Ralf From gromero at linux.vnet.ibm.com Mon May 13 19:59:43 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 13 May 2019 16:59:43 -0300 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 In-Reply-To: References: Message-ID: Hi Ogata, Thanks for the backport and for the webrev. I understand that offset adjustments in general, and particularly for this backport, are not considered a change that needs to be reviewed again. That said, and although I'm not a Reviewer, I tested it against SPECjvm and microbenchmarks for byte, int, and long and reviewed the change for jdk8u-dev. It looks good. Please, provide a "Fix Request" comment to the original bug explaining that the backport is low risk and affects PPC64-only, accordingly to [1] and [2]. Then please add the label "jdk8u-fix-request" to it. Once the approval to push is granted I'll sponsor the change. Thank you. Best regards, Gustavo [1] https://wiki.openjdk.java.net/display/jdk8u/Main [2] http://openjdk.java.net/projects/jdk-updates/approval.html On 05/10/2019 03:55 AM, Kazunori Ogata wrote: > Sorry, I forgot to put the links to the bug report and the original > changeset Also forgot to mention that this changeset is needed to > backport AES intrinsics support [1] on ppc64 big-endian. > > Bug report: > https://bugs.openjdk.java.net/browse/JDK-8158232 > > Original change set > http://hg.openjdk.java.net/jdk/jdk/rev/987528901b83 > > > Webrev: > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > > > Refs: > [1] https://bugs.openjdk.java.net/browse/JDK-8188868 > > > Regards, > Ogata > > "hotspot-compiler-dev" > wrote on 2019/05/10 15:30:05: > >> From: "Kazunori Ogata" >> To: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net >> Date: 2019/05/10 15:31 >> Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 >> Sent by: "hotspot-compiler-dev" > >> >> Hi, >> >> May I get review for backport of 8158232: PPC64: improve byte, int and >> long array copy stubs by using VSX instructions? >> >> This changeset looks no conflict with the latest jdk8u-dev code, but the > >> patch command failed to apply it. It seems the patch command lost the >> code regions to apply patches because stubGenerator_ppc.cpp has sets of >> similar (but slightly different) functions. >> >> I created new webrev mainly to update line numbers in the patch file. I > >> verified I can build fastdebug and release builds and there was no >> degradation in "make test" results. >> >> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ >> >> Regards, >> Ogata >> >> > > From ekaterina.pavlova at oracle.com Mon May 13 20:08:42 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Mon, 13 May 2019 13:08:42 -0700 Subject: RFR(T): 8223235: [Graal] compiler/jsr292/NonInlinedCall/InvokeTest.java failed time out Message-ID: Hi All, please review trivial fix which adds several tests into Graal specific problem list. JBS: https://bugs.openjdk.java.net/browse/JDK-8223235 webrev: http://cr.openjdk.java.net/~epavlova//8223235/webrev.00/index.html thanks, -katya From vladimir.kozlov at oracle.com Mon May 13 20:40:18 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 13 May 2019 13:40:18 -0700 Subject: RFR(T): 8223235: [Graal] compiler/jsr292/NonInlinedCall/InvokeTest.java failed time out In-Reply-To: References: Message-ID: <977FB7F9-A2C4-4679-A6C6-369F093D125B@oracle.com> Looks good. Thanks Vladimir > On May 13, 2019, at 1:08 PM, Ekaterina Pavlova wrote: > > Hi All, > > please review trivial fix which adds several tests into Graal specific problem list. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8223235 > webrev: http://cr.openjdk.java.net/~epavlova//8223235/webrev.00/index.html > > thanks, > -katya > From ekaterina.pavlova at oracle.com Mon May 13 20:48:27 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Mon, 13 May 2019 13:48:27 -0700 Subject: RFR(T): 8223235: [Graal] compiler/jsr292/NonInlinedCall/InvokeTest.java failed time out In-Reply-To: <977FB7F9-A2C4-4679-A6C6-369F093D125B@oracle.com> References: <977FB7F9-A2C4-4679-A6C6-369F093D125B@oracle.com> Message-ID: thanks Vladimir! On 5/13/19 1:40 PM, Vladimir Kozlov wrote: > Looks good. > > Thanks > Vladimir > >> On May 13, 2019, at 1:08 PM, Ekaterina Pavlova wrote: >> >> Hi All, >> >> please review trivial fix which adds several tests into Graal specific problem list. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8223235 >> webrev: http://cr.openjdk.java.net/~epavlova//8223235/webrev.00/index.html >> >> thanks, >> -katya >> > From jesper.wilhelmsson at oracle.com Tue May 14 00:19:01 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 14 May 2019 02:19:01 +0200 Subject: RFR: JDK-8223346 - Update Graal Message-ID: Hi, Please review the patch to integrate recent Graal changes into OpenJDK. Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From OGATAK at jp.ibm.com Tue May 14 06:02:49 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Tue, 14 May 2019 15:02:49 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 In-Reply-To: References: Message-ID: Hi Gustavo, Thank you for the suggestion. I'll proceed to put the fix request comment and tag in the original bug report. Thank you too for offering to sponsor this change. I'll let you know when it's approved. Regards, Ogata "Gustavo Romero" wrote on 2019/05/14 04:59:43: > From: "Gustavo Romero" > To: Kazunori Ogata/Japan/IBM at IBMJP, hotspot-compiler-dev at openjdk.java.net, > jdk8u-dev at openjdk.java.net > Date: 2019/05/14 04:59 > Subject: Re: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > > Hi Ogata, > > Thanks for the backport and for the webrev. > > I understand that offset adjustments in general, and particularly for this > backport, are not considered a change that needs to be reviewed again. > > That said, and although I'm not a Reviewer, I tested it against SPECjvm and > microbenchmarks for byte, int, and long and reviewed the change for jdk8u-dev. > > It looks good. > > Please, provide a "Fix Request" comment to the original bug explaining that > the backport is low risk and affects PPC64-only, accordingly to [1] and [2]. > Then please add the label "jdk8u-fix-request" to it. > > Once the approval to push is granted I'll sponsor the change. > > Thank you. > > Best regards, > Gustavo > > [1] https://wiki.openjdk.java.net/display/jdk8u/Main > [2] http://openjdk.java.net/projects/jdk-updates/approval.html > > On 05/10/2019 03:55 AM, Kazunori Ogata wrote: > > Sorry, I forgot to put the links to the bug report and the original > > changeset Also forgot to mention that this changeset is needed to > > backport AES intrinsics support [1] on ppc64 big-endian. > > > > Bug report: > > https://bugs.openjdk.java.net/browse/JDK-8158232 > > > > Original change set > > http://hg.openjdk.java.net/jdk/jdk/rev/987528901b83 > > > > > > Webrev: > > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > > > > > > Refs: > > [1] https://bugs.openjdk.java.net/browse/JDK-8188868 > > > > > > Regards, > > Ogata > > > > "hotspot-compiler-dev" > > wrote on 2019/05/10 15:30:05: > > > >> From: "Kazunori Ogata" > >> To: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net > >> Date: 2019/05/10 15:31 > >> Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > >> Sent by: "hotspot-compiler-dev" > > > >> > >> Hi, > >> > >> May I get review for backport of 8158232: PPC64: improve byte, int and > >> long array copy stubs by using VSX instructions? > >> > >> This changeset looks no conflict with the latest jdk8u-dev code, but the > > > >> patch command failed to apply it. It seems the patch command lost the > >> code regions to apply patches because stubGenerator_ppc.cpp has sets of > >> similar (but slightly different) functions. > >> > >> I created new webrev mainly to update line numbers in the patch file. I > > > >> verified I can build fastdebug and release builds and there was no > >> degradation in "make test" results. > >> > >> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > >> > >> Regards, > >> Ogata > >> > >> > > > > From david.holmes at oracle.com Tue May 14 09:44:06 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 14 May 2019 19:44:06 +1000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> Message-ID: <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> Hi Vladimir, I'll be very happy to see this go in - though I do wish we had more platform coverage than just x86_64. Hopefully the other archs will jump on-board with this as well. I was initially confused by the UseFastClassInitChecks flag as I couldn't really see why you would want to turn it off (other than perhaps during testing) but I see that it is really used (as you explained to Vladimir K.) to exclude the new code for platforms which have not implemented it. Though I'm still not sure that we shouldn't have something to detect it being turned on at runtime on platforms that don't support it (it will likely crash quickly but still ...). Keep wondering if there is a better way to handle this aspect of the change ... I can't comment on the actual interpreter and compiler changes - sorry. This will need re-basing now that JDK-8219974 has been backed out. Thanks, David On 2/05/2019 9:17 am, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8223213 > > (It's a followup RFR on a earlier RFC [1].) > > Recent changes severely affected how static initializers are executed > and for long-running initializers it manifested as a severe slowdown. > As an example, it led to a 3x slowdown on some Clojure applications > (JDK-8219233 [2]). The root cause is that until a class is fully > initialized, every invocation of static method on it goes through method > resolution. > > Proposed fix introduces fast class initialization barriers for C1, C2, > and template interpreter on x86-64. I did some experiments with > cross-platform approaches, but haven't got satisfactory results. > > On other platforms, behavior stays (mostly) intact. (I had to revert > some changes introduced by JDK-8219492 [3], since the assumptions they > rely on about accesses inside a class don't hold in all cases.) > > The barrier is as simple as: > ?? if (holder->is_not_initialized() && > ?????? !holder->is_reentrant_initialization(current_thread)) { > ???? // trigger call site re-resolution and block there > ?? } > > There are 3 places where barriers are added: > ? * in template interpreter for invokestatic bytecode; > ? * at nmethod verified entry point (for normal compilations); > ? * c2i adapters; > > For template interperter, there's additional check added into > TemplateTable::resolve_cache_and_index which calls into > InterpreterRuntime::resolve_from_cache when fast path checks fail. > > In case of nmethods, the barrier is put before frame construction, so > existing compiler runtime routines can be reused > (SharedRuntime::get_handle_wrong_method_stub()). > > Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers > nmethod recompilation once the class is fully initialized. > > OSR compilations don't need a barrier. > > Correspondence between barriers and transitions they cover: > ? (1) from interpreter (barrier on caller side) > ?????? * all transitions: interpreter, compiled (i2c), native, aot, ... > > ? (2) from compiled (barrier on callee side) > ?????? to compiled, to native (barrier in native wrapper on entry) > > ? (3) c2i bypasses both barriers (interpreter and compiled) and > requires a dedicated barrier in c2i > > ? (4) to Graal/AOT code: > ??????? from interpreter: covered by interpreter barrier > ??????? from compiled: call site patching is disabled, leading to > repeated call site resolution until method holder is fully initialized > (original behavior). > > Performance experiments with clojure [2] demonstrated that the fix > almost completely recuperates the regression: > > ? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) > ? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) > ? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) > -------- > ? (4) barriers disabled for invokestatic? ~3,2s > > I deliberately tried to keep the patch backport-friendly for 8u/11u/12u > and refrained from using newer features like nmethod barriers introduced > recently. The fix can be refactored later specifically for 13 as a > followup change. > > Testing: clojure startup, tier1-5 > > Thanks! > > Best regards, > Vladimir Ivanov > > [1] > https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html > [2] https://bugs.openjdk.java.net/browse/JDK-8219233 > [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From lutz.schmidt at sap.com Tue May 14 10:47:07 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 14 May 2019 10:47:07 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management Message-ID: Dear all, May I please request reviews for my change? Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ What this change is all about: ------------------------------ While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. JDK-8223444 not applied ======================= Segment | free blocks with CodeCacheMinBlockLength= Size | 1 2 3 4 6 8 -----------------+------------------------------------------- aarch 128 | 0 153 75 30 38 2 ppc 128 | 0 149 98 59 14 2 ppcle 128 | 0 219 161 110 69 34 s390 256 | 0 142 93 59 30 10 x86 128 | 0 215 157 118 42 11 JDK-8223444 applied =================== Segment | free blocks with CodeCacheMinBlockLength= | suggested Size | 1 2 3 4 6 8 | setting -----------------+---------------------------------------------+------------ aarch 128 | 221 115 80 36 7 1 | 6 ppc 128 | 245 152 101 54 14 4 | 6 ppcle 128 | 243 144 89 72 20 5 | 6 s390 256 | 168 60 67 8 6 2 | 4 x86 128 | 223 139 83 50 11 2 | 6 Thank you for your time and opinion! Lutz From david.holmes at oracle.com Tue May 14 12:17:51 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 14 May 2019 22:17:51 +1000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> Message-ID: Forgot to mention that your new test doesn't look like it will play nicely when run with Graal enabled, so you may need to split up into different @test sections and add "@requires !vm.graal.enabled" to exclude graal. David On 14/05/2019 7:44 pm, David Holmes wrote: > Hi Vladimir, > > I'll be very happy to see this go in - though I do wish we had more > platform coverage than just x86_64. Hopefully the other archs will jump > on-board with this as well. > > I was initially confused by the UseFastClassInitChecks flag as I > couldn't really see why you would want to turn it off (other than > perhaps during testing) but I see that it is really used (as you > explained to Vladimir K.) to exclude the new code for platforms which > have not implemented it. Though I'm still not sure that we shouldn't > have something to detect it being turned on at runtime on platforms that > don't support it (it will likely crash quickly but still ...). Keep > wondering if there is a better way to handle this aspect of the change ... > > I can't comment on the actual interpreter and compiler changes - sorry. > > This will need re-basing now that JDK-8219974 has been backed out. > > Thanks, > David > > On 2/05/2019 9:17 am, Vladimir Ivanov wrote: >> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8223213 >> >> (It's a followup RFR on a earlier RFC [1].) >> >> Recent changes severely affected how static initializers are executed >> and for long-running initializers it manifested as a severe slowdown. >> As an example, it led to a 3x slowdown on some Clojure applications >> (JDK-8219233 [2]). The root cause is that until a class is fully >> initialized, every invocation of static method on it goes through >> method resolution. >> >> Proposed fix introduces fast class initialization barriers for C1, C2, >> and template interpreter on x86-64. I did some experiments with >> cross-platform approaches, but haven't got satisfactory results. >> >> On other platforms, behavior stays (mostly) intact. (I had to revert >> some changes introduced by JDK-8219492 [3], since the assumptions they >> rely on about accesses inside a class don't hold in all cases.) >> >> The barrier is as simple as: >> ??? if (holder->is_not_initialized() && >> ??????? !holder->is_reentrant_initialization(current_thread)) { >> ????? // trigger call site re-resolution and block there >> ??? } >> >> There are 3 places where barriers are added: >> ?? * in template interpreter for invokestatic bytecode; >> ?? * at nmethod verified entry point (for normal compilations); >> ?? * c2i adapters; >> >> For template interperter, there's additional check added into >> TemplateTable::resolve_cache_and_index which calls into >> InterpreterRuntime::resolve_from_cache when fast path checks fail. >> >> In case of nmethods, the barrier is put before frame construction, so >> existing compiler runtime routines can be reused >> (SharedRuntime::get_handle_wrong_method_stub()). >> >> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers >> nmethod recompilation once the class is fully initialized. >> >> OSR compilations don't need a barrier. >> >> Correspondence between barriers and transitions they cover: >> ?? (1) from interpreter (barrier on caller side) >> ??????? * all transitions: interpreter, compiled (i2c), native, aot, ... >> >> ?? (2) from compiled (barrier on callee side) >> ??????? to compiled, to native (barrier in native wrapper on entry) >> >> ?? (3) c2i bypasses both barriers (interpreter and compiled) and >> requires a dedicated barrier in c2i >> >> ?? (4) to Graal/AOT code: >> ???????? from interpreter: covered by interpreter barrier >> ???????? from compiled: call site patching is disabled, leading to >> repeated call site resolution until method holder is fully initialized >> (original behavior). >> >> Performance experiments with clojure [2] demonstrated that the fix >> almost completely recuperates the regression: >> >> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >> -------- >> ?? (4) barriers disabled for invokestatic? ~3,2s >> >> I deliberately tried to keep the patch backport-friendly for >> 8u/11u/12u and refrained from using newer features like nmethod >> barriers introduced recently. The fix can be refactored later >> specifically for 13 as a followup change. >> >> Testing: clojure startup, tier1-5 >> >> Thanks! >> >> Best regards, >> Vladimir Ivanov >> >> [1] >> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >> >> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From shade at redhat.com Tue May 14 18:14:50 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 14 May 2019 20:14:50 +0200 Subject: RFR/RFC (XS) 8223911: Disable bad node budget verification until the fix Message-ID: <7784d166-6061-5495-454f-b6ec62eaefbd@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8223911 There is a bug that Patric is handling now. However, fastdebug is broken for more than a week, and there is no sight of the fix yet. Having this false negative failure is detrimental for testing, especially given impending fork to 13. Backing out the change seems too harsh. Let's just disable the assert? Fix: diff -r 809cbe8565a1 -r ee3c58aebd44 src/hotspot/share/opto/loopnode.hpp --- a/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:05:43 2019 +0200 +++ b/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:06:51 2019 +0200 @@ -1384,7 +1384,8 @@ uint required = _nodes_required; require_nodes_final(); uint delta = C->live_nodes() - live_at_begin; - assert(delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", + // Assert is disabled, see JDK-8223911 and related issues. + assert(true || delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", delta, required); } Testing: Linux x86_64 fastdebug, regression test from JDK-8223502 (I could have added it, but then this fix would not be as trivial and would require passing jdk-submit), our larger workloads that used to break -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue May 14 18:20:08 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 14 May 2019 11:20:08 -0700 Subject: RFR/RFC (XS) 8223911: Disable bad node budget verification until the fix In-Reply-To: <7784d166-6061-5495-454f-b6ec62eaefbd@redhat.com> References: <7784d166-6061-5495-454f-b6ec62eaefbd@redhat.com> Message-ID: <6D75E3CE-288B-457F-BA63-864CFF5F2216@oracle.com> I agree. Thanks Vladimir > On May 14, 2019, at 11:14 AM, Aleksey Shipilev wrote: > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8223911 > > There is a bug that Patric is handling now. However, fastdebug is broken for more than a week, and > there is no sight of the fix yet. Having this false negative failure is detrimental for testing, > especially given impending fork to 13. Backing out the change seems too harsh. Let's just disable > the assert? > > Fix: > > diff -r 809cbe8565a1 -r ee3c58aebd44 src/hotspot/share/opto/loopnode.hpp > --- a/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:05:43 2019 +0200 > +++ b/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:06:51 2019 +0200 > @@ -1384,7 +1384,8 @@ > uint required = _nodes_required; > require_nodes_final(); > uint delta = C->live_nodes() - live_at_begin; > - assert(delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", > + // Assert is disabled, see JDK-8223911 and related issues. > + assert(true || delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", > delta, required); > } > > Testing: Linux x86_64 fastdebug, regression test from JDK-8223502 (I could have added it, but then > this fix would not be as trivial and would require passing jdk-submit), our larger workloads that > used to break > > -- > Thanks, > -Aleksey > > From vladimir.kozlov at oracle.com Tue May 14 18:53:59 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 14 May 2019 11:53:59 -0700 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: References: Message-ID: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> Good. Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as used and (after call) store bad values in it. What about first part? May be add comment. Thanks, Vladimir On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > Dear all, > > May I please request reviews for my change? > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > What this change is all about: > ------------------------------ > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > JDK-8223444 not applied > ======================= > > Segment | free blocks with CodeCacheMinBlockLength= > Size | 1 2 3 4 6 8 > -----------------+------------------------------------------- > aarch 128 | 0 153 75 30 38 2 > ppc 128 | 0 149 98 59 14 2 > ppcle 128 | 0 219 161 110 69 34 > s390 256 | 0 142 93 59 30 10 > x86 128 | 0 215 157 118 42 11 > > > JDK-8223444 applied > =================== > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > Size | 1 2 3 4 6 8 | setting > -----------------+---------------------------------------------+------------ > aarch 128 | 221 115 80 36 7 1 | 6 > ppc 128 | 245 152 101 54 14 4 | 6 > ppcle 128 | 243 144 89 72 20 5 | 6 > s390 256 | 168 60 67 8 6 2 | 4 > x86 128 | 223 139 83 50 11 2 | 6 > > Thank you for your time and opinion! > Lutz > > > > From vladimir.kozlov at oracle.com Tue May 14 19:32:13 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 14 May 2019 12:32:13 -0700 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: References: Message-ID: Changes seems fine but I am not comfortable about tests results. There are a lot of timeouts again but there are many graalunit tests failures. This time you have to apply overwritten diffs after merge (if we decide to push it) - these changes are not in Graal master repo yet. Thanks, Vladimir On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate recent Graal changes into OpenJDK. > Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ > > This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. > > Thanks, > /Jesper > From lutz.schmidt at sap.com Tue May 14 20:09:56 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 14 May 2019 20:09:56 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> Message-ID: <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> Hi Vladimir, I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. About segment map marking: There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. Segmap before after Index split split I 0 <- block start 0 <- block start (now shorter) I+1 1 1 each index 0..9 still points I+2 2 2 back to the block start I+3 3 3 I+4 4 4 I+5 5 5 I+6 6 6 I+7 7 7 I+8 8 8 I+9 9 9 I+10 10 0 <- new block start I+11 11 1 I+12 12 2 I+13 13 3 I+14 14 4 I+15 0 <- block start 0 <- block start I+16 1 1 I+17 2 2 I+18 3 3 I+19 4 4 There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. Thanks, Lutz ?On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: Good. Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as used and (after call) store bad values in it. What about first part? May be add comment. Thanks, Vladimir On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > Dear all, > > May I please request reviews for my change? > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > What this change is all about: > ------------------------------ > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > JDK-8223444 not applied > ======================= > > Segment | free blocks with CodeCacheMinBlockLength= > Size | 1 2 3 4 6 8 > -----------------+------------------------------------------- > aarch 128 | 0 153 75 30 38 2 > ppc 128 | 0 149 98 59 14 2 > ppcle 128 | 0 219 161 110 69 34 > s390 256 | 0 142 93 59 30 10 > x86 128 | 0 215 157 118 42 11 > > > JDK-8223444 applied > =================== > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > Size | 1 2 3 4 6 8 | setting > -----------------+---------------------------------------------+------------ > aarch 128 | 221 115 80 36 7 1 | 6 > ppc 128 | 245 152 101 54 14 4 | 6 > ppcle 128 | 243 144 89 72 20 5 | 6 > s390 256 | 168 60 67 8 6 2 | 4 > x86 128 | 223 139 83 50 11 2 | 6 > > Thank you for your time and opinion! > Lutz > > > > From vladimir.kozlov at oracle.com Tue May 14 21:00:24 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 14 May 2019 14:00:24 -0700 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> Message-ID: <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> On 5/14/19 1:09 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. Okay, if it is byte copy I am fine with it. > > About segment map marking: > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. > > Segmap before after > Index split split > I 0 <- block start 0 <- block start (now shorter) > I+1 1 1 each index 0..9 still points > I+2 2 2 back to the block start > I+3 3 3 > I+4 4 4 > I+5 5 5 > I+6 6 6 > I+7 7 7 > I+8 8 8 > I+9 9 9 > I+10 10 0 <- new block start > I+11 11 1 > I+12 12 2 > I+13 13 3 > I+14 14 4 > I+15 0 <- block start 0 <- block start > I+16 1 1 > I+17 2 2 > I+18 3 3 > I+19 4 4 > > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? Thank you, I forgot about that first block mapping is still valid. What about storing bad value (in debug mode) only in second part and not both parts? > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. Okay. Thanks, Vladimir > > Thanks, > Lutz > > > ?On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > Good. > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as > used and (after call) store bad values in it. What about first part? May be add comment. > > Thanks, > Vladimir > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > > Dear all, > > > > May I please request reviews for my change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > > > What this change is all about: > > ------------------------------ > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > > > JDK-8223444 not applied > > ======================= > > > > Segment | free blocks with CodeCacheMinBlockLength= > > Size | 1 2 3 4 6 8 > > -----------------+------------------------------------------- > > aarch 128 | 0 153 75 30 38 2 > > ppc 128 | 0 149 98 59 14 2 > > ppcle 128 | 0 219 161 110 69 34 > > s390 256 | 0 142 93 59 30 10 > > x86 128 | 0 215 157 118 42 11 > > > > > > JDK-8223444 applied > > =================== > > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > > Size | 1 2 3 4 6 8 | setting > > -----------------+---------------------------------------------+------------ > > aarch 128 | 221 115 80 36 7 1 | 6 > > ppc 128 | 245 152 101 54 14 4 | 6 > > ppcle 128 | 243 144 89 72 20 5 | 6 > > s390 256 | 168 60 67 8 6 2 | 4 > > x86 128 | 223 139 83 50 11 2 | 6 > > > > Thank you for your time and opinion! > > Lutz > > > > > > > > > > From dean.long at oracle.com Tue May 14 21:03:28 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 14 May 2019 14:03:28 -0700 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: References: Message-ID: I suggest doing the tiers 1-4 testing separate from the tier5+ testing, to reduce noise. There is a fix for CheckGraalIntrinsics coming to upstream Graal. When Jesper merges the overwritten changes, he could include that fix as well, so that compiler/graalunit/HotspotTest.java passes. The HeapMonitorStatArrayCorrectnessTest failure should have been fixed by JDK-8223441, unless Jesper's test repo is out of date. dl On 5/14/19 12:32 PM, Vladimir Kozlov wrote: > Changes seems fine but I am not comfortable about tests results. There > are a lot of timeouts again but there are many graalunit tests failures. > > This time you have to apply overwritten diffs after merge (if we > decide to push it) - these changes are not in Graal master repo yet. > > Thanks, > Vladimir > > On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> >> Please review the patch to integrate recent Graal changes into OpenJDK. >> Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ >> >> This integration did overwrite changes already in place in OpenJDK. >> The diff has been attached to the umbrella bug. >> >> Thanks, >> /Jesper >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Tue May 14 21:40:39 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 14 May 2019 23:40:39 +0200 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: References: Message-ID: <507D457E-C328-4337-968E-80036650FDE3@oracle.com> Dean, Please attach the diff you want me to add to the bug. JDK-8223441 is in there. <> Thanks, /Jesper > On 14 May 2019, at 23:03, dean.long at oracle.com wrote: > > I suggest doing the tiers 1-4 testing separate from the tier5+ testing, to reduce noise. > > There is a fix for CheckGraalIntrinsics coming to upstream Graal. When Jesper merges the overwritten changes, he could include that fix as well, so that compiler/graalunit/HotspotTest.java passes. > > The HeapMonitorStatArrayCorrectnessTest failure should have been fixed by JDK-8223441, unless Jesper's test repo is out of date. > > dl > <> > On 5/14/19 12:32 PM, Vladimir Kozlov wrote: >> Changes seems fine but I am not comfortable about tests results. There are a lot of timeouts again but there are many graalunit tests failures. >> >> This time you have to apply overwritten diffs after merge (if we decide to push it) - these changes are not in Graal master repo yet. >> >> Thanks, >> Vladimir >> >> On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> >>> Please review the patch to integrate recent Graal changes into OpenJDK. >>> Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ >>> >>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>> >>> Thanks, >>> /Jesper >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From lutz.schmidt at sap.com Wed May 15 05:53:28 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 15 May 2019 05:53:28 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> Message-ID: <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> Hi Vladimir, thank you for your comments. About filling CodeHeap with bad values after split_block: - in deallocate_tail, the leading part must remain intact. It contains valid code. - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts. - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. Regards, Lutz ?On 14.05.19, 23:00, "Vladimir Kozlov" wrote: On 5/14/19 1:09 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. Okay, if it is byte copy I am fine with it. > > About segment map marking: > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. > > Segmap before after > Index split split > I 0 <- block start 0 <- block start (now shorter) > I+1 1 1 each index 0..9 still points > I+2 2 2 back to the block start > I+3 3 3 > I+4 4 4 > I+5 5 5 > I+6 6 6 > I+7 7 7 > I+8 8 8 > I+9 9 9 > I+10 10 0 <- new block start > I+11 11 1 > I+12 12 2 > I+13 13 3 > I+14 14 4 > I+15 0 <- block start 0 <- block start > I+16 1 1 > I+17 2 2 > I+18 3 3 > I+19 4 4 > > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? Thank you, I forgot about that first block mapping is still valid. What about storing bad value (in debug mode) only in second part and not both parts? > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. Okay. Thanks, Vladimir > > Thanks, > Lutz > > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > Good. > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as > used and (after call) store bad values in it. What about first part? May be add comment. > > Thanks, > Vladimir > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > > Dear all, > > > > May I please request reviews for my change? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > > > What this change is all about: > > ------------------------------ > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > > > JDK-8223444 not applied > > ======================= > > > > Segment | free blocks with CodeCacheMinBlockLength= > > Size | 1 2 3 4 6 8 > > -----------------+------------------------------------------- > > aarch 128 | 0 153 75 30 38 2 > > ppc 128 | 0 149 98 59 14 2 > > ppcle 128 | 0 219 161 110 69 34 > > s390 256 | 0 142 93 59 30 10 > > x86 128 | 0 215 157 118 42 11 > > > > > > JDK-8223444 applied > > =================== > > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > > Size | 1 2 3 4 6 8 | setting > > -----------------+---------------------------------------------+------------ > > aarch 128 | 221 115 80 36 7 1 | 6 > > ppc 128 | 245 152 101 54 14 4 | 6 > > ppcle 128 | 243 144 89 72 20 5 | 6 > > s390 256 | 168 60 67 8 6 2 | 4 > > x86 128 | 223 139 83 50 11 2 | 6 > > > > Thank you for your time and opinion! > > Lutz > > > > > > > > > > From robbin.ehn at oracle.com Wed May 15 06:26:21 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 15 May 2019 08:26:21 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> Hi, please see this update. I think I got all review comments fix. Long story short, I was concerned about test coverage, so I added a stress test using the WB, which sometimes crashed in rubbish code. There are two bugs in the methods used by WB_DeoptimizeAll. (Seems I'm the first user) CodeCache::mark_all_nmethods_for_deoptimization(); When iterating the nmethods we could see the methods being create in: void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& method) And deopt the method when it was in use or before. Native wrappers are suppose to live as long as the class. I filtered out not_installed and native methods. Deoptimization::deoptimize_all_marked(); The issue is that a not_entrant method can go to zombie at anytime. There are several ways to make a nmethod not go zombie: nmethodLocker, have it on stack, avoid safepoint poll in some states, etc.., which is also depending on what type of nmethod. The iterator only_alive_and_not_unloading returns not_entrant nmethods, but we don't know there state prior last poll. in_use -> not_entrant -> #poll# -> not_entrant -> zombie If the iterator returns the nmethod after we passed the poll it can still be not_entrant but go zombie. The problem happens when a second thread marks a method for deopt and makes it not_entrant. Then after a poll we end-up in deoptimize_all_marked(), but the method is not yet a zombie, so the iterator returns it, it becomes a zombie thus pass the if check and later hit the assert. So there is a race between the iterator check of state and if-statement check of state. Fixed by also filtering out zombies. If the stress test with correction of the bugs causes trouble in review, I can do a follow-up with the stress test separately. Good news, no issues found with deopt with handshakes. This is v3: http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ This full inc from v2 (review + stress test): http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ This inc is the review part from v2: http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ This inc is the additional stress test with bug fixes: http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ Additional biased locking change: The original code use same copy of markOop in revoke_and_rebias. The keep same behavior I now pass in that copy into fast_revoke. The stress test passes hundreds of iterations in mach5. Thousands stress tests locally, the issues above was reproduce-able. Inc changes also passes t1-5. As usual with this change-set, I'm continuously running more test. Thanks, Robbin On 2019-04-25 14:05, Robbin Ehn wrote: > Hi all, please review. > > Let's deopt with handshakes. > Removed VM op Deoptimize, instead we handshake. > Locks needs to be inflate since we are not in a safepoint. > > Goes on top of: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html > > Code: > http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html > Issue: > https://bugs.openjdk.java.net/browse/JDK-8221734 > > Passes t1-7 and multiple t1-5 runs. > > A few startup benchmark see a small speedup. > > Thanks, Robbin From tobias.hartmann at oracle.com Wed May 15 07:03:34 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 15 May 2019 09:03:34 +0200 Subject: RFR/RFC (XS) 8223911: Disable bad node budget verification until the fix In-Reply-To: <6D75E3CE-288B-457F-BA63-864CFF5F2216@oracle.com> References: <7784d166-6061-5495-454f-b6ec62eaefbd@redhat.com> <6D75E3CE-288B-457F-BA63-864CFF5F2216@oracle.com> Message-ID: <13ea00c9-823c-bcb3-dc79-59fe0cbef6c7@oracle.com> +1 Best regards, Tobias On 14.05.19 20:20, Vladimir Kozlov wrote: > I agree. > > Thanks > Vladimir > >> On May 14, 2019, at 11:14 AM, Aleksey Shipilev wrote: >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8223911 >> >> There is a bug that Patric is handling now. However, fastdebug is broken for more than a week, and >> there is no sight of the fix yet. Having this false negative failure is detrimental for testing, >> especially given impending fork to 13. Backing out the change seems too harsh. Let's just disable >> the assert? >> >> Fix: >> >> diff -r 809cbe8565a1 -r ee3c58aebd44 src/hotspot/share/opto/loopnode.hpp >> --- a/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:05:43 2019 +0200 >> +++ b/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:06:51 2019 +0200 >> @@ -1384,7 +1384,8 @@ >> uint required = _nodes_required; >> require_nodes_final(); >> uint delta = C->live_nodes() - live_at_begin; >> - assert(delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", >> + // Assert is disabled, see JDK-8223911 and related issues. >> + assert(true || delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", >> delta, required); >> } >> >> Testing: Linux x86_64 fastdebug, regression test from JDK-8223502 (I could have added it, but then >> this fix would not be as trivial and would require passing jdk-submit), our larger workloads that >> used to break >> >> -- >> Thanks, >> -Aleksey >> >> > From erik.osterlund at oracle.com Wed May 15 07:22:30 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 15 May 2019 09:22:30 +0200 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> Message-ID: Hi, Didn?t read the full conversation, only the part where it was claimed we have byte atomicity on all platforms with memset. That?s not true. On SPARC, the default is to use block initializing stores (BIS), which exposes concurrent readers to out-of-thin-air values that were never in the original memory state or the new memory state. Moreover, the compiler is allowed to, and will also in practice, transform a loop that performs memset, into a memset call as long as the memory being memset is non-volatile. These problems were observed in card tables and led to the introduction of memset_with_concurrent_readers() to force the memset not to use BIS. Just a FYI. /Erik > On 15 May 2019, at 07:53, Schmidt, Lutz wrote: > > Hi Vladimir, > > thank you for your comments. About filling CodeHeap with bad values after split_block: > - in deallocate_tail, the leading part must remain intact. It contains valid code. > - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts. > - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. > > Regards, > Lutz > > ?On 14.05.19, 23:00, "Vladimir Kozlov" wrote: > >> On 5/14/19 1:09 PM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). >> >> It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. > > Okay, if it is byte copy I am fine with it. > >> >> About segment map marking: >> There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). >> In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. >> >> Segmap before after >> Index split split >> I 0 <- block start 0 <- block start (now shorter) >> I+1 1 1 each index 0..9 still points >> I+2 2 2 back to the block start >> I+3 3 3 >> I+4 4 4 >> I+5 5 5 >> I+6 6 6 >> I+7 7 7 >> I+8 8 8 >> I+9 9 9 >> I+10 10 0 <- new block start >> I+11 11 1 >> I+12 12 2 >> I+13 13 3 >> I+14 14 4 >> I+15 0 <- block start 0 <- block start >> I+16 1 1 >> I+17 2 2 >> I+18 3 3 >> I+19 4 4 >> >> There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? > > Thank you, I forgot about that first block mapping is still valid. > > What about storing bad value (in debug mode) only in second part and not both parts? > >> >> Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. > > Okay. > > Thanks, > Vladimir > >> >> Thanks, >> Lutz >> >> >> On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: >> >> Good. >> >> Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). >> An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as >> used and (after call) store bad values in it. What about first part? May be add comment. >> >> Thanks, >> Vladimir >> >>> On 5/14/19 3:47 AM, Schmidt, Lutz wrote: >>> Dear all, >>> >>> May I please request reviews for my change? >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 >>> Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ >>> >>> What this change is all about: >>> ------------------------------ >>> While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. >>> >>> Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. >>> >>> The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. >>> >>> JDK-8223444 not applied >>> ======================= >>> >>> Segment | free blocks with CodeCacheMinBlockLength= >>> Size | 1 2 3 4 6 8 >>> -----------------+------------------------------------------- >>> aarch 128 | 0 153 75 30 38 2 >>> ppc 128 | 0 149 98 59 14 2 >>> ppcle 128 | 0 219 161 110 69 34 >>> s390 256 | 0 142 93 59 30 10 >>> x86 128 | 0 215 157 118 42 11 >>> >>> >>> JDK-8223444 applied >>> =================== >>> >>> Segment | free blocks with CodeCacheMinBlockLength= | suggested >>> Size | 1 2 3 4 6 8 | setting >>> -----------------+---------------------------------------------+------------ >>> aarch 128 | 221 115 80 36 7 1 | 6 >>> ppc 128 | 245 152 101 54 14 4 | 6 >>> ppcle 128 | 243 144 89 72 20 5 | 6 >>> s390 256 | 168 60 67 8 6 2 | 4 >>> x86 128 | 223 139 83 50 11 2 | 6 >>> >>> Thank you for your time and opinion! >>> Lutz >>> >>> >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shade at redhat.com Wed May 15 09:58:44 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 15 May 2019 11:58:44 +0200 Subject: RFR/RFC (XS) 8223911: Disable bad node budget verification until the fix In-Reply-To: <13ea00c9-823c-bcb3-dc79-59fe0cbef6c7@oracle.com> References: <7784d166-6061-5495-454f-b6ec62eaefbd@redhat.com> <6D75E3CE-288B-457F-BA63-864CFF5F2216@oracle.com> <13ea00c9-823c-bcb3-dc79-59fe0cbef6c7@oracle.com> Message-ID: Thanks, pushed. On 5/15/19 9:03 AM, Tobias Hartmann wrote: > +1 > > Best regards, > Tobias > > On 14.05.19 20:20, Vladimir Kozlov wrote: >> I agree. >> >> Thanks >> Vladimir >> >>> On May 14, 2019, at 11:14 AM, Aleksey Shipilev wrote: >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8223911 >>> >>> There is a bug that Patric is handling now. However, fastdebug is broken for more than a week, and >>> there is no sight of the fix yet. Having this false negative failure is detrimental for testing, >>> especially given impending fork to 13. Backing out the change seems too harsh. Let's just disable >>> the assert? >>> >>> Fix: >>> >>> diff -r 809cbe8565a1 -r ee3c58aebd44 src/hotspot/share/opto/loopnode.hpp >>> --- a/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:05:43 2019 +0200 >>> +++ b/src/hotspot/share/opto/loopnode.hpp Tue May 14 20:06:51 2019 +0200 >>> @@ -1384,7 +1384,8 @@ >>> uint required = _nodes_required; >>> require_nodes_final(); >>> uint delta = C->live_nodes() - live_at_begin; >>> - assert(delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", >>> + // Assert is disabled, see JDK-8223911 and related issues. >>> + assert(true || delta <= 2 * required, "Bad node estimate (actual: %d, request: %d)", >>> delta, required); >>> } >>> >>> Testing: Linux x86_64 fastdebug, regression test from JDK-8223502 (I could have added it, but then >>> this fix would not be as trivial and would require passing jdk-submit), our larger workloads that >>> used to break >>> >>> -- >>> Thanks, >>> -Aleksey >>> >>> >> -- Thanks, -Aleksey Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Michael O'Neill, Tom Savage, Eric Shander -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From thomas.stuefe at gmail.com Wed May 15 11:54:16 2019 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 15 May 2019 13:54:16 +0200 Subject: RFR (S) 8223770: code_size2 still too small in some compressed oops configurations In-Reply-To: References: Message-ID: Hi Ralf, looks good. Yes, this is a bit unsatisfying and a self-shrinking code heap would be better. ..Thomas On Mon, May 13, 2019 at 5:06 PM Schmelter, Ralf wrote: > Hi, > > this is an addition to https://bugs.openjdk.java.net/browse/JDK-8223617. > It turns out, that the additionally given space is not always enough to not > trigger the assertion. This is caused by some of the stub routines needing > more memory (93 bytes overall), when the compressed oops mode used a > non-zero disjoint base. To be on the safe side I've added another 200 bytes > to the code_size2 constant and ran the > hotspot/jtreg/gc/arguments/TestUseCompressedOopsErgo.java jtreg test, which > showed the problem consistently before the change. > > webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8223770/webrev.0/ > bugreport: https://bugs.openjdk.java.net/browse/JDK-8223770 > > Best regards, > Ralf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.schmidt at sap.com Wed May 15 12:53:59 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 15 May 2019 12:53:59 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> Message-ID: Hi Erik, had not heard that. Just to make sure I understand your statement correctly: on SPARC, memset does something fancy which causes random byte values to become visible to concurrent readers. Correct? As you state, sufficiently capable compilers will recognize explicit initialization loops and convert them to memset calls. Replacing the loop by a memset call therefore does not introduce a new risk. If at all, it changes probabilities. Furthermore, I do not see concurrent readers. Segment map initialization is only called when the CodeHeap is expanded or when it is cleared in its entirety. If anybody has I remaining concern, I?m ready to revert from memset to explicit loop. Thanks for your heads up! Lutz From: Erik Osterlund Date: Wednesday, 15. May 2019 at 09:22 To: Lutz Schmidt Cc: Vladimir Kozlov , "hotspot-compiler-dev at openjdk.java.net" Subject: Re: RFR(S): 8223444: Improve CodeHeap Free Space Management Hi, Didn?t read the full conversation, only the part where it was claimed we have byte atomicity on all platforms with memset. That?s not true. On SPARC, the default is to use block initializing stores (BIS), which exposes concurrent readers to out-of-thin-air values that were never in the original memory state or the new memory state. Moreover, the compiler is allowed to, and will also in practice, transform a loop that performs memset, into a memset call as long as the memory being memset is non-volatile. These problems were observed in card tables and led to the introduction of memset_with_concurrent_readers() to force the memset not to use BIS. Just a FYI. /Erik On 15 May 2019, at 07:53, Schmidt, Lutz > wrote: Hi Vladimir, thank you for your comments. About filling CodeHeap with bad values after split_block: - in deallocate_tail, the leading part must remain intact. It contains valid code. - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts. - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. Regards, Lutz On 14.05.19, 23:00, "Vladimir Kozlov" > wrote: On 5/14/19 1:09 PM, Schmidt, Lutz wrote: Hi Vladimir, I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. Okay, if it is byte copy I am fine with it. About segment map marking: There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. Segmap before after Index split split I 0 <- block start 0 <- block start (now shorter) I+1 1 1 each index 0..9 still points I+2 2 2 back to the block start I+3 3 3 I+4 4 4 I+5 5 5 I+6 6 6 I+7 7 7 I+8 8 8 I+9 9 9 I+10 10 0 <- new block start I+11 11 1 I+12 12 2 I+13 13 3 I+14 14 4 I+15 0 <- block start 0 <- block start I+16 1 1 I+17 2 2 I+18 3 3 I+19 4 4 There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? Thank you, I forgot about that first block mapping is still valid. What about storing bad value (in debug mode) only in second part and not both parts? Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. Okay. Thanks, Vladimir Thanks, Lutz On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" on behalf of vladimir.kozlov at oracle.com> wrote: Good. Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as used and (after call) store bad values in it. What about first part? May be add comment. Thanks, Vladimir On 5/14/19 3:47 AM, Schmidt, Lutz wrote: Dear all, May I please request reviews for my change? Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ What this change is all about: ------------------------------ While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. JDK-8223444 not applied ======================= Segment | free blocks with CodeCacheMinBlockLength= Size | 1 2 3 4 6 8 -----------------+------------------------------------------- aarch 128 | 0 153 75 30 38 2 ppc 128 | 0 149 98 59 14 2 ppcle 128 | 0 219 161 110 69 34 s390 256 | 0 142 93 59 30 10 x86 128 | 0 215 157 118 42 11 JDK-8223444 applied =================== Segment | free blocks with CodeCacheMinBlockLength= | suggested Size | 1 2 3 4 6 8 | setting -----------------+---------------------------------------------+------------ aarch 128 | 221 115 80 36 7 1 | 6 ppc 128 | 245 152 101 54 14 4 | 6 ppcle 128 | 243 144 89 72 20 5 | 6 s390 256 | 168 60 67 8 6 2 | 4 x86 128 | 223 139 83 50 11 2 | 6 Thank you for your time and opinion! Lutz -------------- next part -------------- An HTML attachment was scrubbed... URL: From rahul.v.raghavan at oracle.com Wed May 15 14:46:32 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 15 May 2019 20:16:32 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler Message-ID: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> Hi, Request help review and finalize fix for 8213416. - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8213416 The requirement is to solve "enumeral and non-enumeral type in conditional expression" warnings triggered with -Wextra enabled for gcc on hotspot. (hotspot/compiler part is handled in this 8213416 task and hotspot/runtime in 8223400) The same warning is generated for ternary operator statements like- (x ? int_val : enum_val). e.g.: comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; Understood from comments that the following type typecast solution proposed earlier was not accepted. - comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; + comp_level = TieredCompilation ? TieredStopAtLevel : (int) CompLevel_highest_tier; and then proposed solution was to rewrite those enums to be static const members. Tried changes based on the comments info from JBS. Extracts of related JBS comments- - ".... it's just a simple code refactoring. " - "David H. only complained about NO_HASH which we can fix. We can also fix CompLevel_highest_tier usage - should use CompLevel type everywhere. But I would not touch Op_RegFlags - I don't want to complicate its construction and we have a lot of places where Op_ are used as uint. I would only fix places where it is used as int to make sure it is used as uint everywhere." Reported enums in question for hotspot/compiler 1) NO_HASH 2) CompLevel_highest_tier 3) Op_RegFlags 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit 1) NO_HASH tried [open/src/hotspot/share/opto/node.hpp] - enum { NO_HASH = 0 }; + static const uint NO_HASH = 0; 2) CompLevel_highest_tier Only one warning in process_compile() [open/src/hotspot/share/ci/ciReplay.cpp] comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; Following type changes tried did not help - - int comp_level = parse_int(comp_level_label); + CompLevel comp_level = parse_int(comp_level_label); ..... - comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; + comp_level = TieredCompilation ? (CompLevel) TieredStopAtLevel : CompLevel_highest_tier; The warning is with only ternary operator usage in this location. So tried simple code refactoring like following and got no more warnings! Is this okay? - comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; + if (TieredCompilation) { + comp_level = TieredStopAtLevel; + } else { + comp_level = CompLevel_highest_tier; + } 3) Op_RegFlags Warnings only for 'virtual uint MachNode::ideal_reg() const' ../open/src/hotspot/share/opto/machnode.hpp: In member function 'virtual uint MachNode::ideal_reg() const': ../open/src/hotspot/share/opto/machnode.hpp:304:95: warning: enumeral and non-enumeral type in conditional expression [-Wextra] virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); return t == TypeInt::CC ? Op_RegFlags : t->ideal_reg(); } ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Op_RegFlags is returned as uint itself here. How to modify code to solve warning? Again since the issue is with only ternary operator usage in only one location, can we go for simple code refactoring like following? - virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); return t == TypeInt::CC ? Op_RegFlags : t->ideal_reg(); } + virtual uint ideal_reg() const { + const Type *t = _opnds[0]->type(); + if (t == TypeInt::CC) { + return Op_RegFlags; + } else { + return t->ideal_reg(); + } + } 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - warnings locations - (i) ../open/src/hotspot/share/oops/klass.cpp: In static member function 'static jint Klass::array_layout_helper(BasicType)': ../open/src/hotspot/share/oops/klass.cpp:212:23: warning: enumeral and non-enumeral type in conditional expression [-Wextra] int tag = isobj ? _lh_array_tag_obj_value : _lh_array_tag_type_value; (ii) ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp: In static member function 'static OopMapSet* Runtime1::generate_code_for(Runtime1::StubID, StubAssembler*)': ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp:1126:22: warning: enumeral and non-enumeral type in conditional expression [-Wextra] int tag = ((id == new_type_array_id) ~~~~~~~~~~~~~~~~~~~~~~~~~ ? Klass::_lh_array_tag_type_value ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ : Klass::_lh_array_tag_obj_value); (iii) ../open/src/hotspot/share/oops/klass.hpp: In static member function 'static jint Klass::instance_layout_helper(jint, bool)': ../open/src/hotspot/share/oops/klass.hpp:422:28: warning: enumeral and non-enumeral type in conditional expression [-Wextra] | (slow_path_flag ? _lh_instance_slow_path_bit : 0); ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Following changes fixed the warnings! (using static const int instead of unnamed enum) [open/src/hotspot/share/oops/klass.cpp] .......... // Unpacking layout_helper: - enum { - _lh_neutral_value = 0, // neutral non-array non-instance value - _lh_instance_slow_path_bit = 0x01, - _lh_log2_element_size_shift = BitsPerByte*0, - _lh_log2_element_size_mask = BitsPerLong-1, - _lh_element_type_shift = BitsPerByte*1, - _lh_element_type_mask = right_n_bits(BitsPerByte), // shifted mask - _lh_header_size_shift = BitsPerByte*2, - _lh_header_size_mask = right_n_bits(BitsPerByte), // shifted mask - _lh_array_tag_bits = 2, - _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits, - _lh_array_tag_obj_value = ~0x01 // 0x80000000 >> 30 - }; + static const int _lh_neutral_value = 0; // neutral non-array non-instance value + static const int _lh_instance_slow_path_bit = 0x01; + static const int _lh_log2_element_size_shift = BitsPerByte*0; + static const int _lh_log2_element_size_mask = BitsPerLong-1; + static const int _lh_element_type_shift = BitsPerByte*1; + static const int _lh_element_type_mask = right_n_bits(BitsPerByte); // shifted mask + static const int _lh_header_size_shift = BitsPerByte*2; + static const int _lh_header_size_mask = right_n_bits(BitsPerByte); // shifted mask + static const int _lh_array_tag_bits = 2; + static const int _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits; + static const int _lh_array_tag_obj_value = ~0x01; // 0x80000000 >> 30 ....... - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ Understood the affected code locations details from the old sample patch attachment of related JDK-8211073 # https://bugs.openjdk.java.net/secure/attachment/79387/hotspot-disable-wextra.diff Also confirmed no similar warnings in hotspot/compiler with -Wextra, no issues with build with this proposed webrev.00 Thanks, Rahul From vladimir.kozlov at oracle.com Wed May 15 16:31:59 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 15 May 2019 09:31:59 -0700 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> Message-ID: <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> Hi Rahul, Comments are inlined below. On 5/15/19 7:46 AM, Rahul Raghavan wrote: > Hi, > > Request help review and finalize fix for 8213416. > > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ > > https://bugs.openjdk.java.net/browse/JDK-8213416 > > The requirement is to solve > ? "enumeral and non-enumeral type in conditional expression" warnings > ? triggered with -Wextra enabled for gcc on hotspot. > > (hotspot/compiler part is handled in this 8213416 task > and hotspot/runtime in 8223400) > > The same warning is generated for ternary operator statements like- > ? (x ? int_val : enum_val). > e.g.: > comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; > > > Understood from comments that the following type typecast solution proposed earlier was not accepted. > - comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; > + comp_level = TieredCompilation ? TieredStopAtLevel : (int) CompLevel_highest_tier; > and then proposed solution was to rewrite those enums to be static const members. > > > Tried changes based on the comments info from JBS. > Extracts of related JBS comments- > - ".... it's just a simple code refactoring. " > - "David H. only complained about NO_HASH which we can fix. > We can also fix CompLevel_highest_tier usage - should use CompLevel type everywhere. > But I would not touch Op_RegFlags - > ? I don't want to complicate its construction and > ? we have a lot of places where Op_ are used as uint. > ? I would only fix places where it is used as int to make sure it is used as uint everywhere." > > Reported enums in question for hotspot/compiler > 1) NO_HASH > 2) CompLevel_highest_tier > 3) Op_RegFlags > 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit > > > > 1) NO_HASH > tried [open/src/hotspot/share/opto/node.hpp] > -? enum { NO_HASH = 0 }; > +? static const uint NO_HASH = 0; > Okay. > > > 2) CompLevel_highest_tier > Only one warning in process_compile() > [open/src/hotspot/share/ci/ciReplay.cpp] > comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; > > Following type changes tried did not help - > -??? int comp_level = parse_int(comp_level_label); > +??? CompLevel comp_level = parse_int(comp_level_label); > ..... > -????? comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; > +????? comp_level = TieredCompilation ? (CompLevel) TieredStopAtLevel : CompLevel_highest_tier; > Thank you for explaining that it did not work. > > The warning is with only ternary operator usage in this location. > So tried simple code refactoring like following and got no more warnings! > Is this okay? > -????? comp_level = TieredCompilation ? TieredStopAtLevel : CompLevel_highest_tier; > +????? if (TieredCompilation) { > +??????? comp_level = TieredStopAtLevel; > +????? } else { > +??????? comp_level = CompLevel_highest_tier > +????? } > Good. > > > 3) Op_RegFlags > Warnings only for 'virtual uint MachNode::ideal_reg() const' > > ../open/src/hotspot/share/opto/machnode.hpp: In member function 'virtual uint MachNode::ideal_reg() const': > ../open/src/hotspot/share/opto/machnode.hpp:304:95: warning: enumeral and non-enumeral type in conditional expression > [-Wextra] > ?? virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); return t == TypeInt::CC ? Op_RegFlags : > t->ideal_reg(); } > > ????? ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Op_RegFlags is returned as uint itself here. > How to modify code to solve warning? > Again since the issue is with only ternary operator usage in only one location, can we go for simple code refactoring > like following? > > -? virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); return t == TypeInt::CC ? Op_RegFlags : > t->ideal_reg(); } > +? virtual uint ideal_reg() const { > +??? const Type *t = _opnds[0]->type(); > +??? if (t == TypeInt::CC) { > +????? return Op_RegFlags; > +??? } else { > +????? return t->ideal_reg(); > +??? } > +? } > Good. I agree. > > > 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - > warnings locations - > > (i) ../open/src/hotspot/share/oops/klass.cpp: In static member function 'static jint > Klass::array_layout_helper(BasicType)': > ../open/src/hotspot/share/oops/klass.cpp:212:23: warning: enumeral and non-enumeral type in conditional expression > [-Wextra] > ?? int? tag?? =? isobj ? _lh_array_tag_obj_value : _lh_array_tag_type_value; > > (ii) ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp: In static member function 'static OopMapSet* > Runtime1::generate_code_for(Runtime1::StubID, StubAssembler*)': > ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp:1126:22: warning: enumeral and non-enumeral type in conditional > expression [-Wextra] > ?????????? int tag = ((id == new_type_array_id) > ????????????????????? ~~~~~~~~~~~~~~~~~~~~~~~~~ > ????????????????????? ? Klass::_lh_array_tag_type_value > ????????????????????? ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ????????????????????? : Klass::_lh_array_tag_obj_value); > > (iii) ../open/src/hotspot/share/oops/klass.hpp: In static member function 'static jint > Klass::instance_layout_helper(jint, bool)': > ../open/src/hotspot/share/oops/klass.hpp:422:28: warning: enumeral and non-enumeral type in conditional expression > [-Wextra] > ?????? |??? (slow_path_flag ? _lh_instance_slow_path_bit : 0); > ???????????? ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Following changes fixed the warnings! > (using static const int instead of unnamed enum) > [open/src/hotspot/share/oops/klass.cpp] > .......... > ?? // Unpacking layout_helper: > -? enum { > -??? _lh_neutral_value?????????? = 0,? // neutral non-array non-instance value > -??? _lh_instance_slow_path_bit? = 0x01, > -??? _lh_log2_element_size_shift = BitsPerByte*0, > -??? _lh_log2_element_size_mask? = BitsPerLong-1, > -??? _lh_element_type_shift????? = BitsPerByte*1, > -??? _lh_element_type_mask?????? = right_n_bits(BitsPerByte),? // shifted mask > -??? _lh_header_size_shift?????? = BitsPerByte*2, > -??? _lh_header_size_mask??????? = right_n_bits(BitsPerByte),? // shifted mask > -??? _lh_array_tag_bits????????? = 2, > -??? _lh_array_tag_shift???????? = BitsPerInt - _lh_array_tag_bits, > -??? _lh_array_tag_obj_value???? = ~0x01?? // 0x80000000 >> 30 > -? }; > +? static const int _lh_neutral_value?????????? = 0;? // neutral non-array non-instance value > +? static const int _lh_instance_slow_path_bit? = 0x01; > +? static const int _lh_log2_element_size_shift = BitsPerByte*0; > +? static const int _lh_log2_element_size_mask? = BitsPerLong-1; > +? static const int _lh_element_type_shift????? = BitsPerByte*1; > +? static const int _lh_element_type_mask?????? = right_n_bits(BitsPerByte);? // shifted mask > +? static const int _lh_header_size_shift?????? = BitsPerByte*2; > +? static const int _lh_header_size_mask??????? = right_n_bits(BitsPerByte);? // shifted mask > +? static const int _lh_array_tag_bits????????? = 2; > +? static const int _lh_array_tag_shift???????? = BitsPerInt - _lh_array_tag_bits; > +? static const int _lh_array_tag_obj_value???? = ~0x01;?? // 0x80000000 >> 30 > ....... > I am okay with it but Runtime group should agree too - it is their code. > > > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ > > Understood the affected code locations details from the old sample patch attachment of related JDK-8211073 > # https://bugs.openjdk.java.net/secure/attachment/79387/hotspot-disable-wextra.diff > Also confirmed no similar warnings in hotspot/compiler with -Wextra, > no issues with build with this proposed webrev.00 Good. Thanks, Vladimir > > > Thanks, > Rahul From vladimir.kozlov at oracle.com Wed May 15 17:00:18 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 15 May 2019 10:00:18 -0700 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> Message-ID: <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> On 5/14/19 10:53 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > thank you for your comments. About filling CodeHeap with bad values after split_block: > - in deallocate_tail, the leading part must remain intact. It contains valid code. > - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts Thank you for explaining. > - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. Yes, it is preferable. An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next block _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation? Thanks, Vladimir > > Regards, > Lutz > > ?On 14.05.19, 23:00, "Vladimir Kozlov" wrote: > > On 5/14/19 1:09 PM, Schmidt, Lutz wrote: > > Hi Vladimir, > > > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). > > > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. > > Okay, if it is byte copy I am fine with it. > > > > > About segment map marking: > > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). > > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. > > > > Segmap before after > > Index split split > > I 0 <- block start 0 <- block start (now shorter) > > I+1 1 1 each index 0..9 still points > > I+2 2 2 back to the block start > > I+3 3 3 > > I+4 4 4 > > I+5 5 5 > > I+6 6 6 > > I+7 7 7 > > I+8 8 8 > > I+9 9 9 > > I+10 10 0 <- new block start > > I+11 11 1 > > I+12 12 2 > > I+13 13 3 > > I+14 14 4 > > I+15 0 <- block start 0 <- block start > > I+16 1 1 > > I+17 2 2 > > I+18 3 3 > > I+19 4 4 > > > > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? > > Thank you, I forgot about that first block mapping is still valid. > > What about storing bad value (in debug mode) only in second part and not both parts? > > > > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. > > Okay. > > Thanks, > Vladimir > > > > > Thanks, > > Lutz > > > > > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > > > Good. > > > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). > > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as > > used and (after call) store bad values in it. What about first part? May be add comment. > > > > Thanks, > > Vladimir > > > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > > > Dear all, > > > > > > May I please request reviews for my change? > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > > > > > What this change is all about: > > > ------------------------------ > > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > > > > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > > > > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > > > > > JDK-8223444 not applied > > > ======================= > > > > > > Segment | free blocks with CodeCacheMinBlockLength= > > > Size | 1 2 3 4 6 8 > > > -----------------+------------------------------------------- > > > aarch 128 | 0 153 75 30 38 2 > > > ppc 128 | 0 149 98 59 14 2 > > > ppcle 128 | 0 219 161 110 69 34 > > > s390 256 | 0 142 93 59 30 10 > > > x86 128 | 0 215 157 118 42 11 > > > > > > > > > JDK-8223444 applied > > > =================== > > > > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > > > Size | 1 2 3 4 6 8 | setting > > > -----------------+---------------------------------------------+------------ > > > aarch 128 | 221 115 80 36 7 1 | 6 > > > ppc 128 | 245 152 101 54 14 4 | 6 > > > ppcle 128 | 243 144 89 72 20 5 | 6 > > > s390 256 | 168 60 67 8 6 2 | 4 > > > x86 128 | 223 139 83 50 11 2 | 6 > > > > > > Thank you for your time and opinion! > > > Lutz > > > > > > > > > > > > > > > > > > From patricio.chilano.mateo at oracle.com Wed May 15 19:45:45 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Wed, 15 May 2019 15:45:45 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> Message-ID: <142878b0-f048-d455-4e44-5308bb511549@oracle.com> Hi Robbin, Biased locking changes look good to me. Just a small comment based on one of the things I mentioned before. I can't help to think that since get_monitors_from_stack() will return monitors owned by the handshaked thread then the assert in Deoptimization::revoke_handshake(): assert(!mark->has_bias_pattern() || mark->biased_locker() == thread, "Can't revoke"); should be combined by the later guarantee() in BiasedLocking::revoke_own_locks_in_handshake() to be: guarantee(!mark->has_bias_pattern() || (mark->biased_locker() == thread && prototype_header->bias_epoch() == mark->bias_epoch()), "Can't revoke"); and that would make more evident we can remove the call to fast_revoke() in BiasedLocking::revoke_own_locks_in_handshake(). I know you said you don't want to change biased locking behavior, but I don't think doing that should change anything. We know the condition above should hold, otherwise, if the handshaked thread is not the real biaser (because of expired epoch) then we could hit the guarantee() in BiasedLocking:revoke_own_locks_in_handshake() anyways later on if some other thread rebiased the lock before we were able to revoke it. Thanks! Patricio On 5/15/19 2:26 AM, Robbin Ehn wrote: > Hi, please see this update. > > I think I got all review comments fix. > > Long story short, I was concerned about test coverage, so I added a > stress test > using the WB, which sometimes crashed in rubbish code. > > There are two bugs in the methods used by WB_DeoptimizeAll. > (Seems I'm the first user) > > CodeCache::mark_all_nmethods_for_deoptimization(); > When iterating the nmethods we could see the methods being create in: > void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& > method) > And deopt the method when it was in use or before. > Native wrappers are suppose to live as long as the class. > I filtered out not_installed and native methods. > > Deoptimization::deoptimize_all_marked(); > The issue is that a not_entrant method can go to zombie at anytime. > There are several ways to make a nmethod not go zombie: nmethodLocker, > have it > on stack, avoid safepoint poll in some states, etc.., which is also > depending on > what type of nmethod. > The iterator only_alive_and_not_unloading returns not_entrant > nmethods, but we > don't know there state prior last poll. > in_use -> not_entrant -> #poll# -> not_entrant -> zombie > If the iterator returns the nmethod after we passed the poll it can > still be > not_entrant but go zombie. > The problem happens when a second thread marks a method for deopt and > makes it > not_entrant. Then after a poll we end-up in deoptimize_all_marked(), > but the > method is not yet a zombie, so the iterator returns it, it becomes a > zombie thus > pass the if check and later hit the assert. > So there is a race between the iterator check of state and > if-statement check of > state. Fixed by also filtering out zombies. > > If the stress test with correction of the bugs causes trouble in > review, I can > do a follow-up with the stress test separately. > > Good news, no issues found with deopt with handshakes. > > This is v3: > http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ > > This full inc from v2 (review + stress test): > http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ > > This inc is the review part from v2: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ > > This inc is the additional stress test with bug fixes: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ > > Additional biased locking change: > The original code use same copy of markOop in revoke_and_rebias. > The keep same behavior I now pass in that copy into fast_revoke. > > The stress test passes hundreds of iterations in mach5. > Thousands stress tests locally, the issues above was reproduce-able. > Inc changes also passes t1-5. > > As usual with this change-set, I'm continuously running more test. > > Thanks, Robbin > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From ekaterina.pavlova at oracle.com Wed May 15 20:22:47 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Wed, 15 May 2019 13:22:47 -0700 Subject: RFR(T): 8223910: TestFloatJNIArgs and TestTrichotomyExpressions time out with Graal as JIT Message-ID: Hi, Please review following changes which disable two tests from running with Graal. test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java is executed in 3 configurations. The 3rd one sets " -XX:-TieredCompilation -Xcomp" which is in Graal as JIT mode results in all Graal methods to be compiled by Graal itself running in interpreter mode which is very slow and causes the test to time out. So, I split this test in two and disabled the 2nd one from running with Graal. compiler/codegen/TestTrichotomyExpressions.java sets "-XX:-TieredCompilation -Xbatch". Current slowness should go away once we have libgraal. So putting this test into Graal specific problem list. JBS: https://bugs.openjdk.java.net/browse/JDK-8223910 webrev: http://cr.openjdk.java.net/~epavlova//8223910/webrev.00/index.html thanks, -katya From jatin.bhateja at intel.com Thu May 16 02:11:30 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Thu, 16 May 2019 02:11:30 +0000 Subject: [PATCH] Elemental shifts and rotates speedup Message-ID: Hi All, Please find a patch having following changes:- A) Intrinsification of two vector APIs: 1) VectorShuffle.shuffleIota(VectorSpecies, int) 2) VectorShuffle.toVector() B) Re-implimentation of following vector APIs using above intrinsified APIs. 1) Vector.shiftLanesLeft(int) 2) Vector.shiftLanesRight(int) 3) Vector.rotateLanesLeft(int) 4) Vector.rotateLanesRight(int) With this we see around ~2X gains in elemental shifts and rotate operations. Webrev: http://cr.openjdk.java.net/~kkharbas/Jatin/rotate_and_shift_lanes/webrev.00/ Kindly review the patch. Best Regards, Jatin -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu May 16 05:43:42 2019 From: david.holmes at oracle.com (David Holmes) Date: Thu, 16 May 2019 15:43:42 +1000 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> Message-ID: <55bac532-4df7-d3a3-6b87-90f21e228aab@oracle.com> This all seems like unnecessary churn to me - is any of this code actually wrong? can we not just disable this particular warning? is there any point using "static const" when we should be aiming to use C++11 constexpr in the (not too distant?) future? Converting from enums to unrelated ints seems a big step backwards in software engineering terms. :( Cheers, David ----- On 16/05/2019 12:46 am, Rahul Raghavan wrote: > Hi, > > Request help review and finalize fix for 8213416. > > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ > > https://bugs.openjdk.java.net/browse/JDK-8213416 > > The requirement is to solve > ? "enumeral and non-enumeral type in conditional expression" warnings > ? triggered with -Wextra enabled for gcc on hotspot. > > (hotspot/compiler part is handled in this 8213416 task > and hotspot/runtime in 8223400) > > The same warning is generated for ternary operator statements like- > ? (x ? int_val : enum_val). > e.g.: > comp_level = TieredCompilation ? TieredStopAtLevel : > CompLevel_highest_tier; > > > Understood from comments that the following type typecast solution > proposed earlier was not accepted. > - comp_level = TieredCompilation ? TieredStopAtLevel : > CompLevel_highest_tier; > + comp_level = TieredCompilation ? TieredStopAtLevel : (int) > CompLevel_highest_tier; > and then proposed solution was to rewrite those enums to be static const > members. > > > Tried changes based on the comments info from JBS. > Extracts of related JBS comments- > - ".... it's just a simple code refactoring. " > - "David H. only complained about NO_HASH which we can fix. > We can also fix CompLevel_highest_tier usage - should use CompLevel type > everywhere. > But I would not touch Op_RegFlags - > ? I don't want to complicate its construction and > ? we have a lot of places where Op_ are used as uint. > ? I would only fix places where it is used as int to make sure it is > used as uint everywhere." > > > > Reported enums in question for hotspot/compiler > 1) NO_HASH > 2) CompLevel_highest_tier > 3) Op_RegFlags > 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit > > > > 1) NO_HASH > tried [open/src/hotspot/share/opto/node.hpp] > -? enum { NO_HASH = 0 }; > +? static const uint NO_HASH = 0; > > > > 2) CompLevel_highest_tier > Only one warning in process_compile() > [open/src/hotspot/share/ci/ciReplay.cpp] > comp_level = TieredCompilation ? TieredStopAtLevel : > CompLevel_highest_tier; > > Following type changes tried did not help - > -??? int comp_level = parse_int(comp_level_label); > +??? CompLevel comp_level = parse_int(comp_level_label); > ..... > -????? comp_level = TieredCompilation ? TieredStopAtLevel : > CompLevel_highest_tier; > +????? comp_level = TieredCompilation ? (CompLevel) TieredStopAtLevel : > CompLevel_highest_tier; > > > The warning is with only ternary operator usage in this location. > So tried simple code refactoring like following and got no more warnings! > Is this okay? > -????? comp_level = TieredCompilation ? TieredStopAtLevel : > CompLevel_highest_tier; > +????? if (TieredCompilation) { > +??????? comp_level = TieredStopAtLevel; > +????? } else { > +??????? comp_level = CompLevel_highest_tier; > +????? } > > > > 3) Op_RegFlags > Warnings only for 'virtual uint MachNode::ideal_reg() const' > > ../open/src/hotspot/share/opto/machnode.hpp: In member function 'virtual > uint MachNode::ideal_reg() const': > ../open/src/hotspot/share/opto/machnode.hpp:304:95: warning: enumeral > and non-enumeral type in conditional expression [-Wextra] > ?? virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); > return t == TypeInt::CC ? Op_RegFlags : t->ideal_reg(); } > > ????? ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Op_RegFlags is returned as uint itself here. > How to modify code to solve warning? > Again since the issue is with only ternary operator usage in only one > location, can we go for simple code refactoring like following? > > -? virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); > return t == TypeInt::CC ? Op_RegFlags : t->ideal_reg(); } > +? virtual uint ideal_reg() const { > +??? const Type *t = _opnds[0]->type(); > +??? if (t == TypeInt::CC) { > +????? return Op_RegFlags; > +??? } else { > +????? return t->ideal_reg(); > +??? } > +? } > > > > 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - > warnings locations - > > (i) ../open/src/hotspot/share/oops/klass.cpp: In static member function > 'static jint Klass::array_layout_helper(BasicType)': > ../open/src/hotspot/share/oops/klass.cpp:212:23: warning: enumeral and > non-enumeral type in conditional expression [-Wextra] > ?? int? tag?? =? isobj ? _lh_array_tag_obj_value : > _lh_array_tag_type_value; > > (ii) ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp: In static member > function 'static OopMapSet* > Runtime1::generate_code_for(Runtime1::StubID, StubAssembler*)': > ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp:1126:22: warning: > enumeral and non-enumeral type in conditional expression [-Wextra] > ?????????? int tag = ((id == new_type_array_id) > ????????????????????? ~~~~~~~~~~~~~~~~~~~~~~~~~ > ????????????????????? ? Klass::_lh_array_tag_type_value > ????????????????????? ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ????????????????????? : Klass::_lh_array_tag_obj_value); > > (iii) ../open/src/hotspot/share/oops/klass.hpp: In static member > function 'static jint Klass::instance_layout_helper(jint, bool)': > ../open/src/hotspot/share/oops/klass.hpp:422:28: warning: enumeral and > non-enumeral type in conditional expression [-Wextra] > ?????? |??? (slow_path_flag ? _lh_instance_slow_path_bit : 0); > ???????????? ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Following changes fixed the warnings! > (using static const int instead of unnamed enum) > [open/src/hotspot/share/oops/klass.cpp] > .......... > ?? // Unpacking layout_helper: > -? enum { > -??? _lh_neutral_value?????????? = 0,? // neutral non-array non-instance > value > -??? _lh_instance_slow_path_bit? = 0x01, > -??? _lh_log2_element_size_shift = BitsPerByte*0, > -??? _lh_log2_element_size_mask? = BitsPerLong-1, > -??? _lh_element_type_shift????? = BitsPerByte*1, > -??? _lh_element_type_mask?????? = right_n_bits(BitsPerByte),? // > shifted mask > -??? _lh_header_size_shift?????? = BitsPerByte*2, > -??? _lh_header_size_mask??????? = right_n_bits(BitsPerByte),? // > shifted mask > -??? _lh_array_tag_bits????????? = 2, > -??? _lh_array_tag_shift???????? = BitsPerInt - _lh_array_tag_bits, > -??? _lh_array_tag_obj_value???? = ~0x01?? // 0x80000000 >> 30 > -? }; > +? static const int _lh_neutral_value?????????? = 0;? // neutral > non-array non-instance value > +? static const int _lh_instance_slow_path_bit? = 0x01; > +? static const int _lh_log2_element_size_shift = BitsPerByte*0; > +? static const int _lh_log2_element_size_mask? = BitsPerLong-1; > +? static const int _lh_element_type_shift????? = BitsPerByte*1; > +? static const int _lh_element_type_mask?????? = > right_n_bits(BitsPerByte);? // shifted mask > +? static const int _lh_header_size_shift?????? = BitsPerByte*2; > +? static const int _lh_header_size_mask??????? = > right_n_bits(BitsPerByte);? // shifted mask > +? static const int _lh_array_tag_bits????????? = 2; > +? static const int _lh_array_tag_shift???????? = BitsPerInt - > _lh_array_tag_bits; > +? static const int _lh_array_tag_obj_value???? = ~0x01;?? // 0x80000000 > >> 30 > ....... > > > > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ > > Understood the affected code locations details from the old sample patch > attachment of related JDK-8211073 > # > https://bugs.openjdk.java.net/secure/attachment/79387/hotspot-disable-wextra.diff > > Also confirmed no similar warnings in hotspot/compiler with -Wextra, > no issues with build with this proposed webrev.00 > > > Thanks, > Rahul From rahul.v.raghavan at oracle.com Thu May 16 06:33:04 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 16 May 2019 12:03:04 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> Message-ID: <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> Hi, Thank you Vladimir for review comments. >> 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - >> [open/src/hotspot/share/oops/klass.cpp] >> .......... > > I am okay with it but Runtime group should agree too - it is their code. > Yes, I missed that it is Runtime code. Please note plan is to handle only the hotspot/compiler part of the requirement changes in JDK-8213416. As per earlier JBS comments new JDK-8223400 was created to cover the requirements in hotspot/runtime. So may I suggest moving the above runtime change requirement details to JDK-822340; and use only the balance changes, as in below updated webrev, here for 8213416. - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ Thanks, Rahul From lutz.schmidt at sap.com Thu May 16 09:55:28 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 16 May 2019 09:55:28 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> Message-ID: Hi Vladimir, sorry for the delayed reaction on your comments. - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list of modified files. - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last night). - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ ~thartmann: The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it work reliably. I would like to keep it in as a reminder and a template to build on. Thanks, Lutz On 10.05.19, 23:16, "Vladimir Kozlov" wrote: Hi Lutz, My comments are inlined below. On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > Thank you, Vladimir! > Please find my comments inline and let me know what you think. > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ Found one more I missed last time: assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): static int instr_len(unsigned char *instr); > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there were no serious clashes) and sent a short note. NP. Please, provide final webrev when you can so that I can run these changes through our testing to make sure no issues are present (especially in builds). > Regards, > Lutz > > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > > Hi Lutz, > > Thank you for doing this great work. > > I have just small comments: > > x86_64.ad - empty change. > File contains whitespace changes for formatting. Not visible in webrev. Okay. > > nmethod.cpp - LUCY? > > + st->print_cr("LUCY: NULL-oop"); > + tty->print("LUCY NULL-oop"); > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. Okay. > > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > Changed. > > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > Changed. > > disassembler.* - LUCY_OBSOLETE? > > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it out. But now I did. It's gone. Okay. > > compilerDefinitions.hpp - I don't see where tier_digit() is used. > I'm surprised myself. Introduced it and then made it obsolete. It's gone. > > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > need to check for invalid value and print help output in such case - it will be very useful if you > forgot a value spelling. Also add line for 'help' value. > > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) like. Not changed as of now. Awaiting your input. It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is useless because PrintAssemblyOptions is used only for disassembler and there are no values which don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). And an other commented new line: + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > > Printing help text: There is an option (hsdis-help) to request help text printout. > > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more intelligence. I can do that, but would like to do it in a separate effort. Your opinion? Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use if-else or switch without complicating the code. I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use here? I don't think next comment is correct for ccstr type: http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 It would be nice to fix it but you can do it later if you don't want to add more changes. > > Do you need next commented lines: > > disassembler.cpp - > +// ptrdiff_t _offset; > Deleted. > > +// Output suppressed because it messes up disassembly. > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > Uncommented, would like to keep it. Made the if condition permanently false. > > disassembler_s390.cpp - > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > Deleted. > > compile.cpp - > +// st->print("# "); _tf->dump_on(st); st->cr(); > Uncommented. > > > abstractDisassembler.cpp - > // st->print("0x%016lx", *((julong*)here)); > st->print("0x%016lx", *((uintptr_t*)here)); > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > Commented lines are gone. > > abstractDisassembler.cpp - may be explicit cast (byte*)?: > > st->print("%2.2x", *byte); > st->print("%2.2x", *pos); > st->print("0x%02x", *here); > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, std::byte is a C++17 feature. We are not there yet. okay > > PTR64_FORMAT ?: > st->print("0x%016lx", *((uintptr_t*)here)); > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. Changed other occurrences, so I changed this one as well. Thanks, Vladimir > > > Thanks, > Vladimir > > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > > Dear Community, > > > > may I please request comments and reviews for this change? Thank you! > > > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > > > > Regards, > > Lutz > > > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > > > > Dear All, > > > > this topic was discussed back in Nov/Dec 2018: > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > > > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > > The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): > > > > PrintAssemblyOptions help: > > hsdis-print-raw test plugin by requesting raw output (deprecated) > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > > hsdis-print-pc turn off PC printing (on by default) (deprecated) > > hsdis-print-bytes turn on instruction byte output (deprecated) > > > > hsdis-show-pc toggle printing current pc, currently ON > > hsdis-show-offset toggle printing current offset, currently OFF > > hsdis-show-bytes toggle printing instruction bytes, currently OFF > > hsdis-show-data-hex toggle formatting data as hex, currently ON > > hsdis-show-data-int toggle formatting data as int, currently OFF > > hsdis-show-data-float toggle formatting data as float, currently OFF > > hsdis-show-structs toggle compiler data structures, currently OFF > > hsdis-show-comment toggle instruction comments, currently OFF > > hsdis-show-block-comment toggle block comments, currently OFF > > hsdis-align-instr toggle instruction alignment, currently OFF > > > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: > > > > The machine code generated by the JVM can be printed in three different formats: > > - Hexadecimal. > > This is basically a hex dump of the memory range containing the code. > > This format is always available (PRODUCT and not-PRODUCT builds), regardless > > of the availability of a disassembler library. It applies to all sorts of > > code, be it blobs, stubs, compiled nmethods, ... > > This format seems useless at first glance, but it is not. In an upcoming, > > separate enhancement, the JVM will be made capable of reading files > > containing such code blocks and disassembling them post mortem. The most > > prominent example is an hs_err* file. > > - Disassembled. > > This is an assembly listing of the instructions as found in the memory range > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > > disassembler library (hsdis-.so) must be available at runtime. > > Most often, that will only be the case in test environments. If no disassembler > > library is available, hexadecimal output is used as fallback. > > - OptoAssembly. > > This is a meta code listing created only by the C2 compiler. As it is somewhat > > closer to the Java code, it may be helpful in linking assembly code to Java code. > > > > All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. > > > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. > > > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. > > > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > > > > OptoAssembly output for reference (always on with PrintAssembly): > > ================================================================= > > > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) > > 039 movl R11, [RBP + #12 (8-bit)] # range > > 03d NullCheck RBP > > > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 03d cmpl RDX, R11 # unsigned > > 040 jnb,us B6 P=0.000000 C=5375.000000 > > > > PrintAssembly with no disassembler library available: > > ===================================================== > > > > [Code] > > [Entry Point] > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > > [Verified Entry Point] > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > > [Exception Handler] > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > > [End] > > > > PrintAssembly with minimal verbosity: > > ===================================== > > > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > > 0x00007f0434b89bdd: cmp %r11d,%edx > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > > > > PrintAssembly (previous plus code offsets from code begin): > > =========================================================== > > > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > > > > PrintAssembly (previous plus block comments): > > =========================================================== > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > > > > PrintAssembly (previous plus instruction comments): > > =========================================================== > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} > > ; - java.lang.String::charAt at 8 (line 702) > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > > > > For completeness, here are the links to > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > > > > But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. > > > > Thank you so much for your time! > > Lutz > > > > > > > > > > From rahul.v.raghavan at oracle.com Thu May 16 09:56:57 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 16 May 2019 15:26:57 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <55bac532-4df7-d3a3-6b87-90f21e228aab@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <55bac532-4df7-d3a3-6b87-90f21e228aab@oracle.com> Message-ID: <9263192b-4f10-b470-7b56-64295319bd4d@oracle.com> Hi, Thank you David for review comments. I will kindly request help from Magnus to reply for the main questions. Sharing some notes, related links - - 8211073: Remove -Wno-extra from Hotspot https://bugs.openjdk.java.net/browse/JDK-8211073 - Discussions in earlier thread - https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-September/034314.html So understood -Wextra do help in catching valid/useful warnings also, but along with some too strict ones like "enumeral and non-enumeral type in conditional expression" type warnings. Extracts from 8211073 JBS comments from Magnus regarding the 'enum-warning' - "... If you think that gcc is a bit too picky here, I agree. It's not obvious per se that the added casts improve the code. However, this is the price we need to pay to be able to enable -Wextra, and *that* is something that is likely to improve the code." Thanks, Rahul On 16/05/19 11:13 AM, David Holmes wrote: > This all seems like unnecessary churn to me - is any of this code > actually wrong? can we not just disable this particular warning? is > there any point using "static const" when we should be aiming to use > C++11 constexpr in the (not too distant?) future? > > Converting from enums to unrelated ints seems a big step backwards in > software engineering terms. :( > > Cheers, > David > ----- > From fujie at loongson.cn Thu May 16 11:15:56 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 16 May 2019 19:15:56 +0800 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: References: Message-ID: Ping. Could someone help to review this change? I would greatly appreciate if this test case could be used for our mips-port jdk. Thanks a lot. Best regards, Jie On 2019/4/11 ??9:51, Jie Fu wrote: > Hi all, > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8222302 > Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ > > TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not > AArch64, PPC, S390x, SPARC or X86). > It is designed to test "UseSHASpecificTestCaseForUnsupportedCPU"[1] > and "GenericTestCaseForOtherCPU"[2] on any other CPU[3]. > But when they run on any other CPU (e.g., mips), an exception[4] is > always thrown, which causes the failure. > So there seems to be a logical bug in it. > > The change has been tested on mips and x86. > Could you please review it? > Thanks a lot. > > Best regards, > Jie > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 > [2] > http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 > [3] > http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 > [4] > http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 > > From tobias.hartmann at oracle.com Thu May 16 12:59:38 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 16 May 2019 14:59:38 +0200 Subject: RFR(T): 8223910: TestFloatJNIArgs and TestTrichotomyExpressions time out with Graal as JIT In-Reply-To: References: Message-ID: Hi Katya, looks good to me. Thanks, Tobias On 15.05.19 22:22, Ekaterina Pavlova wrote: > Hi, > > Please review following changes which disable two tests from running with Graal. > > test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java is executed in 3 configurations. > The 3rd one sets " -XX:-TieredCompilation -Xcomp" which is in Graal as JIT mode results in > all Graal methods to be compiled by Graal itself running in interpreter mode which is > very slow and causes the test to time out. So, I split this test in two and disabled the 2nd > one from running with Graal. > > compiler/codegen/TestTrichotomyExpressions.java sets "-XX:-TieredCompilation -Xbatch". > Current slowness should go away once we have libgraal. So putting this test into Graal > specific problem list. > > ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8223910 > ?webrev: http://cr.openjdk.java.net/~epavlova//8223910/webrev.00/index.html > > thanks, > -katya From vladimir.kozlov at oracle.com Thu May 16 16:47:49 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 09:47:49 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> Message-ID: <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> Nice. I submitted our tier1-3 testing. Thanks, Vladimir On 5/16/19 2:55 AM, Schmidt, Lutz wrote: > Hi Vladimir, > > sorry for the delayed reaction on your comments. > > - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list of modified files. > - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last night). > - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. > - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. > - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ > > ~thartmann: > The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it work reliably. I would like to keep it in as a reminder and a template to build on. > > Thanks, > Lutz > > On 10.05.19, 23:16, "Vladimir Kozlov" wrote: > > Hi Lutz, > > My comments are inlined below. > > On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > > Thank you, Vladimir! > > Please find my comments inline and let me know what you think. > > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ > > Found one more I missed last time: > > assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): > static int instr_len(unsigned char *instr); > > > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there were no serious clashes) and sent a short note. > > NP. Please, provide final webrev when you can so that I can run these changes through our testing to > make sure no issues are present (especially in builds). > > > Regards, > > Lutz > > > > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > > > > Hi Lutz, > > > > Thank you for doing this great work. > > > > I have just small comments: > > > > x86_64.ad - empty change. > > File contains whitespace changes for formatting. Not visible in webrev. > > Okay. > > > > > nmethod.cpp - LUCY? > > > > + st->print_cr("LUCY: NULL-oop"); > > + tty->print("LUCY NULL-oop"); > > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. > > Okay. > > > > > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > > Changed. > > > > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > > Changed. > > > > disassembler.* - LUCY_OBSOLETE? > > > > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it out. But now I did. It's gone. > > Okay. > > > > > compilerDefinitions.hpp - I don't see where tier_digit() is used. > > I'm surprised myself. Introduced it and then made it obsolete. It's gone. > > > > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > > need to check for invalid value and print help output in such case - it will be very useful if you > > forgot a value spelling. Also add line for 'help' value. > > > > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) like. Not changed as of now. Awaiting your input. > > It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is > useless because PrintAssemblyOptions is used only for disassembler and there are no values which > don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). > > And an other commented new line: > + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > > > > > Printing help text: There is an option (hsdis-help) to request help text printout. > > > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more intelligence. I can do that, but would like to do it in a separate effort. Your opinion? > > Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use > if-else or switch without complicating the code. > > I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use > here? I don't think next comment is correct for ccstr type: > > http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 > > It would be nice to fix it but you can do it later if you don't want to add more changes. > > > > > Do you need next commented lines: > > > > disassembler.cpp - > > +// ptrdiff_t _offset; > > Deleted. > > > > +// Output suppressed because it messes up disassembly. > > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > > Uncommented, would like to keep it. Made the if condition permanently false. > > > > disassembler_s390.cpp - > > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > > Deleted. > > > > compile.cpp - > > +// st->print("# "); _tf->dump_on(st); st->cr(); > > Uncommented. > > > > > > abstractDisassembler.cpp - > > // st->print("0x%016lx", *((julong*)here)); > > st->print("0x%016lx", *((uintptr_t*)here)); > > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > > Commented lines are gone. > > > > abstractDisassembler.cpp - may be explicit cast (byte*)?: > > > > st->print("%2.2x", *byte); > > st->print("%2.2x", *pos); > > st->print("0x%02x", *here); > > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, std::byte is a C++17 feature. We are not there yet. > > okay > > > > > PTR64_FORMAT ?: > > st->print("0x%016lx", *((uintptr_t*)here)); > > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. Changed other occurrences, so I changed this one as well. > > Thanks, > Vladimir > > > > > > > Thanks, > > Vladimir > > > > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > > > Dear Community, > > > > > > may I please request comments and reviews for this change? Thank you! > > > > > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The code which constitutes this patch was not changed. Here's the webrev link: > > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > > > > > > Regards, > > > Lutz > > > > > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > > > > > > Dear All, > > > > > > this topic was discussed back in Nov/Dec 2018: > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > > > > > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > > > The result was mixed, some pro, some con. I let the input from back then influence my work of the last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the list (with default settings): > > > > > > PrintAssemblyOptions help: > > > hsdis-print-raw test plugin by requesting raw output (deprecated) > > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > > > hsdis-print-pc turn off PC printing (on by default) (deprecated) > > > hsdis-print-bytes turn on instruction byte output (deprecated) > > > > > > hsdis-show-pc toggle printing current pc, currently ON > > > hsdis-show-offset toggle printing current offset, currently OFF > > > hsdis-show-bytes toggle printing instruction bytes, currently OFF > > > hsdis-show-data-hex toggle formatting data as hex, currently ON > > > hsdis-show-data-int toggle formatting data as int, currently OFF > > > hsdis-show-data-float toggle formatting data as float, currently OFF > > > hsdis-show-structs toggle compiler data structures, currently OFF > > > hsdis-show-comment toggle instruction comments, currently OFF > > > hsdis-show-block-comment toggle block comments, currently OFF > > > hsdis-align-instr toggle instruction alignment, currently OFF > > > > > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like a plan? Here is what you get: > > > > > > The machine code generated by the JVM can be printed in three different formats: > > > - Hexadecimal. > > > This is basically a hex dump of the memory range containing the code. > > > This format is always available (PRODUCT and not-PRODUCT builds), regardless > > > of the availability of a disassembler library. It applies to all sorts of > > > code, be it blobs, stubs, compiled nmethods, ... > > > This format seems useless at first glance, but it is not. In an upcoming, > > > separate enhancement, the JVM will be made capable of reading files > > > containing such code blocks and disassembling them post mortem. The most > > > prominent example is an hs_err* file. > > > - Disassembled. > > > This is an assembly listing of the instructions as found in the memory range > > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > > > disassembler library (hsdis-.so) must be available at runtime. > > > Most often, that will only be the case in test environments. If no disassembler > > > library is available, hexadecimal output is used as fallback. > > > - OptoAssembly. > > > This is a meta code listing created only by the C2 compiler. As it is somewhat > > > closer to the Java code, it may be helpful in linking assembly code to Java code. > > > > > > All three formats can be merged with additional information, most prominently compiler-internal "knowledge" about blocks, related bytecodes, statistics counters, and much more. > > > > > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, dependencies, exception handlers, are printed to aid in debugging. > > > > > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support OptoAssembly output. Data structures are unavailable as well. > > > > > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > > > > > > OptoAssembly output for reference (always on with PrintAssembly): > > > ================================================================= > > > > > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value (constant) > > > 039 movl R11, [RBP + #12 (8-bit)] # range > > > 03d NullCheck RBP > > > > > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > > 03d cmpl RDX, R11 # unsigned > > > 040 jnb,us B6 P=0.000000 C=5375.000000 > > > > > > PrintAssembly with no disassembler library available: > > > ===================================================== > > > > > > [Code] > > > [Entry Point] > > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > > > [Verified Entry Point] > > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > > > [Exception Handler] > > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > > > [End] > > > > > > PrintAssembly with minimal verbosity: > > > ===================================== > > > > > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > > > 0x00007f0434b89bdd: cmp %r11d,%edx > > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > > > > > > PrintAssembly (previous plus code offsets from code begin): > > > =========================================================== > > > > > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > > > > > > PrintAssembly (previous plus block comments): > > > =========================================================== > > > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > > > > > > PrintAssembly (previous plus instruction comments): > > > =========================================================== > > > > > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 rethrow=0 return_oop=0} > > > ; - java.lang.String::charAt at 8 (line 702) > > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to 0x00007fc3e11d7a9e > > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > > > > > > For completeness, here are the links to > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > > > > > > But please, as mentioned above, first focus on the output. The nitty details of the implementation I would like to discuss after the output format has received some support. > > > > > > Thank you so much for your time! > > > Lutz > > > > > > > > > > > > > > > > > > > From vladimir.kozlov at oracle.com Thu May 16 17:48:59 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 10:48:59 -0700 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: References: Message-ID: Looks good to me. Thanks, Vladimir On 5/16/19 4:15 AM, Jie Fu wrote: > Ping. > > Could someone help to review this change? > I would greatly appreciate if this test case could be used for our mips-port jdk. > > Thanks a lot. > Best regards, > Jie > > On 2019/4/11 ??9:51, Jie Fu wrote: >> Hi all, >> >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8222302 >> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >> >> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not AArch64, PPC, S390x, SPARC or X86). >> It is designed to test "UseSHASpecificTestCaseForUnsupportedCPU"[1] and "GenericTestCaseForOtherCPU"[2] on any other >> CPU[3]. >> But when they run on any other CPU (e.g., mips), an exception[4] is always thrown, which causes the failure. >> So there seems to be a logical bug in it. >> >> The change has been tested on mips and x86. >> Could you please review it? >> Thanks a lot. >> >> Best regards, >> Jie >> >> [1] >> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >> >> [2] >> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >> >> [3] >> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >> >> [4] >> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >> >> >> > From ekaterina.pavlova at oracle.com Thu May 16 18:37:54 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 16 May 2019 11:37:54 -0700 Subject: RFR(T): 8223910: TestFloatJNIArgs and TestTrichotomyExpressions time out with Graal as JIT In-Reply-To: References: Message-ID: <51ed8e9a-206e-8af2-d444-10984f582175@oracle.com> Thanks Tobias. On 5/16/19 5:59 AM, Tobias Hartmann wrote: > Hi Katya, > > looks good to me. > > Thanks, > Tobias > > On 15.05.19 22:22, Ekaterina Pavlova wrote: >> Hi, >> >> Please review following changes which disable two tests from running with Graal. >> >> test/hotspot/jtreg/compiler/floatingpoint/TestFloatJNIArgs.java is executed in 3 configurations. >> The 3rd one sets " -XX:-TieredCompilation -Xcomp" which is in Graal as JIT mode results in >> all Graal methods to be compiled by Graal itself running in interpreter mode which is >> very slow and causes the test to time out. So, I split this test in two and disabled the 2nd >> one from running with Graal. >> >> compiler/codegen/TestTrichotomyExpressions.java sets "-XX:-TieredCompilation -Xbatch". >> Current slowness should go away once we have libgraal. So putting this test into Graal >> specific problem list. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8223910 >> ?webrev: http://cr.openjdk.java.net/~epavlova//8223910/webrev.00/index.html >> >> thanks, >> -katya From vladimir.kozlov at oracle.com Thu May 16 18:38:20 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 11:38:20 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> Message-ID: linux-x64-zero build is broke: workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' int instr_size_in_bytes = Assembler::instr_len(pos); ^~~~~~~~~ Other builds and testing are good. Thanks, Vladimir On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > Nice. > > I submitted our tier1-3 testing. > > Thanks, > Vladimir > > On 5/16/19 2:55 AM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> sorry for the delayed reaction on your comments. >> >> ? - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list >> of modified files. >> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In >> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last >> night). >> ? - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter >> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. >> ? - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. >> ? - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ >> >> ~thartmann: >> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it >> work reliably. I would like to keep it in as a reminder and a template to build on. >> >> Thanks, >> Lutz >> >> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: >> >> ???? Hi Lutz, >> ???? My comments are inlined below. >> ???? On 5/10/19 8:44 AM, Schmidt, Lutz wrote: >> ???? > Thank you, Vladimir! >> ???? > Please find my comments inline and let me know what you think. >> ???? > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ >> ???? Found one more I missed last time: >> ???? assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): >> ?????? static int instr_len(unsigned char *instr); >> ???? > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a >> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there >> were no serious clashes) and sent a short note. >> ???? NP. Please, provide final webrev when you can so that I can run these changes through our testing to >> ???? make sure no issues are present (especially in builds). >> ???? > Regards, >> ???? > Lutz >> ???? > >> ???? > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: >> ???? > >> ???? >????? Hi Lutz, >> ???? > >> ???? >????? Thank you for doing this great work. >> ???? > >> ???? >????? I have just small comments: >> ???? > >> ???? >????? x86_64.ad - empty change. >> ???? > File contains whitespace changes for formatting. Not visible in webrev. >> ???? Okay. >> ???? > >> ???? >????? nmethod.cpp - LUCY? >> ???? > >> ???? >????? +??????? st->print_cr("LUCY: NULL-oop"); >> ???? >????? +?? tty->print("LUCY NULL-oop"); >> ???? > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. >> ???? Okay. >> ???? > >> ???? >????? nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. >> ???? > Changed. >> ???? > >> ???? >????? vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). >> ???? > Changed. >> ???? > >> ???? >????? disassembler.* - LUCY_OBSOLETE? >> ???? > >> ???? >????? +#if defined(LUCY_OBSOLETE)? // Used in SAPJVM only >> ???? > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction >> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it >> out. But now I did. It's gone. >> ???? Okay. >> ???? > >> ???? >????? compilerDefinitions.hpp - I don't see where tier_digit() is used. >> ???? > I'm surprised myself. Introduced it and then made it obsolete. It's gone. >> ???? > >> ???? >????? disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You >> ???? >????? need to check for invalid value and print help output in such case - it will be very useful if you >> ???? >????? forgot a value spelling. Also add line for 'help' value. >> ???? > >> ???? > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, >> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is >> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) >> like. Not changed as of now. Awaiting your input. >> ???? It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is >> ???? useless because PrintAssemblyOptions is used only for disassembler and there are no values which >> ???? don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). >> ???? And an other commented new line: >> ???? +? // ost->print_cr("PrintAssemblyOptions='%s'", options()); >> ???? > >> ???? > Printing help text: There is an option (hsdis-help) to request help text printout. > >> ???? > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - >> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more >> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? >> ???? Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use >> ???? if-else or switch without complicating the code. >> ???? I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use >> ???? here? I don't think next comment is correct for ccstr type: >> ???? http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 >> ???? It would be nice to fix it but you can do it later if you don't want to add more changes. >> ???? > >> ???? >????? Do you need next commented lines: >> ???? > >> ???? >????? disassembler.cpp - >> ???? >????? +//? ptrdiff_t???? _offset; >> ???? > Deleted. >> ???? > >> ???? >????? +//????? Output suppressed because it messes up disassembly. >> ???? >????? +//????? output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); >> ???? > Uncommented, would like to keep it. Made the if condition permanently false. >> ???? > >> ???? >????? disassembler_s390.cpp - >> ???? >????? +//??? st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); >> ???? > Deleted. >> ???? > >> ???? >????? compile.cpp - >> ???? >????? +//? st->print("#? ");? _tf->dump_on(st);? st->cr(); >> ???? > Uncommented. >> ???? > >> ???? > >> ???? >????? abstractDisassembler.cpp - >> ???? >????? //????????????????? st->print("0x%016lx", *((julong*)here)); >> ???? >???????????????????????? st->print("0x%016lx", *((uintptr_t*)here)); >> ???? >????? //????????????????? st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); >> ???? > Commented lines are gone. >> ???? > >> ???? >????? abstractDisassembler.cpp - may be explicit cast (byte*)?: >> ???? > >> ???? >?????????????? st->print("%2.2x", *byte); >> ???? >?????????????? st->print("%2.2x", *pos); >> ???? >?????????????????????? st->print("0x%02x", *here); >> ???? > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, >> std::byte is a C++17 feature. We are not there yet. >> ???? okay >> ???? > >> ???? >????? PTR64_FORMAT ?: >> ???? >???????????????????????? st->print("0x%016lx", *((uintptr_t*)here)); >> ???? > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. >> Changed other occurrences, so I changed this one as well. >> ???? Thanks, >> ???? Vladimir >> ???? > >> ???? > >> ???? >????? Thanks, >> ???? >????? Vladimir >> ???? > >> ???? >????? On 5/8/19 8:31 AM, Schmidt, Lutz wrote: >> ???? >????? > Dear Community, >> ???? >????? > >> ???? >????? > may I please request comments and reviews for this change? Thank you! >> ???? >????? > >> ???? >????? > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The >> code which constitutes this patch was not changed. Here's the webrev link: >> ???? >????? > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ >> ???? >????? > >> ???? >????? > Regards, >> ???? >????? > Lutz >> ???? >????? > >> ???? >????? > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: >> ???? >????? > >> ???? >????? >????? Dear All, >> ???? >????? > >> ???? >????? >????? this topic was discussed back in Nov/Dec 2018: >> ???? >????? >????? http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html >> ???? >????? >????? http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html >> ???? >????? > >> ???? >????? >????? Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. >> ???? >????? >????? The result was mixed, some pro, some con. I let the input from back then influence my work of the >> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general >> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the >> list (with default settings): >> ???? >????? > >> ???? >????? >????? PrintAssemblyOptions help: >> ???? >????? >??????? hsdis-print-raw?????? test plugin by requesting raw output (deprecated) >> ???? >????? >??????? hsdis-print-raw-xml?? test plugin by requesting raw xml (deprecated) >> ???? >????? >??????? hsdis-print-pc??????? turn off PC printing (on by default) (deprecated) >> ???? >????? >??????? hsdis-print-bytes???? turn on instruction byte output (deprecated) >> ???? >????? > >> ???? >????? >??????? hsdis-show-pc??????????? toggle printing current pc,??????? currently ON >> ???? >????? >??????? hsdis-show-offset??????? toggle printing current offset,??? currently OFF >> ???? >????? >??????? hsdis-show-bytes???????? toggle printing instruction bytes, currently OFF >> ???? >????? >??????? hsdis-show-data-hex????? toggle formatting data as hex,???? currently ON >> ???? >????? >??????? hsdis-show-data-int????? toggle formatting data as int,???? currently OFF >> ???? >????? >??????? hsdis-show-data-float??? toggle formatting data as float,?? currently OFF >> ???? >????? >??????? hsdis-show-structs?????? toggle compiler data structures,?? currently OFF >> ???? >????? >??????? hsdis-show-comment?????? toggle instruction comments,?????? currently OFF >> ???? >????? >??????? hsdis-show-block-comment toggle block comments,???????????? currently OFF >> ???? >????? >??????? hsdis-align-instr??????? toggle instruction alignment,????? currently OFF >> ???? >????? > >> ???? >????? >????? Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. >> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. >> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like >> a plan? Here is what you get: >> ???? >????? > >> ???? >????? >????? The machine code generated by the JVM can be printed in three different formats: >> ???? >????? >?????? - Hexadecimal. >> ???? >????? >???????? This is basically a hex dump of the memory range containing the code. >> ???? >????? >???????? This format is always available (PRODUCT and not-PRODUCT builds), regardless >> ???? >????? >???????? of the availability of a disassembler library. It applies to all sorts of >> ???? >????? >???????? code, be it blobs, stubs, compiled nmethods, ... >> ???? >????? >???????? This format seems useless at first glance, but it is not. In an upcoming, >> ???? >????? >???????? separate enhancement, the JVM will be made capable of reading files >> ???? >????? >???????? containing such code blocks and disassembling them post mortem. The most >> ???? >????? >???????? prominent example is an hs_err* file. >> ???? >????? >?????? - Disassembled. >> ???? >????? >???????? This is an assembly listing of the instructions as found in the memory range >> ???? >????? >???????? occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable >> ???? >????? >???????? disassembler library (hsdis-.so) must be available at runtime. >> ???? >????? >???????? Most often, that will only be the case in test environments. If no disassembler >> ???? >????? >???????? library is available, hexadecimal output is used as fallback. >> ???? >????? >?????? - OptoAssembly. >> ???? >????? >???????? This is a meta code listing created only by the C2 compiler. As it is somewhat >> ???? >????? >???????? closer to the Java code, it may be helpful in linking assembly code to Java code. >> ???? >????? > >> ???? >????? >????? All three formats can be merged with additional information, most prominently compiler-internal >> "knowledge" about blocks, related bytecodes, statistics counters, and much more. >> ???? >????? > >> ???? >????? >????? Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, >> dependencies, exception handlers, are printed to aid in debugging. >> ???? >????? > >> ???? >????? >????? The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support >> OptoAssembly output. Data structures are unavailable as well. >> ???? >????? > >> ???? >????? >????? So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you >> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: >> ???? >????? >??????? http://cr.openjdk.java.net/~lucy/webrevs/8213084/ >> ???? >????? > >> ???? >????? >????? OptoAssembly output for reference (always on with PrintAssembly): >> ???? >????? >????? ================================================================= >> ???? >????? > >> ???? >????? >????? 036???? B2: #?? out( B7 B3 ) <- in( B1 )? Freq: 1 >> ???? >????? >????? 036???? movl??? RBP, [RSI + #12 (8-bit)]??????? # compressed ptr ! Field: java/lang/String.value >> (constant) >> ???? >????? >????? 039???? movl??? R11, [RBP + #12 (8-bit)]??????? # range >> ???? >????? >????? 03d???? NullCheck RBP >> ???? >????? > >> ???? >????? >????? 03d???? B3: #?? out( B6 B4 ) <- in( B2 )? Freq: 0.999999 >> ???? >????? >????? 03d???? cmpl??? RDX, R11??????? # unsigned >> ???? >????? >????? 040???? jnb,us? B6? P=0.000000 C=5375.000000 >> ???? >????? > >> ???? >????? >????? PrintAssembly with no disassembler library available: >> ???? >????? >????? ===================================================== >> ???? >????? > >> ???? >????? >????? [Code] >> ???? >????? >????? [Entry Point] >> ???? >????? >??????? 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 >> ???? >????? >????? [Verified Entry Point] >> ???? >????? >??????? 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 >> ???? >????? >??????? 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c >> ???? >????? >??????? 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 >> ???? >????? >??????? 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 >> ???? >????? >????? [Exception Handler] >> ???? >????? >??????? 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff >> ???? >????? >????? [End] >> ???? >????? > >> ???? >????? >????? PrintAssembly with minimal verbosity: >> ???? >????? >????? ===================================== >> ???? >????? > >> ???? >????? >??????? 0x00007f0434b89bd6:?? mov??? 0xc(%rsi),%ebp >> ???? >????? >??????? 0x00007f0434b89bd9:?? mov??? 0xc(%rbp),%r11d >> ???? >????? >??????? 0x00007f0434b89bdd:?? cmp??? %r11d,%edx >> ???? >????? >??????? 0x00007f0434b89be0:?? jae??? 0x00007f0434b89c0e >> ???? >????? > >> ???? >????? >????? PrintAssembly (previous plus code offsets from code begin): >> ???? >????? >????? =========================================================== >> ???? >????? > >> ???? >????? >??????? 0x00007f63c11d7956 (+0x36):?? mov??? 0xc(%rsi),%ebp >> ???? >????? >??????? 0x00007f63c11d7959 (+0x39):?? mov??? 0xc(%rbp),%r11d >> ???? >????? >??????? 0x00007f63c11d795d (+0x3d):?? cmp??? %r11d,%edx >> ???? >????? >??????? 0x00007f63c11d7960 (+0x40):?? jae??? 0x00007f63c11d798e >> ???? >????? > >> ???? >????? >????? PrintAssembly (previous plus block comments): >> ???? >????? >????? =========================================================== >> ???? >????? > >> ???? >????? >????? ;; B2: #?????? out( B7 B3 ) <- in( B1 )? Freq: 1 >> ???? >????? >??????? 0x00007f48211d76d6 (+0x36):?? mov??? 0xc(%rsi),%ebp >> ???? >????? >??????? 0x00007f48211d76d9 (+0x39):?? mov??? 0xc(%rbp),%r11d >> ???? >????? >?????? ;; B3: #?????? out( B6 B4 ) <- in( B2 )? Freq: 0.999999 >> ???? >????? >??????? 0x00007f48211d76dd (+0x3d):?? cmp??? %r11d,%edx >> ???? >????? >??????? 0x00007f48211d76e0 (+0x40):?? jae??? 0x00007f48211d770e >> ???? >????? > >> ???? >????? >????? PrintAssembly (previous plus instruction comments): >> ???? >????? >????? =========================================================== >> ???? >????? > >> ???? >????? >????? ;; B2: #?????? out( B7 B3 ) <- in( B1 )? Freq: 1 >> ???? >????? >??????? 0x00007fc3e11d7a56 (+0x36):?? mov??? 0xc(%rsi),%ebp?????????? ;*getfield value {reexecute=0 >> rethrow=0 return_oop=0} >> ???? >????? >????????????????????????????????????????????????????????????????????? ; - java.lang.String::charAt at 8 >> (line 702) >> ???? >????? >??????? 0x00007fc3e11d7a59 (+0x39):?? mov??? 0xc(%rbp),%r11d????????? ; implicit exception: dispatches to >> 0x00007fc3e11d7a9e >> ???? >????? >?????? ;; B3: #?????? out( B6 B4 ) <- in( B2 )? Freq: 0.999999 >> ???? >????? >??????? 0x00007fc3e11d7a5d (+0x3d):?? cmp??? %r11d,%edx >> ???? >????? >??????? 0x00007fc3e11d7a60 (+0x40):?? jae??? 0x00007fc3e11d7a8e >> ???? >????? > >> ???? >????? >????? For completeness, here are the links to >> ???? >????? >????? Bug:??? https://bugs.openjdk.java.net/browse/JDK-8213084 >> ???? >????? >????? Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ >> ???? >????? > >> ???? >????? >????? But please, as mentioned above, first focus on the output. The nitty details of the implementation >> I would like to discuss after the output format has received some support. >> ???? >????? > >> ???? >????? >????? Thank you so much for your time! >> ???? >????? >????? Lutz >> ???? >????? > >> ???? >????? > >> ???? >????? > >> ???? >????? > >> ???? > >> ???? > >> >> From ekaterina.pavlova at oracle.com Thu May 16 19:22:56 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 16 May 2019 12:22:56 -0700 Subject: RFR(T) 8224017: [Graal] gc/z/TestUncommit.java fails with Graal Message-ID: Hi, please review one more trivial change which disables recently added gc/z/TestUncommit.java from running in Graal as JIT mode. Cleaned up tests which fail due to 8196611 in test/hotspot/jtreg/ProblemList-graal.txt as well. JBS: https://bugs.openjdk.java.net/browse/JDK-8224017 webrev: http://cr.openjdk.java.net/~epavlova//8224017/webrev.00/index.html thanks, -katya From lutz.schmidt at sap.com Thu May 16 19:22:58 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 16 May 2019 19:22:58 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> Message-ID: <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> Hi Vladimir, thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. Thanks, Lutz ?On 16.05.19, 20:38, "Vladimir Kozlov" wrote: linux-x64-zero build is broke: workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' int instr_size_in_bytes = Assembler::instr_len(pos); ^~~~~~~~~ Other builds and testing are good. Thanks, Vladimir On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > Nice. > > I submitted our tier1-3 testing. > > Thanks, > Vladimir > > On 5/16/19 2:55 AM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> sorry for the delayed reaction on your comments. >> >> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list >> of modified files. >> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In >> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last >> night). >> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter >> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. >> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. >> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ >> >> ~thartmann: >> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it >> work reliably. I would like to keep it in as a reminder and a template to build on. >> >> Thanks, >> Lutz >> >> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: >> >> Hi Lutz, >> My comments are inlined below. >> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: >> > Thank you, Vladimir! >> > Please find my comments inline and let me know what you think. >> > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ >> Found one more I missed last time: >> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): >> static int instr_len(unsigned char *instr); >> > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a >> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there >> were no serious clashes) and sent a short note. >> NP. Please, provide final webrev when you can so that I can run these changes through our testing to >> make sure no issues are present (especially in builds). >> > Regards, >> > Lutz >> > >> > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: >> > >> > Hi Lutz, >> > >> > Thank you for doing this great work. >> > >> > I have just small comments: >> > >> > x86_64.ad - empty change. >> > File contains whitespace changes for formatting. Not visible in webrev. >> Okay. >> > >> > nmethod.cpp - LUCY? >> > >> > + st->print_cr("LUCY: NULL-oop"); >> > + tty->print("LUCY NULL-oop"); >> > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. >> Okay. >> > >> > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. >> > Changed. >> > >> > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). >> > Changed. >> > >> > disassembler.* - LUCY_OBSOLETE? >> > >> > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only >> > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction >> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it >> out. But now I did. It's gone. >> Okay. >> > >> > compilerDefinitions.hpp - I don't see where tier_digit() is used. >> > I'm surprised myself. Introduced it and then made it obsolete. It's gone. >> > >> > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You >> > need to check for invalid value and print help output in such case - it will be very useful if you >> > forgot a value spelling. Also add line for 'help' value. >> > >> > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, >> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is >> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) >> like. Not changed as of now. Awaiting your input. >> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is >> useless because PrintAssemblyOptions is used only for disassembler and there are no values which >> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). >> And an other commented new line: >> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); >> > >> > Printing help text: There is an option (hsdis-help) to request help text printout. > >> > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - >> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more >> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? >> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use >> if-else or switch without complicating the code. >> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use >> here? I don't think next comment is correct for ccstr type: >> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 >> It would be nice to fix it but you can do it later if you don't want to add more changes. >> > >> > Do you need next commented lines: >> > >> > disassembler.cpp - >> > +// ptrdiff_t _offset; >> > Deleted. >> > >> > +// Output suppressed because it messes up disassembly. >> > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); >> > Uncommented, would like to keep it. Made the if condition permanently false. >> > >> > disassembler_s390.cpp - >> > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); >> > Deleted. >> > >> > compile.cpp - >> > +// st->print("# "); _tf->dump_on(st); st->cr(); >> > Uncommented. >> > >> > >> > abstractDisassembler.cpp - >> > // st->print("0x%016lx", *((julong*)here)); >> > st->print("0x%016lx", *((uintptr_t*)here)); >> > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); >> > Commented lines are gone. >> > >> > abstractDisassembler.cpp - may be explicit cast (byte*)?: >> > >> > st->print("%2.2x", *byte); >> > st->print("%2.2x", *pos); >> > st->print("0x%02x", *here); >> > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, >> std::byte is a C++17 feature. We are not there yet. >> okay >> > >> > PTR64_FORMAT ?: >> > st->print("0x%016lx", *((uintptr_t*)here)); >> > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. >> Changed other occurrences, so I changed this one as well. >> Thanks, >> Vladimir >> > >> > >> > Thanks, >> > Vladimir >> > >> > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: >> > > Dear Community, >> > > >> > > may I please request comments and reviews for this change? Thank you! >> > > >> > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The >> code which constitutes this patch was not changed. Here's the webrev link: >> > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ >> > > >> > > Regards, >> > > Lutz >> > > >> > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: >> > > >> > > Dear All, >> > > >> > > this topic was discussed back in Nov/Dec 2018: >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html >> > > >> > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. >> > > The result was mixed, some pro, some con. I let the input from back then influence my work of the >> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general >> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the >> list (with default settings): >> > > >> > > PrintAssemblyOptions help: >> > > hsdis-print-raw test plugin by requesting raw output (deprecated) >> > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) >> > > hsdis-print-pc turn off PC printing (on by default) (deprecated) >> > > hsdis-print-bytes turn on instruction byte output (deprecated) >> > > >> > > hsdis-show-pc toggle printing current pc, currently ON >> > > hsdis-show-offset toggle printing current offset, currently OFF >> > > hsdis-show-bytes toggle printing instruction bytes, currently OFF >> > > hsdis-show-data-hex toggle formatting data as hex, currently ON >> > > hsdis-show-data-int toggle formatting data as int, currently OFF >> > > hsdis-show-data-float toggle formatting data as float, currently OFF >> > > hsdis-show-structs toggle compiler data structures, currently OFF >> > > hsdis-show-comment toggle instruction comments, currently OFF >> > > hsdis-show-block-comment toggle block comments, currently OFF >> > > hsdis-align-instr toggle instruction alignment, currently OFF >> > > >> > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. >> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. >> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like >> a plan? Here is what you get: >> > > >> > > The machine code generated by the JVM can be printed in three different formats: >> > > - Hexadecimal. >> > > This is basically a hex dump of the memory range containing the code. >> > > This format is always available (PRODUCT and not-PRODUCT builds), regardless >> > > of the availability of a disassembler library. It applies to all sorts of >> > > code, be it blobs, stubs, compiled nmethods, ... >> > > This format seems useless at first glance, but it is not. In an upcoming, >> > > separate enhancement, the JVM will be made capable of reading files >> > > containing such code blocks and disassembling them post mortem. The most >> > > prominent example is an hs_err* file. >> > > - Disassembled. >> > > This is an assembly listing of the instructions as found in the memory range >> > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable >> > > disassembler library (hsdis-.so) must be available at runtime. >> > > Most often, that will only be the case in test environments. If no disassembler >> > > library is available, hexadecimal output is used as fallback. >> > > - OptoAssembly. >> > > This is a meta code listing created only by the C2 compiler. As it is somewhat >> > > closer to the Java code, it may be helpful in linking assembly code to Java code. >> > > >> > > All three formats can be merged with additional information, most prominently compiler-internal >> "knowledge" about blocks, related bytecodes, statistics counters, and much more. >> > > >> > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, >> dependencies, exception handlers, are printed to aid in debugging. >> > > >> > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support >> OptoAssembly output. Data structures are unavailable as well. >> > > >> > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you >> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: >> > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ >> > > >> > > OptoAssembly output for reference (always on with PrintAssembly): >> > > ================================================================= >> > > >> > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >> > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value >> (constant) >> > > 039 movl R11, [RBP + #12 (8-bit)] # range >> > > 03d NullCheck RBP >> > > >> > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >> > > 03d cmpl RDX, R11 # unsigned >> > > 040 jnb,us B6 P=0.000000 C=5375.000000 >> > > >> > > PrintAssembly with no disassembler library available: >> > > ===================================================== >> > > >> > > [Code] >> > > [Entry Point] >> > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 >> > > [Verified Entry Point] >> > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 >> > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c >> > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 >> > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 >> > > [Exception Handler] >> > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff >> > > [End] >> > > >> > > PrintAssembly with minimal verbosity: >> > > ===================================== >> > > >> > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp >> > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d >> > > 0x00007f0434b89bdd: cmp %r11d,%edx >> > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e >> > > >> > > PrintAssembly (previous plus code offsets from code begin): >> > > =========================================================== >> > > >> > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp >> > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d >> > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx >> > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e >> > > >> > > PrintAssembly (previous plus block comments): >> > > =========================================================== >> > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >> > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp >> > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >> > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx >> > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e >> > > >> > > PrintAssembly (previous plus instruction comments): >> > > =========================================================== >> > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >> > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 >> rethrow=0 return_oop=0} >> > > ; - java.lang.String::charAt at 8 >> (line 702) >> > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to >> 0x00007fc3e11d7a9e >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >> > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx >> > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e >> > > >> > > For completeness, here are the links to >> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 >> > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ >> > > >> > > But please, as mentioned above, first focus on the output. The nitty details of the implementation >> I would like to discuss after the output format has received some support. >> > > >> > > Thank you so much for your time! >> > > Lutz >> > > >> > > >> > > >> > > >> > >> > >> >> From vladimir.kozlov at oracle.com Thu May 16 19:33:54 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 12:33:54 -0700 Subject: RFR(T) 8224017: [Graal] gc/z/TestUncommit.java fails with Graal In-Reply-To: References: Message-ID: <3B58D693-28D7-48C0-A21D-5E20AD852140@oracle.com> Looks good and trivial. Thanks Vladimir > On May 16, 2019, at 12:22 PM, Ekaterina Pavlova wrote: > > Hi, > > please review one more trivial change which disables recently added gc/z/TestUncommit.java from running > in Graal as JIT mode. Cleaned up tests which fail due to 8196611 in test/hotspot/jtreg/ProblemList-graal.txt as well. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8224017 > webrev: http://cr.openjdk.java.net/~epavlova//8224017/webrev.00/index.html > > thanks, > -katya From vladimir.kozlov at oracle.com Thu May 16 19:40:11 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 12:40:11 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> Message-ID: <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> I am not sure about exact parameters but I see build testing uses next: configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc Vladimir On 5/16/19 12:22 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? > > I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. > > Thanks, > Lutz > > ?On 16.05.19, 20:38, "Vladimir Kozlov" wrote: > > linux-x64-zero build is broke: > > workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' > int instr_size_in_bytes = Assembler::instr_len(pos); > ^~~~~~~~~ > Other builds and testing are good. > > Thanks, > Vladimir > > On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > > Nice. > > > > I submitted our tier1-3 testing. > > > > Thanks, > > Vladimir > > > > On 5/16/19 2:55 AM, Schmidt, Lutz wrote: > >> Hi Vladimir, > >> > >> sorry for the delayed reaction on your comments. > >> > >> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list > >> of modified files. > >> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In > >> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last > >> night). > >> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter > >> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. > >> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. > >> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ > >> > >> ~thartmann: > >> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it > >> work reliably. I would like to keep it in as a reminder and a template to build on. > >> > >> Thanks, > >> Lutz > >> > >> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: > >> > >> Hi Lutz, > >> My comments are inlined below. > >> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > >> > Thank you, Vladimir! > >> > Please find my comments inline and let me know what you think. > >> > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ > >> Found one more I missed last time: > >> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): > >> static int instr_len(unsigned char *instr); > >> > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a > >> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there > >> were no serious clashes) and sent a short note. > >> NP. Please, provide final webrev when you can so that I can run these changes through our testing to > >> make sure no issues are present (especially in builds). > >> > Regards, > >> > Lutz > >> > > >> > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > >> > > >> > Hi Lutz, > >> > > >> > Thank you for doing this great work. > >> > > >> > I have just small comments: > >> > > >> > x86_64.ad - empty change. > >> > File contains whitespace changes for formatting. Not visible in webrev. > >> Okay. > >> > > >> > nmethod.cpp - LUCY? > >> > > >> > + st->print_cr("LUCY: NULL-oop"); > >> > + tty->print("LUCY NULL-oop"); > >> > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. > >> Okay. > >> > > >> > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > >> > Changed. > >> > > >> > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > >> > Changed. > >> > > >> > disassembler.* - LUCY_OBSOLETE? > >> > > >> > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > >> > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction > >> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it > >> out. But now I did. It's gone. > >> Okay. > >> > > >> > compilerDefinitions.hpp - I don't see where tier_digit() is used. > >> > I'm surprised myself. Introduced it and then made it obsolete. It's gone. > >> > > >> > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > >> > need to check for invalid value and print help output in such case - it will be very useful if you > >> > forgot a value spelling. Also add line for 'help' value. > >> > > >> > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, > >> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is > >> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) > >> like. Not changed as of now. Awaiting your input. > >> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is > >> useless because PrintAssemblyOptions is used only for disassembler and there are no values which > >> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). > >> And an other commented new line: > >> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > >> > > >> > Printing help text: There is an option (hsdis-help) to request help text printout. > > >> > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - > >> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more > >> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? > >> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use > >> if-else or switch without complicating the code. > >> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use > >> here? I don't think next comment is correct for ccstr type: > >> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 > >> It would be nice to fix it but you can do it later if you don't want to add more changes. > >> > > >> > Do you need next commented lines: > >> > > >> > disassembler.cpp - > >> > +// ptrdiff_t _offset; > >> > Deleted. > >> > > >> > +// Output suppressed because it messes up disassembly. > >> > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > >> > Uncommented, would like to keep it. Made the if condition permanently false. > >> > > >> > disassembler_s390.cpp - > >> > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > >> > Deleted. > >> > > >> > compile.cpp - > >> > +// st->print("# "); _tf->dump_on(st); st->cr(); > >> > Uncommented. > >> > > >> > > >> > abstractDisassembler.cpp - > >> > // st->print("0x%016lx", *((julong*)here)); > >> > st->print("0x%016lx", *((uintptr_t*)here)); > >> > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > >> > Commented lines are gone. > >> > > >> > abstractDisassembler.cpp - may be explicit cast (byte*)?: > >> > > >> > st->print("%2.2x", *byte); > >> > st->print("%2.2x", *pos); > >> > st->print("0x%02x", *here); > >> > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, > >> std::byte is a C++17 feature. We are not there yet. > >> okay > >> > > >> > PTR64_FORMAT ?: > >> > st->print("0x%016lx", *((uintptr_t*)here)); > >> > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. > >> Changed other occurrences, so I changed this one as well. > >> Thanks, > >> Vladimir > >> > > >> > > >> > Thanks, > >> > Vladimir > >> > > >> > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > >> > > Dear Community, > >> > > > >> > > may I please request comments and reviews for this change? Thank you! > >> > > > >> > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The > >> code which constitutes this patch was not changed. Here's the webrev link: > >> > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > >> > > > >> > > Regards, > >> > > Lutz > >> > > > >> > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > >> > > > >> > > Dear All, > >> > > > >> > > this topic was discussed back in Nov/Dec 2018: > >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > >> > > > >> > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > >> > > The result was mixed, some pro, some con. I let the input from back then influence my work of the > >> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general > >> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the > >> list (with default settings): > >> > > > >> > > PrintAssemblyOptions help: > >> > > hsdis-print-raw test plugin by requesting raw output (deprecated) > >> > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > >> > > hsdis-print-pc turn off PC printing (on by default) (deprecated) > >> > > hsdis-print-bytes turn on instruction byte output (deprecated) > >> > > > >> > > hsdis-show-pc toggle printing current pc, currently ON > >> > > hsdis-show-offset toggle printing current offset, currently OFF > >> > > hsdis-show-bytes toggle printing instruction bytes, currently OFF > >> > > hsdis-show-data-hex toggle formatting data as hex, currently ON > >> > > hsdis-show-data-int toggle formatting data as int, currently OFF > >> > > hsdis-show-data-float toggle formatting data as float, currently OFF > >> > > hsdis-show-structs toggle compiler data structures, currently OFF > >> > > hsdis-show-comment toggle instruction comments, currently OFF > >> > > hsdis-show-block-comment toggle block comments, currently OFF > >> > > hsdis-align-instr toggle instruction alignment, currently OFF > >> > > > >> > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. > >> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. > >> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like > >> a plan? Here is what you get: > >> > > > >> > > The machine code generated by the JVM can be printed in three different formats: > >> > > - Hexadecimal. > >> > > This is basically a hex dump of the memory range containing the code. > >> > > This format is always available (PRODUCT and not-PRODUCT builds), regardless > >> > > of the availability of a disassembler library. It applies to all sorts of > >> > > code, be it blobs, stubs, compiled nmethods, ... > >> > > This format seems useless at first glance, but it is not. In an upcoming, > >> > > separate enhancement, the JVM will be made capable of reading files > >> > > containing such code blocks and disassembling them post mortem. The most > >> > > prominent example is an hs_err* file. > >> > > - Disassembled. > >> > > This is an assembly listing of the instructions as found in the memory range > >> > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > >> > > disassembler library (hsdis-.so) must be available at runtime. > >> > > Most often, that will only be the case in test environments. If no disassembler > >> > > library is available, hexadecimal output is used as fallback. > >> > > - OptoAssembly. > >> > > This is a meta code listing created only by the C2 compiler. As it is somewhat > >> > > closer to the Java code, it may be helpful in linking assembly code to Java code. > >> > > > >> > > All three formats can be merged with additional information, most prominently compiler-internal > >> "knowledge" about blocks, related bytecodes, statistics counters, and much more. > >> > > > >> > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, > >> dependencies, exception handlers, are printed to aid in debugging. > >> > > > >> > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support > >> OptoAssembly output. Data structures are unavailable as well. > >> > > > >> > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you > >> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > >> > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > >> > > > >> > > OptoAssembly output for reference (always on with PrintAssembly): > >> > > ================================================================= > >> > > > >> > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value > >> (constant) > >> > > 039 movl R11, [RBP + #12 (8-bit)] # range > >> > > 03d NullCheck RBP > >> > > > >> > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 03d cmpl RDX, R11 # unsigned > >> > > 040 jnb,us B6 P=0.000000 C=5375.000000 > >> > > > >> > > PrintAssembly with no disassembler library available: > >> > > ===================================================== > >> > > > >> > > [Code] > >> > > [Entry Point] > >> > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > >> > > [Verified Entry Point] > >> > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > >> > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > >> > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > >> > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > >> > > [Exception Handler] > >> > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > >> > > [End] > >> > > > >> > > PrintAssembly with minimal verbosity: > >> > > ===================================== > >> > > > >> > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > >> > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > >> > > 0x00007f0434b89bdd: cmp %r11d,%edx > >> > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > >> > > > >> > > PrintAssembly (previous plus code offsets from code begin): > >> > > =========================================================== > >> > > > >> > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > >> > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > >> > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > >> > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > >> > > > >> > > PrintAssembly (previous plus block comments): > >> > > =========================================================== > >> > > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > >> > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > >> > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > >> > > > >> > > PrintAssembly (previous plus instruction comments): > >> > > =========================================================== > >> > > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 > >> rethrow=0 return_oop=0} > >> > > ; - java.lang.String::charAt at 8 > >> (line 702) > >> > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to > >> 0x00007fc3e11d7a9e > >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > >> > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > >> > > > >> > > For completeness, here are the links to > >> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > >> > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > >> > > > >> > > But please, as mentioned above, first focus on the output. The nitty details of the implementation > >> I would like to discuss after the output format has received some support. > >> > > > >> > > Thank you so much for your time! > >> > > Lutz > >> > > > >> > > > >> > > > >> > > > >> > > >> > > >> > >> > > From ekaterina.pavlova at oracle.com Thu May 16 19:52:42 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 16 May 2019 12:52:42 -0700 Subject: RFR(T) 8224017: [Graal] gc/z/TestUncommit.java fails with Graal In-Reply-To: <3B58D693-28D7-48C0-A21D-5E20AD852140@oracle.com> References: <3B58D693-28D7-48C0-A21D-5E20AD852140@oracle.com> Message-ID: Thanks Vladimir. On 5/16/19 12:33 PM, Vladimir Kozlov wrote: > Looks good and trivial. > > Thanks > Vladimir > >> On May 16, 2019, at 12:22 PM, Ekaterina Pavlova wrote: >> >> Hi, >> >> please review one more trivial change which disables recently added gc/z/TestUncommit.java from running >> in Graal as JIT mode. Cleaned up tests which fail due to 8196611 in test/hotspot/jtreg/ProblemList-graal.txt as well. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8224017 >> webrev: http://cr.openjdk.java.net/~epavlova//8224017/webrev.00/index.html >> >> thanks, >> -katya > From jesper.wilhelmsson at oracle.com Thu May 16 20:08:01 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Thu, 16 May 2019 22:08:01 +0200 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: <507D457E-C328-4337-968E-80036650FDE3@oracle.com> References: <507D457E-C328-4337-968E-80036650FDE3@oracle.com> Message-ID: <9CB307E3-F31F-4349-932E-E52DE0773FA5@oracle.com> New webrev with overwritten and the extra Graal change applied: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.01/ /Jesper > On 14 May 2019, at 23:40, jesper.wilhelmsson at oracle.com wrote: > > Dean, > > Please attach the diff you want me to add to the bug. > JDK-8223441 is in there. <> > > Thanks, > /Jesper > >> On 14 May 2019, at 23:03, dean.long at oracle.com wrote: >> >> I suggest doing the tiers 1-4 testing separate from the tier5+ testing, to reduce noise. >> >> There is a fix for CheckGraalIntrinsics coming to upstream Graal. When Jesper merges the overwritten changes, he could include that fix as well, so that compiler/graalunit/HotspotTest.java passes. >> >> The HeapMonitorStatArrayCorrectnessTest failure should have been fixed by JDK-8223441, unless Jesper's test repo is out of date. >> >> dl >> <> >> On 5/14/19 12:32 PM, Vladimir Kozlov wrote: >>> Changes seems fine but I am not comfortable about tests results. There are a lot of timeouts again but there are many graalunit tests failures. >>> >>> This time you have to apply overwritten diffs after merge (if we decide to push it) - these changes are not in Graal master repo yet. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: >>>> Hi, >>>> >>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>> Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 >>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ >>>> >>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>> >>>> Thanks, >>>> /Jesper >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Thu May 16 20:09:31 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 13:09:31 -0700 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: <9CB307E3-F31F-4349-932E-E52DE0773FA5@oracle.com> References: <507D457E-C328-4337-968E-80036650FDE3@oracle.com> <9CB307E3-F31F-4349-932E-E52DE0773FA5@oracle.com> Message-ID: Looks good. Thanks Vladimir > On May 16, 2019, at 1:08 PM, jesper.wilhelmsson at oracle.com wrote: > > New webrev with overwritten and the extra Graal change applied: > > http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.01/ > > /Jesper > > >> On 14 May 2019, at 23:40, jesper.wilhelmsson at oracle.com wrote: >> >> Dean, >> >> Please attach the diff you want me to add to the bug. >> JDK-8223441 is in there. >> >> Thanks, >> /Jesper >> >>> On 14 May 2019, at 23:03, dean.long at oracle.com wrote: >>> >>> I suggest doing the tiers 1-4 testing separate from the tier5+ testing, to reduce noise. >>> >>> There is a fix for CheckGraalIntrinsics coming to upstream Graal. When Jesper merges the overwritten changes, he could include that fix as well, so that compiler/graalunit/HotspotTest.java passes. >>> >>> The HeapMonitorStatArrayCorrectnessTest failure should have been fixed by JDK-8223441, unless Jesper's test repo is out of date. >>> >>> dl >>> >>>> On 5/14/19 12:32 PM, Vladimir Kozlov wrote: >>>> Changes seems fine but I am not comfortable about tests results. There are a lot of timeouts again but there are many graalunit tests failures. >>>> >>>> This time you have to apply overwritten diffs after merge (if we decide to push it) - these changes are not in Graal master repo yet. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: >>>>> Hi, >>>>> >>>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>>> Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 >>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ >>>>> >>>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>>> >>>>> Thanks, >>>>> /Jesper >>>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.schmidt at sap.com Thu May 16 22:16:06 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 16 May 2019 22:16:06 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> Message-ID: <138023EE-F390-4618-A885-AD6473B593DE@sap.com> Hi Vladimir, I have implemented that "added safety feature". What it does: - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal. - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with badCodeHeapNewVal. - During verify(), it is checked that all heap memory in free blocks is initialized as expected. Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current jdk/jdk, is here: https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to just delete that assert. I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually vanishes. There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them. Sorry for the long text. Lutz ?On 15.05.19, 19:00, "Vladimir Kozlov" wrote: On 5/14/19 10:53 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > thank you for your comments. About filling CodeHeap with bad values after split_block: > - in deallocate_tail, the leading part must remain intact. It contains valid code. > - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts Thank you for explaining. > - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. Yes, it is preferable. An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next block _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation? Thanks, Vladimir > > Regards, > Lutz > > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: > > On 5/14/19 1:09 PM, Schmidt, Lutz wrote: > > Hi Vladimir, > > > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). > > > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. > > Okay, if it is byte copy I am fine with it. > > > > > About segment map marking: > > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). > > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. > > > > Segmap before after > > Index split split > > I 0 <- block start 0 <- block start (now shorter) > > I+1 1 1 each index 0..9 still points > > I+2 2 2 back to the block start > > I+3 3 3 > > I+4 4 4 > > I+5 5 5 > > I+6 6 6 > > I+7 7 7 > > I+8 8 8 > > I+9 9 9 > > I+10 10 0 <- new block start > > I+11 11 1 > > I+12 12 2 > > I+13 13 3 > > I+14 14 4 > > I+15 0 <- block start 0 <- block start > > I+16 1 1 > > I+17 2 2 > > I+18 3 3 > > I+19 4 4 > > > > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? > > Thank you, I forgot about that first block mapping is still valid. > > What about storing bad value (in debug mode) only in second part and not both parts? > > > > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. > > Okay. > > Thanks, > Vladimir > > > > > Thanks, > > Lutz > > > > > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > > > Good. > > > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). > > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as > > used and (after call) store bad values in it. What about first part? May be add comment. > > > > Thanks, > > Vladimir > > > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > > > Dear all, > > > > > > May I please request reviews for my change? > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > > > > > What this change is all about: > > > ------------------------------ > > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > > > > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > > > > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > > > > > JDK-8223444 not applied > > > ======================= > > > > > > Segment | free blocks with CodeCacheMinBlockLength= > > > Size | 1 2 3 4 6 8 > > > -----------------+------------------------------------------- > > > aarch 128 | 0 153 75 30 38 2 > > > ppc 128 | 0 149 98 59 14 2 > > > ppcle 128 | 0 219 161 110 69 34 > > > s390 256 | 0 142 93 59 30 10 > > > x86 128 | 0 215 157 118 42 11 > > > > > > > > > JDK-8223444 applied > > > =================== > > > > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > > > Size | 1 2 3 4 6 8 | setting > > > -----------------+---------------------------------------------+------------ > > > aarch 128 | 221 115 80 36 7 1 | 6 > > > ppc 128 | 245 152 101 54 14 4 | 6 > > > ppcle 128 | 243 144 89 72 20 5 | 6 > > > s390 256 | 168 60 67 8 6 2 | 4 > > > x86 128 | 223 139 83 50 11 2 | 6 > > > > > > Thank you for your time and opinion! > > > Lutz > > > > > > > > > > > > > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: 8223444.delta01-02.diff Type: application/octet-stream Size: 9210 bytes Desc: 8223444.delta01-02.diff URL: From jesper.wilhelmsson at oracle.com Thu May 16 22:40:31 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 17 May 2019 00:40:31 +0200 Subject: RFR: JDK-8223346 - Update Graal In-Reply-To: References: <507D457E-C328-4337-968E-80036650FDE3@oracle.com> <9CB307E3-F31F-4349-932E-E52DE0773FA5@oracle.com> Message-ID: Thanks Vladimir! /Jesper > On 16 May 2019, at 22:09, Vladimir Kozlov wrote: > > Looks good. > > Thanks > Vladimir > > On May 16, 2019, at 1:08 PM, jesper.wilhelmsson at oracle.com wrote: > >> New webrev with overwritten and the extra Graal change applied: >> >> http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.01/ >> >> /Jesper >> >> >>> On 14 May 2019, at 23:40, jesper.wilhelmsson at oracle.com wrote: >>> >>> Dean, >>> >>> Please attach the diff you want me to add to the bug. >>> JDK-8223441 is in there. <> >>> >>> Thanks, >>> /Jesper >>> >>>> On 14 May 2019, at 23:03, dean.long at oracle.com wrote: >>>> >>>> I suggest doing the tiers 1-4 testing separate from the tier5+ testing, to reduce noise. >>>> >>>> There is a fix for CheckGraalIntrinsics coming to upstream Graal. When Jesper merges the overwritten changes, he could include that fix as well, so that compiler/graalunit/HotspotTest.java passes. >>>> >>>> The HeapMonitorStatArrayCorrectnessTest failure should have been fixed by JDK-8223441, unless Jesper's test repo is out of date. >>>> >>>> dl >>>> <> >>>> On 5/14/19 12:32 PM, Vladimir Kozlov wrote: >>>>> Changes seems fine but I am not comfortable about tests results. There are a lot of timeouts again but there are many graalunit tests failures. >>>>> >>>>> This time you have to apply overwritten diffs after merge (if we decide to push it) - these changes are not in Graal master repo yet. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 5/13/19 5:19 PM, jesper.wilhelmsson at oracle.com wrote: >>>>>> Hi, >>>>>> >>>>>> Please review the patch to integrate recent Graal changes into OpenJDK. >>>>>> Graal tip to integrate: 6a18d9ddacd8eecb0ae4877f687e171889939c0d >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8223346 >>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8223346/webrev.00/ >>>>>> >>>>>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>>>>> >>>>>> Thanks, >>>>>> /Jesper >>>>>> >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From fujie at loongson.cn Thu May 16 23:17:37 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 17 May 2019 07:17:37 +0800 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: References: Message-ID: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> Hi Vladimir, Thank you for your review. Do you think the change is simple enough? And could you please sponsor it? Thanks a lot. Best regards, Jie On 2019?05?17? 01:48, Vladimir Kozlov wrote: > Looks good to me. > > Thanks, > Vladimir > > On 5/16/19 4:15 AM, Jie Fu wrote: >> Ping. >> >> Could someone help to review this change? >> I would greatly appreciate if this test case could be used for our >> mips-port jdk. >> >> Thanks a lot. >> Best regards, >> Jie >> >> On 2019/4/11 ??9:51, Jie Fu wrote: >>> Hi all, >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222302 >>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>> >>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not >>> AArch64, PPC, S390x, SPARC or X86). >>> It is designed to test "UseSHASpecificTestCaseForUnsupportedCPU"[1] >>> and "GenericTestCaseForOtherCPU"[2] on any other CPU[3]. >>> But when they run on any other CPU (e.g., mips), an exception[4] is >>> always thrown, which causes the failure. >>> So there seems to be a logical bug in it. >>> >>> The change has been tested on mips and x86. >>> Could you please review it? >>> Thanks a lot. >>> >>> Best regards, >>> Jie >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>> >>> [2] >>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>> >>> [3] >>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>> >>> [4] >>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>> >>> >>> >> From vladimir.kozlov at oracle.com Thu May 16 23:17:38 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 16:17:38 -0700 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <138023EE-F390-4618-A885-AD6473B593DE@sap.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> Message-ID: On 5/16/19 3:16 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > I have implemented that "added safety feature". What it does: > - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal. > - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with badCodeHeapNewVal. > - During verify(), it is checked that all heap memory in free blocks is initialized as expected. > > Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current jdk/jdk, is here: > https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ Good. I will run testing on it. > > There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to just delete that assert. Yes, I agree. Such implicit dependence is bad and useless. We use reset_to() in AOT where scopes should already have valid offsets: http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152 I am not sure how we pass this assert in AOT case. > > I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually vanishes. > There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them. Okay, it sounds good. Thanks, Vladimir > > Sorry for the long text. > Lutz > > ?On 15.05.19, 19:00, "Vladimir Kozlov" wrote: > > On 5/14/19 10:53 PM, Schmidt, Lutz wrote: > > Hi Vladimir, > > > > thank you for your comments. About filling CodeHeap with bad values after split_block: > > - in deallocate_tail, the leading part must remain intact. It contains valid code. > > - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts > > Thank you for explaining. > > > - If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block. > > Yes, it is preferable. > > An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next block > _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation? > > Thanks, > Vladimir > > > > > Regards, > > Lutz > > > > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: > > > > On 5/14/19 1:09 PM, Schmidt, Lutz wrote: > > > Hi Vladimir, > > > > > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset(). > > > > > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering. > > > > Okay, if it is byte copy I am fine with it. > > > > > > > > About segment map marking: > > > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). > > > In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid. > > > > > > Segmap before after > > > Index split split > > > I 0 <- block start 0 <- block start (now shorter) > > > I+1 1 1 each index 0..9 still points > > > I+2 2 2 back to the block start > > > I+3 3 3 > > > I+4 4 4 > > > I+5 5 5 > > > I+6 6 6 > > > I+7 7 7 > > > I+8 8 8 > > > I+9 9 9 > > > I+10 10 0 <- new block start > > > I+11 11 1 > > > I+12 12 2 > > > I+13 13 3 > > > I+14 14 4 > > > I+15 0 <- block start 0 <- block start > > > I+16 1 1 > > > I+17 2 2 > > > I+18 3 3 > > > I+19 4 4 > > > > > > There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? > > > > Thank you, I forgot about that first block mapping is still valid. > > > > What about storing bad value (in debug mode) only in second part and not both parts? > > > > > > > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it. > > > > Okay. > > > > Thanks, > > Vladimir > > > > > > > > Thanks, > > > Lutz > > > > > > > > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" wrote: > > > > > > Good. > > > > > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here). > > > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as > > > used and (after call) store bad values in it. What about first part? May be add comment. > > > > > > Thanks, > > > Vladimir > > > > > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: > > > > Dear all, > > > > > > > > May I please request reviews for my change? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 > > > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ > > > > > > > > What this change is all about: > > > > ------------------------------ > > > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements. > > > > > > > > Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search. > > > > > > > > The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. > > > > > > > > JDK-8223444 not applied > > > > ======================= > > > > > > > > Segment | free blocks with CodeCacheMinBlockLength= > > > > Size | 1 2 3 4 6 8 > > > > -----------------+------------------------------------------- > > > > aarch 128 | 0 153 75 30 38 2 > > > > ppc 128 | 0 149 98 59 14 2 > > > > ppcle 128 | 0 219 161 110 69 34 > > > > s390 256 | 0 142 93 59 30 10 > > > > x86 128 | 0 215 157 118 42 11 > > > > > > > > > > > > JDK-8223444 applied > > > > =================== > > > > > > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested > > > > Size | 1 2 3 4 6 8 | setting > > > > -----------------+---------------------------------------------+------------ > > > > aarch 128 | 221 115 80 36 7 1 | 6 > > > > ppc 128 | 245 152 101 54 14 4 | 6 > > > > ppcle 128 | 243 144 89 72 20 5 | 6 > > > > s390 256 | 168 60 67 8 6 2 | 4 > > > > x86 128 | 223 139 83 50 11 2 | 6 > > > > > > > > Thank you for your time and opinion! > > > > Lutz > > > > > > > > > > > > > > > > > > > > > > > > > > > > From vladimir.kozlov at oracle.com Thu May 16 23:33:49 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 16 May 2019 16:33:49 -0700 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> Message-ID: <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> I think you need second review to make sure it is right change. Especially who works on other platforms. Thanks, Vladimir On 5/16/19 4:17 PM, Jie Fu wrote: > Hi Vladimir, > > Thank you for your review. > > Do you think the change is simple enough? > And could you please sponsor it? > > Thanks a lot. > Best regards, > Jie > > On 2019?05?17? 01:48, Vladimir Kozlov wrote: >> Looks good to me. >> >> Thanks, >> Vladimir >> >> On 5/16/19 4:15 AM, Jie Fu wrote: >>> Ping. >>> >>> Could someone help to review this change? >>> I would greatly appreciate if this test case could be used for our mips-port jdk. >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> On 2019/4/11 ??9:51, Jie Fu wrote: >>>> Hi all, >>>> >>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8222302 >>>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>>> >>>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not AArch64, PPC, S390x, SPARC or X86). >>>> It is designed to test "UseSHASpecificTestCaseForUnsupportedCPU"[1] and "GenericTestCaseForOtherCPU"[2] on any other >>>> CPU[3]. >>>> But when they run on any other CPU (e.g., mips), an exception[4] is always thrown, which causes the failure. >>>> So there seems to be a logical bug in it. >>>> >>>> The change has been tested on mips and x86. >>>> Could you please review it? >>>> Thanks a lot. >>>> >>>> Best regards, >>>> Jie >>>> >>>> [1] >>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>>> >>>> [2] >>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>>> >>>> [3] >>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>>> >>>> [4] >>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>>> >>>> >>>> >>> > > From david.holmes at oracle.com Fri May 17 04:39:31 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 17 May 2019 14:39:31 +1000 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> Message-ID: Hi Robbin, On 15/05/2019 4:26 pm, Robbin Ehn wrote: > Hi, please see this update. > > I think I got all review comments fix. The changes related to my comments all seem fine - thanks. > Long story short, I was concerned about test coverage, so I added a > stress test using the WB, which sometimes crashed in rubbish code. > > There are two bugs in the methods used by WB_DeoptimizeAll. > (Seems I'm the first user) > > CodeCache::mark_all_nmethods_for_deoptimization(); > When iterating the nmethods we could see the methods being create in: > void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& > method) > And deopt the method when it was in use or before. > Native wrappers are suppose to live as long as the class. > I filtered out not_installed and native methods. > > Deoptimization::deoptimize_all_marked(); > The issue is that a not_entrant method can go to zombie at anytime. > There are several ways to make a nmethod not go zombie: nmethodLocker, > have it > on stack, avoid safepoint poll in some states, etc.., which is also > depending on > what type of nmethod. > The iterator only_alive_and_not_unloading returns not_entrant nmethods, > but we > don't know there state prior last poll. > in_use -> not_entrant -> #poll# -> not_entrant -> zombie > If the iterator returns the nmethod after we passed the poll it can > still be > not_entrant but go zombie. > The problem happens when a second thread marks a method for deopt and > makes it > not_entrant. Then after a poll we end-up in deoptimize_all_marked(), but > the > method is not yet a zombie, so the iterator returns it, it becomes a > zombie thus > pass the if check and later hit the assert. > So there is a race between the iterator check of state and if-statement > check of > state. Fixed by also filtering out zombies. > > If the stress test with correction of the bugs causes trouble in review, > I can do a follow-up with the stress test separately. I'm not concerned about combining these. One nit in the test: 59 Thread.currentThread().sleep(10); should just be Thread.sleep(10) as its not an instance method. Thanks, David ----- > > Good news, no issues found with deopt with handshakes. > > This is v3: > http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ > > This full inc from v2 (review + stress test): > http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ > > This inc is the review part from v2: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ > > This inc is the additional stress test with bug fixes: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ > > Additional biased locking change: > The original code use same copy of markOop in revoke_and_rebias. > The keep same behavior I now pass in that copy into fast_revoke. > > The stress test passes hundreds of iterations in mach5. > Thousands stress tests locally, the issues above was reproduce-able. > Inc changes also passes t1-5. > > As usual with this change-set, I'm continuously running more test. > > Thanks, Robbin > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From OGATAK at jp.ibm.com Fri May 17 05:34:20 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Fri, 17 May 2019 14:34:20 +0900 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals Message-ID: Hi, May I get review for a webrev to fix SLP patterns that use PPC64 VSX instructions? We found that SLP patterns added by JDK-8208171 [1] use incorrect data type, so the patterns have never been used. Further, the pattern for filling -1.0 is confused with the operation for filling -1L. This webrev fixes the pattern to fill an array with 0 to use 0.0d instead of 0d, and removes the pattern to will with -1 because the bit pattern of -1.0d is not easy to generate using a single VSX instruction. It's should be better to load the literal from TOC and use general repl2D_reg_Ex pattern. I also fixed some comments in "format %{ ... %}" to show correct matching types. Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ Ref: [1] https://bugs.openjdk.java.net/browse/JDK-8208171 Regards, Ogata From lutz.schmidt at sap.com Fri May 17 13:19:48 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 17 May 2019 13:19:48 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> Message-ID: <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> Hi Vladimir, here is what I changed to overcome the ZERO build issues: ----------- 8< ---------------- diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp --- a/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:11:44 2019 +0200 +++ b/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:14:06 2019 +0200 @@ -37,6 +37,12 @@ public: void pd_patch_instruction(address branch, address target, const char* file, int line); + + //---< calculate length of instruction >--- + static unsigned int instr_len(unsigned char *instr) { return 1; } + + //---< longest instructions >--- + static unsigned int instr_maxlen() { return 1; } }; class MacroAssembler : public Assembler { diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp --- a/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:11:44 2019 +0200 +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:14:06 2019 +0200 @@ -61,6 +61,9 @@ bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit // might prove cumbersome because instr_len is hard to find on x86 #endif +#if defined(ZERO) +bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit +#endif // Return #bytes printed. Callers may use that for output alignment. // Print instruction address, and offset from blob begin. ----------- >8 ---------------- This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/ Regards, Lutz ?On 16.05.19, 21:40, "Vladimir Kozlov" wrote: I am not sure about exact parameters but I see build testing uses next: configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc Vladimir On 5/16/19 12:22 PM, Schmidt, Lutz wrote: > Hi Vladimir, > > thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? > > I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. > > Thanks, > Lutz > > On 16.05.19, 20:38, "Vladimir Kozlov" wrote: > > linux-x64-zero build is broke: > > workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' > int instr_size_in_bytes = Assembler::instr_len(pos); > ^~~~~~~~~ > Other builds and testing are good. > > Thanks, > Vladimir > > On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > > Nice. > > > > I submitted our tier1-3 testing. > > > > Thanks, > > Vladimir > > > > On 5/16/19 2:55 AM, Schmidt, Lutz wrote: > >> Hi Vladimir, > >> > >> sorry for the delayed reaction on your comments. > >> > >> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list > >> of modified files. > >> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In > >> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last > >> night). > >> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter > >> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. > >> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. > >> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ > >> > >> ~thartmann: > >> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it > >> work reliably. I would like to keep it in as a reminder and a template to build on. > >> > >> Thanks, > >> Lutz > >> > >> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: > >> > >> Hi Lutz, > >> My comments are inlined below. > >> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > >> > Thank you, Vladimir! > >> > Please find my comments inline and let me know what you think. > >> > A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ > >> Found one more I missed last time: > >> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): > >> static int instr_len(unsigned char *instr); > >> > Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a > >> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there > >> were no serious clashes) and sent a short note. > >> NP. Please, provide final webrev when you can so that I can run these changes through our testing to > >> make sure no issues are present (especially in builds). > >> > Regards, > >> > Lutz > >> > > >> > On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > >> > > >> > Hi Lutz, > >> > > >> > Thank you for doing this great work. > >> > > >> > I have just small comments: > >> > > >> > x86_64.ad - empty change. > >> > File contains whitespace changes for formatting. Not visible in webrev. > >> Okay. > >> > > >> > nmethod.cpp - LUCY? > >> > > >> > + st->print_cr("LUCY: NULL-oop"); > >> > + tty->print("LUCY NULL-oop"); > >> > Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. > >> Okay. > >> > > >> > nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > >> > Changed. > >> > > >> > vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > >> > Changed. > >> > > >> > disassembler.* - LUCY_OBSOLETE? > >> > > >> > +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > >> > This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction > >> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it > >> out. But now I did. It's gone. > >> Okay. > >> > > >> > compilerDefinitions.hpp - I don't see where tier_digit() is used. > >> > I'm surprised myself. Introduced it and then made it obsolete. It's gone. > >> > > >> > disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > >> > need to check for invalid value and print help output in such case - it will be very useful if you > >> > forgot a value spelling. Also add line for 'help' value. > >> > > >> > The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, > >> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is > >> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) > >> like. Not changed as of now. Awaiting your input. > >> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is > >> useless because PrintAssemblyOptions is used only for disassembler and there are no values which > >> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). > >> And an other commented new line: > >> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > >> > > >> > Printing help text: There is an option (hsdis-help) to request help text printout. > > >> > Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - > >> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more > >> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? > >> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use > >> if-else or switch without complicating the code. > >> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use > >> here? I don't think next comment is correct for ccstr type: > >> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 > >> It would be nice to fix it but you can do it later if you don't want to add more changes. > >> > > >> > Do you need next commented lines: > >> > > >> > disassembler.cpp - > >> > +// ptrdiff_t _offset; > >> > Deleted. > >> > > >> > +// Output suppressed because it messes up disassembly. > >> > +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > >> > Uncommented, would like to keep it. Made the if condition permanently false. > >> > > >> > disassembler_s390.cpp - > >> > +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > >> > Deleted. > >> > > >> > compile.cpp - > >> > +// st->print("# "); _tf->dump_on(st); st->cr(); > >> > Uncommented. > >> > > >> > > >> > abstractDisassembler.cpp - > >> > // st->print("0x%016lx", *((julong*)here)); > >> > st->print("0x%016lx", *((uintptr_t*)here)); > >> > // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > >> > Commented lines are gone. > >> > > >> > abstractDisassembler.cpp - may be explicit cast (byte*)?: > >> > > >> > st->print("%2.2x", *byte); > >> > st->print("%2.2x", *pos); > >> > st->print("0x%02x", *here); > >> > Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, > >> std::byte is a C++17 feature. We are not there yet. > >> okay > >> > > >> > PTR64_FORMAT ?: > >> > st->print("0x%016lx", *((uintptr_t*)here)); > >> > I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. > >> Changed other occurrences, so I changed this one as well. > >> Thanks, > >> Vladimir > >> > > >> > > >> > Thanks, > >> > Vladimir > >> > > >> > On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > >> > > Dear Community, > >> > > > >> > > may I please request comments and reviews for this change? Thank you! > >> > > > >> > > I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The > >> code which constitutes this patch was not changed. Here's the webrev link: > >> > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > >> > > > >> > > Regards, > >> > > Lutz > >> > > > >> > > On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > >> > > > >> > > Dear All, > >> > > > >> > > this topic was discussed back in Nov/Dec 2018: > >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > >> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > >> > > > >> > > Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > >> > > The result was mixed, some pro, some con. I let the input from back then influence my work of the > >> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general > >> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the > >> list (with default settings): > >> > > > >> > > PrintAssemblyOptions help: > >> > > hsdis-print-raw test plugin by requesting raw output (deprecated) > >> > > hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > >> > > hsdis-print-pc turn off PC printing (on by default) (deprecated) > >> > > hsdis-print-bytes turn on instruction byte output (deprecated) > >> > > > >> > > hsdis-show-pc toggle printing current pc, currently ON > >> > > hsdis-show-offset toggle printing current offset, currently OFF > >> > > hsdis-show-bytes toggle printing instruction bytes, currently OFF > >> > > hsdis-show-data-hex toggle formatting data as hex, currently ON > >> > > hsdis-show-data-int toggle formatting data as int, currently OFF > >> > > hsdis-show-data-float toggle formatting data as float, currently OFF > >> > > hsdis-show-structs toggle compiler data structures, currently OFF > >> > > hsdis-show-comment toggle instruction comments, currently OFF > >> > > hsdis-show-block-comment toggle block comments, currently OFF > >> > > hsdis-align-instr toggle instruction alignment, currently OFF > >> > > > >> > > Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. > >> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. > >> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like > >> a plan? Here is what you get: > >> > > > >> > > The machine code generated by the JVM can be printed in three different formats: > >> > > - Hexadecimal. > >> > > This is basically a hex dump of the memory range containing the code. > >> > > This format is always available (PRODUCT and not-PRODUCT builds), regardless > >> > > of the availability of a disassembler library. It applies to all sorts of > >> > > code, be it blobs, stubs, compiled nmethods, ... > >> > > This format seems useless at first glance, but it is not. In an upcoming, > >> > > separate enhancement, the JVM will be made capable of reading files > >> > > containing such code blocks and disassembling them post mortem. The most > >> > > prominent example is an hs_err* file. > >> > > - Disassembled. > >> > > This is an assembly listing of the instructions as found in the memory range > >> > > occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > >> > > disassembler library (hsdis-.so) must be available at runtime. > >> > > Most often, that will only be the case in test environments. If no disassembler > >> > > library is available, hexadecimal output is used as fallback. > >> > > - OptoAssembly. > >> > > This is a meta code listing created only by the C2 compiler. As it is somewhat > >> > > closer to the Java code, it may be helpful in linking assembly code to Java code. > >> > > > >> > > All three formats can be merged with additional information, most prominently compiler-internal > >> "knowledge" about blocks, related bytecodes, statistics counters, and much more. > >> > > > >> > > Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, > >> dependencies, exception handlers, are printed to aid in debugging. > >> > > > >> > > The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support > >> OptoAssembly output. Data structures are unavailable as well. > >> > > > >> > > So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you > >> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > >> > > http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > >> > > > >> > > OptoAssembly output for reference (always on with PrintAssembly): > >> > > ================================================================= > >> > > > >> > > 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value > >> (constant) > >> > > 039 movl R11, [RBP + #12 (8-bit)] # range > >> > > 03d NullCheck RBP > >> > > > >> > > 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 03d cmpl RDX, R11 # unsigned > >> > > 040 jnb,us B6 P=0.000000 C=5375.000000 > >> > > > >> > > PrintAssembly with no disassembler library available: > >> > > ===================================================== > >> > > > >> > > [Code] > >> > > [Entry Point] > >> > > 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > >> > > [Verified Entry Point] > >> > > 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > >> > > 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > >> > > 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > >> > > 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > >> > > [Exception Handler] > >> > > 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > >> > > [End] > >> > > > >> > > PrintAssembly with minimal verbosity: > >> > > ===================================== > >> > > > >> > > 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > >> > > 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > >> > > 0x00007f0434b89bdd: cmp %r11d,%edx > >> > > 0x00007f0434b89be0: jae 0x00007f0434b89c0e > >> > > > >> > > PrintAssembly (previous plus code offsets from code begin): > >> > > =========================================================== > >> > > > >> > > 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > >> > > 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > >> > > 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > >> > > 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > >> > > > >> > > PrintAssembly (previous plus block comments): > >> > > =========================================================== > >> > > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > >> > > 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > >> > > 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > >> > > > >> > > PrintAssembly (previous plus instruction comments): > >> > > =========================================================== > >> > > > >> > > ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >> > > 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 > >> rethrow=0 return_oop=0} > >> > > ; - java.lang.String::charAt at 8 > >> (line 702) > >> > > 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to > >> 0x00007fc3e11d7a9e > >> > > ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >> > > 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > >> > > 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > >> > > > >> > > For completeness, here are the links to > >> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > >> > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > >> > > > >> > > But please, as mentioned above, first focus on the output. The nitty details of the implementation > >> I would like to discuss after the output format has received some support. > >> > > > >> > > Thank you so much for your time! > >> > > Lutz > >> > > > >> > > > >> > > > >> > > > >> > > >> > > >> > >> > > From vladimir.kozlov at oracle.com Fri May 17 14:48:45 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 17 May 2019 07:48:45 -0700 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> Message-ID: <37865A98-264E-4775-9366-194A804AE5AA@oracle.com> Good. Thanks Vladimir > On May 17, 2019, at 6:19 AM, Schmidt, Lutz wrote: > > Hi Vladimir, > here is what I changed to overcome the ZERO build issues: > > ----------- 8< ---------------- > diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp > --- a/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:11:44 2019 +0200 > +++ b/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:14:06 2019 +0200 > @@ -37,6 +37,12 @@ > > public: > void pd_patch_instruction(address branch, address target, const char* file, int line); > + > + //---< calculate length of instruction >--- > + static unsigned int instr_len(unsigned char *instr) { return 1; } > + > + //---< longest instructions >--- > + static unsigned int instr_maxlen() { return 1; } > }; > > class MacroAssembler : public Assembler { > diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp > --- a/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:11:44 2019 +0200 > +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:14:06 2019 +0200 > @@ -61,6 +61,9 @@ > bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > // might prove cumbersome because instr_len is hard to find on x86 > #endif > +#if defined(ZERO) > +bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > +#endif > > // Return #bytes printed. Callers may use that for output alignment. > // Print instruction address, and offset from blob begin. > ----------- >8 ---------------- > > This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo: > https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/ > > Regards, > Lutz > > > ?On 16.05.19, 21:40, "Vladimir Kozlov" wrote: > > I am not sure about exact parameters but I see build testing uses next: > > configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc > > Vladimir > >> On 5/16/19 12:22 PM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? >> >> I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. >> >> Thanks, >> Lutz >> >> On 16.05.19, 20:38, "Vladimir Kozlov" wrote: >> >> linux-x64-zero build is broke: >> >> workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' >> int instr_size_in_bytes = Assembler::instr_len(pos); >> ^~~~~~~~~ >> Other builds and testing are good. >> >> Thanks, >> Vladimir >> >>> On 5/16/19 9:47 AM, Vladimir Kozlov wrote: >>> Nice. >>> >>> I submitted our tier1-3 testing. >>> >>> Thanks, >>> Vladimir >>> >>>> On 5/16/19 2:55 AM, Schmidt, Lutz wrote: >>>> Hi Vladimir, >>>> >>>> sorry for the delayed reaction on your comments. >>>> >>>> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list >>>> of modified files. >>>> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In >>>> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last >>>> night). >>>> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter >>>> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. >>>> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. >>>> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ >>>> >>>> ~thartmann: >>>> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it >>>> work reliably. I would like to keep it in as a reminder and a template to build on. >>>> >>>> Thanks, >>>> Lutz >>>> >>>> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: >>>> >>>> Hi Lutz, >>>> My comments are inlined below. >>>>> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: >>>>> Thank you, Vladimir! >>>>> Please find my comments inline and let me know what you think. >>>>> A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ >>>> Found one more I missed last time: >>>> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): >>>> static int instr_len(unsigned char *instr); >>>>> Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a >>>> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there >>>> were no serious clashes) and sent a short note. >>>> NP. Please, provide final webrev when you can so that I can run these changes through our testing to >>>> make sure no issues are present (especially in builds). >>>>> Regards, >>>>> Lutz >>>>> >>>>> On 09.05.19, 21:30, "Vladimir Kozlov" wrote: >>>>> >>>>> Hi Lutz, >>>>> >>>>> Thank you for doing this great work. >>>>> >>>>> I have just small comments: >>>>> >>>>> x86_64.ad - empty change. >>>>> File contains whitespace changes for formatting. Not visible in webrev. >>>> Okay. >>>>> >>>>> nmethod.cpp - LUCY? >>>>> >>>>> + st->print_cr("LUCY: NULL-oop"); >>>>> + tty->print("LUCY NULL-oop"); >>>>> Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. >>>> Okay. >>>>> >>>>> nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. >>>>> Changed. >>>>> >>>>> vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). >>>>> Changed. >>>>> >>>>> disassembler.* - LUCY_OBSOLETE? >>>>> >>>>> +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only >>>>> This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction >>>> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it >>>> out. But now I did. It's gone. >>>> Okay. >>>>> >>>>> compilerDefinitions.hpp - I don't see where tier_digit() is used. >>>>> I'm surprised myself. Introduced it and then made it obsolete. It's gone. >>>>> >>>>> disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You >>>>> need to check for invalid value and print help output in such case - it will be very useful if you >>>>> forgot a value spelling. Also add line for 'help' value. >>>>> >>>>> The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, >>>> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is >>>> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) >>>> like. Not changed as of now. Awaiting your input. >>>> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is >>>> useless because PrintAssemblyOptions is used only for disassembler and there are no values which >>>> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). >>>> And an other commented new line: >>>> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); >>>>> >>>>> Printing help text: There is an option (hsdis-help) to request help text printout. > >>>>> Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - >>>> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more >>>> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? >>>> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use >>>> if-else or switch without complicating the code. >>>> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use >>>> here? I don't think next comment is correct for ccstr type: >>>> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 >>>> It would be nice to fix it but you can do it later if you don't want to add more changes. >>>>> >>>>> Do you need next commented lines: >>>>> >>>>> disassembler.cpp - >>>>> +// ptrdiff_t _offset; >>>>> Deleted. >>>>> >>>>> +// Output suppressed because it messes up disassembly. >>>>> +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); >>>>> Uncommented, would like to keep it. Made the if condition permanently false. >>>>> >>>>> disassembler_s390.cpp - >>>>> +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); >>>>> Deleted. >>>>> >>>>> compile.cpp - >>>>> +// st->print("# "); _tf->dump_on(st); st->cr(); >>>>> Uncommented. >>>>> >>>>> >>>>> abstractDisassembler.cpp - >>>>> // st->print("0x%016lx", *((julong*)here)); >>>>> st->print("0x%016lx", *((uintptr_t*)here)); >>>>> // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); >>>>> Commented lines are gone. >>>>> >>>>> abstractDisassembler.cpp - may be explicit cast (byte*)?: >>>>> >>>>> st->print("%2.2x", *byte); >>>>> st->print("%2.2x", *pos); >>>>> st->print("0x%02x", *here); >>>>> Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, >>>> std::byte is a C++17 feature. We are not there yet. >>>> okay >>>>> >>>>> PTR64_FORMAT ?: >>>>> st->print("0x%016lx", *((uintptr_t*)here)); >>>>> I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. >>>> Changed other occurrences, so I changed this one as well. >>>> Thanks, >>>> Vladimir >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> On 5/8/19 8:31 AM, Schmidt, Lutz wrote: >>>>>> Dear Community, >>>>>> >>>>>> may I please request comments and reviews for this change? Thank you! >>>>>> >>>>>> I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The >>>> code which constitutes this patch was not changed. Here's the webrev link: >>>>>> https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ >>>>>> >>>>>> Regards, >>>>>> Lutz >>>>>> >>>>>> On 11.04.19, 23:24, "Schmidt, Lutz" wrote: >>>>>> >>>>>> Dear All, >>>>>> >>>>>> this topic was discussed back in Nov/Dec 2018: >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html >>>>>> >>>>>> Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. >>>>>> The result was mixed, some pro, some con. I let the input from back then influence my work of the >>>> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general >>>> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the >>>> list (with default settings): >>>>>> >>>>>> PrintAssemblyOptions help: >>>>>> hsdis-print-raw test plugin by requesting raw output (deprecated) >>>>>> hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) >>>>>> hsdis-print-pc turn off PC printing (on by default) (deprecated) >>>>>> hsdis-print-bytes turn on instruction byte output (deprecated) >>>>>> >>>>>> hsdis-show-pc toggle printing current pc, currently ON >>>>>> hsdis-show-offset toggle printing current offset, currently OFF >>>>>> hsdis-show-bytes toggle printing instruction bytes, currently OFF >>>>>> hsdis-show-data-hex toggle formatting data as hex, currently ON >>>>>> hsdis-show-data-int toggle formatting data as int, currently OFF >>>>>> hsdis-show-data-float toggle formatting data as float, currently OFF >>>>>> hsdis-show-structs toggle compiler data structures, currently OFF >>>>>> hsdis-show-comment toggle instruction comments, currently OFF >>>>>> hsdis-show-block-comment toggle block comments, currently OFF >>>>>> hsdis-align-instr toggle instruction alignment, currently OFF >>>>>> >>>>>> Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. >>>> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. >>>> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like >>>> a plan? Here is what you get: >>>>>> >>>>>> The machine code generated by the JVM can be printed in three different formats: >>>>>> - Hexadecimal. >>>>>> This is basically a hex dump of the memory range containing the code. >>>>>> This format is always available (PRODUCT and not-PRODUCT builds), regardless >>>>>> of the availability of a disassembler library. It applies to all sorts of >>>>>> code, be it blobs, stubs, compiled nmethods, ... >>>>>> This format seems useless at first glance, but it is not. In an upcoming, >>>>>> separate enhancement, the JVM will be made capable of reading files >>>>>> containing such code blocks and disassembling them post mortem. The most >>>>>> prominent example is an hs_err* file. >>>>>> - Disassembled. >>>>>> This is an assembly listing of the instructions as found in the memory range >>>>>> occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable >>>>>> disassembler library (hsdis-.so) must be available at runtime. >>>>>> Most often, that will only be the case in test environments. If no disassembler >>>>>> library is available, hexadecimal output is used as fallback. >>>>>> - OptoAssembly. >>>>>> This is a meta code listing created only by the C2 compiler. As it is somewhat >>>>>> closer to the Java code, it may be helpful in linking assembly code to Java code. >>>>>> >>>>>> All three formats can be merged with additional information, most prominently compiler-internal >>>> "knowledge" about blocks, related bytecodes, statistics counters, and much more. >>>>>> >>>>>> Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, >>>> dependencies, exception handlers, are printed to aid in debugging. >>>>>> >>>>>> The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support >>>> OptoAssembly output. Data structures are unavailable as well. >>>>>> >>>>>> So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you >>>> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: >>>>>> http://cr.openjdk.java.net/~lucy/webrevs/8213084/ >>>>>> >>>>>> OptoAssembly output for reference (always on with PrintAssembly): >>>>>> ================================================================= >>>>>> >>>>>> 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value >>>> (constant) >>>>>> 039 movl R11, [RBP + #12 (8-bit)] # range >>>>>> 03d NullCheck RBP >>>>>> >>>>>> 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 03d cmpl RDX, R11 # unsigned >>>>>> 040 jnb,us B6 P=0.000000 C=5375.000000 >>>>>> >>>>>> PrintAssembly with no disassembler library available: >>>>>> ===================================================== >>>>>> >>>>>> [Code] >>>>>> [Entry Point] >>>>>> 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 >>>>>> [Verified Entry Point] >>>>>> 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 >>>>>> 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c >>>>>> 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 >>>>>> 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 >>>>>> [Exception Handler] >>>>>> 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff >>>>>> [End] >>>>>> >>>>>> PrintAssembly with minimal verbosity: >>>>>> ===================================== >>>>>> >>>>>> 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp >>>>>> 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d >>>>>> 0x00007f0434b89bdd: cmp %r11d,%edx >>>>>> 0x00007f0434b89be0: jae 0x00007f0434b89c0e >>>>>> >>>>>> PrintAssembly (previous plus code offsets from code begin): >>>>>> =========================================================== >>>>>> >>>>>> 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp >>>>>> 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d >>>>>> 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx >>>>>> 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e >>>>>> >>>>>> PrintAssembly (previous plus block comments): >>>>>> =========================================================== >>>>>> >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp >>>>>> 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx >>>>>> 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e >>>>>> >>>>>> PrintAssembly (previous plus instruction comments): >>>>>> =========================================================== >>>>>> >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 >>>> rethrow=0 return_oop=0} >>>>>> ; - java.lang.String::charAt at 8 >>>> (line 702) >>>>>> 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to >>>> 0x00007fc3e11d7a9e >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx >>>>>> 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e >>>>>> >>>>>> For completeness, here are the links to >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 >>>>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ >>>>>> >>>>>> But please, as mentioned above, first focus on the output. The nitty details of the implementation >>>> I would like to discuss after the output format has received some support. >>>>>> >>>>>> Thank you so much for your time! >>>>>> Lutz >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >> >> > > From fujie at loongson.cn Fri May 17 15:03:29 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 17 May 2019 23:03:29 +0800 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> Message-ID: <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> Hi all, May I have a second review for this change? We need this test case for our misp port. Thanks in advance. Best regards, Jie On 2019?05?17? 07:33, Vladimir Kozlov wrote: > I think you need second review to make sure it is right change. > Especially who works on other platforms. > > Thanks, > Vladimir > > On 5/16/19 4:17 PM, Jie Fu wrote: >> Hi Vladimir, >> >> Thank you for your review. >> >> Do you think the change is simple enough? >> And could you please sponsor it? >> >> Thanks a lot. >> Best regards, >> Jie >> >> On 2019?05?17? 01:48, Vladimir Kozlov wrote: >>> Looks good to me. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/16/19 4:15 AM, Jie Fu wrote: >>>> Ping. >>>> >>>> Could someone help to review this change? >>>> I would greatly appreciate if this test case could be used for our >>>> mips-port jdk. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> On 2019/4/11 ??9:51, Jie Fu wrote: >>>>> Hi all, >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222302 >>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>>>> >>>>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not >>>>> AArch64, PPC, S390x, SPARC or X86). >>>>> It is designed to test >>>>> "UseSHASpecificTestCaseForUnsupportedCPU"[1] and >>>>> "GenericTestCaseForOtherCPU"[2] on any other CPU[3]. >>>>> But when they run on any other CPU (e.g., mips), an exception[4] >>>>> is always thrown, which causes the failure. >>>>> So there seems to be a logical bug in it. >>>>> >>>>> The change has been tested on mips and x86. >>>>> Could you please review it? >>>>> Thanks a lot. >>>>> >>>>> Best regards, >>>>> Jie >>>>> >>>>> [1] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>>>> >>>>> [2] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>>>> >>>>> [3] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>>>> >>>>> [4] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>>>> >>>>> >>>>> >>>> >> >> From vladimir.kozlov at oracle.com Fri May 17 19:27:03 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 17 May 2019 12:27:03 -0700 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> Message-ID: <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> Hi Lutz, Testing passed. I think you need second review because changes are not trivial. Thanks, Vladimir On 5/16/19 4:17 PM, Vladimir Kozlov wrote: > On 5/16/19 3:16 PM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> I have implemented that "added safety feature". What it does: >> ? - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal. >> ? - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with >> badCodeHeapNewVal. >> ? - During verify(), it is checked that all heap memory in free blocks is initialized as expected. >> >> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current >> jdk/jdk, is here: >> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ > > Good. I will run testing on it. > >> >> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be >> initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that >> out. And it's bad code, in my opinion. I would suggest to just delete that assert. > > Yes, I agree. Such implicit dependence is bad and useless. > We use reset_to() in AOT where scopes should already have valid offsets: > http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152 > I am not sure how we pass this assert in AOT case. > >> >> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some >> testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually >> vanishes. >> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be >> the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in >> JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them. > > Okay, it sounds good. > > Thanks, > Vladimir > >> >> Sorry for the long text. >> Lutz >> >> ?On 15.05.19, 19:00, "Vladimir Kozlov" wrote: >> >> ???? On 5/14/19 10:53 PM, Schmidt, Lutz wrote: >> ???? > Hi Vladimir, >> ???? > >> ???? > thank you for your comments. About filling CodeHeap with bad values after split_block: >> ???? >?? - in deallocate_tail, the leading part must remain intact. It contains valid code. >> ???? >?? - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts >> ???? Thank you for explaining. >> ???? >?? - If you want added safety, wouldn't it then be better to invalidate the block contents during >> add_to_freelist()? You could then be sure there is no executable code in a free block. >> ???? Yes, it is preferable. >> ???? An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next >> block >> ???? _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation? >> ???? Thanks, >> ???? Vladimir >> ???? > >> ???? > Regards, >> ???? > Lutz >> ???? > >> ???? > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: >> ???? > >> ???? >????? On 5/14/19 1:09 PM, Schmidt, Lutz wrote: >> ???? >????? > Hi Vladimir, >> ???? >????? > >> ???? >????? > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's >> not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, >> even with memset(). >> ???? >????? > >> ???? >????? > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized >> before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its >> weak memory ordering. >> ???? > >> ???? >????? Okay, if it is byte copy I am fine with it. >> ???? > >> ???? >????? > >> ???? >????? > About segment map marking: >> ???? >????? > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). >> ???? >????? > In short: each segment map element contains an (unsigned) index which, when subtracted from that element >> index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a >> heap block range to describe a newly formed heap block, the leading part remains valid. >> ???? >????? > >> ???? >????? > Segmap? before???????????? after >> ???? >????? > Index??? split???????????? split >> ???? >????? >??? I?????? 0 <- block start?? 0 <- block start (now shorter) >> ???? >????? >??? I+1???? 1????????????????? 1??? each index 0..9 still points >> ???? >????? >??? I+2???? 2????????????????? 2??? back to the block start >> ???? >????? >??? I+3???? 3????????????????? 3 >> ???? >????? >??? I+4???? 4????????????????? 4 >> ???? >????? >??? I+5???? 5????????????????? 5 >> ???? >????? >??? I+6???? 6????????????????? 6 >> ???? >????? >??? I+7???? 7????????????????? 7 >> ???? >????? >??? I+8???? 8????????????????? 8 >> ???? >????? >??? I+9???? 9????????????????? 9 >> ???? >????? >??? I+10??? 10???????????????? 0 <- new block start >> ???? >????? >??? I+11??? 11???????????????? 1 >> ???? >????? >??? I+12??? 12???????????????? 2 >> ???? >????? >??? I+13??? 13???????????????? 3 >> ???? >????? >??? I+14??? 14???????????????? 4 >> ???? >????? >??? I+15??? 0 <- block start?? 0 <- block start >> ???? >????? >??? I+16??? 1????????????????? 1 >> ???? >????? >??? I+17??? 2????????????????? 2 >> ???? >????? >??? I+18??? 3????????????????? 3 >> ???? >????? >??? I+19??? 4????????????????? 4 >> ???? >????? > >> ???? >????? > There is a (very short) description about what's happening at the very end of search_freelist(). >> split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? >> ???? > >> ???? >????? Thank you, I forgot about that first block mapping is still valid. >> ???? > >> ???? >????? What about storing bad value (in debug mode) only in second part and not both parts? >> ???? > >> ???? >????? > >> ???? >????? > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway >> because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In >> addition, I will add one line of comment and rename a local variable. That's it. >> ???? > >> ???? >????? Okay. >> ???? > >> ???? >????? Thanks, >> ???? >????? Vladimir >> ???? > >> ???? >????? > >> ???? >????? > Thanks, >> ???? >????? > Lutz >> ???? >????? > >> ???? >????? > >> ???? >????? > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" >> wrote: >> ???? >????? > >> ???? >????? >????? Good. >> ???? >????? > >> ???? >????? >????? Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I >> am wrong here). >> ???? >????? >????? An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of >> split segments as >> ???? >????? >????? used and (after call) store bad values in it. What about first part? May be add comment. >> ???? >????? > >> ???? >????? >????? Thanks, >> ???? >????? >????? Vladimir >> ???? >????? > >> ???? >????? >????? On 5/14/19 3:47 AM, Schmidt, Lutz wrote: >> ???? >????? >????? > Dear all, >> ???? >????? >????? > >> ???? >????? >????? > May I please request reviews for my change? >> ???? >????? >????? > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8223444 >> ???? >????? >????? > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ >> ???? >????? >????? > >> ???? >????? >????? > What this change is all about: >> ???? >????? >????? > ------------------------------ >> ???? >????? >????? > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some >> small changes which I would call improvements. >> ???? >????? >????? > >> ???? >????? >????? > Furthermore, and in particular with these changes, the platform-specific parameter >> CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block >> allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the >> minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance >> for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) >> the free list, slowing down the free block search. >> ???? >????? >????? > >> ???? >????? >????? > The following free block counts have been found after running JVM98 with different >> CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. >> ???? >????? >????? > >> ???? >????? >????? > JDK-8223444 not applied >> ???? >????? >????? > ======================= >> ???? >????? >????? > >> ???? >????? >????? >????????? Segment? |? free blocks with CodeCacheMinBlockLength= >> ???? >????? >????? >??????????? Size?? |?????? 1????? 2????? 3????? 4????? 6????? 8 >> ???? >????? >????? > -----------------+------------------------------------------- >> ???? >????? >????? > aarch????? 128?? |?????? 0??? 153???? 75???? 30???? 38????? 2 >> ???? >????? >????? > ppc??????? 128?? |?????? 0??? 149???? 98???? 59???? 14????? 2 >> ???? >????? >????? > ppcle????? 128?? |?????? 0??? 219??? 161??? 110???? 69???? 34 >> ???? >????? >????? > s390?????? 256?? |?????? 0??? 142???? 93???? 59???? 30???? 10 >> ???? >????? >????? > x86??????? 128?? |?????? 0??? 215??? 157??? 118???? 42???? 11 >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > JDK-8223444 applied >> ???? >????? >????? > =================== >> ???? >????? >????? > >> ???? >????? >????? >????????? Segment? |? free blocks with CodeCacheMinBlockLength=? |? suggested >> ???? >????? >????? >??????????? Size?? |?????? 1????? 2????? 3????? 4????? 6????? 8? |?? setting >> ???? >????? >????? > -----------------+---------------------------------------------+------------ >> ???? >????? >????? > aarch????? 128?? |???? 221??? 115???? 80???? 36????? 7????? 1? |???? 6 >> ???? >????? >????? > ppc??????? 128?? |???? 245??? 152??? 101???? 54???? 14????? 4? |???? 6 >> ???? >????? >????? > ppcle????? 128?? |???? 243??? 144???? 89???? 72???? 20????? 5? |???? 6 >> ???? >????? >????? > s390?????? 256?? |???? 168???? 60???? 67????? 8????? 6????? 2? |???? 4 >> ???? >????? >????? > x86??????? 128?? |???? 223??? 139???? 83???? 50???? 11????? 2? |???? 6 >> ???? >????? >????? > >> ???? >????? >????? > Thank you for your time and opinion! >> ???? >????? >????? > Lutz >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? >????? > >> ???? >????? > >> ???? >????? > >> ???? > >> ???? > >> From dean.long at oracle.com Fri May 17 20:59:04 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 17 May 2019 13:59:04 -0700 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> Message-ID: <1a7a031f-f975-337a-77db-57fb43f5fada@oracle.com> Seems fine to me.? However, these tests seem overly complicated.? I don't think we test other flags quite so thoroughly! dl On 5/17/19 8:03 AM, Jie Fu wrote: > Hi all, > > May I have a second review for this change? > We need this test case for our misp port. > > Thanks in advance. > Best regards, > Jie > > > On 2019?05?17? 07:33, Vladimir Kozlov wrote: >> I think you need second review to make sure it is right change. >> Especially who works on other platforms. >> >> Thanks, >> Vladimir >> >> On 5/16/19 4:17 PM, Jie Fu wrote: >>> Hi Vladimir, >>> >>> Thank you for your review. >>> >>> Do you think the change is simple enough? >>> And could you please sponsor it? >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> On 2019?05?17? 01:48, Vladimir Kozlov wrote: >>>> Looks good to me. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/16/19 4:15 AM, Jie Fu wrote: >>>>> Ping. >>>>> >>>>> Could someone help to review this change? >>>>> I would greatly appreciate if this test case could be used for our >>>>> mips-port jdk. >>>>> >>>>> Thanks a lot. >>>>> Best regards, >>>>> Jie >>>>> >>>>> On 2019/4/11 ??9:51, Jie Fu wrote: >>>>>> Hi all, >>>>>> >>>>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8222302 >>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>>>>> >>>>>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not >>>>>> AArch64, PPC, S390x, SPARC or X86). >>>>>> It is designed to test >>>>>> "UseSHASpecificTestCaseForUnsupportedCPU"[1] and >>>>>> "GenericTestCaseForOtherCPU"[2] on any other CPU[3]. >>>>>> But when they run on any other CPU (e.g., mips), an exception[4] >>>>>> is always thrown, which causes the failure. >>>>>> So there seems to be a logical bug in it. >>>>>> >>>>>> The change has been tested on mips and x86. >>>>>> Could you please review it? >>>>>> Thanks a lot. >>>>>> >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>> [1] >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>>>>> >>>>>> [2] >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>>>>> >>>>>> [3] >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>>>>> >>>>>> [4] >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>>>>> >>>>>> >>>>>> >>>>> >>> >>> > > From fujie at loongson.cn Sat May 18 01:37:43 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 18 May 2019 09:37:43 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached Message-ID: Hi all, JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ I'm sorry to introduce this assertion failure. Please review the suggested fix and give me some advice. Leonid, could you please help to test the patch? I don't have the reproducer you mentioned in the JBS. Thanks a lot. Best regards, Jie From fujie at loongson.cn Sat May 18 03:51:28 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 18 May 2019 11:51:28 +0800 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <1a7a031f-f975-337a-77db-57fb43f5fada@oracle.com> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> <1a7a031f-f975-337a-77db-57fb43f5fada@oracle.com> Message-ID: <4a40d62d-4d9b-e6bf-771a-6daeefcc7b1d@loongson.cn> Thanks dl for your review. Vladimir, is it OK to be pushed? Thanks a lot. Best regards, Jie On 2019/5/18 ??4:59, dean.long at oracle.com wrote: > Seems fine to me.? However, these tests seem overly complicated.? I > don't think we test other flags quite so thoroughly! > > dl > > On 5/17/19 8:03 AM, Jie Fu wrote: >> Hi all, >> >> May I have a second review for this change? >> We need this test case for our misp port. >> >> Thanks in advance. >> Best regards, >> Jie >> >> >> On 2019?05?17? 07:33, Vladimir Kozlov wrote: >>> I think you need second review to make sure it is right change. >>> Especially who works on other platforms. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/16/19 4:17 PM, Jie Fu wrote: >>>> Hi Vladimir, >>>> >>>> Thank you for your review. >>>> >>>> Do you think the change is simple enough? >>>> And could you please sponsor it? >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> On 2019?05?17? 01:48, Vladimir Kozlov wrote: >>>>> Looks good to me. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 5/16/19 4:15 AM, Jie Fu wrote: >>>>>> Ping. >>>>>> >>>>>> Could someone help to review this change? >>>>>> I would greatly appreciate if this test case could be used for >>>>>> our mips-port jdk. >>>>>> >>>>>> Thanks a lot. >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>> On 2019/4/11 ??9:51, Jie Fu wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222302 >>>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>>>>>> >>>>>>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU >>>>>>> (not AArch64, PPC, S390x, SPARC or X86). >>>>>>> It is designed to test >>>>>>> "UseSHASpecificTestCaseForUnsupportedCPU"[1] and >>>>>>> "GenericTestCaseForOtherCPU"[2] on any other CPU[3]. >>>>>>> But when they run on any other CPU (e.g., mips), an exception[4] >>>>>>> is always thrown, which causes the failure. >>>>>>> So there seems to be a logical bug in it. >>>>>>> >>>>>>> The change has been tested on mips and x86. >>>>>>> Could you please review it? >>>>>>> Thanks a lot. >>>>>>> >>>>>>> Best regards, >>>>>>> Jie >>>>>>> >>>>>>> [1] >>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>>>>>> >>>>>>> [2] >>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>>>>>> >>>>>>> [3] >>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>>>>>> >>>>>>> [4] >>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>> >>>> >> >> > From leonid.mesnik at oracle.com Sat May 18 04:13:15 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Fri, 17 May 2019 21:13:15 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Hi I started testing, but took a long time so I update bug with status on Monday. Leonid > On May 17, 2019, at 6:37 PM, Jie Fu wrote: > > Hi all, > > JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 > Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ > > I'm sorry to introduce this assertion failure. > Please review the suggested fix and give me some advice. > > Leonid, could you please help to test the patch? > I don't have the reproducer you mentioned in the JBS. > > Thanks a lot. > Best regards, > Jie > > From vladimir.x.ivanov at oracle.com Sat May 18 15:53:35 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Sat, 18 May 2019 18:53:35 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: <5ed61999-59ef-15a6-83ac-04d5f70e31a8@oracle.com> Looks good. Best regards, Vladimir Ivanov On 18/05/2019 04:37, Jie Fu wrote: > Hi all, > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 > Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ > > I'm sorry to introduce this assertion failure. > Please review the suggested fix and give me some advice. > > Leonid, could you please help to test the patch? > I don't have the reproducer you mentioned in the JBS. > > Thanks a lot. > Best regards, > Jie > > From vladimir.kozlov at oracle.com Sat May 18 16:15:47 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 18 May 2019 09:15:47 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Hi Jie, So the counter was incremented while this code is executed. And you fixed it by caching initial value. Looks good. Thanks, Vladimir On 5/17/19 6:37 PM, Jie Fu wrote: > Hi all, > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 > Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ > > I'm sorry to introduce this assertion failure. > Please review the suggested fix and give me some advice. > > Leonid, could you please help to test the patch? > I don't have the reproducer you mentioned in the JBS. > > Thanks a lot. > Best regards, > Jie > > From vladimir.kozlov at oracle.com Sat May 18 19:13:17 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 18 May 2019 12:13:17 -0700 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <4a40d62d-4d9b-e6bf-771a-6daeefcc7b1d@loongson.cn> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> <1a7a031f-f975-337a-77db-57fb43f5fada@oracle.com> <4a40d62d-4d9b-e6bf-771a-6daeefcc7b1d@loongson.cn> Message-ID: <9c37d210-7392-b6e7-a3b7-241769d68b1c@oracle.com> Yes. I tested it and pushed: http://hg.openjdk.java.net/jdk/jdk/rev/24c0eeb3ebe7 Regards, Vladimir On 5/17/19 8:51 PM, Jie Fu wrote: > Thanks dl for your review. > > Vladimir, is it OK to be pushed? > > Thanks a lot. > Best regards, > Jie > > On 2019/5/18 ??4:59, dean.long at oracle.com wrote: >> Seems fine to me.? However, these tests seem overly complicated.? I don't think we test other flags quite so thoroughly! >> >> dl >> >> On 5/17/19 8:03 AM, Jie Fu wrote: >>> Hi all, >>> >>> May I have a second review for this change? >>> We need this test case for our misp port. >>> >>> Thanks in advance. >>> Best regards, >>> Jie >>> >>> >>> On 2019?05?17? 07:33, Vladimir Kozlov wrote: >>>> I think you need second review to make sure it is right change. Especially who works on other platforms. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/16/19 4:17 PM, Jie Fu wrote: >>>>> Hi Vladimir, >>>>> >>>>> Thank you for your review. >>>>> >>>>> Do you think the change is simple enough? >>>>> And could you please sponsor it? >>>>> >>>>> Thanks a lot. >>>>> Best regards, >>>>> Jie >>>>> >>>>> On 2019?05?17? 01:48, Vladimir Kozlov wrote: >>>>>> Looks good to me. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 5/16/19 4:15 AM, Jie Fu wrote: >>>>>>> Ping. >>>>>>> >>>>>>> Could someone help to review this change? >>>>>>> I would greatly appreciate if this test case could be used for our mips-port jdk. >>>>>>> >>>>>>> Thanks a lot. >>>>>>> Best regards, >>>>>>> Jie >>>>>>> >>>>>>> On 2019/4/11 ??9:51, Jie Fu wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8222302 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8222302/webrev.00/ >>>>>>>> >>>>>>>> TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU (not AArch64, PPC, S390x, SPARC or X86). >>>>>>>> It is designed to test "UseSHASpecificTestCaseForUnsupportedCPU"[1] and "GenericTestCaseForOtherCPU"[2] on any >>>>>>>> other CPU[3]. >>>>>>>> But when they run on any other CPU (e.g., mips), an exception[4] is always thrown, which causes the failure. >>>>>>>> So there seems to be a logical bug in it. >>>>>>>> >>>>>>>> The change has been tested on mips and x86. >>>>>>>> Could you please review it? >>>>>>>> Thanks a lot. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Jie >>>>>>>> >>>>>>>> [1] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l56 >>>>>>>> >>>>>>>> [2] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java#l58 >>>>>>>> >>>>>>>> [3] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java#l34 >>>>>>>> >>>>>>>> [4] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/bf07e140c49c/test/hotspot/jtreg/compiler/intrinsics/sha/cli/SHAOptionsBase.java#l92 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> >>> >>> >> > From fujie at loongson.cn Sun May 19 00:40:52 2019 From: fujie at loongson.cn (Jie Fu) Date: Sun, 19 May 2019 08:40:52 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Thanks Vladimir Ivanov and Vladimir Kozlov for your review. Let's wait for Leonid's test result. Thanks. Best regards, Jie On 2019?05?19? 00:15, Vladimir Kozlov wrote: > Hi Jie, > > So the counter was incremented while this code is executed. And you > fixed it by caching initial value. > Looks good. > > Thanks, > Vladimir > > On 5/17/19 6:37 PM, Jie Fu wrote: >> Hi all, >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >> >> I'm sorry to introduce this assertion failure. >> Please review the suggested fix and give me some advice. >> >> Leonid, could you please help to test the patch? >> I don't have the reproducer you mentioned in the JBS. >> >> Thanks a lot. >> Best regards, >> Jie >> >> From fujie at loongson.cn Sun May 19 00:43:46 2019 From: fujie at loongson.cn (Jie Fu) Date: Sun, 19 May 2019 08:43:46 +0800 Subject: RFR:8222302:[TESTBUG]test/hotspot/jtreg/compiler/intrinsics/sha/cli/TestUseSHAOptionOnUnsupportedCPU.java fails on any other CPU In-Reply-To: <9c37d210-7392-b6e7-a3b7-241769d68b1c@oracle.com> References: <6553dd4f-73e1-d91e-80ef-1f0473fb698c@loongson.cn> <1b0a489c-e3c1-204d-68a4-e410c23f9078@oracle.com> <282a11f2-1604-b1ee-ead4-70c95fcf993c@loongson.cn> <1a7a031f-f975-337a-77db-57fb43f5fada@oracle.com> <4a40d62d-4d9b-e6bf-771a-6daeefcc7b1d@loongson.cn> <9c37d210-7392-b6e7-a3b7-241769d68b1c@oracle.com> Message-ID: <03eb1c0e-f4b4-83b1-018e-5fc267621555@loongson.cn> Thank you so much, Vladimir. On 2019?05?19? 03:13, Vladimir Kozlov wrote: > Yes. I tested it and pushed: > http://hg.openjdk.java.net/jdk/jdk/rev/24c0eeb3ebe7 > > Regards, > Vladimir From dms at samersoff.net Sun May 19 15:38:32 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Sun, 19 May 2019 18:38:32 +0300 Subject: [aarch64-port-dev ] RFR(S): 8215792: AArch64: String.indexOf generates incorrect result In-Reply-To: <07582b62-ccdf-97c8-5bd9-f441b488fa03@bell-sw.com> References: <32345571546521566@sas2-ce04c18c415c.qloud-c.yandex.net> <07582b62-ccdf-97c8-5bd9-f441b488fa03@bell-sw.com> Message-ID: <46a604a9-8d91-220e-67da-8d58db68f670@samersoff.net> Dmitrij, I'm OK with this one-line fix. I would recommend to push the patch that fixes actual bug (webrev.01), then proceed with the documentation (i.e. webrev.02) in a separate turn. -Dmitry On 09.01.2019 17:50, Dmitrij Pochepko wrote: > Hi all, > > > here is my version of this patch consisting of single "sub" instruction > (I haven't changed test): > http://cr.openjdk.java.net/~dpochepk/8215792/webrev.01/ > > cnt2 is a counter for characters yet to be checked. So, instead of > checking all characters in source string for first character match > (which was initial reason for this bug), now it check (str2len - str1len > + 1). > > > Actually I think this "sub" instruction was initially lost while working > on this instrinsic and moving this instruction between this block > (generate_string_indexof_linear) and caller code. Regular tests couldn't > catch this problem. > > > I run some testing to ensure regular usecases are not affected and it > seems fine. Affected testcase and your test pass as well. > > > btw: now this code is even faster, because less characters will be > loaded and checked > > > Thanks, > > Dmitrij > > On 04/01/2019 3:52 PM, Dmitrij Pochepko wrote: >> Sure. >> >> I could miss something, so, need to try it. I'll send webrev with >> patch once it's done. >> >> >> Thanks, >> >> Dmitrij >> >> >> On 04.01.2019 14:04, Pengfei Li (Arm Technology China) wrote: >>> Hi Dmitrij, >>> >>> Thanks a lot for your reply. >>> >>>> since cnt2 is used as counter, wouldn't it be easier and shorter >>>> just to substract cnt1 from cnt2 at the beginning of this code. >>>> Total (cnt2 - cnt1 +1) combinations must be checked. That is why >>>> first sustraction is by (wordSize/str2_chr_size - 1). >>>> Then whole fix will be probably just 1 line at the beginning: >>>> sub(cnt2, cnt2, cnt1); >>> I don't think the whole fix could be as easy as "sub(cnt2, cnt2, >>> cnt1)" because cnt2 is the counter which counts number of bytes not >>> processed. It could be different from the number of bytes after >>> current first-character-match index. >>> >>> But this is just my thought. Perhaps I didn't understand your idea >>> and code thoroughly. So could you post your shorter fix and let's >>> test if it's right? >>> >>> --? >>> Thanks, >>> Pengfei >>> >> From dms at samersoff.net Sun May 19 15:42:11 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Sun, 19 May 2019 18:42:11 +0300 Subject: [aarch64-port-dev ] RFR: 8218748: AARCH64: String::compareTo intrinsic documentation and maintenance improvement In-Reply-To: References: <86f91401-b7f6-b634-fef1-b0615b8fcde0@redhat.com> Message-ID: <399186f2-5cb7-b713-33c5-b29b6eb73eb8@samersoff.net> Dmitrij, The changes looks good to me. -Dmitry On 25.02.2019 19:52, Dmitrij Pochepko wrote: > Hi Andrew, Pengfei, > > I created webrev.02 with all your suggestions implemented: > > webrev: http://cr.openjdk.java.net/~dpochepk/8218748/webrev.02/ > > - comments are now both in separate section and inlined into code. > - documentation mismatch mentioned by Pengfei is fixed: > -- SHORT_LAST_INIT label name misprint changed to correct SHORT_LAST > -- SHORT_LOOP_TAIL block now merged with last instruction. Documentation > is updated respectively > - minor other changes to layout and wording > > Newly developed tests were run as sanity and they passed. > > Thanks, > Dmitrij > > On 22/02/2019 6:42 PM, Andrew Haley wrote: >> On 2/22/19 10:31 AM, Pengfei Li (Arm Technology China) wrote: >> >>> So personally, I still prefer to inline the comments with the >>> original code block to avoid this kind of inconsistencies. And it >>> makes us easier to review or maintain the code together with the >>> doc, as we don't need to scroll back and force. I don't know the >>> benefit of making the code documentation as a separate part. What's >>> your opinion, Andrew Haley? >> I agree with you. There's no harm having both inline and separate. >> From tobias.hartmann at oracle.com Mon May 20 07:29:39 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 20 May 2019 09:29:39 +0200 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> Message-ID: Hi Lutz, Just wondering if we shouldn't fail in CodeHeap::verify() instead of just printing a warning? Also, in heap.cpp:528, why do you need 'res'? You could just update found_block in line 578, right? Best regards, Tobias On 17.05.19 21:27, Vladimir Kozlov wrote: > Hi Lutz, > > Testing passed. > I think you need second review because changes are not trivial. > > Thanks, > Vladimir > > On 5/16/19 4:17 PM, Vladimir Kozlov wrote: >> On 5/16/19 3:16 PM, Schmidt, Lutz wrote: >>> Hi Vladimir, >>> >>> I have implemented that "added safety feature". What it does: >>> ? - During reserve() and expand_by(), all newly committed heap memory is initialized with >>> badCodeHeapNewVal. >>> ? - Whenever a FreeBlock is added to the free block list, it's memory (except for the header >>> part) is initialized with badCodeHeapNewVal. >>> ? - During verify(), it is checked that all heap memory in free blocks is initialized as expected. >>> >>> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, >>> based on current jdk/jdk, is here: >>> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ >> >> Good. I will run testing on it. >> >>> >>> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a >>> new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int. >>> It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to >>> just delete that assert. >> >> Yes, I agree. Such implicit dependence is bad and useless. >> We use reset_to() in AOT where scopes should already have valid offsets: >> http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152 >> I am not sure how we pass this assert in AOT case. >> >>> >>> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. >>> But I did some testing/tracing. That additional free block (just one) is consumed by subsequent >>> allocate() calls and eventually vanishes. >>> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no >>> longer need to be the last block before _next_segment. That is a prerequisite if we want to get >>> rid of issues like the one described in JDK-8223770. We could just allocate a generously large >>> block for stubs and at the end deallocate_tail them. >> >> Okay, it sounds good. >> >> Thanks, >> Vladimir >> >>> >>> Sorry for the long text. >>> Lutz >>> >>> ?On 15.05.19, 19:00, "Vladimir Kozlov" wrote: >>> >>> ???? On 5/14/19 10:53 PM, Schmidt, Lutz wrote: >>> ???? > Hi Vladimir, >>> ???? > >>> ???? > thank you for your comments. About filling CodeHeap with bad values after split_block: >>> ???? >?? - in deallocate_tail, the leading part must remain intact. It contains valid code. >>> ???? >?? - in search_freelist, one free block is split into two. There I could invalidate the >>> contents of both parts >>> ???? Thank you for explaining. >>> ???? >?? - If you want added safety, wouldn't it then be better to invalidate the block contents >>> during add_to_freelist()? You could then be sure there is no executable code in a free block. >>> ???? Yes, it is preferable. >>> ???? An other note (after looking more on changes). You changed where freed tail goes. Originally >>> it was added to next block >>> ???? _next_segment (make it larger) and you created separate small block. Is not it create more >>> fragmentation? >>> ???? Thanks, >>> ???? Vladimir >>> ???? > >>> ???? > Regards, >>> ???? > Lutz >>> ???? > >>> ???? > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: >>> ???? > >>> ???? >????? On 5/14/19 1:09 PM, Schmidt, Lutz wrote: >>> ???? >????? > Hi Vladimir, >>> ???? >????? > >>> ???? >????? > I had the same thought re atomicity. memset() is not consistent even on one >>> platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we >>> have byte atomicity on all supported platforms, even with memset(). >>> ???? >????? > >>> ???? >????? > It's a different thing with sequence of initialization. Do we really depend on >>> byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit >>> byte loop. Not on x86, but on ppc with its weak memory ordering. >>> ???? > >>> ???? >????? Okay, if it is byte copy I am fine with it. >>> ???? > >>> ???? >????? > >>> ???? >????? > About segment map marking: >>> ???? >????? > There is a short description how the segment map works in heap.cpp, right before >>> CodeHeap::find_start(). >>> ???? >????? > In short: each segment map element contains an (unsigned) index which, when >>> subtracted from that element index, addresses the segment map element where the heap block >>> starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly >>> formed heap block, the leading part remains valid. >>> ???? >????? > >>> ???? >????? > Segmap? before???????????? after >>> ???? >????? > Index??? split???????????? split >>> ???? >????? >??? I?????? 0 <- block start?? 0 <- block start (now shorter) >>> ???? >????? >??? I+1???? 1????????????????? 1??? each index 0..9 still points >>> ???? >????? >??? I+2???? 2????????????????? 2??? back to the block start >>> ???? >????? >??? I+3???? 3????????????????? 3 >>> ???? >????? >??? I+4???? 4????????????????? 4 >>> ???? >????? >??? I+5???? 5????????????????? 5 >>> ???? >????? >??? I+6???? 6????????????????? 6 >>> ???? >????? >??? I+7???? 7????????????????? 7 >>> ???? >????? >??? I+8???? 8????????????????? 8 >>> ???? >????? >??? I+9???? 9????????????????? 9 >>> ???? >????? >??? I+10??? 10???????????????? 0 <- new block start >>> ???? >????? >??? I+11??? 11???????????????? 1 >>> ???? >????? >??? I+12??? 12???????????????? 2 >>> ???? >????? >??? I+13??? 13???????????????? 3 >>> ???? >????? >??? I+14??? 14???????????????? 4 >>> ???? >????? >??? I+15??? 0 <- block start?? 0 <- block start >>> ???? >????? >??? I+16??? 1????????????????? 1 >>> ???? >????? >??? I+17??? 2????????????????? 2 >>> ???? >????? >??? I+18??? 3????????????????? 3 >>> ???? >????? >??? I+19??? 4????????????????? 4 >>> ???? >????? > >>> ???? >????? > There is a (very short) description about what's happening at the very end of >>> search_freelist(). split_block() is called there as well. Would you like to see a similar comment >>> in deallocate_tail()? >>> ???? > >>> ???? >????? Thank you, I forgot about that first block mapping is still valid. >>> ???? > >>> ???? >????? What about storing bad value (in debug mode) only in second part and not both parts? >>> ???? > >>> ???? >????? > >>> ???? >????? > Once I have your response, I will create a new webrev reflecting your input. I need >>> to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The >>> checks can't be done at that place. In addition, I will add one line of comment and rename a >>> local variable. That's it. >>> ???? > >>> ???? >????? Okay. >>> ???? > >>> ???? >????? Thanks, >>> ???? >????? Vladimir >>> ???? > >>> ???? >????? > >>> ???? >????? > Thanks, >>> ???? >????? > Lutz >>> ???? >????? > >>> ???? >????? > >>> ???? >????? > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" >>> wrote: >>> ???? >????? > >>> ???? >????? >????? Good. >>> ???? >????? > >>> ???? >????? >????? Do we need to be concern about atomicity of marking? We know that memset() is >>> not atomic (may be I am wrong here). >>> ???? >????? >????? An other thing is I did not get logic in deallocate_tail(). split_block() >>> marks only second half of split segments as >>> ???? >????? >????? used and (after call) store bad values in it. What about first part? May be >>> add comment. >>> ???? >????? > >>> ???? >????? >????? Thanks, >>> ???? >????? >????? Vladimir >>> ???? >????? > >>> ???? >????? >????? On 5/14/19 3:47 AM, Schmidt, Lutz wrote: >>> ???? >????? >????? > Dear all, >>> ???? >????? >????? > >>> ???? >????? >????? > May I please request reviews for my change? >>> ???? >????? >????? > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8223444 >>> ???? >????? >????? > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ >>> ???? >????? >????? > >>> ???? >????? >????? > What this change is all about: >>> ???? >????? >????? > ------------------------------ >>> ???? >????? >????? > While working on another topic, I came across the code in >>> share/memory/heap.cpp. I applied some small changes which I would call improvements. >>> ???? >????? >????? > >>> ???? >????? >????? > Furthermore, and in particular with these changes, the platform-specific >>> parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small >>> free blocks. Heap block allocation does not create free blocks smaller than >>> CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. >>> If it is too small, such free blocks will never be re-allocated. The only chance for them to >>> vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the >>> beginning of) the free list, slowing down the free block search. >>> ???? >????? >????? > >>> ???? >????? >????? > The following free block counts have been found after running JVM98 with >>> different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the >>> CodeHeap state at VM shutdown. >>> ???? >????? >????? > >>> ???? >????? >????? > JDK-8223444 not applied >>> ???? >????? >????? > ======================= >>> ???? >????? >????? > >>> ???? >????? >????? >????????? Segment? |? free blocks with CodeCacheMinBlockLength= >>> ???? >????? >????? >??????????? Size?? |?????? 1????? 2????? 3????? 4????? 6????? 8 >>> ???? >????? >????? > -----------------+------------------------------------------- >>> ???? >????? >????? > aarch????? 128?? |?????? 0??? 153???? 75???? 30???? 38????? 2 >>> ???? >????? >????? > ppc??????? 128?? |?????? 0??? 149???? 98???? 59???? 14????? 2 >>> ???? >????? >????? > ppcle????? 128?? |?????? 0??? 219??? 161??? 110???? 69???? 34 >>> ???? >????? >????? > s390?????? 256?? |?????? 0??? 142???? 93???? 59???? 30???? 10 >>> ???? >????? >????? > x86??????? 128?? |?????? 0??? 215??? 157??? 118???? 42???? 11 >>> ???? >????? >????? > >>> ???? >????? >????? > >>> ???? >????? >????? > JDK-8223444 applied >>> ???? >????? >????? > =================== >>> ???? >????? >????? > >>> ???? >????? >????? >????????? Segment? |? free blocks with CodeCacheMinBlockLength=? |? suggested >>> ???? >????? >????? >??????????? Size?? |?????? 1????? 2????? 3????? 4????? 6????? 8? |?? setting >>> ???? >????? >????? > -----------------+---------------------------------------------+------------ >>> ???? >????? >????? > aarch????? 128?? |???? 221??? 115???? 80???? 36????? 7????? 1? |???? 6 >>> ???? >????? >????? > ppc??????? 128?? |???? 245??? 152??? 101???? 54???? 14????? 4? |???? 6 >>> ???? >????? >????? > ppcle????? 128?? |???? 243??? 144???? 89???? 72???? 20????? 5? |???? 6 >>> ???? >????? >????? > s390?????? 256?? |???? 168???? 60???? 67????? 8????? 6????? 2? |???? 4 >>> ???? >????? >????? > x86??????? 128?? |???? 223??? 139???? 83???? 50???? 11????? 2? |???? 6 >>> ???? >????? >????? > >>> ???? >????? >????? > Thank you for your time and opinion! >>> ???? >????? >????? > Lutz >>> ???? >????? >????? > >>> ???? >????? >????? > >>> ???? >????? >????? > >>> ???? >????? >????? > >>> ???? >????? > >>> ???? >????? > >>> ???? > >>> ???? > >>> From leonid.mesnik at oracle.com Mon May 20 08:12:46 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Mon, 20 May 2019 01:12:46 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: The failure is still reproduced with patch. I attached full hs_err to the bug. hs_err # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 # assert(profile_count == 0) failed: sanity # # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc # # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ chensink14D_java/scratch/0/core.3096) # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # --------------- S U M M A R Y ------------ Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ al.properties Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) --------------- T H R E A D --------------- Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, stack(0x00002ae522de2000,0x00002ae522ee3000)] Current CompileTask: C2:111312036 146944 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650, free space=1013k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, float)+0xc54 V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d V [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a V [libjvm.so+0x16c9267] JavaThread::run()+0x227 V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e Leonid > On May 18, 2019, at 5:40 PM, Jie Fu wrote: > > Thanks Vladimir Ivanov and Vladimir Kozlov for your review. > Let's wait for Leonid's test result. > > Thanks. > Best regards, > Jie > > On 2019?05?19? 00:15, Vladimir Kozlov wrote: >> Hi Jie, >> >> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >> Looks good. >> >> Thanks, >> Vladimir >> >> On 5/17/19 6:37 PM, Jie Fu wrote: >>> Hi all, >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>> >>> I'm sorry to introduce this assertion failure. >>> Please review the suggested fix and give me some advice. >>> >>> Leonid, could you please help to test the patch? >>> I don't have the reproducer you mentioned in the JBS. >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> > > From lutz.schmidt at sap.com Mon May 20 08:22:30 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 20 May 2019 08:22:30 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <37865A98-264E-4775-9366-194A804AE5AA@oracle.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> <37865A98-264E-4775-9366-194A804AE5AA@oracle.com> Message-ID: <3A11FD93-C2E0-48BE-B927-D718D58B50C6@sap.com> Hi Vladimir, I appreciate the time you spent making this change "push-worthy". Thank you for reviewing it! I am now hoping for a second reviewer to set aside some time and have a look. Thanks, Lutz ?On 17.05.19, 16:48, "Vladimir Kozlov" wrote: Good. Thanks Vladimir > On May 17, 2019, at 6:19 AM, Schmidt, Lutz wrote: > > Hi Vladimir, > here is what I changed to overcome the ZERO build issues: > > ----------- 8< ---------------- > diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp > --- a/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:11:44 2019 +0200 > +++ b/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:14:06 2019 +0200 > @@ -37,6 +37,12 @@ > > public: > void pd_patch_instruction(address branch, address target, const char* file, int line); > + > + //---< calculate length of instruction >--- > + static unsigned int instr_len(unsigned char *instr) { return 1; } > + > + //---< longest instructions >--- > + static unsigned int instr_maxlen() { return 1; } > }; > > class MacroAssembler : public Assembler { > diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp > --- a/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:11:44 2019 +0200 > +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:14:06 2019 +0200 > @@ -61,6 +61,9 @@ > bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > // might prove cumbersome because instr_len is hard to find on x86 > #endif > +#if defined(ZERO) > +bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > +#endif > > // Return #bytes printed. Callers may use that for output alignment. > // Print instruction address, and offset from blob begin. > ----------- >8 ---------------- > > This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo: > https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/ > > Regards, > Lutz > > > On 16.05.19, 21:40, "Vladimir Kozlov" wrote: > > I am not sure about exact parameters but I see build testing uses next: > > configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc > > Vladimir > >> On 5/16/19 12:22 PM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? >> >> I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. >> >> Thanks, >> Lutz >> >> On 16.05.19, 20:38, "Vladimir Kozlov" wrote: >> >> linux-x64-zero build is broke: >> >> workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' >> int instr_size_in_bytes = Assembler::instr_len(pos); >> ^~~~~~~~~ >> Other builds and testing are good. >> >> Thanks, >> Vladimir >> >>> On 5/16/19 9:47 AM, Vladimir Kozlov wrote: >>> Nice. >>> >>> I submitted our tier1-3 testing. >>> >>> Thanks, >>> Vladimir >>> >>>> On 5/16/19 2:55 AM, Schmidt, Lutz wrote: >>>> Hi Vladimir, >>>> >>>> sorry for the delayed reaction on your comments. >>>> >>>> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list >>>> of modified files. >>>> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In >>>> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last >>>> night). >>>> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter >>>> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. >>>> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. >>>> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ >>>> >>>> ~thartmann: >>>> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it >>>> work reliably. I would like to keep it in as a reminder and a template to build on. >>>> >>>> Thanks, >>>> Lutz >>>> >>>> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: >>>> >>>> Hi Lutz, >>>> My comments are inlined below. >>>>> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: >>>>> Thank you, Vladimir! >>>>> Please find my comments inline and let me know what you think. >>>>> A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ >>>> Found one more I missed last time: >>>> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): >>>> static int instr_len(unsigned char *instr); >>>>> Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a >>>> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there >>>> were no serious clashes) and sent a short note. >>>> NP. Please, provide final webrev when you can so that I can run these changes through our testing to >>>> make sure no issues are present (especially in builds). >>>>> Regards, >>>>> Lutz >>>>> >>>>> On 09.05.19, 21:30, "Vladimir Kozlov" wrote: >>>>> >>>>> Hi Lutz, >>>>> >>>>> Thank you for doing this great work. >>>>> >>>>> I have just small comments: >>>>> >>>>> x86_64.ad - empty change. >>>>> File contains whitespace changes for formatting. Not visible in webrev. >>>> Okay. >>>>> >>>>> nmethod.cpp - LUCY? >>>>> >>>>> + st->print_cr("LUCY: NULL-oop"); >>>>> + tty->print("LUCY NULL-oop"); >>>>> Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. >>>> Okay. >>>>> >>>>> nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. >>>>> Changed. >>>>> >>>>> vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). >>>>> Changed. >>>>> >>>>> disassembler.* - LUCY_OBSOLETE? >>>>> >>>>> +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only >>>>> This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction >>>> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it >>>> out. But now I did. It's gone. >>>> Okay. >>>>> >>>>> compilerDefinitions.hpp - I don't see where tier_digit() is used. >>>>> I'm surprised myself. Introduced it and then made it obsolete. It's gone. >>>>> >>>>> disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You >>>>> need to check for invalid value and print help output in such case - it will be very useful if you >>>>> forgot a value spelling. Also add line for 'help' value. >>>>> >>>>> The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, >>>> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is >>>> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) >>>> like. Not changed as of now. Awaiting your input. >>>> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is >>>> useless because PrintAssemblyOptions is used only for disassembler and there are no values which >>>> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). >>>> And an other commented new line: >>>> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); >>>>> >>>>> Printing help text: There is an option (hsdis-help) to request help text printout. > >>>>> Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - >>>> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more >>>> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? >>>> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use >>>> if-else or switch without complicating the code. >>>> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use >>>> here? I don't think next comment is correct for ccstr type: >>>> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 >>>> It would be nice to fix it but you can do it later if you don't want to add more changes. >>>>> >>>>> Do you need next commented lines: >>>>> >>>>> disassembler.cpp - >>>>> +// ptrdiff_t _offset; >>>>> Deleted. >>>>> >>>>> +// Output suppressed because it messes up disassembly. >>>>> +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); >>>>> Uncommented, would like to keep it. Made the if condition permanently false. >>>>> >>>>> disassembler_s390.cpp - >>>>> +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); >>>>> Deleted. >>>>> >>>>> compile.cpp - >>>>> +// st->print("# "); _tf->dump_on(st); st->cr(); >>>>> Uncommented. >>>>> >>>>> >>>>> abstractDisassembler.cpp - >>>>> // st->print("0x%016lx", *((julong*)here)); >>>>> st->print("0x%016lx", *((uintptr_t*)here)); >>>>> // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); >>>>> Commented lines are gone. >>>>> >>>>> abstractDisassembler.cpp - may be explicit cast (byte*)?: >>>>> >>>>> st->print("%2.2x", *byte); >>>>> st->print("%2.2x", *pos); >>>>> st->print("0x%02x", *here); >>>>> Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, >>>> std::byte is a C++17 feature. We are not there yet. >>>> okay >>>>> >>>>> PTR64_FORMAT ?: >>>>> st->print("0x%016lx", *((uintptr_t*)here)); >>>>> I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. >>>> Changed other occurrences, so I changed this one as well. >>>> Thanks, >>>> Vladimir >>>>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> On 5/8/19 8:31 AM, Schmidt, Lutz wrote: >>>>>> Dear Community, >>>>>> >>>>>> may I please request comments and reviews for this change? Thank you! >>>>>> >>>>>> I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The >>>> code which constitutes this patch was not changed. Here's the webrev link: >>>>>> https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ >>>>>> >>>>>> Regards, >>>>>> Lutz >>>>>> >>>>>> On 11.04.19, 23:24, "Schmidt, Lutz" wrote: >>>>>> >>>>>> Dear All, >>>>>> >>>>>> this topic was discussed back in Nov/Dec 2018: >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html >>>>>> >>>>>> Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. >>>>>> The result was mixed, some pro, some con. I let the input from back then influence my work of the >>>> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general >>>> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the >>>> list (with default settings): >>>>>> >>>>>> PrintAssemblyOptions help: >>>>>> hsdis-print-raw test plugin by requesting raw output (deprecated) >>>>>> hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) >>>>>> hsdis-print-pc turn off PC printing (on by default) (deprecated) >>>>>> hsdis-print-bytes turn on instruction byte output (deprecated) >>>>>> >>>>>> hsdis-show-pc toggle printing current pc, currently ON >>>>>> hsdis-show-offset toggle printing current offset, currently OFF >>>>>> hsdis-show-bytes toggle printing instruction bytes, currently OFF >>>>>> hsdis-show-data-hex toggle formatting data as hex, currently ON >>>>>> hsdis-show-data-int toggle formatting data as int, currently OFF >>>>>> hsdis-show-data-float toggle formatting data as float, currently OFF >>>>>> hsdis-show-structs toggle compiler data structures, currently OFF >>>>>> hsdis-show-comment toggle instruction comments, currently OFF >>>>>> hsdis-show-block-comment toggle block comments, currently OFF >>>>>> hsdis-align-instr toggle instruction alignment, currently OFF >>>>>> >>>>>> Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. >>>> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. >>>> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like >>>> a plan? Here is what you get: >>>>>> >>>>>> The machine code generated by the JVM can be printed in three different formats: >>>>>> - Hexadecimal. >>>>>> This is basically a hex dump of the memory range containing the code. >>>>>> This format is always available (PRODUCT and not-PRODUCT builds), regardless >>>>>> of the availability of a disassembler library. It applies to all sorts of >>>>>> code, be it blobs, stubs, compiled nmethods, ... >>>>>> This format seems useless at first glance, but it is not. In an upcoming, >>>>>> separate enhancement, the JVM will be made capable of reading files >>>>>> containing such code blocks and disassembling them post mortem. The most >>>>>> prominent example is an hs_err* file. >>>>>> - Disassembled. >>>>>> This is an assembly listing of the instructions as found in the memory range >>>>>> occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable >>>>>> disassembler library (hsdis-.so) must be available at runtime. >>>>>> Most often, that will only be the case in test environments. If no disassembler >>>>>> library is available, hexadecimal output is used as fallback. >>>>>> - OptoAssembly. >>>>>> This is a meta code listing created only by the C2 compiler. As it is somewhat >>>>>> closer to the Java code, it may be helpful in linking assembly code to Java code. >>>>>> >>>>>> All three formats can be merged with additional information, most prominently compiler-internal >>>> "knowledge" about blocks, related bytecodes, statistics counters, and much more. >>>>>> >>>>>> Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, >>>> dependencies, exception handlers, are printed to aid in debugging. >>>>>> >>>>>> The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support >>>> OptoAssembly output. Data structures are unavailable as well. >>>>>> >>>>>> So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you >>>> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: >>>>>> http://cr.openjdk.java.net/~lucy/webrevs/8213084/ >>>>>> >>>>>> OptoAssembly output for reference (always on with PrintAssembly): >>>>>> ================================================================= >>>>>> >>>>>> 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value >>>> (constant) >>>>>> 039 movl R11, [RBP + #12 (8-bit)] # range >>>>>> 03d NullCheck RBP >>>>>> >>>>>> 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 03d cmpl RDX, R11 # unsigned >>>>>> 040 jnb,us B6 P=0.000000 C=5375.000000 >>>>>> >>>>>> PrintAssembly with no disassembler library available: >>>>>> ===================================================== >>>>>> >>>>>> [Code] >>>>>> [Entry Point] >>>>>> 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 >>>>>> [Verified Entry Point] >>>>>> 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 >>>>>> 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c >>>>>> 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 >>>>>> 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 >>>>>> [Exception Handler] >>>>>> 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff >>>>>> [End] >>>>>> >>>>>> PrintAssembly with minimal verbosity: >>>>>> ===================================== >>>>>> >>>>>> 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp >>>>>> 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d >>>>>> 0x00007f0434b89bdd: cmp %r11d,%edx >>>>>> 0x00007f0434b89be0: jae 0x00007f0434b89c0e >>>>>> >>>>>> PrintAssembly (previous plus code offsets from code begin): >>>>>> =========================================================== >>>>>> >>>>>> 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp >>>>>> 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d >>>>>> 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx >>>>>> 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e >>>>>> >>>>>> PrintAssembly (previous plus block comments): >>>>>> =========================================================== >>>>>> >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp >>>>>> 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx >>>>>> 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e >>>>>> >>>>>> PrintAssembly (previous plus instruction comments): >>>>>> =========================================================== >>>>>> >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 >>>>>> 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 >>>> rethrow=0 return_oop=0} >>>>>> ; - java.lang.String::charAt at 8 >>>> (line 702) >>>>>> 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to >>>> 0x00007fc3e11d7a9e >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 >>>>>> 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx >>>>>> 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e >>>>>> >>>>>> For completeness, here are the links to >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 >>>>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ >>>>>> >>>>>> But please, as mentioned above, first focus on the output. The nitty details of the implementation >>>> I would like to discuss after the output format has received some support. >>>>>> >>>>>> Thank you so much for your time! >>>>>> Lutz >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >> >> > > From lutz.schmidt at sap.com Mon May 20 08:22:30 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 20 May 2019 08:22:30 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> Message-ID: <5D17D33C-EBAF-496D-B4E8-B7375247D6E8@sap.com> Hi Vladimir, thank you for your additional thoughts and for the review. And yes, I assumed I would need a second review. So I'm sitting here waiting patiently for a second reviewer. Thanks, Lutz ?On 17.05.19, 21:27, "Vladimir Kozlov" wrote: Hi Lutz, Testing passed. I think you need second review because changes are not trivial. Thanks, Vladimir On 5/16/19 4:17 PM, Vladimir Kozlov wrote: > On 5/16/19 3:16 PM, Schmidt, Lutz wrote: >> Hi Vladimir, >> >> I have implemented that "added safety feature". What it does: >> - During reserve() and expand_by(), all newly committed heap memory is initialized with badCodeHeapNewVal. >> - Whenever a FreeBlock is added to the free block list, it's memory (except for the header part) is initialized with >> badCodeHeapNewVal. >> - During verify(), it is checked that all heap memory in free blocks is initialized as expected. >> >> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, based on current >> jdk/jdk, is here: >> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ > > Good. I will run testing on it. > >> >> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a new HeapBlock to be >> initialized with a bit pattern forming a negative value when accessed as int. It cost me quite some time to find that >> out. And it's bad code, in my opinion. I would suggest to just delete that assert. > > Yes, I agree. Such implicit dependence is bad and useless. > We use reset_to() in AOT where scopes should already have valid offsets: > http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152 > I am not sure how we pass this assert in AOT case. > >> >> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. But I did some >> testing/tracing. That additional free block (just one) is consumed by subsequent allocate() calls and eventually >> vanishes. >> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no longer need to be >> the last block before _next_segment. That is a prerequisite if we want to get rid of issues like the one described in >> JDK-8223770. We could just allocate a generously large block for stubs and at the end deallocate_tail them. > > Okay, it sounds good. > > Thanks, > Vladimir > >> >> Sorry for the long text. >> Lutz >> >> On 15.05.19, 19:00, "Vladimir Kozlov" wrote: >> >> On 5/14/19 10:53 PM, Schmidt, Lutz wrote: >> > Hi Vladimir, >> > >> > thank you for your comments. About filling CodeHeap with bad values after split_block: >> > - in deallocate_tail, the leading part must remain intact. It contains valid code. >> > - in search_freelist, one free block is split into two. There I could invalidate the contents of both parts >> Thank you for explaining. >> > - If you want added safety, wouldn't it then be better to invalidate the block contents during >> add_to_freelist()? You could then be sure there is no executable code in a free block. >> Yes, it is preferable. >> An other note (after looking more on changes). You changed where freed tail goes. Originally it was added to next >> block >> _next_segment (make it larger) and you created separate small block. Is not it create more fragmentation? >> Thanks, >> Vladimir >> > >> > Regards, >> > Lutz >> > >> > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: >> > >> > On 5/14/19 1:09 PM, Schmidt, Lutz wrote: >> > > Hi Vladimir, >> > > >> > > I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's >> not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, >> even with memset(). >> > > >> > > It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized >> before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its >> weak memory ordering. >> > >> > Okay, if it is byte copy I am fine with it. >> > >> > > >> > > About segment map marking: >> > > There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start(). >> > > In short: each segment map element contains an (unsigned) index which, when subtracted from that element >> index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a >> heap block range to describe a newly formed heap block, the leading part remains valid. >> > > >> > > Segmap before after >> > > Index split split >> > > I 0 <- block start 0 <- block start (now shorter) >> > > I+1 1 1 each index 0..9 still points >> > > I+2 2 2 back to the block start >> > > I+3 3 3 >> > > I+4 4 4 >> > > I+5 5 5 >> > > I+6 6 6 >> > > I+7 7 7 >> > > I+8 8 8 >> > > I+9 9 9 >> > > I+10 10 0 <- new block start >> > > I+11 11 1 >> > > I+12 12 2 >> > > I+13 13 3 >> > > I+14 14 4 >> > > I+15 0 <- block start 0 <- block start >> > > I+16 1 1 >> > > I+17 2 2 >> > > I+18 3 3 >> > > I+19 4 4 >> > > >> > > There is a (very short) description about what's happening at the very end of search_freelist(). >> split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()? >> > >> > Thank you, I forgot about that first block mapping is still valid. >> > >> > What about storing bad value (in debug mode) only in second part and not both parts? >> > >> > > >> > > Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway >> because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In >> addition, I will add one line of comment and rename a local variable. That's it. >> > >> > Okay. >> > >> > Thanks, >> > Vladimir >> > >> > > >> > > Thanks, >> > > Lutz >> > > >> > > >> > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" >> wrote: >> > > >> > > Good. >> > > >> > > Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I >> am wrong here). >> > > An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of >> split segments as >> > > used and (after call) store bad values in it. What about first part? May be add comment. >> > > >> > > Thanks, >> > > Vladimir >> > > >> > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: >> > > > Dear all, >> > > > >> > > > May I please request reviews for my change? >> > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 >> > > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ >> > > > >> > > > What this change is all about: >> > > > ------------------------------ >> > > > While working on another topic, I came across the code in share/memory/heap.cpp. I applied some >> small changes which I would call improvements. >> > > > >> > > > Furthermore, and in particular with these changes, the platform-specific parameter >> CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block >> allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the >> minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance >> for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) >> the free list, slowing down the free block search. >> > > > >> > > > The following free block counts have been found after running JVM98 with different >> CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown. >> > > > >> > > > JDK-8223444 not applied >> > > > ======================= >> > > > >> > > > Segment | free blocks with CodeCacheMinBlockLength= >> > > > Size | 1 2 3 4 6 8 >> > > > -----------------+------------------------------------------- >> > > > aarch 128 | 0 153 75 30 38 2 >> > > > ppc 128 | 0 149 98 59 14 2 >> > > > ppcle 128 | 0 219 161 110 69 34 >> > > > s390 256 | 0 142 93 59 30 10 >> > > > x86 128 | 0 215 157 118 42 11 >> > > > >> > > > >> > > > JDK-8223444 applied >> > > > =================== >> > > > >> > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested >> > > > Size | 1 2 3 4 6 8 | setting >> > > > -----------------+---------------------------------------------+------------ >> > > > aarch 128 | 221 115 80 36 7 1 | 6 >> > > > ppc 128 | 245 152 101 54 14 4 | 6 >> > > > ppcle 128 | 243 144 89 72 20 5 | 6 >> > > > s390 256 | 168 60 67 8 6 2 | 4 >> > > > x86 128 | 223 139 83 50 11 2 | 6 >> > > > >> > > > Thank you for your time and opinion! >> > > > Lutz >> > > > >> > > > >> > > > >> > > > >> > > >> > > >> > >> > >> From fujie at loongson.cn Mon May 20 08:33:47 2019 From: fujie at loongson.cn (Jie Fu) Date: Mon, 20 May 2019 16:33:47 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Thank you, Leonid. Bad news. Well, I think there are two cases which may still cause the assertion fail: -1) profile.count() had been overflowed -2) profile.count() was not initialized correctly I will check it further. Thanks a lot. Best regard, Jie On 2019/5/20 ??4:12, Leonid Mesnik wrote: > The failure is still reproduced with patch. I attached full hs_err to the bug. > > hs_err > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 > # assert(profile_count == 0) failed: sanity > # > # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ > chensink14D_java/scratch/0/core.3096) > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > > --------------- S U M M A R Y ------------ > > Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ > tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ > D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ > ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ > linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ > al.properties > > Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 > Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) > > --------------- T H R E A D --------------- > > Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, stack(0x00002ae522de2000,0x00002ae522ee3000)] > > > Current CompileTask: > C2:111312036 146944 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) > > Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650, free space=1013k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc > V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 > V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 > V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 > V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 > V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 > V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 > V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, float)+0xc54 > V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c > V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 > V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d > V [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d > V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 > V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a > V [libjvm.so+0x16c9267] JavaThread::run()+0x227 > V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 > V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e > > Leonid > >> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >> >> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >> Let's wait for Leonid's test result. >> >> Thanks. >> Best regards, >> Jie >> >> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>> Hi Jie, >>> >>> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >>> Looks good. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>> Hi all, >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>> >>>> I'm sorry to introduce this assertion failure. >>>> Please review the suggested fix and give me some advice. >>>> >>>> Leonid, could you please help to test the patch? >>>> I don't have the reproducer you mentioned in the JBS. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> >> From robbin.ehn at oracle.com Mon May 20 08:37:53 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 20 May 2019 10:37:53 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> Message-ID: <843c5e48-9df2-a45d-d68d-e4275832279b@oracle.com> Hi David, > > I'm not concerned about combining these. > Great! > One nit in the test: > > 59???????????? Thread.currentThread().sleep(10); > > should just be Thread.sleep(10) as its not an instance method. Fixed in v4 which contains the simplifications in biased locking requested by Patricio. Sending v4 to RFR mail. Thanks! /Robbin > > Thanks, > David > ----- > >> >> Good news, no issues found with deopt with handshakes. >> >> This is v3: >> http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ >> >> This full inc from v2 (review + stress test): >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ >> >> This inc is the review part from v2: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ >> >> This inc is the additional stress test with bug fixes: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ >> >> Additional biased locking change: >> The original code use same copy of markOop in revoke_and_rebias. >> The keep same behavior I now pass in that copy into fast_revoke. >> >> The stress test passes hundreds of iterations in mach5. >> Thousands stress tests locally, the issues above was reproduce-able. >> Inc changes also passes t1-5. >> >> As usual with this change-set, I'm continuously running more test. >> >> Thanks, Robbin >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin From robbin.ehn at oracle.com Mon May 20 08:38:40 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 20 May 2019 10:38:40 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <142878b0-f048-d455-4e44-5308bb511549@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> <142878b0-f048-d455-4e44-5308bb511549@oracle.com> Message-ID: <5c2fd5e1-38a3-90d3-de11-fb98900bc202@oracle.com> Hi Patricio, I have made the simplifications, sending v4 to RFR mail, thanks! /Robbin On 2019-05-15 21:45, Patricio Chilano wrote: > Hi Robbin, > > Biased locking changes look good to me. > > Just a small comment based on one of the things I mentioned before. I can't help > to think that since get_monitors_from_stack() will return monitors owned by the > handshaked thread then the assert in Deoptimization::revoke_handshake(): > > assert(!mark->has_bias_pattern() || mark->biased_locker() == thread, "Can't > revoke"); > > should be combined by the later guarantee() in > BiasedLocking::revoke_own_locks_in_handshake() to be: > > guarantee(!mark->has_bias_pattern() || (mark->biased_locker() == thread && > prototype_header->bias_epoch() == mark->bias_epoch()), "Can't revoke"); > > and that would make more evident we can remove the call to fast_revoke() in > BiasedLocking::revoke_own_locks_in_handshake(). I know you said you don't want > to change biased locking behavior, but I don't think doing that should change > anything. We know the condition above should hold, otherwise, if the handshaked > thread is not the real biaser (because of expired epoch) then we could hit the > guarantee() in BiasedLocking:revoke_own_locks_in_handshake() anyways later on if > some other thread rebiased the lock before we were able to revoke it. > > Thanks! > > Patricio > > > On 5/15/19 2:26 AM, Robbin Ehn wrote: >> Hi, please see this update. >> >> I think I got all review comments fix. >> >> Long story short, I was concerned about test coverage, so I added a stress test >> using the WB, which sometimes crashed in rubbish code. >> >> There are two bugs in the methods used by WB_DeoptimizeAll. >> (Seems I'm the first user) >> >> CodeCache::mark_all_nmethods_for_deoptimization(); >> When iterating the nmethods we could see the methods being create in: >> void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& method) >> And deopt the method when it was in use or before. >> Native wrappers are suppose to live as long as the class. >> I filtered out not_installed and native methods. >> >> Deoptimization::deoptimize_all_marked(); >> The issue is that a not_entrant method can go to zombie at anytime. >> There are several ways to make a nmethod not go zombie: nmethodLocker, have it >> on stack, avoid safepoint poll in some states, etc.., which is also depending on >> what type of nmethod. >> The iterator only_alive_and_not_unloading returns not_entrant nmethods, but we >> don't know there state prior last poll. >> in_use -> not_entrant -> #poll# -> not_entrant -> zombie >> If the iterator returns the nmethod after we passed the poll it can still be >> not_entrant but go zombie. >> The problem happens when a second thread marks a method for deopt and makes it >> not_entrant. Then after a poll we end-up in deoptimize_all_marked(), but the >> method is not yet a zombie, so the iterator returns it, it becomes a zombie thus >> pass the if check and later hit the assert. >> So there is a race between the iterator check of state and if-statement check of >> state. Fixed by also filtering out zombies. >> >> If the stress test with correction of the bugs causes trouble in review, I can >> do a follow-up with the stress test separately. >> >> Good news, no issues found with deopt with handshakes. >> >> This is v3: >> http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ >> >> This full inc from v2 (review + stress test): >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ >> >> This inc is the review part from v2: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ >> >> This inc is the additional stress test with bug fixes: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ >> >> Additional biased locking change: >> The original code use same copy of markOop in revoke_and_rebias. >> The keep same behavior I now pass in that copy into fast_revoke. >> >> The stress test passes hundreds of iterations in mach5. >> Thousands stress tests locally, the issues above was reproduce-able. >> Inc changes also passes t1-5. >> >> As usual with this change-set, I'm continuously running more test. >> >> Thanks, Robbin >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Mon May 20 09:04:59 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 20 May 2019 11:04:59 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> Hi all, please see this update v4. I have fixed the simplification Patricio talked about and David's nit. The interesting part is now the full diff of bias locking cpp file: http://cr.openjdk.java.net/~rehn/8221734/v4/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html It's very clean. Full: http://cr.openjdk.java.net/~rehn/8221734/v4/ Inc: http://cr.openjdk.java.net/~rehn/8221734/v4/inc/ I have seen no issues in T1-7, KS and other assorted testing. Thanks, Robbin On 2019-04-25 14:05, Robbin Ehn wrote: > Hi all, please review. > > Let's deopt with handshakes. > Removed VM op Deoptimize, instead we handshake. > Locks needs to be inflate since we are not in a safepoint. > > Goes on top of: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html > > Code: > http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html > Issue: > https://bugs.openjdk.java.net/browse/JDK-8221734 > > Passes t1-7 and multiple t1-5 runs. > > A few startup benchmark see a small speedup. > > Thanks, Robbin From fujie at loongson.cn Mon May 20 09:32:47 2019 From: fujie at loongson.cn (Jie Fu) Date: Mon, 20 May 2019 17:32:47 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Ah, I had lost these cases: ?- http://hg.openjdk.java.net/jdk/jdk/file/8c63164bd540/src/hotspot/share/ci/ciMethod.cpp#l515 ?- http://hg.openjdk.java.net/jdk/jdk/file/8c63164bd540/src/hotspot/share/ci/ciMethod.cpp#l533 On 2019/5/20 ??4:33, Jie Fu wrote: > Thank you, Leonid. > > Bad news. > > Well, I think there are two cases which may still cause the assertion > fail: > -1) profile.count() had been overflowed > -2) profile.count() was not initialized correctly > > I will check it further. > > Thanks a lot. > Best regard, > Jie > > On 2019/5/20 ??4:12, Leonid Mesnik wrote: >> The failure is still reproduced with patch. I attached full hs_err to >> the bug. >> >> hs_err >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> #? Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), >> pid=3096, tid=3128 >> #? assert(profile_count == 0) failed: sanity >> # >> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug >> build 13-internal+0-2019-05-18-0457052.lmesnik.null) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >> 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, >> tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, >> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >> # >> # Core dump will be written. Default location: Core dumps may be >> processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I >> %h" (or dumping to >> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >> chensink14D_java/scratch/0/core.3096) >> # >> # If you would like to submit a bug report, please visit: >> #?? http://bugreport.java.com/bugreport/crash.jsp >> # >> >> ---------------? S U M M A R Y ------------ >> >> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions >> -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot >> -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError >> -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false >> -XX:+DisplayVMOutputToS\ >> tderr -XX:+UsePerfData >> -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags >> -XX:+DisableExplicitGC -XX:+StartAttachListener >> -XX:NativeMemoryTracking=detail -XX:+FlightRecorder >> --add-exports=java.base/java.lang=ALL-UNNAMED >> --add-opens=java.base/java.lang=ALL-UNNAME\ >> D >> --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED >> --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED >> -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >> >> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir >> -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home >> -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so >> applications.kitchensink.process.stress.Main >> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >> al.properties >> >> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle >> Linux Server release 7.5 >> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d >> 6h 55m 12s) >> >> ---------------? T H R E A D? --------------- >> >> Current thread (0x00002ae4e83bc000):? JavaThread "C2 CompilerThread0" >> daemon [_thread_in_native, id=3128, >> stack(0x00002ae522de2000,0x00002ae522ee3000)] >> >> >> Current CompileTask: >> C2:111312036 146944?????? 4 >> spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >> >> Stack: [0x00002ae522de2000,0x00002ae522ee3000], >> sp=0x00002ae522edf650,? free space=1013k >> Native frames: (J=compiled Java code, A=aot compiled Java code, >> j=interpreted, Vv=VM code, C=native code) >> V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, >> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >> V? [libjvm.so+0x6d12e0]? InlineTree::ok_to_inline(ciMethod*, >> JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >> V? [libjvm.so+0xb6e075]? Compile::call_generator(ciMethod*, int, >> bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 >> V? [libjvm.so+0xb6f6b9]? Parse::do_call()+0x469 >> V? [libjvm.so+0x1441b70]? Parse::do_one_bytecode()+0xff0 >> V? [libjvm.so+0x1432520]? Parse::do_one_block()+0x650 >> V? [libjvm.so+0x1432a23]? Parse::do_all_blocks()+0x113 >> V? [libjvm.so+0x14348e4]? Parse::Parse(JVMState*, ciMethod*, >> float)+0xc54 >> V? [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >> V? [libjvm.so+0x9c08b4]? Compile::Compile(ciEnv*, C2Compiler*, >> ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >> V? [libjvm.so+0x801d9d]? C2Compiler::compile_method(ciEnv*, >> ciMethod*, int, DirectiveSet*)+0x10d >> V? [libjvm.so+0x9cd17d] >> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >> V? [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >> V? [libjvm.so+0x16c0baa]? JavaThread::thread_main_inner()+0x26a >> V? [libjvm.so+0x16c9267]? JavaThread::run()+0x227 >> V? [libjvm.so+0x16c62f6]? Thread::call_run()+0xf6 >> V? [libjvm.so+0x13e0d5e]? thread_native_entry(Thread*)+0x10e >> >> Leonid >> >>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>> >>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>> Let's wait for Leonid's test result. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>> Hi Jie, >>>> >>>> So the counter was incremented while this code is executed. And you >>>> fixed it by caching initial value. >>>> Looks good. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>> Hi all, >>>>> >>>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>> >>>>> I'm sorry to introduce this assertion failure. >>>>> Please review the suggested fix and give me some advice. >>>>> >>>>> Leonid, could you please help to test the patch? >>>>> I don't have the reproducer you mentioned in the JBS. >>>>> >>>>> Thanks a lot. >>>>> Best regards, >>>>> Jie >>>>> >>>>> >>> > From fujie at loongson.cn Mon May 20 10:01:41 2019 From: fujie at loongson.cn (Jie Fu) Date: Mon, 20 May 2019 18:01:41 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: Hi all, Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ In my previous patch, I had lost the case of typecheck profile[1]. Please review and give me some advice. Thanks a lot. Best regards, Jie [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html On 2019/5/20 ??4:12, Leonid Mesnik wrote: > The failure is still reproduced with patch. I attached full hs_err to the bug. > > hs_err > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 > # assert(profile_count == 0) failed: sanity > # > # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ > chensink14D_java/scratch/0/core.3096) > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > > --------------- S U M M A R Y ------------ > > Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ > tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ > D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ > ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ > linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ > al.properties > > Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 > Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) > > --------------- T H R E A D --------------- > > Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, stack(0x00002ae522de2000,0x00002ae522ee3000)] > > > Current CompileTask: > C2:111312036 146944 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) > > Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650, free space=1013k > Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc > V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 > V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 > V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 > V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 > V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 > V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 > V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, float)+0xc54 > V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c > V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 > V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d > V [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d > V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 > V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a > V [libjvm.so+0x16c9267] JavaThread::run()+0x227 > V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 > V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e > > Leonid > >> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >> >> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >> Let's wait for Leonid's test result. >> >> Thanks. >> Best regards, >> Jie >> >> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>> Hi Jie, >>> >>> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >>> Looks good. >>> >>> Thanks, >>> Vladimir >>> >>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>> Hi all, >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>> >>>> I'm sorry to introduce this assertion failure. >>>> Please review the suggested fix and give me some advice. >>>> >>>> Leonid, could you please help to test the patch? >>>> I don't have the reproducer you mentioned in the JBS. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> >> From lutz.schmidt at sap.com Mon May 20 10:49:21 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 20 May 2019 10:49:21 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> Message-ID: <1E1422A6-7D0F-4DFF-8B65-A8171A50868A@sap.com> Hi Tobias, thank you for your comments. I do not need "res". It's just personal preference. I like variable names which tell something about their contents. A reasonably capable compiler should be able to avoid stack footprint for res. Yes, I can convert the warning into an assert if you like. I opted for warning to get information about all segments with unexpected state, not just the first one. Now that it works, I can live with assert as well. Do you need to see a new full webrev? Here is the delta: diff -r 802d016189e0 src/hotspot/share/memory/heap.cpp --- a/src/hotspot/share/memory/heap.cpp Mon May 20 12:12:01 2019 +0200 +++ b/src/hotspot/share/memory/heap.cpp Mon May 20 12:43:31 2019 +0200 @@ -629,14 +629,10 @@ size_t segn = seg1 + b->length(); for (size_t i = seg1; i < segn; i++) { nseg++; - if (is_segment_unused(seg_map[i])) { - warning("CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); - } + assert(is_segment_unused(seg_map[i]), "CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); } } - if (nseg != _next_segment) { - warning("CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); - } + assert(nseg != _next_segment, "CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); // Verify that the number of free blocks is not out of hand. static int free_block_threshold = 10000; Thanks, Lutz ?On 20.05.19, 09:29, "Tobias Hartmann" wrote: Hi Lutz, Just wondering if we shouldn't fail in CodeHeap::verify() instead of just printing a warning? Also, in heap.cpp:528, why do you need 'res'? You could just update found_block in line 578, right? Best regards, Tobias On 17.05.19 21:27, Vladimir Kozlov wrote: > Hi Lutz, > > Testing passed. > I think you need second review because changes are not trivial. > > Thanks, > Vladimir > > On 5/16/19 4:17 PM, Vladimir Kozlov wrote: >> On 5/16/19 3:16 PM, Schmidt, Lutz wrote: >>> Hi Vladimir, >>> >>> I have implemented that "added safety feature". What it does: >>> - During reserve() and expand_by(), all newly committed heap memory is initialized with >>> badCodeHeapNewVal. >>> - Whenever a FreeBlock is added to the free block list, it's memory (except for the header >>> part) is initialized with badCodeHeapNewVal. >>> - During verify(), it is checked that all heap memory in free blocks is initialized as expected. >>> >>> Please find the diff from webrev.01 to webrev.02 attached as text file. The latest full webrev, >>> based on current jdk/jdk, is here: >>> https://cr.openjdk.java.net/~lucy/webrevs/8223444.02/ >> >> Good. I will run testing on it. >> >>> >>> There is a dubious assert in nmethod.cpp:338 (PcDescCache::reset_to()). It implicitly relies on a >>> new HeapBlock to be initialized with a bit pattern forming a negative value when accessed as int. >>> It cost me quite some time to find that out. And it's bad code, in my opinion. I would suggest to >>> just delete that assert. >> >> Yes, I agree. Such implicit dependence is bad and useless. >> We use reset_to() in AOT where scopes should already have valid offsets: >> http://hg.openjdk.java.net/jdk/jdk/file/9feb4852536f/src/hotspot/share/aot/aotCompiledMethod.hpp#l152 >> I am not sure how we pass this assert in AOT case. >> >>> >>> I understand your concerns re my changes to deallocate_tail(). In the beginning, I had the same. >>> But I did some testing/tracing. That additional free block (just one) is consumed by subsequent >>> allocate() calls and eventually vanishes. >>> There is an advantage that comes with my changes: the HeapBlock whose tail is deallocated does no >>> longer need to be the last block before _next_segment. That is a prerequisite if we want to get >>> rid of issues like the one described in JDK-8223770. We could just allocate a generously large >>> block for stubs and at the end deallocate_tail them. >> >> Okay, it sounds good. >> >> Thanks, >> Vladimir >> >>> >>> Sorry for the long text. >>> Lutz >>> >>> On 15.05.19, 19:00, "Vladimir Kozlov" wrote: >>> >>> On 5/14/19 10:53 PM, Schmidt, Lutz wrote: >>> > Hi Vladimir, >>> > >>> > thank you for your comments. About filling CodeHeap with bad values after split_block: >>> > - in deallocate_tail, the leading part must remain intact. It contains valid code. >>> > - in search_freelist, one free block is split into two. There I could invalidate the >>> contents of both parts >>> Thank you for explaining. >>> > - If you want added safety, wouldn't it then be better to invalidate the block contents >>> during add_to_freelist()? You could then be sure there is no executable code in a free block. >>> Yes, it is preferable. >>> An other note (after looking more on changes). You changed where freed tail goes. Originally >>> it was added to next block >>> _next_segment (make it larger) and you created separate small block. Is not it create more >>> fragmentation? >>> Thanks, >>> Vladimir >>> > >>> > Regards, >>> > Lutz >>> > >>> > On 14.05.19, 23:00, "Vladimir Kozlov" wrote: >>> > >>> > On 5/14/19 1:09 PM, Schmidt, Lutz wrote: >>> > > Hi Vladimir, >>> > > >>> > > I had the same thought re atomicity. memset() is not consistent even on one >>> platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we >>> have byte atomicity on all supported platforms, even with memset(). >>> > > >>> > > It's a different thing with sequence of initialization. Do we really depend on >>> byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit >>> byte loop. Not on x86, but on ppc with its weak memory ordering. >>> > >>> > Okay, if it is byte copy I am fine with it. >>> > >>> > > >>> > > About segment map marking: >>> > > There is a short description how the segment map works in heap.cpp, right before >>> CodeHeap::find_start(). >>> > > In short: each segment map element contains an (unsigned) index which, when >>> subtracted from that element index, addresses the segment map element where the heap block >>> starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly >>> formed heap block, the leading part remains valid. >>> > > >>> > > Segmap before after >>> > > Index split split >>> > > I 0 <- block start 0 <- block start (now shorter) >>> > > I+1 1 1 each index 0..9 still points >>> > > I+2 2 2 back to the block start >>> > > I+3 3 3 >>> > > I+4 4 4 >>> > > I+5 5 5 >>> > > I+6 6 6 >>> > > I+7 7 7 >>> > > I+8 8 8 >>> > > I+9 9 9 >>> > > I+10 10 0 <- new block start >>> > > I+11 11 1 >>> > > I+12 12 2 >>> > > I+13 13 3 >>> > > I+14 14 4 >>> > > I+15 0 <- block start 0 <- block start >>> > > I+16 1 1 >>> > > I+17 2 2 >>> > > I+18 3 3 >>> > > I+19 4 4 >>> > > >>> > > There is a (very short) description about what's happening at the very end of >>> search_freelist(). split_block() is called there as well. Would you like to see a similar comment >>> in deallocate_tail()? >>> > >>> > Thank you, I forgot about that first block mapping is still valid. >>> > >>> > What about storing bad value (in debug mode) only in second part and not both parts? >>> > >>> > > >>> > > Once I have your response, I will create a new webrev reflecting your input. I need >>> to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The >>> checks can't be done at that place. In addition, I will add one line of comment and rename a >>> local variable. That's it. >>> > >>> > Okay. >>> > >>> > Thanks, >>> > Vladimir >>> > >>> > > >>> > > Thanks, >>> > > Lutz >>> > > >>> > > >>> > > On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" >>> wrote: >>> > > >>> > > Good. >>> > > >>> > > Do we need to be concern about atomicity of marking? We know that memset() is >>> not atomic (may be I am wrong here). >>> > > An other thing is I did not get logic in deallocate_tail(). split_block() >>> marks only second half of split segments as >>> > > used and (after call) store bad values in it. What about first part? May be >>> add comment. >>> > > >>> > > Thanks, >>> > > Vladimir >>> > > >>> > > On 5/14/19 3:47 AM, Schmidt, Lutz wrote: >>> > > > Dear all, >>> > > > >>> > > > May I please request reviews for my change? >>> > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8223444 >>> > > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/ >>> > > > >>> > > > What this change is all about: >>> > > > ------------------------------ >>> > > > While working on another topic, I came across the code in >>> share/memory/heap.cpp. I applied some small changes which I would call improvements. >>> > > > >>> > > > Furthermore, and in particular with these changes, the platform-specific >>> parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small >>> free blocks. Heap block allocation does not create free blocks smaller than >>> CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. >>> If it is too small, such free blocks will never be re-allocated. The only chance for them to >>> vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the >>> beginning of) the free list, slowing down the free block search. >>> > > > >>> > > > The following free block counts have been found after running JVM98 with >>> different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the >>> CodeHeap state at VM shutdown. >>> > > > >>> > > > JDK-8223444 not applied >>> > > > ======================= >>> > > > >>> > > > Segment | free blocks with CodeCacheMinBlockLength= >>> > > > Size | 1 2 3 4 6 8 >>> > > > -----------------+------------------------------------------- >>> > > > aarch 128 | 0 153 75 30 38 2 >>> > > > ppc 128 | 0 149 98 59 14 2 >>> > > > ppcle 128 | 0 219 161 110 69 34 >>> > > > s390 256 | 0 142 93 59 30 10 >>> > > > x86 128 | 0 215 157 118 42 11 >>> > > > >>> > > > >>> > > > JDK-8223444 applied >>> > > > =================== >>> > > > >>> > > > Segment | free blocks with CodeCacheMinBlockLength= | suggested >>> > > > Size | 1 2 3 4 6 8 | setting >>> > > > -----------------+---------------------------------------------+------------ >>> > > > aarch 128 | 221 115 80 36 7 1 | 6 >>> > > > ppc 128 | 245 152 101 54 14 4 | 6 >>> > > > ppcle 128 | 243 144 89 72 20 5 | 6 >>> > > > s390 256 | 168 60 67 8 6 2 | 4 >>> > > > x86 128 | 223 139 83 50 11 2 | 6 >>> > > > >>> > > > Thank you for your time and opinion! >>> > > > Lutz >>> > > > >>> > > > >>> > > > >>> > > > >>> > > >>> > > >>> > >>> > >>> From dms at samersoff.net Mon May 20 10:52:30 2019 From: dms at samersoff.net (Dmitry Samersoff) Date: Mon, 20 May 2019 13:52:30 +0300 Subject: [aarch64-port-dev ] 8218966: AArch64: String.compareTo() can read memory after string In-Reply-To: References: Message-ID: Dmitrij, The fix looks good to me. -Dmitry On 21.02.2019 18:26, Dmitrij Pochepko wrote: > Hi all, > > Please review a fix for "8218966: AArch64: String.compareTo() can read > memory after string". > > bug: https://bugs.openjdk.java.net/browse/JDK-8218966 > webrev: http://cr.openjdk.java.net/~dpochepk/8218966/webrev/ > > Intrinsic implementation returns wrong value in rare cases for strings > longer than 72 characters. > > Changes: > > - Different encodings case. Small 16-characters loop and post-loop code > are re-organized to stop at string end. Post-loop now also uses > compare_string_16_x_LU() to avoid code duplication. > - Changed calculation of prefetchLoopExitCondition. It might be > incorrect in case when SoftwarePrefetchHintDistance was set to > non-default small value. > - Same encoding case. Moved loop counter update out of prefetch block. > It might miss end-of-string check when prefetch is disabled, with memory > after string being read. > - Added 2 tests. They are quite similar but > TestStringCompareToSameLength compares strings of same length, and > TestStringCompareToDifferentLength is for different lengths. Tests cover > 8218966 case. And also they cover different parts of intrinsic, taking > into account conditions in the implementation and possible > SoftwarePrefetchHintDistance values. > > Testing: > > Existing jtreg and jck tests were not able to detect 8218966 case. But > they pass with the fix applied. Newly added jtreg tests can detect the > issue and potential problems in case of changes in the implementation. > The following testing was performed: > > - jck with default vm flags > - jck with -Xcomp -XX:-TieredCompilation > - hotspot jtreg tests (including new tests): compiler/*, runtime/*, gc/* > with default vm flags > - hotspot jtreg tests (including new tests): compiler/*, runtime/*, gc/* > with -Xcomp -XX:-TieredCompilation > - jdk jtreg tier1-3 tests with default vm flags > - jdk jtreg tier1-3 tests with -Xcomp -XX:-TieredCompilation > > No regressions were found. > > I'd like to thank Pengfei Li (Pengfei.Li at arm.com) for pre-review and > additional testing. > > I'm also about to send separate additional webrev with compareTo > intrinsic documentation and maintenance-related improvements as separate > enhancement. > > Thanks, > Dmitrij > From tobias.hartmann at oracle.com Mon May 20 10:59:28 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 20 May 2019 12:59:28 +0200 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <1E1422A6-7D0F-4DFF-8B65-A8171A50868A@sap.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> <1E1422A6-7D0F-4DFF-8B65-A8171A50868A@sap.com> Message-ID: <6b5bd0f3-044b-9e89-7c95-66b060d86d8e@oracle.com> Hi Lutz, On 20.05.19 12:49, Schmidt, Lutz wrote: > Hi Tobias, > > thank you for your comments. > > I do not need "res". It's just personal preference. I like variable names which tell something about their contents. A reasonably capable compiler should be able to avoid stack footprint for res. Okay, fine with me. > Yes, I can convert the warning into an assert if you like. I opted for warning to get information about all segments with unexpected state, not just the first one. Now that it works, I can live with assert as well. Do you need to see a new full webrev? Here is the delta: > > diff -r 802d016189e0 src/hotspot/share/memory/heap.cpp > --- a/src/hotspot/share/memory/heap.cpp Mon May 20 12:12:01 2019 +0200 > +++ b/src/hotspot/share/memory/heap.cpp Mon May 20 12:43:31 2019 +0200 > @@ -629,14 +629,10 @@ > size_t segn = seg1 + b->length(); > for (size_t i = seg1; i < segn; i++) { > nseg++; > - if (is_segment_unused(seg_map[i])) { > - warning("CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); > - } > + assert(is_segment_unused(seg_map[i]), "CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); > } > } > - if (nseg != _next_segment) { > - warning("CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); > - } > + assert(nseg != _next_segment, "CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); > > // Verify that the number of free blocks is not out of hand. > static int free_block_threshold = 10000; Looks good. Please re-run testing before pushing. Thanks, Tobias From lutz.schmidt at sap.com Mon May 20 11:05:16 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 20 May 2019 11:05:16 +0000 Subject: RFR(S): 8223444: Improve CodeHeap Free Space Management In-Reply-To: <6b5bd0f3-044b-9e89-7c95-66b060d86d8e@oracle.com> References: <24edfdcf-8b88-b401-3e36-fd0914ffa226@oracle.com> <9D607D81-5A25-406B-B05B-7D9C9C733D8F@sap.com> <5ec26a0b-0a49-c458-8d90-aa92396610a5@oracle.com> <65F5D580-6B68-4422-AAE6-3406D7FCDE7A@sap.com> <48fb591a-6055-c1c2-b052-6d8bc770da28@oracle.com> <138023EE-F390-4618-A885-AD6473B593DE@sap.com> <1c264e0b-b497-44f6-b0c6-6d0cf8a6104c@oracle.com> <1E1422A6-7D0F-4DFF-8B65-A8171A50868A@sap.com> <6b5bd0f3-044b-9e89-7c95-66b060d86d8e@oracle.com> Message-ID: <473E08D3-A2B5-480F-B260-527C7DDF51C3@sap.com> Thank you, Tobias! Local builds and tests are currently running. I will ask jdk/submit for its opinion before pushing. Regards, Lutz ?On 20.05.19, 12:59, "Tobias Hartmann" wrote: Hi Lutz, On 20.05.19 12:49, Schmidt, Lutz wrote: > Hi Tobias, > > thank you for your comments. > > I do not need "res". It's just personal preference. I like variable names which tell something about their contents. A reasonably capable compiler should be able to avoid stack footprint for res. Okay, fine with me. > Yes, I can convert the warning into an assert if you like. I opted for warning to get information about all segments with unexpected state, not just the first one. Now that it works, I can live with assert as well. Do you need to see a new full webrev? Here is the delta: > > diff -r 802d016189e0 src/hotspot/share/memory/heap.cpp > --- a/src/hotspot/share/memory/heap.cpp Mon May 20 12:12:01 2019 +0200 > +++ b/src/hotspot/share/memory/heap.cpp Mon May 20 12:43:31 2019 +0200 > @@ -629,14 +629,10 @@ > size_t segn = seg1 + b->length(); > for (size_t i = seg1; i < segn; i++) { > nseg++; > - if (is_segment_unused(seg_map[i])) { > - warning("CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); > - } > + assert(is_segment_unused(seg_map[i]), "CodeHeap: unused segment. %d [%d..%d], %s block", (int)i, (int)seg1, (int)segn, b->free()? "free":"used"); > } > } > - if (nseg != _next_segment) { > - warning("CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); > - } > + assert(nseg != _next_segment, "CodeHeap: segment count mismatch. found %d, expected %d.", (int)nseg, (int)_next_segment); > > // Verify that the number of free blocks is not out of hand. > static int free_block_threshold = 10000; Looks good. Please re-run testing before pushing. Thanks, Tobias From rwestrel at redhat.com Mon May 20 11:42:21 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 20 May 2019 13:42:21 +0200 Subject: RFR(XL): 6312651: Compiler should only use verified interface types for optimization Message-ID: <87ftp9l5cy.fsf@redhat.com> http://cr.openjdk.java.net/~roland/6312651/webrev.00/ This is a fix for a bug that John filed 14 years ago (almost) but has always affected c2 as far as I understand. The fix is implemented along the lines of John's recommendation: 1) the type system should keep track of an object's instance class and the set of interfaces it implements separately (or array of instance klass + set of interfaces), 2) interfaces in signatures should not be trusted. For 1), the _klass in TypeOopPtr/TypeKlassPtr now only refers to instance classes (or arrays of instance class). TypeOopPtr/TypeKlassPtr has an extra field _interfaces that carries the set of interfaces implemented by that type. For klass pointers, TypeKlassPtr is no longer sufficient because for arrays, we need the type to also include interfaces that the element implements. So I added 2 classes to the type system, TypeKlassInstPtr and TypeAryKlassPtr that mirror TypeInstPtr/TypeAryPtr. TypeAryKlassPtr has a TypeKlassPtr element which can then record what interfaces the element type implements. The meet implementation for both also mirrors TypeInstPtr/TypeAryPtr closely. In the existing implementation the TypeOopPtr/TypeKlassPtr klass() accessor is used in a lot of places to perform class equality checks, subtyping tests or to build a TypeOopPtr from a TypeKlassPtr or the other way around. That can't work now because klass() only gives partial information about the type. Rather than expose _klass and _interfaces through accessors, I have: - made the klass() accessor non public so code that's not closely tied to the actual type system implementation shouldn't use it - made some operations go through the type classes: testing for class equality should be done through klass_eq() which tests both class and interface set equality, same goes for subtyping tests. Creating a TypeKlassPtr from a TypeOopPtr or the other way around is now always done with the as_klass_type()/as_instance_type() methods. - introduced 3 new accessors: instance_klass() on TypeInstPtr/TypeInstKlassPtr which only returns the instance class, ignoring any interface implemented by the type; exact_klass() on TypeOopPtr/TypeKlassPtr which only works for exact types and can return an interface or array of interfaces; base_element_type() on TypeAryPtr/TypeAryKlassPtr() which returns the base element of the array and in case of an array of objects, only the instance class, not whatever interface the element implements. These 3 are AFAICT good enough to support existing c2 optimizations. For 2), TypeOopPtr::cast_to_non_interface() is used to filter out interfaces whenever signatures are used. As expected, this change makes several workarounds no longer necessary and cleans up the code quite a bit. For instance, CheckCastPPNode::Value() now falls back to the "right" implementation of ConstraintCastNode::Value() which leverages the type system. Fixing CheckCastPPNode::Value() exposes a bug in the C2 type checking logic: if a type check is proved to always fail during optimizations, the CheckCastPPNode becomes top, but the actual type checking logic in Phase::gen_subtype_check() doesn't always optimize out so the data path dies but not the control path. To fix that, I added a PartialSubtypeCheckNode::Value() method that triggers on subtype check failures consistently with CheckCastPPNode::Value. Problem is PartialSubtypeCheckNode is not checked first in Phase::gen_subtype_check(). So I added a duplicate test on the PartialSubtypeCheckNode result first in that logic, under an Opaque4Node so the actual is never compiled in the final code. Note that "8220416: Comparison of klass pointers is not optimized any more" is fixed by this. CastNullCheckDroppingsTest.java needs a change because testObjClassCast() has a call to objClass.cast() followed by a cast to String, so 2 casts in a raw once compiled. With the existing implementation, c2 can't see the casts as redundant and it keeps the second one. That one triggers a deopt if the test is passed null. The Class::cast implementation nevers traps for null. With the updated implementation, c2 sees the second cast as useless and optimizes it out. I dropped the CmpNNode::sub() becasue CmpN nodes are only created during final graph reshape so this is dead code. Performance is unaffected by this change AFAICT> Roland. From fujie at loongson.cn Mon May 20 13:21:29 2019 From: fujie at loongson.cn (Jie Fu) Date: Mon, 20 May 2019 21:21:29 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: Message-ID: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> Hi all, A refinement since the profile.count() won't change during compilation. ---------------------------------------------------- diff -r 13507abf416c src/hotspot/share/opto/bytecodeInfo.cpp --- a/src/hotspot/share/opto/bytecodeInfo.cpp Sat May 18 15:42:21 2019 +0900 +++ b/src/hotspot/share/opto/bytecodeInfo.cpp Mon May 20 21:13:13 2019 +0800 @@ -334,8 +334,8 @@ if (caller_method->is_not_reached(caller_bci)) { return true; // call site not resolved } - if (profile.count() == -1) { - return false; // immature profile; optimistically treat as reached + if (profile.count() <= -1) { + return false; // immature or typecheck profile; optimistically treat as reached } assert(profile.count() == 0, "sanity"); ---------------------------------------------------- Please review this one and give me some advice. Thanks. Best regards, Jie On 2019?05?20? 18:01, Jie Fu wrote: > Hi all, > > Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ > > In my previous patch, I had lost the case of typecheck profile[1]. > Please review and give me some advice. > > Thanks a lot. > Best regards, > Jie > > [1] > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html > > On 2019/5/20 ??4:12, Leonid Mesnik wrote: >> The failure is still reproduced with patch. I attached full hs_err to >> the bug. >> >> hs_err >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), >> pid=3096, tid=3128 >> # assert(profile_count == 0) failed: sanity >> # >> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug >> build 13-internal+0-2019-05-18-0457052.lmesnik.null) >> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >> 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, >> tiered, compressed oops, g1 gc, linux-amd64) >> # Problematic frame: >> # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, >> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >> # >> # Core dump will be written. Default location: Core dumps may be >> processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I >> %h" (or dumping to >> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >> chensink14D_java/scratch/0/core.3096) >> # >> # If you would like to submit a bug report, please visit: >> # http://bugreport.java.com/bugreport/crash.jsp >> # >> >> --------------- S U M M A R Y ------------ >> >> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions >> -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot >> -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError >> -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false >> -XX:+DisplayVMOutputToS\ >> tderr -XX:+UsePerfData >> -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags >> -XX:+DisableExplicitGC -XX:+StartAttachListener >> -XX:NativeMemoryTracking=detail -XX:+FlightRecorder >> --add-exports=java.base/java.lang=ALL-UNNAMED >> --add-opens=java.base/java.lang=ALL-UNNAME\ >> D >> --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED >> --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED >> -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >> >> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir >> -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home >> -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so >> applications.kitchensink.process.stress.Main >> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >> al.properties >> >> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle >> Linux Server release 7.5 >> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d >> 6h 55m 12s) >> >> --------------- T H R E A D --------------- >> >> Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" >> daemon [_thread_in_native, id=3128, >> stack(0x00002ae522de2000,0x00002ae522ee3000)] >> >> >> Current CompileTask: >> C2:111312036 146944 4 >> spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >> >> Stack: [0x00002ae522de2000,0x00002ae522ee3000], >> sp=0x00002ae522edf650, free space=1013k >> Native frames: (J=compiled Java code, A=aot compiled Java code, >> j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, >> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >> V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, >> JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >> V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, >> bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 >> V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 >> V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 >> V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 >> V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 >> V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, >> float)+0xc54 >> V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >> V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, >> ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >> V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, >> ciMethod*, int, DirectiveSet*)+0x10d >> V [libjvm.so+0x9cd17d] >> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >> V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >> V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a >> V [libjvm.so+0x16c9267] JavaThread::run()+0x227 >> V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 >> V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e >> >> Leonid >> >>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>> >>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>> Let's wait for Leonid's test result. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>> Hi Jie, >>>> >>>> So the counter was incremented while this code is executed. And you >>>> fixed it by caching initial value. >>>> Looks good. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>> Hi all, >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>> >>>>> I'm sorry to introduce this assertion failure. >>>>> Please review the suggested fix and give me some advice. >>>>> >>>>> Leonid, could you please help to test the patch? >>>>> I don't have the reproducer you mentioned in the JBS. >>>>> >>>>> Thanks a lot. >>>>> Best regards, >>>>> Jie >>>>> >>>>> >>> > From vladimir.kozlov at oracle.com Mon May 20 15:36:58 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 20 May 2019 08:36:58 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> Message-ID: <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> Hi Jie Please, send updated webrev. It is confusing to which changes this should be applied. Thanks Vladimir > On May 20, 2019, at 6:21 AM, Jie Fu wrote: > > Hi all, > > A refinement since the profile.count() won't change during compilation. > ---------------------------------------------------- > diff -r 13507abf416c src/hotspot/share/opto/bytecodeInfo.cpp > --- a/src/hotspot/share/opto/bytecodeInfo.cpp Sat May 18 15:42:21 2019 +0900 > +++ b/src/hotspot/share/opto/bytecodeInfo.cpp Mon May 20 21:13:13 2019 +0800 > @@ -334,8 +334,8 @@ > if (caller_method->is_not_reached(caller_bci)) { > return true; // call site not resolved > } > - if (profile.count() == -1) { > - return false; // immature profile; optimistically treat as reached > + if (profile.count() <= -1) { > + return false; // immature or typecheck profile; optimistically treat as reached > } > assert(profile.count() == 0, "sanity"); > > ---------------------------------------------------- > Please review this one and give me some advice. > > Thanks. > Best regards, > Jie > >> On 2019?05?20? 18:01, Jie Fu wrote: >> Hi all, >> >> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ >> >> In my previous patch, I had lost the case of typecheck profile[1]. >> Please review and give me some advice. >> >> Thanks a lot. >> Best regards, >> Jie >> >> [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html >> >>> On 2019/5/20 ??4:12, Leonid Mesnik wrote: >>> The failure is still reproduced with patch. I attached full hs_err to the bug. >>> >>> hs_err >>> # >>> # A fatal error has been detected by the Java Runtime Environment: >>> # >>> # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 >>> # assert(profile_count == 0) failed: sanity >>> # >>> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>> # Problematic frame: >>> # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>> # >>> # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >>> chensink14D_java/scratch/0/core.3096) >>> # >>> # If you would like to submit a bug report, please visit: >>> # http://bugreport.java.com/bugreport/crash.jsp >>> # >>> >>> --------------- S U M M A R Y ------------ >>> >>> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ >>> tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ >>> D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >>> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >>> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >>> al.properties >>> >>> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 >>> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) >>> >>> --------------- T H R E A D --------------- >>> >>> Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, stack(0x00002ae522de2000,0x00002ae522ee3000)] >>> >>> >>> Current CompileTask: >>> C2:111312036 146944 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >>> >>> Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650, free space=1013k >>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>> V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>> V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >>> V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 >>> V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 >>> V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 >>> V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 >>> V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 >>> V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, float)+0xc54 >>> V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >>> V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >>> V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d >>> V [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >>> V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >>> V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a >>> V [libjvm.so+0x16c9267] JavaThread::run()+0x227 >>> V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 >>> V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e >>> >>> Leonid >>> >>>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>>> >>>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>>> Let's wait for Leonid's test result. >>>> >>>> Thanks. >>>> Best regards, >>>> Jie >>>> >>>>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>>> Hi Jie, >>>>> >>>>> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >>>>> Looks good. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>>> Hi all, >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>>> >>>>>> I'm sorry to introduce this assertion failure. >>>>>> Please review the suggested fix and give me some advice. >>>>>> >>>>>> Leonid, could you please help to test the patch? >>>>>> I don't have the reproducer you mentioned in the JBS. >>>>>> >>>>>> Thanks a lot. >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>> >>>> >> > > From patricio.chilano.mateo at oracle.com Mon May 20 17:14:46 2019 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Mon, 20 May 2019 13:14:46 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> Message-ID: <6c0ad565-99d8-8e90-48c6-3d88ac19ca2f@oracle.com> Hi Robbin, Changes to biased locking look good! Thanks for the change. nit: --- a/src/hotspot/share/runtime/deoptimization.cpp +++ b/src/hotspot/share/runtime/deoptimization.cpp @@ -1318,3 +1318,2 @@ ???? oop obj = (objects_to_revoke->at(i))(); -?? markOop mark = obj->mark(); BiasedLocking::revoke_own_locks_in_handshake(objects_to_revoke->at(i), thread); Thanks! Patricio On 5/20/19 5:04 AM, Robbin Ehn wrote: > Hi all, please see this update v4. > > I have fixed the simplification Patricio talked about and David's nit. > > The interesting part is now the full diff of bias locking cpp file: > http://cr.openjdk.java.net/~rehn/8221734/v4/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > > It's very clean. > > Full: > http://cr.openjdk.java.net/~rehn/8221734/v4/ > Inc: > http://cr.openjdk.java.net/~rehn/8221734/v4/inc/ > > I have seen no issues in T1-7, KS and other assorted testing. > > Thanks, Robbin > > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From robbin.ehn at oracle.com Mon May 20 17:23:22 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 20 May 2019 19:23:22 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <6c0ad565-99d8-8e90-48c6-3d88ac19ca2f@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> <6c0ad565-99d8-8e90-48c6-3d88ac19ca2f@oracle.com> Message-ID: <84e11be0-37cf-91d9-4d5c-91d9741b3c32@oracle.com> Hi Patricio, On 2019-05-20 19:14, Patricio Chilano wrote: > Hi Robbin, > > Changes to biased locking look good! Thanks for the change. Thanks, great! > > nit: > --- a/src/hotspot/share/runtime/deoptimization.cpp > +++ b/src/hotspot/share/runtime/deoptimization.cpp > @@ -1318,3 +1318,2 @@ > ???? oop obj = (objects_to_revoke->at(i))(); > -?? markOop mark = obj->mark(); > BiasedLocking::revoke_own_locks_in_handshake(objects_to_revoke->at(i), thread); > Thanks, I'll fix this locally. Crossing fingers I do not need a v5 :) /Robbin > > Thanks! > > Patricio > > On 5/20/19 5:04 AM, Robbin Ehn wrote: >> Hi all, please see this update v4. >> >> I have fixed the simplification Patricio talked about and David's nit. >> >> The interesting part is now the full diff of bias locking cpp file: >> http://cr.openjdk.java.net/~rehn/8221734/v4/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> >> It's very clean. >> >> Full: >> http://cr.openjdk.java.net/~rehn/8221734/v4/ >> Inc: >> http://cr.openjdk.java.net/~rehn/8221734/v4/inc/ >> >> I have seen no issues in T1-7, KS and other assorted testing. >> >> Thanks, Robbin >> >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From rahul.v.raghavan at oracle.com Mon May 20 18:16:10 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Mon, 20 May 2019 23:46:10 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> Message-ID: <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> Hi, With reference to below email thread, request help to confirm next steps for JDK-8213416. So may I go ahead with webrev changes related to hotspot/compiler for 8213416 ? - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ (also will add similar hotspot/runtime related details in JBS comments for JDK-8223400) Thanks, Rahul On 16/05/19 3:26 PM, Rahul Raghavan wrote:> Hi, > > Thank you David for review comments. > > I will kindly request help from Magnus to reply for the main questions. > > > Sharing some notes, related links - > - 8211073: Remove -Wno-extra from Hotspot > https://bugs.openjdk.java.net/browse/JDK-8211073 > - Discussions in earlier thread - > https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-September/034314.html > > > So understood -Wextra do help in catching valid/useful warnings also, > but along with some too strict ones like "enumeral and non-enumeral type > in conditional expression" type warnings. > > Extracts from 8211073 JBS comments from Magnus regarding the > 'enum-warning' - > "... If you think that gcc is a bit too picky here, I agree. It's not > obvious per se that the added casts improve the code. However, this is > the price we need to pay to be able to enable -Wextra, and *that* is > something that is likely to improve the code." > > > Thanks, > Rahul > > On 16/05/19 11:13 AM, David Holmes wrote: >> This all seems like unnecessary churn to me - is any of this code >> actually wrong? can we not just disable this particular warning? is >> there any point using "static const" when we should be aiming to use >> C++11 constexpr in the (not too distant?) future? >> >> Converting from enums to unrelated ints seems a big step backwards in >> software engineering terms. :( >> >> Cheers, >> David >> ----- >> On 16/05/19 12:03 PM, Rahul Raghavan wrote: > Hi, > > Thank you Vladimir for review comments. > >>> 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - >>> [open/src/hotspot/share/oops/klass.cpp] >>> .......... >> >> I am okay with it but Runtime group should agree too - it is their code. >> > Yes, I missed that it is Runtime code. > > Please note plan is to handle only the hotspot/compiler part of the > requirement changes in JDK-8213416. > As per earlier JBS comments new JDK-8223400 was created to cover the > requirements in hotspot/runtime. > So may I suggest moving the above runtime change requirement details to > JDK-822340; > and use only the balance changes, as in below updated webrev, here for > 8213416. > > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ > > > Thanks, > Rahul From ekaterina.pavlova at oracle.com Mon May 20 20:09:06 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Mon, 20 May 2019 13:09:06 -0700 Subject: RFR(XS) 8222482: [Graal] Update java-allocation-instrumenter.jar handling in graalunit README.md Message-ID: Hi All, Please review the change which updates test/hotspot/jtreg/compiler/graalunit/README.md. One more auxiliary script test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh has been also added. JBS: https://bugs.openjdk.java.net/browse/JDK-8222482 webrev: http://cr.openjdk.java.net/~epavlova/8222482/webrev.00/index.html thanks, -katya From vladimir.kozlov at oracle.com Mon May 20 20:25:44 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 20 May 2019 13:25:44 -0700 Subject: RFR(XS) 8222482: [Graal] Update java-allocation-instrumenter.jar handling in graalunit README.md In-Reply-To: References: Message-ID: <265BE516-B554-4CBD-B1DC-6A97C49E168D@oracle.com> Okay. Thanks Vladimir > On May 20, 2019, at 1:09 PM, Ekaterina Pavlova wrote: > > Hi All, > > Please review the change which updates test/hotspot/jtreg/compiler/graalunit/README.md. > One more auxiliary script test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh has been also added. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8222482 > webrev: http://cr.openjdk.java.net/~epavlova/8222482/webrev.00/index.html > > thanks, > -katya From ekaterina.pavlova at oracle.com Mon May 20 20:28:21 2019 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Mon, 20 May 2019 13:28:21 -0700 Subject: RFR(XS) 8222482: [Graal] Update java-allocation-instrumenter.jar handling in graalunit README.md In-Reply-To: <265BE516-B554-4CBD-B1DC-6A97C49E168D@oracle.com> References: <265BE516-B554-4CBD-B1DC-6A97C49E168D@oracle.com> Message-ID: <79f43014-1526-dea3-b01a-61c0ee56ce33@oracle.com> Thanks Vladimir, I will wait for Aleksey's review as well. -katya On 5/20/19 1:25 PM, Vladimir Kozlov wrote: > Okay. > > Thanks > Vladimir > >> On May 20, 2019, at 1:09 PM, Ekaterina Pavlova wrote: >> >> Hi All, >> >> Please review the change which updates test/hotspot/jtreg/compiler/graalunit/README.md. >> One more auxiliary script test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh has been also added. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8222482 >> webrev: http://cr.openjdk.java.net/~epavlova/8222482/webrev.00/index.html >> >> thanks, >> -katya > From gromero at linux.vnet.ibm.com Mon May 20 20:46:58 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Mon, 20 May 2019 17:46:58 -0300 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 In-Reply-To: References: Message-ID: <81c12391-1406-948d-e8c5-9f66437a1b92@linux.vnet.ibm.com> Hi, Pushed to jdk8u-dev: http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/39678a65a0e8 Thank you. Best regards, Gustavo On 05/14/2019 03:02 AM, Kazunori Ogata wrote: > Hi Gustavo, > > Thank you for the suggestion. I'll proceed to put the fix request comment > and tag in the original bug report. > > Thank you too for offering to sponsor this change. I'll let you know when > it's approved. > > > Regards, > Ogata > > > "Gustavo Romero" wrote on 2019/05/14 > 04:59:43: > >> From: "Gustavo Romero" >> To: Kazunori Ogata/Japan/IBM at IBMJP, > hotspot-compiler-dev at openjdk.java.net, >> jdk8u-dev at openjdk.java.net >> Date: 2019/05/14 04:59 >> Subject: Re: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 >> >> Hi Ogata, >> >> Thanks for the backport and for the webrev. >> >> I understand that offset adjustments in general, and particularly for > this >> backport, are not considered a change that needs to be reviewed again. >> >> That said, and although I'm not a Reviewer, I tested it against SPECjvm > and >> microbenchmarks for byte, int, and long and reviewed the change for > jdk8u-dev. >> >> It looks good. >> >> Please, provide a "Fix Request" comment to the original bug explaining > that >> the backport is low risk and affects PPC64-only, accordingly to [1] and > [2]. >> Then please add the label "jdk8u-fix-request" to it. >> >> Once the approval to push is granted I'll sponsor the change. >> >> Thank you. >> >> Best regards, >> Gustavo >> >> [1] https://wiki.openjdk.java.net/display/jdk8u/Main >> [2] http://openjdk.java.net/projects/jdk-updates/approval.html >> >> On 05/10/2019 03:55 AM, Kazunori Ogata wrote: >>> Sorry, I forgot to put the links to the bug report and the original >>> changeset Also forgot to mention that this changeset is needed to >>> backport AES intrinsics support [1] on ppc64 big-endian. >>> >>> Bug report: >>> https://bugs.openjdk.java.net/browse/JDK-8158232 >>> >>> Original change set >>> http://hg.openjdk.java.net/jdk/jdk/rev/987528901b83 >>> >>> >>> Webrev: >>> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ >>> >>> >>> Refs: >>> [1] https://bugs.openjdk.java.net/browse/JDK-8188868 >>> >>> >>> Regards, >>> Ogata >>> >>> "hotspot-compiler-dev" >>> wrote on 2019/05/10 15:30:05: >>> >>>> From: "Kazunori Ogata" >>>> To: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net >>>> Date: 2019/05/10 15:31 >>>> Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 >>>> Sent by: "hotspot-compiler-dev" >>> >>>> >>>> Hi, >>>> >>>> May I get review for backport of 8158232: PPC64: improve byte, int > and >>>> long array copy stubs by using VSX instructions? >>>> >>>> This changeset looks no conflict with the latest jdk8u-dev code, but > the >>> >>>> patch command failed to apply it. It seems the patch command lost > the >>>> code regions to apply patches because stubGenerator_ppc.cpp has sets > of >>>> similar (but slightly different) functions. >>>> >>>> I created new webrev mainly to update line numbers in the patch file. > I >>> >>>> verified I can build fastdebug and release builds and there was no >>>> degradation in "make test" results. >>>> >>>> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ >>>> >>>> Regards, >>>> Ogata >>>> >>>> >>> >>> > > From fujie at loongson.cn Mon May 20 20:50:26 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 21 May 2019 04:50:26 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> Message-ID: Hi all, The updated webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ The call site count is < 0 for a typecheck profile[1][2]. Please review it and give me some advice. Thanks. Best regards, Jie [1] http://hg.openjdk.java.net/jdk/jdk/file/46ae54c3026d/src/hotspot/share/ci/ciMethod.cpp#l515 [2] http://hg.openjdk.java.net/jdk/jdk/file/46ae54c3026d/src/hotspot/share/ci/ciMethod.cpp#l533 On 2019?05?20? 23:36, Vladimir Kozlov wrote: > Hi Jie > > Please, send updated webrev. It is confusing to which changes this should be applied. > > Thanks > Vladimir > >> On May 20, 2019, at 6:21 AM, Jie Fu wrote: >> >> Hi all, >> >> A refinement since the profile.count() won't change during compilation. >> ---------------------------------------------------- >> diff -r 13507abf416c src/hotspot/share/opto/bytecodeInfo.cpp >> --- a/src/hotspot/share/opto/bytecodeInfo.cpp Sat May 18 15:42:21 2019 +0900 >> +++ b/src/hotspot/share/opto/bytecodeInfo.cpp Mon May 20 21:13:13 2019 +0800 >> @@ -334,8 +334,8 @@ >> if (caller_method->is_not_reached(caller_bci)) { >> return true; // call site not resolved >> } >> - if (profile.count() == -1) { >> - return false; // immature profile; optimistically treat as reached >> + if (profile.count() <= -1) { >> + return false; // immature or typecheck profile; optimistically treat as reached >> } >> assert(profile.count() == 0, "sanity"); >> >> ---------------------------------------------------- >> Please review this one and give me some advice. >> >> Thanks. >> Best regards, >> Jie >> >>> On 2019?05?20? 18:01, Jie Fu wrote: >>> Hi all, >>> >>> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ >>> >>> In my previous patch, I had lost the case of typecheck profile[1]. >>> Please review and give me some advice. >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html >>> >>>> On 2019/5/20 ??4:12, Leonid Mesnik wrote: >>>> The failure is still reproduced with patch. I attached full hs_err to the bug. >>>> >>>> hs_err >>>> # >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> # Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 >>>> # assert(profile_count == 0) failed: sanity >>>> # >>>> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) >>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>> # Problematic frame: >>>> # V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>>> # >>>> # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >>>> chensink14D_java/scratch/0/core.3096) >>>> # >>>> # If you would like to submit a bug report, please visit: >>>> # http://bugreport.java.com/bugreport/crash.jsp >>>> # >>>> >>>> --------------- S U M M A R Y ------------ >>>> >>>> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ >>>> tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ >>>> D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >>>> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >>>> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >>>> al.properties >>>> >>>> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 >>>> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) >>>> >>>> --------------- T H R E A D --------------- >>>> >>>> Current thread (0x00002ae4e83bc000): JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, stack(0x00002ae522de2000,0x00002ae522ee3000)] >>>> >>>> >>>> Current CompileTask: >>>> C2:111312036 146944 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >>>> >>>> Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650, free space=1013k >>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>>> V [libjvm.so+0x6cbf6c] InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>>> V [libjvm.so+0x6d12e0] InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >>>> V [libjvm.so+0xb6e075] Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 >>>> V [libjvm.so+0xb6f6b9] Parse::do_call()+0x469 >>>> V [libjvm.so+0x1441b70] Parse::do_one_bytecode()+0xff0 >>>> V [libjvm.so+0x1432520] Parse::do_one_block()+0x650 >>>> V [libjvm.so+0x1432a23] Parse::do_all_blocks()+0x113 >>>> V [libjvm.so+0x14348e4] Parse::Parse(JVMState*, ciMethod*, float)+0xc54 >>>> V [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >>>> V [libjvm.so+0x9c08b4] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >>>> V [libjvm.so+0x801d9d] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d >>>> V [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >>>> V [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >>>> V [libjvm.so+0x16c0baa] JavaThread::thread_main_inner()+0x26a >>>> V [libjvm.so+0x16c9267] JavaThread::run()+0x227 >>>> V [libjvm.so+0x16c62f6] Thread::call_run()+0xf6 >>>> V [libjvm.so+0x13e0d5e] thread_native_entry(Thread*)+0x10e >>>> >>>> Leonid >>>> >>>>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>>>> >>>>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>>>> Let's wait for Leonid's test result. >>>>> >>>>> Thanks. >>>>> Best regards, >>>>> Jie >>>>> >>>>>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>>>> Hi Jie, >>>>>> >>>>>> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >>>>>> Looks good. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>>>> >>>>>>> I'm sorry to introduce this assertion failure. >>>>>>> Please review the suggested fix and give me some advice. >>>>>>> >>>>>>> Leonid, could you please help to test the patch? >>>>>>> I don't have the reproducer you mentioned in the JBS. >>>>>>> >>>>>>> Thanks a lot. >>>>>>> Best regards, >>>>>>> Jie >>>>>>> >>>>>>> >> From daniel.daugherty at oracle.com Mon May 20 22:33:09 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 20 May 2019 18:33:09 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> Message-ID: <97a5f58b-1640-cf93-68ef-4fff6011fe95@oracle.com> > This full inc from v2 (review + stress test): > http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ src/hotspot/share/aot/aotCompiledMethod.cpp ??? No comments. src/hotspot/share/aot/aotCompiledMethod.hpp ??? No comments. src/hotspot/share/code/codeCache.cpp ??? L1145: // Deoptimize all(most) methods ??????? Yea... that comment and the name of the function need help even ??????? before your changes. Sigh... Maybe: ?????????? // Mark methods for deopt (if safe or possible). ??? L1151: ??? // Not installed are unsafe to mark for deopt, normally never deopted. ??? L1152: ??? // A not_entrant method may become a zombie at any time, ??? L1153: ??? // since we don't know on which side of last safepoint it became not_entrant ??? L1154: ??? // (state must be in_use). ??? L1155: ??? // Native method are unsafe to mark for deopt, normally never deopted. ??? L1156: ??? if (!nm->method()->is_method_handle_intrinsic() && ??? L1157: ??????? !nm->is_not_installed() && ??? L1158: ??????? nm->is_in_use() && ??? L1159: ??????? !nm->is_native_method()) { ??? L1160: ????? nm->mark_for_deoptimization(); ??????? Please consider replacing L1151-5 with the following comment after ??????? L1159. I prefer the comment for an if-statement to be inside the ??????? if-statement (but this is Compiler team code so they may have a ??????? different preference). ???????????????? // Intrinsics and native methods are never deopted. A method that is ???????????????? // not installed yet or is not in use is not safe to deopt; the ???????????????? // is_in_use() check covers the not_entrant and not zombie cases. ???????????????? // Note: A not_entrant method can become a zombie at anytime if it was ???????????????? // made not_entrant before the previous safepoint/handshake. ??? L1187: ??? // only_alive_and_not_unloading returns not_entrant nmethods. ??? L1188: ??? // A not_entrant can become a zombie at anytime, ??? L1189: ??? // if it was made not_entrant before previous safepoint/handshake. ??? L1190: ??? // We check that it is not not_entrant and not zombie, ??? L1191: ??? // by checking is_in_use(). ??? L1192: ??? if (nm->is_marked_for_deoptimization() && nm->is_in_use()) { ??? L1193: ????? nm->make_not_entrant(); ??????? Please consider replacing L1187-91 with the following comment after ??????? L1192 (also indented): ???????????????? // only_alive_and_not_unloading() can return not_entrant nmethods. ???????????????? // A not_entrant method can become a zombie at anytime if it was ???????????????? // made not_entrant before the previous safepoint/handshake. The ???????????????? // is_in_use() check covers the not_entrant and not zombie cases ???????????????? // that have become true after the method was marked for deopt. src/hotspot/share/code/compiledMethod.hpp ??? No comments. src/hotspot/share/code/nmethod.cpp ??? No comments. src/hotspot/share/oops/method.cpp ??? L957: ? // We need to check if both the _code and _from_compiled_code_entry_point ??? L958: ? // refer to this nmethod because there is a race in setting these two fields ??? L959: ? // in Method* as seen in bugid 4947125. ??? L960: ? // If the vep() points to the zombie nmethod, the memory for the nmethod ??? L961: ? // could be flushed and the compiler and vtable stubs could still call ??? L962: ? // through it. ??? L963: ? if (code() == compare || ??? L964: ????? from_compiled_entry() == compare->verified_entry_point()) { ??????? Now that you've moved the comment closer to the code, I can see a ??????? disconnect between the comment and the code. The comment: ??????????? // ... check if both the _code and _from_compiled_code_entry_point ??? ? ? ? ? // refer to this nmethod ??????? The code: ??????????? code() == compare || ??? ? ????? from_compiled_entry() == compare->verified_entry_point() ??????? So the comment is "both" "and" and the the code is "||". One of ??????? these is not right. src/hotspot/share/oops/method.hpp ??? No comments. src/hotspot/share/prims/whitebox.cpp ??? No comments. src/hotspot/share/runtime/biasedLocking.cpp ??? No comments. src/hotspot/share/runtime/biasedLocking.hpp ??? No comments. src/hotspot/share/runtime/deoptimization.cpp ??? L1297: void Deoptimization::revoke_safepoint(JavaThread* thread, frame fr, RegisterMap* map) { ??????? Perhaps revoke_using_safepoint(). ??? L1311: void Deoptimization::revoke_handshake(JavaThread* thread, frame fr, RegisterMap* map) { ??????? Perhaps revoke_using_handshake(). ??? old L1325: ??? ObjectSynchronizer::inflate(thread, obj, ObjectSynchronizer::inflate_cause_vm_internal); ??????? Why was this deleted? ??????? Update: See synchronizer.cpp below. src/hotspot/share/runtime/deoptimization.hpp ??? No comments. src/hotspot/share/runtime/synchronizer.cpp ??? Deleting ObjectSynchronizer::inflate() in Deoptimization::inflate_monitors_handshake() ??? which is now Deoptimization::revoke_handshake() allows these changes ??? to be reverted... test/hotspot/jtreg/compiler/codecache/stress/UnexpectedDeoptimizationAllTest.java ??? L2:? * Copyright (c) 2014, 2016, Oracle and/or its affiliates. All rights reserved. ??????? This is listed as a new file, but it has an old (and dual) copyright year. ??? Do any of the options used in the test require non-product bits? ??? (I don't think so, but...) Dan On 5/15/19 2:26 AM, Robbin Ehn wrote: > Hi, please see this update. > > I think I got all review comments fix. > > Long story short, I was concerned about test coverage, so I added a > stress test > using the WB, which sometimes crashed in rubbish code. > > There are two bugs in the methods used by WB_DeoptimizeAll. > (Seems I'm the first user) > > CodeCache::mark_all_nmethods_for_deoptimization(); > When iterating the nmethods we could see the methods being create in: > void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& > method) > And deopt the method when it was in use or before. > Native wrappers are suppose to live as long as the class. > I filtered out not_installed and native methods. > > Deoptimization::deoptimize_all_marked(); > The issue is that a not_entrant method can go to zombie at anytime. > There are several ways to make a nmethod not go zombie: nmethodLocker, > have it > on stack, avoid safepoint poll in some states, etc.., which is also > depending on > what type of nmethod. > The iterator only_alive_and_not_unloading returns not_entrant > nmethods, but we > don't know there state prior last poll. > in_use -> not_entrant -> #poll# -> not_entrant -> zombie > If the iterator returns the nmethod after we passed the poll it can > still be > not_entrant but go zombie. > The problem happens when a second thread marks a method for deopt and > makes it > not_entrant. Then after a poll we end-up in deoptimize_all_marked(), > but the > method is not yet a zombie, so the iterator returns it, it becomes a > zombie thus > pass the if check and later hit the assert. > So there is a race between the iterator check of state and > if-statement check of > state. Fixed by also filtering out zombies. > > If the stress test with correction of the bugs causes trouble in > review, I can > do a follow-up with the stress test separately. > > Good news, no issues found with deopt with handshakes. > > This is v3: > http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ > > This full inc from v2 (review + stress test): > http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ > > This inc is the review part from v2: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ > > This inc is the additional stress test with bug fixes: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ > > Additional biased locking change: > The original code use same copy of markOop in revoke_and_rebias. > The keep same behavior I now pass in that copy into fast_revoke. > > The stress test passes hundreds of iterations in mach5. > Thousands stress tests locally, the issues above was reproduce-able. > Inc changes also passes t1-5. > > As usual with this change-set, I'm continuously running more test. > > Thanks, Robbin > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From daniel.daugherty at oracle.com Mon May 20 22:49:11 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 20 May 2019 18:49:11 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> Message-ID: <79a2eb2b-6546-6d6e-ad48-46d7ddc66002@oracle.com> On 5/20/19 5:04 AM, Robbin Ehn wrote: > Hi all, please see this update v4. > > I have fixed the simplification Patricio talked about and David's nit. > > The interesting part is now the full diff of bias locking cpp file: > http://cr.openjdk.java.net/~rehn/8221734/v4/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html > > It's very clean. > > Full: > http://cr.openjdk.java.net/~rehn/8221734/v4/ > Inc: > http://cr.openjdk.java.net/~rehn/8221734/v4/inc/ src/hotspot/share/runtime/biasedLocking.cpp ??? L640: ? assert(mark->biased_locker() == THREAD && ??? L641: ??????????? prototype_header->bias_epoch() == mark->bias_epoch(), "Revoke failed, unhandled biased lock state"); ??????? nit - please reduce L641 indent by 3 spaces. ??? L698: ??????? assert(THREAD->is_Java_thread(), ""); ??????? nit - s/""/"must be a JavaThread"/ src/hotspot/share/runtime/biasedLocking.hpp ??? No comment. src/hotspot/share/runtime/deoptimization.cpp ??? L1321: ??? markOop mark = obj->mark(); ??????? Is now unused (which is good since it could get out of sync ??????? with the one fetched in revoke_own_locks_in_handshake()). test/hotspot/jtreg/compiler/codecache/stress/UnexpectedDeoptimizationAllTest.java ??? No comments. No need to see a new webrev if you decide to fix the bits. Dan > > I have seen no issues in T1-7, KS and other assorted testing. > > Thanks, Robbin > > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From vladimir.kozlov at oracle.com Tue May 21 00:48:47 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 20 May 2019 17:48:47 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> Message-ID: Okay, this explains failure I think and yes ciCallProfile is copy of original profiling information and should not change during compilation. Good. Leonid, please test this one line fix. Thanks, Vladimir On 5/20/19 1:50 PM, Jie Fu wrote: > Hi all, > > The updated webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ > > The call site count is < 0 for a typecheck profile[1][2]. > Please review it and give me some advice. > > Thanks. > Best regards, > Jie > > [1] http://hg.openjdk.java.net/jdk/jdk/file/46ae54c3026d/src/hotspot/share/ci/ciMethod.cpp#l515 > [2] http://hg.openjdk.java.net/jdk/jdk/file/46ae54c3026d/src/hotspot/share/ci/ciMethod.cpp#l533 > > > On 2019?05?20? 23:36, Vladimir Kozlov wrote: >> Hi Jie >> >> Please, send updated webrev. It is confusing to which changes this should be applied. >> >> Thanks >> Vladimir >> >>> On May 20, 2019, at 6:21 AM, Jie Fu wrote: >>> >>> Hi all, >>> >>> A refinement since the profile.count() won't change during compilation. >>> ---------------------------------------------------- >>> diff -r 13507abf416c src/hotspot/share/opto/bytecodeInfo.cpp >>> --- a/src/hotspot/share/opto/bytecodeInfo.cpp?? Sat May 18 15:42:21 2019 +0900 >>> +++ b/src/hotspot/share/opto/bytecodeInfo.cpp?? Mon May 20 21:13:13 2019 +0800 >>> @@ -334,8 +334,8 @@ >>> ?? if (caller_method->is_not_reached(caller_bci)) { >>> ???? return true; // call site not resolved >>> ?? } >>> -? if (profile.count() == -1) { >>> -??? return false; // immature profile; optimistically treat as reached >>> +? if (profile.count() <= -1) { >>> +??? return false; // immature or typecheck profile; optimistically treat as reached >>> ?? } >>> ?? assert(profile.count() == 0, "sanity"); >>> >>> ---------------------------------------------------- >>> Please review this one and give me some advice. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>>> On 2019?05?20? 18:01, Jie Fu wrote: >>>> Hi all, >>>> >>>> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ >>>> >>>> In my previous patch, I had lost the case of typecheck profile[1]. >>>> Please review and give me some advice. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html >>>> >>>>> On 2019/5/20 ??4:12, Leonid Mesnik wrote: >>>>> The failure is still reproduced with patch. I attached full hs_err to the bug. >>>>> >>>>> hs_err >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> #? Internal Error (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 >>>>> #? assert(profile_count == 0) failed: sanity >>>>> # >>>>> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug build 13-internal+0-2019-05-18-0457052.lmesnik.null) >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, >>>>> sharing, tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone >>>>> .constprop.153]+0xbc >>>>> # >>>>> # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c >>>>> %p %u %g %t e %P %I %h" (or dumping to >>>>> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >>>>> chensink14D_java/scratch/0/core.3096) >>>>> # >>>>> # If you would like to submit a bug report, please visit: >>>>> #?? http://bugreport.java.com/bugreport/crash.jsp >>>>> # >>>>> >>>>> ---------------? S U M M A R Y ------------ >>>>> >>>>> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 >>>>> -XX:+DeoptimizeALot -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError -XX:+CrashOnOutOfMemoryError >>>>> -Djava.net.preferIPv6Addresses=false -XX:+DisplayVMOutputToS\ >>>>> tderr -XX:+UsePerfData -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags -XX:+DisableExplicitGC >>>>> -XX:+StartAttachListener -XX:NativeMemoryTracking=detail -XX:+FlightRecorder >>>>> --add-exports=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAME\ >>>>> D --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED >>>>> --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED >>>>> -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >>>>> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir >>>>> -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home >>>>> -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >>>>> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so applications.kitchensink.process.stress.Main >>>>> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >>>>> >>>>> al.properties >>>>> >>>>> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, Oracle Linux Server release 7.5 >>>>> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d 6h 55m 12s) >>>>> >>>>> ---------------? T H R E A D? --------------- >>>>> >>>>> Current thread (0x00002ae4e83bc000):? JavaThread "C2 CompilerThread0" daemon [_thread_in_native, id=3128, >>>>> stack(0x00002ae522de2000,0x00002ae522ee3000)] >>>>> >>>>> >>>>> Current CompileTask: >>>>> C2:111312036 146944?????? 4 spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >>>>> >>>>> Stack: [0x00002ae522de2000,0x00002ae522ee3000], sp=0x00002ae522edf650,? free space=1013k >>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) >>>>> V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, ciMethod*, int, ciCallProfile&) [clone >>>>> .constprop.153]+0xbc >>>>> V? [libjvm.so+0x6d12e0]? InlineTree::ok_to_inline(ciMethod*, JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >>>>> V? [libjvm.so+0xb6e075]? Compile::call_generator(ciMethod*, int, bool, JVMState*, bool, float, ciKlass*, bool, >>>>> bool)+0x905 >>>>> V? [libjvm.so+0xb6f6b9]? Parse::do_call()+0x469 >>>>> V? [libjvm.so+0x1441b70]? Parse::do_one_bytecode()+0xff0 >>>>> V? [libjvm.so+0x1432520]? Parse::do_one_block()+0x650 >>>>> V? [libjvm.so+0x1432a23]? Parse::do_all_blocks()+0x113 >>>>> V? [libjvm.so+0x14348e4]? Parse::Parse(JVMState*, ciMethod*, float)+0xc54 >>>>> V? [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >>>>> V? [libjvm.so+0x9c08b4]? Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >>>>> V? [libjvm.so+0x801d9d]? C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x10d >>>>> V? [libjvm.so+0x9cd17d] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >>>>> V? [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >>>>> V? [libjvm.so+0x16c0baa]? JavaThread::thread_main_inner()+0x26a >>>>> V? [libjvm.so+0x16c9267]? JavaThread::run()+0x227 >>>>> V? [libjvm.so+0x16c62f6]? Thread::call_run()+0xf6 >>>>> V? [libjvm.so+0x13e0d5e]? thread_native_entry(Thread*)+0x10e >>>>> >>>>> Leonid >>>>> >>>>>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>>>>> >>>>>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>>>>> Let's wait for Leonid's test result. >>>>>> >>>>>> Thanks. >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>>>>> Hi Jie, >>>>>>> >>>>>>> So the counter was incremented while this code is executed. And you fixed it by caching initial value. >>>>>>> Looks good. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>>>>> >>>>>>>> I'm sorry to introduce this assertion failure. >>>>>>>> Please review the suggested fix and give me some advice. >>>>>>>> >>>>>>>> Leonid, could you please help to test the patch? >>>>>>>> I don't have the reproducer you mentioned in the JBS. >>>>>>>> >>>>>>>> Thanks a lot. >>>>>>>> Best regards, >>>>>>>> Jie >>>>>>>> >>>>>>>> >>> > > From robbin.ehn at oracle.com Tue May 21 09:39:31 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 21 May 2019 11:39:31 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <97a5f58b-1640-cf93-68ef-4fff6011fe95@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9940a897-d49d-0a22-267d-6b78424a45c2@oracle.com> <97a5f58b-1640-cf93-68ef-4fff6011fe95@oracle.com> Message-ID: Hi Dan, On 2019-05-21 00:33, Daniel D. Daugherty wrote: > src/hotspot/share/code/codeCache.cpp > ??? L1145: // Deoptimize all(most) methods > ??????? Yea... that comment and the name of the function need help even > ??????? before your changes. Sigh... Maybe: > > ?????????? // Mark methods for deopt (if safe or possible). Changed. > > ??? L1151: ??? // Not installed are unsafe to mark for deopt, normally never > deopted. > ??? L1152: ??? // A not_entrant method may become a zombie at any time, > ??? L1153: ??? // since we don't know on which side of last safepoint it became > not_entrant > ??? L1154: ??? // (state must be in_use). > ??? L1155: ??? // Native method are unsafe to mark for deopt, normally never > deopted. > ??? L1156: ??? if (!nm->method()->is_method_handle_intrinsic() && > ??? L1157: ??????? !nm->is_not_installed() && > ??? L1158: ??????? nm->is_in_use() && > ??? L1159: ??????? !nm->is_native_method()) { > ??? L1160: ????? nm->mark_for_deoptimization(); > ??????? Please consider replacing L1151-5 with the following comment after > ??????? L1159. I prefer the comment for an if-statement to be inside the > ??????? if-statement (but this is Compiler team code so they may have a > ??????? different preference). > > ???????????????? // Intrinsics and native methods are never deopted. A method > that is > ???????????????? // not installed yet or is not in use is not safe to deopt; the > ???????????????? // is_in_use() check covers the not_entrant and not zombie cases. > ???????????????? // Note: A not_entrant method can become a zombie at anytime > if it was > ???????????????? // made not_entrant before the previous safepoint/handshake. Changed. > > ??? L1187: ??? // only_alive_and_not_unloading returns not_entrant nmethods. > ??? L1188: ??? // A not_entrant can become a zombie at anytime, > ??? L1189: ??? // if it was made not_entrant before previous safepoint/handshake. > ??? L1190: ??? // We check that it is not not_entrant and not zombie, > ??? L1191: ??? // by checking is_in_use(). > ??? L1192: ??? if (nm->is_marked_for_deoptimization() && nm->is_in_use()) { > ??? L1193: ????? nm->make_not_entrant(); > ??????? Please consider replacing L1187-91 with the following comment after > ??????? L1192 (also indented): > ???????????????? // only_alive_and_not_unloading() can return not_entrant > nmethods. > ???????????????? // A not_entrant method can become a zombie at anytime if it was > ???????????????? // made not_entrant before the previous safepoint/handshake. The > ???????????????? // is_in_use() check covers the not_entrant and not zombie cases > ???????????????? // that have become true after the method was marked for deopt. Changed. > > src/hotspot/share/code/compiledMethod.hpp > ??? No comments. > > src/hotspot/share/code/nmethod.cpp > ??? No comments. > > src/hotspot/share/oops/method.cpp > ??? L957: ? // We need to check if both the _code and > _from_compiled_code_entry_point > ??? L958: ? // refer to this nmethod because there is a race in setting these > two fields > ??? L959: ? // in Method* as seen in bugid 4947125. > ??? L960: ? // If the vep() points to the zombie nmethod, the memory for the > nmethod > ??? L961: ? // could be flushed and the compiler and vtable stubs could still call > ??? L962: ? // through it. > ??? L963: ? if (code() == compare || > ??? L964: ????? from_compiled_entry() == compare->verified_entry_point()) { > ??????? Now that you've moved the comment closer to the code, I can see a > ??????? disconnect between the comment and the code. The comment: > > ??????????? // ... check if both the _code and _from_compiled_code_entry_point > ??? ? ? ? ? // refer to this nmethod > > ??????? The code: > > ??????????? code() == compare || > ??? ? ????? from_compiled_entry() == compare->verified_entry_point() > > ??????? So the comment is "both" "and" and the the code is "||". One of > ??????? these is not right. Comment is wrong, it's or, changed: // We need to check if either the _code or _from_compiled_code_entry_point This is the original: // We need to check if both the _code and _from_compiled_code_entry_point ... if (method() != NULL && (method()->code() == this || method()->from_compiled_entry() == verified_entry_point())) { > > src/hotspot/share/oops/method.hpp > ??? No comments. > > src/hotspot/share/prims/whitebox.cpp > ??? No comments. > > src/hotspot/share/runtime/biasedLocking.cpp > ??? No comments. > > src/hotspot/share/runtime/biasedLocking.hpp > ??? No comments. > > src/hotspot/share/runtime/deoptimization.cpp > ??? L1297: void Deoptimization::revoke_safepoint(JavaThread* thread, frame fr, > RegisterMap* map) { > ??????? Perhaps revoke_using_safepoint(). Changed. > > ??? L1311: void Deoptimization::revoke_handshake(JavaThread* thread, frame fr, > RegisterMap* map) { > ??????? Perhaps revoke_using_handshake(). Changed. > > ??? old L1325: ??? ObjectSynchronizer::inflate(thread, obj, > ObjectSynchronizer::inflate_cause_vm_internal); > ??????? Why was this deleted? > ??????? Update: See synchronizer.cpp below. > > src/hotspot/share/runtime/deoptimization.hpp > ??? No comments. > > src/hotspot/share/runtime/synchronizer.cpp > ??? Deleting ObjectSynchronizer::inflate() in > Deoptimization::inflate_monitors_handshake() > ??? which is now Deoptimization::revoke_handshake() allows these changes > ??? to be reverted... Yes, the JavaThread will inflate, if needed, when unpackning the stack preparing for interpreter. > > test/hotspot/jtreg/compiler/codecache/stress/UnexpectedDeoptimizationAllTest.java > ??? L2:? * Copyright (c) 2014, 2016, Oracle and/or its affiliates. All rights > reserved. > ??????? This is listed as a new file, but it has an old (and dual) copyright year. Fixed. > > ??? Do any of the options used in the test require non-product bits? > ??? (I don't think so, but...) I don't know if WB works with release build. Thanks! /Robbin > > Dan > > > On 5/15/19 2:26 AM, Robbin Ehn wrote: >> Hi, please see this update. >> >> I think I got all review comments fix. >> >> Long story short, I was concerned about test coverage, so I added a stress test >> using the WB, which sometimes crashed in rubbish code. >> >> There are two bugs in the methods used by WB_DeoptimizeAll. >> (Seems I'm the first user) >> >> CodeCache::mark_all_nmethods_for_deoptimization(); >> When iterating the nmethods we could see the methods being create in: >> void AdapterHandlerLibrary::create_native_wrapper(const methodHandle& method) >> And deopt the method when it was in use or before. >> Native wrappers are suppose to live as long as the class. >> I filtered out not_installed and native methods. >> >> Deoptimization::deoptimize_all_marked(); >> The issue is that a not_entrant method can go to zombie at anytime. >> There are several ways to make a nmethod not go zombie: nmethodLocker, have it >> on stack, avoid safepoint poll in some states, etc.., which is also depending on >> what type of nmethod. >> The iterator only_alive_and_not_unloading returns not_entrant nmethods, but we >> don't know there state prior last poll. >> in_use -> not_entrant -> #poll# -> not_entrant -> zombie >> If the iterator returns the nmethod after we passed the poll it can still be >> not_entrant but go zombie. >> The problem happens when a second thread marks a method for deopt and makes it >> not_entrant. Then after a poll we end-up in deoptimize_all_marked(), but the >> method is not yet a zombie, so the iterator returns it, it becomes a zombie thus >> pass the if check and later hit the assert. >> So there is a race between the iterator check of state and if-statement check of >> state. Fixed by also filtering out zombies. >> >> If the stress test with correction of the bugs causes trouble in review, I can >> do a follow-up with the stress test separately. >> >> Good news, no issues found with deopt with handshakes. >> >> This is v3: >> http://cr.openjdk.java.net/~rehn/8221734/v3/webrev/ >> >> This full inc from v2 (review + stress test): >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc/ >> >> This inc is the review part from v2: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/ >> >> This inc is the additional stress test with bug fixes: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_test/ >> >> Additional biased locking change: >> The original code use same copy of markOop in revoke_and_rebias. >> The keep same behavior I now pass in that copy into fast_revoke. >> >> The stress test passes hundreds of iterations in mach5. >> Thousands stress tests locally, the issues above was reproduce-able. >> Inc changes also passes t1-5. >> >> As usual with this change-set, I'm continuously running more test. >> >> Thanks, Robbin >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Tue May 21 09:48:24 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 21 May 2019 11:48:24 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <79a2eb2b-6546-6d6e-ad48-46d7ddc66002@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <45fae2e7-b1ee-faca-9333-fc921d8befae@oracle.com> <79a2eb2b-6546-6d6e-ad48-46d7ddc66002@oracle.com> Message-ID: Hi Dan. On 2019-05-21 00:49, Daniel D. Daugherty wrote: > On 5/20/19 5:04 AM, Robbin Ehn wrote: >> Hi all, please see this update v4. >> >> I have fixed the simplification Patricio talked about and David's nit. >> >> The interesting part is now the full diff of bias locking cpp file: >> http://cr.openjdk.java.net/~rehn/8221734/v4/webrev/src/hotspot/share/runtime/biasedLocking.cpp.sdiff.html >> >> It's very clean. >> >> Full: >> http://cr.openjdk.java.net/~rehn/8221734/v4/ >> Inc: >> http://cr.openjdk.java.net/~rehn/8221734/v4/inc/ > > src/hotspot/share/runtime/biasedLocking.cpp > ??? L640: ? assert(mark->biased_locker() == THREAD && > ??? L641: ??????????? prototype_header->bias_epoch() == mark->bias_epoch(), > "Revoke failed, unhandled biased lock state"); > ??????? nit - please reduce L641 indent by 3 spaces. Fixed. > > ??? L698: ??????? assert(THREAD->is_Java_thread(), ""); > ??????? nit - s/""/"must be a JavaThread"/ This method is restored, so I have no changes here. I will skip adding the comment because Patricios biasedlocking changeset will apply clean if I leave this method untouched. So we fix this assert in his changeset instead. > > src/hotspot/share/runtime/biasedLocking.hpp > ??? No comment. > > src/hotspot/share/runtime/deoptimization.cpp > ??? L1321: ??? markOop mark = obj->mark(); > ??????? Is now unused (which is good since it could get out of sync > ??????? with the one fetched in revoke_own_locks_in_handshake()). Removed. > > test/hotspot/jtreg/compiler/codecache/stress/UnexpectedDeoptimizationAllTest.java > ??? No comments. > > No need to see a new webrev if you decide to fix the bits. Thanks Dan! I'll send out a v6. /Robbin > > Dan > >> >> I have seen no issues in T1-7, KS and other assorted testing. >> >> Thanks, Robbin >> >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Tue May 21 10:27:56 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 21 May 2019 12:27:56 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: Hi all, please this update. Dean do you have any more comments? For some reason webrev don't shows whitespace fix in biasedLocking.cpp. But is clearly visible in: http://cr.openjdk.java.net/~rehn/8221734/v5/inc/webrev/open.patch Inc: http://cr.openjdk.java.net/~rehn/8221734/v5/inc/webrev/ Full: http://cr.openjdk.java.net/~rehn/8221734/v5/webrev/ Thanks, Robbin On 2019-04-25 14:05, Robbin Ehn wrote: > Hi all, please review. > > Let's deopt with handshakes. > Removed VM op Deoptimize, instead we handshake. > Locks needs to be inflate since we are not in a safepoint. > > Goes on top of: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html > > Code: > http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html > Issue: > https://bugs.openjdk.java.net/browse/JDK-8221734 > > Passes t1-7 and multiple t1-5 runs. > > A few startup benchmark see a small speedup. > > Thanks, Robbin From tobias.hartmann at oracle.com Tue May 21 10:38:12 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 21 May 2019 12:38:12 +0200 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: <3A11FD93-C2E0-48BE-B927-D718D58B50C6@sap.com> References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> <37865A98-264E-4775-9366-194A804AE5AA@oracle.com> <3A11FD93-C2E0-48BE-B927-D718D58B50C6@sap.com> Message-ID: Hi Lutz, this looks good to me. Best regards, Tobias On 20.05.19 10:22, Schmidt, Lutz wrote: > Hi Vladimir, > > I appreciate the time you spent making this change "push-worthy". Thank you for reviewing it! > > I am now hoping for a second reviewer to set aside some time and have a look. > > Thanks, > Lutz > > ?On 17.05.19, 16:48, "Vladimir Kozlov" wrote: > > Good. > > Thanks > Vladimir > > > On May 17, 2019, at 6:19 AM, Schmidt, Lutz wrote: > > > > Hi Vladimir, > > here is what I changed to overcome the ZERO build issues: > > > > ----------- 8< ---------------- > > diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp > > --- a/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:11:44 2019 +0200 > > +++ b/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:14:06 2019 +0200 > > @@ -37,6 +37,12 @@ > > > > public: > > void pd_patch_instruction(address branch, address target, const char* file, int line); > > + > > + //---< calculate length of instruction >--- > > + static unsigned int instr_len(unsigned char *instr) { return 1; } > > + > > + //---< longest instructions >--- > > + static unsigned int instr_maxlen() { return 1; } > > }; > > > > class MacroAssembler : public Assembler { > > diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp > > --- a/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:11:44 2019 +0200 > > +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:14:06 2019 +0200 > > @@ -61,6 +61,9 @@ > > bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > > // might prove cumbersome because instr_len is hard to find on x86 > > #endif > > +#if defined(ZERO) > > +bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > > +#endif > > > > // Return #bytes printed. Callers may use that for output alignment. > > // Print instruction address, and offset from blob begin. > > ----------- >8 ---------------- > > > > This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo: > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/ > > > > Regards, > > Lutz > > > > > > On 16.05.19, 21:40, "Vladimir Kozlov" wrote: > > > > I am not sure about exact parameters but I see build testing uses next: > > > > configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc > > > > Vladimir > > > >> On 5/16/19 12:22 PM, Schmidt, Lutz wrote: > >> Hi Vladimir, > >> > >> thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? > >> > >> I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. > >> > >> Thanks, > >> Lutz > >> > >> On 16.05.19, 20:38, "Vladimir Kozlov" wrote: > >> > >> linux-x64-zero build is broke: > >> > >> workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' > >> int instr_size_in_bytes = Assembler::instr_len(pos); > >> ^~~~~~~~~ > >> Other builds and testing are good. > >> > >> Thanks, > >> Vladimir > >> > >>> On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > >>> Nice. > >>> > >>> I submitted our tier1-3 testing. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>>> On 5/16/19 2:55 AM, Schmidt, Lutz wrote: > >>>> Hi Vladimir, > >>>> > >>>> sorry for the delayed reaction on your comments. > >>>> > >>>> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list > >>>> of modified files. > >>>> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In > >>>> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last > >>>> night). > >>>> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter > >>>> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. > >>>> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. > >>>> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ > >>>> > >>>> ~thartmann: > >>>> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it > >>>> work reliably. I would like to keep it in as a reminder and a template to build on. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: > >>>> > >>>> Hi Lutz, > >>>> My comments are inlined below. > >>>>> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > >>>>> Thank you, Vladimir! > >>>>> Please find my comments inline and let me know what you think. > >>>>> A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ > >>>> Found one more I missed last time: > >>>> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): > >>>> static int instr_len(unsigned char *instr); > >>>>> Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a > >>>> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there > >>>> were no serious clashes) and sent a short note. > >>>> NP. Please, provide final webrev when you can so that I can run these changes through our testing to > >>>> make sure no issues are present (especially in builds). > >>>>> Regards, > >>>>> Lutz > >>>>> > >>>>> On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > >>>>> > >>>>> Hi Lutz, > >>>>> > >>>>> Thank you for doing this great work. > >>>>> > >>>>> I have just small comments: > >>>>> > >>>>> x86_64.ad - empty change. > >>>>> File contains whitespace changes for formatting. Not visible in webrev. > >>>> Okay. > >>>>> > >>>>> nmethod.cpp - LUCY? > >>>>> > >>>>> + st->print_cr("LUCY: NULL-oop"); > >>>>> + tty->print("LUCY NULL-oop"); > >>>>> Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. > >>>> Okay. > >>>>> > >>>>> nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > >>>>> Changed. > >>>>> > >>>>> vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > >>>>> Changed. > >>>>> > >>>>> disassembler.* - LUCY_OBSOLETE? > >>>>> > >>>>> +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > >>>>> This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction > >>>> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it > >>>> out. But now I did. It's gone. > >>>> Okay. > >>>>> > >>>>> compilerDefinitions.hpp - I don't see where tier_digit() is used. > >>>>> I'm surprised myself. Introduced it and then made it obsolete. It's gone. > >>>>> > >>>>> disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > >>>>> need to check for invalid value and print help output in such case - it will be very useful if you > >>>>> forgot a value spelling. Also add line for 'help' value. > >>>>> > >>>>> The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, > >>>> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is > >>>> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) > >>>> like. Not changed as of now. Awaiting your input. > >>>> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is > >>>> useless because PrintAssemblyOptions is used only for disassembler and there are no values which > >>>> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). > >>>> And an other commented new line: > >>>> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > >>>>> > >>>>> Printing help text: There is an option (hsdis-help) to request help text printout. > > >>>>> Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - > >>>> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more > >>>> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? > >>>> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use > >>>> if-else or switch without complicating the code. > >>>> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use > >>>> here? I don't think next comment is correct for ccstr type: > >>>> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 > >>>> It would be nice to fix it but you can do it later if you don't want to add more changes. > >>>>> > >>>>> Do you need next commented lines: > >>>>> > >>>>> disassembler.cpp - > >>>>> +// ptrdiff_t _offset; > >>>>> Deleted. > >>>>> > >>>>> +// Output suppressed because it messes up disassembly. > >>>>> +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > >>>>> Uncommented, would like to keep it. Made the if condition permanently false. > >>>>> > >>>>> disassembler_s390.cpp - > >>>>> +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > >>>>> Deleted. > >>>>> > >>>>> compile.cpp - > >>>>> +// st->print("# "); _tf->dump_on(st); st->cr(); > >>>>> Uncommented. > >>>>> > >>>>> > >>>>> abstractDisassembler.cpp - > >>>>> // st->print("0x%016lx", *((julong*)here)); > >>>>> st->print("0x%016lx", *((uintptr_t*)here)); > >>>>> // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > >>>>> Commented lines are gone. > >>>>> > >>>>> abstractDisassembler.cpp - may be explicit cast (byte*)?: > >>>>> > >>>>> st->print("%2.2x", *byte); > >>>>> st->print("%2.2x", *pos); > >>>>> st->print("0x%02x", *here); > >>>>> Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, > >>>> std::byte is a C++17 feature. We are not there yet. > >>>> okay > >>>>> > >>>>> PTR64_FORMAT ?: > >>>>> st->print("0x%016lx", *((uintptr_t*)here)); > >>>>> I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. > >>>> Changed other occurrences, so I changed this one as well. > >>>> Thanks, > >>>> Vladimir > >>>>> > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>>> On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > >>>>>> Dear Community, > >>>>>> > >>>>>> may I please request comments and reviews for this change? Thank you! > >>>>>> > >>>>>> I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The > >>>> code which constitutes this patch was not changed. Here's the webrev link: > >>>>>> https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > >>>>>> > >>>>>> Regards, > >>>>>> Lutz > >>>>>> > >>>>>> On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > >>>>>> > >>>>>> Dear All, > >>>>>> > >>>>>> this topic was discussed back in Nov/Dec 2018: > >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > >>>>>> > >>>>>> Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > >>>>>> The result was mixed, some pro, some con. I let the input from back then influence my work of the > >>>> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general > >>>> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the > >>>> list (with default settings): > >>>>>> > >>>>>> PrintAssemblyOptions help: > >>>>>> hsdis-print-raw test plugin by requesting raw output (deprecated) > >>>>>> hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > >>>>>> hsdis-print-pc turn off PC printing (on by default) (deprecated) > >>>>>> hsdis-print-bytes turn on instruction byte output (deprecated) > >>>>>> > >>>>>> hsdis-show-pc toggle printing current pc, currently ON > >>>>>> hsdis-show-offset toggle printing current offset, currently OFF > >>>>>> hsdis-show-bytes toggle printing instruction bytes, currently OFF > >>>>>> hsdis-show-data-hex toggle formatting data as hex, currently ON > >>>>>> hsdis-show-data-int toggle formatting data as int, currently OFF > >>>>>> hsdis-show-data-float toggle formatting data as float, currently OFF > >>>>>> hsdis-show-structs toggle compiler data structures, currently OFF > >>>>>> hsdis-show-comment toggle instruction comments, currently OFF > >>>>>> hsdis-show-block-comment toggle block comments, currently OFF > >>>>>> hsdis-align-instr toggle instruction alignment, currently OFF > >>>>>> > >>>>>> Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. > >>>> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. > >>>> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like > >>>> a plan? Here is what you get: > >>>>>> > >>>>>> The machine code generated by the JVM can be printed in three different formats: > >>>>>> - Hexadecimal. > >>>>>> This is basically a hex dump of the memory range containing the code. > >>>>>> This format is always available (PRODUCT and not-PRODUCT builds), regardless > >>>>>> of the availability of a disassembler library. It applies to all sorts of > >>>>>> code, be it blobs, stubs, compiled nmethods, ... > >>>>>> This format seems useless at first glance, but it is not. In an upcoming, > >>>>>> separate enhancement, the JVM will be made capable of reading files > >>>>>> containing such code blocks and disassembling them post mortem. The most > >>>>>> prominent example is an hs_err* file. > >>>>>> - Disassembled. > >>>>>> This is an assembly listing of the instructions as found in the memory range > >>>>>> occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > >>>>>> disassembler library (hsdis-.so) must be available at runtime. > >>>>>> Most often, that will only be the case in test environments. If no disassembler > >>>>>> library is available, hexadecimal output is used as fallback. > >>>>>> - OptoAssembly. > >>>>>> This is a meta code listing created only by the C2 compiler. As it is somewhat > >>>>>> closer to the Java code, it may be helpful in linking assembly code to Java code. > >>>>>> > >>>>>> All three formats can be merged with additional information, most prominently compiler-internal > >>>> "knowledge" about blocks, related bytecodes, statistics counters, and much more. > >>>>>> > >>>>>> Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, > >>>> dependencies, exception handlers, are printed to aid in debugging. > >>>>>> > >>>>>> The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support > >>>> OptoAssembly output. Data structures are unavailable as well. > >>>>>> > >>>>>> So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you > >>>> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > >>>>>> http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > >>>>>> > >>>>>> OptoAssembly output for reference (always on with PrintAssembly): > >>>>>> ================================================================= > >>>>>> > >>>>>> 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value > >>>> (constant) > >>>>>> 039 movl R11, [RBP + #12 (8-bit)] # range > >>>>>> 03d NullCheck RBP > >>>>>> > >>>>>> 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 03d cmpl RDX, R11 # unsigned > >>>>>> 040 jnb,us B6 P=0.000000 C=5375.000000 > >>>>>> > >>>>>> PrintAssembly with no disassembler library available: > >>>>>> ===================================================== > >>>>>> > >>>>>> [Code] > >>>>>> [Entry Point] > >>>>>> 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > >>>>>> [Verified Entry Point] > >>>>>> 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > >>>>>> 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > >>>>>> 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > >>>>>> 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > >>>>>> [Exception Handler] > >>>>>> 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > >>>>>> [End] > >>>>>> > >>>>>> PrintAssembly with minimal verbosity: > >>>>>> ===================================== > >>>>>> > >>>>>> 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > >>>>>> 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > >>>>>> 0x00007f0434b89bdd: cmp %r11d,%edx > >>>>>> 0x00007f0434b89be0: jae 0x00007f0434b89c0e > >>>>>> > >>>>>> PrintAssembly (previous plus code offsets from code begin): > >>>>>> =========================================================== > >>>>>> > >>>>>> 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > >>>>>> 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > >>>>>> 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > >>>>>> > >>>>>> PrintAssembly (previous plus block comments): > >>>>>> =========================================================== > >>>>>> > >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > >>>>>> 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > >>>>>> > >>>>>> PrintAssembly (previous plus instruction comments): > >>>>>> =========================================================== > >>>>>> > >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 > >>>> rethrow=0 return_oop=0} > >>>>>> ; - java.lang.String::charAt at 8 > >>>> (line 702) > >>>>>> 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to > >>>> 0x00007fc3e11d7a9e > >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > >>>>>> > >>>>>> For completeness, here are the links to > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > >>>>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > >>>>>> > >>>>>> But please, as mentioned above, first focus on the output. The nitty details of the implementation > >>>> I would like to discuss after the output format has received some support. > >>>>>> > >>>>>> Thank you so much for your time! > >>>>>> Lutz > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >> > >> > > > > > > > From lutz.schmidt at sap.com Tue May 21 10:45:43 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 21 May 2019 10:45:43 +0000 Subject: [PING] Re: RFR(L): 8213084: Rework and enhance Print[Opto]Assembly output In-Reply-To: References: <09368D29-29D0-4854-8BA4-58508DCC44D2@sap.com> <7066294D-5750-4D7A-9F0B-DE027811819A@sap.com> <2ffc4c9c-91cb-2d04-e03e-6620d4443034@oracle.com> <06ce086b-43f9-b570-8b97-55c7c14745a0@oracle.com> <29AEA376-56DC-47A5-8935-9EE700C6345E@sap.com> <8b68b45b-acf5-daf2-be94-0bacac917aac@oracle.com> <7938CE23-798C-4484-9A75-BFC40C297FDD@sap.com> <37865A98-264E-4775-9366-194A804AE5AA@oracle.com> <3A11FD93-C2E0-48BE-B927-D718D58B50C6@sap.com> Message-ID: <4C093381-6F88-47F3-ABD9-CC71E5BD1CEF@sap.com> Hi Tobias, thank you very much for your review! I'll send the patch through jdk/submit one more time, just to be on the safe side (there have been some rebase clashes recently). If successful, I'll push. Best, Lutz ?On 21.05.19, 12:38, "Tobias Hartmann" wrote: Hi Lutz, this looks good to me. Best regards, Tobias On 20.05.19 10:22, Schmidt, Lutz wrote: > Hi Vladimir, > > I appreciate the time you spent making this change "push-worthy". Thank you for reviewing it! > > I am now hoping for a second reviewer to set aside some time and have a look. > > Thanks, > Lutz > > On 17.05.19, 16:48, "Vladimir Kozlov" wrote: > > Good. > > Thanks > Vladimir > > > On May 17, 2019, at 6:19 AM, Schmidt, Lutz wrote: > > > > Hi Vladimir, > > here is what I changed to overcome the ZERO build issues: > > > > ----------- 8< ---------------- > > diff -r c0568c492760 src/hotspot/cpu/zero/assembler_zero.hpp > > --- a/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:11:44 2019 +0200 > > +++ b/src/hotspot/cpu/zero/assembler_zero.hpp Fri May 17 14:14:06 2019 +0200 > > @@ -37,6 +37,12 @@ > > > > public: > > void pd_patch_instruction(address branch, address target, const char* file, int line); > > + > > + //---< calculate length of instruction >--- > > + static unsigned int instr_len(unsigned char *instr) { return 1; } > > + > > + //---< longest instructions >--- > > + static unsigned int instr_maxlen() { return 1; } > > }; > > > > class MacroAssembler : public Assembler { > > diff -r c0568c492760 src/hotspot/share/compiler/abstractDisassembler.cpp > > --- a/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:11:44 2019 +0200 > > +++ b/src/hotspot/share/compiler/abstractDisassembler.cpp Fri May 17 14:14:06 2019 +0200 > > @@ -61,6 +61,9 @@ > > bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > > // might prove cumbersome because instr_len is hard to find on x86 > > #endif > > +#if defined(ZERO) > > +bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit > > +#endif > > > > // Return #bytes printed. Callers may use that for output alignment. > > // Print instruction address, and offset from blob begin. > > ----------- >8 ---------------- > > > > This delta is contained as the only change in the new webrev#04 which is based on the current (13:10 GMT) jdk/jdk repo: > > https://cr.openjdk.java.net/~lucy/webrevs/8213084.04/ > > > > Regards, > > Lutz > > > > > > On 16.05.19, 21:40, "Vladimir Kozlov" wrote: > > > > I am not sure about exact parameters but I see build testing uses next: > > > > configure --with-jvm-variants=zero --with-jvm-features=-shenandoahgc > > > > Vladimir > > > >> On 5/16/19 12:22 PM, Schmidt, Lutz wrote: > >> Hi Vladimir, > >> > >> thanks for the extensive testing. And sorry for me neglecting ZERO. I will add a dummy instr_len() function. I saw another potential issue. There is no static initializer for AbstractDisassembler::_show_bytes. What is the correct macro to test for ZERO? Is it just "#ifdef ZERO"? > >> > >> I will prepare a new webrev with just these two additions as delta. But it'll be not before Friday morning, my time. > >> > >> Thanks, > >> Lutz > >> > >> On 16.05.19, 20:38, "Vladimir Kozlov" wrote: > >> > >> linux-x64-zero build is broke: > >> > >> workspace/open/src/hotspot/share/compiler/abstractDisassembler.cpp:332:42: error: 'instr_len' is not a member of 'Assembler' > >> int instr_size_in_bytes = Assembler::instr_len(pos); > >> ^~~~~~~~~ > >> Other builds and testing are good. > >> > >> Thanks, > >> Vladimir > >> > >>> On 5/16/19 9:47 AM, Vladimir Kozlov wrote: > >>> Nice. > >>> > >>> I submitted our tier1-3 testing. > >>> > >>> Thanks, > >>> Vladimir > >>> > >>>> On 5/16/19 2:55 AM, Schmidt, Lutz wrote: > >>>> Hi Vladimir, > >>>> > >>>> sorry for the delayed reaction on your comments. > >>>> > >>>> - now it reads "static unsigned int instr_len()". This change added cpu/s390/assembler_s390.inline.hpp to the list > >>>> of modified files. > >>>> - testing from my side will be via the submit repo (BuildId: 2019-05-15-1543576.lutz.schmidt.source, no failures). In > >>>> addition, I added the patch to our internal builds so that our inhouse testing will cover it (no issues detected last > >>>> night). > >>>> - All the "hsdis-" prefixes in the PrintAssemblyOptions are gone, as are "print-pc" and "print-bytes". The latter > >>>> two were legacy anyway. I kept them for compatibility. But now, without the prefix, there is no compatibility anymore. > >>>> - Options parsing improvement will be done in a separate effort. I have created JDK-8223765 for that. > >>>> - there is a new webrev, based on the current jdk/jdk repo: https://cr.openjdk.java.net/~lucy/webrevs/8213084.03/ > >>>> > >>>> ~thartmann: > >>>> The disabled code in disassembler_s390.cpp is something I would like to have. So far, I could not find time to make it > >>>> work reliably. I would like to keep it in as a reminder and a template to build on. > >>>> > >>>> Thanks, > >>>> Lutz > >>>> > >>>> On 10.05.19, 23:16, "Vladimir Kozlov" wrote: > >>>> > >>>> Hi Lutz, > >>>> My comments are inlined below. > >>>>> On 5/10/19 8:44 AM, Schmidt, Lutz wrote: > >>>>> Thank you, Vladimir! > >>>>> Please find my comments inline and let me know what you think. > >>>>> A new webrev with all the updates is here: https://cr.openjdk.java.net/~lucy/webrevs/8213084.02/ > >>>> Found one more I missed last time: > >>>> assembler_s390.hpp: still singed return (on other platforms it was converted to unsigned): > >>>> static int instr_len(unsigned char *instr); > >>>>> Please note: the webrev is not based on the most current jdk/jdk! I do not like the idea to "hg pull -u" to a > >>>> repo state which is known to be broken. Once jdk/jdk is repaired, I will update the webrev in-place (provided there > >>>> were no serious clashes) and sent a short note. > >>>> NP. Please, provide final webrev when you can so that I can run these changes through our testing to > >>>> make sure no issues are present (especially in builds). > >>>>> Regards, > >>>>> Lutz > >>>>> > >>>>> On 09.05.19, 21:30, "Vladimir Kozlov" wrote: > >>>>> > >>>>> Hi Lutz, > >>>>> > >>>>> Thank you for doing this great work. > >>>>> > >>>>> I have just small comments: > >>>>> > >>>>> x86_64.ad - empty change. > >>>>> File contains whitespace changes for formatting. Not visible in webrev. > >>>> Okay. > >>>>> > >>>>> nmethod.cpp - LUCY? > >>>>> > >>>>> + st->print_cr("LUCY: NULL-oop"); > >>>>> + tty->print("LUCY NULL-oop"); > >>>>> Oops. Leftover debugging output. Removed. Reads "NULL-oop" now. > >>>> Okay. > >>>>> > >>>>> nmethod.cpp - use PTR64_FORMAT instead of '0x%016lx'. > >>>>> Changed. > >>>>> > >>>>> vmreg.cpp - Use INTPTR_FORMAT instead of %ld for value(). > >>>>> Changed. > >>>>> > >>>>> disassembler.* - LUCY_OBSOLETE? > >>>>> > >>>>> +#if defined(LUCY_OBSOLETE) // Used in SAPJVM only > >>>>> This is fancy code to step backwards in CISC instructions. Used to print a +/- range around a given instruction > >>>> address. Works reasonably well on s390, will probably not work at all for x86. I could not finally decide to kick it > >>>> out. But now I did. It's gone. > >>>> Okay. > >>>>> > >>>>> compilerDefinitions.hpp - I don't see where tier_digit() is used. > >>>>> I'm surprised myself. Introduced it and then made it obsolete. It's gone. > >>>>> > >>>>> disassembler.cpp - PrintAssemblyOptions. Why you need to have 'hsdis-' in all options values? You > >>>>> need to check for invalid value and print help output in such case - it will be very useful if you > >>>>> forgot a value spelling. Also add line for 'help' value. > >>>>> > >>>>> The hsdis- prefix existed before I started my work. I just kept it to not hurt anybody's feelings__. Actually, > >>>> the prefix has a minor practical use. It guards the many "if (strstr(..." instructions from being executed if there is > >>>> no use. I'm personally not emotionally attached to the hsdis- prefix. I can remove it if you (and the other reviewers) > >>>> like. Not changed as of now. Awaiting your input. > >>>> It is a pain to type long values and annoying to type the same prefix. I think hsdis- prefix is > >>>> useless because PrintAssemblyOptions is used only for disassembler and there are no values which > >>>> don't have hsdis- prefix. This is not performance critical code to have a guard (check prefix). > >>>> And an other commented new line: > >>>> + // ost->print_cr("PrintAssemblyOptions='%s'", options()); > >>>>> > >>>>> Printing help text: There is an option (hsdis-help) to request help text printout. > > >>>>> Options parsing doesn't exist here. It's just string comparisons. If one of the predefined strings is found - > >>>> fine. If not - so what. If you would like to detect unrecognized input, process_options() needs significantly more > >>>> intelligence. I can do that, but would like to do it in a separate effort. Your opinion? > >>>> Got it. I forgot that PrintAssemblyOptions flag accepts string with *list* of values - you can't use > >>>> if-else or switch without complicating the code. > >>>> I noticed that PrintAssemblyOptions is defined as ccstr. Why it is not ccstrlist which should be use > >>>> here? I don't think next comment is correct for ccstr type: > >>>> http://hg.openjdk.java.net/jdk/jdk/file/ef73702a906e/src/hotspot/share/compiler/disassembler.cpp#l190 > >>>> It would be nice to fix it but you can do it later if you don't want to add more changes. > >>>>> > >>>>> Do you need next commented lines: > >>>>> > >>>>> disassembler.cpp - > >>>>> +// ptrdiff_t _offset; > >>>>> Deleted. > >>>>> > >>>>> +// Output suppressed because it messes up disassembly. > >>>>> +// output()->print_cr("[Disassembling for mach='%s']", (const char*)arg); > >>>>> Uncommented, would like to keep it. Made the if condition permanently false. > >>>>> > >>>>> disassembler_s390.cpp - > >>>>> +// st->fill_to(((st->position()+3*tsize-1)/tsize)*tsize); > >>>>> Deleted. > >>>>> > >>>>> compile.cpp - > >>>>> +// st->print("# "); _tf->dump_on(st); st->cr(); > >>>>> Uncommented. > >>>>> > >>>>> > >>>>> abstractDisassembler.cpp - > >>>>> // st->print("0x%016lx", *((julong*)here)); > >>>>> st->print("0x%016lx", *((uintptr_t*)here)); > >>>>> // st->print("0x%08x%08x", *((juint*)here), *((juint*)(here+4))); > >>>>> Commented lines are gone. > >>>>> > >>>>> abstractDisassembler.cpp - may be explicit cast (byte*)?: > >>>>> > >>>>> st->print("%2.2x", *byte); > >>>>> st->print("%2.2x", *pos); > >>>>> st->print("0x%02x", *here); > >>>>> Didn't see the need because the pointers are char* (= address) anyway. And, according to cppreference.com, > >>>> std::byte is a C++17 feature. We are not there yet. > >>>> okay > >>>>> > >>>>> PTR64_FORMAT ?: > >>>>> st->print("0x%016lx", *((uintptr_t*)here)); > >>>>> I'm kind of hesitant on that. Nice output alignment clearly depends on this to output exactly 18 characters. > >>>> Changed other occurrences, so I changed this one as well. > >>>> Thanks, > >>>> Vladimir > >>>>> > >>>>> > >>>>> Thanks, > >>>>> Vladimir > >>>>> > >>>>>> On 5/8/19 8:31 AM, Schmidt, Lutz wrote: > >>>>>> Dear Community, > >>>>>> > >>>>>> may I please request comments and reviews for this change? Thank you! > >>>>>> > >>>>>> I have created a new webrev which is based on the current jdk/jdk repo. There was some merge effort. The > >>>> code which constitutes this patch was not changed. Here's the webrev link: > >>>>>> https://cr.openjdk.java.net/~lucy/webrevs/8213084.01/ > >>>>>> > >>>>>> Regards, > >>>>>> Lutz > >>>>>> > >>>>>> On 11.04.19, 23:24, "Schmidt, Lutz" wrote: > >>>>>> > >>>>>> Dear All, > >>>>>> > >>>>>> this topic was discussed back in Nov/Dec 2018: > >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-November/031552.html > >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-December/031641.html > >>>>>> > >>>>>> Purpose of the discussion was to find out if my ideas are at all regarded useful and desirable. > >>>>>> The result was mixed, some pro, some con. I let the input from back then influence my work of the > >>>> last months. In particular, output verbosity can be controlled in a wide range now. In addition to the general > >>>> -XX:+Print* switches, the amount of output can be adjusted by newly introduced -XX:PrintAssemblyOptions. Here is the > >>>> list (with default settings): > >>>>>> > >>>>>> PrintAssemblyOptions help: > >>>>>> hsdis-print-raw test plugin by requesting raw output (deprecated) > >>>>>> hsdis-print-raw-xml test plugin by requesting raw xml (deprecated) > >>>>>> hsdis-print-pc turn off PC printing (on by default) (deprecated) > >>>>>> hsdis-print-bytes turn on instruction byte output (deprecated) > >>>>>> > >>>>>> hsdis-show-pc toggle printing current pc, currently ON > >>>>>> hsdis-show-offset toggle printing current offset, currently OFF > >>>>>> hsdis-show-bytes toggle printing instruction bytes, currently OFF > >>>>>> hsdis-show-data-hex toggle formatting data as hex, currently ON > >>>>>> hsdis-show-data-int toggle formatting data as int, currently OFF > >>>>>> hsdis-show-data-float toggle formatting data as float, currently OFF > >>>>>> hsdis-show-structs toggle compiler data structures, currently OFF > >>>>>> hsdis-show-comment toggle instruction comments, currently OFF > >>>>>> hsdis-show-block-comment toggle block comments, currently OFF > >>>>>> hsdis-align-instr toggle instruction alignment, currently OFF > >>>>>> > >>>>>> Finally, I have pushed my changes to a state where I can dare to request your comments and reviews. > >>>> I would like to suggest and request that we first focus on the effects (i.e. the generated output) of the changes. > >>>> Once we got that adjusted and accepted, we can check the actual implementation and add improvements there. Sounds like > >>>> a plan? Here is what you get: > >>>>>> > >>>>>> The machine code generated by the JVM can be printed in three different formats: > >>>>>> - Hexadecimal. > >>>>>> This is basically a hex dump of the memory range containing the code. > >>>>>> This format is always available (PRODUCT and not-PRODUCT builds), regardless > >>>>>> of the availability of a disassembler library. It applies to all sorts of > >>>>>> code, be it blobs, stubs, compiled nmethods, ... > >>>>>> This format seems useless at first glance, but it is not. In an upcoming, > >>>>>> separate enhancement, the JVM will be made capable of reading files > >>>>>> containing such code blocks and disassembling them post mortem. The most > >>>>>> prominent example is an hs_err* file. > >>>>>> - Disassembled. > >>>>>> This is an assembly listing of the instructions as found in the memory range > >>>>>> occupied by the blob, stub, compiled nmethod ... As a prerequisite, a suitable > >>>>>> disassembler library (hsdis-.so) must be available at runtime. > >>>>>> Most often, that will only be the case in test environments. If no disassembler > >>>>>> library is available, hexadecimal output is used as fallback. > >>>>>> - OptoAssembly. > >>>>>> This is a meta code listing created only by the C2 compiler. As it is somewhat > >>>>>> closer to the Java code, it may be helpful in linking assembly code to Java code. > >>>>>> > >>>>>> All three formats can be merged with additional information, most prominently compiler-internal > >>>> "knowledge" about blocks, related bytecodes, statistics counters, and much more. > >>>>>> > >>>>>> Following the code itself, compiler-internal data structures, like oop maps, relocations, scopes, > >>>> dependencies, exception handlers, are printed to aid in debugging. > >>>>>> > >>>>>> The full set of information is available in non-PRODUCT builds. PRODUCT builds do not support > >>>> OptoAssembly output. Data structures are unavailable as well. > >>>>>> > >>>>>> So how does the output actually look like? Here are a few small snippets (linuxx86_64) to give you > >>>> an idea. The complete output of an entire C2-compiled method, in multiple verbosity variants, is available here: > >>>>>> http://cr.openjdk.java.net/~lucy/webrevs/8213084/ > >>>>>> > >>>>>> OptoAssembly output for reference (always on with PrintAssembly): > >>>>>> ================================================================= > >>>>>> > >>>>>> 036 B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 036 movl RBP, [RSI + #12 (8-bit)] # compressed ptr ! Field: java/lang/String.value > >>>> (constant) > >>>>>> 039 movl R11, [RBP + #12 (8-bit)] # range > >>>>>> 03d NullCheck RBP > >>>>>> > >>>>>> 03d B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 03d cmpl RDX, R11 # unsigned > >>>>>> 040 jnb,us B6 P=0.000000 C=5375.000000 > >>>>>> > >>>>>> PrintAssembly with no disassembler library available: > >>>>>> ===================================================== > >>>>>> > >>>>>> [Code] > >>>>>> [Entry Point] > >>>>>> 0x00007fc74d1d7b20: 448b 5608 49c1 e203 493b c20f 856f 69e7 ff90 9090 9090 9090 9090 9090 9090 9090 > >>>>>> [Verified Entry Point] > >>>>>> 0x00007fc74d1d7b40: 8984 2400 a0fe ff55 4883 ec20 440f be5e 1445 85db 7521 8b6e 0c44 8b5d 0c41 3bd3 > >>>>>> 0x00007fc74d1d7b60: 732c 0fb6 4415 1048 83c4 205d 4d8b 9728 0100 0041 8502 c348 8bee 8914 2444 895c > >>>>>> 0x00007fc74d1d7b80: 2404 be4d ffff ffe8 1483 e7ff 0f0b bee5 ffff ff89 5424 04e8 0483 e7ff 0f0b bef6 > >>>>>> 0x00007fc74d1d7ba0: ffff ff89 5424 04e8 f482 e7ff 0f0b f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 f4f4 > >>>>>> [Exception Handler] > >>>>>> 0x00007fc74d1d7bc0: e95b 0df5 ffe8 0000 0000 4883 2c24 05e9 0c7d e7ff > >>>>>> [End] > >>>>>> > >>>>>> PrintAssembly with minimal verbosity: > >>>>>> ===================================== > >>>>>> > >>>>>> 0x00007f0434b89bd6: mov 0xc(%rsi),%ebp > >>>>>> 0x00007f0434b89bd9: mov 0xc(%rbp),%r11d > >>>>>> 0x00007f0434b89bdd: cmp %r11d,%edx > >>>>>> 0x00007f0434b89be0: jae 0x00007f0434b89c0e > >>>>>> > >>>>>> PrintAssembly (previous plus code offsets from code begin): > >>>>>> =========================================================== > >>>>>> > >>>>>> 0x00007f63c11d7956 (+0x36): mov 0xc(%rsi),%ebp > >>>>>> 0x00007f63c11d7959 (+0x39): mov 0xc(%rbp),%r11d > >>>>>> 0x00007f63c11d795d (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007f63c11d7960 (+0x40): jae 0x00007f63c11d798e > >>>>>> > >>>>>> PrintAssembly (previous plus block comments): > >>>>>> =========================================================== > >>>>>> > >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 0x00007f48211d76d6 (+0x36): mov 0xc(%rsi),%ebp > >>>>>> 0x00007f48211d76d9 (+0x39): mov 0xc(%rbp),%r11d > >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 0x00007f48211d76dd (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007f48211d76e0 (+0x40): jae 0x00007f48211d770e > >>>>>> > >>>>>> PrintAssembly (previous plus instruction comments): > >>>>>> =========================================================== > >>>>>> > >>>>>> ;; B2: # out( B7 B3 ) <- in( B1 ) Freq: 1 > >>>>>> 0x00007fc3e11d7a56 (+0x36): mov 0xc(%rsi),%ebp ;*getfield value {reexecute=0 > >>>> rethrow=0 return_oop=0} > >>>>>> ; - java.lang.String::charAt at 8 > >>>> (line 702) > >>>>>> 0x00007fc3e11d7a59 (+0x39): mov 0xc(%rbp),%r11d ; implicit exception: dispatches to > >>>> 0x00007fc3e11d7a9e > >>>>>> ;; B3: # out( B6 B4 ) <- in( B2 ) Freq: 0.999999 > >>>>>> 0x00007fc3e11d7a5d (+0x3d): cmp %r11d,%edx > >>>>>> 0x00007fc3e11d7a60 (+0x40): jae 0x00007fc3e11d7a8e > >>>>>> > >>>>>> For completeness, here are the links to > >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8213084 > >>>>>> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8213084.00/ > >>>>>> > >>>>>> But please, as mentioned above, first focus on the output. The nitty details of the implementation > >>>> I would like to discuss after the output format has received some support. > >>>>>> > >>>>>> Thank you so much for your time! > >>>>>> Lutz > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >> > >> > > > > > > > From vladimir.x.ivanov at oracle.com Tue May 21 14:30:45 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 21 May 2019 17:30:45 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> Message-ID: <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> > The updated webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ > > The call site count is < 0 for a typecheck profile[1][2]. > Please review it and give me some advice. It doesn't explain the failure since the assert is hit while parsing invoke* bytecode while type check failures are recorded for checkcast, aastore, and instanceof. Am I missing something important here? Best regards, Vladimir Ivanov > On 2019?05?20? 23:36, Vladimir Kozlov wrote: >> Hi Jie >> >> Please, send updated webrev. It is confusing to which changes this >> should be applied. >> >> Thanks >> Vladimir >> >>> On May 20, 2019, at 6:21 AM, Jie Fu wrote: >>> >>> Hi all, >>> >>> A refinement since the profile.count() won't change during compilation. >>> ---------------------------------------------------- >>> diff -r 13507abf416c src/hotspot/share/opto/bytecodeInfo.cpp >>> --- a/src/hotspot/share/opto/bytecodeInfo.cpp?? Sat May 18 15:42:21 >>> 2019 +0900 >>> +++ b/src/hotspot/share/opto/bytecodeInfo.cpp?? Mon May 20 21:13:13 >>> 2019 +0800 >>> @@ -334,8 +334,8 @@ >>> ?? if (caller_method->is_not_reached(caller_bci)) { >>> ???? return true; // call site not resolved >>> ?? } >>> -? if (profile.count() == -1) { >>> -??? return false; // immature profile; optimistically treat as reached >>> +? if (profile.count() <= -1) { >>> +??? return false; // immature or typecheck profile; optimistically >>> treat as reached >>> ?? } >>> ?? assert(profile.count() == 0, "sanity"); >>> >>> ---------------------------------------------------- >>> Please review this one and give me some advice. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>>> On 2019?05?20? 18:01, Jie Fu wrote: >>>> Hi all, >>>> >>>> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.01/ >>>> >>>> In my previous patch, I had lost the case of typecheck profile[1]. >>>> Please review and give me some advice. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/033797.html >>>> >>>> >>>>> On 2019/5/20 ??4:12, Leonid Mesnik wrote: >>>>> The failure is still reproduced with patch. I attached full hs_err >>>>> to the bug. >>>>> >>>>> hs_err >>>>> # >>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>> # >>>>> #? Internal Error >>>>> (open/src/hotspot/share/opto/bytecodeInfo.cpp:343), pid=3096, tid=3128 >>>>> #? assert(profile_count == 0) failed: sanity >>>>> # >>>>> # JRE version: Java(TM) SE Runtime Environment (13.0) (fastdebug >>>>> build 13-internal+0-2019-05-18-0457052.lmesnik.null) >>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug >>>>> 13-internal+0-2019-05-18-0457052.lmesnik.null, mixed mode, sharing, >>>>> tiered, compressed oops, g1 gc, linux-amd64) >>>>> # Problematic frame: >>>>> # V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, >>>>> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>>>> # >>>>> # Core dump will be written. Default location: Core dumps may be >>>>> processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P >>>>> %I %h" (or dumping to >>>>> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kit\ >>>>> >>>>> chensink14D_java/scratch/0/core.3096) >>>>> # >>>>> # If you would like to submit a bug report, please visit: >>>>> #?? http://bugreport.java.com/bugreport/crash.jsp >>>>> # >>>>> >>>>> ---------------? S U M M A R Y ------------ >>>>> >>>>> Command Line: -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions >>>>> -XX:+WhiteBoxAPI -XX:MaxRAMPercentage=12 -XX:+DeoptimizeALot >>>>> -XX:MaxRAMPercentage=50 -XX:+HeapDumpOnOutOfMemoryError >>>>> -XX:+CrashOnOutOfMemoryError -Djava.net.preferIPv6Addresses=false >>>>> -XX:+DisplayVMOutputToS\ >>>>> tderr -XX:+UsePerfData >>>>> -Xlog:gc*,gc+heap=debug:gc.log:uptime,timemillis,level,tags >>>>> -XX:+DisableExplicitGC -XX:+StartAttachListener >>>>> -XX:NativeMemoryTracking=detail -XX:+FlightRecorder >>>>> --add-exports=java.base/java.lang=ALL-UNNAMED >>>>> --add-opens=java.base/java.lang=ALL-UNNAME\ >>>>> D >>>>> --add-exports=java.xml/com.sun.org.apache.xerces.internal.parsers=ALL-UNNAMED >>>>> --add-exports=java.xml/com.sun.org.apache.xerces.internal.util=ALL-UNNAMED >>>>> -Djava.io.tmpdir=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applicatio\ >>>>> >>>>> ns_kitchensink_Kitchensink14D_java/scratch/0/java.io.tmpdir >>>>> -Duser.home=/scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/user.home >>>>> -agentpath:/scratch/lmesnik/ws/ks-apps/build/\ >>>>> linux-x64/images/test/hotspot/jtreg/native/libJvmtiStressModule.so >>>>> applications.kitchensink.process.stress.Main >>>>> /scratch/lmesnik/ws/ks-apps/build/linux-x64/test-support/jtreg_closed_test_hotspot_jtreg_applications_kitchensink_Kitchensink14D_java/scratch/0/kitchensink.fin\ >>>>> >>>>> al.properties >>>>> >>>>> Host: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz, 4 cores, 14G, >>>>> Oracle Linux Server release 7.5 >>>>> Time: Sun May 19 05:15:11 2019 PDT elapsed time: 111312 seconds (1d >>>>> 6h 55m 12s) >>>>> >>>>> ---------------? T H R E A D? --------------- >>>>> >>>>> Current thread (0x00002ae4e83bc000):? JavaThread "C2 >>>>> CompilerThread0" daemon [_thread_in_native, id=3128, >>>>> stack(0x00002ae522de2000,0x00002ae522ee3000)] >>>>> >>>>> >>>>> Current CompileTask: >>>>> C2:111312036 146944?????? 4 >>>>> spec.benchmarks.derby.DerbyHarness$Client::handleResultSet (77 bytes) >>>>> >>>>> Stack: [0x00002ae522de2000,0x00002ae522ee3000], >>>>> sp=0x00002ae522edf650,? free space=1013k >>>>> Native frames: (J=compiled Java code, A=aot compiled Java code, >>>>> j=interpreted, Vv=VM code, C=native code) >>>>> V? [libjvm.so+0x6cbf6c]? InlineTree::is_not_reached(ciMethod*, >>>>> ciMethod*, int, ciCallProfile&) [clone .constprop.153]+0xbc >>>>> V? [libjvm.so+0x6d12e0]? InlineTree::ok_to_inline(ciMethod*, >>>>> JVMState*, ciCallProfile&, WarmCallInfo*, bool&)+0x1950 >>>>> V? [libjvm.so+0xb6e075]? Compile::call_generator(ciMethod*, int, >>>>> bool, JVMState*, bool, float, ciKlass*, bool, bool)+0x905 >>>>> V? [libjvm.so+0xb6f6b9]? Parse::do_call()+0x469 >>>>> V? [libjvm.so+0x1441b70]? Parse::do_one_bytecode()+0xff0 >>>>> V? [libjvm.so+0x1432520]? Parse::do_one_block()+0x650 >>>>> V? [libjvm.so+0x1432a23]? Parse::do_all_blocks()+0x113 >>>>> V? [libjvm.so+0x14348e4]? Parse::Parse(JVMState*, ciMethod*, >>>>> float)+0xc54 >>>>> V? [libjvm.so+0x803d0c] ParseGenerator::generate(JVMState*)+0x18c >>>>> V? [libjvm.so+0x9c08b4]? Compile::Compile(ciEnv*, C2Compiler*, >>>>> ciMethod*, int, bool, bool, bool, DirectiveSet*)+0xe74 >>>>> V? [libjvm.so+0x801d9d]? C2Compiler::compile_method(ciEnv*, >>>>> ciMethod*, int, DirectiveSet*)+0x10d >>>>> V? [libjvm.so+0x9cd17d] >>>>> CompileBroker::invoke_compiler_on_method(CompileTask*)+0x46d >>>>> V? [libjvm.so+0x9ce1d8] CompileBroker::compiler_thread_loop()+0x418 >>>>> V? [libjvm.so+0x16c0baa]? JavaThread::thread_main_inner()+0x26a >>>>> V? [libjvm.so+0x16c9267]? JavaThread::run()+0x227 >>>>> V? [libjvm.so+0x16c62f6]? Thread::call_run()+0xf6 >>>>> V? [libjvm.so+0x13e0d5e]? thread_native_entry(Thread*)+0x10e >>>>> >>>>> Leonid >>>>> >>>>>> On May 18, 2019, at 5:40 PM, Jie Fu wrote: >>>>>> >>>>>> Thanks Vladimir Ivanov and Vladimir Kozlov for your review. >>>>>> Let's wait for Leonid's test result. >>>>>> >>>>>> Thanks. >>>>>> Best regards, >>>>>> Jie >>>>>> >>>>>>> On 2019?05?19? 00:15, Vladimir Kozlov wrote: >>>>>>> Hi Jie, >>>>>>> >>>>>>> So the counter was incremented while this code is executed. And >>>>>>> you fixed it by caching initial value. >>>>>>> Looks good. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>>> On 5/17/19 6:37 PM, Jie Fu wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8224162 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jiefu/8224162/webrev.00/ >>>>>>>> >>>>>>>> I'm sorry to introduce this assertion failure. >>>>>>>> Please review the suggested fix and give me some advice. >>>>>>>> >>>>>>>> Leonid, could you please help to test the patch? >>>>>>>> I don't have the reproducer you mentioned in the JBS. >>>>>>>> >>>>>>>> Thanks a lot. >>>>>>>> Best regards, >>>>>>>> Jie >>>>>>>> >>>>>>>> >>> > > From shade at redhat.com Tue May 21 16:04:15 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 21 May 2019 18:04:15 +0200 Subject: RFR(XS) 8222482: [Graal] Update java-allocation-instrumenter.jar handling in graalunit README.md In-Reply-To: <79f43014-1526-dea3-b01a-61c0ee56ce33@oracle.com> References: <265BE516-B554-4CBD-B1DC-6A97C49E168D@oracle.com> <79f43014-1526-dea3-b01a-61c0ee56ce33@oracle.com> Message-ID: <8e1ff36c-0099-6386-e82d-09e062f7d215@redhat.com> I have not tested the patch, but it looks good. -Aleksey On 5/20/19 10:28 PM, Ekaterina Pavlova wrote: > Thanks Vladimir, > I will wait for Aleksey's review as well. > > -katya > > On 5/20/19 1:25 PM, Vladimir Kozlov wrote: >> Okay. >> >> Thanks >> Vladimir >> >>> On May 20, 2019, at 1:09 PM, Ekaterina Pavlova wrote: >>> >>> Hi All, >>> >>> Please review the change which updates test/hotspot/jtreg/compiler/graalunit/README.md. >>> One more auxiliary script test/hotspot/jtreg/compiler/graalunit/downloadLibs.sh has been also added. >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8222482 >>> webrev: http://cr.openjdk.java.net/~epavlova/8222482/webrev.00/index.html >>> >>> thanks, >>> -katya -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.x.ivanov at oracle.com Tue May 21 17:33:18 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 21 May 2019 20:33:18 +0300 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> Message-ID: <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> Thanks for the feedback, David! Updated webrev: http://cr.openjdk.java.net/~vlivanov/8223213/webrev.01/ Some responses inline: > Forgot to mention that your new test doesn't look like it will play > nicely when run with Graal enabled, so you may need to split up into > different @test sections and add "@requires !vm.graal.enabled" to > exclude graal. What kind of problem when running with Graal do you have in mind? I double-checked that the test passes with Graal enabled. >> I'll be very happy to see this go in - though I do wish we had more >> platform coverage than just x86_64. Hopefully the other archs will >> jump on-board with this as well. Yes, fully agree with you. It should be pretty straightforward for maintainers to mirror x86-specific changes for their architectures. >> I was initially confused by the UseFastClassInitChecks flag as I >> couldn't really see why you would want to turn it off (other than >> perhaps during testing) but I see that it is really used (as you >> explained to Vladimir K.) to exclude the new code for platforms which >> have not implemented it. Though I'm still not sure that we shouldn't >> have something to detect it being turned on at runtime on platforms >> that don't support it (it will likely crash quickly but still ...). >> Keep wondering if there is a better way to handle this aspect of the >> change ... I deliberately made the flag develop, so it's possible to change it from command-line only in debug builds. I could introduce additional platform-specific validation, but it doesn't look worth the effort for such narrow case (and there are other develop flags which guard broken functionality). >> I can't comment on the actual interpreter and compiler changes - sorry. No problem, I'll wait for more reviews from Runtime team. >> This will need re-basing now that JDK-8219974 has been backed out. Done. Best regards, Vladimir Ivanov >> On 2/05/2019 9:17 am, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8223213 >>> >>> (It's a followup RFR on a earlier RFC [1].) >>> >>> Recent changes severely affected how static initializers are executed >>> and for long-running initializers it manifested as a severe slowdown. >>> As an example, it led to a 3x slowdown on some Clojure applications >>> (JDK-8219233 [2]). The root cause is that until a class is fully >>> initialized, every invocation of static method on it goes through >>> method resolution. >>> >>> Proposed fix introduces fast class initialization barriers for C1, >>> C2, and template interpreter on x86-64. I did some experiments with >>> cross-platform approaches, but haven't got satisfactory results. >>> >>> On other platforms, behavior stays (mostly) intact. (I had to revert >>> some changes introduced by JDK-8219492 [3], since the assumptions >>> they rely on about accesses inside a class don't hold in all cases.) >>> >>> The barrier is as simple as: >>> ??? if (holder->is_not_initialized() && >>> ??????? !holder->is_reentrant_initialization(current_thread)) { >>> ????? // trigger call site re-resolution and block there >>> ??? } >>> >>> There are 3 places where barriers are added: >>> ?? * in template interpreter for invokestatic bytecode; >>> ?? * at nmethod verified entry point (for normal compilations); >>> ?? * c2i adapters; >>> >>> For template interperter, there's additional check added into >>> TemplateTable::resolve_cache_and_index which calls into >>> InterpreterRuntime::resolve_from_cache when fast path checks fail. >>> >>> In case of nmethods, the barrier is put before frame construction, so >>> existing compiler runtime routines can be reused >>> (SharedRuntime::get_handle_wrong_method_stub()). >>> >>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers >>> nmethod recompilation once the class is fully initialized. >>> >>> OSR compilations don't need a barrier. >>> >>> Correspondence between barriers and transitions they cover: >>> ?? (1) from interpreter (barrier on caller side) >>> ??????? * all transitions: interpreter, compiled (i2c), native, aot, ... >>> >>> ?? (2) from compiled (barrier on callee side) >>> ??????? to compiled, to native (barrier in native wrapper on entry) >>> >>> ?? (3) c2i bypasses both barriers (interpreter and compiled) and >>> requires a dedicated barrier in c2i >>> >>> ?? (4) to Graal/AOT code: >>> ???????? from interpreter: covered by interpreter barrier >>> ???????? from compiled: call site patching is disabled, leading to >>> repeated call site resolution until method holder is fully >>> initialized (original behavior). >>> >>> Performance experiments with clojure [2] demonstrated that the fix >>> almost completely recuperates the regression: >>> >>> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >>> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >>> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >>> -------- >>> ?? (4) barriers disabled for invokestatic? ~3,2s >>> >>> I deliberately tried to keep the patch backport-friendly for >>> 8u/11u/12u and refrained from using newer features like nmethod >>> barriers introduced recently. The fix can be refactored later >>> specifically for 13 as a followup change. >>> >>> Testing: clojure startup, tier1-5 >>> >>> Thanks! >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] >>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >>> >>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From daniel.daugherty at oracle.com Tue May 21 19:06:28 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 21 May 2019 15:06:28 -0400 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <2f312755-c10d-af13-f4f8-7aa6c9f8466c@oracle.com> On 5/21/19 6:27 AM, Robbin Ehn wrote: > Hi all, please this update. > > Dean do you have any more comments? > > For some reason webrev don't shows whitespace fix in biasedLocking.cpp. > But is clearly visible in: > http://cr.openjdk.java.net/~rehn/8221734/v5/inc/webrev/open.patch Also visible via the file's patch link: http://cr.openjdk.java.net/~rehn/8221734/v5/inc/webrev/src/hotspot/share/runtime/biasedLocking.cpp.patch I believe the default for webrev is to ignore leading and trailing white space changes for most of the "diffs" that are generated. So a white space change like this: ???? < This is? the line. ???? > This is the line. will (likely) show up. :-) > > Inc: > http://cr.openjdk.java.net/~rehn/8221734/v5/inc/webrev/ src/hotspot/share/code/codeCache.cpp ??? No comments. src/hotspot/share/oops/method.cpp ??? No comments. src/hotspot/share/runtime/biasedLocking.cpp ??? No comments. src/hotspot/share/runtime/deoptimization.cpp ??? No comments. src/hotspot/share/runtime/deoptimization.hpp ??? No comments. test/hotspot/jtreg/compiler/codecache/stress/UnexpectedDeoptimizationAllTest.java ??? No comments. Thumbs up! Dan > Full: > http://cr.openjdk.java.net/~rehn/8221734/v5/webrev/ > > Thanks, Robbin > > On 2019-04-25 14:05, Robbin Ehn wrote: >> Hi all, please review. >> >> Let's deopt with handshakes. >> Removed VM op Deoptimize, instead we handshake. >> Locks needs to be inflate since we are not in a safepoint. >> >> Goes on top of: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >> >> >> Code: >> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8221734 >> >> Passes t1-7 and multiple t1-5 runs. >> >> A few startup benchmark see a small speedup. >> >> Thanks, Robbin From dean.long at oracle.com Tue May 21 21:35:10 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 21 May 2019 14:35:10 -0700 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: On 5/21/19 3:27 AM, Robbin Ehn wrote: > Dean do you have any more comments? No, you already addressed my concerns.? Thanks. dl From david.holmes at oracle.com Tue May 21 22:34:59 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 22 May 2019 08:34:59 +1000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> Message-ID: Hi Vladimir, On 22/05/2019 3:33 am, Vladimir Ivanov wrote: > Thanks for the feedback, David! > > Updated webrev: > ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.01/ > > Some responses inline: >> Forgot to mention that your new test doesn't look like it will play >> nicely when run with Graal enabled, so you may need to split up into >> different @test sections and add "@requires !vm.graal.enabled" to >> exclude graal. > > What kind of problem when running with Graal do you have in mind? > > I double-checked that the test passes with Graal enabled. Your various @run lines are trying to execute in different compilation modes: -Xint, C1 only, C2 only, plus various permutations. If you take those command-lines and then add all the Graal flags to it then you are no longer testing any of the things you wanted to test. At best you test Graal 7 times - which is pointless. At worst you may find some variations will timeout when Graal is applied. For that matter an Xcomp run will also negate your intentions. Cheers, David ----- >>> I'll be very happy to see this go in - though I do wish we had more >>> platform coverage than just x86_64. Hopefully the other archs will >>> jump on-board with this as well. > > Yes, fully agree with you. It should be pretty straightforward for > maintainers to mirror x86-specific changes for their architectures. > >>> I was initially confused by the UseFastClassInitChecks flag as I >>> couldn't really see why you would want to turn it off (other than >>> perhaps during testing) but I see that it is really used (as you >>> explained to Vladimir K.) to exclude the new code for platforms which >>> have not implemented it. Though I'm still not sure that we shouldn't >>> have something to detect it being turned on at runtime on platforms >>> that don't support it (it will likely crash quickly but still ...). >>> Keep wondering if there is a better way to handle this aspect of the >>> change ... > > I deliberately made the flag develop, so it's possible to change it from > command-line only in debug builds. I could introduce additional > platform-specific validation, but it doesn't look worth the effort for > such narrow case (and there are other develop flags which guard broken > functionality). > >>> I can't comment on the actual interpreter and compiler changes - sorry. > > No problem, I'll wait for more reviews from Runtime team. > >>> This will need re-basing now that JDK-8219974 has been backed out. > > Done. > > Best regards, > Vladimir Ivanov > >>> On 2/05/2019 9:17 am, Vladimir Ivanov wrote: >>>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >>>> https://bugs.openjdk.java.net/browse/JDK-8223213 >>>> >>>> (It's a followup RFR on a earlier RFC [1].) >>>> >>>> Recent changes severely affected how static initializers are >>>> executed and for long-running initializers it manifested as a severe >>>> slowdown. >>>> As an example, it led to a 3x slowdown on some Clojure applications >>>> (JDK-8219233 [2]). The root cause is that until a class is fully >>>> initialized, every invocation of static method on it goes through >>>> method resolution. >>>> >>>> Proposed fix introduces fast class initialization barriers for C1, >>>> C2, and template interpreter on x86-64. I did some experiments with >>>> cross-platform approaches, but haven't got satisfactory results. >>>> >>>> On other platforms, behavior stays (mostly) intact. (I had to revert >>>> some changes introduced by JDK-8219492 [3], since the assumptions >>>> they rely on about accesses inside a class don't hold in all cases.) >>>> >>>> The barrier is as simple as: >>>> ??? if (holder->is_not_initialized() && >>>> ??????? !holder->is_reentrant_initialization(current_thread)) { >>>> ????? // trigger call site re-resolution and block there >>>> ??? } >>>> >>>> There are 3 places where barriers are added: >>>> ?? * in template interpreter for invokestatic bytecode; >>>> ?? * at nmethod verified entry point (for normal compilations); >>>> ?? * c2i adapters; >>>> >>>> For template interperter, there's additional check added into >>>> TemplateTable::resolve_cache_and_index which calls into >>>> InterpreterRuntime::resolve_from_cache when fast path checks fail. >>>> >>>> In case of nmethods, the barrier is put before frame construction, >>>> so existing compiler runtime routines can be reused >>>> (SharedRuntime::get_handle_wrong_method_stub()). >>>> >>>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which triggers >>>> nmethod recompilation once the class is fully initialized. >>>> >>>> OSR compilations don't need a barrier. >>>> >>>> Correspondence between barriers and transitions they cover: >>>> ?? (1) from interpreter (barrier on caller side) >>>> ??????? * all transitions: interpreter, compiled (i2c), native, aot, >>>> ... >>>> >>>> ?? (2) from compiled (barrier on callee side) >>>> ??????? to compiled, to native (barrier in native wrapper on entry) >>>> >>>> ?? (3) c2i bypasses both barriers (interpreter and compiled) and >>>> requires a dedicated barrier in c2i >>>> >>>> ?? (4) to Graal/AOT code: >>>> ???????? from interpreter: covered by interpreter barrier >>>> ???????? from compiled: call site patching is disabled, leading to >>>> repeated call site resolution until method holder is fully >>>> initialized (original behavior). >>>> >>>> Performance experiments with clojure [2] demonstrated that the fix >>>> almost completely recuperates the regression: >>>> >>>> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >>>> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >>>> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >>>> -------- >>>> ?? (4) barriers disabled for invokestatic? ~3,2s >>>> >>>> I deliberately tried to keep the patch backport-friendly for >>>> 8u/11u/12u and refrained from using newer features like nmethod >>>> barriers introduced recently. The fix can be refactored later >>>> specifically for 13 as a followup change. >>>> >>>> Testing: clojure startup, tier1-5 >>>> >>>> Thanks! >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >>>> >>>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >>>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From vivek.r.deshpande at intel.com Wed May 22 00:12:37 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 22 May 2019 00:12:37 +0000 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> Hi All The encoding for replicateB in x86.ad uses dst register as one of the source without initializing, when the source for the scalar value is memory. This leads to wrong replication in the resulting vector. I have a fix for the bug in this webrev: http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ I have created following JBS Entry: https://bugs.openjdk.java.net/browse/JDK-8224558 Kindly requesting review for the patch. Regards, Vivek -------------- next part -------------- An HTML attachment was scrubbed... URL: From fujie at loongson.cn Wed May 22 03:54:48 2019 From: fujie at loongson.cn (Jie Fu) Date: Wed, 22 May 2019 11:54:48 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> Message-ID: <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: > It doesn't explain the failure since the assert is hit while parsing > invoke* bytecode while type check failures are recorded for checkcast, > aastore, and instanceof. Am I missing something important here? > Good question. I'm so sorry to make you confused. After a long time of digging into the code, I think this failure was cause by the overflow of profile.cout. A reproducer is constructed here: ?- http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java And I've changed the comment for the patch: ?- http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ Please review it and give me some advice. Thanks. Best regards, Jie From leonid.mesnik at oracle.com Wed May 22 03:57:39 2019 From: leonid.mesnik at oracle.com (Leonid Mesnik) Date: Tue, 21 May 2019 20:57:39 -0700 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> Message-ID: Could you please add reproducer as regression test for fix. Please verify that test failed without fix and pass with fix. Leonid > On May 21, 2019, at 8:54 PM, Jie Fu wrote: > > On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >> It doesn't explain the failure since the assert is hit while parsing invoke* bytecode while type check failures are recorded for checkcast, aastore, and instanceof. Am I missing something important here? >> > Good question. I'm so sorry to make you confused. > > After a long time of digging into the code, I think this failure was cause by the overflow of profile.cout. > > A reproducer is constructed here: > - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java > > And I've changed the comment for the patch: > - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ > > Please review it and give me some advice. > > Thanks. > Best regards, > Jie > > From david.holmes at oracle.com Wed May 22 04:29:30 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 22 May 2019 14:29:30 +1000 Subject: RFR (trivial) 8224570: Update ProblemList-graal.txt Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8224570 webrev: http://cr.openjdk.java.net/~dholmes/8224570/webrev/ A number of bugs have been fixed for which their entries have not been removed from ProblemList-graal.txt. Thanks, David diff -r f98a0ab24887 test/hotspot/jtreg/ProblemList-graal.txt --- a/test/hotspot/jtreg/ProblemList-graal.txt +++ b/test/hotspot/jtreg/ProblemList-graal.txt @@ -226,14 +226,6 @@ runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java 8222582 generic-all -vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java 8220032 generic-all -vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java 8220032 generic-all - -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses003/TestDescription.java 8222422 generic-all -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses005/TestDescription.java 8222422 generic-all - -runtime/exceptionMsgs/ArrayIndexOutOfBoundsException/ArrayIndexOutOfBoundsExceptionTest.java 8222292 generic-all - serviceability/dcmd/compiler/CodelistTest.java 8220449 generic-all # Graal unit tests From fujie at loongson.cn Wed May 22 05:33:42 2019 From: fujie at loongson.cn (Jie Fu) Date: Wed, 22 May 2019 13:33:42 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> Message-ID: <296a2419-53cb-3d96-91f9-8ee26c024f55@loongson.cn> On 2019/5/22 ??11:57, Leonid Mesnik wrote: > Could you please add reproducer as regression test for fix. Please verify that test failed without fix and pass with fix. I'd like to do that. Thanks. > > Leonid > >> On May 21, 2019, at 8:54 PM, Jie Fu wrote: >> >> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >>> It doesn't explain the failure since the assert is hit while parsing invoke* bytecode while type check failures are recorded for checkcast, aastore, and instanceof. Am I missing something important here? >>> >> Good question. I'm so sorry to make you confused. >> >> After a long time of digging into the code, I think this failure was cause by the overflow of profile.cout. >> >> A reproducer is constructed here: >> - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java >> >> And I've changed the comment for the patch: >> - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ >> >> Please review it and give me some advice. >> >> Thanks. >> Best regards, >> Jie >> >> From robbin.ehn at oracle.com Wed May 22 06:31:34 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 22 May 2019 08:31:34 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <2f312755-c10d-af13-f4f8-7aa6c9f8466c@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <2f312755-c10d-af13-f4f8-7aa6c9f8466c@oracle.com> Message-ID: <7e616936-b83b-b8af-2079-cab46b73d8e7@oracle.com> > Thumbs up! Thanks Dan! /Robbin > > Dan > >> Full: >> http://cr.openjdk.java.net/~rehn/8221734/v5/webrev/ >> >> Thanks, Robbin >> >> On 2019-04-25 14:05, Robbin Ehn wrote: >>> Hi all, please review. >>> >>> Let's deopt with handshakes. >>> Removed VM op Deoptimize, instead we handshake. >>> Locks needs to be inflate since we are not in a safepoint. >>> >>> Goes on top of: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033491.html >>> >>> >>> Code: >>> http://cr.openjdk.java.net/~rehn/8221734/v1/webrev/index.html >>> Issue: >>> https://bugs.openjdk.java.net/browse/JDK-8221734 >>> >>> Passes t1-7 and multiple t1-5 runs. >>> >>> A few startup benchmark see a small speedup. >>> >>> Thanks, Robbin > From robbin.ehn at oracle.com Wed May 22 06:31:53 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 22 May 2019 08:31:53 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> Message-ID: <9c67a8fc-bd23-eed9-a484-c4f03daec18b@oracle.com> Thanks Dean! /Robbin On 2019-05-21 23:35, dean.long at oracle.com wrote: > On 5/21/19 3:27 AM, Robbin Ehn wrote: >> Dean do you have any more comments? > > No, you already addressed my concerns.? Thanks. > > dl From tobias.hartmann at oracle.com Wed May 22 07:48:07 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 09:48:07 +0200 Subject: RFR (trivial) 8224570: Update ProblemList-graal.txt In-Reply-To: References: Message-ID: Hi David, reviewed and considered trivial. Best regards, Tobias On 22.05.19 06:29, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8224570 > webrev: http://cr.openjdk.java.net/~dholmes/8224570/webrev/ > > A number of bugs have been fixed for which their entries have not been removed from > ProblemList-graal.txt. > > Thanks, > David > > diff -r f98a0ab24887 test/hotspot/jtreg/ProblemList-graal.txt > --- a/test/hotspot/jtreg/ProblemList-graal.txt > +++ b/test/hotspot/jtreg/ProblemList-graal.txt > @@ -226,14 +226,6 @@ > > ?runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java ?????? 8222582 generic-all > > -vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java > 8220032 generic-all > -vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java ?????? > 8220032 generic-all > - > -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses003/TestDescription.java > 8222422 generic-all > -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses005/TestDescription.java > 8222422 generic-all > - > -runtime/exceptionMsgs/ArrayIndexOutOfBoundsException/ArrayIndexOutOfBoundsExceptionTest.java > 8222292 generic-all > - > ?serviceability/dcmd/compiler/CodelistTest.java 8220449 generic-all > > ?# Graal unit tests From tobias.hartmann at oracle.com Wed May 22 08:12:57 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 10:12:57 +0200 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> Message-ID: <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> Hi Vivek, I wonder why this never showed up, could you please add a regression test? You also need to fix the format strings. Thanks, Tobias On 22.05.19 02:12, Deshpande, Vivek R wrote: > Hi All > > ? > > The encoding for replicateB in x86.ad uses dst register as one of the source without initializing, > when the source for the scalar value is memory. > > This leads to wrong replication in the resulting vector. > > I have a fix for the bug in this webrev: > > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ > > I have created following JBS Entry: > > https://bugs.openjdk.java.net/browse/JDK-8224558 > > ? > > Kindly requesting review for the patch. > > ? > > Regards, > > Vivek > From aoqi at loongson.cn Wed May 22 08:41:11 2019 From: aoqi at loongson.cn (Ao Qi) Date: Wed, 22 May 2019 16:41:11 +0800 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 Message-ID: Hi, Could I please get reviews for this? JBS: https://bugs.openjdk.java.net/browse/JDK-8224568 Webrev: http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ Tested: linux-x86_64-{server, zero, minimal}-{release, fastdebug} build linux-x86_64-server-release hotspot:tier1 Thanks, Ao Qi From rwestrel at redhat.com Wed May 22 08:55:18 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 22 May 2019 10:55:18 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded Message-ID: <877eailvgp.fsf@redhat.com> http://cr.openjdk.java.net/~roland/8224580/webrev.00/ With Shenandoah, we have a prototype where, rather than using an extra header field for the forwarding pointer, we use the mark word. When the shenandoah barrier is applied to an oop, the oop is first checked to be in cset which is done by converting the oop to an integer and extracting some of the bits of the address and then, if that test fails, the forwarding pointer is loaded by masking off some bits in the mark word. We're seeing a case where, in compiled code, the in cset test and the mark word load for the same oop use 2 different loads, i.e. the load of the field/array element is duplicated, the oop is reloaded between the in cset test and loading of the forwarding pointer. This only with compressed oops. I traced it down to the matcher, where the LoadN node for the field/array element is not marked as shared and so is emitted twice. This happens because the LoadN result is only used through a decodeN which has multiple uses. I see the matcher has code to deal with that situation: if( mop == Op_AddP && m->in(AddPNode::Base)->is_DecodeNarrowPtr()) { // Bases used in addresses must be shared but since // they are shared through a DecodeN they may appear // to have a single use so force sharing here. set_shared(m->in(AddPNode::Base)->in(1)); } But that logic doesn't trigger in our case because the mark word load is at offset 0 so there's not AddP. The fix I propose is to always mark the input of a DecodeN as shared if the DecodeN is shared. Roland. From fujie at loongson.cn Wed May 22 08:55:07 2019 From: fujie at loongson.cn (Jie Fu) Date: Wed, 22 May 2019 16:55:07 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> Message-ID: <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> On 2019/5/22 ??11:57, Leonid Mesnik wrote: > Could you please add reproducer as regression test for fix. Please verify that test failed without fix and pass with fix. Done. Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.03/ # Testing On an i7-8700 at 3.20GHz machine, the test results: ----------------------------------------------- - fastdebug: make test TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" CONF=fastdebug ? Before fix: failed, elapsed time: 37s ? After? fix: pass,?? elapsed time: 43s - slowdebug: make test TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" CONF=slowdebug ? Before fix: failed, elapsed time: 38s ? After? fix: pass,?? elapsed time: 43s ----------------------------------------------- Please review it and give me some advice. Thanks. Best regards, Jie > > Leonid > >> On May 21, 2019, at 8:54 PM, Jie Fu wrote: >> >> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >>> It doesn't explain the failure since the assert is hit while parsing invoke* bytecode while type check failures are recorded for checkcast, aastore, and instanceof. Am I missing something important here? >>> >> Good question. I'm so sorry to make you confused. >> >> After a long time of digging into the code, I think this failure was cause by the overflow of profile.cout. >> >> A reproducer is constructed here: >> - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java >> >> And I've changed the comment for the patch: >> - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ >> >> Please review it and give me some advice. >> >> Thanks. >> Best regards, >> Jie >> >> From rwestrel at redhat.com Wed May 22 09:06:40 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 22 May 2019 11:06:40 +0200 Subject: RFR(S): 8224496: Shenandoah compilation fails with assert(is_CountedLoopEnd()) failed: invalid node class Message-ID: <874l5mluxr.fsf@redhat.com> http://cr.openjdk.java.net/~roland/8224496/webrev.00/ Expanding a barrier in the outer loop of a strip mined loop nest confuses the loop strip mining verification logic. This is not a new problem and the way this relatively rare case has been handled so far is by turning the outer strip mined loop into a regular loop so the verification logic doesn't trigger. In the move to the new barrier scheme, some of that logic was dropped. This fix puts it back and also fixes a bug that it causes. Roland. From rkennke at redhat.com Wed May 22 09:14:51 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 22 May 2019 11:14:51 +0200 Subject: RFR(S): 8224496: Shenandoah compilation fails with assert(is_CountedLoopEnd()) failed: invalid node class In-Reply-To: <874l5mluxr.fsf@redhat.com> References: <874l5mluxr.fsf@redhat.com> Message-ID: The fix looks good to me! Thanks! Roman > http://cr.openjdk.java.net/~roland/8224496/webrev.00/ > > Expanding a barrier in the outer loop of a strip mined loop nest > confuses the loop strip mining verification logic. This is not a new > problem and the way this relatively rare case has been handled so far is > by turning the outer strip mined loop into a regular loop so the > verification logic doesn't trigger. In the move to the new barrier > scheme, some of that logic was dropped. This fix puts it back and also > fixes a bug that it causes. > > Roland. > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rwestrel at redhat.com Wed May 22 09:25:43 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 22 May 2019 11:25:43 +0200 Subject: RFR(S): 8173196: [REDO] C2 does not optimize redundant memory operations with G1 Message-ID: <871s0qlu20.fsf@redhat.com> http://cr.openjdk.java.net/~roland/8173196/webrev.00/ Previous attempt at this was discussed here: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/021014.html And the follow up bugs with some comments on a possible fix: https://bugs.openjdk.java.net/browse/JDK-8172850 The new fix is very similar to the previous one. The 2 differences are: - aarch64 code shouldn't need any change because of 8209420 ("Track membars for volatile accesses so they can be properly optimized") - The membar only affects the raw memory slice which is now properly handled by MachNodes thanks to 8209691 ("Allow MemBar on single memory slice") Roland. From tobias.hartmann at oracle.com Wed May 22 09:35:32 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 11:35:32 +0200 Subject: [13] RFR(S): 8223581: C2 compilation failed with assert(!q->is_MergeMem()) Message-ID: Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8223581 http://cr.openjdk.java.net/~thartmann/8223581/webrev.00/ We hit an assert during parsing with incremental inlining when merging memory edges into a target block because of a MergeMem that has another MergeMem as input. The root cause is the same as with 8221592 [1]: After the fix for 8059241 [2], we don't always execute a round of PhaseRemoveUseless / IGVN after incremental inlining and as a result, MergeMems that feed into other MergeMems are not cleaned immediately (but they are on the IGVN worklist and will be cleaned up eventually). To not confuse the parser, we need to remove them eagerly. This is already done in GraphKit::replace_call() for the non-exceptional memory edge but the implementation misses to also handle the exceptional case. Instead of a Node_List, I'm now using a Unique_Node_List to avoid the costly contains() call that has O(n) complexity. I've verified the fix by many runs of the api/java_lang JCK tests and testing on relevant tiers. Thanks, Tobias [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033505.html [2] https://bugs.openjdk.java.net/browse/JDK-8059241 From vladimir.x.ivanov at oracle.com Wed May 22 09:57:48 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 12:57:48 +0300 Subject: [13] RFR(S): 8223581: C2 compilation failed with assert(!q->is_MergeMem()) In-Reply-To: References: Message-ID: <84e3b9c2-2e37-f076-7deb-9d1a7d4aca51@oracle.com> The fix looks good. Does it worth to extract repetitive code [1] [2] into a helper method? Best regards, Vladimir Ivanov [1] C->gvn_replace_by(callprojs.fallthrough_memproj, final_mem); + if (final_mem->is_MergeMem()) { + // Keep track of MergeMems feeding into other MergeMems + for (SimpleDUIterator i(final_mem); i.has_next(); i.next()) { + Node* use = i.get(); + if (use->is_MergeMem()) { + wl.push(use); + } + } + } [2] + C->gvn_replace_by(callprojs.catchall_memproj, ex_mem); + if (ex_mem->is_MergeMem()) { + // Keep track of MergeMems feeding into other MergeMems + for (SimpleDUIterator i(ex_mem); i.has_next(); i.next()) { + Node* use = i.get(); + if (use->is_MergeMem()) { + wl.push(use); + } + } + } On 22/05/2019 12:35, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8223581 > http://cr.openjdk.java.net/~thartmann/8223581/webrev.00/ > > We hit an assert during parsing with incremental inlining when merging memory edges into a target > block because of a MergeMem that has another MergeMem as input. The root cause is the same as with > 8221592 [1]: After the fix for 8059241 [2], we don't always execute a round of PhaseRemoveUseless / > IGVN after incremental inlining and as a result, MergeMems that feed into other MergeMems are not > cleaned immediately (but they are on the IGVN worklist and will be cleaned up eventually). > > To not confuse the parser, we need to remove them eagerly. This is already done in > GraphKit::replace_call() for the non-exceptional memory edge but the implementation misses to also > handle the exceptional case. > > Instead of a Node_List, I'm now using a Unique_Node_List to avoid the costly contains() call that > has O(n) complexity. > > I've verified the fix by many runs of the api/java_lang JCK tests and testing on relevant tiers. > > Thanks, > Tobias > > [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-April/033505.html > [2] https://bugs.openjdk.java.net/browse/JDK-8059241 > From david.holmes at oracle.com Wed May 22 10:12:18 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 22 May 2019 20:12:18 +1000 Subject: RFR (trivial) 8224570: Update ProblemList-graal.txt In-Reply-To: References: Message-ID: <0799b832-1830-44a3-7207-df952a956f7d@oracle.com> Thanks Tobias! David On 22/05/2019 5:48 pm, Tobias Hartmann wrote: > Hi David, > > reviewed and considered trivial. > > Best regards, > Tobias > > On 22.05.19 06:29, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224570 >> webrev: http://cr.openjdk.java.net/~dholmes/8224570/webrev/ >> >> A number of bugs have been fixed for which their entries have not been removed from >> ProblemList-graal.txt. >> >> Thanks, >> David >> >> diff -r f98a0ab24887 test/hotspot/jtreg/ProblemList-graal.txt >> --- a/test/hotspot/jtreg/ProblemList-graal.txt >> +++ b/test/hotspot/jtreg/ProblemList-graal.txt >> @@ -226,14 +226,6 @@ >> >> ?runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java ?????? 8222582 generic-all >> >> -vmTestbase/nsk/jdi/ObjectReference/referringObjects/referringObjects002/referringObjects002.java >> 8220032 generic-all >> -vmTestbase/nsk/jdi/VirtualMachine/instanceCounts/instancecounts003/instancecounts003.java >> 8220032 generic-all >> - >> -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses003/TestDescription.java >> 8222422 generic-all >> -vmTestbase/nsk/jdi/ClassLoaderReference/definedClasses/definedclasses005/TestDescription.java >> 8222422 generic-all >> - >> -runtime/exceptionMsgs/ArrayIndexOutOfBoundsException/ArrayIndexOutOfBoundsExceptionTest.java >> 8222292 generic-all >> - >> ?serviceability/dcmd/compiler/CodelistTest.java 8220449 generic-all >> >> ?# Graal unit tests From tobias.hartmann at oracle.com Wed May 22 10:22:18 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 12:22:18 +0200 Subject: [13] RFR(S): 8223581: C2 compilation failed with assert(!q->is_MergeMem()) In-Reply-To: <84e3b9c2-2e37-f076-7deb-9d1a7d4aca51@oracle.com> References: <84e3b9c2-2e37-f076-7deb-9d1a7d4aca51@oracle.com> Message-ID: <02db9000-6a78-0861-3608-2957086fca9b@oracle.com> Thanks Vladimir. On 22.05.19 11:57, Vladimir Ivanov wrote: > Does it worth to extract repetitive code [1] [2] into a helper method? Sure, what about this? http://cr.openjdk.java.net/~thartmann/8223581/webrev.01/ Best regards, Tobias From vladimir.x.ivanov at oracle.com Wed May 22 10:23:37 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 13:23:37 +0300 Subject: [13] RFR(S): 8223581: C2 compilation failed with assert(!q->is_MergeMem()) In-Reply-To: <02db9000-6a78-0861-3608-2957086fca9b@oracle.com> References: <84e3b9c2-2e37-f076-7deb-9d1a7d4aca51@oracle.com> <02db9000-6a78-0861-3608-2957086fca9b@oracle.com> Message-ID: > http://cr.openjdk.java.net/~thartmann/8223581/webrev.01/ Looks good! Best regards, Vladimir Ivanov From tobias.hartmann at oracle.com Wed May 22 10:24:55 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 12:24:55 +0200 Subject: [13] RFR(S): 8223581: C2 compilation failed with assert(!q->is_MergeMem()) In-Reply-To: References: <84e3b9c2-2e37-f076-7deb-9d1a7d4aca51@oracle.com> <02db9000-6a78-0861-3608-2957086fca9b@oracle.com> Message-ID: <2c5fe7fa-de24-27d5-8b6d-e75ca36de0b7@oracle.com> Thanks Vladimir! Best regards, Tobias On 22.05.19 12:23, Vladimir Ivanov wrote: > >> http://cr.openjdk.java.net/~thartmann/8223581/webrev.01/ > > Looks good! > > Best regards, > Vladimir Ivanov > From shade at redhat.com Wed May 22 14:19:06 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 22 May 2019 16:19:06 +0200 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: References: Message-ID: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> On 5/22/19 10:41 AM, Ao Qi wrote: > JBS: > https://bugs.openjdk.java.net/browse/JDK-8224568 > > Webrev: > http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ Header addition looks trivial and good. Rewiring platform-specific ifdefs raises a lot of questions (mostly to the original patch): since we are always defining to the same value, why do we even have platform-specific defines there? Also, that means that new platform would break without new platform-specific block there. I think we are better off just dropping the platform-specific defines and just unconditionally set _show_bytes=false. Lutz, please advise here. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Wed May 22 14:34:27 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 22 May 2019 16:34:27 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: <877eailvgp.fsf@redhat.com> References: <877eailvgp.fsf@redhat.com> Message-ID: <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> Hi Roland, The change looks reasonable to me. I have run tests with the patch and can confirm that the original bug went away. I've also run a bunch of other tests and workloads and looks good too. Roman > http://cr.openjdk.java.net/~roland/8224580/webrev.00/ > > With Shenandoah, we have a prototype where, rather than using an extra > header field for the forwarding pointer, we use the mark word. When the > shenandoah barrier is applied to an oop, the oop is first checked to be > in cset which is done by converting the oop to an integer and extracting > some of the bits of the address and then, if that test fails, the > forwarding pointer is loaded by masking off some bits in the mark > word. We're seeing a case where, in compiled code, the in cset test and > the mark word load for the same oop use 2 different loads, i.e. the load > of the field/array element is duplicated, the oop is reloaded between > the in cset test and loading of the forwarding pointer. This only with > compressed oops. > > I traced it down to the matcher, where the LoadN node for the > field/array element is not marked as shared and so is emitted > twice. This happens because the LoadN result is only used through a > decodeN which has multiple uses. I see the matcher has code to deal with > that situation: > > if( mop == Op_AddP && m->in(AddPNode::Base)->is_DecodeNarrowPtr()) { > // Bases used in addresses must be shared but since > // they are shared through a DecodeN they may appear > // to have a single use so force sharing here. > set_shared(m->in(AddPNode::Base)->in(1)); > } > > But that logic doesn't trigger in our case because the mark word load is > at offset 0 so there's not AddP. The fix I propose is to always mark the > input of a DecodeN as shared if the DecodeN is shared. > > Roland. > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Wed May 22 14:41:36 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 16:41:36 +0200 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 Message-ID: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8224539 http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ The fix for JDK-8212243 [1] changed the implementation of the ArrayCopyNode optimizations to access the src/dst adr nodes to get the base: http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.10 http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.23 Now it can happen that either one is top if the array size is known and the offset is out of bounds. For example, with incremental inlining we might not know the constant array size once the ArrayCopyNode is created but only once we execute ideal transformations (see regression test). The ArrayCopyNode will eventually be removed because the range checks fail but control is still valid at the time when we hit the assert. Tested with regression test and relevant tiers (running). Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8212243 From rwestrel at redhat.com Wed May 22 14:44:42 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 22 May 2019 16:44:42 +0200 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> Message-ID: <87sgt6k0px.fsf@redhat.com> > http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ Looks good to me. Roland. From vladimir.x.ivanov at oracle.com Wed May 22 15:08:25 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 18:08:25 +0300 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> Message-ID: > http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ Looks good. Best regards, Vladimir Ivanov > The fix for JDK-8212243 [1] changed the implementation of the ArrayCopyNode optimizations to access > the src/dst adr nodes to get the base: > http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.10 > http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.23 > > Now it can happen that either one is top if the array size is known and the offset is out of bounds. > For example, with incremental inlining we might not know the constant array size once the > ArrayCopyNode is created but only once we execute ideal transformations (see regression test). The > ArrayCopyNode will eventually be removed because the range checks fail but control is still valid at > the time when we hit the assert. > > Tested with regression test and relevant tiers (running). > > Thanks, > Tobias > > [1] https://bugs.openjdk.java.net/browse/JDK-8212243 > From tobias.hartmann at oracle.com Wed May 22 15:09:12 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 17:09:12 +0200 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: <87sgt6k0px.fsf@redhat.com> References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> <87sgt6k0px.fsf@redhat.com> Message-ID: Thanks Roland. Best regards, Tobias On 22.05.19 16:44, Roland Westrelin wrote: > >> http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ > > Looks good to me. > > Roland. > From tobias.hartmann at oracle.com Wed May 22 15:09:27 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 17:09:27 +0200 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> Message-ID: <6d483b93-5c64-4a5b-2708-74cd5fb2bbb9@oracle.com> Thanks Vladimir. Best regards, Tobias On 22.05.19 17:08, Vladimir Ivanov wrote: > >> http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ > > Looks good. > > Best regards, > Vladimir Ivanov > >> The fix for JDK-8212243 [1] changed the implementation of the ArrayCopyNode optimizations to access >> the src/dst adr nodes to get the base: >> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.10 >> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.23 >> >> Now it can happen that either one is top if the array size is known and the offset is out of bounds. >> For example, with incremental inlining we might not know the constant array size once the >> ArrayCopyNode is created but only once we execute ideal transformations (see regression test). The >> ArrayCopyNode will eventually be removed because the range checks fail but control is still valid at >> the time when we hit the assert. >> >> Tested with regression test and relevant tiers (running). >> >> Thanks, >> Tobias >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8212243 >> From lutz.schmidt at sap.com Wed May 22 15:37:28 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 22 May 2019 15:37:28 +0000 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> Message-ID: <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> Hi all, sorry for reacting only with delay. I agree with Aleksey's idea to unconditionally set _show_bytes=false. The platform-specific initialization has become far less important - if not obsolete. It is now possible to toggle _show_bytes via -XX:PrintAssemblyOptions=show_bytes. Thanks for fixing this! Regards, Lutz ?On 22.05.19, 16:19, "Aleksey Shipilev" wrote: On 5/22/19 10:41 AM, Ao Qi wrote: > JBS: > https://bugs.openjdk.java.net/browse/JDK-8224568 > > Webrev: > http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ Header addition looks trivial and good. Rewiring platform-specific ifdefs raises a lot of questions (mostly to the original patch): since we are always defining to the same value, why do we even have platform-specific defines there? Also, that means that new platform would break without new platform-specific block there. I think we are better off just dropping the platform-specific defines and just unconditionally set _show_bytes=false. Lutz, please advise here. -Aleksey From aoqi at loongson.cn Wed May 22 15:50:45 2019 From: aoqi at loongson.cn (Ao Qi) Date: Wed, 22 May 2019 23:50:45 +0800 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> Message-ID: On Wed, May 22, 2019 at 11:37 PM Schmidt, Lutz wrote: > > Hi all, > > sorry for reacting only with delay. > > I agree with Aleksey's idea to unconditionally set _show_bytes=false. Should the comment for x86/arm/aarch64 at line 49 be kept ? 47 #if defined(X86) || defined(ARM) || defined(AARCH64) 48 bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit 49 // might prove cumbersome because instr_len is hard to find on x86 or arm 50 #else 51 bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit 52 #endif Cheers, Ao Qi > > The platform-specific initialization has become far less important - if not obsolete. It is now possible to toggle _show_bytes via -XX:PrintAssemblyOptions=show_bytes. > > Thanks for fixing this! > > Regards, > Lutz > > > ?On 22.05.19, 16:19, "Aleksey Shipilev" wrote: > > On 5/22/19 10:41 AM, Ao Qi wrote: > > JBS: > > https://bugs.openjdk.java.net/browse/JDK-8224568 > > > > Webrev: > > http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ > > Header addition looks trivial and good. > > Rewiring platform-specific ifdefs raises a lot of questions (mostly to the original patch): since we > are always defining to the same value, why do we even have platform-specific defines there? Also, > that means that new platform would break without new platform-specific block there. I think we are > better off just dropping the platform-specific defines and just unconditionally set _show_bytes=false. > > Lutz, please advise here. > > -Aleksey > > > From vladimir.kozlov at oracle.com Wed May 22 15:53:13 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 22 May 2019 08:53:13 -0700 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> Message-ID: <86DF39D5-FAEC-40E1-B9D6-6F37117BF751@oracle.com> +1 Thanks Vladimir > On May 22, 2019, at 8:08 AM, Vladimir Ivanov wrote: > > >> http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ > > Looks good. > > Best regards, > Vladimir Ivanov > >> The fix for JDK-8212243 [1] changed the implementation of the ArrayCopyNode optimizations to access >> the src/dst adr nodes to get the base: >> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.10 >> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.23 >> Now it can happen that either one is top if the array size is known and the offset is out of bounds. >> For example, with incremental inlining we might not know the constant array size once the >> ArrayCopyNode is created but only once we execute ideal transformations (see regression test). The >> ArrayCopyNode will eventually be removed because the range checks fail but control is still valid at >> the time when we hit the assert. >> Tested with regression test and relevant tiers (running). >> Thanks, >> Tobias >> [1] https://bugs.openjdk.java.net/browse/JDK-8212243 From vladimir.x.ivanov at oracle.com Wed May 22 15:58:08 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 18:58:08 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> Message-ID: <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> Nice catch, Jie! I'm in favor of fixing the overflow instead and keep counts always non-negative. The root cause for negative counts is uint->int conversion and values can be normalized to max_int. Moreover, on 64-bit platforms it's easy to catch (and fix) uint overflow since all the info is there - counts already are 64-bit (intptr_t). Regarding the test: test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java: 28 * @requires (vm.debug == true) The test runs just fine with product binaries. So, I'd prefer to avoid limiting it only to debug. 51 } else { 52 // Sleep to wait for the compilation to be finished 53 Thread.sleep(2000); 54 } You can just turn off background compilation instead (-Xbatch). I was able to significantly speed up the test with the following chnages: /** * @test * @run main/othervm -Xbatch -XX:-UseOnStackReplacement -XX:MaxTrivialSize=0 compiler.profiling.TestProfileCounterOverflow */ package compiler.profiling; public class TestProfileCounterOverflow { public static void test(long iterations) throws Exception { for (long j = 0; j < iterations; j++) { call(); } } public static void call() {} public static void main(String[] args) throws Exception { // trigger profiling on tier3 for (int i = 0; i < 500; i++) { test(1); } test(Integer.MAX_VALUE + 10000L); // overflow call counter // trigger c2 compilation for (int i = 0; i < 10_000; i++) { test(1); } System.out.println("TEST PASSED"); } } Best regards, Vladimir Ivanov On 22/05/2019 11:55, Jie Fu wrote: > On 2019/5/22 ??11:57, Leonid Mesnik wrote: >> Could you please add reproducer as regression test for fix. Please >> verify that test failed without fix and pass with fix. > > Done. > > Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.03/ > > # Testing > On an i7-8700 at 3.20GHz machine, the test results: > ----------------------------------------------- > - fastdebug: make test > TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" > CONF=fastdebug > ? Before fix: failed, elapsed time: 37s > ? After? fix: pass,?? elapsed time: 43s > - slowdebug: make test > TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" > CONF=slowdebug > ? Before fix: failed, elapsed time: 38s > ? After? fix: pass,?? elapsed time: 43s > ----------------------------------------------- > > Please review it and give me some advice. > > Thanks. > Best regards, > Jie > >> >> Leonid >> >>> On May 21, 2019, at 8:54 PM, Jie Fu wrote: >>> >>> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >>>> It doesn't explain the failure since the assert is hit while parsing >>>> invoke* bytecode while type check failures are recorded for >>>> checkcast, aastore, and instanceof. Am I missing something important >>>> here? >>>> >>> Good question. I'm so sorry to make you confused. >>> >>> After a long time of digging into the code, I think this failure was >>> cause by the overflow of profile.cout. >>> >>> A reproducer is constructed here: >>> ? - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java >>> >>> And I've changed the comment for the patch: >>> ? - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ >>> >>> Please review it and give me some advice. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>> > From tobias.hartmann at oracle.com Wed May 22 16:00:06 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 22 May 2019 18:00:06 +0200 Subject: [13] 8224539: C2 compilation fails during ArrayCopyNode optimizations with assert(i < _max) failed: oob: i=1, _max=1 In-Reply-To: <86DF39D5-FAEC-40E1-B9D6-6F37117BF751@oracle.com> References: <4c342e3f-6778-f6bc-9318-7ecf0c81e233@oracle.com> <86DF39D5-FAEC-40E1-B9D6-6F37117BF751@oracle.com> Message-ID: Thanks Vladimir. Best regards, Tobias On 22.05.19 17:53, Vladimir Kozlov wrote: > +1 > > Thanks > Vladimir > >> On May 22, 2019, at 8:08 AM, Vladimir Ivanov wrote: >> >> >>> http://cr.openjdk.java.net/~thartmann/8224539/webrev.00/ >> >> Looks good. >> >> Best regards, >> Vladimir Ivanov >> >>> The fix for JDK-8212243 [1] changed the implementation of the ArrayCopyNode optimizations to access >>> the src/dst adr nodes to get the base: >>> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.10 >>> http://hg.openjdk.java.net/jdk/jdk/rev/e3d79743f57d#l10.23 >>> Now it can happen that either one is top if the array size is known and the offset is out of bounds. >>> For example, with incremental inlining we might not know the constant array size once the >>> ArrayCopyNode is created but only once we execute ideal transformations (see regression test). The >>> ArrayCopyNode will eventually be removed because the range checks fail but control is still valid at >>> the time when we hit the assert. >>> Tested with regression test and relevant tiers (running). >>> Thanks, >>> Tobias >>> [1] https://bugs.openjdk.java.net/browse/JDK-8212243 > From lutz.schmidt at sap.com Wed May 22 16:16:40 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 22 May 2019 16:16:40 +0000 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> Message-ID: <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> Hi, maybe rephrase the line to // might prove cumbersome on platforms where instr_len is hard to find out Thanks, Lutz ?On 22.05.19, 17:50, "Ao Qi" wrote: On Wed, May 22, 2019 at 11:37 PM Schmidt, Lutz wrote: > > Hi all, > > sorry for reacting only with delay. > > I agree with Aleksey's idea to unconditionally set _show_bytes=false. Should the comment for x86/arm/aarch64 at line 49 be kept ? 47 #if defined(X86) || defined(ARM) || defined(AARCH64) 48 bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit 49 // might prove cumbersome because instr_len is hard to find on x86 or arm 50 #else 51 bool AbstractDisassembler::_show_bytes = false; // set "true" to see what's in memory bit by bit 52 #endif Cheers, Ao Qi > > The platform-specific initialization has become far less important - if not obsolete. It is now possible to toggle _show_bytes via -XX:PrintAssemblyOptions=show_bytes. > > Thanks for fixing this! > > Regards, > Lutz > > > On 22.05.19, 16:19, "Aleksey Shipilev" wrote: > > On 5/22/19 10:41 AM, Ao Qi wrote: > > JBS: > > https://bugs.openjdk.java.net/browse/JDK-8224568 > > > > Webrev: > > http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ > > Header addition looks trivial and good. > > Rewiring platform-specific ifdefs raises a lot of questions (mostly to the original patch): since we > are always defining to the same value, why do we even have platform-specific defines there? Also, > that means that new platform would break without new platform-specific block there. I think we are > better off just dropping the platform-specific defines and just unconditionally set _show_bytes=false. > > Lutz, please advise here. > > -Aleksey > > > From aoqi at loongson.cn Wed May 22 16:26:34 2019 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 23 May 2019 00:26:34 +0800 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> Message-ID: Hi, Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ What do you think? Thanks, Ao Qi On Thu, May 23, 2019 at 12:16 AM Schmidt, Lutz wrote: > > Hi, > > maybe rephrase the line to > // might prove cumbersome on platforms where instr_len is hard to find out > > Thanks, > Lutz > > ?On 22.05.19, 17:50, "Ao Qi" wrote: > > On Wed, May 22, 2019 at 11:37 PM Schmidt, Lutz wrote: > > > > Hi all, > > > > sorry for reacting only with delay. > > > > I agree with Aleksey's idea to unconditionally set _show_bytes=false. > > Should the comment for x86/arm/aarch64 at line 49 be kept ? > > 47 #if defined(X86) || defined(ARM) || defined(AARCH64) > 48 bool AbstractDisassembler::_show_bytes = false; // set "true" to > see what's in memory bit by bit > 49 // might prove > cumbersome because instr_len is hard to find on x86 or arm > 50 #else > 51 bool AbstractDisassembler::_show_bytes = false; // set "true" to > see what's in memory bit by bit > 52 #endif > > Cheers, > Ao Qi > > > > > The platform-specific initialization has become far less important - if not obsolete. It is now possible to toggle _show_bytes via -XX:PrintAssemblyOptions=show_bytes. > > > > Thanks for fixing this! > > > > Regards, > > Lutz > > > > > > On 22.05.19, 16:19, "Aleksey Shipilev" wrote: > > > > On 5/22/19 10:41 AM, Ao Qi wrote: > > > JBS: > > > https://bugs.openjdk.java.net/browse/JDK-8224568 > > > > > > Webrev: > > > http://cr.openjdk.java.net/~aoqi/8224568/webrev.00/ > > > > Header addition looks trivial and good. > > > > Rewiring platform-specific ifdefs raises a lot of questions (mostly to the original patch): since we > > are always defining to the same value, why do we even have platform-specific defines there? Also, > > that means that new platform would break without new platform-specific block there. I think we are > > better off just dropping the platform-specific defines and just unconditionally set _show_bytes=false. > > > > Lutz, please advise here. > > > > -Aleksey > > > > > > > > From shade at redhat.com Wed May 22 16:29:33 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 22 May 2019 18:29:33 +0200 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> Message-ID: <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> On 5/22/19 6:26 PM, Ao Qi wrote: > Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ Looks fine and trivial to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From aoqi at loongson.cn Wed May 22 16:49:19 2019 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 23 May 2019 00:49:19 +0800 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> Message-ID: On Thu, May 23, 2019 at 12:29 AM Aleksey Shipilev wrote: > > On 5/22/19 6:26 PM, Ao Qi wrote: > > Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ > > Looks fine and trivial to me. Thanks Aleksey and Lutz! I need a sponsor. Could you help? Thanks, Ao Qi > > -Aleksey > From shade at redhat.com Wed May 22 16:51:49 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 22 May 2019 18:51:49 +0200 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> Message-ID: <708ff693-ba03-2a3a-279e-41a1b90d3444@redhat.com> On 5/22/19 6:49 PM, Ao Qi wrote: > On Thu, May 23, 2019 at 12:29 AM Aleksey Shipilev wrote: >> >> On 5/22/19 6:26 PM, Ao Qi wrote: >>> Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ >> >> Looks fine and trivial to me. > > Thanks Aleksey and Lutz! I need a sponsor. Could you help? Yes, I'll sponsor. I am doing quick jdk-submit run to catch surprises. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.x.ivanov at oracle.com Wed May 22 16:58:21 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 19:58:21 +0300 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> Message-ID: <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> >>> Forgot to mention that your new test doesn't look like it will play >>> nicely when run with Graal enabled, so you may need to split up into >>> different @test sections and add "@requires !vm.graal.enabled" to >>> exclude graal. >> >> What kind of problem when running with Graal do you have in mind? >> >> I double-checked that the test passes with Graal enabled. > > Your various @run lines are trying to execute in different compilation > modes: -Xint, C1 only, C2 only, plus various permutations. If you take > those command-lines and then add all the Graal flags to it then you are > no longer testing any of the things you wanted to test. At best you test > Graal 7 times - which is pointless. At worst you may find some > variations will timeout when Graal is applied. The test explicitly checks class initialization invariants and specifies execution modes to stress different scenarios. It doesn't explicitly specify C2: it just runs with -XX:-TieredCompilation and Graal is used if it is set as top-tier compiler. For other modes (interpreter & C1), whether C2 or Graal are used is irrelevant. But I take your point: -XX:-TieredCompilation is much slower with Graal and doesn't test the logic as intended (no relevant compiled code generated). So I'll add "@requires !vm.graal.enabled" as you suggest. (Will update the webrev in-place.) > For that matter an Xcomp run will also negate your intentions. With -Xcomp it works as intended except pre-warmup phase, but there's a dedicated -Xint run mode which covers that. So I don't think it's necessary to exclude the test when -Xcomp is specified. Best regards, Vladimir Ivanov >>>> I'll be very happy to see this go in - though I do wish we had more >>>> platform coverage than just x86_64. Hopefully the other archs will >>>> jump on-board with this as well. >> >> Yes, fully agree with you. It should be pretty straightforward for >> maintainers to mirror x86-specific changes for their architectures. >> >>>> I was initially confused by the UseFastClassInitChecks flag as I >>>> couldn't really see why you would want to turn it off (other than >>>> perhaps during testing) but I see that it is really used (as you >>>> explained to Vladimir K.) to exclude the new code for platforms >>>> which have not implemented it. Though I'm still not sure that we >>>> shouldn't have something to detect it being turned on at runtime on >>>> platforms that don't support it (it will likely crash quickly but >>>> still ...). Keep wondering if there is a better way to handle this >>>> aspect of the change ... >> >> I deliberately made the flag develop, so it's possible to change it >> from command-line only in debug builds. I could introduce additional >> platform-specific validation, but it doesn't look worth the effort for >> such narrow case (and there are other develop flags which guard broken >> functionality). >> >>>> I can't comment on the actual interpreter and compiler changes - sorry. >> >> No problem, I'll wait for more reviews from Runtime team. >> >>>> This will need re-basing now that JDK-8219974 has been backed out. >> >> Done. >> >> Best regards, >> Vladimir Ivanov >> >>>> On 2/05/2019 9:17 am, Vladimir Ivanov wrote: >>>>> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.00/ >>>>> https://bugs.openjdk.java.net/browse/JDK-8223213 >>>>> >>>>> (It's a followup RFR on a earlier RFC [1].) >>>>> >>>>> Recent changes severely affected how static initializers are >>>>> executed and for long-running initializers it manifested as a >>>>> severe slowdown. >>>>> As an example, it led to a 3x slowdown on some Clojure applications >>>>> (JDK-8219233 [2]). The root cause is that until a class is fully >>>>> initialized, every invocation of static method on it goes through >>>>> method resolution. >>>>> >>>>> Proposed fix introduces fast class initialization barriers for C1, >>>>> C2, and template interpreter on x86-64. I did some experiments with >>>>> cross-platform approaches, but haven't got satisfactory results. >>>>> >>>>> On other platforms, behavior stays (mostly) intact. (I had to >>>>> revert some changes introduced by JDK-8219492 [3], since the >>>>> assumptions they rely on about accesses inside a class don't hold >>>>> in all cases.) >>>>> >>>>> The barrier is as simple as: >>>>> ??? if (holder->is_not_initialized() && >>>>> ??????? !holder->is_reentrant_initialization(current_thread)) { >>>>> ????? // trigger call site re-resolution and block there >>>>> ??? } >>>>> >>>>> There are 3 places where barriers are added: >>>>> ?? * in template interpreter for invokestatic bytecode; >>>>> ?? * at nmethod verified entry point (for normal compilations); >>>>> ?? * c2i adapters; >>>>> >>>>> For template interperter, there's additional check added into >>>>> TemplateTable::resolve_cache_and_index which calls into >>>>> InterpreterRuntime::resolve_from_cache when fast path checks fail. >>>>> >>>>> In case of nmethods, the barrier is put before frame construction, >>>>> so existing compiler runtime routines can be reused >>>>> (SharedRuntime::get_handle_wrong_method_stub()). >>>>> >>>>> Also, C2 has a guard on entry (Parse::clinit_deopt()) which >>>>> triggers nmethod recompilation once the class is fully initialized. >>>>> >>>>> OSR compilations don't need a barrier. >>>>> >>>>> Correspondence between barriers and transitions they cover: >>>>> ?? (1) from interpreter (barrier on caller side) >>>>> ??????? * all transitions: interpreter, compiled (i2c), native, >>>>> aot, ... >>>>> >>>>> ?? (2) from compiled (barrier on callee side) >>>>> ??????? to compiled, to native (barrier in native wrapper on entry) >>>>> >>>>> ?? (3) c2i bypasses both barriers (interpreter and compiled) and >>>>> requires a dedicated barrier in c2i >>>>> >>>>> ?? (4) to Graal/AOT code: >>>>> ???????? from interpreter: covered by interpreter barrier >>>>> ???????? from compiled: call site patching is disabled, leading to >>>>> repeated call site resolution until method holder is fully >>>>> initialized (original behavior). >>>>> >>>>> Performance experiments with clojure [2] demonstrated that the fix >>>>> almost completely recuperates the regression: >>>>> >>>>> ?? (1) always reresolve (w/o the fix):??? ~12,0s ( 1x) >>>>> ?? (2) C1/C2 barriers only:??????????????? ~3,8s (~3x) >>>>> ?? (3) int/C1/C2 barriers:???????????????? ~3,2s (-20%) >>>>> -------- >>>>> ?? (4) barriers disabled for invokestatic? ~3,2s >>>>> >>>>> I deliberately tried to keep the patch backport-friendly for >>>>> 8u/11u/12u and refrained from using newer features like nmethod >>>>> barriers introduced recently. The fix can be refactored later >>>>> specifically for 13 as a followup change. >>>>> >>>>> Testing: clojure startup, tier1-5 >>>>> >>>>> Thanks! >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> [1] >>>>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-April/037760.html >>>>> >>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8219233 >>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8219492 From aoqi at loongson.cn Wed May 22 17:16:39 2019 From: aoqi at loongson.cn (Ao Qi) Date: Thu, 23 May 2019 01:16:39 +0800 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: <708ff693-ba03-2a3a-279e-41a1b90d3444@redhat.com> References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> <708ff693-ba03-2a3a-279e-41a1b90d3444@redhat.com> Message-ID: On Thu, May 23, 2019 at 12:51 AM Aleksey Shipilev wrote: > > On 5/22/19 6:49 PM, Ao Qi wrote: > > On Thu, May 23, 2019 at 12:29 AM Aleksey Shipilev wrote: > >> > >> On 5/22/19 6:26 PM, Ao Qi wrote: > >>> Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ > >> > >> Looks fine and trivial to me. > > > > Thanks Aleksey and Lutz! I need a sponsor. Could you help? > > Yes, I'll sponsor. I am doing quick jdk-submit run to catch surprises. Thanks. I am also testing. Seems fine so far. > > -Aleksey > From lutz.schmidt at sap.com Wed May 22 17:20:39 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 22 May 2019 17:20:39 +0000 Subject: RFR(trivial): JDK-8224568: minimal and zero build fails after JDK-8213084 In-Reply-To: References: <91ae61b2-1f1b-bcbb-ea8e-5c524ec19a8c@redhat.com> <8476FF23-32C6-4342-8E56-B9752AE0CAAF@sap.com> <654DDDC6-73A0-487F-A57D-17AE450B577C@sap.com> <67345916-68e8-de04-36b5-ff35e88fd6e3@redhat.com> <708ff693-ba03-2a3a-279e-41a1b90d3444@redhat.com> Message-ID: Hi, the change looks good. I'm not a reviewer, though. Regards, Lutz ?On 22.05.19, 19:16, "Ao Qi" wrote: On Thu, May 23, 2019 at 12:51 AM Aleksey Shipilev wrote: > > On 5/22/19 6:49 PM, Ao Qi wrote: > > On Thu, May 23, 2019 at 12:29 AM Aleksey Shipilev wrote: > >> > >> On 5/22/19 6:26 PM, Ao Qi wrote: > >>> Updated: http://cr.openjdk.java.net/~aoqi/8224568/webrev.01/ > >> > >> Looks fine and trivial to me. > > > > Thanks Aleksey and Lutz! I need a sponsor. Could you help? > > Yes, I'll sponsor. I am doing quick jdk-submit run to catch surprises. Thanks. I am also testing. Seems fine so far. > > -Aleksey > From vladimir.x.ivanov at oracle.com Wed May 22 17:47:01 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 20:47:01 +0300 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <294c512c-a613-7679-0b90-61e2fe015d3c@oracle.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> <294c512c-a613-7679-0b90-61e2fe015d3c@oracle.com> Message-ID: <51756d46-c2ad-5377-12ce-67208deb12a5@oracle.com> Thanks for the thorough explanation, Sergey. What I like about the patch you propose is its simplicity. More accurate heuristic which takes inlining effects into account (e.g., on EA) would scale to a much wider range of use cases and C2 has a rich toolkit to make such heuristic possible, but it would definitely require more effort. Thinking about it for a while, I agree with your proposal: the heuristic is acceptable for the case of iterators, the benefits are evident, and risks (with overinlining and premature inlining) are low. I still hope there'll be a more generic solution available at some point which will supersede such special case for Iterable. Regarding the check itself, I'm in favor of limiting it toIterable::iterator() overrides/overloads, but I'm OK with a more generic check on return type. Best regards, Vladimir Ivanov On 10/05/2019 21:47, Sergey Kuksenko wrote: > Let me do a broader description. > > When hotspot makes a decision to the "ultimate question compilation, > optimization and everything"? inline or not inline there are two key > part of that decision. It is check of sizes (callee and caller) and > check of frequencies (invocation count). Frequency check is reasonable, > why should we inline rarely invoked method? But sometimes we loose > optimization opportunities with that. > > Let's narrow the scenario. We have a loop and a method invocation before > the loop. Inline of the method is a vital? for the loop performance. I > see at least two key optimizations here: constant propagation and scalar > replacement, maybe more. But if the loop has large enough amount of > iterations -> hotspot has large enough backedge counters -> but it means > that prolog is considered as relatively cold code (small amount of > invocation counter) -> that method (potentially vital for performance) > is not inlined (due to frequency/MinInlineThreashold cut off). > > We can't say if inlining is important until we look into the loop (even > if there is a loop there). But we have to make a decision about inline > before that. So let's try to make reasonable heuristic and narrow the > scenario again. Limit our sight to Iterators. There is a very high > probability that after Iterable::iterator() invocation there is a loop > (covers all for-all loop). Also there is a high correlation between > collection size and amount of loop iterations. Let's inline all > iterators. I don't think the idea to analyze if "returned Iterator is a > freshly-allocated instance" makes sense. First of all it's unnecessary > complication.? Moreover, I have results when we have chain of iterators, > hotspot can't inline the whole chain due to absence of profile (and/or > profile pollution), but partial inline of the chain have shown > performance benefits. To get more effective prediction if that > particular inline is important we should look not into the method, but > to the usage of the method results (into the loop). > > About the first comment (to broad or to narrow check). I have to note > that this fix doesn't force inline for all methods with "iterator" name. > The fix only excludes frequency cut off. All other checks (by sizes) are > still in place. I did broader check for two reasons: to simplify > modifications and to have wider appliances when it works. I could narrow > it if you insist, but at the same time I think we have to make that > check broader - don't look into method name at all. If you have > something? that returns Iterator - there will be loop after that with a > very high probability. So I'd vote for making that wider - check only > return type. > > On 5/8/19 3:10 PM, Vladimir Ivanov wrote: >>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ >> returned Iterator is a freshly-allocated instance >> src/hotspot/share/opto/bytecodeInfo.cpp: >> >> +? if (callee_method->name() == ciSymbol::iterator_name()) { >> +??? if >> (callee_method->signature()->return_type()->is_subtype_of(C->env()->Iterator_klass())) >> { >> +????? return true; >> +??? } >> +? } >> >> The check looks too broad for me: it returns true for any method with >> a name "iterator" which returns an instance of Iterator which is much >> broader that just overrides/overloads of Iterable::iterator(). >> >> Can you elaborate, please, why did you decide to extend the check for >> non-Iterables? >> >> Commenting on the general approach, it looks like a good candidate for >> a fist-line filter before performing a more extensive analysis. I'd >> prefer to see BCEscapeAnalyzer extended to determine that returned >> Iterator is a freshly-allocated instance and decide whether to inline >> or not based on that instead. Among java.util classes you mentioned >> most iterators are trivial, so even naive analysis should get decent >> results. >> >> And then the analysis can be applied to any method which returns an >> Object to see whether EA may benefit from inlining. >> >> What do you think? >> >> Best regards, >> Vladimir Ivanov >> >>> On 5/7/19 11:56 AM, Aleksey Shipilev wrote: >>>> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >>>>> Hi All, >>>>> >>>>> I would like to ask for review the following change/update: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8223504 >>>>> >>>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ >>>> The idea sounds fine. >>>> >>>> Nits (the usual drill): >>>> >>>> ? *) Copyright years need to be updated, at least in bytecodeInfo.cpp >>>> >>>> ? *) Do we need to put Iterator_klass initialization this early in >>>> WK_KLASSES_DO? It feels safer to >>>> initialize it at the end, to avoid surprising bootstrap issues. >>>> >>>> ? *) Backslash indent is off here in vmSymbols.hpp: >>>> >>>> ? 129?? template(java_util_Iterator, >>>> "java/util/Iterator")?????????????? \ >>>> >>>> ? *) Space after "if"? Also, I think you can use >>>> ciType::is_subtype_of instead here. Plus, since you >>>> declared iterator in WK klasses, SystemDictionary::Iterator_klass() >>>> should be available. >>>> >>>> ? 100???? if(retType->is_klass() && >>>> retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { >>>> From vivek.r.deshpande at intel.com Wed May 22 17:48:57 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 22 May 2019 17:48:57 +0000 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> Hi Tobias Thanks for looking at it. Yes I will fix the format strings. I came across this issue with vector API tests. I tried to write a reproducer using the autovectorizer test, but it always uses the register based rule(instruct Repl32B) instead of memory based rule (instruct Repl32B_mem) and register based rule gives correct result. This is why it never showed up. Could you please help me with forcing to use memory based rule with autovectorizer based test. Regards, Vivek -----Original Message----- From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] Sent: Wednesday, May 22, 2019 1:13 AM To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding Hi Vivek, I wonder why this never showed up, could you please add a regression test? You also need to fix the format strings. Thanks, Tobias On 22.05.19 02:12, Deshpande, Vivek R wrote: > Hi All > > ? > > The encoding for replicateB in x86.ad uses dst register as one of the > source without initializing, when the source for the scalar value is memory. > > This leads to wrong replication in the resulting vector. > > I have a fix for the bug in this webrev: > > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ > > I have created following JBS Entry: > > https://bugs.openjdk.java.net/browse/JDK-8224558 > > ? > > Kindly requesting review for the patch. > > ? > > Regards, > > Vivek > From robbin.ehn at oracle.com Wed May 22 18:03:49 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 22 May 2019 20:03:49 +0200 Subject: RFR(m): 8221734: Deoptimize with handshakes In-Reply-To: <449adf93-cf99-8724-3d6b-905db2d69b61@oracle.com> References: <89b00912-1f84-3458-d53b-fbe6d372affe@oracle.com> <9c67a8fc-bd23-eed9-a484-c4f03daec18b@oracle.com> <449adf93-cf99-8724-3d6b-905db2d69b61@oracle.com> Message-ID: Hi Coleen, On 2019-05-22 12:09, coleen.phillimore at oracle.com wrote: > > Hi Robbin,? I've also reviewed the codeCache parts for redefinition and it looks > good. Thanks, great! /Robbin > Thanks, > Coleen > > On 5/22/19 2:31 AM, Robbin Ehn wrote: >> Thanks Dean! >> >> /Robbin >> >> On 2019-05-21 23:35, dean.long at oracle.com wrote: >>> On 5/21/19 3:27 AM, Robbin Ehn wrote: >>>> Dean do you have any more comments? >>> >>> No, you already addressed my concerns.? Thanks. >>> >>> dl > From vladimir.x.ivanov at oracle.com Wed May 22 18:04:16 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 21:04:16 +0300 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> Message-ID: <6220b641-2f53-10cc-c4e4-9b9989f0a133@oracle.com> > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ What's the benefit of keeping those memory variants now and not simply rely on a pair of load + replicate instructions to match (ReplicateB (LoadB mem)) (e.g., Repl4B_mem vs loadB+Repl4B)? Best regards, Vladimir Ivanov From vivek.r.deshpande at intel.com Wed May 22 18:19:04 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 22 May 2019 18:19:04 +0000 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <6220b641-2f53-10cc-c4e4-9b9989f0a133@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <6220b641-2f53-10cc-c4e4-9b9989f0a133@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECE3E@ORSMSX106.amr.corp.intel.com> Hi Vladimir They both(Repl4B_mem vs loadB+Repl4B ) would be same in this case. May be we can remove the memory variants as the current memory variants are not correct. I can prepare a patch according to that. Regards, Vivek -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Wednesday, May 22, 2019 11:04 AM To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ What's the benefit of keeping those memory variants now and not simply rely on a pair of load + replicate instructions to match (ReplicateB (LoadB mem)) (e.g., Repl4B_mem vs loadB+Repl4B)? Best regards, Vladimir Ivanov From vladimir.x.ivanov at oracle.com Wed May 22 18:37:09 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 22 May 2019 21:37:09 +0300 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECE3E@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <6220b641-2f53-10cc-c4e4-9b9989f0a133@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECE3E@ORSMSX106.amr.corp.intel.com> Message-ID: > On 22 May 2019, at 21:19, Deshpande, Vivek R wrote: > > Hi Vladimir > > They both(Repl4B_mem vs loadB+Repl4B ) would be same in this case. May be we can remove the memory variants as the current memory variants are not correct. > I can prepare a patch according to that. Yes, I prefer redundant instructions to go away. The only difference I noticed is how byte is loaded (movzbl in fixed Repl4B_mem vs movsbl in loadB), but I assume it doesn?t matter, right? Best regards, Vladimir Ivanov > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] > Sent: Wednesday, May 22, 2019 11:04 AM > To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding > > >> http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.00/ > > What's the benefit of keeping those memory variants now and not simply rely on a pair of load + replicate instructions to match (ReplicateB (LoadB mem)) (e.g., Repl4B_mem vs loadB+Repl4B)? > > Best regards, > Vladimir Ivanov From OGATAK at jp.ibm.com Wed May 22 18:43:50 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Thu, 23 May 2019 03:43:50 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 In-Reply-To: <81c12391-1406-948d-e8c5-9f66437a1b92@linux.vnet.ibm.com> References: <81c12391-1406-948d-e8c5-9f66437a1b92@linux.vnet.ibm.com> Message-ID: Hi Gustavo, Thank you for sponsoring the patch. Regards, Ogata "Gustavo Romero" wrote on 2019/05/21 05:46:58: > From: "Gustavo Romero" > To: Kazunori Ogata/Japan/IBM at IBMJP > Cc: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net > Date: 2019/05/21 05:47 > Subject: Re: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > > Hi, > > Pushed to jdk8u-dev: > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/39678a65a0e8 > > Thank you. > > Best regards, > Gustavo > > On 05/14/2019 03:02 AM, Kazunori Ogata wrote: > > Hi Gustavo, > > > > Thank you for the suggestion. I'll proceed to put the fix request comment > > and tag in the original bug report. > > > > Thank you too for offering to sponsor this change. I'll let you know when > > it's approved. > > > > > > Regards, > > Ogata > > > > > > "Gustavo Romero" wrote on 2019/05/14 > > 04:59:43: > > > >> From: "Gustavo Romero" > >> To: Kazunori Ogata/Japan/IBM at IBMJP, > > hotspot-compiler-dev at openjdk.java.net, > >> jdk8u-dev at openjdk.java.net > >> Date: 2019/05/14 04:59 > >> Subject: Re: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > >> > >> Hi Ogata, > >> > >> Thanks for the backport and for the webrev. > >> > >> I understand that offset adjustments in general, and particularly for > > this > >> backport, are not considered a change that needs to be reviewed again. > >> > >> That said, and although I'm not a Reviewer, I tested it against SPECjvm > > and > >> microbenchmarks for byte, int, and long and reviewed the change for > > jdk8u-dev. > >> > >> It looks good. > >> > >> Please, provide a "Fix Request" comment to the original bug explaining > > that > >> the backport is low risk and affects PPC64-only, accordingly to [1] and > > [2]. > >> Then please add the label "jdk8u-fix-request" to it. > >> > >> Once the approval to push is granted I'll sponsor the change. > >> > >> Thank you. > >> > >> Best regards, > >> Gustavo > >> > >> [1] https://wiki.openjdk.java.net/display/jdk8u/Main > >> [2] http://openjdk.java.net/projects/jdk-updates/approval.html > >> > >> On 05/10/2019 03:55 AM, Kazunori Ogata wrote: > >>> Sorry, I forgot to put the links to the bug report and the original > >>> changeset Also forgot to mention that this changeset is needed to > >>> backport AES intrinsics support [1] on ppc64 big-endian. > >>> > >>> Bug report: > >>> https://bugs.openjdk.java.net/browse/JDK-8158232 > >>> > >>> Original change set > >>> http://hg.openjdk.java.net/jdk/jdk/rev/987528901b83 > >>> > >>> > >>> Webrev: > >>> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > >>> > >>> > >>> Refs: > >>> [1] https://bugs.openjdk.java.net/browse/JDK-8188868 > >>> > >>> > >>> Regards, > >>> Ogata > >>> > >>> "hotspot-compiler-dev" > >>> wrote on 2019/05/10 15:30:05: > >>> > >>>> From: "Kazunori Ogata" > >>>> To: hotspot-compiler-dev at openjdk.java.net, jdk8u-dev at openjdk.java.net > >>>> Date: 2019/05/10 15:31 > >>>> Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8158232 > >>>> Sent by: "hotspot-compiler-dev" > >>> > >>>> > >>>> Hi, > >>>> > >>>> May I get review for backport of 8158232: PPC64: improve byte, int > > and > >>>> long array copy stubs by using VSX instructions? > >>>> > >>>> This changeset looks no conflict with the latest jdk8u-dev code, but > > the > >>> > >>>> patch command failed to apply it. It seems the patch command lost > > the > >>>> code regions to apply patches because stubGenerator_ppc.cpp has sets > > of > >>>> similar (but slightly different) functions. > >>>> > >>>> I created new webrev mainly to update line numbers in the patch file. > > I > >>> > >>>> verified I can build fastdebug and release builds and there was no > >>>> degradation in "make test" results. > >>>> > >>>> http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8158232/webrev.02/ > >>>> > >>>> Regards, > >>>> Ogata > >>>> > >>>> > >>> > >>> > > > > From gromero at linux.vnet.ibm.com Wed May 22 19:25:59 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 22 May 2019 16:25:59 -0300 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests Message-ID: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> Hi, Could I get reviews for the following change please? Bug : https://bugs.openjdk.java.net/browse/JDK-8223660 Webrev: http://cr.openjdk.java.net/~gromero/8223660/v1/ It removes from the RTM jtreg tests the use of Unsafe native methods as abort provokers in a transaction. Relying on Unsafe native methods to abort a transaction makes the RTM test brittle because an Unsafe native method can be converted to non-native at any time, breaking the RTM tests. This is the second time it happens. This change removes the use of Unsafe native methods (currently, pageSize() is used, but is not native anymore) and adds an isolated native method in library libXAbortProvoker.so for the XAbortProvoker class that will be used to abort transactions in the RTM tests, turning the RTM jtreg tests more self-contained and so less brittle. I tested the change on x86_64 and PPC64, w/ RTM/HTM CPU feature available. Thank you! Best regards, Gustavo From fujie at loongson.cn Thu May 23 07:29:17 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 23 May 2019 15:29:17 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> Message-ID: Great improvement. Thank you so much, Vladimir Ivanov. With fastdebug, the elapsed time dropped from 37s to 4s on our test machine. Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.04/ For time and safety reasons, I still prefer the one-line fix. I think "profile.count < -1" is good enough to catch the overflow and no other action is required for this issue. The overflow problem seems to be there for quite a long time. It requires much effort and time to fix it correctly and completely, and the risk seems a little high. To finish this issue ASAP, it might be better to handle the overflow problem in a separate bug ID. What do you think? Thanks a lot. Best regards, Jie On 2019/5/22 ??11:58, Vladimir Ivanov wrote: > Nice catch, Jie! > > I'm in favor of fixing the overflow instead and keep counts always > non-negative. The root cause for negative counts is uint->int > conversion and values can be normalized to max_int. > > Moreover, on 64-bit platforms it's easy to catch (and fix) uint > overflow since all the info is there - counts already are 64-bit > (intptr_t). > > Regarding the test: > > test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java: > > ? 28? * @requires (vm.debug == true) > > The test runs just fine with product binaries. So, I'd prefer to avoid > limiting it only to debug. > > > ? 51???????? } else { > ? 52???????????? // Sleep to wait for the compilation to be finished > ? 53???????????? Thread.sleep(2000); > ? 54???????? } > > You can just turn off background compilation instead (-Xbatch). > > > I was able to significantly speed up the test with the following chnages: > > /** > ?* @test > ?* @run main/othervm -Xbatch -XX:-UseOnStackReplacement > -XX:MaxTrivialSize=0 compiler.profiling.TestProfileCounterOverflow > ?*/ > > package compiler.profiling; > > public class TestProfileCounterOverflow { > ??? public static void test(long iterations) throws Exception { > ??????? for (long j = 0; j < iterations; j++) { > ??????????? call(); > ??????? } > ??? } > ??? public static void call() {} > > ??? public static void main(String[] args) throws Exception { > ??????? // trigger profiling on tier3 > ??????? for (int i = 0; i < 500; i++) { > ??????????? test(1); > ??????? } > > ??????? test(Integer.MAX_VALUE + 10000L); // overflow call counter > > ??????? // trigger c2 compilation > ??????? for (int i = 0; i < 10_000; i++) { > ??????????? test(1); > ??????? } > ??????? System.out.println("TEST PASSED"); > ??? } > } > > Best regards, > Vladimir Ivanov > > On 22/05/2019 11:55, Jie Fu wrote: >> On 2019/5/22 ??11:57, Leonid Mesnik wrote: >>> Could you please add reproducer as regression test for fix. Please >>> verify that test failed without fix and pass with fix. >> >> Done. >> >> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.03/ >> >> # Testing >> On an i7-8700 at 3.20GHz machine, the test results: >> ----------------------------------------------- >> - fastdebug: make test >> TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" >> CONF=fastdebug >> ?? Before fix: failed, elapsed time: 37s >> ?? After? fix: pass,?? elapsed time: 43s >> - slowdebug: make test >> TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" >> CONF=slowdebug >> ?? Before fix: failed, elapsed time: 38s >> ?? After? fix: pass,?? elapsed time: 43s >> ----------------------------------------------- >> >> Please review it and give me some advice. >> >> Thanks. >> Best regards, >> Jie >> >>> >>> Leonid >>> >>>> On May 21, 2019, at 8:54 PM, Jie Fu wrote: >>>> >>>> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >>>>> It doesn't explain the failure since the assert is hit while >>>>> parsing invoke* bytecode while type check failures are recorded >>>>> for checkcast, aastore, and instanceof. Am I missing something >>>>> important here? >>>>> >>>> Good question. I'm so sorry to make you confused. >>>> >>>> After a long time of digging into the code, I think this failure >>>> was cause by the overflow of profile.cout. >>>> >>>> A reproducer is constructed here: >>>> ? - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java >>>> >>>> And I've changed the comment for the patch: >>>> ? - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ >>>> >>>> Please review it and give me some advice. >>>> >>>> Thanks. >>>> Best regards, >>>> Jie >>>> >>>> >> From vladimir.x.ivanov at oracle.com Thu May 23 10:21:29 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 23 May 2019 13:21:29 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> Message-ID: <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> > Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.04/ > > For time and safety reasons, I still prefer the one-line fix. > I think "profile.count < -1" is good enough to catch the overflow and no > other action is required for this issue. > > The overflow problem seems to be there for quite a long time. > It requires much effort and time to fix it correctly and completely, and > the risk seems a little high. (Just noticed that the bug is classified as P1. IMO it should be reassessed as P3 and I left a comment there asking for that.) I'm still in favor of fixing the root cause than putting band-aids in otherwise perfectly valid code. I don't consider fixing CounterData::count() and its usages to properly handle overflow as overly complicated. There's a limited number of usages and they don't properly handle overflow as well. So, fixing the bug is highly desireable even though it has been left unnoticed for a long time. Best regards, Vladimir Ivanov > To finish this issue ASAP, it might be better to handle the overflow > problem in a separate bug ID. > What do you think? > On 2019/5/22 ??11:58, Vladimir Ivanov wrote: >> Nice catch, Jie! >> >> I'm in favor of fixing the overflow instead and keep counts always >> non-negative. The root cause for negative counts is uint->int >> conversion and values can be normalized to max_int. >> >> Moreover, on 64-bit platforms it's easy to catch (and fix) uint >> overflow since all the info is there - counts already are 64-bit >> (intptr_t). >> >> Regarding the test: >> >> test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java: >> >> ? 28? * @requires (vm.debug == true) >> >> The test runs just fine with product binaries. So, I'd prefer to avoid >> limiting it only to debug. >> >> >> ? 51???????? } else { >> ? 52???????????? // Sleep to wait for the compilation to be finished >> ? 53???????????? Thread.sleep(2000); >> ? 54???????? } >> >> You can just turn off background compilation instead (-Xbatch). >> >> >> I was able to significantly speed up the test with the following chnages: >> >> /** >> ?* @test >> ?* @run main/othervm -Xbatch -XX:-UseOnStackReplacement >> -XX:MaxTrivialSize=0 compiler.profiling.TestProfileCounterOverflow >> ?*/ >> >> package compiler.profiling; >> >> public class TestProfileCounterOverflow { >> ??? public static void test(long iterations) throws Exception { >> ??????? for (long j = 0; j < iterations; j++) { >> ??????????? call(); >> ??????? } >> ??? } >> ??? public static void call() {} >> >> ??? public static void main(String[] args) throws Exception { >> ??????? // trigger profiling on tier3 >> ??????? for (int i = 0; i < 500; i++) { >> ??????????? test(1); >> ??????? } >> >> ??????? test(Integer.MAX_VALUE + 10000L); // overflow call counter >> >> ??????? // trigger c2 compilation >> ??????? for (int i = 0; i < 10_000; i++) { >> ??????????? test(1); >> ??????? } >> ??????? System.out.println("TEST PASSED"); >> ??? } >> } >> >> Best regards, >> Vladimir Ivanov >> >> On 22/05/2019 11:55, Jie Fu wrote: >>> On 2019/5/22 ??11:57, Leonid Mesnik wrote: >>>> Could you please add reproducer as regression test for fix. Please >>>> verify that test failed without fix and pass with fix. >>> >>> Done. >>> >>> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.03/ >>> >>> # Testing >>> On an i7-8700 at 3.20GHz machine, the test results: >>> ----------------------------------------------- >>> - fastdebug: make test >>> TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" >>> CONF=fastdebug >>> ?? Before fix: failed, elapsed time: 37s >>> ?? After? fix: pass,?? elapsed time: 43s >>> - slowdebug: make test >>> TEST="test/hotspot/jtreg/compiler/profiling/TestProfileCounterOverflow.java" >>> CONF=slowdebug >>> ?? Before fix: failed, elapsed time: 38s >>> ?? After? fix: pass,?? elapsed time: 43s >>> ----------------------------------------------- >>> >>> Please review it and give me some advice. >>> >>> Thanks. >>> Best regards, >>> Jie >>> >>>> >>>> Leonid >>>> >>>>> On May 21, 2019, at 8:54 PM, Jie Fu wrote: >>>>> >>>>> On 2019/5/21 ??10:30, Vladimir Ivanov wrote: >>>>>> It doesn't explain the failure since the assert is hit while >>>>>> parsing invoke* bytecode while type check failures are recorded >>>>>> for checkcast, aastore, and instanceof. Am I missing something >>>>>> important here? >>>>>> >>>>> Good question. I'm so sorry to make you confused. >>>>> >>>>> After a long time of digging into the code, I think this failure >>>>> was cause by the overflow of profile.cout. >>>>> >>>>> A reproducer is constructed here: >>>>> ? - http://cr.openjdk.java.net/~jiefu/8224162/CounterOverflow.java >>>>> >>>>> And I've changed the comment for the patch: >>>>> ? - http://cr.openjdk.java.net/~jiefu/8224162/webrev.02/ >>>>> >>>>> Please review it and give me some advice. >>>>> >>>>> Thanks. >>>>> Best regards, >>>>> Jie >>>>> >>>>> >>> > From fujie at loongson.cn Thu May 23 10:54:39 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 23 May 2019 18:54:39 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> Message-ID: > I'm still in favor of fixing the root cause than putting band-aids in > otherwise perfectly valid code. > > I don't consider fixing CounterData::count() and its usages to > properly handle overflow as overly complicated. There's a limited > number of usages and they don't properly handle overflow as well. So, > fixing the bug is highly desireable even though it has been left > unnoticed for a long time. OK. I'd like to do it. Thanks. Best regards, Jie From adinn at redhat.com Thu May 23 10:55:59 2019 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 23 May 2019 11:55:59 +0100 Subject: RFR: 8207851: Implement JEP 352 Message-ID: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> Hi, Could I please have reviews of the following change set which implements JEP 352: JEP: https://openjdk.java.net/jeps/352 JIRA: https://bugs.openjdk.java.net/browse/JDK-8207851 webrev: http://cr.openjdk.java.net/~adinn/8207851/webrev.00/ I would also very much like to target this implementation for JDK13. Testing: The webrev includes a simple test (in directory test/jdk/java/nio/MappedByteBuffer) which ensures that an NVRAM-backed MappedByteBuffer can be created, updated and forced using cache line flushes. This test is marked as ignored because it requires, inter alia, a suitably configured host, fitted with an NVRAM DIMM device or employing a pseudo-NVAM device simulated over volatile RAM. The above test has been run successfully on Linux x86_64 with an Optane DIMM and with a pseduo-NVRAM device. Further, more rigorous testing has been done in both the above configurations using the Narayana Transactions logger and Infinispan distributed data grid. Testing of /successful/ use of the API on Linux AArch64 has not yet been possible with either emulated or real NVRAM devices as it requires an updated (ARMv8.2) CPU hardware capability as well as access to AArch64 compatible NVRAM devices. n.b. an AArch64 compatibility flag (-X:UsePOCForPOP) has been provided in the current patch to support testing on older CPUs using simulated NVRAM. Unfortunately, it has not yet been possible to obtain access to an AArch64 v8.1 machine that supports simulation of NVRAM devices via volatile RAM. In consequence, AArch64 testing has been limited to ensuring that the relevant API failure modes correctly manifest: i.e. v8.1 CPUs which lack the relevant hardware instructions refuse to map NVRAM-backed buffers trwoing UnsupportedOperationException v8.1 CPUs which bypass this failure via compatibility mode fail at the mmap stage with IOException due to lack of NVRAM mapping support in the underlying OS mmap API It is expected that the omissions in AArch64 testing will be rectified in the next few weeks. While this is desirable, the omissions are not viewed as critical since there is currently no general access to the relevant hardware. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu May 23 12:18:10 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 23 May 2019 13:18:10 +0100 Subject: RFR: 8224671: AArch64: mauve System.arraycopy test failure Message-ID: Well, this one was a doozy. Mauve is an old Java test suite, from many years ago. Thankfully though, AdoptOpenJDK still run it, and we received a surprising report about a test failure in System.arraycopy(). It turns out that in C1-compiled code an ArrayStoreException is wrongly thrown as an ArrayIndexOutOfBoundsException. After a lot of debugging, the bug turned out to be the calculation of ~x, a bitwise inversion: __ eonw(rscratch1, r0, 0); This looks reasonable enough: EOR r0 with ~0. But it's wrong: the EONW instruction has no immediate form. So how are we not getting an assembly-time error? This turns out to be a misfeature of C++ combined with the nasty way we declare registers in HotSpot. When C++ processes the overloads for eonw() it looks first for an exact match of the immediate operand then applies the default integer conversions. One of these silently (!) converts 0 to a NULL pointer. Unfortunately (very unfortunately) the register declaration for r0 is a NULL pointer, so __ eonw(rscratch1, r0, 0); generates eonw x8, r0, r0 which always sets r0 to -1, which causes the ArrayIndexOutOfBoundsException. The fix is to use ZR instead of r0. In order to make sure that this never happens again I've provided a declaration for the immediate form of these instructions. It will generate a compile-time error if this mistake ever happens again. http://cr.openjdk.java.net/~aph/8224671/jdk-test.changeset We'll need backports for all releases. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tobias.hartmann at oracle.com Thu May 23 12:39:28 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 23 May 2019 14:39:28 +0200 Subject: [13] RFR(S): 8224658: Unsafe access C2 compile fails with assert(flat != TypePtr::BOTTOM) failed: cannot alias-analyze an untyped ptr: adr_type = NULL Message-ID: <2b0a8781-4688-558d-353e-8ef3483fc833@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8224658 http://cr.openjdk.java.net/~thartmann/8224658/webrev.00/ We hit an assert in alias analysis when compiling an unsafe off-heap access intrinsic with zero address. We should simply bail out in this case because if that code is ever executed, it will crash the VM anyway. Thanks to Aleksey for reporting the issue and providing the test case. Best regards, Tobias From tobias.hartmann at oracle.com Thu May 23 12:44:37 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 23 May 2019 14:44:37 +0200 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> Message-ID: <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> Hi Vivek, On 22.05.19 19:48, Deshpande, Vivek R wrote: > I came across this issue with vector API tests. > I tried to write a reproducer using the autovectorizer test, but it always uses the register based rule(instruct Repl32B) instead of memory based rule (instruct Repl32B_mem) and register based rule gives correct result. > This is why it never showed up. Okay, thanks for the explanation. > Could you please help me with forcing to use memory based rule with autovectorizer based test. I'm afraid I can't help, I'm not even sure what the autovectorizer is. Is that specific to the vector API? Best regards, Tobias From adinn at redhat.com Thu May 23 12:45:55 2019 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 23 May 2019 13:45:55 +0100 Subject: RFR: 8224671: AArch64: mauve System.arraycopy test failure In-Reply-To: References: Message-ID: Oh, nice catch. The patch looks good for head and for all backports so count it as reviewed. Are those the only cases where we need a variant with an unsigned int argument? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander On 23/05/2019 13:18, Andrew Haley wrote: > Well, this one was a doozy. > > Mauve is an old Java test suite, from many years ago. Thankfully > though, AdoptOpenJDK still run it, and we received a surprising report > about a test failure in System.arraycopy(). > > It turns out that in C1-compiled code an ArrayStoreException is > wrongly thrown as an ArrayIndexOutOfBoundsException. After a lot of > debugging, the bug turned out to be the calculation of ~x, a bitwise > inversion: > > __ eonw(rscratch1, r0, 0); > > This looks reasonable enough: EOR r0 with ~0. But it's wrong: the EONW > instruction has no immediate form. So how are we not getting an > assembly-time error? This turns out to be a misfeature of C++ combined > with the nasty way we declare registers in HotSpot. > > When C++ processes the overloads for eonw() it looks first for an > exact match of the immediate operand then applies the default integer > conversions. One of these silently (!) converts 0 to a NULL > pointer. Unfortunately (very unfortunately) the register declaration > for r0 is a NULL pointer, so > > __ eonw(rscratch1, r0, 0); > > generates > > eonw x8, r0, r0 > > which always sets r0 to -1, which causes the ArrayIndexOutOfBoundsException. > > The fix is to use ZR instead of r0. > > In order to make sure that this never happens again I've provided a > declaration for the immediate form of these instructions. It will > generate a compile-time error if this mistake ever happens again. > > http://cr.openjdk.java.net/~aph/8224671/jdk-test.changeset > > We'll need backports for all releases. > From vladimir.x.ivanov at oracle.com Thu May 23 13:09:00 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 23 May 2019 16:09:00 +0300 Subject: [13] RFR(S): 8224658: Unsafe access C2 compile fails with assert(flat != TypePtr::BOTTOM) failed: cannot alias-analyze an untyped ptr: adr_type = NULL In-Reply-To: <2b0a8781-4688-558d-353e-8ef3483fc833@oracle.com> References: <2b0a8781-4688-558d-353e-8ef3483fc833@oracle.com> Message-ID: <445d5158-21f3-0fc1-43ac-273ad81d0533@oracle.com> The fix should work fine for Unsafe.getXxx(0) case, but what if address turns into 0 later? For example, I don't see a reason why it can't theoretically happen in presence of post-parse inlining happening in effectively unreachable code. Best regards, Vladimir Ivanov On 23/05/2019 15:39, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8224658 > http://cr.openjdk.java.net/~thartmann/8224658/webrev.00/ > > We hit an assert in alias analysis when compiling an unsafe off-heap access intrinsic with zero > address. We should simply bail out in this case because if that code is ever executed, it will crash > the VM anyway. > > Thanks to Aleksey for reporting the issue and providing the test case. > > Best regards, > Tobias > From aph at redhat.com Thu May 23 13:10:48 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 23 May 2019 14:10:48 +0100 Subject: RFR: 8224671: AArch64: mauve System.arraycopy test failure In-Reply-To: References: Message-ID: On 5/23/19 1:45 PM, Andrew Dinn wrote: > Oh, nice catch. > > The patch looks good for head and for all backports so count it as reviewed. > > Are those the only cases where we need a variant with an unsigned int > argument? There probably are some more cases where there is this risk, but I'd rather change the definitions of our registers (so that none of them is a null pointer constant) than add a ton of overloads. That's for the future: this simple patch is good for the backports. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tobias.hartmann at oracle.com Thu May 23 13:21:56 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 23 May 2019 15:21:56 +0200 Subject: [13] RFR(S): 8224658: Unsafe access C2 compile fails with assert(flat != TypePtr::BOTTOM) failed: cannot alias-analyze an untyped ptr: adr_type = NULL In-Reply-To: <445d5158-21f3-0fc1-43ac-273ad81d0533@oracle.com> References: <2b0a8781-4688-558d-353e-8ef3483fc833@oracle.com> <445d5158-21f3-0fc1-43ac-273ad81d0533@oracle.com> Message-ID: <837ac2a7-d8eb-34ff-06f0-586ed75bd639@oracle.com> Hi Vladimir, thanks for looking at this. On 23.05.19 15:09, Vladimir Ivanov wrote: > The fix should work fine for Unsafe.getXxx(0) case, but what if address turns into 0 later? For > example, I don't see a reason why it can't theoretically happen in presence of post-parse inlining > happening in effectively unreachable code. Yes that's possible and I thought it doesn't matter because we won't hit that assert during IGVN. I've just checked in detail and C2 actually completely removes the unsafe access in that case which is obviously incorrect. I'll come back once I have a fix ready. Thanks, Tobias From matthias.baesken at sap.com Thu May 23 11:16:43 2019 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 23 May 2019 11:16:43 +0000 Subject: deoptimization.cpp Events::log / Events::log_deopt_message - was : RE: RFR: 8224221: add memprotect calls to event log Message-ID: Hello, could please someone comment on the Events::log / Events::log_deopt_message calls in deoptimization.cpp , should they better all Go to the depot log ( Events::log_deopt_message ) ? Best regards, Matthias > > > > Btw when looking into the other already present Events::log* calls I wondered about this : > > In deoptimization.cpp , there are 3 calls to Events:log, like > > > > Events::log(thread, "DEOPT UNPACKING pc=" INTPTR_FORMAT " sp=" > INTPTR_FORMAT " mode %d", > > p2i(stub_frame.pc()), p2i(stub_frame.sp()), exec_mode); > > > > But just one Events::log_deopt_message > > > > Events::log_deopt_message(thread, "Uncommon trap: reason=%s > action=%s pc=" INTPTR_FORMAT " method=%s @ %d %s", > > trap_reason_name(reason), trap_action_name(action), > p2i(fr.pc()), > > trap_method->name_and_sig_as_C_string(), trap_bci, nm- > >compiler_name()); > > > > I think all 4 messages should go to the separate deoptimization - log and use Events::log_deopt_message. > > Or is there a special intentation why some go into the depot-log (Events::log_deopt_message ) and some into the general events log (Events::log) ? > > I have no idea sorry. Best to open an issue and/or discuss this with the > compiler folk. > From lutz.schmidt at sap.com Thu May 23 13:45:03 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 23 May 2019 13:45:03 +0000 Subject: RFR(XS): 8224652: 32-bit build failures after JDK-8213084 Message-ID: Dear All, I would like to request reviews for this tiny fix, bringing 32-bit platforms back to life: Bug: https://bugs.openjdk.java.net/browse/JDK-8224652 Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224652.00/ ~shade: thank you Aleksey for providing the fix. Regards, Lutz From aph at redhat.com Thu May 23 14:11:37 2019 From: aph at redhat.com (Andrew Haley) Date: Thu, 23 May 2019 15:11:37 +0100 Subject: RFR: 8224671: AArch64: mauve System.arraycopy test failure In-Reply-To: References: Message-ID: On 5/23/19 1:45 PM, Andrew Dinn wrote: > The patch looks good for head and for all backports so count it as reviewed. > > Are those the only cases where we need a variant with an unsigned int > argument? Probably not, but I'd rather fix those by redefining our registers so that the declarations don't use NULL. At the same time, it'd fix a case of UB, so it's definitely worth doing. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tobias.hartmann at oracle.com Thu May 23 14:12:59 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 23 May 2019 16:12:59 +0200 Subject: RFR(XS): 8224652: 32-bit build failures after JDK-8213084 In-Reply-To: References: Message-ID: <93ba0374-bced-6891-6b0b-250185179e1c@oracle.com> Hi Lutz, looks good to me. Best regards, Tobias On 23.05.19 15:45, Schmidt, Lutz wrote: > Dear All, > > I would like to request reviews for this tiny fix, bringing 32-bit platforms back to life: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224652 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224652.00/ > > ~shade: thank you Aleksey for providing the fix. > > Regards, > Lutz > > From lutz.schmidt at sap.com Thu May 23 14:18:44 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 23 May 2019 14:18:44 +0000 Subject: RFR(XS): 8224652: 32-bit build failures after JDK-8213084 In-Reply-To: <93ba0374-bced-6891-6b0b-250185179e1c@oracle.com> References: <93ba0374-bced-6891-6b0b-250185179e1c@oracle.com> Message-ID: <5ABC7660-D084-40FC-9DC9-688D0BC7DD59@sap.com> Thanks, Tobias, for reviewing. Regards, Lutz ?On 23.05.19, 16:12, "Tobias Hartmann" wrote: Hi Lutz, looks good to me. Best regards, Tobias On 23.05.19 15:45, Schmidt, Lutz wrote: > Dear All, > > I would like to request reviews for this tiny fix, bringing 32-bit platforms back to life: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224652 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224652.00/ > > ~shade: thank you Aleksey for providing the fix. > > Regards, > Lutz > > From nils.eliasson at oracle.com Thu May 23 14:25:43 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 23 May 2019 16:25:43 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC Message-ID: Hi, In ZGC we use load barriers on references. In the original implementation these where added as macro nodes at parse time. The load barrier node consumes and produces control flow in order to be able to be lowered into a check with a slow path late. The load barrier nodes are fixed in the control flow, and extensions to different optimizations are need the barriers out of loop and past other unrelated control flow. With this patch the barriers are instead added after the loop optimizations, before macro node expansion. This makes the entire pipeline until that point oblivious about the barriers. A dump of the IR with ZGC or EpsilonGC will be basically identical at that point, and the diff compared to serialGC or ParallelGC that use write barriers is really small. Benefits - A major complexity reduction. One can reason about and implement loop optimization without caring about the barriers. The escape analysis doesn't need to know about the barriers. Loads float freely like they are supposed to. - Less nodes early. The inlining will become more deterministic. A barrier heavy GC will not run into node limits earlier. Also node limit bounded optimization like unrolling and peeling will not be penalized by barriers. - Better test coverage, or reduce testing cost when the same optimization doesn't need to be verified with every GC. - Better control on where barriers end up. It is trivial to guarantee that the load and barriers are not separated by a safepoint. Design The implementation uses an extra phase that piggy back on PhaseIdealLoop which provides control and dominator information for all loads. This extra phase is needed because we need to splice the control flow when adding the load barriers. Barriers are inserted on the loads nodes in post order (any successor first). This is to guarantee the dominator information above every insertion is correct. This is also important within blocks. Two loads in the same block can float in relation to each other. The addition of barriers serializes their order. Any def-use relationship is upheld by expanding them post order. Barrier insertion is done in stages. In this first stage a single macro node that represents the barrier is added with all dependencies that is required. In the macro expansion phase the barrier nodes is expanded into the final shape, adding nodes that represent the conditional load barrier check. (Write barriers in other GCs could possibly be expanded here directly) All the barriers that are needed for unsafe reference operations (cas, swap, cmpx) are also expanded late. They already have control flow, so the expansion is straight forward. The barriers for the unsafe reference operations (cas, getandset, cmpx) have also been simplified. The cas-load-cas dance have been replaced by a pre-load. The pre-load is a load with a barrier, that is kept alive by an extra (required) edge on the unsafe-primitive-nodes (specialized as ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). One challenge that was encountered early and that have caused considerable work is that nodes (like loads) can end up between calls and their catch projections. This is usually handled after matching, in PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned to all catch blocks. At this stage they are in an ordered list, so that is a straight forward process. For late barrier insertion we need to splice in control earlier, before matching, and control flow between calls and catches is not allowed. This requires us to add a transformation pass where all loads and their dependent instructions are cloned out to the catch blocks before we can start splicing in control flow. This transformation doesn't replace the legacy call_catch_cleanup fully, but it could be a future goal. In the original barrier implementation there where two different load barrier implementations: the basic and the optimized. With the new approach to barriers on unsafe, the basic is no longer required and has been removed. (It provided options for skipping the self healing, and passed the ref in a register, guaranteeing that the oop wasn't reloaded.) The wart that was fixup_partial_loads in zHeap has also been made redundant. Dominating barriers are no longer removed on weak loads. Weak barriers doesn't guarantee self-healing. Follow up work: - Consolidate all uses of GrowableArray::insert_sorted to use the new version - Refactor the phases. There are a lot of simplifications and verification that can be done with more well defined phases. - Simplify the remaining barrier optimizations. There might still be code paths that are no longer needed. Testing: Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then some. All with -XX:+ZVerifyViews. Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ Please review, Regards, Nils From rkennke at redhat.com Thu May 23 14:31:37 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 23 May 2019 16:31:37 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: <59c61d94-fdb3-c380-1ff7-dd1fbc5752c5@redhat.com> Quick glance showed a problem: you are renaming/moving BarrierSetC2::add_users_to_worklist() but ShenandoahBarrierSetC2 is not updated accordingly. Roman > Hi, > > In ZGC we use load barriers on references. In the original > implementation these where added as macro nodes at parse time. The load > barrier node consumes and produces control flow in order to be able to > be lowered into a check with a slow path late. The load barrier nodes > are fixed in the control flow, and extensions to different optimizations > are need the barriers out of loop and past other unrelated control flow. > > With this patch the barriers are instead added after the loop > optimizations, before macro node expansion. This makes the entire > pipeline until that point oblivious about the barriers. A dump of the IR > with ZGC or EpsilonGC will be basically identical at that point, and the > diff compared to serialGC or ParallelGC that use write barriers is > really small. > > Benefits > > - A major complexity reduction. One can reason about and implement loop > optimization without caring about the barriers. The escape analysis > doesn't need to know about the barriers. Loads float freely like they > are supposed to. > > - Less nodes early. The inlining will become more deterministic. A > barrier heavy GC will not run into node limits earlier. Also node limit > bounded optimization like unrolling and peeling will not be penalized by > barriers. > > - Better test coverage, or reduce testing cost when the same > optimization doesn't need to be verified with every GC. > > - Better control on where barriers end up. It is trivial to guarantee > that the load and barriers are not separated by a safepoint. > > Design > > The implementation uses an extra phase that piggy back on PhaseIdealLoop > which provides control and dominator information for all loads. This > extra phase is needed because we need to splice the control flow when > adding the load barriers. > > Barriers are inserted on the loads nodes in post order (any successor > first). This is to guarantee the dominator information above every > insertion is correct. This is also important within blocks. Two loads in > the same block can float in relation to each other. The addition of > barriers serializes their order. Any def-use relationship is upheld by > expanding them post order. > > Barrier insertion is done in stages. In this first stage a single macro > node that represents the barrier is added with all dependencies that is > required. In the macro expansion phase the barrier nodes is expanded > into the final shape, adding nodes that represent the conditional load > barrier check. (Write barriers in other GCs could possibly be expanded > here directly) > > All the barriers that are needed for unsafe reference operations (cas, > swap, cmpx) are also expanded late. They already have control flow, so > the expansion is straight forward. > > The barriers for the unsafe reference operations (cas, getandset, cmpx) > have also been simplified. The cas-load-cas dance have been replaced by > a pre-load. The pre-load is a load with a barrier, that is kept alive by > an extra (required) edge on the unsafe-primitive-nodes (specialized as > ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). > > One challenge that was encountered early and that have caused > considerable work is that nodes (like loads) can end up between calls > and their catch projections. This is usually handled after matching, in > PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned > to all catch blocks. At this stage they are in an ordered list, so that > is a straight forward process. For late barrier insertion we need to > splice in control earlier, before matching, and control flow between > calls and catches is not allowed. This requires us to add a > transformation pass where all loads and their dependent instructions are > cloned out to the catch blocks before we can start splicing in control > flow. This transformation doesn't replace the legacy call_catch_cleanup > fully, but it could be a future goal. > > In the original barrier implementation there where two different load > barrier implementations: the basic and the optimized. With the new > approach to barriers on unsafe, the basic is no longer required and has > been removed. (It provided options for skipping the self healing, and > passed the ref in a register, guaranteeing that the oop wasn't reloaded.) > > The wart that was fixup_partial_loads in zHeap has also been made > redundant. > > Dominating barriers are no longer removed on weak loads. Weak barriers > doesn't guarantee self-healing. > > Follow up work: > > - Consolidate all uses of GrowableArray::insert_sorted to use the new > version > > - Refactor the phases. There are a lot of simplifications and > verification that can be done with more well defined phases. > > - Simplify the remaining barrier optimizations. There might still be > code paths that are no longer needed. > > > Testing: > > Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then > some. All with -XX:+ZVerifyViews. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 > > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > > > Please review, > > Regards, > > Nils > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From lutz.schmidt at sap.com Thu May 23 16:05:06 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 23 May 2019 16:05:06 +0000 Subject: RFR(S): 8224672: (lib)hsdis-.so search incorrect after JDK-8213084 Message-ID: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> Dear All,? I would like to request reviews for this little patch, fixing an issue with finding the hsdis disassembler library: Bug:????https://bugs.openjdk.java.net/browse/JDK-8224672 Webrev:?https://cr.openjdk.java.net/~lucy/webrevs/8224672.00/ Tracing the library search locations now shows: Trying to load: /jdk/lib/server/libhsdis-amd64.dylib Trying to load: /jdk/lib/server/hsdis-amd64.dylib Trying to load: /jdk/lib/hsdis-amd64.dylib Trying to load: hsdis-amd64.dylib via LD_LIBRARY_PATH or equivalent Loaded disassembler from hsdis-amd64.dylib ~shade: thank you Aleksey for discovering the issue. Regards, Lutz From shade at redhat.com Thu May 23 16:33:50 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 23 May 2019 18:33:50 +0200 Subject: RFR(XS): 8224652: 32-bit build failures after JDK-8213084 In-Reply-To: References: Message-ID: On 5/23/19 3:45 PM, Schmidt, Lutz wrote: > Dear All, > > I would like to request reviews for this tiny fix, bringing 32-bit platforms back to life: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224652 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224652.00/ Looks good to me. You don't need to credit me for the fix: most of the work is testing it actually works. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Thu May 23 16:43:30 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 23 May 2019 09:43:30 -0700 Subject: RFR(S): 8224672: (lib)hsdis-.so search incorrect after JDK-8213084 In-Reply-To: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> References: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> Message-ID: <47E36D2D-DF1C-477F-A7FD-353BDECC0341@oracle.com> Looks good. Thanks Vladimir > On May 23, 2019, at 9:05 AM, Schmidt, Lutz wrote: > > Dear All, > > I would like to request reviews for this little patch, fixing an issue with finding the hsdis disassembler library: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224672 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224672.00/ > > Tracing the library search locations now shows: > Trying to load: /jdk/lib/server/libhsdis-amd64.dylib > Trying to load: /jdk/lib/server/hsdis-amd64.dylib > Trying to load: /jdk/lib/hsdis-amd64.dylib > Trying to load: hsdis-amd64.dylib via LD_LIBRARY_PATH or equivalent > Loaded disassembler from hsdis-amd64.dylib > > ~shade: thank you Aleksey for discovering the issue. > > Regards, > Lutz > > > From lutz.schmidt at sap.com Thu May 23 16:46:19 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 23 May 2019 16:46:19 +0000 Subject: RFR(S): 8224672: (lib)hsdis-.so search incorrect after JDK-8213084 In-Reply-To: <47E36D2D-DF1C-477F-A7FD-353BDECC0341@oracle.com> References: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> <47E36D2D-DF1C-477F-A7FD-353BDECC0341@oracle.com> Message-ID: <1BF83A81-8CBF-43F8-BEC5-A8706206989C@sap.com> Thanks for reviewing, Vladimir! Regards, Lutz ?On 23.05.19, 18:43, "Vladimir Kozlov" wrote: Looks good. Thanks Vladimir > On May 23, 2019, at 9:05 AM, Schmidt, Lutz wrote: > > Dear All, > > I would like to request reviews for this little patch, fixing an issue with finding the hsdis disassembler library: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224672 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224672.00/ > > Tracing the library search locations now shows: > Trying to load: /jdk/lib/server/libhsdis-amd64.dylib > Trying to load: /jdk/lib/server/hsdis-amd64.dylib > Trying to load: /jdk/lib/hsdis-amd64.dylib > Trying to load: hsdis-amd64.dylib via LD_LIBRARY_PATH or equivalent > Loaded disassembler from hsdis-amd64.dylib > > ~shade: thank you Aleksey for discovering the issue. > > Regards, > Lutz > > > From shade at redhat.com Thu May 23 16:51:59 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 23 May 2019 18:51:59 +0200 Subject: RFR(S): 8224672: (lib)hsdis-.so search incorrect after JDK-8213084 In-Reply-To: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> References: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> Message-ID: On 5/23/19 6:05 PM, Schmidt, Lutz wrote: > Bug:????https://bugs.openjdk.java.net/browse/JDK-8224672 > Webrev:?https://cr.openjdk.java.net/~lucy/webrevs/8224672.00/ Thanks, I checked it restores my workflow (and I use libhsdis explicitly everywhere). The patch looks good too. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From lutz.schmidt at sap.com Thu May 23 17:00:26 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 23 May 2019 17:00:26 +0000 Subject: RFR(S): 8224672: (lib)hsdis-.so search incorrect after JDK-8213084 In-Reply-To: References: <14C6583D-9ECA-4F00-A945-6D8AB06DA91A@sap.com> Message-ID: <97433E11-2E7A-441F-9CB5-3975FD21E4C0@sap.com> Thanks for your review, Aleksey! Will only push the change tomorrow. Regards, Lutz ?On 23.05.19, 18:51, "Aleksey Shipilev" wrote: On 5/23/19 6:05 PM, Schmidt, Lutz wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8224672 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224672.00/ Thanks, I checked it restores my workflow (and I use libhsdis explicitly everywhere). The patch looks good too. -Aleksey From per.liden at oracle.com Thu May 23 19:32:42 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 23 May 2019 21:32:42 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: Hi Nils, On 2019-05-23 16:25, Nils Eliasson wrote: [...] > The wart that was fixup_partial_loads in zHeap has also been made > redundant. We should also be able to remove the function, task and closure for this: 327 class ZFixupPartialLoadsClosure : public ZRootsIteratorClosure { 328 public: 329 virtual void do_oop(oop* p) { 330 ZBarrier::mark_barrier_on_root_oop_field(p); 331 } 332 333 virtual void do_oop(narrowOop* p) { 334 ShouldNotReachHere(); 335 } 336 }; 337 338 class ZFixupPartialLoadsTask : public ZTask { 339 private: 340 ZThreadRootsIterator _thread_roots; 341 342 public: 343 ZFixupPartialLoadsTask() : 344 ZTask("ZFixupPartialLoadsTask"), 345 _thread_roots() {} 346 347 virtual void work() { 348 ZFixupPartialLoadsClosure cl; 349 _thread_roots.oops_do(&cl); 350 } 351 }; 352 353 void ZHeap::fixup_partial_loads() { 354 ZFixupPartialLoadsTask task; 355 _workers.run_parallel(&task); 356 } cheers, Per > Testing: > > Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then > some. All with -XX:+ZVerifyViews. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 > > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > > > Please review, > > Regards, > > Nils > From wangxue at loongson.cn Fri May 24 07:12:05 2019 From: wangxue at loongson.cn (Wang Xue) Date: Fri, 24 May 2019 15:12:05 +0800 Subject: RFR(trival): 8224723: [TESTBUG] compiler/arraycopy/TestArrayCopyWithBadOffset.java failed Message-ID: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> Hi all, Bug:? https://bugs.openjdk.java.net/browse/JDK-8224723 It can be fixed by --------------------------------------------------- diff -r ecb7b9a98f0e test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 14:14:13 2019 -0700 +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 14:10:19 2019 +0800 @@ -25,6 +25,7 @@ ? * @test ? * @bug 8224539 ? * @summary Test arraycopy optimizations with bad src/dst array offsets. + * @requires (vm.debug == true) ? * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline ? * compiler.arraycopy.TestArrayCopyWithBadOffset ? */ --------------------------------------------------- Could you please review it? Thanks, Wang Xue From tobias.hartmann at oracle.com Fri May 24 07:16:32 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 24 May 2019 09:16:32 +0200 Subject: RFR(trival): 8224723: [TESTBUG] compiler/arraycopy/TestArrayCopyWithBadOffset.java failed In-Reply-To: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> References: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> Message-ID: <4fc0598e-cca7-67c6-c367-11042208a28e@oracle.com> Hi, thanks for fixing this! Please just add -XX:+IgnoreUnrecognizedVMOptions to the tests @run statement such that it is also executed with a product build. Thanks, Tobias On 24.05.19 09:12, Wang Xue wrote: > Hi all, > > Bug:? https://bugs.openjdk.java.net/browse/JDK-8224723 > > It can be fixed by > --------------------------------------------------- > diff -r ecb7b9a98f0e test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java > --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 14:14:13 2019 > -0700 > +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 14:10:19 2019 > +0800 > @@ -25,6 +25,7 @@ > ? * @test > ? * @bug 8224539 > ? * @summary Test arraycopy optimizations with bad src/dst array offsets. > + * @requires (vm.debug == true) > ? * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline > ? * compiler.arraycopy.TestArrayCopyWithBadOffset > ? */ > --------------------------------------------------- > > Could you please review it? > > Thanks, > Wang Xue > > From wangxue at loongson.cn Fri May 24 07:40:51 2019 From: wangxue at loongson.cn (Wang Xue) Date: Fri, 24 May 2019 15:40:51 +0800 Subject: RFR(trival): 8224723: [TESTBUG] compiler/arraycopy/TestArrayCopyWithBadOffset.java failed In-Reply-To: <4fc0598e-cca7-67c6-c367-11042208a28e@oracle.com> References: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> <4fc0598e-cca7-67c6-c367-11042208a28e@oracle.com> Message-ID: <06bb95a4-4d63-27dc-82d9-0e33414aba77@loongson.cn> Hi Tobias, Thanks for your suggestion. Update the patch ---------------------------------------------------------------- diff -r d84176dd57b0 test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 18:47:24 2019 -0700 +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 15:31:55 2019 +0800 @@ -25,7 +25,7 @@ ? * @test ? * @bug 8224539 ? * @summary Test arraycopy optimizations with bad src/dst array offsets. - * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline + * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline -XX:+IgnoreUnrecognizedVMOptions ? * compiler.arraycopy.TestArrayCopyWithBadOffset ? */ ---------------------------------------------------------------- Thanks, Wang Xue ? 5/24/19 15:16, Tobias Hartmann ??: > Hi, > > thanks for fixing this! Please just add -XX:+IgnoreUnrecognizedVMOptions to the tests @run statement > such that it is also executed with a product build. > > Thanks, > Tobias > > On 24.05.19 09:12, Wang Xue wrote: >> Hi all, >> >> Bug:? https://bugs.openjdk.java.net/browse/JDK-8224723 >> >> It can be fixed by >> --------------------------------------------------- >> diff -r ecb7b9a98f0e test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java >> --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 14:14:13 2019 >> -0700 >> +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 14:10:19 2019 >> +0800 >> @@ -25,6 +25,7 @@ >> ? * @test >> ? * @bug 8224539 >> ? * @summary Test arraycopy optimizations with bad src/dst array offsets. >> + * @requires (vm.debug == true) >> ? * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline >> ? * compiler.arraycopy.TestArrayCopyWithBadOffset >> ? */ >> --------------------------------------------------- >> >> Could you please review it? >> >> Thanks, >> Wang Xue >> >> From tobias.hartmann at oracle.com Fri May 24 07:51:41 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 24 May 2019 09:51:41 +0200 Subject: RFR(trival): 8224723: [TESTBUG] compiler/arraycopy/TestArrayCopyWithBadOffset.java failed In-Reply-To: <06bb95a4-4d63-27dc-82d9-0e33414aba77@loongson.cn> References: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> <4fc0598e-cca7-67c6-c367-11042208a28e@oracle.com> <06bb95a4-4d63-27dc-82d9-0e33414aba77@loongson.cn> Message-ID: <823a693a-c817-174f-8f5f-9c2b3d4b8237@oracle.com> Hi Wang, looks good, but the IgnoreUnrecognizedVMOptions should come first. I've changed that and sponsored your fix: http://hg.openjdk.java.net/jdk/jdk/rev/e93621d4db2c Thanks, Tobias On 24.05.19 09:40, Wang Xue wrote: > Hi Tobias, > > Thanks for your suggestion. > > Update the patch > ---------------------------------------------------------------- > diff -r d84176dd57b0 test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java > --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 18:47:24 2019 > -0700 > +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 15:31:55 2019 > +0800 > @@ -25,7 +25,7 @@ > ? * @test > ? * @bug 8224539 > ? * @summary Test arraycopy optimizations with bad src/dst array offsets. > - * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline > + * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline -XX:+IgnoreUnrecognizedVMOptions > ? * compiler.arraycopy.TestArrayCopyWithBadOffset > ? */ > ---------------------------------------------------------------- > > > Thanks, > Wang Xue > > ? 5/24/19 15:16, Tobias Hartmann ??: >> Hi, >> >> thanks for fixing this! Please just add -XX:+IgnoreUnrecognizedVMOptions to the tests @run statement >> such that it is also executed with a product build. >> >> Thanks, >> Tobias >> >> On 24.05.19 09:12, Wang Xue wrote: >>> Hi all, >>> >>> Bug:? https://bugs.openjdk.java.net/browse/JDK-8224723 >>> >>> It can be fixed by >>> --------------------------------------------------- >>> diff -r ecb7b9a98f0e test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java >>> --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 14:14:13 2019 >>> -0700 >>> +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 14:10:19 2019 >>> +0800 >>> @@ -25,6 +25,7 @@ >>> ?? * @test >>> ?? * @bug 8224539 >>> ?? * @summary Test arraycopy optimizations with bad src/dst array offsets. >>> + * @requires (vm.debug == true) >>> ?? * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline >>> ?? * compiler.arraycopy.TestArrayCopyWithBadOffset >>> ?? */ >>> --------------------------------------------------- >>> >>> Could you please review it? >>> >>> Thanks, >>> Wang Xue >>> >>> > From wangxue at loongson.cn Fri May 24 08:05:33 2019 From: wangxue at loongson.cn (Wang Xue) Date: Fri, 24 May 2019 16:05:33 +0800 Subject: RFR(trival): 8224723: [TESTBUG] compiler/arraycopy/TestArrayCopyWithBadOffset.java failed In-Reply-To: <823a693a-c817-174f-8f5f-9c2b3d4b8237@oracle.com> References: <08cad8cb-b149-b603-77ab-983a39cde08f@loongson.cn> <4fc0598e-cca7-67c6-c367-11042208a28e@oracle.com> <06bb95a4-4d63-27dc-82d9-0e33414aba77@loongson.cn> <823a693a-c817-174f-8f5f-9c2b3d4b8237@oracle.com> Message-ID: <986b60fa-c025-9c5d-668e-6915fbba5914@loongson.cn> Hi Tobias, Thank you very much. Wang Xue ? 5/24/19 15:51, Tobias Hartmann ??: > Hi Wang, > > looks good, but the IgnoreUnrecognizedVMOptions should come first. > > I've changed that and sponsored your fix: > http://hg.openjdk.java.net/jdk/jdk/rev/e93621d4db2c > > Thanks, > Tobias > > On 24.05.19 09:40, Wang Xue wrote: >> Hi Tobias, >> >> Thanks for your suggestion. >> >> Update the patch >> ---------------------------------------------------------------- >> diff -r d84176dd57b0 test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java >> --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 18:47:24 2019 >> -0700 >> +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 15:31:55 2019 >> +0800 >> @@ -25,7 +25,7 @@ >> ? * @test >> ? * @bug 8224539 >> ? * @summary Test arraycopy optimizations with bad src/dst array offsets. >> - * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline >> + * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline -XX:+IgnoreUnrecognizedVMOptions >> ? * compiler.arraycopy.TestArrayCopyWithBadOffset >> ? */ >> ---------------------------------------------------------------- >> >> >> Thanks, >> Wang Xue >> >> ? 5/24/19 15:16, Tobias Hartmann ??: >>> Hi, >>> >>> thanks for fixing this! Please just add -XX:+IgnoreUnrecognizedVMOptions to the tests @run statement >>> such that it is also executed with a product build. >>> >>> Thanks, >>> Tobias >>> >>> On 24.05.19 09:12, Wang Xue wrote: >>>> Hi all, >>>> >>>> Bug:? https://bugs.openjdk.java.net/browse/JDK-8224723 >>>> >>>> It can be fixed by >>>> --------------------------------------------------- >>>> diff -r ecb7b9a98f0e test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java >>>> --- a/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Thu May 23 14:14:13 2019 >>>> -0700 >>>> +++ b/test/hotspot/jtreg/compiler/arraycopy/TestArrayCopyWithBadOffset.java Fri May 24 14:10:19 2019 >>>> +0800 >>>> @@ -25,6 +25,7 @@ >>>> ?? * @test >>>> ?? * @bug 8224539 >>>> ?? * @summary Test arraycopy optimizations with bad src/dst array offsets. >>>> + * @requires (vm.debug == true) >>>> ?? * @run main/othervm -Xbatch -XX:+AlwaysIncrementalInline >>>> ?? * compiler.arraycopy.TestArrayCopyWithBadOffset >>>> ?? */ >>>> --------------------------------------------------- >>>> >>>> Could you please review it? >>>> >>>> Thanks, >>>> Wang Xue >>>> >>>> From fujie at loongson.cn Fri May 24 08:53:37 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 24 May 2019 16:53:37 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> Message-ID: <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> Hi Vladimir Ivanov and all, What do you think of this version: http://cr.openjdk.java.net/~jiefu/8224162/webrev.05/ Could you please give me some comments? Thanks a lot. Best regards, Jie On 2019/5/23 ??6:21, Vladimir Ivanov wrote: > I'm still in favor of fixing the root cause than putting band-aids in > otherwise perfectly valid code. > > I don't consider fixing CounterData::count() and its usages to > properly handle overflow as overly complicated. There's a limited > number of usages and they don't properly handle overflow as well. So, > fixing the bug is highly desireable even though it has been left > unnoticed for a long time. From martin.doerr at sap.com Fri May 24 09:33:39 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 24 May 2019 09:33:39 +0000 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests In-Reply-To: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> References: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> Message-ID: Hi Gustavo, looks good to me. Tested on AIX. The previously failing tests have passed. Best regards, Martin > -----Original Message----- > From: Gustavo Romero > Sent: Mittwoch, 22. Mai 2019 21:26 > To: hotspot-compiler-dev at openjdk.java.net > Cc: Doerr, Martin ; Lindenmaier, Goetz > ; vladimir.kozlov at oracle.com > Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests > > Hi, > > Could I get reviews for the following change please? > > Bug : https://bugs.openjdk.java.net/browse/JDK-8223660 > Webrev: http://cr.openjdk.java.net/~gromero/8223660/v1/ > > It removes from the RTM jtreg tests the use of Unsafe native methods as > abort provokers in a transaction. > > Relying on Unsafe native methods to abort a transaction makes the RTM test > brittle because an Unsafe native method can be converted to non-native at > any time, breaking the RTM tests. This is the second time it happens. > > This change removes the use of Unsafe native methods (currently, pageSize() > is used, but is not native anymore) and adds an isolated native method in > library libXAbortProvoker.so for the XAbortProvoker class that will be used > to abort transactions in the RTM tests, turning the RTM jtreg tests more > self-contained and so less brittle. > > I tested the change on x86_64 and PPC64, w/ RTM/HTM CPU feature > available. > > Thank you! > > Best regards, > Gustavo From nils.eliasson at oracle.com Fri May 24 09:40:39 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 24 May 2019 11:40:39 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <59c61d94-fdb3-c380-1ff7-dd1fbc5752c5@redhat.com> References: <59c61d94-fdb3-c380-1ff7-dd1fbc5752c5@redhat.com> Message-ID: <1a5dc2fd-cd59-4042-798e-b261c9017368@oracle.com> Hi Roman, I removed the method. I have verified that Shenandoah builds and isn't obviously broken. Webrev updated in place. Regards, Nils On 2019-05-23 16:31, Roman Kennke wrote: > Quick glance showed a problem: you are renaming/moving > BarrierSetC2::add_users_to_worklist() but ShenandoahBarrierSetC2 is not > updated accordingly. > > Roman > > >> Hi, >> >> In ZGC we use load barriers on references. In the original >> implementation these where added as macro nodes at parse time. The load >> barrier node consumes and produces control flow in order to be able to >> be lowered into a check with a slow path late. The load barrier nodes >> are fixed in the control flow, and extensions to different optimizations >> are need the barriers out of loop and past other unrelated control flow. >> >> With this patch the barriers are instead added after the loop >> optimizations, before macro node expansion. This makes the entire >> pipeline until that point oblivious about the barriers. A dump of the IR >> with ZGC or EpsilonGC will be basically identical at that point, and the >> diff compared to serialGC or ParallelGC that use write barriers is >> really small. >> >> Benefits >> >> - A major complexity reduction. One can reason about and implement loop >> optimization without caring about the barriers. The escape analysis >> doesn't need to know about the barriers. Loads float freely like they >> are supposed to. >> >> - Less nodes early. The inlining will become more deterministic. A >> barrier heavy GC will not run into node limits earlier. Also node limit >> bounded optimization like unrolling and peeling will not be penalized by >> barriers. >> >> - Better test coverage, or reduce testing cost when the same >> optimization doesn't need to be verified with every GC. >> >> - Better control on where barriers end up. It is trivial to guarantee >> that the load and barriers are not separated by a safepoint. >> >> Design >> >> The implementation uses an extra phase that piggy back on PhaseIdealLoop >> which provides control and dominator information for all loads. This >> extra phase is needed because we need to splice the control flow when >> adding the load barriers. >> >> Barriers are inserted on the loads nodes in post order (any successor >> first). This is to guarantee the dominator information above every >> insertion is correct. This is also important within blocks. Two loads in >> the same block can float in relation to each other. The addition of >> barriers serializes their order. Any def-use relationship is upheld by >> expanding them post order. >> >> Barrier insertion is done in stages. In this first stage a single macro >> node that represents the barrier is added with all dependencies that is >> required. In the macro expansion phase the barrier nodes is expanded >> into the final shape, adding nodes that represent the conditional load >> barrier check. (Write barriers in other GCs could possibly be expanded >> here directly) >> >> All the barriers that are needed for unsafe reference operations (cas, >> swap, cmpx) are also expanded late. They already have control flow, so >> the expansion is straight forward. >> >> The barriers for the unsafe reference operations (cas, getandset, cmpx) >> have also been simplified. The cas-load-cas dance have been replaced by >> a pre-load. The pre-load is a load with a barrier, that is kept alive by >> an extra (required) edge on the unsafe-primitive-nodes (specialized as >> ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). >> >> One challenge that was encountered early and that have caused >> considerable work is that nodes (like loads) can end up between calls >> and their catch projections. This is usually handled after matching, in >> PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned >> to all catch blocks. At this stage they are in an ordered list, so that >> is a straight forward process. For late barrier insertion we need to >> splice in control earlier, before matching, and control flow between >> calls and catches is not allowed. This requires us to add a >> transformation pass where all loads and their dependent instructions are >> cloned out to the catch blocks before we can start splicing in control >> flow. This transformation doesn't replace the legacy call_catch_cleanup >> fully, but it could be a future goal. >> >> In the original barrier implementation there where two different load >> barrier implementations: the basic and the optimized. With the new >> approach to barriers on unsafe, the basic is no longer required and has >> been removed. (It provided options for skipping the self healing, and >> passed the ref in a register, guaranteeing that the oop wasn't reloaded.) >> >> The wart that was fixup_partial_loads in zHeap has also been made >> redundant. >> >> Dominating barriers are no longer removed on weak loads. Weak barriers >> doesn't guarantee self-healing. >> >> Follow up work: >> >> - Consolidate all uses of GrowableArray::insert_sorted to use the new >> version >> >> - Refactor the phases. There are a lot of simplifications and >> verification that can be done with more well defined phases. >> >> - Simplify the remaining barrier optimizations. There might still be >> code paths that are no longer needed. >> >> >> Testing: >> >> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then >> some. All with -XX:+ZVerifyViews. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >> >> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >> >> >> Please review, >> >> Regards, >> >> Nils >> From nils.eliasson at oracle.com Fri May 24 09:41:22 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 24 May 2019 11:41:22 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: <7be41f00-bede-c64a-2bc8-2c4b9981f309@oracle.com> Hi Per, I removed the code and updated the webrev. Thanks, Nils On 2019-05-23 21:32, Per Liden wrote: > Hi Nils, > > On 2019-05-23 16:25, Nils Eliasson wrote: > [...] >> The wart that was fixup_partial_loads in zHeap has also been made >> redundant. > > We should also be able to remove the function, task and closure for this: > > ?327 class ZFixupPartialLoadsClosure : public ZRootsIteratorClosure { > ?328 public: > ?329?? virtual void do_oop(oop* p) { > ?330???? ZBarrier::mark_barrier_on_root_oop_field(p); > ?331?? } > ?332 > ?333?? virtual void do_oop(narrowOop* p) { > ?334???? ShouldNotReachHere(); > ?335?? } > ?336 }; > ?337 > ?338 class ZFixupPartialLoadsTask : public ZTask { > ?339 private: > ?340?? ZThreadRootsIterator _thread_roots; > ?341 > ?342 public: > ?343?? ZFixupPartialLoadsTask() : > ?344?????? ZTask("ZFixupPartialLoadsTask"), > ?345?????? _thread_roots() {} > ?346 > ?347?? virtual void work() { > ?348???? ZFixupPartialLoadsClosure cl; > ?349???? _thread_roots.oops_do(&cl); > ?350?? } > ?351 }; > ?352 > ?353 void ZHeap::fixup_partial_loads() { > ?354?? ZFixupPartialLoadsTask task; > ?355?? _workers.run_parallel(&task); > ?356 } > > cheers, > Per > >> Testing: >> >> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and >> then some. All with -XX:+ZVerifyViews. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >> >> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >> >> >> Please review, >> >> Regards, >> >> Nils >> From rkennke at redhat.com Fri May 24 09:42:09 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 24 May 2019 11:42:09 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <1a5dc2fd-cd59-4042-798e-b261c9017368@oracle.com> References: <59c61d94-fdb3-c380-1ff7-dd1fbc5752c5@redhat.com> <1a5dc2fd-cd59-4042-798e-b261c9017368@oracle.com> Message-ID: Hi Nils, > I removed the method. I have verified that Shenandoah builds and isn't > obviously broken. > > Webrev updated in place. Thanks! Will review more thoroughly and run some tests later. Nice approach, btw! Roman > Regards, > > Nils > > On 2019-05-23 16:31, Roman Kennke wrote: >> Quick glance showed a problem: you are renaming/moving >> BarrierSetC2::add_users_to_worklist() but ShenandoahBarrierSetC2 is not >> updated accordingly. >> >> Roman >> >> >>> Hi, >>> >>> In ZGC we use load barriers on references. In the original >>> implementation these where added as macro nodes at parse time. The load >>> barrier node consumes and produces control flow in order to be able to >>> be lowered into a check with a slow path late. The load barrier nodes >>> are fixed in the control flow, and extensions to different optimizations >>> are need the barriers out of loop and past other unrelated control flow. >>> >>> With this patch the barriers are instead added after the loop >>> optimizations, before macro node expansion. This makes the entire >>> pipeline until that point oblivious about the barriers. A dump of the IR >>> with ZGC or EpsilonGC will be basically identical at that point, and the >>> diff compared to serialGC or ParallelGC that use write barriers is >>> really small. >>> >>> Benefits >>> >>> - A major complexity reduction. One can reason about and implement loop >>> optimization without caring about the barriers. The escape analysis >>> doesn't need to know about the barriers. Loads float freely like they >>> are supposed to. >>> >>> - Less nodes early. The inlining will become more deterministic. A >>> barrier heavy GC will not run into node limits earlier. Also node limit >>> bounded optimization like unrolling and peeling will not be penalized by >>> barriers. >>> >>> - Better test coverage, or reduce testing cost when the same >>> optimization doesn't need to be verified with every GC. >>> >>> - Better control on where barriers end up. It is trivial to guarantee >>> that the load and barriers are not separated by a safepoint. >>> >>> Design >>> >>> The implementation uses an extra phase that piggy back on PhaseIdealLoop >>> which provides control and dominator information for all loads. This >>> extra phase is needed because we need to splice the control flow when >>> adding the load barriers. >>> >>> Barriers are inserted on the loads nodes in post order (any successor >>> first). This is to guarantee the dominator information above every >>> insertion is correct. This is also important within blocks. Two loads in >>> the same block can float in relation to each other. The addition of >>> barriers serializes their order. Any def-use relationship is upheld by >>> expanding them post order. >>> >>> Barrier insertion is done in stages. In this first stage a single macro >>> node that represents the barrier is added with all dependencies that is >>> required. In the macro expansion phase the barrier nodes is expanded >>> into the final shape, adding nodes that represent the conditional load >>> barrier check. (Write barriers in other GCs could possibly be expanded >>> here directly) >>> >>> All the barriers that are needed for unsafe reference operations (cas, >>> swap, cmpx) are also expanded late. They already have control flow, so >>> the expansion is straight forward. >>> >>> The barriers for the unsafe reference operations (cas, getandset, cmpx) >>> have also been simplified. The cas-load-cas dance have been replaced by >>> a pre-load. The pre-load is a load with a barrier, that is kept alive by >>> an extra (required) edge on the unsafe-primitive-nodes (specialized as >>> ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). >>> >>> One challenge that was encountered early and that have caused >>> considerable work is that nodes (like loads) can end up between calls >>> and their catch projections. This is usually handled after matching, in >>> PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned >>> to all catch blocks. At this stage they are in an ordered list, so that >>> is a straight forward process. For late barrier insertion we need to >>> splice in control earlier, before matching, and control flow between >>> calls and catches is not allowed. This requires us to add a >>> transformation pass where all loads and their dependent instructions are >>> cloned out to the catch blocks before we can start splicing in control >>> flow. This transformation doesn't replace the legacy call_catch_cleanup >>> fully, but it could be a future goal. >>> >>> In the original barrier implementation there where two different load >>> barrier implementations: the basic and the optimized. With the new >>> approach to barriers on unsafe, the basic is no longer required and has >>> been removed. (It provided options for skipping the self healing, and >>> passed the ref in a register, guaranteeing that the oop wasn't >>> reloaded.) >>> >>> The wart that was fixup_partial_loads in zHeap has also been made >>> redundant. >>> >>> Dominating barriers are no longer removed on weak loads. Weak barriers >>> doesn't guarantee self-healing. >>> >>> Follow up work: >>> >>> - Consolidate all uses of GrowableArray::insert_sorted to use the new >>> version >>> >>> - Refactor the phases. There are a lot of simplifications and >>> verification that can be done with more well defined phases. >>> >>> - Simplify the remaining barrier optimizations. There might still be >>> code paths that are no longer needed. >>> >>> >>> Testing: >>> >>> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then >>> some. All with -XX:+ZVerifyViews. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >>> >>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >>> >>> >>> Please review, >>> >>> Regards, >>> >>> Nils >>> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From adinn at redhat.com Fri May 24 10:06:41 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 24 May 2019 11:06:41 +0100 Subject: RFR: 8207851: Implement JEP 352 In-Reply-To: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> Message-ID: <67ad071d-376b-8b0d-9b2f-42dca17a1041@redhat.com> Ping! Any takers for a review? Also, can anyone advise me on what I might need to do to target this JEP to JDK13, other than the obvious reviewing and pushing of the implementation? regards, Andrew Dinn ----------- On 23/05/2019 11:55, Andrew Dinn wrote: > Hi, > > Could I please have reviews of the following change set which implements > JEP 352: > > JEP: https://openjdk.java.net/jeps/352 > JIRA: https://bugs.openjdk.java.net/browse/JDK-8207851 > webrev: http://cr.openjdk.java.net/~adinn/8207851/webrev.00/ > > I would also very much like to target this implementation for JDK13. > > Testing: > > The webrev includes a simple test (in directory > test/jdk/java/nio/MappedByteBuffer) which ensures that an NVRAM-backed > MappedByteBuffer can be created, updated and forced using cache line > flushes. This test is marked as ignored because it requires, inter alia, > a suitably configured host, fitted with an NVRAM DIMM device or > employing a pseudo-NVAM device simulated over volatile RAM. > > The above test has been run successfully on Linux x86_64 with an Optane > DIMM and with a pseduo-NVRAM device. Further, more rigorous testing has > been done in both the above configurations using the Narayana > Transactions logger and Infinispan distributed data grid. > > Testing of /successful/ use of the API on Linux AArch64 has not yet been > possible with either emulated or real NVRAM devices as it requires an > updated (ARMv8.2) CPU hardware capability as well as access to AArch64 > compatible NVRAM devices. n.b. an AArch64 compatibility flag > (-X:UsePOCForPOP) has been provided in the current patch to support > testing on older CPUs using simulated NVRAM. Unfortunately, it has not > yet been possible to obtain access to an AArch64 v8.1 machine that > supports simulation of NVRAM devices via volatile RAM. > > In consequence, AArch64 testing has been limited to ensuring that the > relevant API failure modes correctly manifest: i.e. > > v8.1 CPUs which lack the relevant hardware instructions refuse to map > NVRAM-backed buffers trwoing UnsupportedOperationException > > v8.1 CPUs which bypass this failure via compatibility mode fail at the > mmap stage with IOException due to lack of NVRAM mapping support in the > underlying OS mmap API > > It is expected that the omissions in AArch64 testing will be rectified > in the next few weeks. While this is desirable, the omissions are not > viewed as critical since there is currently no general access to the > relevant hardware. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From gromero at linux.vnet.ibm.com Fri May 24 12:26:05 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Fri, 24 May 2019 09:26:05 -0300 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests In-Reply-To: References: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> Message-ID: <1f83213d-d5db-4ed8-3ab7-46835b6da08b@linux.vnet.ibm.com> On 05/24/2019 06:33 AM, Doerr, Martin wrote: > looks good to me. Tested on AIX. The previously failing tests have passed. Thanks for the review, Martin! Best regards, Gustavo From lutz.schmidt at sap.com Fri May 24 12:28:37 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 24 May 2019 12:28:37 +0000 Subject: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type jlong Message-ID: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> Hi all, MacOS build was broken by JDK-8224652. May I please request reviews for this tiny, maybe trivial, build fix? Bug: https://bugs.openjdk.java.net/browse/JDK-8224742 Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224742.00/ Tested locally on MacOS and Linux. The added line is protected by #if defined(_LP64) && defined(__APPLE__) Thanks, Lutz Dr. Lutz Schmidt | SAP JVM & SapMachine | TI SAP CP Core | T: +49 (6227) 7-42834 http://sapjvm:1080 http://sapmachine.io From tobias.hartmann at oracle.com Fri May 24 12:39:45 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 24 May 2019 14:39:45 +0200 Subject: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type jlong In-Reply-To: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> References: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> Message-ID: <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> Hi Lutz, looks good and trivial to me but you can remove the extra whitespace between "%" and "#width". Thanks, Tobias On 24.05.19 14:28, Schmidt, Lutz wrote: > Hi all, > > MacOS build was broken by JDK-8224652. May I please request reviews for this tiny, maybe trivial, build fix? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224742 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224742.00/ > > Tested locally on MacOS and Linux. The added line is protected by > #if defined(_LP64) && defined(__APPLE__) > > > Thanks, > Lutz > > > Dr. Lutz Schmidt | SAP JVM & SapMachine | TI SAP CP Core | T: +49 (6227) 7-42834 > http://sapjvm:1080 > http://sapmachine.io > > From lutz.schmidt at sap.com Fri May 24 12:50:52 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 24 May 2019 12:50:52 +0000 Subject: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type jlong In-Reply-To: <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> References: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> Message-ID: Hi Tobias, thanks for reviewing. I'll remove the whitespace and push. jdk/submit testing is pending. Regards, Lutz ?On 24.05.19, 14:39, "Tobias Hartmann" wrote: Hi Lutz, looks good and trivial to me but you can remove the extra whitespace between "%" and "#width". Thanks, Tobias On 24.05.19 14:28, Schmidt, Lutz wrote: > Hi all, > > MacOS build was broken by JDK-8224652. May I please request reviews for this tiny, maybe trivial, build fix? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224742 > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224742.00/ > > Tested locally on MacOS and Linux. The added line is protected by > #if defined(_LP64) && defined(__APPLE__) > > > Thanks, > Lutz > > > Dr. Lutz Schmidt | SAP JVM & SapMachine | TI SAP CP Core | T: +49 (6227) 7-42834 > http://sapjvm:1080 > http://sapmachine.io > > From christoph.langer at sap.com Fri May 24 12:53:38 2019 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 24 May 2019 12:53:38 +0000 Subject: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type jlong In-Reply-To: <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> References: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> Message-ID: +1 My Mac build is fixed again ?? > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Tobias Hartmann > Sent: Freitag, 24. Mai 2019 14:40 > To: Schmidt, Lutz ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type > jlong > > Hi Lutz, > > looks good and trivial to me but you can remove the extra whitespace > between "%" and "#width". > > Thanks, > Tobias > > On 24.05.19 14:28, Schmidt, Lutz wrote: > > Hi all, > > > > MacOS build was broken by JDK-8224652. May I please request reviews for > this tiny, maybe trivial, build fix? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224742 > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224742.00/ > > > > Tested locally on MacOS and Linux. The added line is protected by > > #if defined(_LP64) && defined(__APPLE__) > > > > > > Thanks, > > Lutz > > > > > > Dr. Lutz Schmidt | SAP JVM & SapMachine | TI SAP CP Core | T: +49 (6227) > 7-42834 > > http://sapjvm:1080 > > http://sapmachine.io > > > > From lutz.schmidt at sap.com Fri May 24 13:07:14 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 24 May 2019 13:07:14 +0000 Subject: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type jlong In-Reply-To: References: <686635EA-BA76-4931-B62D-8052623B35A8@sap.com> <721fb281-c53e-b5b8-0c56-a96487554d53@oracle.com> Message-ID: Thank you, Christoph. The push is on its way, slowly. Network performance is, well, sluggish today. Regards, Lutz ?On 24.05.19, 14:53, "Langer, Christoph" wrote: +1 My Mac build is fixed again ?? > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Tobias Hartmann > Sent: Freitag, 24. Mai 2019 14:40 > To: Schmidt, Lutz ; hotspot-compiler- > dev at openjdk.java.net > Subject: Re: RFR(T): 8224742: JLONG_FORMAT_W incompatible with type > jlong > > Hi Lutz, > > looks good and trivial to me but you can remove the extra whitespace > between "%" and "#width". > > Thanks, > Tobias > > On 24.05.19 14:28, Schmidt, Lutz wrote: > > Hi all, > > > > MacOS build was broken by JDK-8224652. May I please request reviews for > this tiny, maybe trivial, build fix? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224742 > > Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8224742.00/ > > > > Tested locally on MacOS and Linux. The added line is protected by > > #if defined(_LP64) && defined(__APPLE__) > > > > > > Thanks, > > Lutz > > > > > > Dr. Lutz Schmidt | SAP JVM & SapMachine | TI SAP CP Core | T: +49 (6227) > 7-42834 > > http://sapjvm:1080 > > http://sapmachine.io > > > > From stuart.monteith at linaro.org Fri May 24 14:37:24 2019 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Fri, 24 May 2019 15:37:24 +0100 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: That's interesting, and seems beneficial for ZGC on aarch64, where before your patch the ZGC load barriers broke assumptions the memory-fence optimisation code was making. I'm currently testing your patch, with the following put on top for aarch64: diff -r ead187ebe684 src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 13:11:48 2019 +0100 +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 15:34:17 2019 +0100 @@ -56,6 +56,8 @@ // instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ match(Set dst (LoadBarrierSlowReg mem)); + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); + effect(DEF dst, KILL cr); format %{"LoadBarrierSlowReg $dst, $mem" %} @@ -70,7 +72,8 @@ // Execute ZGC load barrier (weak) slow path // instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ - match(Set dst (LoadBarrierWeakSlowReg mem)); + match(Set dst (LoadBarrierSlowReg mem)); + predicate(n->as_LoadBarrierSlowReg()->is_weak()); effect(DEF dst, KILL cr); @@ -81,3 +84,60 @@ %} ins_pipe(pipe_slow); %} + + +// Specialized versions of compareAndExchangeP that adds a keepalive that is consumed +// but doesn't affect output. + +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, + iRegP oldval, iRegP newval, iRegP keepalive, + rFlagsReg cr) %{ + match(Set oldval (ZCompareAndExchangeP (Binary mem keepalive) (Binary oldval newval))); + ins_cost(2 * VOLATILE_REF_COST); + effect(TEMP_DEF res, KILL cr); + format %{ + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == $oldval then $mem <-- $newval" + %} + ins_encode %{ + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, + Assembler::xword, /*acquire*/ false, /*release*/ true, + /*weak*/ false, $res$$Register); + %} + ins_pipe(pipe_slow); +%} + +instruct z_compareAndSwapP(iRegINoSp res, + indirect mem, + iRegP oldval, iRegP newval, iRegP keepalive, + rFlagsReg cr) %{ + + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary oldval newval))); + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary oldval newval))); + + ins_cost(2 * VOLATILE_REF_COST); + + effect(KILL cr); + + format %{ + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" + %} + + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), + aarch64_enc_cset_eq(res)); + + ins_pipe(pipe_slow); +%} + + +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, + iRegP keepalive) %{ + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); + + ins_cost(2 * VOLATILE_REF_COST); + format %{ "atomic_xchg $prev, $newv, [$mem]" %} + ins_encode %{ + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); + %} + ins_pipe(pipe_serial); +%} \ No newline at end of file On Thu, 23 May 2019 at 15:38, Nils Eliasson wrote: > > Hi, > > In ZGC we use load barriers on references. In the original > implementation these where added as macro nodes at parse time. The load > barrier node consumes and produces control flow in order to be able to > be lowered into a check with a slow path late. The load barrier nodes > are fixed in the control flow, and extensions to different optimizations > are need the barriers out of loop and past other unrelated control flow. > > With this patch the barriers are instead added after the loop > optimizations, before macro node expansion. This makes the entire > pipeline until that point oblivious about the barriers. A dump of the IR > with ZGC or EpsilonGC will be basically identical at that point, and the > diff compared to serialGC or ParallelGC that use write barriers is > really small. > > Benefits > > - A major complexity reduction. One can reason about and implement loop > optimization without caring about the barriers. The escape analysis > doesn't need to know about the barriers. Loads float freely like they > are supposed to. > > - Less nodes early. The inlining will become more deterministic. A > barrier heavy GC will not run into node limits earlier. Also node limit > bounded optimization like unrolling and peeling will not be penalized by > barriers. > > - Better test coverage, or reduce testing cost when the same > optimization doesn't need to be verified with every GC. > > - Better control on where barriers end up. It is trivial to guarantee > that the load and barriers are not separated by a safepoint. > > Design > > The implementation uses an extra phase that piggy back on PhaseIdealLoop > which provides control and dominator information for all loads. This > extra phase is needed because we need to splice the control flow when > adding the load barriers. > > Barriers are inserted on the loads nodes in post order (any successor > first). This is to guarantee the dominator information above every > insertion is correct. This is also important within blocks. Two loads in > the same block can float in relation to each other. The addition of > barriers serializes their order. Any def-use relationship is upheld by > expanding them post order. > > Barrier insertion is done in stages. In this first stage a single macro > node that represents the barrier is added with all dependencies that is > required. In the macro expansion phase the barrier nodes is expanded > into the final shape, adding nodes that represent the conditional load > barrier check. (Write barriers in other GCs could possibly be expanded > here directly) > > All the barriers that are needed for unsafe reference operations (cas, > swap, cmpx) are also expanded late. They already have control flow, so > the expansion is straight forward. > > The barriers for the unsafe reference operations (cas, getandset, cmpx) > have also been simplified. The cas-load-cas dance have been replaced by > a pre-load. The pre-load is a load with a barrier, that is kept alive by > an extra (required) edge on the unsafe-primitive-nodes (specialized as > ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). > > One challenge that was encountered early and that have caused > considerable work is that nodes (like loads) can end up between calls > and their catch projections. This is usually handled after matching, in > PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned > to all catch blocks. At this stage they are in an ordered list, so that > is a straight forward process. For late barrier insertion we need to > splice in control earlier, before matching, and control flow between > calls and catches is not allowed. This requires us to add a > transformation pass where all loads and their dependent instructions are > cloned out to the catch blocks before we can start splicing in control > flow. This transformation doesn't replace the legacy call_catch_cleanup > fully, but it could be a future goal. > > In the original barrier implementation there where two different load > barrier implementations: the basic and the optimized. With the new > approach to barriers on unsafe, the basic is no longer required and has > been removed. (It provided options for skipping the self healing, and > passed the ref in a register, guaranteeing that the oop wasn't reloaded.) > > The wart that was fixup_partial_loads in zHeap has also been made > redundant. > > Dominating barriers are no longer removed on weak loads. Weak barriers > doesn't guarantee self-healing. > > Follow up work: > > - Consolidate all uses of GrowableArray::insert_sorted to use the new > version > > - Refactor the phases. There are a lot of simplifications and > verification that can be done with more well defined phases. > > - Simplify the remaining barrier optimizations. There might still be > code paths that are no longer needed. > > > Testing: > > Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then > some. All with -XX:+ZVerifyViews. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 > > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > > > Please review, > > Regards, > > Nils > From sergey.kuksenko at oracle.com Fri May 24 17:25:47 2019 From: sergey.kuksenko at oracle.com (Sergey Kuksenko) Date: Fri, 24 May 2019 10:25:47 -0700 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <51756d46-c2ad-5377-12ce-67208deb12a5@oracle.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> <294c512c-a613-7679-0b90-61e2fe015d3c@oracle.com> <51756d46-c2ad-5377-12ce-67208deb12a5@oracle.com> Message-ID: <29bca6d1-d63e-c046-88f9-c6ef21bd089b@oracle.com> Please review the next version: http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.03/ Now, only return type is checked. It works the same way on all forall loops. Besides it perfectly works on "descendingIterator()" methods and all other methods returning iterator but having other names like Enumeration::asIterator. On 5/22/19 10:47 AM, Vladimir Ivanov wrote: > Thanks for the thorough explanation, Sergey. > > What I like about the patch you propose is its simplicity. > > More accurate heuristic which takes inlining effects into account > (e.g., on EA) would scale to a much wider range of use cases and C2 > has a rich toolkit to make such heuristic possible, but it would > definitely require more effort. > > Thinking about it for a while, I agree with your proposal: the > heuristic is acceptable for the case of iterators, the benefits are > evident, and risks (with overinlining and premature inlining) are low. > > I still hope there'll be a more generic solution available at some > point which will supersede such special case for Iterable. > > Regarding the check itself, I'm in favor of limiting it > toIterable::iterator() overrides/overloads, but I'm OK with a more > generic check on return type. > > Best regards, > Vladimir Ivanov > > On 10/05/2019 21:47, Sergey Kuksenko wrote: >> Let me do a broader description. >> >> When hotspot makes a decision to the "ultimate question compilation, >> optimization and everything"? inline or not inline there are two key >> part of that decision. It is check of sizes (callee and caller) and >> check of frequencies (invocation count). Frequency check is >> reasonable, why should we inline rarely invoked method? But sometimes >> we loose optimization opportunities with that. >> >> Let's narrow the scenario. We have a loop and a method invocation >> before the loop. Inline of the method is a vital? for the loop >> performance. I see at least two key optimizations here: constant >> propagation and scalar replacement, maybe more. But if the loop has >> large enough amount of iterations -> hotspot has large enough >> backedge counters -> but it means that prolog is considered as >> relatively cold code (small amount of invocation counter) -> that >> method (potentially vital for performance) is not inlined (due to >> frequency/MinInlineThreashold cut off). >> >> We can't say if inlining is important until we look into the loop >> (even if there is a loop there). But we have to make a decision about >> inline before that. So let's try to make reasonable heuristic and >> narrow the scenario again. Limit our sight to Iterators. There is a >> very high probability that after Iterable::iterator() invocation >> there is a loop (covers all for-all loop). Also there is a high >> correlation between collection size and amount of loop iterations. >> Let's inline all iterators. I don't think the idea to analyze if >> "returned Iterator is a freshly-allocated instance" makes sense. >> First of all it's unnecessary complication.? Moreover, I have results >> when we have chain of iterators, hotspot can't inline the whole chain >> due to absence of profile (and/or profile pollution), but partial >> inline of the chain have shown performance benefits. To get more >> effective prediction if that particular inline is important we should >> look not into the method, but to the usage of the method results >> (into the loop). >> >> About the first comment (to broad or to narrow check). I have to note >> that this fix doesn't force inline for all methods with "iterator" >> name. The fix only excludes frequency cut off. All other checks (by >> sizes) are still in place. I did broader check for two reasons: to >> simplify modifications and to have wider appliances when it works. I >> could narrow it if you insist, but at the same time I think we have >> to make that check broader - don't look into method name at all. If >> you have something? that returns Iterator - there will be loop after >> that with a very high probability. So I'd vote for making that wider >> - check only return type. >> >> On 5/8/19 3:10 PM, Vladimir Ivanov wrote: >>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ >>> returned Iterator is a freshly-allocated instance >>> src/hotspot/share/opto/bytecodeInfo.cpp: >>> >>> +? if (callee_method->name() == ciSymbol::iterator_name()) { >>> +??? if >>> (callee_method->signature()->return_type()->is_subtype_of(C->env()->Iterator_klass())) >>> { >>> +????? return true; >>> +??? } >>> +? } >>> >>> The check looks too broad for me: it returns true for any method >>> with a name "iterator" which returns an instance of Iterator which >>> is much broader that just overrides/overloads of Iterable::iterator(). >>> >>> Can you elaborate, please, why did you decide to extend the check >>> for non-Iterables? >>> >>> Commenting on the general approach, it looks like a good candidate >>> for a fist-line filter before performing a more extensive analysis. >>> I'd prefer to see BCEscapeAnalyzer extended to determine that >>> returned Iterator is a freshly-allocated instance and decide whether >>> to inline or not based on that instead. Among java.util classes you >>> mentioned most iterators are trivial, so even naive analysis should >>> get decent results. >>> >>> And then the analysis can be applied to any method which returns an >>> Object to see whether EA may benefit from inlining. >>> >>> What do you think? >>> >>> Best regards, >>> Vladimir Ivanov >>> >>>> On 5/7/19 11:56 AM, Aleksey Shipilev wrote: >>>>> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >>>>>> Hi All, >>>>>> >>>>>> I would like to ask for review the following change/update: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8223504 >>>>>> >>>>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ >>>>> The idea sounds fine. >>>>> >>>>> Nits (the usual drill): >>>>> >>>>> ? *) Copyright years need to be updated, at least in bytecodeInfo.cpp >>>>> >>>>> ? *) Do we need to put Iterator_klass initialization this early in >>>>> WK_KLASSES_DO? It feels safer to >>>>> initialize it at the end, to avoid surprising bootstrap issues. >>>>> >>>>> ? *) Backslash indent is off here in vmSymbols.hpp: >>>>> >>>>> ? 129?? template(java_util_Iterator, >>>>> "java/util/Iterator")?????????????? \ >>>>> >>>>> ? *) Space after "if"? Also, I think you can use >>>>> ciType::is_subtype_of instead here. Plus, since you >>>>> declared iterator in WK klasses, >>>>> SystemDictionary::Iterator_klass() should be available. >>>>> >>>>> ? 100???? if(retType->is_klass() && >>>>> retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { >>>>> From vivek.r.deshpande at intel.com Fri May 24 18:22:59 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Fri, 24 May 2019 18:22:59 +0000 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> Hi Tobias By Autovectorizer, I was referring to Superword Optimization. As per earlier discussion with VladimirI, I have a patch which removes the wrong memory based instructs. Could you all please review it? http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.01/ I can push the patch, after both of you find it ok. Regards, Vivek -----Original Message----- From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] Sent: Thursday, May 23, 2019 5:45 AM To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding Hi Vivek, On 22.05.19 19:48, Deshpande, Vivek R wrote: > I came across this issue with vector API tests. > I tried to write a reproducer using the autovectorizer test, but it always uses the register based rule(instruct Repl32B) instead of memory based rule (instruct Repl32B_mem) and register based rule gives correct result. > This is why it never showed up. Okay, thanks for the explanation. > Could you please help me with forcing to use memory based rule with autovectorizer based test. I'm afraid I can't help, I'm not even sure what the autovectorizer is. Is that specific to the vector API? Best regards, Tobias From rahul.v.raghavan at oracle.com Fri May 24 20:17:08 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Sat, 25 May 2019 01:47:08 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> Message-ID: Hi, Request one more review approval for latest http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ Thanks, Rahul On 20/05/19 11:46 PM, Rahul Raghavan wrote: > Hi, > > With reference to below email thread, request help to confirm next steps > for JDK-8213416. > > So may I go ahead with webrev changes related to hotspot/compiler for > 8213416 ? > - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ > > (also will add similar hotspot/runtime related details in JBS comments > for JDK-8223400) > > Thanks, > Rahul > > > > On 16/05/19 3:26 PM, Rahul Raghavan wrote:> Hi, > > > > Thank you David for review comments. > > I will kindly request help from Magnus to reply for the main questions. > > > > Sharing some notes, related links - > > - 8211073: Remove -Wno-extra from Hotspot > >????? https://bugs.openjdk.java.net/browse/JDK-8211073 > > - Discussions in earlier thread - > > > https://mail.openjdk.java.net/pipermail/hotspot-dev/2018-September/034314.html > > > > So understood -Wextra do help in catching valid/useful warnings also, > > but along with some too strict ones like "enumeral and non-enumeral type > > in conditional expression" type warnings. > > > > Extracts from 8211073 JBS comments from Magnus regarding the > > 'enum-warning' - > > "... If you think that gcc is a bit too picky here, I agree. It's not > > obvious per se that the added casts improve the code. However, this is > > the price we need to pay to be able to enable -Wextra, and *that* is > > something that is likely to improve the code." > > > > > > Thanks, > > Rahul > > > > On 16/05/19 11:13 AM, David Holmes wrote: > >> This all seems like unnecessary churn to me - is any of this code > >> actually wrong? can we not just disable this particular warning? is > >> there any point using "static const" when we should be aiming to use > >> C++11 constexpr in the (not too distant?) future? > >> > >> Converting from enums to unrelated ints seems a big step backwards in > >> software engineering terms. :( > >> > >> Cheers, > >> David > >> ----- > >> > > > > On 16/05/19 12:03 PM, Rahul Raghavan wrote: >> Hi, >> >> Thank you Vladimir for review comments. >> >>>> 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - >>>> [open/src/hotspot/share/oops/klass.cpp] >>>> .......... >>> >>> I am okay with it but Runtime group should agree too - it is their code. >>> >> Yes, I missed that it is Runtime code. >> >> Please note plan is to handle only the hotspot/compiler part of the >> requirement changes in JDK-8213416. >> As per earlier JBS comments new JDK-8223400 was created to cover the >> requirements in hotspot/runtime. >> So may I suggest moving the above runtime change requirement details >> to JDK-822340; >> and use only the balance changes, as in below updated webrev, here for >> 8213416. >> >> - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ >> >> >> Thanks, >> Rahul On 15/05/19 10:01 PM, Vladimir Kozlov wrote:> Hi Rahul, > > Comments are inlined below. > > On 5/15/19 7:46 AM, Rahul Raghavan wrote: >> Hi, >> >> Request help review and finalize fix for 8213416. >> >> - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ >> >> https://bugs.openjdk.java.net/browse/JDK-8213416 >> >> The requirement is to solve >> "enumeral and non-enumeral type in conditional expression" warnings >> triggered with -Wextra enabled for gcc on hotspot. >> >> (hotspot/compiler part is handled in this 8213416 task >> and hotspot/runtime in 8223400) >> >> The same warning is generated for ternary operator statements like- >> (x ? int_val : enum_val). >> e.g.: >> comp_level = TieredCompilation ? TieredStopAtLevel : >> CompLevel_highest_tier; >> >> >> Understood from comments that the following type typecast solution >> proposed earlier was not accepted. >> - comp_level = TieredCompilation ? TieredStopAtLevel : >> CompLevel_highest_tier; >> + comp_level = TieredCompilation ? TieredStopAtLevel : (int) >> CompLevel_highest_tier; >> and then proposed solution was to rewrite those enums to be static >> const members. >> >> >> Tried changes based on the comments info from JBS. >> Extracts of related JBS comments- >> - ".... it's just a simple code refactoring. " >> - "David H. only complained about NO_HASH which we can fix. >> We can also fix CompLevel_highest_tier usage - should use CompLevel >> type everywhere. >> But I would not touch Op_RegFlags - >> I don't want to complicate its construction and >> we have a lot of places where Op_ are used as uint. >> I would only fix places where it is used as int to make sure it is >> used as uint everywhere." >> >> Reported enums in question for hotspot/compiler >> 1) NO_HASH >> 2) CompLevel_highest_tier >> 3) Op_RegFlags >> 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit >> >> >> >> 1) NO_HASH >> tried [open/src/hotspot/share/opto/node.hpp] >> - enum { NO_HASH = 0 }; >> + static const uint NO_HASH = 0; >> > > Okay. > >> >> >> 2) CompLevel_highest_tier >> Only one warning in process_compile() >> [open/src/hotspot/share/ci/ciReplay.cpp] >> comp_level = TieredCompilation ? TieredStopAtLevel : >> CompLevel_highest_tier; >> >> Following type changes tried did not help - >> - int comp_level = parse_int(comp_level_label); >> + CompLevel comp_level = parse_int(comp_level_label); >> ..... >> - comp_level = TieredCompilation ? TieredStopAtLevel : >> CompLevel_highest_tier; >> + comp_level = TieredCompilation ? (CompLevel) TieredStopAtLevel >> : CompLevel_highest_tier; >> > > Thank you for explaining that it did not work. >> >> The warning is with only ternary operator usage in this location. >> So tried simple code refactoring like following and got no more warnings! >> Is this okay? >> - comp_level = TieredCompilation ? TieredStopAtLevel : >> CompLevel_highest_tier; >> + if (TieredCompilation) { >> + comp_level = TieredStopAtLevel; >> + } else { >> + comp_level = CompLevel_highest_tier > + } >> > > Good. > >> >> >> 3) Op_RegFlags >> Warnings only for 'virtual uint MachNode::ideal_reg() const' >> >> ../open/src/hotspot/share/opto/machnode.hpp: In member function >> 'virtual uint MachNode::ideal_reg() const': >> ../open/src/hotspot/share/opto/machnode.hpp:304:95: warning: enumeral >> and non-enumeral type in conditional expression [-Wextra] >> virtual uint ideal_reg() const { const Type *t = >> _opnds[0]->type(); return t == TypeInt::CC ? Op_RegFlags : >> t->ideal_reg(); } >> >> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> Op_RegFlags is returned as uint itself here. >> How to modify code to solve warning? >> Again since the issue is with only ternary operator usage in only one >> location, can we go for simple code refactoring like following? >> >> - virtual uint ideal_reg() const { const Type *t = _opnds[0]->type(); >> return t == TypeInt::CC ? Op_RegFlags : t->ideal_reg(); } >> + virtual uint ideal_reg() const { >> + const Type *t = _opnds[0]->type(); >> + if (t == TypeInt::CC) { >> + return Op_RegFlags; >> + } else { >> + return t->ideal_reg(); >> + } >> + } >> > > Good. I agree. > >> >> >> 4) _lh_array_tag_obj_value, _lh_instance_slow_path_bit - >> warnings locations - >> >> (i) ../open/src/hotspot/share/oops/klass.cpp: In static member >> function 'static jint Klass::array_layout_helper(BasicType)': >> ../open/src/hotspot/share/oops/klass.cpp:212:23: warning: enumeral and >> non-enumeral type in conditional expression [-Wextra] >> int tag = isobj ? _lh_array_tag_obj_value : >> _lh_array_tag_type_value; >> >> (ii) ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp: In static member >> function 'static OopMapSet* >> Runtime1::generate_code_for(Runtime1::StubID, StubAssembler*)': >> ../open/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp:1126:22: warning: >> enumeral and non-enumeral type in conditional expression [-Wextra] >> int tag = ((id == new_type_array_id) >> ~~~~~~~~~~~~~~~~~~~~~~~~~ >> ? Klass::_lh_array_tag_type_value >> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> : Klass::_lh_array_tag_obj_value); >> >> (iii) ../open/src/hotspot/share/oops/klass.hpp: In static member >> function 'static jint Klass::instance_layout_helper(jint, bool)': >> ../open/src/hotspot/share/oops/klass.hpp:422:28: warning: enumeral and >> non-enumeral type in conditional expression [-Wextra] >> | (slow_path_flag ? _lh_instance_slow_path_bit : 0); >> ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> Following changes fixed the warnings! >> (using static const int instead of unnamed enum) >> [open/src/hotspot/share/oops/klass.cpp] >> .......... >> // Unpacking layout_helper: >> - enum { >> - _lh_neutral_value = 0, // neutral non-array >> non-instance value >> - _lh_instance_slow_path_bit = 0x01, >> - _lh_log2_element_size_shift = BitsPerByte*0, >> - _lh_log2_element_size_mask = BitsPerLong-1, >> - _lh_element_type_shift = BitsPerByte*1, >> - _lh_element_type_mask = right_n_bits(BitsPerByte), // >> shifted mask >> - _lh_header_size_shift = BitsPerByte*2, >> - _lh_header_size_mask = right_n_bits(BitsPerByte), // >> shifted mask >> - _lh_array_tag_bits = 2, >> - _lh_array_tag_shift = BitsPerInt - _lh_array_tag_bits, >> - _lh_array_tag_obj_value = ~0x01 // 0x80000000 >> 30 >> - }; >> + static const int _lh_neutral_value = 0; // neutral >> non-array non-instance value >> + static const int _lh_instance_slow_path_bit = 0x01; >> + static const int _lh_log2_element_size_shift = BitsPerByte*0; >> + static const int _lh_log2_element_size_mask = BitsPerLong-1; >> + static const int _lh_element_type_shift = BitsPerByte*1; >> + static const int _lh_element_type_mask = >> right_n_bits(BitsPerByte); // shifted mask >> + static const int _lh_header_size_shift = BitsPerByte*2; >> + static const int _lh_header_size_mask = >> right_n_bits(BitsPerByte); // shifted mask >> + static const int _lh_array_tag_bits = 2; >> + static const int _lh_array_tag_shift = BitsPerInt - >> _lh_array_tag_bits; >> + static const int _lh_array_tag_obj_value = ~0x01; // >> 0x80000000 >> 30 >> ....... >> > > I am okay with it but Runtime group should agree too - it is their code. > >> >> >> - http://cr.openjdk.java.net/~rraghavan/8213416/webrev.00/ >> >> Understood the affected code locations details from the old sample >> patch attachment of related JDK-8211073 >> # >> https://bugs.openjdk.java.net/secure/attachment/79387/hotspot-disable-wextra.diff >> >> Also confirmed no similar warnings in hotspot/compiler with -Wextra, >> no issues with build with this proposed webrev.00 > > Good. > > Thanks, > Vladimir > >> >> >> Thanks, >> Rahul From Alan.Bateman at oracle.com Sun May 26 18:25:29 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Sun, 26 May 2019 19:25:29 +0100 Subject: RFR: 8207851: Implement JEP 352 In-Reply-To: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> Message-ID: <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> On 23/05/2019 11:55, Andrew Dinn wrote: > Hi, > > Could I please have reviews of the following change set which implements > JEP 352: > > JEP: https://openjdk.java.net/jeps/352 > JIRA: https://bugs.openjdk.java.net/browse/JDK-8207851 > webrev: http://cr.openjdk.java.net/~adinn/8207851/webrev.00/ > > I would also very much like to target this implementation for JDK13. > You may want to take a pass over the JEP to make sure that everything is accurate. I notice, for example, the section on BufferPoolMXBean has the old name READ_ONLY_PERSISTENT. We went through a couple of iterations in the discussion here so there might be a few others. I think the API changes are okay. I don't see a CSR yet but I assume you'll get to that soon. I've read through the changes to java.base and jdk.unsupported. Just a few minor points: - I assume the update FileChannel.java should be dropped as it's just a left over from when we agreed to split out the updates to the Java SE API. - com.sun.nio.file.ExtendedMapMode.*_SYNC are missing javadoc, or rather the descriptions are truncated with "...". I think this dates from when were working out the right place to expose these constants. The source file (and the internal ExtendedMapModes are missing copyright headers too). - We didn't discuss the name of the buffer pool that is exposed through the JMX/management interface. We could take inspiration from the names of the CodeHeap spaces that are exposed with MemoryPoolMXBeans as there is an established convention for naming there, e.g. "mapped - 'non-volatile memory'". - Minor nit in Unmapper is that the methods to increment/decrement the usage should use Java conventions so probably should be incrementUsage and decrementUsage. - PmemTest. This is awkward and I wonder if it should be @run main/manual rather than @ignore. Also `@modules jdk.unsupported` would be useful to ensure it will be skipped if run with a test JDK that doesn't have this module. -Alan From tobias.hartmann at oracle.com Mon May 27 09:03:53 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 27 May 2019 11:03:53 +0200 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> Message-ID: <20c8f52f-7a43-a658-77ad-40c8a3d74ff1@oracle.com> Hi Rahul, On 24.05.19 22:17, Rahul Raghavan wrote: > Request one more review approval for latest > http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ Looks good to me. Best regards, Tobias From OGATAK at jp.ibm.com Mon May 27 09:06:04 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Mon, 27 May 2019 18:06:04 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: Improve VSR support to use up to 64 registers Message-ID: Hi, I'm requesting backport of 8185696: PPC64: Improve VSR support to use up to 64 registers to jdk8u-dev. This patch can be applied almost cleanly, but one chunk failed (other than a copyright year conflict) because the definition of MTVSRWA_OPCODE is missing in src/cpu/ppc/vm/assembler_ppc.hpp. MTVSRWA_OPCODE was added in 8144019: PPC64 C1: Introduce Client Compiler [1], which added tiered compilation feature in JDK9. Since JDK8 does not support tiered compilation, we cannot apply this change set. So I manually applied the failed chunk by skipping the MTVSRWA_OPCODE definition because no code in this changeset and other changesets I'm going to backport uses this opcode. I'll leave it to the future backport that really needs this opcode. Is this fix acceptable for backport request? If there is no objection in a few days, I'll go forward to add jdk8u-fix-request tag in the original bug report. I verified I can build both fastdebug and release version, and no degradation in jtreg. Original bug report: https://bugs.openjdk.java.net/browse/JDK-8185969 Webrev: http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8185969/webrev.02/ Refs: [1] https://bugs.openjdk.java.net/browse/JDK-8144019 From tobias.hartmann at oracle.com Mon May 27 09:06:28 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 27 May 2019 11:06:28 +0200 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> Message-ID: <1d76d87c-31d8-b199-81a8-befeb2979a5d@oracle.com> Hi Vivek, On 24.05.19 20:22, Deshpande, Vivek R wrote: > By Autovectorizer, I was referring to Superword Optimization. > As per earlier discussion with VladimirI, I have a patch which removes the wrong memory based instructs. > Could you all please review it? > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.01/ > I can push the patch, after both of you find it ok. Thanks for the clarification, the patch looks good to me. Best regards, Tobias From tobias.hartmann at oracle.com Mon May 27 09:17:26 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 27 May 2019 11:17:26 +0200 Subject: RFR(S): 8173196: [REDO] C2 does not optimize redundant memory operations with G1 In-Reply-To: <871s0qlu20.fsf@redhat.com> References: <871s0qlu20.fsf@redhat.com> Message-ID: Hi Roland, this looks good to me. I'll run some extended testing and let you know once it's done. Thanks, Tobias On 22.05.19 11:25, Roland Westrelin wrote: > > http://cr.openjdk.java.net/~roland/8173196/webrev.00/ > > Previous attempt at this was discussed here: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/021014.html > > And the follow up bugs with some comments on a possible fix: > > https://bugs.openjdk.java.net/browse/JDK-8172850 > > The new fix is very similar to the previous one. The 2 differences are: > > - aarch64 code shouldn't need any change because of 8209420 ("Track > membars for volatile accesses so they can be properly optimized") > > - The membar only affects the raw memory slice which is now properly > handled by MachNodes thanks to 8209691 ("Allow MemBar on single memory > slice") > > Roland. > From nils.eliasson at oracle.com Mon May 27 10:59:12 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 27 May 2019 12:59:12 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: <4b5fa91d-cc8f-3988-1376-09917216237b@oracle.com> I am looking forward to your results! // Nils On 2019-05-24 16:37, Stuart Monteith wrote: > That's interesting, and seems beneficial for ZGC on aarch64, where > before your patch the ZGC load barriers broke assumptions the > memory-fence optimisation code was making. > > I'm currently testing your patch, with the following put on top for aarch64: > > diff -r ead187ebe684 src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad > --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 13:11:48 2019 +0100 > +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 15:34:17 2019 +0100 > @@ -56,6 +56,8 @@ > // > instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > match(Set dst (LoadBarrierSlowReg mem)); > + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); > + > effect(DEF dst, KILL cr); > > format %{"LoadBarrierSlowReg $dst, $mem" %} > @@ -70,7 +72,8 @@ > // Execute ZGC load barrier (weak) slow path > // > instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > - match(Set dst (LoadBarrierWeakSlowReg mem)); > + match(Set dst (LoadBarrierSlowReg mem)); > + predicate(n->as_LoadBarrierSlowReg()->is_weak()); > > effect(DEF dst, KILL cr); > > @@ -81,3 +84,60 @@ > %} > ins_pipe(pipe_slow); > %} > + > + > +// Specialized versions of compareAndExchangeP that adds a keepalive > that is consumed > +// but doesn't affect output. > + > +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + match(Set oldval (ZCompareAndExchangeP (Binary mem keepalive) > (Binary oldval newval))); > + ins_cost(2 * VOLATILE_REF_COST); > + effect(TEMP_DEF res, KILL cr); > + format %{ > + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == > $oldval then $mem <-- $newval" > + %} > + ins_encode %{ > + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, > + Assembler::xword, /*acquire*/ false, /*release*/ true, > + /*weak*/ false, $res$$Register); > + %} > + ins_pipe(pipe_slow); > +%} > + > +instruct z_compareAndSwapP(iRegINoSp res, > + indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + > + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + > + effect(KILL cr); > + > + format %{ > + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then > $mem <-- $newval" > + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" > + %} > + > + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), > + aarch64_enc_cset_eq(res)); > + > + ins_pipe(pipe_slow); > +%} > + > + > +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, > + iRegP keepalive) %{ > + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + format %{ "atomic_xchg $prev, $newv, [$mem]" %} > + ins_encode %{ > + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); > + %} > + ins_pipe(pipe_serial); > +%} > \ No newline at end of file > > On Thu, 23 May 2019 at 15:38, Nils Eliasson wrote: >> Hi, >> >> In ZGC we use load barriers on references. In the original >> implementation these where added as macro nodes at parse time. The load >> barrier node consumes and produces control flow in order to be able to >> be lowered into a check with a slow path late. The load barrier nodes >> are fixed in the control flow, and extensions to different optimizations >> are need the barriers out of loop and past other unrelated control flow. >> >> With this patch the barriers are instead added after the loop >> optimizations, before macro node expansion. This makes the entire >> pipeline until that point oblivious about the barriers. A dump of the IR >> with ZGC or EpsilonGC will be basically identical at that point, and the >> diff compared to serialGC or ParallelGC that use write barriers is >> really small. >> >> Benefits >> >> - A major complexity reduction. One can reason about and implement loop >> optimization without caring about the barriers. The escape analysis >> doesn't need to know about the barriers. Loads float freely like they >> are supposed to. >> >> - Less nodes early. The inlining will become more deterministic. A >> barrier heavy GC will not run into node limits earlier. Also node limit >> bounded optimization like unrolling and peeling will not be penalized by >> barriers. >> >> - Better test coverage, or reduce testing cost when the same >> optimization doesn't need to be verified with every GC. >> >> - Better control on where barriers end up. It is trivial to guarantee >> that the load and barriers are not separated by a safepoint. >> >> Design >> >> The implementation uses an extra phase that piggy back on PhaseIdealLoop >> which provides control and dominator information for all loads. This >> extra phase is needed because we need to splice the control flow when >> adding the load barriers. >> >> Barriers are inserted on the loads nodes in post order (any successor >> first). This is to guarantee the dominator information above every >> insertion is correct. This is also important within blocks. Two loads in >> the same block can float in relation to each other. The addition of >> barriers serializes their order. Any def-use relationship is upheld by >> expanding them post order. >> >> Barrier insertion is done in stages. In this first stage a single macro >> node that represents the barrier is added with all dependencies that is >> required. In the macro expansion phase the barrier nodes is expanded >> into the final shape, adding nodes that represent the conditional load >> barrier check. (Write barriers in other GCs could possibly be expanded >> here directly) >> >> All the barriers that are needed for unsafe reference operations (cas, >> swap, cmpx) are also expanded late. They already have control flow, so >> the expansion is straight forward. >> >> The barriers for the unsafe reference operations (cas, getandset, cmpx) >> have also been simplified. The cas-load-cas dance have been replaced by >> a pre-load. The pre-load is a load with a barrier, that is kept alive by >> an extra (required) edge on the unsafe-primitive-nodes (specialized as >> ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). >> >> One challenge that was encountered early and that have caused >> considerable work is that nodes (like loads) can end up between calls >> and their catch projections. This is usually handled after matching, in >> PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned >> to all catch blocks. At this stage they are in an ordered list, so that >> is a straight forward process. For late barrier insertion we need to >> splice in control earlier, before matching, and control flow between >> calls and catches is not allowed. This requires us to add a >> transformation pass where all loads and their dependent instructions are >> cloned out to the catch blocks before we can start splicing in control >> flow. This transformation doesn't replace the legacy call_catch_cleanup >> fully, but it could be a future goal. >> >> In the original barrier implementation there where two different load >> barrier implementations: the basic and the optimized. With the new >> approach to barriers on unsafe, the basic is no longer required and has >> been removed. (It provided options for skipping the self healing, and >> passed the ref in a register, guaranteeing that the oop wasn't reloaded.) >> >> The wart that was fixup_partial_loads in zHeap has also been made >> redundant. >> >> Dominating barriers are no longer removed on weak loads. Weak barriers >> doesn't guarantee self-healing. >> >> Follow up work: >> >> - Consolidate all uses of GrowableArray::insert_sorted to use the new >> version >> >> - Refactor the phases. There are a lot of simplifications and >> verification that can be done with more well defined phases. >> >> - Simplify the remaining barrier optimizations. There might still be >> code paths that are no longer needed. >> >> >> Testing: >> >> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then >> some. All with -XX:+ZVerifyViews. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >> >> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >> >> >> Please review, >> >> Regards, >> >> Nils >> From nils.eliasson at oracle.com Mon May 27 11:04:18 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 27 May 2019 13:04:18 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: <59c61d94-fdb3-c380-1ff7-dd1fbc5752c5@redhat.com> <1a5dc2fd-cd59-4042-798e-b261c9017368@oracle.com> Message-ID: <55cf2fc9-871b-74eb-9776-b28b1fd0c94c@oracle.com> Thanks! I hope it can be an evolution that can be useful to all GCs in the long run. Serial and parallel should be fairly straight forward, G1 requires a little bit more care, and I guess Shenandoah too. // Nils On 2019-05-24 11:42, Roman Kennke wrote: > Hi Nils, > >> I removed the method. I have verified that Shenandoah builds and isn't >> obviously broken. >> >> Webrev updated in place. > Thanks! Will review more thoroughly and run some tests later. > > Nice approach, btw! > > Roman > > > >> Regards, >> >> Nils >> >> On 2019-05-23 16:31, Roman Kennke wrote: >>> Quick glance showed a problem: you are renaming/moving >>> BarrierSetC2::add_users_to_worklist() but ShenandoahBarrierSetC2 is not >>> updated accordingly. >>> >>> Roman >>> >>> >>>> Hi, >>>> >>>> In ZGC we use load barriers on references. In the original >>>> implementation these where added as macro nodes at parse time. The load >>>> barrier node consumes and produces control flow in order to be able to >>>> be lowered into a check with a slow path late. The load barrier nodes >>>> are fixed in the control flow, and extensions to different optimizations >>>> are need the barriers out of loop and past other unrelated control flow. >>>> >>>> With this patch the barriers are instead added after the loop >>>> optimizations, before macro node expansion. This makes the entire >>>> pipeline until that point oblivious about the barriers. A dump of the IR >>>> with ZGC or EpsilonGC will be basically identical at that point, and the >>>> diff compared to serialGC or ParallelGC that use write barriers is >>>> really small. >>>> >>>> Benefits >>>> >>>> - A major complexity reduction. One can reason about and implement loop >>>> optimization without caring about the barriers. The escape analysis >>>> doesn't need to know about the barriers. Loads float freely like they >>>> are supposed to. >>>> >>>> - Less nodes early. The inlining will become more deterministic. A >>>> barrier heavy GC will not run into node limits earlier. Also node limit >>>> bounded optimization like unrolling and peeling will not be penalized by >>>> barriers. >>>> >>>> - Better test coverage, or reduce testing cost when the same >>>> optimization doesn't need to be verified with every GC. >>>> >>>> - Better control on where barriers end up. It is trivial to guarantee >>>> that the load and barriers are not separated by a safepoint. >>>> >>>> Design >>>> >>>> The implementation uses an extra phase that piggy back on PhaseIdealLoop >>>> which provides control and dominator information for all loads. This >>>> extra phase is needed because we need to splice the control flow when >>>> adding the load barriers. >>>> >>>> Barriers are inserted on the loads nodes in post order (any successor >>>> first). This is to guarantee the dominator information above every >>>> insertion is correct. This is also important within blocks. Two loads in >>>> the same block can float in relation to each other. The addition of >>>> barriers serializes their order. Any def-use relationship is upheld by >>>> expanding them post order. >>>> >>>> Barrier insertion is done in stages. In this first stage a single macro >>>> node that represents the barrier is added with all dependencies that is >>>> required. In the macro expansion phase the barrier nodes is expanded >>>> into the final shape, adding nodes that represent the conditional load >>>> barrier check. (Write barriers in other GCs could possibly be expanded >>>> here directly) >>>> >>>> All the barriers that are needed for unsafe reference operations (cas, >>>> swap, cmpx) are also expanded late. They already have control flow, so >>>> the expansion is straight forward. >>>> >>>> The barriers for the unsafe reference operations (cas, getandset, cmpx) >>>> have also been simplified. The cas-load-cas dance have been replaced by >>>> a pre-load. The pre-load is a load with a barrier, that is kept alive by >>>> an extra (required) edge on the unsafe-primitive-nodes (specialized as >>>> ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). >>>> >>>> One challenge that was encountered early and that have caused >>>> considerable work is that nodes (like loads) can end up between calls >>>> and their catch projections. This is usually handled after matching, in >>>> PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned >>>> to all catch blocks. At this stage they are in an ordered list, so that >>>> is a straight forward process. For late barrier insertion we need to >>>> splice in control earlier, before matching, and control flow between >>>> calls and catches is not allowed. This requires us to add a >>>> transformation pass where all loads and their dependent instructions are >>>> cloned out to the catch blocks before we can start splicing in control >>>> flow. This transformation doesn't replace the legacy call_catch_cleanup >>>> fully, but it could be a future goal. >>>> >>>> In the original barrier implementation there where two different load >>>> barrier implementations: the basic and the optimized. With the new >>>> approach to barriers on unsafe, the basic is no longer required and has >>>> been removed. (It provided options for skipping the self healing, and >>>> passed the ref in a register, guaranteeing that the oop wasn't >>>> reloaded.) >>>> >>>> The wart that was fixup_partial_loads in zHeap has also been made >>>> redundant. >>>> >>>> Dominating barriers are no longer removed on weak loads. Weak barriers >>>> doesn't guarantee self-healing. >>>> >>>> Follow up work: >>>> >>>> - Consolidate all uses of GrowableArray::insert_sorted to use the new >>>> version >>>> >>>> - Refactor the phases. There are a lot of simplifications and >>>> verification that can be done with more well defined phases. >>>> >>>> - Simplify the remaining barrier optimizations. There might still be >>>> code paths that are no longer needed. >>>> >>>> >>>> Testing: >>>> >>>> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then >>>> some. All with -XX:+ZVerifyViews. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >>>> >>>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >>>> >>>> >>>> Please review, >>>> >>>> Regards, >>>> >>>> Nils >>>> From vladimir.x.ivanov at oracle.com Mon May 27 11:14:25 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 27 May 2019 14:14:25 +0300 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> Message-ID: > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.01/ Looks good. Best regards, Vladimir Ivanov > I can push the patch, after both of you find it ok. > > Regards, > Vivek > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Thursday, May 23, 2019 5:45 AM > To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Cc: Viswanathan, Sandhya > Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding > > Hi Vivek, > > On 22.05.19 19:48, Deshpande, Vivek R wrote: >> I came across this issue with vector API tests. >> I tried to write a reproducer using the autovectorizer test, but it always uses the register based rule(instruct Repl32B) instead of memory based rule (instruct Repl32B_mem) and register based rule gives correct result. >> This is why it never showed up. > > Okay, thanks for the explanation. > >> Could you please help me with forcing to use memory based rule with autovectorizer based test. > I'm afraid I can't help, I'm not even sure what the autovectorizer is. Is that specific to the vector API? > > Best regards, > Tobias > From vladimir.x.ivanov at oracle.com Mon May 27 11:16:36 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Mon, 27 May 2019 14:16:36 +0300 Subject: RFR: 8223504: improve performance of forall loops by better inlining of "iterator()" methods. In-Reply-To: <29bca6d1-d63e-c046-88f9-c6ef21bd089b@oracle.com> References: <58486996-d7da-30ab-77c2-b590395423c2@oracle.com> <92d61151-97ac-565a-1bfe-d25dd5ea1048@redhat.com> <359da83e-883b-de8a-0525-21ce4c797249@oracle.com> <294c512c-a613-7679-0b90-61e2fe015d3c@oracle.com> <51756d46-c2ad-5377-12ce-67208deb12a5@oracle.com> <29bca6d1-d63e-c046-88f9-c6ef21bd089b@oracle.com> Message-ID: <4edf73f2-acbb-4c81-0116-c65f092ba9be@oracle.com> > http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.03/ Reviewed. Best regards, Vladimir Ivanov > Now, only return type is checked. It works the same way on all forall > loops. Besides it perfectly works on "descendingIterator()" methods and > all other methods returning iterator but having other names like > Enumeration::asIterator. > > On 5/22/19 10:47 AM, Vladimir Ivanov wrote: >> Thanks for the thorough explanation, Sergey. >> >> What I like about the patch you propose is its simplicity. >> >> More accurate heuristic which takes inlining effects into account >> (e.g., on EA) would scale to a much wider range of use cases and C2 >> has a rich toolkit to make such heuristic possible, but it would >> definitely require more effort. >> >> Thinking about it for a while, I agree with your proposal: the >> heuristic is acceptable for the case of iterators, the benefits are >> evident, and risks (with overinlining and premature inlining) are low. >> >> I still hope there'll be a more generic solution available at some >> point which will supersede such special case for Iterable. >> >> Regarding the check itself, I'm in favor of limiting it >> toIterable::iterator() overrides/overloads, but I'm OK with a more >> generic check on return type. >> >> Best regards, >> Vladimir Ivanov >> >> On 10/05/2019 21:47, Sergey Kuksenko wrote: >>> Let me do a broader description. >>> >>> When hotspot makes a decision to the "ultimate question compilation, >>> optimization and everything"? inline or not inline there are two key >>> part of that decision. It is check of sizes (callee and caller) and >>> check of frequencies (invocation count). Frequency check is >>> reasonable, why should we inline rarely invoked method? But sometimes >>> we loose optimization opportunities with that. >>> >>> Let's narrow the scenario. We have a loop and a method invocation >>> before the loop. Inline of the method is a vital? for the loop >>> performance. I see at least two key optimizations here: constant >>> propagation and scalar replacement, maybe more. But if the loop has >>> large enough amount of iterations -> hotspot has large enough >>> backedge counters -> but it means that prolog is considered as >>> relatively cold code (small amount of invocation counter) -> that >>> method (potentially vital for performance) is not inlined (due to >>> frequency/MinInlineThreashold cut off). >>> >>> We can't say if inlining is important until we look into the loop >>> (even if there is a loop there). But we have to make a decision about >>> inline before that. So let's try to make reasonable heuristic and >>> narrow the scenario again. Limit our sight to Iterators. There is a >>> very high probability that after Iterable::iterator() invocation >>> there is a loop (covers all for-all loop). Also there is a high >>> correlation between collection size and amount of loop iterations. >>> Let's inline all iterators. I don't think the idea to analyze if >>> "returned Iterator is a freshly-allocated instance" makes sense. >>> First of all it's unnecessary complication.? Moreover, I have results >>> when we have chain of iterators, hotspot can't inline the whole chain >>> due to absence of profile (and/or profile pollution), but partial >>> inline of the chain have shown performance benefits. To get more >>> effective prediction if that particular inline is important we should >>> look not into the method, but to the usage of the method results >>> (into the loop). >>> >>> About the first comment (to broad or to narrow check). I have to note >>> that this fix doesn't force inline for all methods with "iterator" >>> name. The fix only excludes frequency cut off. All other checks (by >>> sizes) are still in place. I did broader check for two reasons: to >>> simplify modifications and to have wider appliances when it works. I >>> could narrow it if you insist, but at the same time I think we have >>> to make that check broader - don't look into method name at all. If >>> you have something? that returns Iterator - there will be loop after >>> that with a very high probability. So I'd vote for making that wider >>> - check only return type. >>> >>> On 5/8/19 3:10 PM, Vladimir Ivanov wrote: >>>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.01/ >>>> returned Iterator is a freshly-allocated instance >>>> src/hotspot/share/opto/bytecodeInfo.cpp: >>>> >>>> +? if (callee_method->name() == ciSymbol::iterator_name()) { >>>> +??? if >>>> (callee_method->signature()->return_type()->is_subtype_of(C->env()->Iterator_klass())) >>>> { >>>> +????? return true; >>>> +??? } >>>> +? } >>>> >>>> The check looks too broad for me: it returns true for any method >>>> with a name "iterator" which returns an instance of Iterator which >>>> is much broader that just overrides/overloads of Iterable::iterator(). >>>> >>>> Can you elaborate, please, why did you decide to extend the check >>>> for non-Iterables? >>>> >>>> Commenting on the general approach, it looks like a good candidate >>>> for a fist-line filter before performing a more extensive analysis. >>>> I'd prefer to see BCEscapeAnalyzer extended to determine that >>>> returned Iterator is a freshly-allocated instance and decide whether >>>> to inline or not based on that instead. Among java.util classes you >>>> mentioned most iterators are trivial, so even naive analysis should >>>> get decent results. >>>> >>>> And then the analysis can be applied to any method which returns an >>>> Object to see whether EA may benefit from inlining. >>>> >>>> What do you think? >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> On 5/7/19 11:56 AM, Aleksey Shipilev wrote: >>>>>> On 5/7/19 8:39 PM, Sergey Kuksenko wrote: >>>>>>> Hi All, >>>>>>> >>>>>>> I would like to ask for review the following change/update: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8223504 >>>>>>> >>>>>>> http://cr.openjdk.java.net/~skuksenko/hotspot/8223504/webrev.00/ >>>>>> The idea sounds fine. >>>>>> >>>>>> Nits (the usual drill): >>>>>> >>>>>> ? *) Copyright years need to be updated, at least in bytecodeInfo.cpp >>>>>> >>>>>> ? *) Do we need to put Iterator_klass initialization this early in >>>>>> WK_KLASSES_DO? It feels safer to >>>>>> initialize it at the end, to avoid surprising bootstrap issues. >>>>>> >>>>>> ? *) Backslash indent is off here in vmSymbols.hpp: >>>>>> >>>>>> ? 129?? template(java_util_Iterator, >>>>>> "java/util/Iterator")?????????????? \ >>>>>> >>>>>> ? *) Space after "if"? Also, I think you can use >>>>>> ciType::is_subtype_of instead here. Plus, since you >>>>>> declared iterator in WK klasses, >>>>>> SystemDictionary::Iterator_klass() should be available. >>>>>> >>>>>> ? 100???? if(retType->is_klass() && >>>>>> retType->as_klass()->is_subtype_of(C->env()->Iterator_klass())) { >>>>>> From rwestrel at redhat.com Mon May 27 11:54:10 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 27 May 2019 13:54:10 +0200 Subject: RFR(S): 8173196: [REDO] C2 does not optimize redundant memory operations with G1 In-Reply-To: References: <871s0qlu20.fsf@redhat.com> Message-ID: <87lfysaza5.fsf@redhat.com> > this looks good to me. I'll run some extended testing and let you know once it's done. Thanks, Tobias! Roland. From rwestrel at redhat.com Mon May 27 11:57:48 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 27 May 2019 13:57:48 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> References: <877eailvgp.fsf@redhat.com> <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> Message-ID: <87imtwaz43.fsf@redhat.com> > The change looks reasonable to me. > > I have run tests with the patch and can confirm that the original bug > went away. I've also run a bunch of other tests and workloads and looks > good too. Thanks Roman. Anyone else for this? Shenandoah needs this for a key change that's also out for review: https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025931.html (8224584: Shenandoah: Eliminate forwarding pointer word) Roland. From robbin.ehn at oracle.com Mon May 27 13:32:49 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 27 May 2019 15:32:49 +0200 Subject: RFR(xxs): 8224795: some runtime/SelectionResolution tests are timing out Message-ID: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> Hi all, There is a fault in changeset 8221734: Deoptimize with handshakes. Which causes us clear the method's code unconditionally. It must be checked that it's pointing to this nmethod before clearing. I made an error while changing some code after reviews of 8221734 here: http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/webrev/src/hotspot/share/code/nmethod.cpp.udiff.html This patch restores correct behavior: diff -r d871ce8ab96b src/hotspot/share/code/nmethod.cpp --- a/src/hotspot/share/code/nmethod.cpp Sat May 25 20:55:33 2019 +0900 +++ b/src/hotspot/share/code/nmethod.cpp Mon May 27 12:29:26 2019 +0200 @@ -1261,7 +1261,7 @@ void nmethod::unlink_from_method() { if (method() != NULL) { - method()->unlink_code(); + method()->unlink_code(this); } } Issue: https://bugs.openjdk.java.net/browse/JDK-8224795 I do not see any new issues in t1-7 and SelectionResolution (executed in t4 and t6) now passes. Also locally verified SelectionResolution tests with Xcomp. Thanks, Robbin From tobias.hartmann at oracle.com Mon May 27 13:33:47 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 27 May 2019 15:33:47 +0200 Subject: RFR(xxs): 8224795: some runtime/SelectionResolution tests are timing out In-Reply-To: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> References: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> Message-ID: <814eb78a-64d1-cf65-d7ed-48d3e2c85853@oracle.com> Hi Robbin, this looks good to me. Best regards, Tobias On 27.05.19 15:32, Robbin Ehn wrote: > Hi all, > > There is a fault in changeset 8221734: Deoptimize with handshakes. > Which causes us clear the method's code unconditionally. > It must be checked that it's pointing to this nmethod before clearing. > I made an error while changing some code after reviews of 8221734 here: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/webrev/src/hotspot/share/code/nmethod.cpp.udiff.html > > > This patch restores correct behavior: > diff -r d871ce8ab96b src/hotspot/share/code/nmethod.cpp > --- a/src/hotspot/share/code/nmethod.cpp??? Sat May 25 20:55:33 2019 +0900 > +++ b/src/hotspot/share/code/nmethod.cpp??? Mon May 27 12:29:26 2019 +0200 > @@ -1261,7 +1261,7 @@ > > ?void nmethod::unlink_from_method() { > ?? if (method() != NULL) { > -??? method()->unlink_code(); > +??? method()->unlink_code(this); > ?? } > ?} > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8224795 > > I do not see any new issues in t1-7 and SelectionResolution (executed in t4 and t6) now passes. Also > locally verified SelectionResolution tests with Xcomp. > > Thanks, Robbin From robbin.ehn at oracle.com Mon May 27 13:41:34 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Mon, 27 May 2019 15:41:34 +0200 Subject: RFR(xxs): 8224795: some runtime/SelectionResolution tests are timing out In-Reply-To: <814eb78a-64d1-cf65-d7ed-48d3e2c85853@oracle.com> References: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> <814eb78a-64d1-cf65-d7ed-48d3e2c85853@oracle.com> Message-ID: <3c65ab4a-73c6-7a83-0841-9d07934cb48b@oracle.com> Thanks Tobias! /Robbin On 2019-05-27 15:33, Tobias Hartmann wrote: > Hi Robbin, > > this looks good to me. > > Best regards, > Tobias > > On 27.05.19 15:32, Robbin Ehn wrote: >> Hi all, >> >> There is a fault in changeset 8221734: Deoptimize with handshakes. >> Which causes us clear the method's code unconditionally. >> It must be checked that it's pointing to this nmethod before clearing. >> I made an error while changing some code after reviews of 8221734 here: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/webrev/src/hotspot/share/code/nmethod.cpp.udiff.html >> >> >> This patch restores correct behavior: >> diff -r d871ce8ab96b src/hotspot/share/code/nmethod.cpp >> --- a/src/hotspot/share/code/nmethod.cpp??? Sat May 25 20:55:33 2019 +0900 >> +++ b/src/hotspot/share/code/nmethod.cpp??? Mon May 27 12:29:26 2019 +0200 >> @@ -1261,7 +1261,7 @@ >> >> ?void nmethod::unlink_from_method() { >> ?? if (method() != NULL) { >> -??? method()->unlink_code(); >> +??? method()->unlink_code(this); >> ?? } >> ?} >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8224795 >> >> I do not see any new issues in t1-7 and SelectionResolution (executed in t4 and t6) now passes. Also >> locally verified SelectionResolution tests with Xcomp. >> >> Thanks, Robbin From tobias.hartmann at oracle.com Mon May 27 15:22:31 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 27 May 2019 17:22:31 +0200 Subject: RFR(S): 8173196: [REDO] C2 does not optimize redundant memory operations with G1 In-Reply-To: References: <871s0qlu20.fsf@redhat.com> Message-ID: <379f29f4-2f14-3b6a-efd1-b501a46a8f87@oracle.com> On 27.05.19 11:17, Tobias Hartmann wrote: > I'll run some extended testing and let you know once it's done. Testing passed. Best regards, Tobias From martin.doerr at sap.com Mon May 27 15:24:04 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 27 May 2019 15:24:04 +0000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> Message-ID: Hi Vladimir, I've looked at your change an think it's good. Methods belonging to uninitialized classes are not inlined, so it looks fine to perform the check only at the nmethod entry. I'm not so familiar with JVMCI/AOT, so I haven't looked into this topic. Are these assertions safe? + assert(method()->needs_clinit_barrier(), "barrier not needed"); + assert(method()->holder()->is_being_initialized(), "barrier not needed"); Can it happen that initialization concurrently completes before they are evaluated? A small suggestion for x86 TemplateTable::invokeinterface: It'd be nice to replace load of interface klass by your new load_method_holder. Thanks for sharing performance numbers. Best regards, Martin From gnu.andrew at redhat.com Mon May 27 16:31:35 2019 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Mon, 27 May 2019 17:31:35 +0100 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: Message-ID: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> On 27/05/2019 10:06, Kazunori Ogata wrote: > Hi, > > I'm requesting backport of 8185696: PPC64: Improve VSR support to use up > to 64 registers to jdk8u-dev. This patch can be applied almost cleanly, > but one chunk failed (other than a copyright year conflict) because the > definition of MTVSRWA_OPCODE is missing in > src/cpu/ppc/vm/assembler_ppc.hpp. > > MTVSRWA_OPCODE was added in 8144019: PPC64 C1: Introduce Client Compiler > [1], which added tiered compilation feature in JDK9. Since JDK8 does not > support tiered compilation, we cannot apply this change set. > > So I manually applied the failed chunk by skipping the MTVSRWA_OPCODE > definition because no code in this changeset and other changesets I'm > going to backport uses this opcode. I'll leave it to the future backport > that really needs this opcode. > > Is this fix acceptable for backport request? If there is no objection in > a few days, I'll go forward to add jdk8u-fix-request tag in the original > bug report. I verified I can build both fastdebug and release version, > and no degradation in jtreg. > > > Original bug report: > https://bugs.openjdk.java.net/browse/JDK-8185969 > > Webrev: > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8185969/webrev.02/ > > Refs: > [1] https://bugs.openjdk.java.net/browse/JDK-8144019 > > This mostly looks ok, but why is the copyright header updated to 2019 for assembler_ppc.hpp, but not register_ppc.hpp and register_ppc.cpp? Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From martin.doerr at sap.com Mon May 27 16:41:37 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 27 May 2019 16:41:37 +0000 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> Message-ID: Hi, I think it's fine. I guess the copyright updates in the other files were part of other changes which have not been backported. I don't think we have to update copyrights in backport changes other than what comes naturally with the change. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Andrew John Hughes > Sent: Montag, 27. Mai 2019 18:32 > To: Kazunori Ogata ; hotspot-compiler- > dev at openjdk.java.net; jdk8u-dev at openjdk.java.net > Subject: Re: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: > Improve VSR support to use up to 64 registers > > > > On 27/05/2019 10:06, Kazunori Ogata wrote: > > Hi, > > > > I'm requesting backport of 8185696: PPC64: Improve VSR support to use up > > to 64 registers to jdk8u-dev. This patch can be applied almost cleanly, > > but one chunk failed (other than a copyright year conflict) because the > > definition of MTVSRWA_OPCODE is missing in > > src/cpu/ppc/vm/assembler_ppc.hpp. > > > > MTVSRWA_OPCODE was added in 8144019: PPC64 C1: Introduce Client > Compiler > > [1], which added tiered compilation feature in JDK9. Since JDK8 does not > > support tiered compilation, we cannot apply this change set. > > > > So I manually applied the failed chunk by skipping the MTVSRWA_OPCODE > > definition because no code in this changeset and other changesets I'm > > going to backport uses this opcode. I'll leave it to the future backport > > that really needs this opcode. > > > > Is this fix acceptable for backport request? If there is no objection in > > a few days, I'll go forward to add jdk8u-fix-request tag in the original > > bug report. I verified I can build both fastdebug and release version, > > and no degradation in jtreg. > > > > > > Original bug report: > > https://bugs.openjdk.java.net/browse/JDK-8185969 > > > > Webrev: > > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8185969/webrev.02/ > > > > Refs: > > [1] https://bugs.openjdk.java.net/browse/JDK-8144019 > > > > > > This mostly looks ok, but why is the copyright header updated to 2019 > for assembler_ppc.hpp, but not register_ppc.hpp and register_ppc.cpp? > > Thanks, > -- > Andrew :) > > Senior Free Java Software Engineer > Red Hat, Inc. (http://www.redhat.com) > > PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) > Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 > https://keybase.io/gnu_andrew From gnu.andrew at redhat.com Mon May 27 17:28:28 2019 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Mon, 27 May 2019 18:28:28 +0100 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> Message-ID: <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> On 27/05/2019 17:41, Doerr, Martin wrote: > Hi, > > I think it's fine. > > I guess the copyright updates in the other files were part of other changes which have not been backported. > I don't think we have to update copyrights in backport changes other than what comes naturally with the change. > > Best regards, > Martin > > The change to the copyright header in assembler_ppc.hpp is an addition in this backport. So either that should be dropped or the same should be applied to register_ppc.{c,h}pp (the remaining file is already 2019). I tend towards dropping it, but we should at least be consistent within the same patch. Best regards, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From david.holmes at oracle.com Mon May 27 21:39:12 2019 From: david.holmes at oracle.com (David Holmes) Date: Tue, 28 May 2019 07:39:12 +1000 Subject: RFR(xxs): 8224795: some runtime/SelectionResolution tests are timing out In-Reply-To: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> References: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> Message-ID: Looks good Robbin! Thank, David On 27/05/2019 11:32 pm, Robbin Ehn wrote: > Hi all, > > There is a fault in changeset 8221734: Deoptimize with handshakes. > Which causes us clear the method's code unconditionally. > It must be checked that it's pointing to this nmethod before clearing. > I made an error while changing some code after reviews of 8221734 here: > http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/webrev/src/hotspot/share/code/nmethod.cpp.udiff.html > > > This patch restores correct behavior: > diff -r d871ce8ab96b src/hotspot/share/code/nmethod.cpp > --- a/src/hotspot/share/code/nmethod.cpp??? Sat May 25 20:55:33 2019 +0900 > +++ b/src/hotspot/share/code/nmethod.cpp??? Mon May 27 12:29:26 2019 +0200 > @@ -1261,7 +1261,7 @@ > > ?void nmethod::unlink_from_method() { > ?? if (method() != NULL) { > -??? method()->unlink_code(); > +??? method()->unlink_code(this); > ?? } > ?} > > Issue: > https://bugs.openjdk.java.net/browse/JDK-8224795 > > I do not see any new issues in t1-7 and SelectionResolution (executed in > t4 and t6) now passes. Also locally verified SelectionResolution tests > with Xcomp. > > Thanks, Robbin From OGATAK at jp.ibm.com Tue May 28 05:22:20 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Tue, 28 May 2019 14:22:20 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185696: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> Message-ID: Hi Andrew and Martin, Thank you for your comments. My original intention to change the copyright year was that I did some work to apply the original patch to this file. I now realized I made no change in the code that was modified in the original patch. So I agree not updating the copyright year is more natural. I updated webrev: http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8185969/webrev.03/ Regards, Ogata Andrew John Hughes wrote on 2019/05/28 02:28:28: > From: Andrew John Hughes > To: "Doerr, Martin" , Kazunori Ogata > , "hotspot-compiler-dev at openjdk.java.net" compiler-dev at openjdk.java.net>, "jdk8u-dev at openjdk.java.net" dev at openjdk.java.net> > Date: 2019/05/28 02:28 > Subject: [EXTERNAL] Re: [8u-dev, ppc] RFR for (almost clean) backport of > 8185696: PPC64: Improve VSR support to use up to 64 registers > > > > On 27/05/2019 17:41, Doerr, Martin wrote: > > Hi, > > > > I think it's fine. > > > > I guess the copyright updates in the other files were part of other > changes which have not been backported. > > I don't think we have to update copyrights in backport changes other > than what comes naturally with the change. > > > > Best regards, > > Martin > > > > > > The change to the copyright header in assembler_ppc.hpp is an addition > in this backport. So either that should be dropped or the same should be > applied to register_ppc.{c,h}pp (the remaining file is already 2019). > > I tend towards dropping it, but we should at least be consistent within > the same patch. > > Best regards, > -- > Andrew :) > > Senior Free Java Software Engineer > Red Hat, Inc. (http://www.redhat.com) > > PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) > Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 > https://keybase.io/gnu_andrew > > [attachment "signature.asc" deleted by Kazunori Ogata/Japan/IBM] From OGATAK at jp.ibm.com Tue May 28 05:30:35 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Tue, 28 May 2019 14:30:35 +0900 Subject: [PING] Re: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Hi, May I get review for this fix? Regards, Ogata "hotspot-compiler-dev" wrote on 2019/05/17 14:34:20: > From: "Kazunori Ogata" > To: hotspot-compiler-dev at openjdk.java.net, ppc-aix-port-dev at openjdk.java.net > Date: 2019/05/17 14:35 > Subject: [EXTERNAL] RFR: 8224090: [PPC64] Fix SLP patterns for filling an > array with double float literals > Sent by: "hotspot-compiler-dev" > > Hi, > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > instructions? > > We found that SLP patterns added by JDK-8208171 [1] use incorrect data > type, so the patterns have never been used. Further, the pattern for > filling -1.0 is confused with the operation for filling -1L. > > This webrev fixes the pattern to fill an array with 0 to use 0.0d instead > of 0d, and removes the pattern to will with -1 because the bit pattern of > -1.0d is not easy to generate using a single VSX instruction. It's should > be better to load the literal from TOC and use general repl2D_reg_Ex > pattern. > > I also fixed some comments in "format %{ ... %}" to show correct matching > types. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > Ref: > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > Regards, > Ogata > > From robbin.ehn at oracle.com Tue May 28 06:08:54 2019 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Tue, 28 May 2019 08:08:54 +0200 Subject: RFR(xxs): 8224795: some runtime/SelectionResolution tests are timing out In-Reply-To: References: <2ed3b64f-3d15-04e5-d84e-557fd166658d@oracle.com> Message-ID: <13ef6bdc-0d7e-86e5-5797-98fd51ade313@oracle.com> Thanks David! /Robbin On 2019-05-27 23:39, David Holmes wrote: > Looks good Robbin! > > Thank, > David > > On 27/05/2019 11:32 pm, Robbin Ehn wrote: >> Hi all, >> >> There is a fault in changeset 8221734: Deoptimize with handshakes. >> Which causes us clear the method's code unconditionally. >> It must be checked that it's pointing to this nmethod before clearing. >> I made an error while changing some code after reviews of 8221734 here: >> http://cr.openjdk.java.net/~rehn/8221734/v3/inc_review/webrev/src/hotspot/share/code/nmethod.cpp.udiff.html >> >> >> This patch restores correct behavior: >> diff -r d871ce8ab96b src/hotspot/share/code/nmethod.cpp >> --- a/src/hotspot/share/code/nmethod.cpp??? Sat May 25 20:55:33 2019 +0900 >> +++ b/src/hotspot/share/code/nmethod.cpp??? Mon May 27 12:29:26 2019 +0200 >> @@ -1261,7 +1261,7 @@ >> >> ??void nmethod::unlink_from_method() { >> ??? if (method() != NULL) { >> -??? method()->unlink_code(); >> +??? method()->unlink_code(this); >> ??? } >> ??} >> >> Issue: >> https://bugs.openjdk.java.net/browse/JDK-8224795 >> >> I do not see any new issues in t1-7 and SelectionResolution (executed in t4 >> and t6) now passes. Also locally verified SelectionResolution tests with Xcomp. >> >> Thanks, Robbin From rahul.v.raghavan at oracle.com Tue May 28 06:40:55 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Tue, 28 May 2019 12:10:55 +0530 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <20c8f52f-7a43-a658-77ad-40c8a3d74ff1@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> <20c8f52f-7a43-a658-77ad-40c8a3d74ff1@oracle.com> Message-ID: <225ec4e1-10bf-2156-be74-e0ff3608e45f@oracle.com> Thank you Tobias. On 27/05/19 2:33 PM, Tobias Hartmann wrote: > Hi Rahul, > > On 24.05.19 22:17, Rahul Raghavan wrote: >> Request one more review approval for latest >> http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ > > Looks good to me. > > Best regards, > Tobias > From nils.eliasson at oracle.com Tue May 28 07:55:41 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 28 May 2019 09:55:41 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address Message-ID: Hi, This is patch fixes a problem in LoadBarrierNode::common_barriers that can lead to broken IRs in some very rare cases. The common barrier optimizations tries to merge two barriers if they have the same oop-in, and there is a site in the control flow where they can be merged. The problem is that it doesn't check the address too, it might be pinned by checkcast nodes. In the failing IR the address is the same if it is traced upwards, but it isn't obvious which checkcast nodes can be skipped. Since the common_barriers optimization is removed in the late loadbarrier insertion patch, I choose to create a good-enough fix that can be easily backported. I will make sure to push this fix before the late loadbarrier patch. Bug: https://bugs.openjdk.java.net/browse/JDK-8224538 Webrev: http://cr.openjdk.java.net/~neliasso/8224538/webrev.01/ Regards, Nils From martin.doerr at sap.com Tue May 28 09:21:29 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 28 May 2019 09:21:29 +0000 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Hi Ogata, looks good. Thanks for fixing. Best regards, Martin > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Kazunori Ogata > Sent: Freitag, 17. Mai 2019 07:34 > To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port- > dev at openjdk.java.net > Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double > float literals > > Hi, > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > instructions? > > We found that SLP patterns added by JDK-8208171 [1] use incorrect data > type, so the patterns have never been used. Further, the pattern for > filling -1.0 is confused with the operation for filling -1L. > > This webrev fixes the pattern to fill an array with 0 to use 0.0d instead > of 0d, and removes the pattern to will with -1 because the bit pattern of > -1.0d is not easy to generate using a single VSX instruction. It's should > be better to load the literal from TOC and use general repl2D_reg_Ex > pattern. > > I also fixed some comments in "format %{ ... %}" to show correct matching > types. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > Ref: > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > Regards, > Ogata From tobias.hartmann at oracle.com Tue May 28 09:27:43 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 28 May 2019 11:27:43 +0200 Subject: [13] RFR(T): 8224870: Problemlist compiler/c2/Test8004741.java until JDK-8214904 is fixed Message-ID: <6636599f-781c-bad5-eb9c-18dc3bf6f69a@oracle.com> Hi, please review the following trivial fix that problemlists Test8004741 until 8214904 is fixed: https://bugs.openjdk.java.net/browse/JDK-8224870 http://cr.openjdk.java.net/~thartmann/8224870/webrev.00/ Thanks, Tobias From OGATAK at jp.ibm.com Tue May 28 10:25:54 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Tue, 28 May 2019 19:25:54 +0900 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Hi Martin, Thank you for your review. Regards, Ogata "Doerr, Martin" wrote on 2019/05/28 18:21:29: > From: "Doerr, Martin" > To: Kazunori Ogata , "hotspot-compiler- > dev at openjdk.java.net" , "ppc-aix- > port-dev at openjdk.java.net" > Date: 2019/05/28 18:21 > Subject: [EXTERNAL] RE: RFR: 8224090: [PPC64] Fix SLP patterns for filling > an array with double float literals > > Hi Ogata, > > looks good. Thanks for fixing. > > Best regards, > Martin > > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Kazunori Ogata > > Sent: Freitag, 17. Mai 2019 07:34 > > To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port- > > dev at openjdk.java.net > > Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double > > float literals > > > > Hi, > > > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > > instructions? > > > > We found that SLP patterns added by JDK-8208171 [1] use incorrect data > > type, so the patterns have never been used. Further, the pattern for > > filling -1.0 is confused with the operation for filling -1L. > > > > This webrev fixes the pattern to fill an array with 0 to use 0.0d instead > > of 0d, and removes the pattern to will with -1 because the bit pattern of > > -1.0d is not easy to generate using a single VSX instruction. It's should > > be better to load the literal from TOC and use general repl2D_reg_Ex > > pattern. > > > > I also fixed some comments in "format %{ ... %}" to show correct matching > > types. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > > > Ref: > > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > > > > Regards, > > Ogata > > From rahul.v.raghavan at oracle.com Tue May 28 10:33:04 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Tue, 28 May 2019 16:03:04 +0530 Subject: RFR: 8220449: serviceability/dcmd/compiler/CodelistTest.java failure Message-ID: Hi, Please review the following change based on the analysis and fix proposal as contributed by Gary Adams. - http://cr.openjdk.java.net/~rraghavan/8220449/webrev.00/ - https://bugs.openjdk.java.net/browse/JDK-8220449 'The test does attempt to disable BackgroundCompilation, but it apparently did not work as expected.' Fix is to add -XX:-BackgroundCompilation to test to eager initialize JVMCI. This case is similar to fix done in tests for JDK-8205400. 8205400: '[Graal] DisassembleCodeBlobTest.java fails with can't be enqueued for compilation on level 4' # https://bugs.openjdk.java.net/browse/JDK-8205400 # http://hg.openjdk.java.net/jdk/jdk/rev/ca4eea543d23 Thanks, Rahul From patric.hedlin at oracle.com Tue May 28 10:35:52 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Tue, 28 May 2019 12:35:52 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure Message-ID: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> Dear all, I would like to ask for help to review the following change/update: Issue:? https://bugs.openjdk.java.net/browse/JDK-8223363 Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ 8223363: Bad node estimate assertion failure ??? Tightening node budget estimates. New loop clone estimate attempting to ??? approximate complex loop cases (slightly) more accurately. Still ad-hoc. Also addressed: Issue:? https://bugs.openjdk.java.net/browse/JDK-8223502 Issue:? https://bugs.openjdk.java.net/browse/JDK-8224648 8223502: Node estimate for loop unswitching is not correct: assert(delta <= 2 * required) failed: Bad node estimate 8224648: assert(!exceeding_node_budget()) failed: Too many NODES required! failure with ctw Testing: hs-tier1..7, hs-precheckin-comp, Lucene (CTW) Caveat:? Multiple failures present in several tiers. Best regards, Patric From shade at redhat.com Tue May 28 10:42:01 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 28 May 2019 12:42:01 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> Message-ID: <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> On 5/28/19 12:35 PM, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8223363 > Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ Quick nit: there is a test attached here you can use: https://bugs.openjdk.java.net/browse/JDK-8223502 -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From patric.hedlin at oracle.com Tue May 28 10:57:12 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Tue, 28 May 2019 12:57:12 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> Message-ID: <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> Right Aleksey, this test has also been used/verified (and should be mentioned as well). Best regards, Patric On 28/05/2019 12:42, Aleksey Shipilev wrote: > On 5/28/19 12:35 PM, Patric Hedlin wrote: >> Dear all, >> >> I would like to ask for help to review the following change/update: >> >> Issue:? https://bugs.openjdk.java.net/browse/JDK-8223363 >> Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ > Quick nit: there is a test attached here you can use: > https://bugs.openjdk.java.net/browse/JDK-8223502 > > -Aleksey > From vladimir.x.ivanov at oracle.com Tue May 28 11:40:31 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 28 May 2019 14:40:31 +0300 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> Message-ID: Thanks, Martin. Updated webrev: http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/ > Are these assertions safe? > + assert(method()->needs_clinit_barrier(), "barrier not needed"); > + assert(method()->holder()->is_being_initialized(), "barrier not needed"); > Can it happen that initialization concurrently completes before they are evaluated? Good point. Even though ciInstanceKlass caches initialization state of the corresponding InstanceKlass, it seems there's a possibility that the state is updated during the compilation (see ciInstanceKlass::update_if_shared). I enhanced the asserts to check that initialization has been stated. > A small suggestion for x86 TemplateTable::invokeinterface: > It'd be nice to replace load of interface klass by your new load_method_holder. Agree. Updated. Best regards, Vladimir Ivanov From shade at redhat.com Tue May 28 12:14:59 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 28 May 2019 14:14:59 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> Message-ID: <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> On 5/28/19 12:57 PM, Patric Hedlin wrote: > Right Aleksey, this test has also been used/verified (and should be mentioned as well). What I meant is that you can use that as regression test for this patch. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Tue May 28 12:47:49 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 28 May 2019 14:47:49 +0200 Subject: [13] RFR(T): 8224870: Problemlist compiler/c2/Test8004741.java until JDK-8214904 is fixed In-Reply-To: <6636599f-781c-bad5-eb9c-18dc3bf6f69a@oracle.com> References: <6636599f-781c-bad5-eb9c-18dc3bf6f69a@oracle.com> Message-ID: On 5/28/19 11:27 AM, Tobias Hartmann wrote: > Hi, > > please review the following trivial fix that problemlists Test8004741 until 8214904 is fixed: > https://bugs.openjdk.java.net/browse/JDK-8224870 > http://cr.openjdk.java.net/~thartmann/8224870/webrev.00/ Looks trivial and good. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Tue May 28 12:54:17 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 28 May 2019 14:54:17 +0200 Subject: [13] RFR(T): 8224870: Problemlist compiler/c2/Test8004741.java until JDK-8214904 is fixed In-Reply-To: References: <6636599f-781c-bad5-eb9c-18dc3bf6f69a@oracle.com> Message-ID: <44d7c110-50e0-f070-599d-5061a6ee356a@oracle.com> Thanks Aleksey. Pushed. Best regards, Tobias On 28.05.19 14:47, Aleksey Shipilev wrote: > On 5/28/19 11:27 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following trivial fix that problemlists Test8004741 until 8214904 is fixed: >> https://bugs.openjdk.java.net/browse/JDK-8224870 >> http://cr.openjdk.java.net/~thartmann/8224870/webrev.00/ > > Looks trivial and good. > > -Aleksey > From per.liden at oracle.com Tue May 28 12:58:45 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 28 May 2019 14:58:45 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <7be41f00-bede-c64a-2bc8-2c4b9981f309@oracle.com> References: <7be41f00-bede-c64a-2bc8-2c4b9981f309@oracle.com> Message-ID: <12b53de0-17a6-1178-23f4-1dabba4ef2e0@oracle.com> Hi Nils, We should now also be able to remove the following ugly hacks (yay!): diff --git a/src/hotspot/share/classfile/vmSymbols.cpp b/src/hotspot/share/classfile/vmSymbols.cpp --- a/src/hotspot/share/classfile/vmSymbols.cpp +++ b/src/hotspot/share/classfile/vmSymbols.cpp @@ -778,9 +778,6 @@ #endif // COMPILER1 #ifdef COMPILER2 case vmIntrinsics::_clone: -#if INCLUDE_ZGC - if (UseZGC) return true; -#endif case vmIntrinsics::_copyOf: case vmIntrinsics::_copyOfRange: // These intrinsics use both the objectcopy and the arraycopy diff --git a/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp b/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp --- a/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp +++ b/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp @@ -462,7 +462,6 @@ } bool weak = (access.decorators() & ON_WEAK_OOP_REF) != 0; - assert(access.is_parse_access(), "entry not supported at optimization time"); if (p->isa_Load()) { p->as_Load()->set_barrier(weak); } diff --git a/src/hotspot/share/runtime/stackValue.cpp b/src/hotspot/share/runtime/stackValue.cpp --- a/src/hotspot/share/runtime/stackValue.cpp +++ b/src/hotspot/share/runtime/stackValue.cpp @@ -133,11 +133,6 @@ } #endif // Deoptimization must make sure all oops have passed load barriers -#if INCLUDE_ZGC - if (UseZGC) { - val = ZBarrier::load_barrier_on_oop_field_preloaded((oop*)value_addr, val); - } -#endif #if INCLUDE_SHENANDOAHGC if (UseShenandoahGC) { val = ShenandoahBarrierSet::barrier_set()->load_reference_barrier(val); cheers, Per On 5/24/19 11:41 AM, Nils Eliasson wrote: > Hi Per, > > I removed the code and updated the webrev. > > Thanks, > > Nils > > On 2019-05-23 21:32, Per Liden wrote: >> Hi Nils, >> >> On 2019-05-23 16:25, Nils Eliasson wrote: >> [...] >>> The wart that was fixup_partial_loads in zHeap has also been made >>> redundant. >> >> We should also be able to remove the function, task and closure for this: >> >> ?327 class ZFixupPartialLoadsClosure : public ZRootsIteratorClosure { >> ?328 public: >> ?329?? virtual void do_oop(oop* p) { >> ?330???? ZBarrier::mark_barrier_on_root_oop_field(p); >> ?331?? } >> ?332 >> ?333?? virtual void do_oop(narrowOop* p) { >> ?334???? ShouldNotReachHere(); >> ?335?? } >> ?336 }; >> ?337 >> ?338 class ZFixupPartialLoadsTask : public ZTask { >> ?339 private: >> ?340?? ZThreadRootsIterator _thread_roots; >> ?341 >> ?342 public: >> ?343?? ZFixupPartialLoadsTask() : >> ?344?????? ZTask("ZFixupPartialLoadsTask"), >> ?345?????? _thread_roots() {} >> ?346 >> ?347?? virtual void work() { >> ?348???? ZFixupPartialLoadsClosure cl; >> ?349???? _thread_roots.oops_do(&cl); >> ?350?? } >> ?351 }; >> ?352 >> ?353 void ZHeap::fixup_partial_loads() { >> ?354?? ZFixupPartialLoadsTask task; >> ?355?? _workers.run_parallel(&task); >> ?356 } >> >> cheers, >> Per >> >>> Testing: >>> >>> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and >>> then some. All with -XX:+ZVerifyViews. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >>> >>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >>> >>> >>> Please review, >>> >>> Regards, >>> >>> Nils >>> From volker.simonis at gmail.com Tue May 28 13:04:17 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 28 May 2019 15:04:17 +0200 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Hi Ogata, you change looks good, but you should use "jlong_cast()" in the "immD_0()" operand. Thank you and best regards, Volker On Tue, May 28, 2019 at 12:26 PM Kazunori Ogata wrote: > > Hi Martin, > > Thank you for your review. > > Regards, > Ogata > > "Doerr, Martin" wrote on 2019/05/28 18:21:29: > > > From: "Doerr, Martin" > > To: Kazunori Ogata , "hotspot-compiler- > > dev at openjdk.java.net" , "ppc-aix- > > port-dev at openjdk.java.net" > > Date: 2019/05/28 18:21 > > Subject: [EXTERNAL] RE: RFR: 8224090: [PPC64] Fix SLP patterns for > filling > > an array with double float literals > > > > Hi Ogata, > > > > looks good. Thanks for fixing. > > > > Best regards, > > Martin > > > > > > > -----Original Message----- > > > From: hotspot-compiler-dev > > bounces at openjdk.java.net> On Behalf Of Kazunori Ogata > > > Sent: Freitag, 17. Mai 2019 07:34 > > > To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port- > > > dev at openjdk.java.net > > > Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array > with double > > > float literals > > > > > > Hi, > > > > > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > > > instructions? > > > > > > We found that SLP patterns added by JDK-8208171 [1] use incorrect data > > > type, so the patterns have never been used. Further, the pattern for > > > filling -1.0 is confused with the operation for filling -1L. > > > > > > This webrev fixes the pattern to fill an array with 0 to use 0.0d > instead > > > of 0d, and removes the pattern to will with -1 because the bit pattern > of > > > -1.0d is not easy to generate using a single VSX instruction. It's > should > > > be better to load the literal from TOC and use general repl2D_reg_Ex > > > pattern. > > > > > > I also fixed some comments in "format %{ ... %}" to show correct > matching > > > types. > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > > > > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > > > > > Ref: > > > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > > > > > > > Regards, > > > Ogata > > > > > > From patric.hedlin at oracle.com Tue May 28 13:07:55 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Tue, 28 May 2019 15:07:55 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> Message-ID: <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> Ooops, sorry. The test-case from 8223502 is here: Webrev: http://cr.openjdk.java.net/~phedlin/tr8223502/ Best regards, Patric On 28/05/2019 14:14, Aleksey Shipilev wrote: > On 5/28/19 12:57 PM, Patric Hedlin wrote: >> Right Aleksey, this test has also been used/verified (and should be mentioned as well). > What I meant is that you can use that as regression test for this patch. > > -Aleksey > From shade at redhat.com Tue May 28 13:51:44 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 28 May 2019 15:51:44 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> Message-ID: <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> On 5/28/19 3:07 PM, Patric Hedlin wrote: > Ooops, sorry. The test-case from 8223502 is here: > > Webrev: http://cr.openjdk.java.net/~phedlin/tr8223502/ Yes, so what prevents us from including that test in this changeset? Surely it acts like the regression tests for the fix. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From martin.doerr at sap.com Tue May 28 14:03:44 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 28 May 2019 14:03:44 +0000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> Message-ID: Hi Vladimir, thanks for the update. Looks good. Best regards, Martin > -----Original Message----- > From: Vladimir Ivanov > Sent: Dienstag, 28. Mai 2019 13:41 > To: Doerr, Martin ; hotspot compiler compiler-dev at openjdk.java.net>; hotspot-runtime-dev dev at openjdk.java.net>; hotspot-dev developers dev at openjdk.java.net> > Subject: Re: [13] RFR (M): 8223213: Implement fast class initialization checks > on x86-64 > > Thanks, Martin. > > Updated webrev: > http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/ > > > Are these assertions safe? > > + assert(method()->needs_clinit_barrier(), "barrier not needed"); > > + assert(method()->holder()->is_being_initialized(), "barrier not > needed"); > > Can it happen that initialization concurrently completes before they are > evaluated? > > Good point. Even though ciInstanceKlass caches initialization state of > the corresponding InstanceKlass, it seems there's a possibility that the > state is updated during the compilation (see > ciInstanceKlass::update_if_shared). I enhanced the asserts to check that > initialization has been stated. > > > A small suggestion for x86 TemplateTable::invokeinterface: > > It'd be nice to replace load of interface klass by your new > load_method_holder. > > Agree. Updated. > > Best regards, > Vladimir Ivanov From martin.doerr at sap.com Tue May 28 14:04:15 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 28 May 2019 14:04:15 +0000 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Right, this needs to get changed. Best regards, Martin > -----Original Message----- > From: Volker Simonis > Sent: Dienstag, 28. Mai 2019 15:04 > To: Kazunori Ogata > Cc: Doerr, Martin ; hotspot-compiler- > dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with > double float literals > > Hi Ogata, > > you change looks good, but you should use "jlong_cast()" in the > "immD_0()" operand. > > Thank you and best regards, > Volker > > > On Tue, May 28, 2019 at 12:26 PM Kazunori Ogata > wrote: > > > > Hi Martin, > > > > Thank you for your review. > > > > Regards, > > Ogata > > > > "Doerr, Martin" wrote on 2019/05/28 18:21:29: > > > > > From: "Doerr, Martin" > > > To: Kazunori Ogata , "hotspot-compiler- > > > dev at openjdk.java.net" , > "ppc-aix- > > > port-dev at openjdk.java.net" > > > Date: 2019/05/28 18:21 > > > Subject: [EXTERNAL] RE: RFR: 8224090: [PPC64] Fix SLP patterns for > > filling > > > an array with double float literals > > > > > > Hi Ogata, > > > > > > looks good. Thanks for fixing. > > > > > > Best regards, > > > Martin > > > > > > > > > > -----Original Message----- > > > > From: hotspot-compiler-dev > > > bounces at openjdk.java.net> On Behalf Of Kazunori Ogata > > > > Sent: Freitag, 17. Mai 2019 07:34 > > > > To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port- > > > > dev at openjdk.java.net > > > > Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array > > with double > > > > float literals > > > > > > > > Hi, > > > > > > > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > > > > instructions? > > > > > > > > We found that SLP patterns added by JDK-8208171 [1] use incorrect > data > > > > type, so the patterns have never been used. Further, the pattern for > > > > filling -1.0 is confused with the operation for filling -1L. > > > > > > > > This webrev fixes the pattern to fill an array with 0 to use 0.0d > > instead > > > > of 0d, and removes the pattern to will with -1 because the bit pattern > > of > > > > -1.0d is not easy to generate using a single VSX instruction. It's > > should > > > > be better to load the literal from TOC and use general repl2D_reg_Ex > > > > pattern. > > > > > > > > I also fixed some comments in "format %{ ... %}" to show correct > > matching > > > > types. > > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > > > > > > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > > > > > > > Ref: > > > > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > > > > > > > > > > Regards, > > > > Ogata > > > > > > > > > > From adinn at redhat.com Tue May 28 15:00:39 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 28 May 2019 16:00:39 +0100 Subject: RFR: 8207851: Implement JEP 352 In-Reply-To: <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> Message-ID: Hi Alan, Thank you for the review. On 26/05/2019 19:25, Alan Bateman wrote: > You may want to take a pass over the JEP to make sure that everything is > accurate. I notice, for example, the section on BufferPoolMXBean has the > old name READ_ONLY_PERSISTENT. We went through a couple of iterations in > the discussion here so there might be a few others. Thanks for spotting that. I corrected several wrong mentions of PERSISTENT instead of SYNC. I also coerced a mention of MAP_SYNC to render using a fixed width font. I also updated the description of the exception signature changes to FileChannel.map to explain how they will correlate with specific cases handled by this JEP implementation (or, rather, cases that are explicitly not handled by the implementation). Finally I updated the MXBean name as you suggested (see below). > I think the API changes are okay. I don't see a CSR yet but I assume > you'll get to that soon. Yes, I'll raise one ASAP. Could you clarify what changes I need to document in the CSR? Here are my current thoughts: ManagementFactoryHelper/FileChannelImpl I am assuming the change that exposes the new MXBean needs to be mentioned somwehere. However, that change doesn't actually affect any API. It just means that a new bean with a new name appears in the list of memory beans. I don't see anything which documents those bean names. Am I missing something? (probably :). com/sun/nio/file/ExtendedMapMode (in jdk.unsupported) I'm assuming the CSR needs to propose javadoc for the 2 exposed MapMode values, explaining what these modes are used for and which exceptions documented in FileChannel.map get thrown for the cases where their use is unsupported by the JVM or the OS, respectively. Is that correct? jdk/internal/misc/ExtendedMapMode (in java.base) Do I need to provide javadoc for the two new MapMode values and include them in the CSR? I was assuming not. FileChannelImpl method map The javadoc in FileChannel lists the new exceptions that might be thrown by this implementation but does not mention any specifics to say how they might relate to use of the XXX_SYNC MapModes. Do I need to propose updates for the FileChannel javadoc in the CSR or am I ok to provide that detail in the doc for com/sun/nio/file/ExtendedMapMode? Unsafe method writebackMemory I was assuming Unsafe.writebackMemory is internal to the JDK so does not need a mention in the CSR. Is that correct? > I've read through the changes to java.base and jdk.unsupported. > > Just a few minor points: > > - I assume the update FileChannel.java should be dropped as it's just a > left over from when we agreed to split out the updates to the Java SE API. Yes, that is a leftover. It has been removed from the latest patch. > - com.sun.nio.file.ExtendedMapMode.*_SYNC are missing javadoc, or rather > the descriptions are truncated with "...". I think this dates from when > were working out the right place to expose these constants. The source > file (and the internal ExtendedMapModes are missing copyright headers too). Thanks, I have added javadoc comments. > - We didn't discuss the name of the buffer pool that is exposed through > the JMX/management interface. We could take inspiration from the names > of the CodeHeap spaces that are exposed with MemoryPoolMXBeans as there > is an established convention for naming there, e.g. "mapped - > 'non-volatile memory'". The JEP used the name mapped_persistent" while the code named it mapped_sync. I have changed both to use the name "mapped - 'non-volatile memory'". Does this need further discussion by other parties? Or is that a final decision? > - Minor nit in Unmapper is that the methods to increment/decrement the > usage should use Java conventions so probably should be incrementUsage > and decrementUsage. Caught me red-handed. Also fixed. > - PmemTest. This is awkward and I wonder if it should be @run > main/manual rather than @ignore. Also `@modules jdk.unsupported` would > be useful to ensure it will be skipped if run with a test JDK that > doesn't have this module. Yes, I agree that @run main/manual is far better than @ignore and the @modules requirement is also a very good idea. I have updated both. New webrev: http://cr.openjdk.java.net/~adinn/8207851/webrev.01 regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From tobias.hartmann at oracle.com Tue May 28 15:08:39 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 28 May 2019 17:08:39 +0200 Subject: RFR: 8220449: serviceability/dcmd/compiler/CodelistTest.java failure In-Reply-To: References: Message-ID: <688b8b23-923b-41e1-069d-a7bed637a9bc@oracle.com> Hi Rahul, looks good and trivial to me. Best regards, Tobias On 28.05.19 12:33, Rahul Raghavan wrote: > Hi, > > Please review the following change based on the analysis > and fix proposal as contributed by Gary Adams. > > - http://cr.openjdk.java.net/~rraghavan/8220449/webrev.00/ > > - https://bugs.openjdk.java.net/browse/JDK-8220449 > 'The test does attempt to disable BackgroundCompilation, but it apparently did not work as expected.' > Fix is to add -XX:-BackgroundCompilation to test to eager initialize JVMCI. > > > This case is similar to fix done in tests for JDK-8205400. > ? 8205400: '[Graal] DisassembleCodeBlobTest.java fails with can't be enqueued for compilation on > level 4' > ? # https://bugs.openjdk.java.net/browse/JDK-8205400 > ? # http://hg.openjdk.java.net/jdk/jdk/rev/ca4eea543d23 > > > Thanks, > Rahul From tobias.hartmann at oracle.com Tue May 28 15:12:28 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 28 May 2019 17:12:28 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: References: Message-ID: Hi Nils, this looks reasonable to me but please add parentheses around the == expression. >From the failure history it looks like this failure also showed up without ZGC? Thanks, Tobias On 28.05.19 09:55, Nils Eliasson wrote: > Hi, > > This is patch fixes a problem in LoadBarrierNode::common_barriers that can lead to broken IRs in > some very rare cases. The common barrier optimizations tries to merge two barriers if they have the > same oop-in, and there is a site in the control flow where they can be merged. The problem is that > it doesn't check the address too, it might be pinned by checkcast nodes. > > In the failing IR the address is the same if it is traced upwards, but it isn't obvious which > checkcast nodes can be skipped. Since the common_barriers optimization is removed in the late > loadbarrier insertion patch, I choose to create a good-enough fix that can be easily backported. I > will make sure to push this fix before the late loadbarrier patch. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224538 > > Webrev: http://cr.openjdk.java.net/~neliasso/8224538/webrev.01/ > > Regards, > > Nils > From Alan.Bateman at oracle.com Tue May 28 16:14:03 2019 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 28 May 2019 17:14:03 +0100 Subject: RFR: 8207851: Implement JEP 352 In-Reply-To: References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> Message-ID: <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> On 28/05/2019 16:00, Andrew Dinn wrote: > : > Yes, I'll raise one ASAP. Could you clarify what changes I need to > document in the CSR? Here are my current thoughts: > > ManagementFactoryHelper/FileChannelImpl > I am assuming the change that exposes the new MXBean needs to be > mentioned somwehere. However, that change doesn't actually affect any > API. It just means that a new bean with a new name appears in the list > of memory beans. I don't see anything which documents those bean names. > Am I missing something? (probably :). It's not the main event but I think useful to list it in the CSR. You are right that the existing "direct" and "mapped" aren't documented anywhere but it wouldn't be too surprising to find tools that rely on them. > > com/sun/nio/file/ExtendedMapMode (in jdk.unsupported) > > I'm assuming the CSR needs to propose javadoc for the 2 exposed MapMode > values, explaining what these modes are used for and which exceptions > documented in FileChannel.map get thrown for the cases where their use > is unsupported by the JVM or the OS, respectively. Is that correct? Yes, including the javadoc for the class and the two new map modes. The javadoc for both modes can reference the UOE thrown by FileChannel.map when the mode is not supported. > > jdk/internal/misc/ExtendedMapMode (in java.base) > > Do I need to provide javadoc for the two new MapMode values and include > them in the CSR? I was assuming not. Right, it's JDK internal so no need to list that. > > FileChannelImpl method map > The javadoc in FileChannel lists the new exceptions that might be thrown > by this implementation but does not mention any specifics to say how > they might relate to use of the XXX_SYNC MapModes. Do I need to propose > updates for the FileChannel javadoc in the CSR or am I ok to provide > that detail in the doc for com/sun/nio/file/ExtendedMapMode? JDK-8221397 had the "enabling" changes so no changes to FileChannel.map, just the reference from ExtendedMapMode. > > Unsafe method writebackMemory > I was assuming Unsafe.writebackMemory is internal to the JDK so does not > need a mention in the CSR. Is that correct? That's right. -Alan From nils.eliasson at oracle.com Tue May 28 16:24:24 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 28 May 2019 18:24:24 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: References: Message-ID: I'll update the webrev. This fixes the crash in RunThese - and that was with ZGC. The other crashes are from sparkexamples-application. When I push this patch we need to open a new bug for that crash. Thanks, Nils On 2019-05-28 17:12, Tobias Hartmann wrote: > Hi Nils, > > this looks reasonable to me but please add parentheses around the == expression. > > From the failure history it looks like this failure also showed up without ZGC? > > Thanks, > Tobias > > On 28.05.19 09:55, Nils Eliasson wrote: >> Hi, >> >> This is patch fixes a problem in LoadBarrierNode::common_barriers that can lead to broken IRs in >> some very rare cases. The common barrier optimizations tries to merge two barriers if they have the >> same oop-in, and there is a site in the control flow where they can be merged. The problem is that >> it doesn't check the address too, it might be pinned by checkcast nodes. >> >> In the failing IR the address is the same if it is traced upwards, but it isn't obvious which >> checkcast nodes can be skipped. Since the common_barriers optimization is removed in the late >> loadbarrier insertion patch, I choose to create a good-enough fix that can be easily backported. I >> will make sure to push this fix before the late loadbarrier patch. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8224538 >> >> Webrev: http://cr.openjdk.java.net/~neliasso/8224538/webrev.01/ >> >> Regards, >> >> Nils >> From patric.hedlin at oracle.com Tue May 28 16:48:31 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Tue, 28 May 2019 18:48:31 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> Message-ID: Missing webrev with test-case for JDK-8223502: Webrev: http://cr.openjdk.java.net/~phedlin/tr8223502/ Best regards, Patric On 2019-05-28 12:35, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following change/update: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8223363 > Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ > > 8223363: Bad node estimate assertion failure > > ??? Tightening node budget estimates. New loop clone estimate > attempting to > ??? approximate complex loop cases (slightly) more accurately. Still > ad-hoc. > > > Also addressed: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8223502 > Issue:? https://bugs.openjdk.java.net/browse/JDK-8224648 > > 8223502: Node estimate for loop unswitching is not correct: > assert(delta <= 2 * required) failed: Bad node estimate > 8224648: assert(!exceeding_node_budget()) failed: Too many NODES > required! failure with ctw > > > Testing: hs-tier1..7, hs-precheckin-comp, Lucene (CTW) > > Caveat:? Multiple failures present in several tiers. > > > Best regards, > Patric > From patric.hedlin at oracle.com Tue May 28 16:49:21 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Tue, 28 May 2019 18:49:21 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> Message-ID: On 2019-05-28 15:51, Aleksey Shipilev wrote: > On 5/28/19 3:07 PM, Patric Hedlin wrote: >> Ooops, sorry. The test-case from 8223502 is here: Webrev: >> http://cr.openjdk.java.net/~phedlin/tr8223502/ > Yes, so what prevents us from including that test in this changeset? > Surely it acts like the regression tests for the fix. Absolutely nothing. The error was only in my webrev generation that didn't include both bookmarks. They will be pushed "as one". (I can generate a new, single, complete, webrev in the morning if you prefer.) Best regards, Patric From shade at redhat.com Tue May 28 16:50:33 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 28 May 2019 18:50:33 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> Message-ID: <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> On 5/28/19 6:49 PM, Patric Hedlin wrote: > On 2019-05-28 15:51, Aleksey Shipilev wrote: >> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >> Yes, so what prevents us from including that test in this changeset? Surely it acts like the >> regression tests for the fix. > > Absolutely nothing. The error was only in my webrev generation that didn't include both bookmarks. > They will be pushed "as one". (I can generate a new, single, complete, webrev in the morning if you > prefer.) Yes, please. I prefer full webrevs to understand what exactly is being pushed. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vivek.r.deshpande at intel.com Tue May 28 17:01:58 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 28 May 2019 17:01:58 +0000 Subject: RFR(XS) 8224558: x86 Fix replicateB encoding In-Reply-To: References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EBAE0@ORSMSX106.amr.corp.intel.com> <140eaec4-a0fb-cfbf-cbae-d9b5661df758@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4ECDD8@ORSMSX106.amr.corp.intel.com> <5ca63c64-eb2c-1a11-c758-462782d0933d@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4EFF40@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4F1C1E@ORSMSX106.amr.corp.intel.com> Thanks. Pushed the patch. http://hg.openjdk.java.net/jdk/jdk/rev/d1fa0f8d8c9a Regards, Vivek -----Original Message----- From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com] Sent: Monday, May 27, 2019 4:14 AM To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding > http://cr.openjdk.java.net/~vdeshpande/8224558/webrev.01/ Looks good. Best regards, Vladimir Ivanov > I can push the patch, after both of you find it ok. > > Regards, > Vivek > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Thursday, May 23, 2019 5:45 AM > To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Cc: Viswanathan, Sandhya > Subject: Re: RFR(XS) 8224558: x86 Fix replicateB encoding > > Hi Vivek, > > On 22.05.19 19:48, Deshpande, Vivek R wrote: >> I came across this issue with vector API tests. >> I tried to write a reproducer using the autovectorizer test, but it always uses the register based rule(instruct Repl32B) instead of memory based rule (instruct Repl32B_mem) and register based rule gives correct result. >> This is why it never showed up. > > Okay, thanks for the explanation. > >> Could you please help me with forcing to use memory based rule with autovectorizer based test. > I'm afraid I can't help, I'm not even sure what the autovectorizer is. Is that specific to the vector API? > > Best regards, > Tobias > From vladimir.kozlov at oracle.com Tue May 28 17:32:16 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 28 May 2019 10:32:16 -0700 Subject: [13] RFR: 8213416: Replace some enums with static const members in hotspot/compiler In-Reply-To: <20c8f52f-7a43-a658-77ad-40c8a3d74ff1@oracle.com> References: <1f7afc19-0756-33f8-54f5-2438ed5da886@oracle.com> <8f18e15d-cae8-58eb-b4a1-870ca6ffaf15@oracle.com> <62869e18-3deb-435d-1ce8-7726866d79eb@oracle.com> <8b22fc8b-af06-31a2-4033-4984ac4fcb5d@oracle.com> <20c8f52f-7a43-a658-77ad-40c8a3d74ff1@oracle.com> Message-ID: <0A00046F-0454-4469-B348-B687B1B96989@oracle.com> Yes, I like this version of fix. Reviewed. Thanks Vladimir > On May 27, 2019, at 2:03 AM, Tobias Hartmann wrote: > > Hi Rahul, > >> On 24.05.19 22:17, Rahul Raghavan wrote: >> Request one more review approval for latest >> http://cr.openjdk.java.net/~rraghavan/8213416/webrev.01/ > > Looks good to me. > > Best regards, > Tobias From dean.long at oracle.com Wed May 29 00:31:14 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 28 May 2019 17:31:14 -0700 Subject: RFR(XS) 8224931: disable JAOTC invokedynamic support until 8223533 is fixed Message-ID: https://bugs.openjdk.java.net/browse/JDK-8224931 http://cr.openjdk.java.net/~dlong/8224931/webrev/ Disable JAOTC invokedynamic support temporarily.? Register a plugin that always returns false for supportsDynamicInvoke(), which causes the parser to insert deoptimize nodes. dl From vladimir.kozlov at oracle.com Wed May 29 00:58:44 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 28 May 2019 17:58:44 -0700 Subject: RFR(XS) 8224931: disable JAOTC invokedynamic support until 8223533 is fixed In-Reply-To: References: Message-ID: <7100BBCD-EE5A-472C-A9CB-F16196C2B038@oracle.com> Good. Thanks Vladimir > On May 28, 2019, at 5:31 PM, dean.long at oracle.com wrote: > > https://bugs.openjdk.java.net/browse/JDK-8224931 > http://cr.openjdk.java.net/~dlong/8224931/webrev/ > > Disable JAOTC invokedynamic support temporarily. Register a plugin that always returns false for supportsDynamicInvoke(), which causes the parser to insert deoptimize nodes. > > dl From rahul.v.raghavan at oracle.com Wed May 29 02:23:35 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 29 May 2019 07:53:35 +0530 Subject: RFR: 8220449: serviceability/dcmd/compiler/CodelistTest.java failure In-Reply-To: <688b8b23-923b-41e1-069d-a7bed637a9bc@oracle.com> References: <688b8b23-923b-41e1-069d-a7bed637a9bc@oracle.com> Message-ID: <182fb5e4-eb8b-a16a-17d2-2fbb87c0dcc9@oracle.com> Thank you Tobias. On 28/05/19 8:38 PM, Tobias Hartmann wrote: > Hi Rahul, > > looks good and trivial to me. > > Best regards, > Tobias From OGATAK at jp.ibm.com Wed May 29 05:16:11 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Wed, 29 May 2019 14:16:11 +0900 Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with double float literals In-Reply-To: References: Message-ID: Hi Volker and Martin, Thank you for your comment. I updated the webrev. http://cr.openjdk.java.net/~horii/8224090/webrev.01/ Regards, Ogata "Doerr, Martin" wrote on 2019/05/28 23:04:15: > From: "Doerr, Martin" > To: Volker Simonis , Kazunori Ogata > Cc: "hotspot-compiler-dev at openjdk.java.net" dev at openjdk.java.net>, "ppc-aix-port-dev at openjdk.java.net" dev at openjdk.java.net> > Date: 2019/05/28 23:08 > Subject: RE: RFR: 8224090: [PPC64] Fix SLP patterns for filling > an array with double float literals > > Right, this needs to get changed. > > Best regards, > Martin > > > -----Original Message----- > > From: Volker Simonis > > Sent: Dienstag, 28. Mai 2019 15:04 > > To: Kazunori Ogata > > Cc: Doerr, Martin ; hotspot-compiler- > > dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > > Subject: Re: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array with > > double float literals > > > > Hi Ogata, > > > > you change looks good, but you should use "jlong_cast()" in the > > "immD_0()" operand. > > > > Thank you and best regards, > > Volker > > > > > > On Tue, May 28, 2019 at 12:26 PM Kazunori Ogata > > wrote: > > > > > > Hi Martin, > > > > > > Thank you for your review. > > > > > > Regards, > > > Ogata > > > > > > "Doerr, Martin" wrote on 2019/05/28 18:21:29: > > > > > > > From: "Doerr, Martin" > > > > To: Kazunori Ogata , "hotspot-compiler- > > > > dev at openjdk.java.net" , > > "ppc-aix- > > > > port-dev at openjdk.java.net" > > > > Date: 2019/05/28 18:21 > > > > Subject: [EXTERNAL] RE: RFR: 8224090: [PPC64] Fix SLP patterns for > > > filling > > > > an array with double float literals > > > > > > > > Hi Ogata, > > > > > > > > looks good. Thanks for fixing. > > > > > > > > Best regards, > > > > Martin > > > > > > > > > > > > > -----Original Message----- > > > > > From: hotspot-compiler-dev > > > > bounces at openjdk.java.net> On Behalf Of Kazunori Ogata > > > > > Sent: Freitag, 17. Mai 2019 07:34 > > > > > To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port- > > > > > dev at openjdk.java.net > > > > > Subject: RFR: 8224090: [PPC64] Fix SLP patterns for filling an array > > > with double > > > > > float literals > > > > > > > > > > Hi, > > > > > > > > > > May I get review for a webrev to fix SLP patterns that use PPC64 VSX > > > > > instructions? > > > > > > > > > > We found that SLP patterns added by JDK-8208171 [1] use incorrect > > data > > > > > type, so the patterns have never been used. Further, the pattern for > > > > > filling -1.0 is confused with the operation for filling -1L. > > > > > > > > > > This webrev fixes the pattern to fill an array with 0 to use 0.0d > > > instead > > > > > of 0d, and removes the pattern to will with -1 because the bit pattern > > > of > > > > > -1.0d is not easy to generate using a single VSX instruction. It's > > > should > > > > > be better to load the literal from TOC and use general repl2D_reg_Ex > > > > > pattern. > > > > > > > > > > I also fixed some comments in "format %{ ... %}" to show correct > > > matching > > > > > types. > > > > > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224090 > > > > > > > > > > Webrev: http://cr.openjdk.java.net/~horii/8224090/webrev.00/ > > > > > > > > > > Ref: > > > > > [1] https://bugs.openjdk.java.net/browse/JDK-8208171 > > > > > > > > > > > > > > > Regards, > > > > > Ogata > > > > > > > > > > > > > > > From dean.long at oracle.com Wed May 29 05:28:32 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 28 May 2019 22:28:32 -0700 Subject: RFR(XS) 8224931: disable JAOTC invokedynamic support until 8223533 is fixed In-Reply-To: <7100BBCD-EE5A-472C-A9CB-F16196C2B038@oracle.com> References: <7100BBCD-EE5A-472C-A9CB-F16196C2B038@oracle.com> Message-ID: Thanks Vladimir. dl On 5/28/19 5:58 PM, Vladimir Kozlov wrote: > Good. > > Thanks > Vladimir > >> On May 28, 2019, at 5:31 PM, dean.long at oracle.com wrote: >> >> https://bugs.openjdk.java.net/browse/JDK-8224931 >> http://cr.openjdk.java.net/~dlong/8224931/webrev/ >> >> Disable JAOTC invokedynamic support temporarily. Register a plugin that always returns false for supportsDynamicInvoke(), which causes the parser to insert deoptimize nodes. >> >> dl From igor.ignatyev at oracle.com Wed May 29 07:10:30 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 29 May 2019 00:10:30 -0700 Subject: RFR(T/S) : 8224945 : googlemock update breaks the build of arm32 Message-ID: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > 10 lines changed: 10 ins; 0 del; 0 mod; Hi all, could you please review this small and trivial patch which undefines R, F1 and F2 macros in unittest.hpp so they won't conflict w/ typenames used in gmock? testing: extensive build testing (which found JDK-8224949 -- an unrelated breakage on linux-x86) webrev: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html JBS: https://bugs.openjdk.java.net/browse/JDK-8224945 Thanks, -- Igor From david.holmes at oracle.com Wed May 29 07:35:24 2019 From: david.holmes at oracle.com (David Holmes) Date: Wed, 29 May 2019 17:35:24 +1000 Subject: RFR(T/S) : 8224945 : googlemock update breaks the build of arm32 In-Reply-To: References: Message-ID: <08bbcfde-33c5-1f07-6597-9db588a09452@oracle.com> Hi Igor, This looks good to me. Thanks for fixing so promptly. David On 29/05/2019 5:10 pm, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html >> 10 lines changed: 10 ins; 0 del; 0 mod; > > > Hi all, > > could you please review this small and trivial patch which undefines R, F1 and F2 macros in unittest.hpp so they won't conflict w/ typenames used in gmock? > > testing: extensive build testing (which found JDK-8224949 -- an unrelated breakage on linux-x86) > webrev: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8224945 > > Thanks, > -- Igor > From rwestrel at redhat.com Wed May 29 07:48:47 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 29 May 2019 09:48:47 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: <87a7f5bt0g.fsf@redhat.com> Hi Nils, > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ zBarrierSetC2.cpp: typo loadbarrers line 756 lcm.cpp: void PhaseCFG:: call_catch_cleanup(Block* block) { space after ::? loopnode.cpp: Node *u I thought the usually recommended style was: Node* u loopnode.cpp: Do we really need a new entry in the gc api for barrier_insertion() couldn't this go under optimize_loops()? memnode.hpp: 168 enum LoadBarrier { 169 UnProcessed = 0, 170 RequireBarrier = 1, 171 WeakBarrier = 3, // Inclusive with RequireBarrier 172 ExpandedBarrier = 4 173 }; Shouldn't that be abstracted away through the gc api somehow? zBarrierSetC2.cpp: typo (witch): 981 // For each load use, see witch catch projs dominates, create load clone lazily and reconnect In fixup_uses_in_catch() wouldn't you need to deal with phis the way you do in call_catch_cleanup_one(): 1019 // We found no single catchproj that dominated the use - The use is at a point after 1020 // where control flow from multiple catch projs have merged. We will have to create 1021 // phi nodes before the use and tie the output from the cloned loads together. It 1022 // can be a single phi or a number of chained phis, depending on control flow Is there a chance that a load would be processed by fixup_uses_in_catch()? I see call_catch_cleanup_one() is where they are expected to be handled but you only go there if load->is_barrier_required() is true. So could you have a load for which is_barrier_required() is true have a use for which is_barrier_required() is not true and both of them be in the catch block? Roland. From patric.hedlin at oracle.com Wed May 29 07:48:23 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Wed, 29 May 2019 09:48:23 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> Message-ID: <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ Now also including the testcase for/from 8223502. Best regards, Patric On 28/05/2019 18:50, Aleksey Shipilev wrote: > On 5/28/19 6:49 PM, Patric Hedlin wrote: >> On 2019-05-28 15:51, Aleksey Shipilev wrote: >>> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >>> Yes, so what prevents us from including that test in this changeset? Surely it acts like the >>> regression tests for the fix. >> Absolutely nothing. The error was only in my webrev generation that didn't include both bookmarks. >> They will be pushed "as one". (I can generate a new, single, complete, webrev in the morning if you >> prefer.) > Yes, please. I prefer full webrevs to understand what exactly is being pushed. > > -Aleksey > From tobias.hartmann at oracle.com Wed May 29 07:54:46 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 29 May 2019 09:54:46 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: References: Message-ID: Hi Nils, On 28.05.19 18:24, Nils Eliasson wrote: > I'll update the webrev. Thanks. > This fixes the crash in RunThese - and that was with ZGC. The other crashes are from > sparkexamples-application. > > When I push this patch we need to open a new bug for that crash. Okay, I've re-added the ZGC label and filed JDK-8224957 for the non-ZGC related failures. Best regards, Tobias From nils.eliasson at oracle.com Wed May 29 07:56:15 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 09:56:15 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <12b53de0-17a6-1178-23f4-1dabba4ef2e0@oracle.com> References: <7be41f00-bede-c64a-2bc8-2c4b9981f309@oracle.com> <12b53de0-17a6-1178-23f4-1dabba4ef2e0@oracle.com> Message-ID: Thanks Per, I put a new webrev here: http://cr.openjdk.java.net/~neliasso/8224675/webrev.02/ / Nils On 2019-05-28 14:58, Per Liden wrote: > Hi Nils, > > We should now also be able to remove the following ugly hacks (yay!): > > diff --git a/src/hotspot/share/classfile/vmSymbols.cpp > b/src/hotspot/share/classfile/vmSymbols.cpp > --- a/src/hotspot/share/classfile/vmSymbols.cpp > +++ b/src/hotspot/share/classfile/vmSymbols.cpp > @@ -778,9 +778,6 @@ > ?#endif // COMPILER1 > ?#ifdef COMPILER2 > ?? case vmIntrinsics::_clone: > -#if INCLUDE_ZGC > -??? if (UseZGC) return true; > -#endif > ?? case vmIntrinsics::_copyOf: > ?? case vmIntrinsics::_copyOfRange: > ???? // These intrinsics use both the objectcopy and the arraycopy > diff --git a/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp > b/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp > --- a/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp > +++ b/src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp > @@ -462,7 +462,6 @@ > ?? } > > ?? bool weak = (access.decorators() & ON_WEAK_OOP_REF) != 0; > -? assert(access.is_parse_access(), "entry not supported at > optimization time"); > ?? if (p->isa_Load()) { > ???? p->as_Load()->set_barrier(weak); > ?? } > diff --git a/src/hotspot/share/runtime/stackValue.cpp > b/src/hotspot/share/runtime/stackValue.cpp > --- a/src/hotspot/share/runtime/stackValue.cpp > +++ b/src/hotspot/share/runtime/stackValue.cpp > @@ -133,11 +133,6 @@ > ?????? } > ?#endif > ?????? // Deoptimization must make sure all oops have passed load > barriers > -#if INCLUDE_ZGC > -????? if (UseZGC) { > -??????? val = > ZBarrier::load_barrier_on_oop_field_preloaded((oop*)value_addr, val); > -????? } > -#endif > ?#if INCLUDE_SHENANDOAHGC > ?????? if (UseShenandoahGC) { > ???????? val = > ShenandoahBarrierSet::barrier_set()->load_reference_barrier(val); > > cheers, > Per > > On 5/24/19 11:41 AM, Nils Eliasson wrote: >> Hi Per, >> >> I removed the code and updated the webrev. >> >> Thanks, >> >> Nils >> >> On 2019-05-23 21:32, Per Liden wrote: >>> Hi Nils, >>> >>> On 2019-05-23 16:25, Nils Eliasson wrote: >>> [...] >>>> The wart that was fixup_partial_loads in zHeap has also been made >>>> redundant. >>> >>> We should also be able to remove the function, task and closure for >>> this: >>> >>> ?327 class ZFixupPartialLoadsClosure : public ZRootsIteratorClosure { >>> ?328 public: >>> ?329?? virtual void do_oop(oop* p) { >>> ?330???? ZBarrier::mark_barrier_on_root_oop_field(p); >>> ?331?? } >>> ?332 >>> ?333?? virtual void do_oop(narrowOop* p) { >>> ?334???? ShouldNotReachHere(); >>> ?335?? } >>> ?336 }; >>> ?337 >>> ?338 class ZFixupPartialLoadsTask : public ZTask { >>> ?339 private: >>> ?340?? ZThreadRootsIterator _thread_roots; >>> ?341 >>> ?342 public: >>> ?343?? ZFixupPartialLoadsTask() : >>> ?344?????? ZTask("ZFixupPartialLoadsTask"), >>> ?345?????? _thread_roots() {} >>> ?346 >>> ?347?? virtual void work() { >>> ?348???? ZFixupPartialLoadsClosure cl; >>> ?349???? _thread_roots.oops_do(&cl); >>> ?350?? } >>> ?351 }; >>> ?352 >>> ?353 void ZHeap::fixup_partial_loads() { >>> ?354?? ZFixupPartialLoadsTask task; >>> ?355?? _workers.run_parallel(&task); >>> ?356 } >>> >>> cheers, >>> Per >>> >>>> Testing: >>>> >>>> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and >>>> then some. All with -XX:+ZVerifyViews. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >>>> >>>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >>>> >>>> >>>> Please review, >>>> >>>> Regards, >>>> >>>> Nils >>>> From volker.simonis at gmail.com Wed May 29 08:23:21 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 29 May 2019 10:23:21 +0200 Subject: RFR(T/S) : 8224945 : googlemock update breaks the build of arm32 In-Reply-To: References: Message-ID: Hi Igor, thanks for considering ppc in your fix as well. Your changes are definitely required on ppc but I'm not sure if that's enough. I'm currently running a build to verify that. I'll let you know in an hour or so. Best regards, Volker On Wed, May 29, 2019 at 9:11 AM Igor Ignatyev wrote: > > http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > > 10 lines changed: 10 ins; 0 del; 0 mod; > > > Hi all, > > could you please review this small and trivial patch which undefines R, F1 and F2 macros in unittest.hpp so they won't conflict w/ typenames used in gmock? > > testing: extensive build testing (which found JDK-8224949 -- an unrelated breakage on linux-x86) > webrev: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > JBS: https://bugs.openjdk.java.net/browse/JDK-8224945 > > Thanks, > -- Igor From volker.simonis at gmail.com Wed May 29 08:54:14 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Wed, 29 May 2019 10:54:14 +0200 Subject: RFR(T/S) : 8224945 : googlemock update breaks the build of arm32 In-Reply-To: References: Message-ID: Looks good! With your change I can successfully build ppc64 platfroms again! Thanks and thumbs up from me! On Wed, May 29, 2019 at 10:23 AM Volker Simonis wrote: > > Hi Igor, > > thanks for considering ppc in your fix as well. > > Your changes are definitely required on ppc but I'm not sure if that's > enough. I'm currently running a build to verify that. I'll let you > know in an hour or so. > > Best regards, > Volker > > On Wed, May 29, 2019 at 9:11 AM Igor Ignatyev wrote: > > > > http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > > > 10 lines changed: 10 ins; 0 del; 0 mod; > > > > > > Hi all, > > > > could you please review this small and trivial patch which undefines R, F1 and F2 macros in unittest.hpp so they won't conflict w/ typenames used in gmock? > > > > testing: extensive build testing (which found JDK-8224949 -- an unrelated breakage on linux-x86) > > webrev: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html > > JBS: https://bugs.openjdk.java.net/browse/JDK-8224945 > > > > Thanks, > > -- Igor From adam.farley at uk.ibm.com Wed May 29 09:45:05 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 29 May 2019 10:45:05 +0100 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement Message-ID: Hi All, Could someone familiar with the Hotspot JIT please review and opine on the below? The Char-Byte encoding/decoding methods inside some of the sun.nio.cs classes (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 VM seems to do this a lot faster. Is it possible to achieve a similar improvement on OpenJDK on Hotspot by tweaking the CL code to match Hotspot JIT compiler idioms, or by introducing a method name for the HS JIT to match on? An example of these changes to US_ASCII.java is linked below. No OpenJ9 code is included in the work item or the webrev, to avoid contamination. Work item: https://bugs.openjdk.java.net/browse/JDK-8224963 Example Webrev: http://cr.openjdk.java.net/~afarley/8224963/webrev/ Best Regards Adam Farley IBM Runtimes Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Wed May 29 10:18:42 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 29 May 2019 12:18:42 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: <87imtwaz43.fsf@redhat.com> References: <877eailvgp.fsf@redhat.com> <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> <87imtwaz43.fsf@redhat.com> Message-ID: <214cc3ee-7b10-d88d-f154-54a940a7a56d@redhat.com> >> The change looks reasonable to me. >> >> I have run tests with the patch and can confirm that the original bug >> went away. I've also run a bunch of other tests and workloads and looks >> good too. > > Thanks Roman. Anyone else for this? > Shenandoah needs this for a key change that's also out for review: > > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025931.html > (8224584: Shenandoah: Eliminate forwarding pointer word) FWIW, I've pushed the "8224584: Shenandoah: Eliminate forwarding pointer word" change, in order to unblock our work. It is really important to fix this nasty bug though. Can we please get another review from C2/compiler folks? Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From nils.eliasson at oracle.com Wed May 29 12:00:23 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 14:00:23 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> Message-ID: Looks good, // Nils On 2019-05-29 09:48, Patric Hedlin wrote: > Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ > > Now also including the testcase for/from 8223502. > > Best regards, > Patric > > On 28/05/2019 18:50, Aleksey Shipilev wrote: >> On 5/28/19 6:49 PM, Patric Hedlin wrote: >>> On 2019-05-28 15:51, Aleksey Shipilev wrote: >>>> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>>>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>>>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >>>> Yes, so what prevents us from including that test in this >>>> changeset? Surely it acts like the >>>> regression tests for the fix. >>> Absolutely nothing. The error was only in my webrev generation that >>> didn't include both bookmarks. >>> They will be pushed "as one". (I can generate a new, single, >>> complete, webrev in the morning if you >>> prefer.) >> Yes, please. I prefer full webrevs to understand what exactly is >> being pushed. >> >> -Aleksey >> > From vladimir.x.ivanov at oracle.com Wed May 29 12:02:20 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 29 May 2019 15:02:20 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> Message-ID: <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> Thanks, Jie. > http://cr.openjdk.java.net/~jiefu/8224162/webrev.05/ src/hotspot/share/ci/ciMethod.cpp: if (count >= 0) { count += receivers_count_total; + // Check and handle the overflow condition + if (count < 0) { + count = max_jint; + } } There are other places where overflow can occur: int rcount = call->receiver_count(i) + epsilon; receivers_count_total += rcount; Maybe introduce a helper routine for saturating addition? src/hotspot/share/oops/methodData.hpp: uint count() const { - return uint_at(count_off); + intptr_t raw_data = intptr_at(count_off); + if (raw_data > max_jint) { + raw_data = max_jint; + } else if (raw_data < min_jint) { + raw_data = min_jint; + } + return uint(raw_data); } Should uintptr_t be used here instead? It looks like this version won't work as expected on 32-bit platforms (since uint => intrptr_t conversion becomes lossy). Also, I don't understand why you added comparison against min_jint. Since counts are unsigned, the following should be enough: uint count() const { uintx raw_data = (uintx)intptr_at(count_off); if (raw_data > max_jint) { raw_data = max_jint; } return uint(raw_data); } Best regards, Vladimir Ivanov > On 2019/5/23 ??6:21, Vladimir Ivanov wrote: >> I'm still in favor of fixing the root cause than putting band-aids in >> otherwise perfectly valid code. >> >> I don't consider fixing CounterData::count() and its usages to >> properly handle overflow as overly complicated. There's a limited >> number of usages and they don't properly handle overflow as well. So, >> fixing the bug is highly desireable even though it has been left >> unnoticed for a long time. > From nils.eliasson at oracle.com Wed May 29 12:18:27 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 14:18:27 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <87a7f5bt0g.fsf@redhat.com> References: <87a7f5bt0g.fsf@redhat.com> Message-ID: <87a3f42d-94ff-b588-85ea-e8bbf07cc2df@oracle.com> Hi Roland, On 2019-05-29 09:48, Roland Westrelin wrote: > Hi Nils, > >> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > zBarrierSetC2.cpp: typo loadbarrers line 756 Fixed > lcm.cpp: > > void PhaseCFG:: call_catch_cleanup(Block* block) { > > space after ::? Fixed > > loopnode.cpp: > > Node *u > > I thought the usually recommended style was: > > Node* u Yes. > loopnode.cpp: > > Do we really need a new entry in the gc api for barrier_insertion() > couldn't this go under optimize_loops()? It could. There is only remove_range_check_casts in between right now. I choose to have it as its own for separation, in a similar way to how LoopOptsNone are used. > memnode.hpp: > > 168 enum LoadBarrier { > 169 UnProcessed = 0, > 170 RequireBarrier = 1, > 171 WeakBarrier = 3, // Inclusive with RequireBarrier > 172 ExpandedBarrier = 4 > 173 }; > > Shouldn't that be abstracted away through the gc api somehow? Yes. I would have preferred using the node-flags, but they are all taken unless we expand it to 32 bits, or overload the flags that are only used during codegen and later. That would require some verification of the flag use to catch any mistakes. > > zBarrierSetC2.cpp: > > typo (witch): > > 981 // For each load use, see witch catch projs dominates, create load clone lazily and reconnect Fixed > > In fixup_uses_in_catch() wouldn't you need to deal with phis the way you > do in call_catch_cleanup_one(): > > 1019 // We found no single catchproj that dominated the use - The use is at a point after > 1020 // where control flow from multiple catch projs have merged. We will have to create > 1021 // phi nodes before the use and tie the output from the cloned loads together. It > 1022 // can be a single phi or a number of chained phis, depending on control flow No - it's a two step process. There is a load hanging off a call with multiple catch projs. (There can be no other control-flow in between.) The load have uses, either (1) in the catch blocks, or (2) in some cases in the call block, hanging off the load. fixup_uses_in_catch responsibility is to turn (2) into (1). It clones all uses of the load, in the call block, out to the catch blocks. This is done recursively. There can never be any phis here. > Is there a chance that a load would be processed by > fixup_uses_in_catch()? I see call_catch_cleanup_one() is where they are > expected to be handled but you only go there if > load->is_barrier_required() is true. So could you have a load for which > is_barrier_required() is true have a use for which is_barrier_required() > is not true and both of them be in the catch block? In theory yes. If the load without a barrier is a use of the load with a barrier, it would be processed first. When the load with a barrier is processed, it would clone the load without a barrier, like any other node. Another case is two loads that both require a barrier. One would be processed first, that one can't have the other one as a use, because the loads are processed in post order. And when the other load is processed, the first has already been cloned out. Thanks for the feedback! I'll get back with a new webrev. // Nils > > Roland. From nils.eliasson at oracle.com Wed May 29 12:21:34 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 14:21:34 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: <214cc3ee-7b10-d88d-f154-54a940a7a56d@redhat.com> References: <877eailvgp.fsf@redhat.com> <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> <87imtwaz43.fsf@redhat.com> <214cc3ee-7b10-d88d-f154-54a940a7a56d@redhat.com> Message-ID: <18e8ca11-3b1b-df57-4fad-52160b4c88b8@oracle.com> Looks good! Reviewed, / Nils On 2019-05-29 12:18, Roman Kennke wrote: >>> The change looks reasonable to me. >>> >>> I have run tests with the patch and can confirm that the original bug >>> went away. I've also run a bunch of other tests and workloads and looks >>> good too. >> Thanks Roman. Anyone else for this? >> Shenandoah needs this for a key change that's also out for review: >> >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025931.html >> (8224584: Shenandoah: Eliminate forwarding pointer word) > FWIW, I've pushed the "8224584: Shenandoah: Eliminate forwarding pointer > word" change, in order to unblock our work. It is really important to > fix this nasty bug though. Can we please get another review from > C2/compiler folks? > > Thanks, > Roman > From vladimir.x.ivanov at oracle.com Wed May 29 12:22:27 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 29 May 2019 15:22:27 +0300 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement In-Reply-To: References: Message-ID: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> Hi Adam, The bug mentions ~6x improvement in throughput. Are there have any microbenchmarks you can share which demonstrate that? That would greatly simplify the analysis of changes you propose. Also, if you can elaborate on what optimization opportunities C2 misses in original code, please, do. Best regards, Vladimir Ivanov On 29/05/2019 12:45, Adam Farley8 wrote: > Hi All, > > Could someone familiar with the Hotspot JIT please review and opine on > the below? > > The Char-Byte encoding/decoding methods inside some of the sun.nio.cs > classes > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 VM seems to > do this a lot faster. > > Is it possible to achieve a similar improvement on OpenJDK on Hotspot by > tweaking the CL code to match Hotspot JIT compiler idioms, or by > introducing > a method name for the HS JIT to match on? > > An example of these changes to US_ASCII.java is linked below. No OpenJ9 > code > is included in the work item or the webrev, to avoid contamination. > > Work item: https://bugs.openjdk.java.net/browse/JDK-8224963 > > Example Webrev: _http://cr.openjdk.java.net/~afarley/8224963/webrev/_ > > Best Regards > > Adam Farley > IBM Runtimes > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From nils.eliasson at oracle.com Wed May 29 12:31:09 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 14:31:09 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: References: Message-ID: <540347ca-30c1-3199-8acf-225782c886ec@oracle.com> I updated the webrev in place. Thanks, // Nils On 2019-05-29 09:54, Tobias Hartmann wrote: > Hi Nils, > > On 28.05.19 18:24, Nils Eliasson wrote: >> I'll update the webrev. > Thanks. > >> This fixes the crash in RunThese - and that was with ZGC. The other crashes are from >> sparkexamples-application. >> >> When I push this patch we need to open a new bug for that crash. > Okay, I've re-added the ZGC label and filed JDK-8224957 for the non-ZGC related failures. > > Best regards, > Tobias From adinn at redhat.com Wed May 29 12:35:03 2019 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 29 May 2019 13:35:03 +0100 Subject: RFR: 8224974: Implement JEP 352 In-Reply-To: <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> Message-ID: <61731ea4-eda2-4126-8cc2-9afbfbbd278f@redhat.com> Hi Alan, I have created a new JEP implementation JIRA (note change to thread title) and associated draft CSR Impl JIRA: https://bugs.openjdk.java.net/browse/JDK-8224974 CSR JIRA: regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From tobias.hartmann at oracle.com Wed May 29 12:38:59 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 29 May 2019 14:38:59 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: <540347ca-30c1-3199-8acf-225782c886ec@oracle.com> References: <540347ca-30c1-3199-8acf-225782c886ec@oracle.com> Message-ID: <532513d4-6927-6718-b9e8-c28397069b58@oracle.com> Thanks, looks good! Best regards, Tobias On 29.05.19 14:31, Nils Eliasson wrote: > I updated the webrev in place. > > Thanks, > > // Nils > > On 2019-05-29 09:54, Tobias Hartmann wrote: >> Hi Nils, >> >> On 28.05.19 18:24, Nils Eliasson wrote: >>> I'll update the webrev. >> Thanks. >> >>> This fixes the crash in RunThese - and that was with ZGC. The other crashes are from >>> sparkexamples-application. >>> >>> When I push this patch we need to open a new bug for that crash. >> Okay, I've re-added the ZGC label and filed JDK-8224957 for the non-ZGC related failures. >> >> Best regards, >> Tobias From adinn at redhat.com Wed May 29 12:50:20 2019 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 29 May 2019 13:50:20 +0100 Subject: RFR: 8224974: Implement JEP 352 In-Reply-To: <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> Message-ID: <06ef3b25-9e34-f6f3-e4c4-319adea52ae7@redhat.com> Hi Alan, Apologies for the previous post which escaped from the lab while Dr Funkenstein was struggling to push the right buttons (and work out what happened when he pushed them). I have created an implementation subtask and associated CSR. I also updated the last webrev to record the javadoc changes necessitated in order to complete the CSR. Finally, I set the JEP fix version to 13 and pressed the big red "target" button. Impl JIRA: https://bugs.openjdk.java.net/browse/JDK-8224974 CSR JIRA: https://bugs.openjdk.java.net/browse/JDK-8224975 webrev: http://cr.openjdk.java.net/~adinn/8224974/webrev.02/ n.b. I have switched to using the subtask JIRA id in $title and in the cr.openjdk webrev link ... regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From patric.hedlin at oracle.com Wed May 29 12:51:30 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Wed, 29 May 2019 14:51:30 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> Message-ID: Thanks for reviewing Nils. /Patric On 29/05/2019 14:00, Nils Eliasson wrote: > Looks good, > > // Nils > > > On 2019-05-29 09:48, Patric Hedlin wrote: >> Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ >> >> Now also including the testcase for/from 8223502. >> From eric.caspole at oracle.com Wed May 29 14:07:04 2019 From: eric.caspole at oracle.com (Eric Caspole) Date: Wed, 29 May 2019 10:07:04 -0400 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement In-Reply-To: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> References: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> Message-ID: <47be7536-8b6a-d87c-986e-8c79e666813e@oracle.com> Hi Adam, It would be helpful if you could make a JMH to exercise your optimization webrev and add it to the webrev. Or, if you find one in the existing collection in the jdk repo[1] just point out which one it is. Thanks, Eric [1] http://hg.openjdk.java.net/jdk/jdk/file/bda9984d8ee4/test/micro/org/openjdk/bench On 5/29/19 08:22, Vladimir Ivanov wrote: > Hi Adam, > > The bug mentions ~6x improvement in throughput. Are there have any > microbenchmarks you can share which demonstrate that? That would greatly > simplify the analysis of changes you propose. > > Also, if you can elaborate on what optimization opportunities C2 misses > in original code, please, do. > > Best regards, > Vladimir Ivanov > > On 29/05/2019 12:45, Adam Farley8 wrote: >> Hi All, >> >> Could someone familiar with the Hotspot JIT please review and opine on >> the below? >> >> The Char-Byte encoding/decoding methods inside some of the sun.nio.cs >> classes >> (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 VM >> seems to >> do this a lot faster. >> >> Is it possible to achieve a similar improvement on OpenJDK on Hotspot by >> tweaking the CL code to match Hotspot JIT compiler idioms, or by >> introducing >> a method name for the HS JIT to match on? >> >> An example of these changes to US_ASCII.java is linked below. No >> OpenJ9 code >> is included in the work item or the webrev, to avoid contamination. >> >> Work item: https://bugs.openjdk.java.net/browse/JDK-8224963 >> >> Example Webrev: _http://cr.openjdk.java.net/~afarley/8224963/webrev/_ >> >> Best Regards >> >> Adam Farley >> IBM Runtimes >> >> Unless stated otherwise above: >> IBM United Kingdom Limited - Registered in England and Wales with >> number 741598. >> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 >> 3AU From gromero at linux.vnet.ibm.com Wed May 29 14:04:46 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 29 May 2019 11:04:46 -0300 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests In-Reply-To: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> References: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> Message-ID: <2a712e51-165f-1f29-11ed-029bd1d75021@linux.vnet.ibm.com> Hi, Could I get a second review for that change please? Thanks! Best regards, Gustavo On 05/22/2019 04:25 PM, Gustavo Romero wrote: > Hi, > > Could I get reviews for the following change please? > > Bug?? : https://bugs.openjdk.java.net/browse/JDK-8223660 > Webrev: http://cr.openjdk.java.net/~gromero/8223660/v1/ > > It removes from the RTM jtreg tests the use of Unsafe native methods as > abort provokers in a transaction. > > Relying on Unsafe native methods to abort a transaction makes the RTM test > brittle because an Unsafe native method can be converted to non-native at > any time, breaking the RTM tests. This is the second time it happens. > > This change removes the use of Unsafe native methods (currently, pageSize() > is used, but is not native anymore) and adds an isolated native method in > library libXAbortProvoker.so for the XAbortProvoker class that will be used > to abort transactions in the RTM tests, turning the RTM jtreg tests more > self-contained and so less brittle. > > I tested the change on x86_64 and PPC64, w/ RTM/HTM CPU feature available. > > Thank you! > > Best regards, > Gustavo > From shade at redhat.com Wed May 29 14:09:26 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 29 May 2019 16:09:26 +0200 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests In-Reply-To: <2a712e51-165f-1f29-11ed-029bd1d75021@linux.vnet.ibm.com> References: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> <2a712e51-165f-1f29-11ed-029bd1d75021@linux.vnet.ibm.com> Message-ID: <5f7858ba-e468-c32d-7b5e-0a9c154c1944@redhat.com> On 5/29/19 4:04 PM, Gustavo Romero wrote: >> Bug?? : https://bugs.openjdk.java.net/browse/JDK-8223660 >> Webrev: http://cr.openjdk.java.net/~gromero/8223660/v1/ Looks good to me. -- Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From nils.eliasson at oracle.com Wed May 29 14:54:03 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 29 May 2019 16:54:03 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: <87a3f42d-94ff-b588-85ea-e8bbf07cc2df@oracle.com> References: <87a7f5bt0g.fsf@redhat.com> <87a3f42d-94ff-b588-85ea-e8bbf07cc2df@oracle.com> Message-ID: <5a262187-a9e5-0057-2904-487d0b034712@oracle.com> Latest webrev uploaded with all the fixes you suggested: http://cr.openjdk.java.net/~neliasso/8224675/webrev.03/ Regards, Nils On 2019-05-29 14:18, Nils Eliasson wrote: > Hi Roland, > > On 2019-05-29 09:48, Roland Westrelin wrote: >> Hi Nils, >> >>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >> zBarrierSetC2.cpp: typo loadbarrers line 756 > Fixed >> lcm.cpp: >> >> void PhaseCFG:: call_catch_cleanup(Block* block) { >> >> space after ::? > Fixed >> >> loopnode.cpp: >> >> Node *u >> >> I thought the usually recommended style was: >> >> Node* u > Yes. >> loopnode.cpp: >> >> Do we really need a new entry in the gc api for barrier_insertion() >> couldn't this go under optimize_loops()? > > It could. There is only remove_range_check_casts in between right now. > I choose to have it as its own for separation, in a similar way to how > LoopOptsNone are used. > > >> memnode.hpp: >> >> ? 168?? enum LoadBarrier { >> ? 169???? UnProcessed???? = 0, >> ? 170???? RequireBarrier? = 1, >> ? 171???? WeakBarrier???? = 3,? // Inclusive with RequireBarrier >> ? 172???? ExpandedBarrier = 4 >> ? 173?? }; >> >> Shouldn't that be abstracted away through the gc api somehow? > > Yes. > > I would have preferred using the node-flags, but they are all taken > unless we expand it to 32 bits, or overload the flags that are only > used during codegen and later. That would require some verification of > the flag use to catch any mistakes. > >> >> zBarrierSetC2.cpp: >> >> typo (witch): >> >> 981???? // For each load use, see witch catch projs dominates, create >> load clone lazily and reconnect > Fixed >> >> In fixup_uses_in_catch() wouldn't you need to deal with phis the way you >> do in call_catch_cleanup_one(): >> >> 1019???? // We found no single catchproj that dominated the use - The >> use is at a point after >> 1020???? // where control flow from multiple catch projs have merged. >> We will have to create >> 1021???? // phi nodes before the use and tie the output from the >> cloned loads together. It >> 1022???? // can be a single phi or a number of chained phis, >> depending on control flow > > No - it's a two step process. > > There is a load hanging off a call with multiple catch projs. (There > can be no other control-flow in between.) The load have uses, either > (1) in the catch blocks, or (2) in some cases in the call block, > hanging off the load. > > fixup_uses_in_catch responsibility is to turn (2) into (1). It clones > all uses of the load, in the call block, out to the catch blocks. This > is done recursively. There can never be any phis here. > > >> Is there a chance that a load would be processed by >> fixup_uses_in_catch()? I see call_catch_cleanup_one() is where they are >> expected to be handled but you only go there if >> load->is_barrier_required() is true. So could you have a load for which >> is_barrier_required() is true have a use for which is_barrier_required() >> is not true and both of them be in the catch block? > > In theory yes. If the load without a barrier is a use of the load with > a barrier, it would be processed first. When the load with a barrier > is processed, it would clone the load without a barrier, like any > other node. > > Another case is two loads that both require a barrier. One would be > processed first, that one can't have the other one as a use, because > the loads are processed in post order. And when the other load is > processed, the first has already been cloned out. > > Thanks for the feedback! > > I'll get back with a new webrev. > > // Nils > >> >> Roland. From adam.farley at uk.ibm.com Wed May 29 14:53:50 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Wed, 29 May 2019 15:53:50 +0100 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement In-Reply-To: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> References: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> Message-ID: Hi Vladimir, I have a locally-written performance test I used to get the "6x". Will chase up with the guy who wrote it to see if I can share it. If not, I'll write a new one. As for the enhancements, two options are: - matching on the new method names, and replacing the inner logic with some souped-up version of said logic. - alter the code to match on one of the C2 idioms, though I imagine if it were that simple, OpenJDK would come with a list of said idioms so everything people write can be easily accelerated by the JIT. As for how OpenJ9 does it specifically, I don't know, and I suspect it's safer if I don't find out, contamination-wise. Does any of that help? Best Regards Adam Farley IBM Runtimes Vladimir Ivanov wrote on 29/05/2019 13:22:27: > From: Vladimir Ivanov > To: Adam Farley8 , hotspot-compiler- > dev at openjdk.java.net > Date: 29/05/2019 13:22 > Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement > > Hi Adam, > > The bug mentions ~6x improvement in throughput. Are there have any > microbenchmarks you can share which demonstrate that? That would greatly > simplify the analysis of changes you propose. > > Also, if you can elaborate on what optimization opportunities C2 misses > in original code, please, do. > > Best regards, > Vladimir Ivanov > > On 29/05/2019 12:45, Adam Farley8 wrote: > > Hi All, > > > > Could someone familiar with the Hotspot JIT please review and opine on > > the below? > > > > The Char-Byte encoding/decoding methods inside some of the sun.nio.cs > > classes > > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 VM seems to > > do this a lot faster. > > > > Is it possible to achieve a similar improvement on OpenJDK on Hotspot by > > tweaking the CL code to match Hotspot JIT compiler idioms, or by > > introducing > > a method name for the HS JIT to match on? > > > > An example of these changes to US_ASCII.java is linked below. No OpenJ9 > > code > > is included in the work item or the webrev, to avoid contamination. > > > > Work item: https://urldefense.proofpoint.com/v2/url? > u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8224963&d=DwIC- > g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- > PzjFCXb4&s=2ChxP3IE0tkvevxSXfil3PGlpEHkUPxgwMxHH5J-A34&e= > > > > Example Webrev: _https://urldefense.proofpoint.com/v2/url? > u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_-5F&d=DwIC- > g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- > CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- > PzjFCXb4&s=fCeNvvk3Fehc6ssZfoNkJao_NJyoxeov7cxiyMSvuwQ&e= > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From gromero at linux.vnet.ibm.com Wed May 29 15:12:56 2019 From: gromero at linux.vnet.ibm.com (Gustavo Romero) Date: Wed, 29 May 2019 12:12:56 -0300 Subject: RFR(M): 8223660: jtreg: Decouple Unsafe from RTM tests In-Reply-To: <5f7858ba-e468-c32d-7b5e-0a9c154c1944@redhat.com> References: <33df6862-948f-b6c4-8768-193f54127c20@linux.vnet.ibm.com> <2a712e51-165f-1f29-11ed-029bd1d75021@linux.vnet.ibm.com> <5f7858ba-e468-c32d-7b5e-0a9c154c1944@redhat.com> Message-ID: On 05/29/2019 11:09 AM, Aleksey Shipilev wrote: > On 5/29/19 4:04 PM, Gustavo Romero wrote: >>> Bug?? : https://bugs.openjdk.java.net/browse/JDK-8223660 >>> Webrev: http://cr.openjdk.java.net/~gromero/8223660/v1/ > > Looks good to me. Thanks for the review, Aleksey! Best regards, Gustavo From stuart.monteith at linaro.org Wed May 29 16:06:42 2019 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Wed, 29 May 2019 17:06:42 +0100 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: Hello Nils, I've tested with JTReg and JCStress, with and without -XX:+UseBarriersWithVolatile. I found one error, which was a typo on my part in z_compareAndExchangeP. I've fixed it and JCStress is now clean. Your changes are good as far as my testing is concerned. I'm continuing to look at the code that is generated. diff -r d48227fa72cf src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 14:53:47 2019 +0100 +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 16:39:59 2019 +0100 @@ -56,6 +56,8 @@ // instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ match(Set dst (LoadBarrierSlowReg mem)); + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); + effect(DEF dst, KILL cr); format %{"LoadBarrierSlowReg $dst, $mem" %} @@ -70,7 +72,8 @@ // Execute ZGC load barrier (weak) slow path // instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ - match(Set dst (LoadBarrierWeakSlowReg mem)); + match(Set dst (LoadBarrierSlowReg mem)); + predicate(n->as_LoadBarrierSlowReg()->is_weak()); effect(DEF dst, KILL cr); @@ -81,3 +84,60 @@ %} ins_pipe(pipe_slow); %} + + +// Specialized versions of compareAndExchangeP that adds a keepalive that is consumed +// but doesn't affect output. + +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, + iRegP oldval, iRegP newval, iRegP keepalive, + rFlagsReg cr) %{ + match(Set res (ZCompareAndExchangeP (Binary mem keepalive) (Binary oldval newval))); + ins_cost(2 * VOLATILE_REF_COST); + effect(TEMP_DEF res, KILL cr); + format %{ + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == $oldval then $mem <-- $newval" + %} + ins_encode %{ + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, + Assembler::xword, /*acquire*/ false, /*release*/ true, + /*weak*/ false, $res$$Register); + %} + ins_pipe(pipe_slow); +%} + +instruct z_compareAndSwapP(iRegINoSp res, + indirect mem, + iRegP oldval, iRegP newval, iRegP keepalive, + rFlagsReg cr) %{ + + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary oldval newval))); + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary oldval newval))); + + ins_cost(2 * VOLATILE_REF_COST); + + effect(KILL cr); + + format %{ + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" + %} + + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), + aarch64_enc_cset_eq(res)); + + ins_pipe(pipe_slow); +%} + + +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, + iRegP keepalive) %{ + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); + + ins_cost(2 * VOLATILE_REF_COST); + format %{ "atomic_xchg $prev, $newv, [$mem]" %} + ins_encode %{ + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); + %} + ins_pipe(pipe_serial); +%} On Fri, 24 May 2019 at 15:37, Stuart Monteith wrote: > > That's interesting, and seems beneficial for ZGC on aarch64, where > before your patch the ZGC load barriers broke assumptions the > memory-fence optimisation code was making. > > I'm currently testing your patch, with the following put on top for aarch64: > > diff -r ead187ebe684 src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad > --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 13:11:48 2019 +0100 > +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 15:34:17 2019 +0100 > @@ -56,6 +56,8 @@ > // > instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > match(Set dst (LoadBarrierSlowReg mem)); > + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); > + > effect(DEF dst, KILL cr); > > format %{"LoadBarrierSlowReg $dst, $mem" %} > @@ -70,7 +72,8 @@ > // Execute ZGC load barrier (weak) slow path > // > instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > - match(Set dst (LoadBarrierWeakSlowReg mem)); > + match(Set dst (LoadBarrierSlowReg mem)); > + predicate(n->as_LoadBarrierSlowReg()->is_weak()); > > effect(DEF dst, KILL cr); > > @@ -81,3 +84,60 @@ > %} > ins_pipe(pipe_slow); > %} > + > + > +// Specialized versions of compareAndExchangeP that adds a keepalive > that is consumed > +// but doesn't affect output. > + > +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + match(Set oldval (ZCompareAndExchangeP (Binary mem keepalive) > (Binary oldval newval))); > + ins_cost(2 * VOLATILE_REF_COST); > + effect(TEMP_DEF res, KILL cr); > + format %{ > + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == > $oldval then $mem <-- $newval" > + %} > + ins_encode %{ > + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, > + Assembler::xword, /*acquire*/ false, /*release*/ true, > + /*weak*/ false, $res$$Register); > + %} > + ins_pipe(pipe_slow); > +%} > + > +instruct z_compareAndSwapP(iRegINoSp res, > + indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + > + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + > + effect(KILL cr); > + > + format %{ > + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then > $mem <-- $newval" > + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" > + %} > + > + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), > + aarch64_enc_cset_eq(res)); > + > + ins_pipe(pipe_slow); > +%} > + > + > +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, > + iRegP keepalive) %{ > + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + format %{ "atomic_xchg $prev, $newv, [$mem]" %} > + ins_encode %{ > + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); > + %} > + ins_pipe(pipe_serial); > +%} > \ No newline at end of file > > On Thu, 23 May 2019 at 15:38, Nils Eliasson wrote: > > > > Hi, > > > > In ZGC we use load barriers on references. In the original > > implementation these where added as macro nodes at parse time. The load > > barrier node consumes and produces control flow in order to be able to > > be lowered into a check with a slow path late. The load barrier nodes > > are fixed in the control flow, and extensions to different optimizations > > are need the barriers out of loop and past other unrelated control flow. > > > > With this patch the barriers are instead added after the loop > > optimizations, before macro node expansion. This makes the entire > > pipeline until that point oblivious about the barriers. A dump of the IR > > with ZGC or EpsilonGC will be basically identical at that point, and the > > diff compared to serialGC or ParallelGC that use write barriers is > > really small. > > > > Benefits > > > > - A major complexity reduction. One can reason about and implement loop > > optimization without caring about the barriers. The escape analysis > > doesn't need to know about the barriers. Loads float freely like they > > are supposed to. > > > > - Less nodes early. The inlining will become more deterministic. A > > barrier heavy GC will not run into node limits earlier. Also node limit > > bounded optimization like unrolling and peeling will not be penalized by > > barriers. > > > > - Better test coverage, or reduce testing cost when the same > > optimization doesn't need to be verified with every GC. > > > > - Better control on where barriers end up. It is trivial to guarantee > > that the load and barriers are not separated by a safepoint. > > > > Design > > > > The implementation uses an extra phase that piggy back on PhaseIdealLoop > > which provides control and dominator information for all loads. This > > extra phase is needed because we need to splice the control flow when > > adding the load barriers. > > > > Barriers are inserted on the loads nodes in post order (any successor > > first). This is to guarantee the dominator information above every > > insertion is correct. This is also important within blocks. Two loads in > > the same block can float in relation to each other. The addition of > > barriers serializes their order. Any def-use relationship is upheld by > > expanding them post order. > > > > Barrier insertion is done in stages. In this first stage a single macro > > node that represents the barrier is added with all dependencies that is > > required. In the macro expansion phase the barrier nodes is expanded > > into the final shape, adding nodes that represent the conditional load > > barrier check. (Write barriers in other GCs could possibly be expanded > > here directly) > > > > All the barriers that are needed for unsafe reference operations (cas, > > swap, cmpx) are also expanded late. They already have control flow, so > > the expansion is straight forward. > > > > The barriers for the unsafe reference operations (cas, getandset, cmpx) > > have also been simplified. The cas-load-cas dance have been replaced by > > a pre-load. The pre-load is a load with a barrier, that is kept alive by > > an extra (required) edge on the unsafe-primitive-nodes (specialized as > > ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). > > > > One challenge that was encountered early and that have caused > > considerable work is that nodes (like loads) can end up between calls > > and their catch projections. This is usually handled after matching, in > > PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned > > to all catch blocks. At this stage they are in an ordered list, so that > > is a straight forward process. For late barrier insertion we need to > > splice in control earlier, before matching, and control flow between > > calls and catches is not allowed. This requires us to add a > > transformation pass where all loads and their dependent instructions are > > cloned out to the catch blocks before we can start splicing in control > > flow. This transformation doesn't replace the legacy call_catch_cleanup > > fully, but it could be a future goal. > > > > In the original barrier implementation there where two different load > > barrier implementations: the basic and the optimized. With the new > > approach to barriers on unsafe, the basic is no longer required and has > > been removed. (It provided options for skipping the self healing, and > > passed the ref in a register, guaranteeing that the oop wasn't reloaded.) > > > > The wart that was fixup_partial_loads in zHeap has also been made > > redundant. > > > > Dominating barriers are no longer removed on weak loads. Weak barriers > > doesn't guarantee self-healing. > > > > Follow up work: > > > > - Consolidate all uses of GrowableArray::insert_sorted to use the new > > version > > > > - Refactor the phases. There are a lot of simplifications and > > verification that can be done with more well defined phases. > > > > - Simplify the remaining barrier optimizations. There might still be > > code paths that are no longer needed. > > > > > > Testing: > > > > Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then > > some. All with -XX:+ZVerifyViews. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 > > > > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > > > > > > Please review, > > > > Regards, > > > > Nils > > From vladimir.x.ivanov at oracle.com Wed May 29 16:19:36 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 29 May 2019 19:19:36 +0300 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement In-Reply-To: References: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> Message-ID: Adam, Among all options, I'm in favor of enhancing C2 to produce better code. Then on my preference list goes rewriting JDK code to make it amenable to missing optimizations (the patch you propose). And, as a last resort, I'd consider introducing new intrinsics. The microbenchmarks would help understand what pieces as missing in C2 and decide how to proceed. I haven't had HotSpot vs J9 comparison in mind, but in absence of benchmarks available comparing generated code (by C2) between original and updated JDK version would help understand what goes wrong. Best regards, Vladimir Ivanov On 29/05/2019 17:53, Adam Farley8 wrote: > Hi Vladimir, > > I have a locally-written performance test I used to get the "6x". > Will chase up with the guy who wrote it to see if I can share it. > If not, I'll write a new one. > > As for the enhancements, two options are: > > - matching on the new method names, and replacing the inner logic > with some souped-up version of said logic. > > - alter the code to match on one of the C2 idioms, though I imagine > if it were that simple, OpenJDK would come with a list of said > idioms so everything people write can be easily accelerated by the > JIT. > > As for how OpenJ9 does it specifically, I don't know, and I suspect > it's safer if I don't find out, contamination-wise. > > Does any of that help? > > Best Regards > > Adam Farley > IBM Runtimes > > > Vladimir Ivanov wrote on 29/05/2019 13:22:27: > >> From: Vladimir Ivanov >> To: Adam Farley8 , hotspot-compiler- >> dev at openjdk.java.net >> Date: 29/05/2019 13:22 >> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement >> >> Hi Adam, >> >> The bug mentions ~6x improvement in throughput. Are there have any >> microbenchmarks you can share which demonstrate that? That would greatly >> simplify the analysis of changes you propose. >> >> Also, if you can elaborate on what optimization opportunities C2 misses >> in original code, please, do. >> >> Best regards, >> Vladimir Ivanov >> >> On 29/05/2019 12:45, Adam Farley8 wrote: >> > Hi All, >> > >> > Could someone familiar with the Hotspot JIT please review and opine on >> > the below? >> > >> > The Char-Byte encoding/decoding methods inside some of the sun.nio.cs >> > classes >> > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 VM seems to >> > do this a lot faster. >> > >> > Is it possible to achieve a similar improvement on OpenJDK on Hotspot by >> > tweaking the CL code to match Hotspot JIT compiler idioms, or by >> > introducing >> > a method name for the HS JIT to match on? >> > >> > An example of these changes to US_ASCII.java is linked below. No OpenJ9 >> > code >> > is included in the work item or the webrev, to avoid contamination. >> > >> > Work item: https://urldefense.proofpoint.com/v2/url? >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8224963&d=DwIC- >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- >> PzjFCXb4&s=2ChxP3IE0tkvevxSXfil3PGlpEHkUPxgwMxHH5J-A34&e= >> > >> > Example Webrev: _https://urldefense.proofpoint.com/v2/url? >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_-5F&d=DwIC- >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- >> PzjFCXb4&s=fCeNvvk3Fehc6ssZfoNkJao_NJyoxeov7cxiyMSvuwQ&e= >> > >> > Best Regards >> > >> > Adam Farley >> > IBM Runtimes >> > >> > Unless stated otherwise above: >> > IBM United Kingdom Limited - Registered in England and Wales with number >> > 741598. >> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >> > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU From vladimir.kozlov at oracle.com Wed May 29 16:52:06 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 29 May 2019 09:52:06 -0700 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: <532513d4-6927-6718-b9e8-c28397069b58@oracle.com> References: <540347ca-30c1-3199-8acf-225782c886ec@oracle.com> <532513d4-6927-6718-b9e8-c28397069b58@oracle.com> Message-ID: <6a1f0d8e-45b3-1c22-bdc1-0de56df3fff5@oracle.com> +1 Thanks, Vladimir On 5/29/19 5:38 AM, Tobias Hartmann wrote: > Thanks, looks good! > > Best regards, > Tobias > > On 29.05.19 14:31, Nils Eliasson wrote: >> I updated the webrev in place. >> >> Thanks, >> >> // Nils >> >> On 2019-05-29 09:54, Tobias Hartmann wrote: >>> Hi Nils, >>> >>> On 28.05.19 18:24, Nils Eliasson wrote: >>>> I'll update the webrev. >>> Thanks. >>> >>>> This fixes the crash in RunThese - and that was with ZGC. The other crashes are from >>>> sparkexamples-application. >>>> >>>> When I push this patch we need to open a new bug for that crash. >>> Okay, I've re-added the ZGC label and filed JDK-8224957 for the non-ZGC related failures. >>> >>> Best regards, >>> Tobias From vladimir.kozlov at oracle.com Wed May 29 17:00:12 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 29 May 2019 10:00:12 -0700 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> Message-ID: <358ae47a-422f-e243-febe-086820ec6bc9@oracle.com> Hi Patric Add to test '@requires !vm.graal.enabled' to avoid running with Graal. And add link to test results to the bug report. I assume you need to run 1-3 tiers and hs-precheckin-comp. Thanks, Vladimir On 5/29/19 12:48 AM, Patric Hedlin wrote: > Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ > > Now also including the testcase for/from 8223502. > > Best regards, > Patric > > On 28/05/2019 18:50, Aleksey Shipilev wrote: >> On 5/28/19 6:49 PM, Patric Hedlin wrote: >>> On 2019-05-28 15:51, Aleksey Shipilev wrote: >>>> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>>>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>>>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >>>> Yes, so what prevents us from including that test in this changeset? Surely it acts like the >>>> regression tests for the fix. >>> Absolutely nothing. The error was only in my webrev generation that didn't include both bookmarks. >>> They will be pushed "as one". (I can generate a new, single, complete, webrev in the morning if you >>> prefer.) >> Yes, please. I prefer full webrevs to understand what exactly is being pushed. >> >> -Aleksey >> > From igor.ignatyev at oracle.com Wed May 29 18:55:15 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 29 May 2019 11:55:15 -0700 Subject: RFR(T/S) : 8224945 : googlemock update breaks the build of arm32 In-Reply-To: References: Message-ID: <6542B8D1-DEA8-4A1D-A52F-4C8A001229D0@oracle.com> Volker, David, thanks for review! pushed. -- Igor > On May 29, 2019, at 1:54 AM, Volker Simonis wrote: > > Looks good! With your change I can successfully build ppc64 platfroms again! > > Thanks and thumbs up from me! > > On Wed, May 29, 2019 at 10:23 AM Volker Simonis > wrote: >> >> Hi Igor, >> >> thanks for considering ppc in your fix as well. >> >> Your changes are definitely required on ppc but I'm not sure if that's >> enough. I'm currently running a build to verify that. I'll let you >> know in an hour or so. >> >> Best regards, >> Volker >> >> On Wed, May 29, 2019 at 9:11 AM Igor Ignatyev wrote: >>> >>> http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html >>>> 10 lines changed: 10 ins; 0 del; 0 mod; >>> >>> >>> Hi all, >>> >>> could you please review this small and trivial patch which undefines R, F1 and F2 macros in unittest.hpp so they won't conflict w/ typenames used in gmock? >>> >>> testing: extensive build testing (which found JDK-8224949 -- an unrelated breakage on linux-x86) >>> webrev: http://cr.openjdk.java.net/~iignatyev//8224945/webrev.00/index.html >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8224945 >>> >>> Thanks, >>> -- Igor From tom.rodriguez at oracle.com Wed May 29 18:59:17 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Wed, 29 May 2019 11:59:17 -0700 Subject: RFR(XS) 8224931: disable JAOTC invokedynamic support until 8223533 is fixed In-Reply-To: References: Message-ID: <54ae61a3-059a-0975-4b53-2b55ca36e36d@oracle.com> Looks good. tom dean.long at oracle.com wrote on 5/28/19 5:31 PM: > https://bugs.openjdk.java.net/browse/JDK-8224931 > http://cr.openjdk.java.net/~dlong/8224931/webrev/ > > Disable JAOTC invokedynamic support temporarily.? Register a plugin that > always returns false for supportsDynamicInvoke(), which causes the > parser to insert deoptimize nodes. > > dl From dean.long at oracle.com Wed May 29 19:34:21 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 29 May 2019 12:34:21 -0700 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: <87imtwaz43.fsf@redhat.com> References: <877eailvgp.fsf@redhat.com> <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> <87imtwaz43.fsf@redhat.com> Message-ID: It seems to do what you want, but I'm curious, what's the worse that can happen if we call set_shared() on a node that doesn't need it? dl On 5/27/19 4:57 AM, Roland Westrelin wrote: >> The change looks reasonable to me. >> >> I have run tests with the patch and can confirm that the original bug >> went away. I've also run a bunch of other tests and workloads and looks >> good too. > Thanks Roman. Anyone else for this? > Shenandoah needs this for a key change that's also out for review: > > https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025931.html > (8224584: Shenandoah: Eliminate forwarding pointer word) > > Roland. From vladimir.kozlov at oracle.com Wed May 29 19:49:46 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 29 May 2019 12:49:46 -0700 Subject: RFR: 8224974: Implement JEP 352 In-Reply-To: <06ef3b25-9e34-f6f3-e4c4-319adea52ae7@redhat.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> <06ef3b25-9e34-f6f3-e4c4-319adea52ae7@redhat.com> Message-ID: <2153cb8e-b4c1-d3c5-4c28-3e32d50db2ea@oracle.com> Hi, Andrew I tried to test these changes and build failed on all systems except Linux because: workspace/open/src/hotspot/share/prims/unsafe.cpp:446:3: error: use of undeclared identifier 'JNU_ThrowRuntimeException' JNU_ThrowRuntimeException(env, "writeback is not implemented"); ^ workspace/open/src/hotspot/share/prims/unsafe.cpp:447:10: error: use of undeclared identifier 'IOS_THROWN' return IOS_THROWN; ^ workspace/open/src/hotspot/share/prims/unsafe.cpp:473:3: error: use of undeclared identifier 'JNU_ThrowRuntimeException' JNU_ThrowRuntimeException(env, "writeback sync is not implemented"); ^ workspace/open/src/hotspot/share/prims/unsafe.cpp:474:10: error: use of undeclared identifier 'IOS_THROWN' return IOS_THROWN; ^ workspace/open/src/hotspot/share/prims/unsafe.cpp:488:3: error: use of undeclared identifier 'JNU_ThrowRuntimeException' JNU_ThrowRuntimeException(env, "writeback sync is not implemented"); ^ workspace/open/src/hotspot/share/prims/unsafe.cpp:489:10: error: use of undeclared identifier 'IOS_THROWN' return IOS_THROWN; ------------------------------------------------------------ Also Graal test should be fixed for new intrinsics (add them to 'toBeInvestigated' for isJDK13orHigher): src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java java.lang.AssertionError: missing Graal intrinsics for: jdk/internal/misc/Unsafe.writeback0(J)V jdk/internal/misc/Unsafe.writebackPostSync0()V jdk/internal/misc/Unsafe.writebackPreSync0()V at org.graalvm.compiler.hotspot.test.CheckGraalIntrinsics.test(CheckGraalIntrinsics.java:653) Regards, Vladimir On 5/29/19 5:50 AM, Andrew Dinn wrote: > Hi Alan, > > Apologies for the previous post which escaped from the lab while Dr > Funkenstein was struggling to push the right buttons (and work out what > happened when he pushed them). > > I have created an implementation subtask and associated CSR. I also > updated the last webrev to record the javadoc changes necessitated in > order to complete the CSR. Finally, I set the JEP fix version to 13 and > pressed the big red "target" button. > > Impl JIRA: https://bugs.openjdk.java.net/browse/JDK-8224974 > CSR JIRA: https://bugs.openjdk.java.net/browse/JDK-8224975 > webrev: http://cr.openjdk.java.net/~adinn/8224974/webrev.02/ > > n.b. I have switched to using the subtask JIRA id in $title and in the > cr.openjdk webrev link ... > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From vladimir.kozlov at oracle.com Wed May 29 19:58:00 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 29 May 2019 12:58:00 -0700 Subject: RFR(S): 8173196: [REDO] C2 does not optimize redundant memory operations with G1 In-Reply-To: References: <871s0qlu20.fsf@redhat.com> Message-ID: Looks good to me too. Thanks, Vladimir On 5/27/19 2:17 AM, Tobias Hartmann wrote: > Hi Roland, > > this looks good to me. I'll run some extended testing and let you know once it's done. > > Thanks, > Tobias > > On 22.05.19 11:25, Roland Westrelin wrote: >> >> http://cr.openjdk.java.net/~roland/8173196/webrev.00/ >> >> Previous attempt at this was discussed here: >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-January/021014.html >> >> And the follow up bugs with some comments on a possible fix: >> >> https://bugs.openjdk.java.net/browse/JDK-8172850 >> >> The new fix is very similar to the previous one. The 2 differences are: >> >> - aarch64 code shouldn't need any change because of 8209420 ("Track >> membars for volatile accesses so they can be properly optimized") >> >> - The membar only affects the raw memory slice which is now properly >> handled by MachNodes thanks to 8209691 ("Allow MemBar on single memory >> slice") >> >> Roland. >> From dean.long at oracle.com Wed May 29 20:10:34 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 29 May 2019 13:10:34 -0700 Subject: RFR(XS) 8224931: disable JAOTC invokedynamic support until 8223533 is fixed In-Reply-To: <54ae61a3-059a-0975-4b53-2b55ca36e36d@oracle.com> References: <54ae61a3-059a-0975-4b53-2b55ca36e36d@oracle.com> Message-ID: Thanks Tom. dl On 5/29/19 11:59 AM, Tom Rodriguez wrote: > Looks good. > > tom > > dean.long at oracle.com wrote on 5/28/19 5:31 PM: >> https://bugs.openjdk.java.net/browse/JDK-8224931 >> http://cr.openjdk.java.net/~dlong/8224931/webrev/ >> >> Disable JAOTC invokedynamic support temporarily.? Register a plugin >> that always returns false for supportsDynamicInvoke(), which causes >> the parser to insert deoptimize nodes. >> >> dl From rkennke at redhat.com Wed May 29 20:31:11 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 29 May 2019 22:31:11 +0200 Subject: RFR(S): 8224580: Matcher can cause oop field/array element to be reloaded In-Reply-To: References: <877eailvgp.fsf@redhat.com> <70fedac3-59a2-e077-4de0-af6f6604dc16@redhat.com> <87imtwaz43.fsf@redhat.com> Message-ID: <1febdda4-4d19-d19d-1abc-9b33b2bc5d2e@redhat.com> > It seems to do what you want, but I'm curious, what's the worse that can > happen if we call set_shared() on a node that doesn't need it? What would that be? set_shared() has been there before the change, but only if it's preceded by an address. The only case where this is different is loads from offset==0, in other words, loads of the mark-word. That would be locking code (which needs stricter access, e.g. volatile, anyway) or hashcode, as far as I can tell. In other words, the worst that can happen is that hashcode is not reloaded, where it could have been reloaded before? Doesn't sound like a problem to me. However, reloading in case of Shenandoah's barrier is a severe *correctness* problem. And a fairly nasty one too. Roman > dl > > On 5/27/19 4:57 AM, Roland Westrelin wrote: >>> The change looks reasonable to me. >>> >>> I have run tests with the patch and can confirm that the original bug >>> went away. I've also run a bunch of other tests and workloads and looks >>> good too. >> Thanks Roman. Anyone else for this? >> Shenandoah needs this for a key change that's also out for review: >> >> https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-May/025931.html >> >> (8224584: Shenandoah: Eliminate forwarding pointer word) >> >> Roland. > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From daniel.daugherty at oracle.com Wed May 29 22:31:04 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 29 May 2019 18:31:04 -0400 Subject: RFR(T): 8225022 Put compiler/graalunit/JttThreadsTest.java on ProblemList-graal.txt Message-ID: Greetings, Trivial change to reduce noise in the JDK13 CI: $ hg diff diff -r 5b400b9121d0 test/hotspot/jtreg/ProblemList-graal.txt --- a/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 14:07:27 2019 -0400 +++ b/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 18:21:39 2019 -0400 @@ -39,6 +39,7 @@ ?compiler/compilercontrol/mixed/RandomCommandsTest.java 8181753 generic-all ?compiler/graalunit/HotspotJdk9Test.java 8224254 generic-all +compiler/graalunit/JttThreadsTest.java 8207757 generic-all ?compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all Thanks, in advance, for any questions, comments or suggestions. Dan From vladimir.kozlov at oracle.com Wed May 29 22:33:47 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 29 May 2019 15:33:47 -0700 Subject: RFR(T): 8225022 Put compiler/graalunit/JttThreadsTest.java on ProblemList-graal.txt In-Reply-To: References: Message-ID: Okay. Thanks, Vladimir On 5/29/19 3:31 PM, Daniel D. Daugherty wrote: > Greetings, > > Trivial change to reduce noise in the JDK13 CI: > > $ hg diff > diff -r 5b400b9121d0 test/hotspot/jtreg/ProblemList-graal.txt > --- a/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 14:07:27 2019 -0400 > +++ b/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 18:21:39 2019 -0400 > @@ -39,6 +39,7 @@ > ?compiler/compilercontrol/mixed/RandomCommandsTest.java 8181753 generic-all > > ?compiler/graalunit/HotspotJdk9Test.java 8224254 generic-all > +compiler/graalunit/JttThreadsTest.java 8207757 generic-all > > ?compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > From daniel.daugherty at oracle.com Wed May 29 23:11:05 2019 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 29 May 2019 19:11:05 -0400 Subject: RFR(T): 8225022 Put compiler/graalunit/JttThreadsTest.java on ProblemList-graal.txt In-Reply-To: References: Message-ID: Thanks! Dan On 5/29/19 6:33 PM, Vladimir Kozlov wrote: > Okay. > > Thanks, > Vladimir > > On 5/29/19 3:31 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> Trivial change to reduce noise in the JDK13 CI: >> >> $ hg diff >> diff -r 5b400b9121d0 test/hotspot/jtreg/ProblemList-graal.txt >> --- a/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 14:07:27 >> 2019 -0400 >> +++ b/test/hotspot/jtreg/ProblemList-graal.txt Wed May 29 18:21:39 >> 2019 -0400 >> @@ -39,6 +39,7 @@ >> ??compiler/compilercontrol/mixed/RandomCommandsTest.java 8181753 >> generic-all >> >> ??compiler/graalunit/HotspotJdk9Test.java 8224254 generic-all >> +compiler/graalunit/JttThreadsTest.java 8207757 generic-all >> >> ??compiler/jvmci/SecurityRestrictionsTest.java 8181837 generic-all >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> From fujie at loongson.cn Thu May 30 06:16:12 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 30 May 2019 14:16:12 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> Message-ID: Hi Vladimir Ivanov, Thank you for your kind review and nice suggestions. Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.06/ Please see comments inlined below. Thanks a lot. Best regards, Jie On 2019/5/29 ??8:02, Vladimir Ivanov wrote: > There are other places where overflow can occur: > > ????????? int rcount = call->receiver_count(i) + epsilon; > > ????????? receivers_count_total += rcount; > > Maybe introduce a helper routine for saturating addition? > Good catch. Fixed. > > src/hotspot/share/oops/methodData.hpp: > > ?? uint count() const { > -??? return uint_at(count_off); > +??? intptr_t raw_data = intptr_at(count_off); > +??? if (raw_data > max_jint) { > +????? raw_data = max_jint; > +??? } else if (raw_data < min_jint) { > +????? raw_data = min_jint; > +??? } > +??? return uint(raw_data); > ?? } > > Should uintptr_t be used here instead? It looks like this version > won't work as expected on 32-bit platforms (since uint => intrptr_t > conversion becomes lossy). uintptr_t shouldn't be used here since the counter may contain a negative value[1][2]. If uintx was used, max_jint would be returned by CounterData::count() in cases of negative values, which is unexpected. And I don't think the type conversion is problematic due to the inaccuracy of profile[3]. What we care about is just whether the counter is greater or less than a specific threshold, not its absolute value. > > Also, I don't understand why you added comparison against min_jint. > Since counts are unsigned, the following should be enough: I did it because the counter may contain a negative value. It might be better to also detect and handle the underflow condition. The interpreter also does this here[4]. [1] http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/ci/ciMethod.cpp#l515 [2] http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/ci/ciMethod.cpp#l533 [3] http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/oops/methodData.hpp#l49 [4] http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/interp_masm_x86.cpp#l1407 > > ? uint count() const { > ??? uintx raw_data = (uintx)intptr_at(count_off); > ??? if (raw_data > max_jint) { > ????? raw_data = max_jint; > ??? } > ??? return uint(raw_data); > ? } From OGATAK at jp.ibm.com Thu May 30 07:10:24 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Thu, 30 May 2019 16:10:24 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185969: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> Message-ID: Hi, Fixed a typo in the bug ID in the title. Since it looks there is no further objection, I added jdk8u-fix-request tag in the original bug report. Regards, Ogata Kazunori Ogata/Japan/IBM wrote on 2019/05/28 14:22:20: > From: Kazunori Ogata/Japan/IBM > To: Andrew John Hughes > Cc: "hotspot-compiler-dev at openjdk.java.net" dev at openjdk.java.net>, "jdk8u-dev at openjdk.java.net" dev at openjdk.java.net>, "Doerr, Martin" > Date: 2019/05/28 14:22 > Subject: Re: [EXTERNAL] Re: [8u-dev, ppc] RFR for (almost clean) backport > of 8185696: PPC64: Improve VSR support to use up to 64 registers > > Hi Andrew and Martin, > > Thank you for your comments. > > My original intention to change the copyright year was that I did some > work to apply the original patch to this file. I now realized I made no > change in the code that was modified in the original patch. So I agree > not updating the copyright year is more natural. > > I updated webrev: > http://cr.openjdk.java.net/~horii/jdk8u_aes_be/8185969/webrev.03/ > > Regards, > Ogata > > Andrew John Hughes wrote on 2019/05/28 02:28:28: > > > From: Andrew John Hughes > > To: "Doerr, Martin" , Kazunori Ogata > > , "hotspot-compiler-dev at openjdk.java.net" > compiler-dev at openjdk.java.net>, "jdk8u-dev at openjdk.java.net" > dev at openjdk.java.net> > > Date: 2019/05/28 02:28 > > Subject: [EXTERNAL] Re: [8u-dev, ppc] RFR for (almost clean) backport of > > 8185696: PPC64: Improve VSR support to use up to 64 registers > > > > > > > > On 27/05/2019 17:41, Doerr, Martin wrote: > > > Hi, > > > > > > I think it's fine. > > > > > > I guess the copyright updates in the other files were part of other > > changes which have not been backported. > > > I don't think we have to update copyrights in backport changes other > > than what comes naturally with the change. > > > > > > Best regards, > > > Martin > > > > > > > > > > The change to the copyright header in assembler_ppc.hpp is an addition > > in this backport. So either that should be dropped or the same should be > > applied to register_ppc.{c,h}pp (the remaining file is already 2019). > > > > I tend towards dropping it, but we should at least be consistent within > > the same patch. > > > > Best regards, > > -- > > Andrew :) > > > > Senior Free Java Software Engineer > > Red Hat, Inc. (http://www.redhat.com) > > > > PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) > > Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 > > https://keybase.io/gnu_andrew > > > > [attachment "signature.asc" deleted by Kazunori Ogata/Japan/IBM] From vladimir.x.ivanov at oracle.com Thu May 30 14:59:51 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 30 May 2019 17:59:51 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> Message-ID: <8e25f068-609a-de80-a020-fbc0ede4b96a@oracle.com> Thanks for clarifications, Jie! src/hotspot/share/oops/methodData.hpp: // Direct accessor uint count() const { - return uint_at(count_off); + intptr_t raw_data = intptr_at(count_off); + if (raw_data > max_jint) { + raw_data = max_jint; + } else if (raw_data < min_jint) { + raw_data = min_jint; + } + return uint(raw_data); } I agree with this change. Switching return type to int would make it clearer. src/hotspot/share/ci/ciMethod.cpp: +int ciMethod::saturated_add_int(int a, int b) { - int rcount = call->receiver_count(i) + epsilon; + int rcount = saturated_add_int(call->receiver_count(i), epsilon); call->receiver_count(i) is uint, so you still can experience overflow when converting from uint to int. Considering receiver counts are positive, I'd use saturating add on uints (and don't care about min_jint). Otherwise, the fix looks good. Inaccuracies in profiling data are expected, but overflows may cause drastic changes in behavior, so it's better to catch such cases and handle them properly. Regarding handling overflow during profiling: * C1 doesn't handle counter overflow [1] * What template interpreter does to avoid overflow is not enough for concurrent case: it stores new value into memory and then conditionally decrements it, but another thread may already see it. Proper solution would be to keep the value in register, but that requires a temporary register to burn. So, it seems easier and cheaper (from the perspective of profiling overhead) to handle that on compiler side when interpreting the data. Best regards, Vladimir Ivanov [1] http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l1617 On 30/05/2019 09:16, Jie Fu wrote: > Hi Vladimir Ivanov, > > Thank you for your kind review and nice suggestions. > Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.06/ > > Please see comments inlined below. > > Thanks a lot. > Best regards, > Jie > > On 2019/5/29 ??8:02, Vladimir Ivanov wrote: >> There are other places where overflow can occur: >> >> ????????? int rcount = call->receiver_count(i) + epsilon; >> >> ????????? receivers_count_total += rcount; >> >> Maybe introduce a helper routine for saturating addition? >> > Good catch. Fixed. >> >> src/hotspot/share/oops/methodData.hpp: >> >> ?? uint count() const { >> -??? return uint_at(count_off); >> +??? intptr_t raw_data = intptr_at(count_off); >> +??? if (raw_data > max_jint) { >> +????? raw_data = max_jint; >> +??? } else if (raw_data < min_jint) { >> +????? raw_data = min_jint; >> +??? } >> +??? return uint(raw_data); >> ?? } >> >> Should uintptr_t be used here instead? It looks like this version >> won't work as expected on 32-bit platforms (since uint => intrptr_t >> conversion becomes lossy). > uintptr_t shouldn't be used here since the counter may contain a > negative value[1][2]. > If uintx was used, max_jint would be returned by CounterData::count() in > cases of negative values, which is unexpected. > And I don't think the type conversion is problematic due to the > inaccuracy of profile[3]. > What we care about is just whether the counter is greater or less than a > specific threshold, not its absolute value. >> >> Also, I don't understand why you added comparison against min_jint. >> Since counts are unsigned, the following should be enough: > I did it because the counter may contain a negative value. > It might be better to also detect and handle the underflow condition. > The interpreter also does this here[4]. > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/ci/ciMethod.cpp#l515 > > [2] > http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/ci/ciMethod.cpp#l533 > > [3] > http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/share/oops/methodData.hpp#l49 > > [4] > http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/interp_masm_x86.cpp#l1407 > > > >> >> ? uint count() const { >> ??? uintx raw_data = (uintx)intptr_at(count_off); >> ??? if (raw_data > max_jint) { >> ????? raw_data = max_jint; >> ??? } >> ??? return uint(raw_data); >> ? } > From adinn at redhat.com Thu May 30 16:08:02 2019 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 30 May 2019 17:08:02 +0100 Subject: RFR: 8224974: Implement JEP 352 In-Reply-To: <2153cb8e-b4c1-d3c5-4c28-3e32d50db2ea@oracle.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> <06ef3b25-9e34-f6f3-e4c4-319adea52ae7@redhat.com> <2153cb8e-b4c1-d3c5-4c28-3e32d50db2ea@oracle.com> Message-ID: <656f462e-c655-0c48-7c90-190c92e0bc28@redhat.com> HI Vladimir, Thank you for reviewing the patch. On 29/05/2019 20:49, Vladimir Kozlov wrote: > I tried to test these changes and build failed on all systems except > Linux because: > > workspace/open/src/hotspot/share/prims/unsafe.cpp:446:3: error: use of > undeclared identifier 'JNU_ThrowRuntimeException' > ?? JNU_ThrowRuntimeException(env, "writeback is not implemented"); > ?? ^ Apologies for that. I forgot to test this via the submit repo after cut-and-pasting the checks for OS and CPU support from the map0 native method to the Unsafe writeback method. I had to make some tweaks to this code anyway in order to spot an issue Alan noticed when checking the CSR (the map code was not distinguishing the precise cases where IOException and UnsupportedOperationException needed to be thrown and would sometimes have replaced the latter with the former on Windows/x86_64). I have uploaded a new webrev which attempts to address the problem. http://cr.openjdk.java.net/~adinn/8224974/webrev.03 The test to see whether writeback is enabled on the current cpu_os combination is now performed in Java from methods Unsafe.writebackMemory and FileChannelImpl.map, using a call to Unsafe.isWritebackEnabled() There are also 'belt and braces' checks in the corresponding native implementation methods: Unsafe asserts that VM_Version::supports_data_cache_line_writeback() returns true. The result of this method indicates whether support is available on both CPU and OS. It returns a value computed using a call to a new OS-specific method os::supports_map_sync() and, on hardware for which that is true (AArch64 and x86_64), a test of the relevant CPU status bits. FileChannelImpl still relies on conditional compilation to reject calls on invalid OS/CPU combinations (the VM_VERSION method is not available for it to call). In the branch for !LINUX || !(AArch64 || amd64) it throws an InternalError as this path not be reached. Unfortunately, this latest webrev still fails when uploaded to the submit repo. The problem seems to be specific to Windows/x86_64 builds. The branch that failed is JDK-8224974-03. The returned text is appended after my signature. Would you be able to provide some details about the errors? > ------------------------------------------------------------ > Also Graal test should be fixed for new intrinsics (add them to > 'toBeInvestigated' for isJDK13orHigher): > > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java That has been fixed in the webrev mentioned above. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- BuildId: 2019-05-30-1509485.adinn.source No failed tests Tasks Summary UNABLE_TO_RUN: 18 NOTHING_TO_RUN: 0 KILLED: 0 EXECUTED_WITH_FAILURE: 4 FAILED: 0 HARNESS_ERROR: 0 NA: 0 PASSED: 55 Build 1 Unable to run windows-x64-install-windows-x64-build-19 Dependency task failed: mach5...1512-2804499-windows-x64-windows-x64-build-12 4 Executed with failure windows-x64-windows-x64-build-12 error while building, return value: 2 windows-x64-debug-windows-x64-build-13 error while building, return value: 2 windows-x64-open-windows-x64-build-14 error while building, return value: 2 windows-x64-open-debug-windows-x64-build-15 error while building, return value: 2 Test 17 Unable to run tier1-product-open_test_hotspot_jtreg_tier1_common-windows-x64-22 Dependency task failed: mach5...1512-2804499-windows-x64-windows-x64-build-12 tier1-debug-open_test_hotspot_jtreg_tier1_common-windows-x64-debug-28 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-windows-x64-debug-31 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-windows-x64-debug-34 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-windows-x64-debug-37 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-windows-x64-debug-40 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-windows-x64-debug-43 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-windows-x64-debug-46 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 tier1-product-open_test_hotspot_jtreg_tier1_gc_gcbasher-windows-x64-25 Dependency task failed: mach5...1512-2804499-windows-x64-windows-x64-build-12 tier1-debug-open_test_hotspot_jtreg_tier1_gc_gcbasher-windows-x64-debug-49 Dependency task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 See all 17... From gnu.andrew at redhat.com Thu May 30 16:33:06 2019 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Thu, 30 May 2019 17:33:06 +0100 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185969: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> Message-ID: On 30/05/2019 08:10, Kazunori Ogata wrote: > Hi, > > Fixed a typo in the bug ID in the title. > > Since it looks there is no further objection, I added jdk8u-fix-request > tag in the original bug report. > > > Regards, > Ogata > The updated webrev looks fine now. However, please don't assume an acceptable review just because there is no objection. Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Thu May 30 16:58:49 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 30 May 2019 09:58:49 -0700 Subject: RFR: 8224974: Implement JEP 352 In-Reply-To: <656f462e-c655-0c48-7c90-190c92e0bc28@redhat.com> References: <80da32b2-7acb-7b94-b82c-5dcd5cf95539@redhat.com> <506d2bee-9376-52f7-731c-4d872c944847@oracle.com> <607e2f32-1f01-e45f-42e8-a645fa374cf7@oracle.com> <06ef3b25-9e34-f6f3-e4c4-319adea52ae7@redhat.com> <2153cb8e-b4c1-d3c5-4c28-3e32d50db2ea@oracle.com> <656f462e-c655-0c48-7c90-190c92e0bc28@redhat.com> Message-ID: On 5/30/19 9:08 AM, Andrew Dinn wrote: > HI Vladimir, > > Thank you for reviewing the patch. > > On 29/05/2019 20:49, Vladimir Kozlov wrote: >> I tried to test these changes and build failed on all systems except >> Linux because: >> >> workspace/open/src/hotspot/share/prims/unsafe.cpp:446:3: error: use of >> undeclared identifier 'JNU_ThrowRuntimeException' >> ?? JNU_ThrowRuntimeException(env, "writeback is not implemented"); >> ?? ^ > > Apologies for that. I forgot to test this via the submit repo after > cut-and-pasting the checks for OS and CPU support from the map0 native > method to the Unsafe writeback method. > > I had to make some tweaks to this code anyway in order to spot an issue > Alan noticed when checking the CSR (the map code was not distinguishing > the precise cases where IOException and UnsupportedOperationException > needed to be thrown and would sometimes have replaced the latter with > the former on Windows/x86_64). Okay. > > I have uploaded a new webrev which attempts to address the problem. > > http://cr.openjdk.java.net/~adinn/8224974/webrev.03 I looked only on HotSpot code. stubGenerator_x86_64.cpp - in generate_data_cache_writeback() next are not used: + bool optimized = VM_Version::supports_clflushopt(); + bool no_evict = VM_Version::supports_clwb(); vm_version_x86.hpp it should check CPUID flag in 32-bit: +#else + static bool supports_clflush() { return true; } We don't check has_match_rule() in LibraryCallKit any more. In .ad files you need to add predicate to new insrtructions: predicate(VM_Version::supports_data_cache_line_flush()); Also add this check to Matcher::match_rule_supported() which you can use then in C2Compiler::is_intrinsic_supported(). DISABLE_UNSAFE_WRITEBACK_INTRINSIC should be checked much early may be together with os::supports_map_sync() when you set _data_cache_line_flush_size. > > The test to see whether writeback is enabled on the current cpu_os > combination is now performed in Java from methods Unsafe.writebackMemory > and FileChannelImpl.map, using a call to Unsafe.isWritebackEnabled() > > There are also 'belt and braces' checks in the corresponding native > implementation methods: > > Unsafe asserts that VM_Version::supports_data_cache_line_writeback() > returns true. The result of this method indicates whether support is > available on both CPU and OS. It returns a value computed using a call > to a new OS-specific method os::supports_map_sync() and, on hardware for > which that is true (AArch64 and x86_64), a test of the relevant CPU > status bits. > > FileChannelImpl still relies on conditional compilation to reject calls > on invalid OS/CPU combinations (the VM_VERSION method is not available > for it to call). In the branch for !LINUX || !(AArch64 || amd64) it > throws an InternalError as this path not be reached. > > Unfortunately, this latest webrev still fails when uploaded to the > submit repo. The problem seems to be specific to Windows/x86_64 builds. > The branch that failed is JDK-8224974-03. The returned text is appended > after my signature. Would you be able to provide some details about the > errors t:/workspace/open/src/java.base/windows/native/libnio/ch/FileChannelImpl.c(64): error C2220: warning treated as error - no 'object' file generated t:/workspace/open/src/java.base/windows/native/libnio/ch/FileChannelImpl.c(64): warning C4029: declared formal parameter list different from definition > >> ------------------------------------------------------------ >> Also Graal test should be fixed for new intrinsics (add them to >> 'toBeInvestigated' for isJDK13orHigher): >> >> src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/CheckGraalIntrinsics.java > That has been fixed in the webrev mentioned above. Good. Thanks, Vladimir > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > > > ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- > > BuildId: 2019-05-30-1509485.adinn.source > No failed tests > Tasks Summary > > UNABLE_TO_RUN: 18 > NOTHING_TO_RUN: 0 > KILLED: 0 > EXECUTED_WITH_FAILURE: 4 > FAILED: 0 > HARNESS_ERROR: 0 > NA: 0 > PASSED: 55 > Build > > 1 Unable to run > windows-x64-install-windows-x64-build-19 Dependency task > failed: mach5...1512-2804499-windows-x64-windows-x64-build-12 > 4 Executed with failure > windows-x64-windows-x64-build-12 error while building, > return value: 2 > windows-x64-debug-windows-x64-build-13 error while building, > return value: 2 > windows-x64-open-windows-x64-build-14 error while building, > return value: 2 > windows-x64-open-debug-windows-x64-build-15 error while > building, return value: 2 > > Test > > 17 Unable to run > > tier1-product-open_test_hotspot_jtreg_tier1_common-windows-x64-22 > Dependency task failed: > mach5...1512-2804499-windows-x64-windows-x64-build-12 > > tier1-debug-open_test_hotspot_jtreg_tier1_common-windows-x64-debug-28 > Dependency task failed: > mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-windows-x64-debug-31 Dependency > task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-windows-x64-debug-34 Dependency > task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-windows-x64-debug-37 Dependency > task failed: mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-windows-x64-debug-40 > Dependency task failed: > mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-windows-x64-debug-43 > Dependency task failed: > mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-windows-x64-debug-46 > Dependency task failed: > mach5...804499-windows-x64-debug-windows-x64-build-13 > > tier1-product-open_test_hotspot_jtreg_tier1_gc_gcbasher-windows-x64-25 > Dependency task failed: > mach5...1512-2804499-windows-x64-windows-x64-build-12 > > tier1-debug-open_test_hotspot_jtreg_tier1_gc_gcbasher-windows-x64-debug-49 > Dependency task failed: > mach5...804499-windows-x64-debug-windows-x64-build-13 > See all 17... > From tom.rodriguez at oracle.com Thu May 30 18:27:49 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Thu, 30 May 2019 11:27:49 -0700 Subject: RFR 8209626: [JVMCI] Use implicit exception table for dispatch and printing In-Reply-To: <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> References: <09596941-d41c-fcc5-e54b-640f080d2881@oracle.com> <8dce515a-d066-bd29-8c58-bf8a515171e4@oracle.com> <46e4f527-7ba8-a5a2-157e-34d5001a44ad@oracle.com> <9a8bfea4-3a64-fd5f-afff-78d4af062124@oracle.com> <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> Message-ID: <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> I have updated this webrev to include fixes to AOT to properly capture the implicit exception table and record the offset for it in the AOT binary. It required minor Graal changes which I will push upstream separately. Please rereview. tom Tom Rodriguez wrote on 12/12/18 11:22 PM: > > > Vladimir Kozlov wrote on 12/12/18 2:29 PM: >> On 12/12/18 1:06 PM, Tom Rodriguez wrote: >>> They all look like preexisting failures to me.? The >>> CheckGraalIntrinsics one you mentioned in chat and >> >> yes >> >>> compiler/aot/DeoptimizationTest.java which seems to have been failing >>> at least intermittently for a while.? What do you think? >> >> SIGFPE is new. And I think your changes in sharedRuntime.cpp may >> affected execution of AOT methods because they are marked as compiled >> by Graal (compiler_jvmci): >> >> http://hg.openjdk.java.net/jdk/jdk/file/9e28eff3d40f/src/hotspot/share/aot/aotCompiledMethod.hpp#l131 > > > Yes I think I need to move some code around to properly support AOT. > I'll send out an updated webrev soon but I think we can defer this one > until jdk 13. > > tom > >> >> >> Vladimir >> >>> >>> It does make me wonder if AOT needs any extra support to use the >>> implicit exception table.? I would assume we'd be seeing problems if >>> that was the case but don't really know. >>> >>> tom >>> >>> Vladimir Kozlov wrote on 12/12/18 11:12 AM: >>>> Tom, >>>> >>>> Some tests failed. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 12/12/18 10:42 AM, Tom Rodriguez wrote: >>>>> http://cr.openjdk.java.net/~never/8209626/webrev >>>>> https://bugs.openjdk.java.net/browse/JDK-8209626 >>>>> >>>>> Graal handles implicit exceptions by deoptimizing and that's >>>>> currently done in a way that's hard to understand from the >>>>> PrintNMethods output. Basically there's just an extra PcDesc at the >>>>> implicit check location and the runtime assumes that a fault with a >>>>> PcDesc underneath it an implicit check.? This changes JVMCI to use >>>>> the implicit exception table to mark these locations specially >>>>> which simplifies the dispatching and >>>>> printing.??The?new?print?output?looks?like?this: >>>>> >>>>> ?? 0x0000000120f053a0: mov??? DWORD PTR [rsp-0x14000],eax >>>>> ?? 0x0000000120f053a7: sub??? rsp,0x18 >>>>> ?? 0x0000000120f053ab: mov??? QWORD PTR [rsp+0x10],rbp ;*aload_0 >>>>> {reexecute=1 rethrow=0 return_oop=0} >>>>> ???????????????????????????????????????????????????????????? ; - >>>>> java.lang.StringLatin1::equals at 0 (line 94) >>>>> >>>>> ?? 0x0000000120f053b0: mov??? eax,DWORD PTR [rsi+0xc]??????? ; >>>>> implicit exception: deoptimizes >>>>> ???????????????????????????????????????????????????????????? ; >>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>> ???????????????????????????????????????????????????????????? ; - >>>>> java.lang.StringLatin1::equals at 0 (line 94) >>>>> >>>>> ?? 0x0000000120f053b3: mov??? r10d,DWORD PTR [rdx+0xc]?????? ; >>>>> implicit exception: deoptimizes >>>>> ???????????????????????????????????????????????????????????? ; >>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>> >>>>> The scope information is still printed in the normal original >>>>> location. This has been in use with JVMCI 8 for several months. >>>>> >>>>> tom From vladimir.kozlov at oracle.com Thu May 30 18:55:13 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 30 May 2019 11:55:13 -0700 Subject: RFR 8209626: [JVMCI] Use implicit exception table for dispatch and printing In-Reply-To: <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> References: <09596941-d41c-fcc5-e54b-640f080d2881@oracle.com> <8dce515a-d066-bd29-8c58-bf8a515171e4@oracle.com> <46e4f527-7ba8-a5a2-157e-34d5001a44ad@oracle.com> <9a8bfea4-3a64-fd5f-afff-78d4af062124@oracle.com> <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> Message-ID: <8bbf04d2-09d6-6f9f-930c-77e97a03bdc9@oracle.com> Hi Tom, Can you add comment to next new code that it is to support old JVMCI? Otherwise it is confusing because in JDK repo we will not have the issue: getDeclaredMethod("implicitExceptionTable"); In jvmci/jvmciRuntime.hpp spacing is not aligned with the rest of arguments. The rest seems fine. what testing you did with new code? Thanks, Vladimir On 5/30/19 11:27 AM, Tom Rodriguez wrote: > I have updated this webrev to include fixes to AOT to properly capture the implicit exception table and record the > offset for it in the AOT binary.? It required minor Graal changes which I will push upstream separately.? Please rereview. > > tom > > Tom Rodriguez wrote on 12/12/18 11:22 PM: >> >> >> Vladimir Kozlov wrote on 12/12/18 2:29 PM: >>> On 12/12/18 1:06 PM, Tom Rodriguez wrote: >>>> They all look like preexisting failures to me.? The CheckGraalIntrinsics one you mentioned in chat and >>> >>> yes >>> >>>> compiler/aot/DeoptimizationTest.java which seems to have been failing at least intermittently for a while.? What do >>>> you think? >>> >>> SIGFPE is new. And I think your changes in sharedRuntime.cpp may affected execution of AOT methods because they are >>> marked as compiled by Graal (compiler_jvmci): >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/9e28eff3d40f/src/hotspot/share/aot/aotCompiledMethod.hpp#l131 >> >> >> Yes I think I need to move some code around to properly support AOT. I'll send out an updated webrev soon but I think >> we can defer this one until jdk 13. >> >> tom >> >>> >>> >>> Vladimir >>> >>>> >>>> It does make me wonder if AOT needs any extra support to use the implicit exception table.? I would assume we'd be >>>> seeing problems if that was the case but don't really know. >>>> >>>> tom >>>> >>>> Vladimir Kozlov wrote on 12/12/18 11:12 AM: >>>>> Tom, >>>>> >>>>> Some tests failed. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 12/12/18 10:42 AM, Tom Rodriguez wrote: >>>>>> http://cr.openjdk.java.net/~never/8209626/webrev >>>>>> https://bugs.openjdk.java.net/browse/JDK-8209626 >>>>>> >>>>>> Graal handles implicit exceptions by deoptimizing and that's currently done in a way that's hard to understand >>>>>> from the PrintNMethods output. Basically there's just an extra PcDesc at the implicit check location and the >>>>>> runtime assumes that a fault with a PcDesc underneath it an implicit check.? This changes JVMCI to use the >>>>>> implicit exception table to mark these locations specially which simplifies the dispatching and >>>>>> printing.??The?new?print?output?looks?like?this: >>>>>> >>>>>> ?? 0x0000000120f053a0: mov??? DWORD PTR [rsp-0x14000],eax >>>>>> ?? 0x0000000120f053a7: sub??? rsp,0x18 >>>>>> ?? 0x0000000120f053ab: mov??? QWORD PTR [rsp+0x10],rbp ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>> ???????????????????????????????????????????????????????????? ; - java.lang.StringLatin1::equals at 0 (line 94) >>>>>> >>>>>> ?? 0x0000000120f053b0: mov??? eax,DWORD PTR [rsi+0xc]??????? ; implicit exception: deoptimizes >>>>>> ???????????????????????????????????????????????????????????? ; ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>> ???????????????????????????????????????????????????????????? ; - java.lang.StringLatin1::equals at 0 (line 94) >>>>>> >>>>>> ?? 0x0000000120f053b3: mov??? r10d,DWORD PTR [rdx+0xc]?????? ; implicit exception: deoptimizes >>>>>> ???????????????????????????????????????????????????????????? ; ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>> >>>>>> The scope information is still printed in the normal original location. This has been in use with JVMCI 8 for >>>>>> several months. >>>>>> >>>>>> tom From igor.veresov at oracle.com Thu May 30 19:33:13 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 30 May 2019 12:33:13 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed Message-ID: Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for these values JLS guarantees the identity of the boxes. The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there instead of allocating it. Mach5: tier1-6, tier2-6 with Graal Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. Thanks, igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu May 30 21:15:33 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 30 May 2019 14:15:33 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: References: Message-ID: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> CCing to runtime group too. So you went hard and correct way ;-) deoptimization.cpp next #ifdef sequence is strange since we support only 64-bit on SPARC: +#ifdef _LP64 + jlong res = (jlong)low->get_int(); +#else +#ifdef SPARC Also the code here is guarded by INCLUDE_JVMCI but it should be applicable to AOT code too. Right? May be || INCLUDE_AOT? Please, add more comment. For example add one in aotLoader.cpp for initialize_box_caches() to explain why we need to eager initialize caches for AOT. Thanks, Vladimir On 5/30/19 12:33 PM, Igor Veresov wrote: > Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation > of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for > these values JLS guarantees the identity of the boxes. > The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the > deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there > instead of allocating it. > > Mach5: tier1-6, tier2-6 with Graal > > Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ > > I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. > > Thanks, > igor > > > From igor.veresov at oracle.com Thu May 30 22:29:30 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 30 May 2019 15:29:30 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> Message-ID: <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> > On May 30, 2019, at 2:15 PM, Vladimir Kozlov wrote: > > CCing to runtime group too. > > So you went hard and correct way ;-) Well, there isn?t much of choice? Must abide the spec. > > deoptimization.cpp next #ifdef sequence is strange since we support only 64-bit on SPARC: > > +#ifdef _LP64 > + jlong res = (jlong)low->get_int(); > +#else > +#ifdef SPARC > Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just remove it. > Also the code here is guarded by INCLUDE_JVMCI but it should be applicable to AOT code too. Right? May be || INCLUDE_AOT? > Good point, will add that. > Please, add more comment. For example add one in aotLoader.cpp for initialize_box_caches() to explain why we need to eager initialize caches for AOT. > Ok. New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ igor > Thanks, > Vladimir > > On 5/30/19 12:33 PM, Igor Veresov wrote: >> Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for these values JLS guarantees the identity of the boxes. >> The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there instead of allocating it. >> Mach5: tier1-6, tier2-6 with Graal >> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >> I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. >> Thanks, >> igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu May 30 23:12:14 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 30 May 2019 16:12:14 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> Message-ID: This looks good to me. Thanks Vladimir > On May 30, 2019, at 3:29 PM, Igor Veresov wrote: > > >> On May 30, 2019, at 2:15 PM, Vladimir Kozlov wrote: >> >> CCing to runtime group too. >> >> So you went hard and correct way ;-) > > Well, there isn?t much of choice? Must abide the spec. > >> >> deoptimization.cpp next #ifdef sequence is strange since we support only 64-bit on SPARC: >> >> +#ifdef _LP64 >> + jlong res = (jlong)low->get_int(); >> +#else >> +#ifdef SPARC >> > > Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just remove it. > >> Also the code here is guarded by INCLUDE_JVMCI but it should be applicable to AOT code too. Right? May be || INCLUDE_AOT? >> > > Good point, will add that. > >> Please, add more comment. For example add one in aotLoader.cpp for initialize_box_caches() to explain why we need to eager initialize caches for AOT. >> > > Ok. > > > New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ > > > igor > >> Thanks, >> Vladimir >> >>> On 5/30/19 12:33 PM, Igor Veresov wrote: >>> Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for these values JLS guarantees the identity of the boxes. >>> The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there instead of allocating it. >>> Mach5: tier1-6, tier2-6 with Graal >>> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >>> I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. >>> Thanks, >>> igor > From OGATAK at jp.ibm.com Fri May 31 04:54:16 2019 From: OGATAK at jp.ibm.com (Kazunori Ogata) Date: Fri, 31 May 2019 13:54:16 +0900 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185969: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> Message-ID: Hi Andrew, Thank you for your confirmation. > However, please don't assume an acceptable review just because there is > no objection. I see. I thought review was optional in this case since it is almost clean and no change in the actual code. But I should have waited for your reply, as you reviewed the patch. I'll be more careful. Regards, Ogata Andrew John Hughes wrote on 2019/05/31 01:33:06: > From: Andrew John Hughes > To: Kazunori Ogata , "hotspot-compiler- > dev at openjdk.java.net" , "jdk8u- > dev at openjdk.java.net" > Date: 2019/05/31 01:33 > Subject: [EXTERNAL] Re: [8u-dev, ppc] RFR for (almost clean) backport of > 8185969: PPC64: Improve VSR support to use up to 64 registers > > On 30/05/2019 08:10, Kazunori Ogata wrote: > > Hi, > > > > Fixed a typo in the bug ID in the title. > > > > Since it looks there is no further objection, I added jdk8u-fix-request > > tag in the original bug report. > > > > > > Regards, > > Ogata > > > > The updated webrev looks fine now. > > However, please don't assume an acceptable review just because there is > no objection. > > Thanks, > -- > Andrew :) > > Senior Free Java Software Engineer > Red Hat, Inc. (http://www.redhat.com) > > PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) > Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 > https://keybase.io/gnu_andrew > > [attachment "signature.asc" deleted by Kazunori Ogata/Japan/IBM] From fujie at loongson.cn Fri May 31 05:03:04 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 31 May 2019 13:03:04 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <8e25f068-609a-de80-a020-fbc0ede4b96a@oracle.com> References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> <8e25f068-609a-de80-a020-fbc0ede4b96a@oracle.com> Message-ID: <063fa1f0-864c-b049-1ece-534322505bf7@loongson.cn> Hi Vladimir Ivanov, Thank you for your review and guidance. I benefit a lot from the discussion with you. The patch had been updated based on your suggestions: - http://cr.openjdk.java.net/~jiefu/8224162/webrev.07/ Also I had changed the parameter type of CounterData::set_count(...) from uint to int. It is only used here[1], which I think is safe and clearer to do that. Please review it and give me some advice. Testing: ? make test TEST="tier1 tier2 tier3" JTREG="JOBS=4" CONF=release on Linux/x64 Thanks a lot. Best regards, Jie [1] http://hg.openjdk.java.net/jdk/jdk/file/b0513c833960/src/hotspot/share/oops/methodData.hpp#l1170 On 2019/5/30 ??10:59, Vladimir Ivanov wrote: > Switching return type to int would make it clearer. > Done > > call->receiver_count(i) is uint, so you still can experience overflow > when converting from uint to int. Considering receiver counts are > positive, I'd use saturating add on uints (and don't care about > min_jint). Fixed. > Regarding handling overflow during profiling: > > ? * C1 doesn't handle counter overflow [1] > > ? * What template interpreter does to avoid overflow is not enough for > concurrent case: it stores new value into memory and then > conditionally decrements it, but another thread may already see it. > Proper solution would be to keep the value in register, but that > requires a temporary register to burn. Nice catch! I didn't realize the problem before. Thanks. > So, it seems easier and cheaper (from the perspective of profiling > overhead) to handle that on compiler side when interpreting the data. > I agree. > > Best regards, > Vladimir Ivanov > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l1617 From patric.hedlin at oracle.com Fri May 31 08:55:02 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Fri, 31 May 2019 10:55:02 +0200 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: <358ae47a-422f-e243-febe-086820ec6bc9@oracle.com> References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> <358ae47a-422f-e243-febe-086820ec6bc9@oracle.com> Message-ID: Updated testcase with "@requires !vm.graal.enabled". Refreshed Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ Best regards, Patric On 29/05/2019 19:00, Vladimir Kozlov wrote: > > Hi Patric > > Add to test '@requires !vm.graal.enabled' to avoid running with Graal. > And add link to test results to the bug report. I assume you need to > run 1-3 tiers and hs-precheckin-comp. > > Thanks, > Vladimir > > On 5/29/19 12:48 AM, Patric Hedlin wrote: >> Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ >> >> Now also including the testcase for/from 8223502. >> >> Best regards, >> Patric >> >> On 28/05/2019 18:50, Aleksey Shipilev wrote: >>> On 5/28/19 6:49 PM, Patric Hedlin wrote: >>>> On 2019-05-28 15:51, Aleksey Shipilev wrote: >>>>> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>>>>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>>>>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >>>>> Yes, so what prevents us from including that test in this >>>>> changeset? Surely it acts like the >>>>> regression tests for the fix. >>>> Absolutely nothing. The error was only in my webrev generation that >>>> didn't include both bookmarks. >>>> They will be pushed "as one". (I can generate a new, single, >>>> complete, webrev in the morning if you >>>> prefer.) >>> Yes, please. I prefer full webrevs to understand what exactly is >>> being pushed. >>> >>> -Aleksey >>> >> From nils.eliasson at oracle.com Fri May 31 09:10:52 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 31 May 2019 11:10:52 +0200 Subject: RFR(XS): 8224538: LoadBarrierNode::common_barrier must check address In-Reply-To: <6a1f0d8e-45b3-1c22-bdc1-0de56df3fff5@oracle.com> References: <540347ca-30c1-3199-8acf-225782c886ec@oracle.com> <532513d4-6927-6718-b9e8-c28397069b58@oracle.com> <6a1f0d8e-45b3-1c22-bdc1-0de56df3fff5@oracle.com> Message-ID: Thanks for the review Tobias and Vladimir, // Nils On 2019-05-29 18:52, Vladimir Kozlov wrote: > +1 > > Thanks, > Vladimir > > On 5/29/19 5:38 AM, Tobias Hartmann wrote: >> Thanks, looks good! >> >> Best regards, >> Tobias >> >> On 29.05.19 14:31, Nils Eliasson wrote: >>> I updated the webrev in place. >>> >>> Thanks, >>> >>> // Nils >>> >>> On 2019-05-29 09:54, Tobias Hartmann wrote: >>>> Hi Nils, >>>> >>>> On 28.05.19 18:24, Nils Eliasson wrote: >>>>> I'll update the webrev. >>>> Thanks. >>>> >>>>> This fixes the crash in RunThese - and that was with ZGC. The >>>>> other crashes are from >>>>> sparkexamples-application. >>>>> >>>>> When I push this patch we need to open a new bug for that crash. >>>> Okay, I've re-added the ZGC label and filed JDK-8224957 for the >>>> non-ZGC related failures. >>>> >>>> Best regards, >>>> Tobias From martin.doerr at sap.com Fri May 31 09:43:20 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 31 May 2019 09:43:20 +0000 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: <762bb879-63df-ed5e-c419-8c7c5544246c@oracle.com> References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> <956790f2-5230-fa65-ca0a-77aa106ef462@oracle.com> <762bb879-63df-ed5e-c419-8c7c5544246c@oracle.com> Message-ID: Hi Vladimir, thanks for doing this change. Unfortunately, I had missed that the assertions don't work with -Xcomp: assert(method->holder()->is_being_initialized() || method->holder()->is_initialized(), ... LIR_Assembler::clinit_barrier (x86) MachPrologNode::emit (x86) Parse::clinit_deopt() (C2 code) Assertion failures can be reproduced by running jck tests vm/constantpool/Initialization/Initialization010/Initialization01001m051/Initialization/* with -Xcomp. They fail in C1 and if you use -XX:-TieredCompilation in C2. And after thinking longer about it, I think that it's not ideal that we check is_being_initialized() several times in C2 (recheck in GraphKit::clinit_barrier for deoptimization). Should we better enforce consistency by only checking clinit_barrier_on_entry()? Sorry that these issues didn't come into my mind earlier. Best regards, Martin > -----Original Message----- > From: hotspot-dev On Behalf Of > Vladimir Ivanov > Sent: Donnerstag, 30. Mai 2019 12:38 > To: hotspot-dev developers > Subject: Re: [13] RFR (M): 8223213: Implement fast class initialization checks > on x86-64 > > Thanks for reviews, Vladimir, Claes, David, Martin, and Coleen! > > Best regards, > Vladimir Ivanov > > On 29/05/2019 22:06, coleen.phillimore at oracle.com wrote: > > > > Vladimir, > > > > This looks good to me. > > > > On 5/29/19 12:20 PM, Vladimir Ivanov wrote: > >> Thanks, Coleen. > >> > >> Updated webrev: > >> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.03 > >> > >> Incremental webrev: > >> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.03_02/ > >> > >>> I reviewed mostly the interpreter and shared change.? As someone else > >>> commented, I don't like the addition of a develop flag because some > >>> platforms don't support class initialization barriers.? Isn't it > >>> normal to do this with the misnamed VM_Version, like adding > >>> VM_Version::supports_class_initialization_barriers() with x86 > >>> returning true until the other platforms false until they implement > >>> the feature. Then there isn't another flag configuration to test (or > >>> not test). > >> > >> I like your suggestion. Didn't know VM_Version is used in such a way. > >> > >> Replaced UseFastClassInitChecks with > >> VM_Version::supports_fast_class_init_checks(). > >> > >>> > http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/src/hotspot/cpu/x > 86/interp_masm_x86.cpp.udiff.html > >>> > >>> > >>> I have to admit that the relationship between resolved bytecode in > >>> bytecode_1/bytecode_2 and which _f1/_f2 held the Method* was > actually > >>> a surprise to me.? There's nothing structurally in cpCache other than > >>> reading the code that enforces this and it wasn't always a Method* in > >>> f2 for invokeinterface, for example, so it's sort of an accident. > >>> > >>> But this code is correct and I think as a follow up we should make > >>> load_invoke_cp_cache_entry() call load_resolved_method_at_index() > too > >>> and have some assert in cpCache that this is true, or rewrite the > >>> cpCache completely. > >> > >> I went ahead and changed load_invoke_cp_cache_entry() to call > >> load_resolved_method_at_index() (along with some other minor > >> refactorings). If you have any ideas/suggestions about the assert, I > >> can add it as well. > > > > I think the change to use load_resolved_method_at_index() is good > > because if someone moves things around, it'll now fail very quickly.? I > > think this is the right amount of refactoring. > > > > Thanks, > > Coleen > >> > >> Best regards, > >> Vladimir Ivanov > >> > >>> The code to do the initialization barrier in the interpreter looks good. > >>> > >>> Thanks, > >>> Coleen > >>> > >>> On 5/28/19 7:40 AM, Vladimir Ivanov wrote: > >>>> Thanks, Martin. > >>>> > >>>> Updated webrev: > >>>> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/ > >>>> > >>>>> Are these assertions safe? > >>>>> +?? assert(method()->needs_clinit_barrier(), "barrier not needed"); > >>>>> +?? assert(method()->holder()->is_being_initialized(), "barrier not > >>>>> needed"); > >>>>> Can it happen that initialization concurrently completes before > >>>>> they are evaluated? > >>>> > >>>> Good point. Even though ciInstanceKlass caches initialization state > >>>> of the corresponding InstanceKlass, it seems there's a possibility > >>>> that the state is updated during the compilation (see > >>>> ciInstanceKlass::update_if_shared). I enhanced the asserts to check > >>>> that initialization has been stated. > >>> > >>> Ok, this makes sense. > >>>> > >>>>> A small suggestion for x86 TemplateTable::invokeinterface: > >>>>> It'd be nice to replace load of interface klass by your new > >>>>> load_method_holder. > >>>> > >>>> Agree. Updated. > >>> > >>> This is nice. > >>> > >>> > >>>> > >>>> Best regards, > >>>> Vladimir Ivanov > >>> > > From patric.hedlin at oracle.com Fri May 31 10:31:08 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Fri, 31 May 2019 12:31:08 +0200 Subject: RFR(T): 8225110: IGV build definition uses non-secure transport Message-ID: <55100074-5ff7-9fcc-0bb0-42df8075e5cb@oracle.com> Dear all, I would like to ask for help to review the following _trivial_ change/update: Issue:? https://bugs.openjdk.java.net/browse/JDK-8225110 Diff below. Best regards, Patric -----8<----- diff -r dd321e3596c0 -r 590e647d30fa src/utils/IdealGraphVisualizer/nbproject/platform.properties --- a/src/utils/IdealGraphVisualizer/nbproject/platform.properties Wed May 29 13:58:05 2019 +0100 +++ b/src/utils/IdealGraphVisualizer/nbproject/platform.properties Thu May 16 17:09:11 2019 +0200 @@ -7,7 +7,7 @@ ?nbplatform.default.netbeans.dest.dir=${suite.dir}/nbplatform ?nbplatform.default.harness.dir=${nbplatform.default.netbeans.dest.dir}/harness ?bootstrap.url=http://bits.netbeans.org/dev/nbms-and-javadoc/lastSuccessfulBuild/artifact/nbbuild/netbeans/harness/tasks.jar -autoupdate.catalog.url=http://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz +autoupdate.catalog.url=https://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz ?suite.dir=${basedir} ?nbplatform.active=default ?## Not disabled because of NetBeans bug 206347: From adinn at redhat.com Fri May 31 10:35:54 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 31 May 2019 11:35:54 +0100 Subject: RFR: 8224975: CSR: Implement JEP 352 Message-ID: Could I please have reviews for the following CSR which details the changes needed for the JEP 352 implementation task: CSR JIRA: https://bugs.openjdk.java.net/browse/JDK-8224975 I'm still hoping to target this for JDK13. The OpenJDK Project Lead explained that this CSR needs to be reviewed with at least provisional agreement before that can happen. Also, the JEP still needs endorsing by at least one relevant Group or Area Lead (I think that probably means Alan, Brian or Vladimir). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From nils.eliasson at oracle.com Fri May 31 11:45:04 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 31 May 2019 13:45:04 +0200 Subject: RFR(T): 8225110: IGV build definition uses non-secure transport In-Reply-To: <55100074-5ff7-9fcc-0bb0-42df8075e5cb@oracle.com> References: <55100074-5ff7-9fcc-0bb0-42df8075e5cb@oracle.com> Message-ID: <54d20c03-5e26-56ae-9bbc-97426d015acb@oracle.com> Looks good! // Nils On 2019-05-31 12:31, Patric Hedlin wrote: > Dear all, > > I would like to ask for help to review the following _trivial_ > change/update: > > Issue:? https://bugs.openjdk.java.net/browse/JDK-8225110 > > Diff below. > > Best regards, > Patric > > -----8<----- > > diff -r dd321e3596c0 -r 590e647d30fa > src/utils/IdealGraphVisualizer/nbproject/platform.properties > --- a/src/utils/IdealGraphVisualizer/nbproject/platform.properties Wed > May 29 13:58:05 2019 +0100 > +++ b/src/utils/IdealGraphVisualizer/nbproject/platform.properties Thu > May 16 17:09:11 2019 +0200 > @@ -7,7 +7,7 @@ > ?nbplatform.default.netbeans.dest.dir=${suite.dir}/nbplatform > ?nbplatform.default.harness.dir=${nbplatform.default.netbeans.dest.dir}/harness > > ?bootstrap.url=http://bits.netbeans.org/dev/nbms-and-javadoc/lastSuccessfulBuild/artifact/nbbuild/netbeans/harness/tasks.jar > > -autoupdate.catalog.url=http://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz > > +autoupdate.catalog.url=https://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz > > ?suite.dir=${basedir} > ?nbplatform.active=default > ?## Not disabled because of NetBeans bug 206347: > From patric.hedlin at oracle.com Fri May 31 12:00:59 2019 From: patric.hedlin at oracle.com (Patric Hedlin) Date: Fri, 31 May 2019 14:00:59 +0200 Subject: RFR(T): 8225110: IGV build definition uses non-secure transport In-Reply-To: <54d20c03-5e26-56ae-9bbc-97426d015acb@oracle.com> References: <55100074-5ff7-9fcc-0bb0-42df8075e5cb@oracle.com> <54d20c03-5e26-56ae-9bbc-97426d015acb@oracle.com> Message-ID: <349a4eea-2ac4-b285-f6d4-e9cc23eb865b@oracle.com> Thanks Nils. Pushed. /Patric On 31/05/2019 13:45, Nils Eliasson wrote: > Looks good! > > // Nils > > On 2019-05-31 12:31, Patric Hedlin wrote: >> Dear all, >> >> I would like to ask for help to review the following _trivial_ >> change/update: >> >> Issue:? https://bugs.openjdk.java.net/browse/JDK-8225110 >> >> Diff below. >> >> Best regards, >> Patric >> >> -----8<----- >> >> diff -r dd321e3596c0 -r 590e647d30fa >> src/utils/IdealGraphVisualizer/nbproject/platform.properties >> --- a/src/utils/IdealGraphVisualizer/nbproject/platform.properties >> Wed May 29 13:58:05 2019 +0100 >> +++ b/src/utils/IdealGraphVisualizer/nbproject/platform.properties >> Thu May 16 17:09:11 2019 +0200 >> @@ -7,7 +7,7 @@ >> ?nbplatform.default.netbeans.dest.dir=${suite.dir}/nbplatform >> ?nbplatform.default.harness.dir=${nbplatform.default.netbeans.dest.dir}/harness >> >> ?bootstrap.url=http://bits.netbeans.org/dev/nbms-and-javadoc/lastSuccessfulBuild/artifact/nbbuild/netbeans/harness/tasks.jar >> >> -autoupdate.catalog.url=http://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz >> >> +autoupdate.catalog.url=https://updates.netbeans.org/netbeans/updates/7.4/uc/final/distribution/catalog.xml.gz >> >> ?suite.dir=${basedir} >> ?nbplatform.active=default >> ?## Not disabled because of NetBeans bug 206347: >> From fujie at loongson.cn Fri May 31 13:09:18 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 31 May 2019 21:09:18 +0800 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: References: <3a9a1a08-76eb-df30-2c23-a4cb4d3d52d7@loongson.cn> <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> <8e25f068-609a-de80-a020-fbc0ede4b96a@oracle.com> <063fa1f0-864c-b049-1ece-534322505bf7@loongson.cn> <828dfe7a-bbfd-3782-62e9-ae4ac4490cb7@loongson.cn> Message-ID: <289bd842-4f56-f5ec-70ff-bf31a1c56f19@loongson.cn> Hi Vladimir Ivanov, Thanks for your great idea. With your help, the overflow problem had been fixed both on 64- & 32-bit platforms. Please review it: http://cr.openjdk.java.net/~jiefu/8224162/webrev.09/ Thanks a lot. Best regards, Jie On 2019/5/31 ??6:02, Vladimir Ivanov wrote: > Thanks for checking 32-bit port! > > I'd try to fix the overflow in ciMethod::call_profile_at_bci() by > enforcing the following property: > ? // The call site count is 0 with known morphism (only 1 or 2 receivers) > ? // or < 0 in the case of a type check failure for checkcast, > aastore, instanceof. > ? // The call site count is > 0 in the case of a polymorphic virtual > call. > > It knows the bci and has access to the actual bytecode. > > For invoke* it would turn negative values into max_jint and for > checkcast/aastore/instanceof turn positive values into min_jint. > > Also, I would avoid "#ifndef _LP64" and keep the code on both 64- & > 32-bit, though on 64-bit it should be effectively unused (as of now, > but not necessarily in the future). > > Best regards, > Vladimir Ivanov > > On 31/05/2019 12:41, Jie Fu wrote: >> Hi Vladimir Ivanov, >> >> The previous version still failed on 32-bit VM. >> >> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.08/ >> >> There seems no way to detect the invoke-profile counter overflow in >> CounterData::count() on 32-bit systems. >> So I did that in Compile::call_generator(...) for 32-bit platforms. >> >> What do you think of this version? >> >> Thanks a lot. >> Best regards, >> Jie >> >> On 2019/5/31 ??3:55, Jie Fu wrote: >>> Hi Vladimir Ivanov, >>> >>> I'm wondering whether the patch[1] works for 32-bit JVM. >>> And now I'm doing experiments on a 32-bit system. >>> Please just wait for me and I'll let you know the result ASAP. >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> [1] http://cr.openjdk.java.net/~jiefu/8224162/webrev.07/ >>> >>> >>> On 2019/5/31 ??1:03, Jie Fu wrote: >>>> Hi Vladimir Ivanov, >>>> >>>> Thank you for your review and guidance. >>>> I benefit a lot from the discussion with you. >>>> The patch had been updated based on your suggestions: >>>> - http://cr.openjdk.java.net/~jiefu/8224162/webrev.07/ >>>> >>>> Also I had changed the parameter type of >>>> CounterData::set_count(...) from uint to int. >>>> It is only used here[1], which I think is safe and clearer to do that. >>>> Please review it and give me some advice. >>>> >>>> Testing: >>>> ? make test TEST="tier1 tier2 tier3" JTREG="JOBS=4" CONF=release on >>>> Linux/x64 >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> [1] >>>> http://hg.openjdk.java.net/jdk/jdk/file/b0513c833960/src/hotspot/share/oops/methodData.hpp#l1170 >>>> >>>> >>>> >>>> On 2019/5/30 ??10:59, Vladimir Ivanov wrote: >>>>> Switching return type to int would make it clearer. >>>>> >>>> Done >>>> >>>> >>>>> >>>>> call->receiver_count(i) is uint, so you still can experience >>>>> overflow when converting from uint to int. Considering receiver >>>>> counts are positive, I'd use saturating add on uints (and don't >>>>> care about min_jint). >>>> >>>> Fixed. >>>> >>>> >>>>> Regarding handling overflow during profiling: >>>>> >>>>> ? * C1 doesn't handle counter overflow [1] >>>>> >>>>> ? * What template interpreter does to avoid overflow is not enough >>>>> for concurrent case: it stores new value into memory and then >>>>> conditionally decrements it, but another thread may already see >>>>> it. Proper solution would be to keep the value in register, but >>>>> that requires a temporary register to burn. >>>> >>>> Nice catch! I didn't realize the problem before. Thanks. >>>> >>>> >>>>> So, it seems easier and cheaper (from the perspective of profiling >>>>> overhead) to handle that on compiler side when interpreting the data. >>>>> >>>> I agree. >>>> >>>> >>>>> >>>>> Best regards, >>>>> Vladimir Ivanov >>>>> >>>>> [1] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l1617 >>>>> >> From peter.januschke at sap.com Fri May 31 13:46:33 2019 From: peter.januschke at sap.com (Januschke, Peter) Date: Fri, 31 May 2019 13:46:33 +0000 Subject: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit In-Reply-To: <9F82E0AB-45DB-45F4-B453-691F4A1AB244@oracle.com> References: <9F82E0AB-45DB-45F4-B453-691F4A1AB244@oracle.com> Message-ID: Hi, sorry for taking so long. Since I am a member of SAP?s work council, I have only very limited time for doing development. So now here is a solution like proposed by Igor: http://cr.openjdk.java.net/~mdoerr/8222103_testbug/webrev.02/ https://bugs.openjdk.java.net/browse/JDK-8222103 Please review, and I please need a sponsor. Best regards Peter From: Igor Ignatyev Sent: Dienstag, 9. April 2019 18:30 To: Nils Eliasson ; Januschke, Peter Cc: hotspot-compiler-dev at openjdk.java.net compiler Subject: Re: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit Nils, thanks for the clarification. Peter, I haven't looked at these tests for quite a while, and, I guess, presence of 'jcmd' in the path got me confused, so I (now obviously) incorrectly assumed that the directives are added by jcmd. in this case, I agree that it is a test bug. I'd however prefer it to be fixed slightly different and instead of adding 'CompilerDirectivesLimit to command line I'd read its value using WhiteBox and limit ClearDirectivesFileStackTest::AMOUNT by it. you'll also need to update year in the copyright notice. Thanks, -- Igor On Apr 9, 2019, at 4:55 AM, Nils Eliasson > wrote: Hi, The design is that you add a number of directives together to get a desired behavior. Adding just some of them make no sense. If you add directives by command line, and get an error (syntax or limit) - it will print the error stop the startup. This allows the user to correct the problem and retry. If you add by jcmd, the error is printed on the jcmd console, but the VM continues on unaffected. The user can correct the problem and make another try. I think there is a separate test of the limit. And if that is already covered, testing it in this test too, seems unnecessary. Regards, // Nils On 2019-04-09 11:34, Januschke, Peter wrote: Hi Igor, the current implementation prints a message and then stops: bool DirectivesStack::check_capacity(int request_size, outputStream* st) { if ((request_size + _depth) > CompilerDirectivesLimit) { st->print_cr("Could not add %i more directives. Currently %i/%i directives.", request_size, _depth, CompilerDirectivesLimit); return false; } return true; } Best regards Peter From: Igor Ignatyev Sent: Montag, 8. April 2019 19:25 To: Januschke, Peter Cc: hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit Hi Peter, I don't think it's a test bug, VM shouldn't stop execution if someone requests too many compiler directives, instead it should reject requests which will exceed its capacity. -- Igor On Apr 8, 2019, at 1:18 AM, Januschke, Peter > wrote: Hi, I propose the following fix to the test mentioned in the subject: Problem: The test generates a random number of compiler directives, which might be greater than the value of CompilerDirectivesLimit. This causes the VM to stop execution upon the corresponding capacity check. Fix: set CompilerDirectivesLimit to the max random number used. http://cr.openjdk.java.net/~goetz/wr19/peter/8222103-01 https://bugs.openjdk.java.net/browse/JDK-8222103 Please review, and I please need a sponsor. Best regards Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Fri May 31 14:26:03 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 31 May 2019 07:26:03 -0700 Subject: RFR(M): 8223363: Bad node estimate assertion failure In-Reply-To: References: <97125c44-5709-3cc6-7a29-70ee1a0d1f7c@oracle.com> <08d8d09a-44a8-15ff-835c-363820140151@redhat.com> <82222832-8d19-3972-a98b-45f5de55685f@oracle.com> <5409c3e8-ef31-c753-abf7-e9334d18f517@redhat.com> <936169ea-0e8a-540f-29e1-615243e483a5@oracle.com> <412dc2e7-b748-5467-8d5d-6cea5bd81957@redhat.com> <20544c1c-12fc-520b-0cdd-30e32b6cda32@redhat.com> <7a2fb3a4-a68f-d98c-a39c-02cc9cd48e1a@oracle.com> <358ae47a-422f-e243-febe-086820ec6bc9@oracle.com> Message-ID: Looks good. thanks, Vladimir On 5/31/19 1:55 AM, Patric Hedlin wrote: > Updated testcase with "@requires !vm.graal.enabled". > > Refreshed Webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ > > Best regards, > Patric > > On 29/05/2019 19:00, Vladimir Kozlov wrote: >> >> Hi Patric >> >> Add to test '@requires !vm.graal.enabled' to avoid running with Graal. >> And add link to test results to the bug report. I assume you need to run 1-3 tiers and hs-precheckin-comp. >> >> Thanks, >> Vladimir >> >> On 5/29/19 12:48 AM, Patric Hedlin wrote: >>> Updated webrev: http://cr.openjdk.java.net/~phedlin/tr8223363/ >>> >>> Now also including the testcase for/from 8223502. >>> >>> Best regards, >>> Patric >>> >>> On 28/05/2019 18:50, Aleksey Shipilev wrote: >>>> On 5/28/19 6:49 PM, Patric Hedlin wrote: >>>>> On 2019-05-28 15:51, Aleksey Shipilev wrote: >>>>>> On 5/28/19 3:07 PM, Patric Hedlin wrote: >>>>>>> Ooops, sorry. The test-case from 8223502 is here: Webrev: >>>>>>> http://cr.openjdk.java.net/~phedlin/tr8223502/ >>>>>> Yes, so what prevents us from including that test in this changeset? Surely it acts like the >>>>>> regression tests for the fix. >>>>> Absolutely nothing. The error was only in my webrev generation that didn't include both bookmarks. >>>>> They will be pushed "as one". (I can generate a new, single, complete, webrev in the morning if you >>>>> prefer.) >>>> Yes, please. I prefer full webrevs to understand what exactly is being pushed. >>>> >>>> -Aleksey >>>> >>> > From stuart.monteith at linaro.org Fri May 31 14:31:26 2019 From: stuart.monteith at linaro.org (Stuart Monteith) Date: Fri, 31 May 2019 15:31:26 +0100 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: Incidentally, my plan is to incorporate the changes necessary for aarch64 for "Late GC barrier insertion for ZGC" into my patch for implementing ZGC on aarch64, and hopefully get that merged for JDK13. BR, Stuart On Wed, 29 May 2019 at 17:06, Stuart Monteith wrote: > > Hello Nils, > I've tested with JTReg and JCStress, with and without > -XX:+UseBarriersWithVolatile. I found one error, which was a typo on > my part in z_compareAndExchangeP. I've fixed it and JCStress is now > clean. > Your changes are good as far as my testing is concerned. I'm > continuing to look at the code that is generated. > > > diff -r d48227fa72cf src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad > --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 14:53:47 2019 +0100 > +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 16:39:59 2019 +0100 > @@ -56,6 +56,8 @@ > // > instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > match(Set dst (LoadBarrierSlowReg mem)); > + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); > + > effect(DEF dst, KILL cr); > > format %{"LoadBarrierSlowReg $dst, $mem" %} > @@ -70,7 +72,8 @@ > // Execute ZGC load barrier (weak) slow path > // > instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > - match(Set dst (LoadBarrierWeakSlowReg mem)); > + match(Set dst (LoadBarrierSlowReg mem)); > + predicate(n->as_LoadBarrierSlowReg()->is_weak()); > > effect(DEF dst, KILL cr); > > @@ -81,3 +84,60 @@ > %} > ins_pipe(pipe_slow); > %} > + > + > +// Specialized versions of compareAndExchangeP that adds a keepalive > that is consumed > +// but doesn't affect output. > + > +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + match(Set res (ZCompareAndExchangeP (Binary mem keepalive) (Binary > oldval newval))); > + ins_cost(2 * VOLATILE_REF_COST); > + effect(TEMP_DEF res, KILL cr); > + format %{ > + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == > $oldval then $mem <-- $newval" > + %} > + ins_encode %{ > + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, > + Assembler::xword, /*acquire*/ false, /*release*/ true, > + /*weak*/ false, $res$$Register); > + %} > + ins_pipe(pipe_slow); > +%} > + > +instruct z_compareAndSwapP(iRegINoSp res, > + indirect mem, > + iRegP oldval, iRegP newval, iRegP keepalive, > + rFlagsReg cr) %{ > + > + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary > oldval newval))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + > + effect(KILL cr); > + > + format %{ > + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then > $mem <-- $newval" > + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" > + %} > + > + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), > + aarch64_enc_cset_eq(res)); > + > + ins_pipe(pipe_slow); > +%} > + > + > +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, > + iRegP keepalive) %{ > + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); > + > + ins_cost(2 * VOLATILE_REF_COST); > + format %{ "atomic_xchg $prev, $newv, [$mem]" %} > + ins_encode %{ > + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); > + %} > + ins_pipe(pipe_serial); > +%} > > On Fri, 24 May 2019 at 15:37, Stuart Monteith > wrote: > > > > That's interesting, and seems beneficial for ZGC on aarch64, where > > before your patch the ZGC load barriers broke assumptions the > > memory-fence optimisation code was making. > > > > I'm currently testing your patch, with the following put on top for aarch64: > > > > diff -r ead187ebe684 src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad > > --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 13:11:48 2019 +0100 > > +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 15:34:17 2019 +0100 > > @@ -56,6 +56,8 @@ > > // > > instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > > match(Set dst (LoadBarrierSlowReg mem)); > > + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); > > + > > effect(DEF dst, KILL cr); > > > > format %{"LoadBarrierSlowReg $dst, $mem" %} > > @@ -70,7 +72,8 @@ > > // Execute ZGC load barrier (weak) slow path > > // > > instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ > > - match(Set dst (LoadBarrierWeakSlowReg mem)); > > + match(Set dst (LoadBarrierSlowReg mem)); > > + predicate(n->as_LoadBarrierSlowReg()->is_weak()); > > > > effect(DEF dst, KILL cr); > > > > @@ -81,3 +84,60 @@ > > %} > > ins_pipe(pipe_slow); > > %} > > + > > + > > +// Specialized versions of compareAndExchangeP that adds a keepalive > > that is consumed > > +// but doesn't affect output. > > + > > +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, > > + iRegP oldval, iRegP newval, iRegP keepalive, > > + rFlagsReg cr) %{ > > + match(Set oldval (ZCompareAndExchangeP (Binary mem keepalive) > > (Binary oldval newval))); > > + ins_cost(2 * VOLATILE_REF_COST); > > + effect(TEMP_DEF res, KILL cr); > > + format %{ > > + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == > > $oldval then $mem <-- $newval" > > + %} > > + ins_encode %{ > > + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, > > + Assembler::xword, /*acquire*/ false, /*release*/ true, > > + /*weak*/ false, $res$$Register); > > + %} > > + ins_pipe(pipe_slow); > > +%} > > + > > +instruct z_compareAndSwapP(iRegINoSp res, > > + indirect mem, > > + iRegP oldval, iRegP newval, iRegP keepalive, > > + rFlagsReg cr) %{ > > + > > + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary > > oldval newval))); > > + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary > > oldval newval))); > > + > > + ins_cost(2 * VOLATILE_REF_COST); > > + > > + effect(KILL cr); > > + > > + format %{ > > + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then > > $mem <-- $newval" > > + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" > > + %} > > + > > + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), > > + aarch64_enc_cset_eq(res)); > > + > > + ins_pipe(pipe_slow); > > +%} > > + > > + > > +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, > > + iRegP keepalive) %{ > > + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); > > + > > + ins_cost(2 * VOLATILE_REF_COST); > > + format %{ "atomic_xchg $prev, $newv, [$mem]" %} > > + ins_encode %{ > > + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); > > + %} > > + ins_pipe(pipe_serial); > > +%} > > \ No newline at end of file > > > > On Thu, 23 May 2019 at 15:38, Nils Eliasson wrote: > > > > > > Hi, > > > > > > In ZGC we use load barriers on references. In the original > > > implementation these where added as macro nodes at parse time. The load > > > barrier node consumes and produces control flow in order to be able to > > > be lowered into a check with a slow path late. The load barrier nodes > > > are fixed in the control flow, and extensions to different optimizations > > > are need the barriers out of loop and past other unrelated control flow. > > > > > > With this patch the barriers are instead added after the loop > > > optimizations, before macro node expansion. This makes the entire > > > pipeline until that point oblivious about the barriers. A dump of the IR > > > with ZGC or EpsilonGC will be basically identical at that point, and the > > > diff compared to serialGC or ParallelGC that use write barriers is > > > really small. > > > > > > Benefits > > > > > > - A major complexity reduction. One can reason about and implement loop > > > optimization without caring about the barriers. The escape analysis > > > doesn't need to know about the barriers. Loads float freely like they > > > are supposed to. > > > > > > - Less nodes early. The inlining will become more deterministic. A > > > barrier heavy GC will not run into node limits earlier. Also node limit > > > bounded optimization like unrolling and peeling will not be penalized by > > > barriers. > > > > > > - Better test coverage, or reduce testing cost when the same > > > optimization doesn't need to be verified with every GC. > > > > > > - Better control on where barriers end up. It is trivial to guarantee > > > that the load and barriers are not separated by a safepoint. > > > > > > Design > > > > > > The implementation uses an extra phase that piggy back on PhaseIdealLoop > > > which provides control and dominator information for all loads. This > > > extra phase is needed because we need to splice the control flow when > > > adding the load barriers. > > > > > > Barriers are inserted on the loads nodes in post order (any successor > > > first). This is to guarantee the dominator information above every > > > insertion is correct. This is also important within blocks. Two loads in > > > the same block can float in relation to each other. The addition of > > > barriers serializes their order. Any def-use relationship is upheld by > > > expanding them post order. > > > > > > Barrier insertion is done in stages. In this first stage a single macro > > > node that represents the barrier is added with all dependencies that is > > > required. In the macro expansion phase the barrier nodes is expanded > > > into the final shape, adding nodes that represent the conditional load > > > barrier check. (Write barriers in other GCs could possibly be expanded > > > here directly) > > > > > > All the barriers that are needed for unsafe reference operations (cas, > > > swap, cmpx) are also expanded late. They already have control flow, so > > > the expansion is straight forward. > > > > > > The barriers for the unsafe reference operations (cas, getandset, cmpx) > > > have also been simplified. The cas-load-cas dance have been replaced by > > > a pre-load. The pre-load is a load with a barrier, that is kept alive by > > > an extra (required) edge on the unsafe-primitive-nodes (specialized as > > > ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). > > > > > > One challenge that was encountered early and that have caused > > > considerable work is that nodes (like loads) can end up between calls > > > and their catch projections. This is usually handled after matching, in > > > PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned > > > to all catch blocks. At this stage they are in an ordered list, so that > > > is a straight forward process. For late barrier insertion we need to > > > splice in control earlier, before matching, and control flow between > > > calls and catches is not allowed. This requires us to add a > > > transformation pass where all loads and their dependent instructions are > > > cloned out to the catch blocks before we can start splicing in control > > > flow. This transformation doesn't replace the legacy call_catch_cleanup > > > fully, but it could be a future goal. > > > > > > In the original barrier implementation there where two different load > > > barrier implementations: the basic and the optimized. With the new > > > approach to barriers on unsafe, the basic is no longer required and has > > > been removed. (It provided options for skipping the self healing, and > > > passed the ref in a register, guaranteeing that the oop wasn't reloaded.) > > > > > > The wart that was fixup_partial_loads in zHeap has also been made > > > redundant. > > > > > > Dominating barriers are no longer removed on weak loads. Weak barriers > > > doesn't guarantee self-healing. > > > > > > Follow up work: > > > > > > - Consolidate all uses of GrowableArray::insert_sorted to use the new > > > version > > > > > > - Refactor the phases. There are a lot of simplifications and > > > verification that can be done with more well defined phases. > > > > > > - Simplify the remaining barrier optimizations. There might still be > > > code paths that are no longer needed. > > > > > > > > > Testing: > > > > > > Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then > > > some. All with -XX:+ZVerifyViews. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 > > > > > > Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ > > > > > > > > > Please review, > > > > > > Regards, > > > > > > Nils > > > From nils.eliasson at oracle.com Fri May 31 14:40:24 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 31 May 2019 16:40:24 +0200 Subject: RFR(XL): 8224675: Late GC barrier insertion for ZGC In-Reply-To: References: Message-ID: <2d9b78d8-7ce1-50a7-6b21-f8f3497a73e0@oracle.com> Great! Nice to see that the patch is working well for you. The latest webrev is webrev.03. It has all the fixes mentioned in this thread, and is based upon jdk/jdk today after my fix for "8224538: LoadBarrierNode::common_barrier must check address" Best regards, Nils On 2019-05-31 16:31, Stuart Monteith wrote: > Incidentally, my plan is to incorporate the changes necessary for > aarch64 for "Late GC barrier insertion for ZGC" into my patch for > implementing ZGC on aarch64, and hopefully get that merged for JDK13. > > BR, > Stuart > > On Wed, 29 May 2019 at 17:06, Stuart Monteith > wrote: >> Hello Nils, >> I've tested with JTReg and JCStress, with and without >> -XX:+UseBarriersWithVolatile. I found one error, which was a typo on >> my part in z_compareAndExchangeP. I've fixed it and JCStress is now >> clean. >> Your changes are good as far as my testing is concerned. I'm >> continuing to look at the code that is generated. >> >> >> diff -r d48227fa72cf src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad >> --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 14:53:47 2019 +0100 >> +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Wed May 29 16:39:59 2019 +0100 >> @@ -56,6 +56,8 @@ >> // >> instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ >> match(Set dst (LoadBarrierSlowReg mem)); >> + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); >> + >> effect(DEF dst, KILL cr); >> >> format %{"LoadBarrierSlowReg $dst, $mem" %} >> @@ -70,7 +72,8 @@ >> // Execute ZGC load barrier (weak) slow path >> // >> instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ >> - match(Set dst (LoadBarrierWeakSlowReg mem)); >> + match(Set dst (LoadBarrierSlowReg mem)); >> + predicate(n->as_LoadBarrierSlowReg()->is_weak()); >> >> effect(DEF dst, KILL cr); >> >> @@ -81,3 +84,60 @@ >> %} >> ins_pipe(pipe_slow); >> %} >> + >> + >> +// Specialized versions of compareAndExchangeP that adds a keepalive >> that is consumed >> +// but doesn't affect output. >> + >> +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, >> + iRegP oldval, iRegP newval, iRegP keepalive, >> + rFlagsReg cr) %{ >> + match(Set res (ZCompareAndExchangeP (Binary mem keepalive) (Binary >> oldval newval))); >> + ins_cost(2 * VOLATILE_REF_COST); >> + effect(TEMP_DEF res, KILL cr); >> + format %{ >> + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == >> $oldval then $mem <-- $newval" >> + %} >> + ins_encode %{ >> + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, >> + Assembler::xword, /*acquire*/ false, /*release*/ true, >> + /*weak*/ false, $res$$Register); >> + %} >> + ins_pipe(pipe_slow); >> +%} >> + >> +instruct z_compareAndSwapP(iRegINoSp res, >> + indirect mem, >> + iRegP oldval, iRegP newval, iRegP keepalive, >> + rFlagsReg cr) %{ >> + >> + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary >> oldval newval))); >> + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary >> oldval newval))); >> + >> + ins_cost(2 * VOLATILE_REF_COST); >> + >> + effect(KILL cr); >> + >> + format %{ >> + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then >> $mem <-- $newval" >> + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" >> + %} >> + >> + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), >> + aarch64_enc_cset_eq(res)); >> + >> + ins_pipe(pipe_slow); >> +%} >> + >> + >> +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, >> + iRegP keepalive) %{ >> + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); >> + >> + ins_cost(2 * VOLATILE_REF_COST); >> + format %{ "atomic_xchg $prev, $newv, [$mem]" %} >> + ins_encode %{ >> + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); >> + %} >> + ins_pipe(pipe_serial); >> +%} >> >> On Fri, 24 May 2019 at 15:37, Stuart Monteith >> wrote: >>> That's interesting, and seems beneficial for ZGC on aarch64, where >>> before your patch the ZGC load barriers broke assumptions the >>> memory-fence optimisation code was making. >>> >>> I'm currently testing your patch, with the following put on top for aarch64: >>> >>> diff -r ead187ebe684 src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad >>> --- a/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 13:11:48 2019 +0100 >>> +++ b/src/hotspot/cpu/aarch64/gc/z/z_aarch64.ad Fri May 24 15:34:17 2019 +0100 >>> @@ -56,6 +56,8 @@ >>> // >>> instruct loadBarrierSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ >>> match(Set dst (LoadBarrierSlowReg mem)); >>> + predicate(!n->as_LoadBarrierSlowReg()->is_weak()); >>> + >>> effect(DEF dst, KILL cr); >>> >>> format %{"LoadBarrierSlowReg $dst, $mem" %} >>> @@ -70,7 +72,8 @@ >>> // Execute ZGC load barrier (weak) slow path >>> // >>> instruct loadBarrierWeakSlowReg(iRegP dst, memory mem, rFlagsReg cr) %{ >>> - match(Set dst (LoadBarrierWeakSlowReg mem)); >>> + match(Set dst (LoadBarrierSlowReg mem)); >>> + predicate(n->as_LoadBarrierSlowReg()->is_weak()); >>> >>> effect(DEF dst, KILL cr); >>> >>> @@ -81,3 +84,60 @@ >>> %} >>> ins_pipe(pipe_slow); >>> %} >>> + >>> + >>> +// Specialized versions of compareAndExchangeP that adds a keepalive >>> that is consumed >>> +// but doesn't affect output. >>> + >>> +instruct z_compareAndExchangeP(iRegPNoSp res, indirect mem, >>> + iRegP oldval, iRegP newval, iRegP keepalive, >>> + rFlagsReg cr) %{ >>> + match(Set oldval (ZCompareAndExchangeP (Binary mem keepalive) >>> (Binary oldval newval))); >>> + ins_cost(2 * VOLATILE_REF_COST); >>> + effect(TEMP_DEF res, KILL cr); >>> + format %{ >>> + "cmpxchg $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == >>> $oldval then $mem <-- $newval" >>> + %} >>> + ins_encode %{ >>> + __ cmpxchg($mem$$Register, $oldval$$Register, $newval$$Register, >>> + Assembler::xword, /*acquire*/ false, /*release*/ true, >>> + /*weak*/ false, $res$$Register); >>> + %} >>> + ins_pipe(pipe_slow); >>> +%} >>> + >>> +instruct z_compareAndSwapP(iRegINoSp res, >>> + indirect mem, >>> + iRegP oldval, iRegP newval, iRegP keepalive, >>> + rFlagsReg cr) %{ >>> + >>> + match(Set res (ZCompareAndSwapP (Binary mem keepalive) (Binary >>> oldval newval))); >>> + match(Set res (ZWeakCompareAndSwapP (Binary mem keepalive) (Binary >>> oldval newval))); >>> + >>> + ins_cost(2 * VOLATILE_REF_COST); >>> + >>> + effect(KILL cr); >>> + >>> + format %{ >>> + "cmpxchg $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then >>> $mem <-- $newval" >>> + "cset $res, EQ\t# $res <-- (EQ ? 1 : 0)" >>> + %} >>> + >>> + ins_encode(aarch64_enc_cmpxchg(mem, oldval, newval), >>> + aarch64_enc_cset_eq(res)); >>> + >>> + ins_pipe(pipe_slow); >>> +%} >>> + >>> + >>> +instruct z_get_and_setP(indirect mem, iRegP newv, iRegPNoSp prev, >>> + iRegP keepalive) %{ >>> + match(Set prev (ZGetAndSetP mem (Binary newv keepalive))); >>> + >>> + ins_cost(2 * VOLATILE_REF_COST); >>> + format %{ "atomic_xchg $prev, $newv, [$mem]" %} >>> + ins_encode %{ >>> + __ atomic_xchg($prev$$Register, $newv$$Register, as_Register($mem$$base)); >>> + %} >>> + ins_pipe(pipe_serial); >>> +%} >>> \ No newline at end of file >>> >>> On Thu, 23 May 2019 at 15:38, Nils Eliasson wrote: >>>> Hi, >>>> >>>> In ZGC we use load barriers on references. In the original >>>> implementation these where added as macro nodes at parse time. The load >>>> barrier node consumes and produces control flow in order to be able to >>>> be lowered into a check with a slow path late. The load barrier nodes >>>> are fixed in the control flow, and extensions to different optimizations >>>> are need the barriers out of loop and past other unrelated control flow. >>>> >>>> With this patch the barriers are instead added after the loop >>>> optimizations, before macro node expansion. This makes the entire >>>> pipeline until that point oblivious about the barriers. A dump of the IR >>>> with ZGC or EpsilonGC will be basically identical at that point, and the >>>> diff compared to serialGC or ParallelGC that use write barriers is >>>> really small. >>>> >>>> Benefits >>>> >>>> - A major complexity reduction. One can reason about and implement loop >>>> optimization without caring about the barriers. The escape analysis >>>> doesn't need to know about the barriers. Loads float freely like they >>>> are supposed to. >>>> >>>> - Less nodes early. The inlining will become more deterministic. A >>>> barrier heavy GC will not run into node limits earlier. Also node limit >>>> bounded optimization like unrolling and peeling will not be penalized by >>>> barriers. >>>> >>>> - Better test coverage, or reduce testing cost when the same >>>> optimization doesn't need to be verified with every GC. >>>> >>>> - Better control on where barriers end up. It is trivial to guarantee >>>> that the load and barriers are not separated by a safepoint. >>>> >>>> Design >>>> >>>> The implementation uses an extra phase that piggy back on PhaseIdealLoop >>>> which provides control and dominator information for all loads. This >>>> extra phase is needed because we need to splice the control flow when >>>> adding the load barriers. >>>> >>>> Barriers are inserted on the loads nodes in post order (any successor >>>> first). This is to guarantee the dominator information above every >>>> insertion is correct. This is also important within blocks. Two loads in >>>> the same block can float in relation to each other. The addition of >>>> barriers serializes their order. Any def-use relationship is upheld by >>>> expanding them post order. >>>> >>>> Barrier insertion is done in stages. In this first stage a single macro >>>> node that represents the barrier is added with all dependencies that is >>>> required. In the macro expansion phase the barrier nodes is expanded >>>> into the final shape, adding nodes that represent the conditional load >>>> barrier check. (Write barriers in other GCs could possibly be expanded >>>> here directly) >>>> >>>> All the barriers that are needed for unsafe reference operations (cas, >>>> swap, cmpx) are also expanded late. They already have control flow, so >>>> the expansion is straight forward. >>>> >>>> The barriers for the unsafe reference operations (cas, getandset, cmpx) >>>> have also been simplified. The cas-load-cas dance have been replaced by >>>> a pre-load. The pre-load is a load with a barrier, that is kept alive by >>>> an extra (required) edge on the unsafe-primitive-nodes (specialized as >>>> ZCompareAndSwap, ZGetAndSet, ZCompareAndExchange). >>>> >>>> One challenge that was encountered early and that have caused >>>> considerable work is that nodes (like loads) can end up between calls >>>> and their catch projections. This is usually handled after matching, in >>>> PhaseCFG::call_catch_cleanup, where the nodes after the call are cloned >>>> to all catch blocks. At this stage they are in an ordered list, so that >>>> is a straight forward process. For late barrier insertion we need to >>>> splice in control earlier, before matching, and control flow between >>>> calls and catches is not allowed. This requires us to add a >>>> transformation pass where all loads and their dependent instructions are >>>> cloned out to the catch blocks before we can start splicing in control >>>> flow. This transformation doesn't replace the legacy call_catch_cleanup >>>> fully, but it could be a future goal. >>>> >>>> In the original barrier implementation there where two different load >>>> barrier implementations: the basic and the optimized. With the new >>>> approach to barriers on unsafe, the basic is no longer required and has >>>> been removed. (It provided options for skipping the self healing, and >>>> passed the ref in a register, guaranteeing that the oop wasn't reloaded.) >>>> >>>> The wart that was fixup_partial_loads in zHeap has also been made >>>> redundant. >>>> >>>> Dominating barriers are no longer removed on weak loads. Weak barriers >>>> doesn't guarantee self-healing. >>>> >>>> Follow up work: >>>> >>>> - Consolidate all uses of GrowableArray::insert_sorted to use the new >>>> version >>>> >>>> - Refactor the phases. There are a lot of simplifications and >>>> verification that can be done with more well defined phases. >>>> >>>> - Simplify the remaining barrier optimizations. There might still be >>>> code paths that are no longer needed. >>>> >>>> >>>> Testing: >>>> >>>> Hotspot tier 1-6, CTW, jcstress, micros, runthese, kitchensink, and then >>>> some. All with -XX:+ZVerifyViews. >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8224675 >>>> >>>> Webrev: http://cr.openjdk.java.net/~neliasso/8224675/webrev.01/ >>>> >>>> >>>> Please review, >>>> >>>> Regards, >>>> >>>> Nils >>>> From martin.doerr at sap.com Fri May 31 15:54:42 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 31 May 2019 15:54:42 +0000 Subject: RFR(M): 8224827: Implement fast class initialization checks on s390 Message-ID: Hi, please review the s390 implementation of JDK-8223213: http://cr.openjdk.java.net/~mdoerr/8224827_s390_fast_clinit/webrev.00/ x86 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/9ad765641e8f Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Fri May 31 15:54:41 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 31 May 2019 15:54:41 +0000 Subject: RFR(M): 8224826: Implement fast class initialization checks on PPC64 Message-ID: Hi, please review the PPC64 implementation of JDK-8223213: http://cr.openjdk.java.net/~mdoerr/8224826_ppc64_fast_clinit/webrev.00/ x86 changeset: http://hg.openjdk.java.net/jdk/jdk/rev/9ad765641e8f Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From adam.farley at uk.ibm.com Fri May 31 16:07:24 2019 From: adam.farley at uk.ibm.com (Adam Farley8) Date: Fri, 31 May 2019 17:07:24 +0100 Subject: RFR: JDK-8224963: Char-Byte Performance Enhancement In-Reply-To: References: <68b970be-e551-577d-0ca1-28c16880a2ff@oracle.com> Message-ID: Hi Vladimir, Here's a minimised version of the benchmark, which converts chars to bytes using nio. I found that the conversion rates are similar between Hotspot and OpenJ9 for encoding single-character buffers, and that the difference becomes palpable as you increase the size of the buffer. 4096-char buffers, for example, show the 6x difference I mentioned earlier. This makes sense to me, as we're spending less time messing around with objects at the test level, and more time actually utilising the encoding code. You should just be able to run the benchmark on the command line. "java ASCIIEncodingBenchmark " Benchmark code: http://cr.openjdk.java.net/~afarley/8224963/ASCIIEncodingBenchmark.java If you need a microbenchmark for a specific framework, name it and I'll get it done. Or one of my team will get it done. Off for a week. :) Best Regards Adam Farley IBM Runtimes Vladimir Ivanov wrote on 29/05/2019 17:19:36: > From: Vladimir Ivanov > To: Adam Farley8 > Cc: hotspot-compiler-dev at openjdk.java.net > Date: 29/05/2019 17:23 > Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement > > Adam, > > Among all options, I'm in favor of enhancing C2 to produce better code. > Then on my preference list goes rewriting JDK code to make it amenable > to missing optimizations (the patch you propose). And, as a last resort, > I'd consider introducing new intrinsics. > > The microbenchmarks would help understand what pieces as missing in C2 > and decide how to proceed. > > I haven't had HotSpot vs J9 comparison in mind, but in absence of > benchmarks available comparing generated code (by C2) between original > and updated JDK version would help understand what goes wrong. > > Best regards, > Vladimir Ivanov > > On 29/05/2019 17:53, Adam Farley8 wrote: > > Hi Vladimir, > > > > I have a locally-written performance test I used to get the "6x". > > Will chase up with the guy who wrote it to see if I can share it. > > If not, I'll write a new one. > > > > As for the enhancements, two options are: > > > > - matching on the new method names, and replacing the inner logic > > with some souped-up version of said logic. > > > > - alter the code to match on one of the C2 idioms, though I imagine > > if it were that simple, OpenJDK would come with a list of said > > idioms so everything people write can be easily accelerated by the > > JIT. > > > > As for how OpenJ9 does it specifically, I don't know, and I suspect > > it's safer if I don't find out, contamination-wise. > > > > Does any of that help? > > > > Best Regards > > > > Adam Farley > > IBM Runtimes > > > > > > Vladimir Ivanov wrote on 29/05/201913:22:27: > > > >> From: Vladimir Ivanov > >> To: Adam Farley8 , hotspot-compiler- > >> dev at openjdk.java.net > >> Date: 29/05/2019 13:22 > >> Subject: Re: RFR: JDK-8224963: Char-Byte Performance Enhancement > >> > >> Hi Adam, > >> > >> The bug mentions ~6x improvement in throughput. Are there have any > >> microbenchmarks you can share which demonstrate that? That would greatly > >> simplify the analysis of changes you propose. > >> > >> Also, if you can elaborate on what optimization opportunities C2 misses > >> in original code, please, do. > >> > >> Best regards, > >> Vladimir Ivanov > >> > >> On 29/05/2019 12:45, Adam Farley8 wrote: > >> > Hi All, > >> > > >> > Could someone familiar with the Hotspot JIT please review and opine on > >> > the below? > >> > > >> > The Char-Byte encoding/decoding methods inside some of the sun.nio.cs > >> > classes > >> > (such as US_ASCII) see a lot of use, and OpenJDK on the OpenJ9 > VM seems to > >> > do this a lot faster. > >> > > >> > Is it possible to achieve a similar improvement on OpenJDK on Hotspot by > >> > tweaking the CL code to match Hotspot JIT compiler idioms, or by > >> > introducing > >> > a method name for the HS JIT to match on? > >> > > >> > An example of these changes to US_ASCII.java is linked below. No OpenJ9 > >> > code > >> > is included in the work item or the webrev, to avoid contamination. > >> > > >> > Work item: https://urldefense.proofpoint.com/v2/url? > >> u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8224963&d=DwIC- > >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- > >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- > >> PzjFCXb4&s=2ChxP3IE0tkvevxSXfil3PGlpEHkUPxgwMxHH5J-A34&e= > >> > > >> > Example Webrev: _https://urldefense.proofpoint.com/v2/url? > >> u=http-3A__cr.openjdk.java.net_-7Eafarley_8224963_webrev_-5F&d=DwIC- > >> g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf- > >> CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=4XPqGhxLchCLvSQhTIu3Wvm63NE2XpuEJf- > >> PzjFCXb4&s=fCeNvvk3Fehc6ssZfoNkJao_NJyoxeov7cxiyMSvuwQ&e= > >> > > >> > Best Regards > >> > > >> > Adam Farley > >> > IBM Runtimes > >> > > >> > Unless stated otherwise above: > >> > IBM United Kingdom Limited - Registered in England and Wales with number > >> > 741598. > >> > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU > >> > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From gnu.andrew at redhat.com Fri May 31 16:18:22 2019 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Fri, 31 May 2019 17:18:22 +0100 Subject: [8u-dev, ppc] RFR for (almost clean) backport of 8185969: PPC64: Improve VSR support to use up to 64 registers In-Reply-To: References: <1bd63cd1-efbb-e70d-62e5-510d364f712b@redhat.com> <6459888e-be23-e362-3b09-c5cd4afa701f@redhat.com> Message-ID: <21b41ea7-f8e8-39ee-aebf-b6f80ab41367@redhat.com> On 31/05/2019 05:54, Kazunori Ogata wrote: > Hi Andrew, > > Thank you for your confirmation. > >> However, please don't assume an acceptable review just because there is >> no objection. > > I see. I thought review was optional in this case since it is almost clean > and no change in the actual code. But I should have waited for your > reply, as you reviewed the patch. I'll be more careful. > > > Regards, > Ogata > > You didn't do anything wrong. I just wanted to clarify that there should always been an explicit ok if a review is required, in case there was any misunderstanding. Regards, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Fri May 31 16:38:19 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 31 May 2019 09:38:19 -0700 Subject: [13] RFR(M) 8225019: Update JVMCI Message-ID: http://cr.openjdk.java.net/~kvn/8225019/webrev.02/ https://bugs.openjdk.java.net/browse/JDK-8225019 Sync latest changes from graal-jvmci-8. Several compiler/jvmci tests failed because now VM will exit if JVMCI Compiler specified incorrectly (-Djvmci.Compiler=null) when UseJVMCICompiler is ON. The exit was added by [GR-15954] "Fail gracefully if JVMCI compiler initialization fails". This is correct behavior. Tests were fixed by replacing -Djvmci.Compiler=null with -XX:-UseJVMCICompiler. Note, I reproduced the same failures back to JDK 10 when we enabled Graal as JIT. Tests passed before because Graal initialization failures were ignored. Found an other test compiler/uncommontrap/DeoptReallocFailure.java which use small Java heap -Xmx100m to trigger allocation failures. We should not run it with Java Graal - I put it in ProblemList-graal.txt with JDK-8196611 umbrella bug. Tested with tier1-3. Thanks, Vladimir From igor.ignatyev at oracle.com Fri May 31 17:09:26 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 31 May 2019 10:09:26 -0700 Subject: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit In-Reply-To: References: <9F82E0AB-45DB-45F4-B453-691F4A1AB244@oracle.com> Message-ID: <21920A56-9559-4CB3-A084-5F9F3EF92BCA@oracle.com> Hi Peter, the patch looks good to me. -- Igor > On May 31, 2019, at 6:46 AM, Januschke, Peter wrote: > > Hi, > > sorry for taking so long. Since I am a member of SAP?s work council, I have only very limited time for doing development. > > So now here is a solution like proposed by Igor: > > http://cr.openjdk.java.net/~mdoerr/8222103_testbug/webrev.02/ > > https://bugs.openjdk.java.net/browse/JDK-8222103 > > Please review, and I please need a sponsor. > > Best regards > > Peter > > From: Igor Ignatyev > Sent: Dienstag, 9. April 2019 18:30 > To: Nils Eliasson ; Januschke, Peter > Cc: hotspot-compiler-dev at openjdk.java.net compiler > Subject: Re: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit > > Nils, > > thanks for the clarification. > > Peter, > > I haven't looked at these tests for quite a while, and, I guess, presence of 'jcmd' in the path got me confused, so I (now obviously) incorrectly assumed that the directives are added by jcmd. in this case, I agree that it is a test bug. I'd however prefer it to be fixed slightly different and instead of adding 'CompilerDirectivesLimit to command line I'd read its value using WhiteBox and limit ClearDirectivesFileStackTest::AMOUNT by it. > > you'll also need to update year in the copyright notice. > > Thanks, > -- Igor > On Apr 9, 2019, at 4:55 AM, Nils Eliasson > wrote: > > Hi, > The design is that you add a number of directives together to get a desired behavior. Adding just some of them make no sense. > If you add directives by command line, and get an error (syntax or limit) - it will print the error stop the startup. This allows the user to correct the problem and retry. > If you add by jcmd, the error is printed on the jcmd console, but the VM continues on unaffected. The user can correct the problem and make another try. > I think there is a separate test of the limit. And if that is already covered, testing it in this test too, seems unnecessary. > Regards, > // Nils > > On 2019-04-09 11:34, Januschke, Peter wrote: > > Hi Igor, > > the current implementation prints a message and then stops: > > bool DirectivesStack::check_capacity(int request_size, outputStream* st) { > if ((request_size + _depth) > CompilerDirectivesLimit) { > st->print_cr("Could not add %i more directives. Currently %i/%i directives.", request_size, _depth, CompilerDirectivesLimit); > return false; > } > return true; > } > > Best regards > > Peter > > From: Igor Ignatyev > Sent: Montag, 8. April 2019 19:25 > To: Januschke, Peter > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S): 8222103: [testbug] compiler/compilercontrol/jcmd/ClearDirectivesFileStackTest may exceed VM limit > > Hi Peter, > > I don't think it's a test bug, VM shouldn't stop execution if someone requests too many compiler directives, instead it should reject requests which will exceed its capacity. > > -- Igor > > > > On Apr 8, 2019, at 1:18 AM, Januschke, Peter > wrote: > > Hi, > > I propose the following fix to the test mentioned in the subject: > > Problem: > The test generates a random number of compiler directives, which might be greater than the value of CompilerDirectivesLimit. This causes the VM to stop execution upon the corresponding capacity check. > > Fix: set CompilerDirectivesLimit to the max random number used. > > http://cr.openjdk.java.net/~goetz/wr19/peter/8222103-01 > > https://bugs.openjdk.java.net/browse/JDK-8222103 > > Please review, and I please need a sponsor. > > Best regards > > Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.rodriguez at oracle.com Fri May 31 18:36:17 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 31 May 2019 11:36:17 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> Message-ID: <1f372704-6dbb-c211-49a8-d8c3266c4de2@oracle.com> you're missing the ByteCache. Isn't box the generic term for these classes whether they are cached or not? I guess what?s confusing is that both new Integer(n) and Integer.valueOf(n) are both boxes but one with be marked as box and the other won't. It might be clearer to use cached or auto_box instead of just box. Otherwise this looks ok to me. Thanks for taking care of this. tom Igor Veresov wrote on 5/30/19 3:29 PM: > >> On May 30, 2019, at 2:15 PM, Vladimir Kozlov >> > wrote: >> >> CCing to runtime group too. >> >> So you went hard and correct way ;-) > > Well, there isn?t much of choice? Must abide the spec. > >> >> deoptimization.cpp next #ifdef sequence is strange since we support >> only 64-bit on SPARC: >> >> +#ifdef _LP64 >> + ??????????????????????jlong res = (jlong)low->get_int(); >> +#else >> +#ifdef SPARC >> > > Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just > remove it. > >> Also the code here is guarded by INCLUDE_JVMCI but it should be >> applicable to AOT code too. Right? May be || INCLUDE_AOT? >> > > Good point, will add that. > >> Please, add more comment. For example add one in aotLoader.cpp for >> initialize_box_caches() to explain why we need to eager initialize >> caches for AOT. >> > > Ok. > > > New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ > > > > igor > >> Thanks, >> Vladimir >> >> On 5/30/19 12:33 PM, Igor Veresov wrote: >>> Graal models boxing (a call to valueOf()) as a BoxNode. If >>> scalarized, it is encoded in the debug info as an allocation of a box >>> object. However, for certain ranges of values the box object has to >>> come from caches. The reason is that for these values JLS guarantees >>> the identity of the boxes. >>> The fix essentially propagates the information on whether the Box is >>> a result of Box.valueOf() or new Box() to the deoptimization >>> machinery that checks if the object is in the range that should be in >>> a cache and gets it from there instead of allocating it. >>> Mach5: tier1-6, tier2-6 with Graal >>> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >>> >>> I?d like to push all this into JDK13 first and then follow up with a >>> change to the upstream Graal. >>> Thanks, >>> igor > From tom.rodriguez at oracle.com Fri May 31 19:00:11 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 31 May 2019 12:00:11 -0700 Subject: [13] RFR(M) 8225019: Update JVMCI In-Reply-To: References: Message-ID: <264d0cb0-d49f-dab4-aa6b-2ed13822bb27@oracle.com> Looks good. tom Vladimir Kozlov wrote on 5/31/19 9:38 AM: > http://cr.openjdk.java.net/~kvn/8225019/webrev.02/ > https://bugs.openjdk.java.net/browse/JDK-8225019 > > Sync latest changes from graal-jvmci-8. > > Several compiler/jvmci tests failed because now VM will exit if JVMCI > Compiler specified incorrectly (-Djvmci.Compiler=null) when > UseJVMCICompiler is ON. > > The exit was added by [GR-15954] "Fail gracefully if JVMCI compiler > initialization fails". > This is correct behavior. Tests were fixed by replacing > -Djvmci.Compiler=null with -XX:-UseJVMCICompiler. > Note, I reproduced the same failures back to JDK 10 when we enabled > Graal as JIT. Tests passed before because Graal initialization failures > were ignored. > > Found an other test compiler/uncommontrap/DeoptReallocFailure.java which > use small Java heap -Xmx100m to trigger allocation failures. We should > not run it with Java Graal - I put it in ProblemList-graal.txt with > JDK-8196611 umbrella bug. > > Tested with tier1-3. > > Thanks, > Vladimir From igor.veresov at oracle.com Fri May 31 19:30:43 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 31 May 2019 12:30:43 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: <1f372704-6dbb-c211-49a8-d8c3266c4de2@oracle.com> References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> <1f372704-6dbb-c211-49a8-d8c3266c4de2@oracle.com> Message-ID: <07D0C373-36BE-4D79-A483-631467148594@oracle.com> Thanks for review! I did the renames and added support for ByteCache. Updated webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.02/ igor > On May 31, 2019, at 11:36 AM, Tom Rodriguez wrote: > > you're missing the ByteCache. > > Isn't box the generic term for these classes whether they are cached or not? I guess what?s confusing is that both new Integer(n) and Integer.valueOf(n) are both boxes but one with be marked as box and the other won't. It might be clearer to use cached or auto_box instead of just box. > > Otherwise this looks ok to me. Thanks for taking care of this. > > tom > > Igor Veresov wrote on 5/30/19 3:29 PM: >>> On May 30, 2019, at 2:15 PM, Vladimir Kozlov > wrote: >>> >>> CCing to runtime group too. >>> >>> So you went hard and correct way ;-) >> Well, there isn?t much of choice? Must abide the spec. >>> >>> deoptimization.cpp next #ifdef sequence is strange since we support only 64-bit on SPARC: >>> >>> +#ifdef _LP64 >>> + jlong res = (jlong)low->get_int(); >>> +#else >>> +#ifdef SPARC >>> >> Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just remove it. >>> Also the code here is guarded by INCLUDE_JVMCI but it should be applicable to AOT code too. Right? May be || INCLUDE_AOT? >>> >> Good point, will add that. >>> Please, add more comment. For example add one in aotLoader.cpp for initialize_box_caches() to explain why we need to eager initialize caches for AOT. >>> >> Ok. >> New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ >> igor >>> Thanks, >>> Vladimir >>> >>> On 5/30/19 12:33 PM, Igor Veresov wrote: >>>> Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for these values JLS guarantees the identity of the boxes. >>>> The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there instead of allocating it. >>>> Mach5: tier1-6, tier2-6 with Graal >>>> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >>>> I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. >>>> Thanks, >>>> igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.rodriguez at oracle.com Fri May 31 19:36:17 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 31 May 2019 12:36:17 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: <07D0C373-36BE-4D79-A483-631467148594@oracle.com> References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> <1f372704-6dbb-c211-49a8-d8c3266c4de2@oracle.com> <07D0C373-36BE-4D79-A483-631467148594@oracle.com> Message-ID: Looks good. tom Igor Veresov wrote on 5/31/19 12:30 PM: > Thanks for review! > > I did the renames and added support for ByteCache. > > Updated webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.02/ > > > igor > > > >> On May 31, 2019, at 11:36 AM, Tom Rodriguez > > wrote: >> >> you're missing the ByteCache. >> >> Isn't box the generic term for these classes whether they are cached >> or not? ?I guess what?s confusing is that both new Integer(n) and >> Integer.valueOf(n) are both boxes but one with be marked as box and >> the other won't. ?It might be clearer to use cached or auto_box >> instead of just box. >> >> Otherwise this looks ok to me. ?Thanks for taking care of this. >> >> tom >> >> Igor Veresov wrote on 5/30/19 3:29 PM: >>>> On May 30, 2019, at 2:15 PM, Vladimir Kozlov >>>> >>>> > wrote: >>>> >>>> CCing to runtime group too. >>>> >>>> So you went hard and correct way ;-) >>> Well, there isn?t much of choice? Must abide the spec. >>>> >>>> deoptimization.cpp next #ifdef sequence is strange since we support >>>> only 64-bit on SPARC: >>>> >>>> +#ifdef _LP64 >>>> + ??????????????????????jlong res = (jlong)low->get_int(); >>>> +#else >>>> +#ifdef SPARC >>>> >>> Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just >>> remove it. >>>> Also the code here is guarded by INCLUDE_JVMCI but it should be >>>> applicable to AOT code too. Right? May be || INCLUDE_AOT? >>>> >>> Good point, will add that. >>>> Please, add more comment. For example add one in aotLoader.cpp for >>>> initialize_box_caches() to explain why we need to eager initialize >>>> caches for AOT. >>>> >>> Ok. >>> New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ >>> >>> >>> igor >>>> Thanks, >>>> Vladimir >>>> >>>> On 5/30/19 12:33 PM, Igor Veresov wrote: >>>>> Graal models boxing (a call to valueOf()) as a BoxNode. If >>>>> scalarized, it is encoded in the debug info as an allocation of a >>>>> box object. However, for certain ranges of values the box object >>>>> has to come from caches. The reason is that for these values JLS >>>>> guarantees the identity of the boxes. >>>>> The fix essentially propagates the information on whether the Box >>>>> is a result of Box.valueOf() or new Box() to the deoptimization >>>>> machinery that checks if the object is in the range that should be >>>>> in a cache and gets it from there instead of allocating it. >>>>> Mach5: tier1-6, tier2-6 with Graal >>>>> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >>>>> >>>>> >>>>> I?d like to push all this into JDK13 first and then follow up with >>>>> a change to the upstream Graal. >>>>> Thanks, >>>>> igor > From igor.veresov at oracle.com Fri May 31 19:50:41 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 31 May 2019 12:50:41 -0700 Subject: RFR(L) 8223320: [AOT] jck test api/javax_script/ScriptEngine/PutGet.html fails when test classes are AOTed In-Reply-To: References: <1eb97f37-19eb-82db-b58b-36f4c87d8921@oracle.com> <6953E8A8-B856-4D8F-9EEF-ABDB6C8A4904@oracle.com> <1f372704-6dbb-c211-49a8-d8c3266c4de2@oracle.com> <07D0C373-36BE-4D79-A483-631467148594@oracle.com> Message-ID: <95A23A32-4C0D-438F-A13A-D59938D88330@oracle.com> Thanks Vladimir and Tom! igor > On May 31, 2019, at 12:36 PM, Tom Rodriguez wrote: > > Looks good. > > tom > > Igor Veresov wrote on 5/31/19 12:30 PM: >> Thanks for review! >> I did the renames and added support for ByteCache. >> Updated webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.02/ >> igor >>> On May 31, 2019, at 11:36 AM, Tom Rodriguez > wrote: >>> >>> you're missing the ByteCache. >>> >>> Isn't box the generic term for these classes whether they are cached or not? I guess what?s confusing is that both new Integer(n) and Integer.valueOf(n) are both boxes but one with be marked as box and the other won't. It might be clearer to use cached or auto_box instead of just box. >>> >>> Otherwise this looks ok to me. Thanks for taking care of this. >>> >>> tom >>> >>> Igor Veresov wrote on 5/30/19 3:29 PM: >>>>> On May 30, 2019, at 2:15 PM, Vladimir Kozlov > wrote: >>>>> >>>>> CCing to runtime group too. >>>>> >>>>> So you went hard and correct way ;-) >>>> Well, there isn?t much of choice? Must abide the spec. >>>>> >>>>> deoptimization.cpp next #ifdef sequence is strange since we support only 64-bit on SPARC: >>>>> >>>>> +#ifdef _LP64 >>>>> + jlong res = (jlong)low->get_int(); >>>>> +#else >>>>> +#ifdef SPARC >>>>> >>>> Right. I guess JVMCI and AOT will never support 32 bit, so I?ll just remove it. >>>>> Also the code here is guarded by INCLUDE_JVMCI but it should be applicable to AOT code too. Right? May be || INCLUDE_AOT? >>>>> >>>> Good point, will add that. >>>>> Please, add more comment. For example add one in aotLoader.cpp for initialize_box_caches() to explain why we need to eager initialize caches for AOT. >>>>> >>>> Ok. >>>> New webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.01/ >>>> igor >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 5/30/19 12:33 PM, Igor Veresov wrote: >>>>>> Graal models boxing (a call to valueOf()) as a BoxNode. If scalarized, it is encoded in the debug info as an allocation of a box object. However, for certain ranges of values the box object has to come from caches. The reason is that for these values JLS guarantees the identity of the boxes. >>>>>> The fix essentially propagates the information on whether the Box is a result of Box.valueOf() or new Box() to the deoptimization machinery that checks if the object is in the range that should be in a cache and gets it from there instead of allocating it. >>>>>> Mach5: tier1-6, tier2-6 with Graal >>>>>> Webrev: http://cr.openjdk.java.net/~iveresov/8223320/webrev.00/ >>>>>> I?d like to push all this into JDK13 first and then follow up with a change to the upstream Graal. >>>>>> Thanks, >>>>>> igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Fri May 31 20:10:52 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 31 May 2019 13:10:52 -0700 Subject: [13] RFR(M) 8225019: Update JVMCI In-Reply-To: <264d0cb0-d49f-dab4-aa6b-2ed13822bb27@oracle.com> References: <264d0cb0-d49f-dab4-aa6b-2ed13822bb27@oracle.com> Message-ID: Thank you, Tom Vladimir On 5/31/19 12:00 PM, Tom Rodriguez wrote: > Looks good. > > tom > > Vladimir Kozlov wrote on 5/31/19 9:38 AM: >> http://cr.openjdk.java.net/~kvn/8225019/webrev.02/ >> https://bugs.openjdk.java.net/browse/JDK-8225019 >> >> Sync latest changes from graal-jvmci-8. >> >> Several compiler/jvmci tests failed because now VM will exit if JVMCI Compiler specified incorrectly >> (-Djvmci.Compiler=null) when UseJVMCICompiler is ON. >> >> The exit was added by [GR-15954] "Fail gracefully if JVMCI compiler initialization fails". >> This is correct behavior. Tests were fixed by replacing -Djvmci.Compiler=null with -XX:-UseJVMCICompiler. >> Note, I reproduced the same failures back to JDK 10 when we enabled Graal as JIT. Tests passed before because Graal >> initialization failures were ignored. >> >> Found an other test compiler/uncommontrap/DeoptReallocFailure.java which use small Java heap -Xmx100m to trigger >> allocation failures. We should not run it with Java Graal - I put it in ProblemList-graal.txt with JDK-8196611 >> umbrella bug. >> >> Tested with tier1-3. >> >> Thanks, >> Vladimir From vladimir.x.ivanov at oracle.com Fri May 31 20:15:11 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 31 May 2019 23:15:11 +0300 Subject: [13] RFR (S): 8225106: C2: Parse::clinit_deopt asserts when holder klass is in error state Message-ID: <28899e8f-daa5-2d90-c139-dcb833c9a93f@oracle.com> http://cr.openjdk.java.net/~vlivanov/8225106/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8225106 Some asserts introduced as part of JDK-8223213 [1] don't take into account the case when the class which participates in class intialization barrier is in error state (class initialization failed with an exception). Proposed fix adjusts the asserts, plus slightly improves handling of cases when classes in error state are encountered. Testing: hs-precheckin-comp, tier1-4 Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8223213 From dean.long at oracle.com Fri May 31 20:15:35 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 31 May 2019 13:15:35 -0700 Subject: [13] RFR(M) 8225019: Update JVMCI In-Reply-To: References: Message-ID: Looks good.? Was there a JDK bug for this fix: ? GR-15425 Ensure not compilable OSR at level 4 falls back to level 1. It looks like it could apply to C1/C2 too and not just JVMCI. dl On 5/31/19 9:38 AM, Vladimir Kozlov wrote: > http://cr.openjdk.java.net/~kvn/8225019/webrev.02/ > https://bugs.openjdk.java.net/browse/JDK-8225019 > > Sync latest changes from graal-jvmci-8. > > Several compiler/jvmci tests failed because now VM will exit if JVMCI > Compiler specified incorrectly (-Djvmci.Compiler=null) when > UseJVMCICompiler is ON. > > The exit was added by [GR-15954] "Fail gracefully if JVMCI compiler > initialization fails". > This is correct behavior. Tests were fixed by replacing > -Djvmci.Compiler=null with -XX:-UseJVMCICompiler. > Note, I reproduced the same failures back to JDK 10 when we enabled > Graal as JIT. Tests passed before because Graal initialization > failures were ignored. > > Found an other test compiler/uncommontrap/DeoptReallocFailure.java > which use small Java heap -Xmx100m to trigger allocation failures. We > should not run it with Java Graal - I put it in ProblemList-graal.txt > with JDK-8196611 umbrella bug. > > Tested with tier1-3. > > Thanks, > Vladimir From vladimir.kozlov at oracle.com Fri May 31 20:28:06 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 31 May 2019 13:28:06 -0700 Subject: [13] RFR(M) 8225019: Update JVMCI In-Reply-To: References: Message-ID: <0c8ee377-16a0-d1a3-51ff-e2798d4a4923@oracle.com> Thank you, Dean There is JDK-8173112. Yes, I think it may help C2. Tom explained that changes fixed issue with Graal unable to OSR compile a method due to low/0 invocation count and stuck in tier3. Igor pointed in comment that JDK-8173112 is about tier3 code speed and not about how fast we replace it with other tiers. The underlining problem is not fixed yet - how to force profile counters only grow and not reset back to 0: http://hg.openjdk.java.net/jdk/jdk/file/ae908641e726/src/hotspot/share/c1/c1_LIRGenerator.cpp#l3280 Thanks, Vladimir On 5/31/19 1:15 PM, dean.long at oracle.com wrote: > Looks good.? Was there a JDK bug for this fix: > > ? GR-15425 Ensure not compilable OSR at level 4 falls back to level 1. > > It looks like it could apply to C1/C2 too and not just JVMCI. > > dl > > On 5/31/19 9:38 AM, Vladimir Kozlov wrote: >> http://cr.openjdk.java.net/~kvn/8225019/webrev.02/ >> https://bugs.openjdk.java.net/browse/JDK-8225019 >> >> Sync latest changes from graal-jvmci-8. >> >> Several compiler/jvmci tests failed because now VM will exit if JVMCI Compiler specified incorrectly >> (-Djvmci.Compiler=null) when UseJVMCICompiler is ON. >> >> The exit was added by [GR-15954] "Fail gracefully if JVMCI compiler initialization fails". >> This is correct behavior. Tests were fixed by replacing -Djvmci.Compiler=null with -XX:-UseJVMCICompiler. >> Note, I reproduced the same failures back to JDK 10 when we enabled Graal as JIT. Tests passed before because Graal >> initialization failures were ignored. >> >> Found an other test compiler/uncommontrap/DeoptReallocFailure.java which use small Java heap -Xmx100m to trigger >> allocation failures. We should not run it with Java Graal - I put it in ProblemList-graal.txt with JDK-8196611 >> umbrella bug. >> >> Tested with tier1-3. >> >> Thanks, >> Vladimir > From tom.rodriguez at oracle.com Fri May 31 20:41:56 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 31 May 2019 13:41:56 -0700 Subject: RFR 8209626: [JVMCI] Use implicit exception table for dispatch and printing In-Reply-To: <8bbf04d2-09d6-6f9f-930c-77e97a03bdc9@oracle.com> References: <09596941-d41c-fcc5-e54b-640f080d2881@oracle.com> <8dce515a-d066-bd29-8c58-bf8a515171e4@oracle.com> <46e4f527-7ba8-a5a2-157e-34d5001a44ad@oracle.com> <9a8bfea4-3a64-fd5f-afff-78d4af062124@oracle.com> <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> <8bbf04d2-09d6-6f9f-930c-77e97a03bdc9@oracle.com> Message-ID: Updated webrev at http://cr.openjdk.java.net/~never/8209626.1/webrev Vladimir Kozlov wrote on 5/30/19 11:55 AM: > Hi Tom, > > Can you add comment to next new code that it is to support old JVMCI? > Otherwise it is confusing because in JDK repo we will not have the issue: > > getDeclaredMethod("implicitExceptionTable"); I added a comment and also added a check that it's required starting with JDK13. The mach5 job is in progress. > > In jvmci/jvmciRuntime.hpp spacing is not aligned with the rest of > arguments. Fixed. > > The rest seems fine.? what testing you did with new code? The existing graal tests should exercise the new implicit exception dispatch path adequately and I manually inspected the PrintNMethods output. The JDK8 version of this code has been in use for many months. A sample of the disassembly is below: 0x00000001230c3a07: mov r10d,DWORD PTR [rax*8+0xc] ; implicit exception: deoptimizes ; ImmutableOopMap {rax=NarrowOop rsi=Oop } By the way the new PrintNMethods default output seems awful. No comments at all by default and the labels marking the Entry Point seem to be totally disconnected. The alignment is kind of screwy too. 0x00000001230c3a12: jle 0x00000001230c3a5a ;*if_icmplt {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.StringLatin1::charAt at 7 (line 47) ; - java.lang.String::charAt at 12 (line 708) 0x00000001230c3a18: cmp r10d,edx tom > > Thanks, > Vladimir > > On 5/30/19 11:27 AM, Tom Rodriguez wrote: >> I have updated this webrev to include fixes to AOT to properly capture >> the implicit exception table and record the offset for it in the AOT >> binary.? It required minor Graal changes which I will push upstream >> separately.? Please rereview. >> >> tom >> >> Tom Rodriguez wrote on 12/12/18 11:22 PM: >>> >>> >>> Vladimir Kozlov wrote on 12/12/18 2:29 PM: >>>> On 12/12/18 1:06 PM, Tom Rodriguez wrote: >>>>> They all look like preexisting failures to me.? The >>>>> CheckGraalIntrinsics one you mentioned in chat and >>>> >>>> yes >>>> >>>>> compiler/aot/DeoptimizationTest.java which seems to have been >>>>> failing at least intermittently for a while.? What do you think? >>>> >>>> SIGFPE is new. And I think your changes in sharedRuntime.cpp may >>>> affected execution of AOT methods because they are marked as >>>> compiled by Graal (compiler_jvmci): >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/9e28eff3d40f/src/hotspot/share/aot/aotCompiledMethod.hpp#l131 >>> >>> >>> >>> Yes I think I need to move some code around to properly support AOT. >>> I'll send out an updated webrev soon but I think we can defer this >>> one until jdk 13. >>> >>> tom >>> >>>> >>>> >>>> Vladimir >>>> >>>>> >>>>> It does make me wonder if AOT needs any extra support to use the >>>>> implicit exception table.? I would assume we'd be seeing problems >>>>> if that was the case but don't really know. >>>>> >>>>> tom >>>>> >>>>> Vladimir Kozlov wrote on 12/12/18 11:12 AM: >>>>>> Tom, >>>>>> >>>>>> Some tests failed. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> On 12/12/18 10:42 AM, Tom Rodriguez wrote: >>>>>>> http://cr.openjdk.java.net/~never/8209626/webrev >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209626 >>>>>>> >>>>>>> Graal handles implicit exceptions by deoptimizing and that's >>>>>>> currently done in a way that's hard to understand from the >>>>>>> PrintNMethods output. Basically there's just an extra PcDesc at >>>>>>> the implicit check location and the runtime assumes that a fault >>>>>>> with a PcDesc underneath it an implicit check.? This changes >>>>>>> JVMCI to use the implicit exception table to mark these locations >>>>>>> specially which simplifies the dispatching and >>>>>>> printing.??The?new?print?output?looks?like?this: >>>>>>> >>>>>>> ?? 0x0000000120f053a0: mov??? DWORD PTR [rsp-0x14000],eax >>>>>>> ?? 0x0000000120f053a7: sub??? rsp,0x18 >>>>>>> ?? 0x0000000120f053ab: mov??? QWORD PTR [rsp+0x10],rbp ;*aload_0 >>>>>>> {reexecute=1 rethrow=0 return_oop=0} >>>>>>> ???????????????????????????????????????????????????????????? ; - >>>>>>> java.lang.StringLatin1::equals at 0 (line 94) >>>>>>> >>>>>>> ?? 0x0000000120f053b0: mov??? eax,DWORD PTR [rsi+0xc]??????? ; >>>>>>> implicit exception: deoptimizes >>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>>> ???????????????????????????????????????????????????????????? ; - >>>>>>> java.lang.StringLatin1::equals at 0 (line 94) >>>>>>> >>>>>>> ?? 0x0000000120f053b3: mov??? r10d,DWORD PTR [rdx+0xc]?????? ; >>>>>>> implicit exception: deoptimizes >>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>> >>>>>>> The scope information is still printed in the normal original >>>>>>> location. This has been in use with JVMCI 8 for several months. >>>>>>> >>>>>>> tom From vladimir.x.ivanov at oracle.com Fri May 31 20:53:35 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 31 May 2019 23:53:35 +0300 Subject: [13] RFR (XS): 8225141: Better handling of classes in error state in fast class initialization checks Message-ID: <18cc95c1-4b46-2eec-bb16-34380aad9394@oracle.com> http://cr.openjdk.java.net/~vlivanov/8225141/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8225141 Fast class initialization checks don't properly handle classes in error state when performed from (previously) initializing thread. One way to fix it is to add one more fast path check (InstanceKlass::_init_state == being_initialized) into the barrier, but that would require significant changes, since both newly introduced checks (JDK-8223213 [1]) and existing C1 checks should be changed. What I propose is to set InstanceKlass::_init_thread only for the duration when the klass is in being_initialized state and reset it back to NULL when changing class state. It makes existing "_init_thread == current_thread" check equivalent to "_init_state == being_initialized && _init_thread == current_thread". Testing: hs-precheckin-comp, tier1-4 Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8223213 From vladimir.kozlov at oracle.com Fri May 31 20:54:03 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 31 May 2019 13:54:03 -0700 Subject: RFR 8209626: [JVMCI] Use implicit exception table for dispatch and printing In-Reply-To: References: <09596941-d41c-fcc5-e54b-640f080d2881@oracle.com> <8dce515a-d066-bd29-8c58-bf8a515171e4@oracle.com> <46e4f527-7ba8-a5a2-157e-34d5001a44ad@oracle.com> <9a8bfea4-3a64-fd5f-afff-78d4af062124@oracle.com> <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> <8bbf04d2-09d6-6f9f-930c-77e97a03bdc9@oracle.com> Message-ID: <11cd6929-af13-1708-847a-639dc1991efa@oracle.com> On 5/31/19 1:41 PM, Tom Rodriguez wrote: > Updated webrev at http://cr.openjdk.java.net/~never/8209626.1/webrev Looks good. > > Vladimir Kozlov wrote on 5/30/19 11:55 AM: >> Hi Tom, >> >> Can you add comment to next new code that it is to support old JVMCI? Otherwise it is confusing because in JDK repo we >> will not have the issue: >> >> getDeclaredMethod("implicitExceptionTable"); > > I added a comment and also added a check that it's required starting with JDK13.? The mach5 job is in progress. > Okay. >> >> In jvmci/jvmciRuntime.hpp spacing is not aligned with the rest of arguments. > > Fixed. > >> >> The rest seems fine.? what testing you did with new code? > > The existing graal tests should exercise the new implicit exception dispatch path adequately and I manually inspected > the PrintNMethods output.? The JDK8 version of this code has been in use for many months. A sample of the disassembly is > below: > > ? 0x00000001230c3a07:?? mov??? r10d,DWORD PTR [rax*8+0xc] > ? ; implicit exception: deoptimizes > ? ; ImmutableOopMap {rax=NarrowOop rsi=Oop } > > > By the way the new PrintNMethods default output seems awful.? No comments at all by default and the labels marking the > Entry Point seem to be totally disconnected.? The alignment is kind of screwy too. > > ? 0x00000001230c3a12:?? jle??? 0x00000001230c3a5a > ;*if_icmplt {reexecute=0 rethrow=0 return_oop=0} > ; - java.lang.StringLatin1::charAt at 7 (line 47) > ; - java.lang.String::charAt at 12 (line 708) > ? 0x00000001230c3a18:?? cmp??? r10d,edx Strange, Lutz fixed output recently for 8213084. Can you try PrintAssembly instead? May his changes affect PrintNMethods in wrong way. Thanks, Vladimir > > tom > >> >> Thanks, >> Vladimir >> >> On 5/30/19 11:27 AM, Tom Rodriguez wrote: >>> I have updated this webrev to include fixes to AOT to properly capture the implicit exception table and record the >>> offset for it in the AOT binary.? It required minor Graal changes which I will push upstream separately.? Please >>> rereview. >>> >>> tom >>> >>> Tom Rodriguez wrote on 12/12/18 11:22 PM: >>>> >>>> >>>> Vladimir Kozlov wrote on 12/12/18 2:29 PM: >>>>> On 12/12/18 1:06 PM, Tom Rodriguez wrote: >>>>>> They all look like preexisting failures to me.? The CheckGraalIntrinsics one you mentioned in chat and >>>>> >>>>> yes >>>>> >>>>>> compiler/aot/DeoptimizationTest.java which seems to have been failing at least intermittently for a while.? What >>>>>> do you think? >>>>> >>>>> SIGFPE is new. And I think your changes in sharedRuntime.cpp may affected execution of AOT methods because they are >>>>> marked as compiled by Graal (compiler_jvmci): >>>>> >>>>> http://hg.openjdk.java.net/jdk/jdk/file/9e28eff3d40f/src/hotspot/share/aot/aotCompiledMethod.hpp#l131 >>>> >>>> >>>> >>>> Yes I think I need to move some code around to properly support AOT. I'll send out an updated webrev soon but I >>>> think we can defer this one until jdk 13. >>>> >>>> tom >>>> >>>>> >>>>> >>>>> Vladimir >>>>> >>>>>> >>>>>> It does make me wonder if AOT needs any extra support to use the implicit exception table.? I would assume we'd be >>>>>> seeing problems if that was the case but don't really know. >>>>>> >>>>>> tom >>>>>> >>>>>> Vladimir Kozlov wrote on 12/12/18 11:12 AM: >>>>>>> Tom, >>>>>>> >>>>>>> Some tests failed. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> On 12/12/18 10:42 AM, Tom Rodriguez wrote: >>>>>>>> http://cr.openjdk.java.net/~never/8209626/webrev >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209626 >>>>>>>> >>>>>>>> Graal handles implicit exceptions by deoptimizing and that's currently done in a way that's hard to understand >>>>>>>> from the PrintNMethods output. Basically there's just an extra PcDesc at the implicit check location and the >>>>>>>> runtime assumes that a fault with a PcDesc underneath it an implicit check.? This changes JVMCI to use the >>>>>>>> implicit exception table to mark these locations specially which simplifies the dispatching and >>>>>>>> printing.??The?new?print?output?looks?like?this: >>>>>>>> >>>>>>>> ?? 0x0000000120f053a0: mov??? DWORD PTR [rsp-0x14000],eax >>>>>>>> ?? 0x0000000120f053a7: sub??? rsp,0x18 >>>>>>>> ?? 0x0000000120f053ab: mov??? QWORD PTR [rsp+0x10],rbp ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>>>> ???????????????????????????????????????????????????????????? ; - java.lang.StringLatin1::equals at 0 (line 94) >>>>>>>> >>>>>>>> ?? 0x0000000120f053b0: mov??? eax,DWORD PTR [rsi+0xc]??????? ; implicit exception: deoptimizes >>>>>>>> ???????????????????????????????????????????????????????????? ; ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>>>> ???????????????????????????????????????????????????????????? ; - java.lang.StringLatin1::equals at 0 (line 94) >>>>>>>> >>>>>>>> ?? 0x0000000120f053b3: mov??? r10d,DWORD PTR [rdx+0xc]?????? ; implicit exception: deoptimizes >>>>>>>> ???????????????????????????????????????????????????????????? ; ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>>> >>>>>>>> The scope information is still printed in the normal original location. This has been in use with JVMCI 8 for >>>>>>>> several months. >>>>>>>> >>>>>>>> tom From vladimir.x.ivanov at oracle.com Fri May 31 20:59:30 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 31 May 2019 23:59:30 +0300 Subject: [13] RFR (M): 8223213: Implement fast class initialization checks on x86-64 In-Reply-To: References: <85a4a478-9200-87f2-c966-49af21f687c2@oracle.com> <3e1ceae0-f7a9-e2e6-2b06-59a22540550d@oracle.com> <3d9c0897-0275-c341-fe33-5f0b6c94f253@oracle.com> <42a8fc79-9497-b2eb-8dd9-a56e4ed85255@oracle.com> <956790f2-5230-fa65-ca0a-77aa106ef462@oracle.com> <762bb879-63df-ed5e-c419-8c7c5544246c@oracle.com> Message-ID: <935f5428-2e4f-a02e-f8d1-4d5051721ead@oracle.com> Thanks, Martin. I filed JDK-8225106 [1] and sent out the fix for review [2] > And after thinking longer about it, I think that it's not ideal that we check is_being_initialized() several times in C2 (recheck in GraphKit::clinit_barrier for deoptimization). > Should we better enforce consistency by only checking clinit_barrier_on_entry()? I introduced ciInstanceKlass::is_not_initialized() (mirrors InstanceKlass::is_not_initialized()) to properly narrow cases when class initialization has been started. I'd prefer to keep the asserts in-place. Best regards, Vladimir Ivanov [1] https://bugs.openjdk.java.net/browse/JDK-8225106[2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-May/034058.html >> -----Original Message----- >> From: hotspot-dev On Behalf Of >> Vladimir Ivanov >> Sent: Donnerstag, 30. Mai 2019 12:38 >> To: hotspot-dev developers >> Subject: Re: [13] RFR (M): 8223213: Implement fast class initialization checks >> on x86-64 >> >> Thanks for reviews, Vladimir, Claes, David, Martin, and Coleen! >> >> Best regards, >> Vladimir Ivanov >> >> On 29/05/2019 22:06, coleen.phillimore at oracle.com wrote: >>> >>> Vladimir, >>> >>> This looks good to me. >>> >>> On 5/29/19 12:20 PM, Vladimir Ivanov wrote: >>>> Thanks, Coleen. >>>> >>>> Updated webrev: >>>> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.03 >>>> >>>> Incremental webrev: >>>> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.03_02/ >>>> >>>>> I reviewed mostly the interpreter and shared change.? As someone else >>>>> commented, I don't like the addition of a develop flag because some >>>>> platforms don't support class initialization barriers.? Isn't it >>>>> normal to do this with the misnamed VM_Version, like adding >>>>> VM_Version::supports_class_initialization_barriers() with x86 >>>>> returning true until the other platforms false until they implement >>>>> the feature. Then there isn't another flag configuration to test (or >>>>> not test). >>>> >>>> I like your suggestion. Didn't know VM_Version is used in such a way. >>>> >>>> Replaced UseFastClassInitChecks with >>>> VM_Version::supports_fast_class_init_checks(). >>>> >>>>> >> http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/src/hotspot/cpu/x >> 86/interp_masm_x86.cpp.udiff.html >>>>> >>>>> >>>>> I have to admit that the relationship between resolved bytecode in >>>>> bytecode_1/bytecode_2 and which _f1/_f2 held the Method* was >> actually >>>>> a surprise to me.? There's nothing structurally in cpCache other than >>>>> reading the code that enforces this and it wasn't always a Method* in >>>>> f2 for invokeinterface, for example, so it's sort of an accident. >>>>> >>>>> But this code is correct and I think as a follow up we should make >>>>> load_invoke_cp_cache_entry() call load_resolved_method_at_index() >> too >>>>> and have some assert in cpCache that this is true, or rewrite the >>>>> cpCache completely. >>>> >>>> I went ahead and changed load_invoke_cp_cache_entry() to call >>>> load_resolved_method_at_index() (along with some other minor >>>> refactorings). If you have any ideas/suggestions about the assert, I >>>> can add it as well. >>> >>> I think the change to use load_resolved_method_at_index() is good >>> because if someone moves things around, it'll now fail very quickly.? I >>> think this is the right amount of refactoring. >>> >>> Thanks, >>> Coleen >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>>> The code to do the initialization barrier in the interpreter looks good. >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> On 5/28/19 7:40 AM, Vladimir Ivanov wrote: >>>>>> Thanks, Martin. >>>>>> >>>>>> Updated webrev: >>>>>> ? http://cr.openjdk.java.net/~vlivanov/8223213/webrev.02/ >>>>>> >>>>>>> Are these assertions safe? >>>>>>> +?? assert(method()->needs_clinit_barrier(), "barrier not needed"); >>>>>>> +?? assert(method()->holder()->is_being_initialized(), "barrier not >>>>>>> needed"); >>>>>>> Can it happen that initialization concurrently completes before >>>>>>> they are evaluated? >>>>>> >>>>>> Good point. Even though ciInstanceKlass caches initialization state >>>>>> of the corresponding InstanceKlass, it seems there's a possibility >>>>>> that the state is updated during the compilation (see >>>>>> ciInstanceKlass::update_if_shared). I enhanced the asserts to check >>>>>> that initialization has been stated. >>>>> >>>>> Ok, this makes sense. >>>>>> >>>>>>> A small suggestion for x86 TemplateTable::invokeinterface: >>>>>>> It'd be nice to replace load of interface klass by your new >>>>>>> load_method_holder. >>>>>> >>>>>> Agree. Updated. >>>>> >>>>> This is nice. >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>> >>> From tom.rodriguez at oracle.com Fri May 31 21:19:45 2019 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 31 May 2019 14:19:45 -0700 Subject: RFR 8209626: [JVMCI] Use implicit exception table for dispatch and printing In-Reply-To: <11cd6929-af13-1708-847a-639dc1991efa@oracle.com> References: <09596941-d41c-fcc5-e54b-640f080d2881@oracle.com> <8dce515a-d066-bd29-8c58-bf8a515171e4@oracle.com> <46e4f527-7ba8-a5a2-157e-34d5001a44ad@oracle.com> <9a8bfea4-3a64-fd5f-afff-78d4af062124@oracle.com> <4bdbc2a1-59dd-1838-5831-af2759a3c4ac@oracle.com> <11db7afe-d111-0908-851e-f100b6050f75@oracle.com> <8bbf04d2-09d6-6f9f-930c-77e97a03bdc9@oracle.com> <11cd6929-af13-1708-847a-639dc1991efa@oracle.com> Message-ID: <1a061238-0ee4-8fbf-97b6-59472ef1576c@oracle.com> >> By the way the new PrintNMethods default output seems awful.? No >> comments at all by default and the labels marking the Entry Point seem >> to be totally disconnected.? The alignment is kind of screwy too. >> >> ?? 0x00000001230c3a12:?? jle??? 0x00000001230c3a5a >> ;*if_icmplt {reexecute=0 rethrow=0 return_oop=0} >> ; - java.lang.StringLatin1::charAt at 7 (line 47) >> ; - java.lang.String::charAt at 12 (line 708) >> ?? 0x00000001230c3a18:?? cmp??? r10d,edx > > Strange, Lutz fixed output recently for 8213084. Can you try > PrintAssembly instead? May his changes affect PrintNMethods in wrong way. I don't see any differences between those. From digging into the code a bit more it looks like I have to add -XX:PrintAssemblyOptions=show-comments,show-block-comments to get something that looks like the old output. Why aren't these on by default? I think I need to adjust my move_to calls to match the existing code, but the show-comments output still seems awfully far to the left. Oddly it's further to left than a block comment. Maybe it's a problem with the _post_decode_alignment alignment? It seems like that's only recalculated in one place that I think is kind of unreachable or not commonly reached. tom > > Thanks, > Vladimir > >> >> tom >> >>> >>> Thanks, >>> Vladimir >>> >>> On 5/30/19 11:27 AM, Tom Rodriguez wrote: >>>> I have updated this webrev to include fixes to AOT to properly >>>> capture the implicit exception table and record the offset for it in >>>> the AOT binary.? It required minor Graal changes which I will push >>>> upstream separately.? Please rereview. >>>> >>>> tom >>>> >>>> Tom Rodriguez wrote on 12/12/18 11:22 PM: >>>>> >>>>> >>>>> Vladimir Kozlov wrote on 12/12/18 2:29 PM: >>>>>> On 12/12/18 1:06 PM, Tom Rodriguez wrote: >>>>>>> They all look like preexisting failures to me.? The >>>>>>> CheckGraalIntrinsics one you mentioned in chat and >>>>>> >>>>>> yes >>>>>> >>>>>>> compiler/aot/DeoptimizationTest.java which seems to have been >>>>>>> failing at least intermittently for a while.? What do you think? >>>>>> >>>>>> SIGFPE is new. And I think your changes in sharedRuntime.cpp may >>>>>> affected execution of AOT methods because they are marked as >>>>>> compiled by Graal (compiler_jvmci): >>>>>> >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/9e28eff3d40f/src/hotspot/share/aot/aotCompiledMethod.hpp#l131 >>>>> >>>>> >>>>> >>>>> >>>>> Yes I think I need to move some code around to properly support >>>>> AOT. I'll send out an updated webrev soon but I think we can defer >>>>> this one until jdk 13. >>>>> >>>>> tom >>>>> >>>>>> >>>>>> >>>>>> Vladimir >>>>>> >>>>>>> >>>>>>> It does make me wonder if AOT needs any extra support to use the >>>>>>> implicit exception table.? I would assume we'd be seeing problems >>>>>>> if that was the case but don't really know. >>>>>>> >>>>>>> tom >>>>>>> >>>>>>> Vladimir Kozlov wrote on 12/12/18 11:12 AM: >>>>>>>> Tom, >>>>>>>> >>>>>>>> Some tests failed. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> On 12/12/18 10:42 AM, Tom Rodriguez wrote: >>>>>>>>> http://cr.openjdk.java.net/~never/8209626/webrev >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8209626 >>>>>>>>> >>>>>>>>> Graal handles implicit exceptions by deoptimizing and that's >>>>>>>>> currently done in a way that's hard to understand from the >>>>>>>>> PrintNMethods output. Basically there's just an extra PcDesc at >>>>>>>>> the implicit check location and the runtime assumes that a >>>>>>>>> fault with a PcDesc underneath it an implicit check.? This >>>>>>>>> changes JVMCI to use the implicit exception table to mark these >>>>>>>>> locations specially which simplifies the dispatching and >>>>>>>>> printing.??The?new?print?output?looks?like?this: >>>>>>>>> >>>>>>>>> ?? 0x0000000120f053a0: mov??? DWORD PTR [rsp-0x14000],eax >>>>>>>>> ?? 0x0000000120f053a7: sub??? rsp,0x18 >>>>>>>>> ?? 0x0000000120f053ab: mov??? QWORD PTR [rsp+0x10],rbp >>>>>>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>>>> - java.lang.StringLatin1::equals at 0 (line 94) >>>>>>>>> >>>>>>>>> ?? 0x0000000120f053b0: mov??? eax,DWORD PTR [rsi+0xc]??????? ; >>>>>>>>> implicit exception: deoptimizes >>>>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>>>> ;*aload_0 {reexecute=1 rethrow=0 return_oop=0} >>>>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>>>> - java.lang.StringLatin1::equals at 0 (line 94) >>>>>>>>> >>>>>>>>> ?? 0x0000000120f053b3: mov??? r10d,DWORD PTR [rdx+0xc]?????? ; >>>>>>>>> implicit exception: deoptimizes >>>>>>>>> ???????????????????????????????????????????????????????????? ; >>>>>>>>> ImmutableOopMap{rdx=Oop rsi=Oop } >>>>>>>>> >>>>>>>>> The scope information is still printed in the normal original >>>>>>>>> location. This has been in use with JVMCI 8 for several months. >>>>>>>>> >>>>>>>>> tom From vladimir.x.ivanov at oracle.com Fri May 31 21:47:22 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Sat, 1 Jun 2019 00:47:22 +0300 Subject: RFR: 8224162: assert(profile.count() == 0) failed: sanity in InlineTree::is_not_reached In-Reply-To: <289bd842-4f56-f5ec-70ff-bf31a1c56f19@loongson.cn> References: <262145A0-09CB-4CD5-8B49-A81CC0B68380@oracle.com> <282b2c79-1ce0-95bb-c37a-d151edcc02f4@oracle.com> <03736619-e07f-e33c-635b-5e8d722d0142@loongson.cn> <259a914e-1c9c-c884-6114-6f855a96afb6@loongson.cn> <1060f01d-dcfa-3a04-284d-1c6a95c791fc@oracle.com> <4c2da2fb-7550-d51b-539a-4656fc67bb00@oracle.com> <38b00331-34f7-b4a8-f033-f1489a154806@loongson.cn> <0669b1e3-5258-7765-aac8-8d3e5c47066c@oracle.com> <8e25f068-609a-de80-a020-fbc0ede4b96a@oracle.com> <063fa1f0-864c-b049-1ece-534322505bf7@loongson.cn> <828dfe7a-bbfd-3782-62e9-ae4ac4490cb7@loongson.cn> <289bd842-4f56-f5ec-70ff-bf31a1c56f19@loongson.cn> Message-ID: > Please review it: http://cr.openjdk.java.net/~jiefu/8224162/webrev.09/ Looks good! I'll submit it for testing. Some nitpicking: src/hotspot/share/ci/ciMethod.cpp +template +int ciMethod::saturated_add(L a, R b) { What do you think about moving it to globalDefinitions.hpp? Probably, it's worth getting rid of the template parameters and introduce specializations (akin to JAVA_INTEGER_OP) since the implementation makes sense only for int/uint. + jlong sum = src1 + src2; I'd prefer to see explicit casts to jlong to stress there's no overflow possible here for 32-bit values. + return c < 0 ? max_jint : c; + case Bytecodes::_instanceof: return c > 0 ? min_jint : c; Please, put parentheses around ternary expressions. Best regards, Vladimir Ivanov > > Thanks a lot. > Best regards, > Jie > > On 2019/5/31 ??6:02, Vladimir Ivanov wrote: >> Thanks for checking 32-bit port! >> >> I'd try to fix the overflow in ciMethod::call_profile_at_bci() by >> enforcing the following property: >> ? // The call site count is 0 with known morphism (only 1 or 2 receivers) >> ? // or < 0 in the case of a type check failure for checkcast, >> aastore, instanceof. >> ? // The call site count is > 0 in the case of a polymorphic virtual >> call. >> >> It knows the bci and has access to the actual bytecode. >> >> For invoke* it would turn negative values into max_jint and for >> checkcast/aastore/instanceof turn positive values into min_jint. >> >> Also, I would avoid "#ifndef _LP64" and keep the code on both 64- & >> 32-bit, though on 64-bit it should be effectively unused (as of now, >> but not necessarily in the future). >> >> Best regards, >> Vladimir Ivanov >> >> On 31/05/2019 12:41, Jie Fu wrote: >>> Hi Vladimir Ivanov, >>> >>> The previous version still failed on 32-bit VM. >>> >>> Updated: http://cr.openjdk.java.net/~jiefu/8224162/webrev.08/ >>> >>> There seems no way to detect the invoke-profile counter overflow in >>> CounterData::count() on 32-bit systems. >>> So I did that in Compile::call_generator(...) for 32-bit platforms. >>> >>> What do you think of this version? >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> On 2019/5/31 ??3:55, Jie Fu wrote: >>>> Hi Vladimir Ivanov, >>>> >>>> I'm wondering whether the patch[1] works for 32-bit JVM. >>>> And now I'm doing experiments on a 32-bit system. >>>> Please just wait for me and I'll let you know the result ASAP. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> [1] http://cr.openjdk.java.net/~jiefu/8224162/webrev.07/ >>>> >>>> >>>> On 2019/5/31 ??1:03, Jie Fu wrote: >>>>> Hi Vladimir Ivanov, >>>>> >>>>> Thank you for your review and guidance. >>>>> I benefit a lot from the discussion with you. >>>>> The patch had been updated based on your suggestions: >>>>> - http://cr.openjdk.java.net/~jiefu/8224162/webrev.07/ >>>>> >>>>> Also I had changed the parameter type of >>>>> CounterData::set_count(...) from uint to int. >>>>> It is only used here[1], which I think is safe and clearer to do that. >>>>> Please review it and give me some advice. >>>>> >>>>> Testing: >>>>> ? make test TEST="tier1 tier2 tier3" JTREG="JOBS=4" CONF=release on >>>>> Linux/x64 >>>>> >>>>> Thanks a lot. >>>>> Best regards, >>>>> Jie >>>>> >>>>> [1] >>>>> http://hg.openjdk.java.net/jdk/jdk/file/b0513c833960/src/hotspot/share/oops/methodData.hpp#l1170 >>>>> >>>>> >>>>> >>>>> On 2019/5/30 ??10:59, Vladimir Ivanov wrote: >>>>>> Switching return type to int would make it clearer. >>>>>> >>>>> Done >>>>> >>>>> >>>>>> >>>>>> call->receiver_count(i) is uint, so you still can experience >>>>>> overflow when converting from uint to int. Considering receiver >>>>>> counts are positive, I'd use saturating add on uints (and don't >>>>>> care about min_jint). >>>>> >>>>> Fixed. >>>>> >>>>> >>>>>> Regarding handling overflow during profiling: >>>>>> >>>>>> ? * C1 doesn't handle counter overflow [1] >>>>>> >>>>>> ? * What template interpreter does to avoid overflow is not enough >>>>>> for concurrent case: it stores new value into memory and then >>>>>> conditionally decrements it, but another thread may already see >>>>>> it. Proper solution would be to keep the value in register, but >>>>>> that requires a temporary register to burn. >>>>> >>>>> Nice catch! I didn't realize the problem before. Thanks. >>>>> >>>>> >>>>>> So, it seems easier and cheaper (from the perspective of profiling >>>>>> overhead) to handle that on compiler side when interpreting the data. >>>>>> >>>>> I agree. >>>>> >>>>> >>>>>> >>>>>> Best regards, >>>>>> Vladimir Ivanov >>>>>> >>>>>> [1] >>>>>> http://hg.openjdk.java.net/jdk/jdk/file/c41783eb76eb/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp#l1617 >>>>>> >>> > From dean.long at oracle.com Fri May 31 23:14:43 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 31 May 2019 16:14:43 -0700 Subject: [13] RFR (XS): 8225141: Better handling of classes in error state in fast class initialization checks In-Reply-To: <18cc95c1-4b46-2eec-bb16-34380aad9394@oracle.com> References: <18cc95c1-4b46-2eec-bb16-34380aad9394@oracle.com> Message-ID: <70f49678-782e-9b38-d4b8-79cb88c71e70@oracle.com> Looks good. dl On 5/31/19 1:53 PM, Vladimir Ivanov wrote: > http://cr.openjdk.java.net/~vlivanov/8225141/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8225141 > > Fast class initialization checks don't properly handle classes in > error state when performed from (previously) initializing thread. > > One way to fix it is to add one more fast path check > (InstanceKlass::_init_state == being_initialized) into the barrier, > but that would require significant changes, since both newly > introduced checks (JDK-8223213 [1]) and existing C1 checks should be > changed. > > What I propose is to set InstanceKlass::_init_thread only for the > duration when the klass is in being_initialized state and reset it > back to NULL when changing class state. It makes existing > "_init_thread == current_thread" check equivalent to "_init_state == > being_initialized && _init_thread == current_thread". > > Testing: hs-precheckin-comp, tier1-4 > > Best regards, > Vladimir Ivanov > > [1] https://bugs.openjdk.java.net/browse/JDK-8223213