From fairoz.matte at oracle.com Fri Mar 1 00:17:06 2019 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Thu, 28 Feb 2019 16:17:06 -0800 (PST) Subject: [13] RFR (S): 8219513: [TESTBUG] : compiler/codegen/aes/TestCipherBlockChainingEncrypt.java timeout on Solaris-sparc In-Reply-To: <3ef52bf7-9d39-333d-e1f5-1e8a1988c0c5@oracle.com> References: <3ef52bf7-9d39-333d-e1f5-1e8a1988c0c5@oracle.com> Message-ID: Thanks Vladimir. > -----Original Message----- > From: Vladimir Kozlov > Sent: Friday, March 01, 2019 1:05 AM > To: Fairoz Matte ; Igor Ignatyev > > Cc: hotspot compiler > Subject: Re: [13] RFR (S): 8219513: [TESTBUG] : > compiler/codegen/aes/TestCipherBlockChainingEncrypt.java timeout on > Solaris-sparc > > Looks good. > > Thanks, > Vladimir > > On 2/28/19 1:05 AM, Fairoz Matte wrote: > > Hi Vladimir, > > > > Thanks for the review. > > > >> -----Original Message----- > >> From: Vladimir Kozlov > >> Sent: Wednesday, February 27, 2019 11:17 PM > >> To: Fairoz Matte ; Igor Ignatyev > >> > >> Cc: hotspot compiler > >> Subject: Re: [13] RFR (S): 8219513: [TESTBUG] : > >> compiler/codegen/aes/TestCipherBlockChainingEncrypt.java timeout on > >> Solaris-sparc > >> > >> Hi Fairoz, > >> > >> I am fine with reduced number of iterations but did you verified that > >> intrinsics are still used/generated in execute() method? > > > > Yes I checked that using -XX:+PrintIntrinsics during testing. > >> > >> And I think the test is missing intrinsic availability check on > >> testing system - see 8207153 changes for TestBase64.java, for example > [1]. > > > > Thanks for the pointer, I have updated the webrev > > http://cr.openjdk.java.net/~fmatte/8219513/webrev.02/ > > > > Thanks, > > Fairoz > >> > >> Thanks, > >> Vladimir > >> > >> [1] http://hg.openjdk.java.net/jdk/jdk/rev/ae001a1deb74#l2.1 > >> > >> On 2/26/19 8:36 PM, Fairoz Matte wrote: > >>> Hi Igor, > >>> > >>> Thanks for the review, below is the webrev with suggested changes. > >>> http://cr.openjdk.java.net/~fmatte/8219513/webrev.01/ > >>> > >>> Testing Mach5 hs-tier1 - 3 (Observed that test case passes in > >>> product and debug builds) > >>> > >>> Thanks, > >>> Fairoz > >>> > >>>> -----Original Message----- > >>>> From: Igor Ignatyev > >>>> Sent: Monday, February 25, 2019 10:46 PM > >>>> To: Fairoz Matte > >>>> Cc: hotspot compiler > >>>> Subject: Re: [13] RFR (S): 8219513: [TESTBUG] : > >>>> compiler/codegen/aes/TestCipherBlockChainingEncrypt.java timeout > on > >>>> Solaris-sparc > >>>> > >>>> Hi Fairoz, > >>>> > >>>> reduction loop iterations to 2k is good. > >>>> > >>>> adjustment of timeout for slow debug build (or any other type of vm > >>>> configuration) should not be done at test-level, and should be done > >>>> via TIMEOUT_FACTOR make var. instead. > >>>> > >>>> Thanks, > >>>> -- Igor > >>>> > >>>>> On Feb 25, 2019, at 8:29 AM, Fairoz Matte > >>>>> > >>>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> http://cr.openjdk.java.net/~fmatte/8219513/webrev.00/ > >>>>> https://bugs.openjdk.java.net/browse/JDK-8219513 > >>>>> > >>>>> Please review this tiny change to adjust the timeout observed on > >>>>> debug > >>>> builds. > >>>>> Single test iteration exercises the aes intrinsic several times. > >>>>> There are encrypt and decrypt calls in the loop that all use aes. > >>>>> Reducing count to 2_000 iterations still triggers the compilation. > >>>>> > >>>>> Test case uses timeout of 300 seconds is not sufficient for slow > >>>>> debug > >>>> builds. > >>>>> That has been adjusted to 600 seconds. > >>>>> > >>>>> Thanks, > >>>>> Fairoz > >>>> From fujie at loongson.cn Fri Mar 1 00:56:30 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 1 Mar 2019 08:56:30 +0800 Subject: RFR (trivial): 8219919: RuntimeStub's name lost with PrintFrameConverterAssembly In-Reply-To: <944da3d8-4a5c-ae02-6ed8-3c5cc2bc11f3@oracle.com> References: <7c9d8242-389a-0b90-57cc-b9f2c8ee588b@loongson.cn> <944da3d8-4a5c-ae02-6ed8-3c5cc2bc11f3@oracle.com> Message-ID: Thank you so much, Vladimir. I really appreciate if someone could help to sponsor it. Thanks in advance. Best regards, Jie On 2019/3/1 ??3:36, Vladimir Kozlov wrote: > Looks good. > > thanks, > Vladimir > > On 2/28/19 4:29 AM, Jie Fu wrote: >> Hi all, >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8219919 >> >> The RuntimeStub's name is lost when dumping C2's runtime stub with >> -XX:+PrintFrameConverterAssembly. >> However, it do exist when dumping with -XX:+PrintStubCode. >> >> It would be more friendly if the stub's name was dumped as well for >> JVM debuggers with -XX:+PrintFrameConverterAssembly. >> >> It can be fixed by >> --------------------------------------- >> diff -r 56089cf6152c src/hotspot/share/opto/output.cpp >> --- a/src/hotspot/share/opto/output.cpp Tue Feb 26 05:46:02 2019 -0800 >> +++ b/src/hotspot/share/opto/output.cpp Thu Feb 28 19:52:40 2019 +0800 >> @@ -1556,6 +1556,8 @@ >> ??????? } >> ??????? if (method() != NULL) { >> ????????? method()->print_metadata(); >> +????? } else if (stub_name() != NULL) { >> +??????? tty->print_cr("Generating RuntimeStub - %s", stub_name()); >> ??????? } >> ??????? dump_asm(node_offsets, node_offset_limit); >> ??????? if (xtty != NULL) { >> --------------------------------------- >> >> The change has been tested on Linux/x64. >> Could you please review it? >> Thanks a lot. >> >> Best regards, >> Jie >> >> From vivek.r.deshpande at intel.com Fri Mar 1 01:23:44 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Fri, 1 Mar 2019 01:23:44 +0000 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> Hi Vladimir Thanks for your inputs. I have made the changes according to your suggestion. The webrev is here: http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ This addresses the questions you had raised. With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Wednesday, February 13, 2019 12:29 PM To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya ; Raj, Guru Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic Hi Vivek, Most of new checks are loop invariant: !s1_ctrl_inv and !s1_ctrl->is_RangeCheck() I think you don't need to search for is_muladds2i() if those checks return false. Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? Thanks, Vladimir On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: > Hi Vladimir > > Would you please take a look at this patch. > > The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. > This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. > > Webrev: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ > Bug ID: > https://bugs.openjdk.java.net/browse/JDK-8216580 > > Regards, > Vivek > > -----Original Message----- > From: Deshpande, Vivek R > Sent: Monday, January 28, 2019 9:45 AM > To: Tobias Hartmann ; > hotspot-compiler-dev at openjdk.java.net compiler > ; Vladimir Kozlov > > Cc: Viswanathan, Sandhya ; Raj, Guru > > Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code > by allowing adjacent LoadS nodes to be isomorphic > > Hi Vladimir > > Would you please take a look at the patch. > The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. > This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. > > Webrev: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ > > Regards, > Vivek > > -----Original Message----- > From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] > Sent: Tuesday, January 15, 2019 2:57 AM > To: Deshpande, Vivek R ; > hotspot-compiler-dev at openjdk.java.net compiler > > Cc: Vladimir Kozlov ; Viswanathan, Sandhya > ; Raj, Guru > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code > by allowing adjacent LoadS nodes to be isomorphic > > Hi Vivek, > > please add parentheses around the == comparison in lines 1225,1226. > > Otherwise this looks reasonable to me but I'm not too familiar with that code. > > Best regards, > Tobias > > On 12.01.19 01:03, Deshpande, Vivek R wrote: >> Hi Tobias >> >> The webrev for the bug JDK-821650 is here: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >> Could you please review it. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Deshpande, Vivek R >> Sent: Friday, January 11, 2019 11:38 AM >> To: 'Tobias Hartmann' ; >> hotspot-compiler-dev at openjdk.java.net compiler >> >> Cc: Vladimir Kozlov ; Viswanathan, >> Sandhya ; Raj, Guru >> >> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization fails >> with assert(0 <= i && i < _len) failed: illegal index >> >> Hi Tobias >> >> Thanks for reviewing the patch. >> I have made the changes according to your suggestion. >> In this webrev: >> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >> I have fix for the crash reported in the 8216050. >> >> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >> >> I have updated the bug also with the link to webrev. >> >> I have created a different bug JDK-8216580 for >> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >> for a[i] and a[i+1] accesses in same MulAddS2I node >> >> Thank you. >> Regards, >> Vivek >> >> -----Original Message----- >> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >> Sent: Friday, January 11, 2019 4:49 AM >> To: Deshpande, Vivek R ; >> hotspot-compiler-dev at openjdk.java.net compiler >> >> Cc: Vladimir Kozlov ; Viswanathan, >> Sandhya ; Raj, Guru >> >> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization fails >> with assert(0 <= i && i < _len) failed: illegal index >> >> Hi Vivek, >> >> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>> 1) Fix for the crash by matching the operand by swapping to right positions. >> >> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >> >>> 2) Cost based generation of vpdpwssd instruction. >> >> Other instructions added by JDK-8214751 still miss a cost definition, for example: >> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >> >>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to >>> be isomorphic when they have different control RangeCheck nodes >>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >> >> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >> >> Thanks, >> Tobias >> From jatin.bhateja at intel.com Fri Mar 1 02:35:02 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Fri, 1 Mar 2019 02:35:02 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <04476179-590e-9315-667c-cc6885477194@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <45438f90-dcac-7941-1cf5-366555821e2e@redhat.com> <3eb77fb3-db71-8e57-a9f3-ebf635a0291c@redhat.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: Hi Pengfei, Please find my response in following mail. Best Regards, Jatin > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, March 1, 2019 12:57 AM > To: Pengfei Li (Arm Technology China) ; Bhateja, Jatin > ; B. Blaser ; aarch64-port- > dev at openjdk.java.net > Cc: hotspot-compiler-dev at openjdk.java.net; Viswanathan, Sandhya > > Subject: Re: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point > Math.min/max intrinsics, approval request > > Thank you, Pengfei > > Then lets keep branch prediction heuristic shared. I take back my previous > suggestion to have a function for it. > > Jatin, can you Pengfei's question about your change? > > Thanks, > Vladimir > > On 2/27/19 10:45 PM, Pengfei Li (Arm Technology China) wrote: > > Hi Vladimir, Jatin and All, > > > >> So I have question for aarch64 developers. Are aarch64 fmin/fmax > >> instructions are always faster than code generated by default? If > >> this is true new conditions should be x86 specific. To have a > >> separate function to do these checks. We have precedent - > >> clear_upper_avx(). May be later we have to add other conditions for > other platforms too. > > > > I am the author of original AArch64 fmin/fmax intrinsics patch[1], but not a > reviewer. > > > > Both Andrew Haley and I have tested the performance of AArch64 > fmin/fmax instructions before. As far as I could remember, the result is > similar to what we have seen here on x86. If selecting the min/max values > from an array of random numbers, fmin/fmax instructions show better > performance. But for an already (almost) sorted array, fmin/fmax > instructions do make the performance worse, but not too much. So > personally I think, adding heuristic in shared code would benefit AArch64 as > well. > > > > I didn't quite understand Jatin's additional code below. > > -- > > +#ifdef X86 > > + // Being conservative since all the phi edges may not be set > > + // by now. This is done to skip over reduction scenarios. > > + if (a->is_Phi() || b->is_Phi()) > > + return false; > > +#endif > > -- > > Is it going to black out *all* reduction scenarios? I see the intrinsics benefit > the reduction in some cases. And in my opinion, adding this kind of platform- > dependent macros in hotspot shared code is not so good. Proposed check was added based on the common reduction scenario cases which showed performance degradation with new intrinsic sequence for X86. > > > > [1] http://hg.openjdk.java.net/jdk/jdk/rev/f15af1e2c683 > > > > -- > > Thanks, > > Pengfei > > From adinn at redhat.com Fri Mar 1 09:44:31 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 1 Mar 2019 09:44:31 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: On 01/03/2019 02:35, Bhateja, Jatin wrote: >>> I didn't quite understand Jatin's additional code below. >>> -- >>> +#ifdef X86 >>> + // Being conservative since all the phi edges may not be set >>> + // by now. This is done to skip over reduction scenarios. >>> + if (a->is_Phi() || b->is_Phi()) >>> + return false; >>> +#endif >>> -- >>> Is it going to black out *all* reduction scenarios? I see the intrinsics benefit >> the reduction in some cases. And in my opinion, adding this kind of platform- >> dependent macros in hotspot shared code is not so good. > > Proposed check was added based on the common reduction scenario cases which showed > performance degradation with new intrinsic sequence for X86. That doesn't actually clarify things very well. Are you saying: 1a) your patch disables FPMinMax reduction for all architectures? or 1b) your patch disables FPMinMax reduction for x86? and 2a) it does so because when reduction is enabled x86 fails to show performance improvement for applications of reduction? or 2b) it does so because when reduction is enabled x86 fails to show performance improvement for selection of the FPMin/Max intrinsic? I think you are saying 1a and 2b but I'd prefer to be sure. I would like a clear answer because Pengfei has a pending patch which shows significant benefit on AArch64 using first the FPMin/Max intrinsic and then, for extra gain, FPMin/Max reduction. My own investigations have not show any detrimental effect to using the intrinsic or reduction and Andrew Haley seems to have withdrawn the claim that the intrinsic can worsen performance. So, it is quite important to understand what your patch does and why. If there is some other way to avoid the slowdown on x86 (whether that comes with use of the intrinsic or with use of reduction) without clobbering the gains to be had on AArch64 then that would be preferable. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From jatin.bhateja at intel.com Fri Mar 1 10:20:56 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Fri, 1 Mar 2019 10:20:56 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: Hi Andrew, Please see my response embedded in following mail. Thanks, Jatin > -----Original Message----- > From: Andrew Dinn [mailto:adinn at redhat.com] > Sent: Friday, March 1, 2019 3:15 PM > To: Bhateja, Jatin ; Vladimir Kozlov > ; Pengfei Li (Arm Technology China) > ; B. Blaser ; aarch64-port- > dev at openjdk.java.net > Cc: hotspot-compiler-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point > Math.min/max intrinsics, approval request > > On 01/03/2019 02:35, Bhateja, Jatin wrote: > > >>> I didn't quite understand Jatin's additional code below. > >>> -- > >>> +#ifdef X86 > >>> + // Being conservative since all the phi edges may not be set > >>> + // by now. This is done to skip over reduction scenarios. > >>> + if (a->is_Phi() || b->is_Phi()) > >>> + return false; > >>> +#endif > >>> -- > >>> Is it going to black out *all* reduction scenarios? I see the > >>> intrinsics benefit > >> the reduction in some cases. And in my opinion, adding this kind of > >> platform- dependent macros in hotspot shared code is not so good. > > > > Proposed check was added based on the common reduction scenario cases > > which showed performance degradation with new intrinsic sequence for > X86. > That doesn't actually clarify things very well. Are you saying: > > 1a) your patch disables FPMinMax reduction for all architectures? > > or > > 1b) your patch disables FPMinMax reduction for x86? > > and > > 2a) it does so because when reduction is enabled x86 fails to show > performance improvement for applications of reduction? > > or > > 2b) it does so because when reduction is enabled x86 fails to show > performance improvement for selection of the FPMin/Max intrinsic? > Current patch which is under review does not contain above code change to bypass intrinsic creation for reduction patterns. For X86 performance degrades with intrinsic w.r.t to non-intrinsic implementation in reduction scenarios with and without data variance (i.e. with and without branch predication effects). I could not find right hooks which can be called from common code for adding any such target specific checks during ideal(DAG) construction. Please share if you know any. > I think you are saying 1a and 2b but I'd prefer to be sure. I would like a clear > answer because Pengfei has a pending patch which shows significant benefit > on AArch64 using first the FPMin/Max intrinsic and then, for extra gain, > FPMin/Max reduction. My own investigations have not show any detrimental > effect to using the intrinsic or reduction and Andrew Haley seems to have > withdrawn the claim that the intrinsic can worsen performance. So, it is quite > important to understand what your patch does and why. > > If there is some other way to avoid the slowdown on x86 (whether that > comes with use of the intrinsic or with use of reduction) without clobbering > the gains to be had on AArch64 then that would be preferable. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From goetz.lindenmaier at sap.com Fri Mar 1 11:28:24 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 1 Mar 2019 11:28:24 +0000 Subject: 8219582: PPC: Crash after C1 checkcast patched and GC In-Reply-To: References: Message-ID: Hi Martin, Thanks for addressing this issue! Well, this shuffling of registers is quite complex and error prone. Forcing the register allocation to have 5 different registers seems to be the safer fix. But I checked, about 10% of the cases in jvm98 only use 4 registers here, and I would consider this a significant amount to justify the complexity. For me, all the renaming and assignment of conditions makes the code harder to read, (e.g., keep_object_alive encodes checkcast && reg_conflict, I.e., it rather means "must_move_obj" ...), as well as the names of registers (k_RInfo, klass_RInfo ... what's the idea behind these names?) But I understand you want to keep this similar to the code on other platforms, which is also desireable. So consider this reviewed. Best regards, Goetz. > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Doerr, Martin > Sent: Tuesday, February 26, 2019 6:51 PM > To: Anton Kozlov ; 'hotspot-compiler- > dev at openjdk.java.net' > Subject: [CAUTION] RE: 8219582: PPC: Crash after C1 checkcast patched and > GC > > Hi Anton, > > I noticed that my fix had missed that the Runtime1::slow_subtype_check_id > stub needs fixed registers. So I need to undo the register changes before > and after that call. > > Quite messy. Now I remember why I didn't want to do it this way originally ?? > > I just added the shuffling in place: > http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ > > I'll run tests and try to find somebody for a 2nd review. > > Best regards, > Martin > > > -----Original Message----- > From: Anton Kozlov > Sent: Montag, 25. Februar 2019 16:25 > To: Doerr, Martin ; 'hotspot-compiler- > dev at openjdk.java.net' > Subject: Re: 8219582: PPC: Crash after C1 checkcast patched and GC > > Hi, Martin, > > my bad, the null check looked like optimization (why to resolve if the klass > doesn't matter), but it's actually specified > > If objectref is null , then the operand stack is unchanged. > Otherwise, the named class, array, or interface type is resolved ... > > Your patch looks correct; reproducer does not fail anymore. > > A very minor note: > > } else if (code == lir_checkcast) { > > Label success, failure; > > - emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); > // Moves obj to dst. > > + emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); > > __ b(*op->stub()->entry()); > > __ align(32, 12); > > __ bind(success); > > + __ mr(op->result_opr()->as_register(), op->object()->as_register()); > shouldn't __mr_if_needed be here? As it was before, obj and dst can be > same register: > > > __ cmpdi(CCR0, obj, 0); > > - if (move_obj_to_dst || reg_conflict) { > > - __ mr_if_needed(dst, obj); > ^^^ here > > - if (reg_conflict) { obj = dst; } > > - } > > > > Thanks for fixing the patch! > -- Anton > > On 25.02.2019 17:20, Doerr, Martin wrote: > > Hi Anton, > > > > your proposal fixes the issue, but introduces another one: > > We must not use the patching stub before the null check (resolving is not > allowed at this place). > > JCK tests exist which verify this: > > vm/instr/checkcast > > vm/instr/instanceof > > > > So I suggest to fix it this way: > > http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ > > It is closer to the x86 implementation. > > > > Can you verify this proposal, please? > > > > Thanks again for your helpful analysis of the problem. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Doerr, Martin > > Sent: Freitag, 22. Februar 2019 17:12 > > To: 'hotspot-compiler-dev at openjdk.java.net' dev at openjdk.java.net>; Anton Kozlov > > Subject: [CAUTION] RFR: 8219582: PPC: Crash after C1 checkcast patched > and GC > > > > Hi Anton, > > > > reposting on hotspot-compiler-dev. > > Thanks for analyzing the issue and for providing a fix. I'll take a closer look > next week. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Anton Kozlov > > Sent: Freitag, 22. Februar 2019 16:05 > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR: 8219582: PPC: Crash after C1 checkcast patched and GC > > > > Hi, > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8219582 > > webrev: http://cr.openjdk.java.net/~akozlov/8219582/webrev.00/ > > > > PPC C1 checkcast implementation overcomes possible object-to-check and > temp registers conflict by using destination register as temp to store the > object. It usually works, but after object moved to dst and before checkcast > completed, safepoint may occur because of implicit runtime call from > klass2reg_with_patching. During the call, oop in dst register is not visible to > GC, so it will not be updated after GC moved objects. > > > > Please review the fix, that is to load klass (and may be call runtime) at > beginning of the LIR instruction, when all oops are in place expected by GC. > > > > Thanks, > > Anton > > From magnus.ihse.bursie at oracle.com Fri Mar 1 14:25:43 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 1 Mar 2019 15:25:43 +0100 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> Message-ID: On 2019-02-27 03:25, Jie Fu wrote: > Hi Tobias, > > Thanks a lot for your review. > > It's a bit difficult for me to test this patch since I don't have a > sparc or arm machine. > I've analyzed the adlc processing logic in > make/hotspot/gensrc/GensrcAdlc.gmk finding that ad-files under > ./src/hotspot/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH) > are optional. What do you mean by "optional"? The build code does this: ############################################################################## ? # Concatenate all ad source files into a single file, which will be fed to ? # adlc. ... ? AD_SRC_FILES := $(call uniq, $(wildcard $(foreach d, $(AD_SRC_ROOTS), \ ????? $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU).ad \ $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH).ad \ $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad \ ??? ))) so it will definitely pick up both those files and use it in creating the concatenated ad file. That being said, maybe this is not the correct behavior. I see that the linux_sparc.ad file is essentially empty, so you can probably remove that. The aarch64 file otoh seems to contain valid code. I would not presume that you can just remove it! /Magnus > Since both linux_sparc.ad and linux_aarch64.ad are useless for the > generation of C2, it would be better to remove them. > > I'll try my best to test it. > > By the way, I really appreciate If someone with sparc or aarch64 > development environment could help to verify this change. > Thanks in advance. > > Best regards, > Jie > > > On 2019/2/27 ??12:51, Tobias Hartmann wrote: >> Hi Jie, >> >> this looks good to me assuming that you have tested the change on >> these platforms. >> >> Best regards, >> Tobias >> >> On 21.02.19 10:35, Jie Fu wrote: >>> Hi all, >>> >>> The following two source files are useless for the generation of C2 >>> and should be removed. >>> ??1) ./src/hotspot/os_cpu/linux_sparc/linux_sparc.ad >>> ??2) ./src/hotspot/os_cpu/linux_aarch64/linux_aarch64.ad >>> >>> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8219519 >>> Webrev: http://cr.openjdk.java.net/~jiefu/8219519/webrev.00/ >>> >>> Could you please review it? >>> Thanks a lot. >>> >>> Best regards, >>> Jie >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adinn at redhat.com Fri Mar 1 14:39:14 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 1 Mar 2019 14:39:14 +0000 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> Message-ID: <213140b5-52f1-4d45-8198-d1f6a7676922@redhat.com> On 01/03/2019 14:25, Magnus Ihse Bursie wrote: > On 2019-02-27 03:25, Jie Fu wrote: >> It's a bit difficult for me to test this patch since I don't have a >> sparc or arm machine. >> I've analyzed the adlc processing logic in >> make/hotspot/gensrc/GensrcAdlc.gmk finding that ad-files under >> ./src/hotspot/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH) >> are optional. > What do you mean by "optional"? The build code does this: > > ? > ############################################################################## > ? # Concatenate all ad source files into a single file, which will be fed to > ? # adlc. > > ... > > ? AD_SRC_FILES := $(call uniq, $(wildcard $(foreach d, $(AD_SRC_ROOTS), \ > ????? $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU).ad \ > ????? $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH).ad \ > ????? > $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad > \ > ??? ))) > > so it will definitely pick up both those files and use it in creating > the concatenated ad file. That's interesting because Pengfei Li claims he applied the patch and successfully built OpenJDK on AArch64. https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-February/006975.html Does the build system actually need those files to exist when it builds the concatenated file? > That being said, maybe this is not the correct behavior. Well, something sounds fishy. > I see that the linux_sparc.ad file is essentially empty, so you can > probably remove that. The aarch64 file otoh seems to contain valid code. > I would not presume that you can just remove it! He is ok to remove it as far as any contents are concerned. Indeed, I told him this was ok in a review in the above thread after Pengfei reported that OpenJDK built without the file being present. As to the contents, the encoding defined in that file is completely redundant (I don't really know how it got there as I don't believe it was ever used) regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From glaubitz at physik.fu-berlin.de Fri Mar 1 14:29:46 2019 From: glaubitz at physik.fu-berlin.de (John Paul Adrian Glaubitz) Date: Fri, 1 Mar 2019 15:29:46 +0100 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> Message-ID: <004720da-c137-433f-19f1-708fe8f74a0b@physik.fu-berlin.de> Hello! On 3/1/19 3:25 PM, Magnus Ihse Bursie wrote: >> It's a bit difficult for me to test this patch since I don't have a sparc or arm machine. Both SPARC and AArch64 machines running Linux can be accessed through the gcc compile farm. Any open source developer can request an account for these machines. See: > https://gcc.gnu.org/wiki/CompileFarm > https://cfarm.tetaneutral.net/machines/list/ I'm admin for the sparc64 box running Linux in case someone needs any particular package to be installed. Thanks, Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaubitz at debian.org `. `' Freie Universitaet Berlin - glaubitz at physik.fu-berlin.de `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913 From magnus.ihse.bursie at oracle.com Fri Mar 1 14:47:13 2019 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 1 Mar 2019 15:47:13 +0100 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: <213140b5-52f1-4d45-8198-d1f6a7676922@redhat.com> References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> <213140b5-52f1-4d45-8198-d1f6a7676922@redhat.com> Message-ID: On 2019-03-01 15:39, Andrew Dinn wrote: > On 01/03/2019 14:25, Magnus Ihse Bursie wrote: >> On 2019-02-27 03:25, Jie Fu wrote: >>> It's a bit difficult for me to test this patch since I don't have a >>> sparc or arm machine. >>> I've analyzed the adlc processing logic in >>> make/hotspot/gensrc/GensrcAdlc.gmk finding that ad-files under >>> ./src/hotspot/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH) >>> are optional. >> What do you mean by "optional"? The build code does this: >> >> >> ############################################################################## >> ? # Concatenate all ad source files into a single file, which will be fed to >> ? # adlc. >> >> ... >> >> ? AD_SRC_FILES := $(call uniq, $(wildcard $(foreach d, $(AD_SRC_ROOTS), \ >> ????? $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU).ad \ >> ????? $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH).ad \ >> >> $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad >> \ >> ??? ))) >> >> so it will definitely pick up both those files and use it in creating >> the concatenated ad file. > That's interesting because Pengfei Li claims he applied the patch and > successfully built OpenJDK on AArch64. > > > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-February/006975.html > > Does the build system actually need those files to exist when it builds > the concatenated file? No, the build system does not "need" it. If it is not there, it is not included (nor reported MIA), but if it is there, it is included. > >> That being said, maybe this is not the correct behavior. > Well, something sounds fishy. > >> I see that the linux_sparc.ad file is essentially empty, so you can >> probably remove that. The aarch64 file otoh seems to contain valid code. >> I would not presume that you can just remove it! > He is ok to remove it as far as any contents are concerned. Indeed, I > told him this was ok in a review in the above thread after Pengfei > reported that OpenJDK built without the file being present. > > As to the contents, the encoding defined in that file is completely > redundant (I don't really know how it got there as I don't believe it > was ever used) Ok, it might very well be the case that the file is not needed since it's contents is redundant. I can't say anything about that; that's the domain of the adlc experts. However, it is incorrect to claim that the build does not use file in question. But from the build PoV, it's perfectly fine to remove it if it's not needed. But just not on the grounds that it is not used by the build system! /Magnus > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander -------------- next part -------------- An HTML attachment was scrubbed... URL: From adinn at redhat.com Fri Mar 1 16:01:07 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 1 Mar 2019 16:01:07 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3eb77fb3-db71-8e57-a9f3-ebf635a0291c@redhat.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: On 28/02/2019 12:38, Andrew Haley wrote: > On 2/28/19 9:54 AM, Andrew Haley wrote: >> On 2/27/19 8:21 PM, Vladimir Kozlov wrote: >> >>> So I have question for aarch64 developers. Are aarch64 fmin/fmax >>> instructions are always faster than code generated by default? >> Be aware that AArch64 is an abstract architecture, so it cannot be >> said to have performance properties. >> >> In real hardware, however, the answer is no. Nothing like. I have seen >> the fmin/fmax instructions cause a 3x slowdown on a reduction loop. > > So Andrew Dinn asked me what machine, and what test. After some time > trying I confess that I cannot reproduce this result. I didn't think > much of it at the time, which was why I didn't record that > information. My apologies. Ok, that's good to know. The tests I ran to check the benefits of FPMax/Min intrinsics were for 3 different AArch64 CPUs (AppliedMicro, Qualcomm and AMD) and they only showed a small degradation in performance for the intrinsic with sorted data and a good improvement with random data. Also, I can now provide some details including timings the tests fpmin/max reduction tests I tried on Qualcomm and AMD. I tested 3 separate implementations: 1) only Pengfei's fpmin/max intrinsics no reduction rules (novec) 2) Pengfei's fpmin/max intrinsics plus reduction rules (vec) 3) Pengfei's fpmin/max intrinsics plus my upgraded reduction rules (vecplus) The difference between vec and vecplus is that Pengfei only uses the vector reduction instruction fmaxv for reducing a 4 float vector (T4S). instruct reduce_max4F(vRegF dst, vRegF src1, vecX src2) %{ . . . ins_encode %{ __ fmaxv(as_FloatRegister($dst$$reg), __ T4S, as_FloatRegister($src2$$reg)); __ fmaxs(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg)); %} . . . The fmaxv vector operation picks the max of the 4 new vector elements in one step. The subsequent scalar compare picks it or the current reduction value for the next cycle round the loop. For the 2 float and 2 double reduction rules (T2S and T2D) Pengfei's rules compare of the 2 vector entries independently using a vector element pick two scalar comparisons. Here is the double version of Pengfei's encoding (the float version simply replaces D with F and d with f) instruct reduce_max2D(vRegD dst, vRegD src1, vecX src2, vecX tmp) %{ . . . ins_encode %{ __ fmaxd(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), as_FloatRegister($src2$$reg)); __ ins(as_FloatRegister($tmp$$reg), __ D, as_FloatRegister($src2$$reg), 0, 1); __ fmaxd(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), as_FloatRegister($tmp$$reg)); %} . . . My alternative patch modifies the 2D rule to work like the 4S rule i.e. instruct reduce_max2D(vRegD dst, vRegD src1, vecX src2, vecX tmp) %{ . . . ins_encode %{ __ fmaxv(as_FloatRegister($dst$$reg), __ T2D, as_FloatRegister($src2$$reg)); __ fmaxd(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg)); %} . . . There is a corresponding tweak to the 2F rule but it is somewhat immaterial since I could not produce a test that would cause it to be applied. In fact, like Pengfei, I found it hard to come up with tests that caused the reduction to be performed. The obvious example one would want to work would be something like this: double da[] = ... doubel db[] = ... @Benchmark public void testVecMaxDoubleReduce2() { double max = 0.0; for (int z = 0; z < COUNT; z++) { for (int i = 0; i < LENGTH; i++) { max = Math.max(max, da[i]); } } dc[0] = max; } Obviously there are 3 more equivalent benchmarks obtained by substituting Math.min for Math.max and.or float for double. For this test the max operations are translated to the FPMax intrinsic. However, the reduction is not applied. That's because the compiler never considers performing the da[i] loads in the loop body as vector loads. Vectorized loading is only performed when two arrays are loaded and combined using a binary operator. So, the following test does get vectorized and, as a consequence, is then vector reduced. @Benchmark public void testVecMaxDoubleReduce3() { double max = 0.0; for (int z = 0; z < COUNT; z++) { for (int i = 0; i < LENGTH; i++) { max = Math.max(max, da[i] + db[i]); } } dc[0] = max; } In this case the compiler spots that the adds refer to two elements of da and db using the same index and decides that it can perform the add as a 2D vector op. Now that it has the sum as a 2D value it is able to use the 2D FPMax reduction rule to compute the value of the Math.max call. This also works for the other 3 cases where a min and/or float type is substituted. I got the following result (i us/op) from running these tests on the AMD box Benchmark No Redn Redn Full Redn testVecMaxDoubleReduce2 6042 ? 0.47 6041 ? 0.17 6042 ? 0.37 testVecMaxDoubleReduce3 6042 ? 0.57 6042 ? 0.55 3576 ? 143.86 testVecMaxFloatReduce2 6041 ? 0.05 6042 ? 0.47 6042 ? 0.38 testVecMaxFloatReduce3 6042 ? 0.31 1556 ? 17.25 1562 ? 21.96 testVecMinDoubleReduce2 6041 ? 0.05 6042 ? 0.34 6042 ? 0.39 testVecMinDoubleReduce3 6042 ? 0.36 6050 ? 7.20 3322 ? 23.41 testVecMinFloatReduce2 6042 ? 0.49 6042 ? 0.58 6042 ? 0.30 testVecMinFloatReduce3 6042 ? 0.44 1535 ? 3.24 1581 ? 61.56 The 3 columns show no reduction, reduction using Pengfei's T4S rule and reduction using both T4S and T2D rules. In all 3 cases the FpMin/Max intrinsic is enabled. As you can see the Reduce2 tests don't get reduced. The Reduce3 tests do get reduced with both sets of reduction rules (I verified this in the debugger and by eyeballing the generated code). for the T4S cases there is a clear improvement with both sets of rules. For the 2D case Pengfei's rules don't give a better result but mine do (when you look at the generated code it is clear that Pengfei's reduction rule is not going to give a better result over the no reduction case). Results on the Qualcomm box were pretty much identical. n.b. the variation in the average values and spreads for some of the float and double runs are misleading. My machine crashed while running the tests remotely and I ended up perturbing the box I was running the tests on when I re-established a login session and checked the progress of the runs. Previous tests provided more consistent results where the FloatReduce3 values for vec and vecplus were very close and had a very small spread (as you would expect) and where the DoubleReduce3 values for vecplus were also both close to 3300 with a very small spread. So, these figures seem to show that the use of fmaxv reduction for both T4S and T2D saves cycles by avoiding the need for extra vector register element transfers and fpmax/min comparisons. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Fri Mar 1 16:06:13 2019 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 1 Mar 2019 16:06:13 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3eb77fb3-db71-8e57-a9f3-ebf635a0291c@redhat.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: <9033fb77-b4c8-8e69-cbc2-9bc3afcdd6bb@redhat.com> On 01/03/2019 16:01, Andrew Dinn wrote: > My alternative patch modifies the 2D rule to work like the 4S rule i.e. > > instruct reduce_max2D(vRegD dst, vRegD src1, vecX src2, vecX tmp) %{ > . . . > ins_encode %{ > __ fmaxv(as_FloatRegister($dst$$reg), __ T2D, > as_FloatRegister($src2$$reg)); > __ fmaxd(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), > as_FloatRegister($src1$$reg)); > %} > . . . Doh! One more tweak that could be applied here is, of course, to get rid of that redundant 'vecX tmp' argument. It almost certainly won't affect the current benchmark figures but it does free up another vector register which might help some real FP code. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From bsrbnd at gmail.com Fri Mar 1 16:39:03 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Fri, 1 Mar 2019 17:39:03 +0100 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: On Fri, 1 Mar 2019 at 10:44, Andrew Dinn wrote: > > If there is some other way to avoid the slowdown on x86 (whether that > comes with use of the intrinsic or with use of reduction) without > clobbering the gains to be had on AArch64 then that would be preferable. Then, I'd like to suggest another alternative here under. This is an optimization of the API generated code using only one 'ucomis[s/d]' vs five before. There's between 5-10% gain with the current benchmark [1] for predictable, unpredictable and reduction scenarios. Before: Benchmark Mode Cnt Score Error Units FpMinMaxIntrinsics.dMax avgt 8633.782 ns/op FpMinMaxIntrinsics.dMin avgt 8694.123 ns/op FpMinMaxIntrinsics.dMinReduce avgt 710.493 ns/op FpMinMaxIntrinsics.fMax avgt 8578.784 ns/op FpMinMaxIntrinsics.fMin avgt 8734.432 ns/op FpMinMaxIntrinsics.fMinReduce avgt 719.532 ns/op After: Benchmark Mode Cnt Score Error Units FpMinMaxIntrinsics.dMax avgt 8050.014 ns/op FpMinMaxIntrinsics.dMin avgt 8027.534 ns/op FpMinMaxIntrinsics.dMinReduce avgt 675.791 ns/op FpMinMaxIntrinsics.fMax avgt 8022.847 ns/op FpMinMaxIntrinsics.fMin avgt 7945.885 ns/op FpMinMaxIntrinsics.fMinReduce avgt 659.173 ns/op I haven't observed any regression until now, so statistics aren't necessary any more. There's no need to deal with legacy 'xmm' registers and only one temporary integer register is necessary. Any feedback would be more than welcome (hotspot:tier1 is OK on x86_64 xeon). Thanks, Bernard [1] http://hg.openjdk.java.net/jdk/submit/rev/ab2b1418f0db diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad --- a/src/hotspot/cpu/x86/x86_64.ad +++ b/src/hotspot/cpu/x86/x86_64.ad @@ -808,6 +808,57 @@ __ bind(done); } +// fp min # max +// ucomis[s/d] +// ja -> b # a +// jp -> NaN # NaN +// je -> a | b # a & b +// jb -> a # b +void emit_fp_min_max(MacroAssembler& _masm, XMMRegister dst, XMMRegister a, XMMRegister b, Register tmp, bool min, bool single) { + Label nan, equal, above, done; + + if (single) + __ ucomiss(a, b); + else + __ ucomisd(a, b); + + __ jccb(Assembler::above, above); // CF=0 & ZF=0 + __ jccb(Assembler::parity, nan); // PF=1 + __ jccb(Assembler::equal, equal); // ZF=1 + + // below + if (single) + __ movflt(dst, min ? a : b); + else + __ movdbl(dst, min ? a : b); + __ jmp(done); + + __ bind(nan); + if (single) { + __ movl(tmp, 0x7fc00000); // Float.NaN + __ movdl(dst, tmp); + } + else { + __ mov64(tmp, 0x7ff8000000000000L); // Double.NaN + __ movdq(dst, tmp); + } + __ jmp(done); + + __ bind(equal); + if (min) + __ vpor(dst, a, b, Assembler::AVX_128bit); + else + __ vpand(dst, a, b, Assembler::AVX_128bit); + __ jmp(done); + + __ bind(above); + if (single) + __ movflt(dst, min ? b : a); + else + __ movdbl(dst, min ? b : a); + + __ bind(done); +} //============================================================================= const RegMask& MachConstantBaseNode::_out_RegMask = RegMask::Empty; @@ -5470,6 +5521,63 @@ ins_pipe( fpu_reg_reg ); %} +// max = java.lang.Math.max(float a, float b) +instruct maxF_reg(regF dst, regF a, regF b, rRegI tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MaxF a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = max($a, $b)\t# intrinsic (float)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + false /*min*/, true /*single*/); + %} + ins_pipe( pipe_slow ); +%} + +// max = java.lang.Math.max(double a, double b) +instruct maxD_reg(regD dst, regD a, regD b, rRegL tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MaxD a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = max($a, $b)\t# intrinsic (double)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + false /*min*/, false /*single*/); + %} + ins_pipe( pipe_slow ); +%} + + +// min = java.lang.Math.min(float a, float b) +instruct minF_reg(regF dst, regF a, regF b, rRegI tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MinF a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = min($a, $b)\t# intrinsic (float)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + true /*min*/, true /*single*/); + %} + ins_pipe( pipe_slow ); +%} + +// min = java.lang.Math.min(double a, double b) +instruct minD_reg(regD dst, regD a, regD b, rRegL tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MinD a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = min($a, $b)\t# intrinsic (double)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + true /*min*/, false /*single*/); + %} + ins_pipe( pipe_slow ); +%} + // Load Effective Address instruct leaP8(rRegP dst, indOffset8 mem) %{ From martin.doerr at sap.com Fri Mar 1 16:58:59 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 1 Mar 2019 16:58:59 +0000 Subject: 8219582: PPC: Crash after C1 checkcast patched and GC In-Reply-To: References: Message-ID: Hi G?tz, thank you for the review. Pushed. Best regards, Martin -----Original Message----- From: Lindenmaier, Goetz Sent: Freitag, 1. M?rz 2019 12:28 To: Doerr, Martin ; Anton Kozlov ; 'hotspot-compiler-dev at openjdk.java.net' Subject: RE: 8219582: PPC: Crash after C1 checkcast patched and GC Hi Martin, Thanks for addressing this issue! Well, this shuffling of registers is quite complex and error prone. Forcing the register allocation to have 5 different registers seems to be the safer fix. But I checked, about 10% of the cases in jvm98 only use 4 registers here, and I would consider this a significant amount to justify the complexity. For me, all the renaming and assignment of conditions makes the code harder to read, (e.g., keep_object_alive encodes checkcast && reg_conflict, I.e., it rather means "must_move_obj" ...), as well as the names of registers (k_RInfo, klass_RInfo ... what's the idea behind these names?) But I understand you want to keep this similar to the code on other platforms, which is also desireable. So consider this reviewed. Best regards, Goetz. > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Doerr, Martin > Sent: Tuesday, February 26, 2019 6:51 PM > To: Anton Kozlov ; 'hotspot-compiler- > dev at openjdk.java.net' > Subject: [CAUTION] RE: 8219582: PPC: Crash after C1 checkcast patched and > GC > > Hi Anton, > > I noticed that my fix had missed that the Runtime1::slow_subtype_check_id > stub needs fixed registers. So I need to undo the register changes before > and after that call. > > Quite messy. Now I remember why I didn't want to do it this way originally ?? > > I just added the shuffling in place: > http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ > > I'll run tests and try to find somebody for a 2nd review. > > Best regards, > Martin > > > -----Original Message----- > From: Anton Kozlov > Sent: Montag, 25. Februar 2019 16:25 > To: Doerr, Martin ; 'hotspot-compiler- > dev at openjdk.java.net' > Subject: Re: 8219582: PPC: Crash after C1 checkcast patched and GC > > Hi, Martin, > > my bad, the null check looked like optimization (why to resolve if the klass > doesn't matter), but it's actually specified > > If objectref is null , then the operand stack is unchanged. > Otherwise, the named class, array, or interface type is resolved ... > > Your patch looks correct; reproducer does not fail anymore. > > A very minor note: > > } else if (code == lir_checkcast) { > > Label success, failure; > > - emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); > // Moves obj to dst. > > + emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); > > __ b(*op->stub()->entry()); > > __ align(32, 12); > > __ bind(success); > > + __ mr(op->result_opr()->as_register(), op->object()->as_register()); > shouldn't __mr_if_needed be here? As it was before, obj and dst can be > same register: > > > __ cmpdi(CCR0, obj, 0); > > - if (move_obj_to_dst || reg_conflict) { > > - __ mr_if_needed(dst, obj); > ^^^ here > > - if (reg_conflict) { obj = dst; } > > - } > > > > Thanks for fixing the patch! > -- Anton > > On 25.02.2019 17:20, Doerr, Martin wrote: > > Hi Anton, > > > > your proposal fixes the issue, but introduces another one: > > We must not use the patching stub before the null check (resolving is not > allowed at this place). > > JCK tests exist which verify this: > > vm/instr/checkcast > > vm/instr/instanceof > > > > So I suggest to fix it this way: > > http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ > > It is closer to the x86 implementation. > > > > Can you verify this proposal, please? > > > > Thanks again for your helpful analysis of the problem. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Doerr, Martin > > Sent: Freitag, 22. Februar 2019 17:12 > > To: 'hotspot-compiler-dev at openjdk.java.net' dev at openjdk.java.net>; Anton Kozlov > > Subject: [CAUTION] RFR: 8219582: PPC: Crash after C1 checkcast patched > and GC > > > > Hi Anton, > > > > reposting on hotspot-compiler-dev. > > Thanks for analyzing the issue and for providing a fix. I'll take a closer look > next week. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Anton Kozlov > > Sent: Freitag, 22. Februar 2019 16:05 > > To: hotspot-runtime-dev at openjdk.java.net > > Subject: RFR: 8219582: PPC: Crash after C1 checkcast patched and GC > > > > Hi, > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8219582 > > webrev: http://cr.openjdk.java.net/~akozlov/8219582/webrev.00/ > > > > PPC C1 checkcast implementation overcomes possible object-to-check and > temp registers conflict by using destination register as temp to store the > object. It usually works, but after object moved to dst and before checkcast > completed, safepoint may occur because of implicit runtime call from > klass2reg_with_patching. During the call, oop in dst register is not visible to > GC, so it will not be updated after GC moved objects. > > > > Please review the fix, that is to load klass (and may be call runtime) at > beginning of the LIR instruction, when all oops are in place expected by GC. > > > > Thanks, > > Anton > > From vladimir.kozlov at oracle.com Fri Mar 1 18:02:22 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 Mar 2019 10:02:22 -0800 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> Message-ID: This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. Thanks, Vladimir On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: > Hi Vladimir > > Thanks for your inputs. I have made the changes according to your suggestion. > The webrev is here: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ > This addresses the questions you had raised. > With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, February 13, 2019 12:29 PM > To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Cc: Viswanathan, Sandhya ; Raj, Guru > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic > > Hi Vivek, > > Most of new checks are loop invariant: !s1_ctrl_inv and > !s1_ctrl->is_RangeCheck() > > I think you don't need to search for is_muladds2i() if those checks return false. > > Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? > > Thanks, > Vladimir > > On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Would you please take a look at this patch. >> >> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >> >> Webrev: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >> Bug ID: >> https://bugs.openjdk.java.net/browse/JDK-8216580 >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Deshpande, Vivek R >> Sent: Monday, January 28, 2019 9:45 AM >> To: Tobias Hartmann ; >> hotspot-compiler-dev at openjdk.java.net compiler >> ; Vladimir Kozlov >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vladimir >> >> Would you please take a look at the patch. >> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >> >> Webrev: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >> Sent: Tuesday, January 15, 2019 2:57 AM >> To: Deshpande, Vivek R ; >> hotspot-compiler-dev at openjdk.java.net compiler >> >> Cc: Vladimir Kozlov ; Viswanathan, Sandhya >> ; Raj, Guru >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vivek, >> >> please add parentheses around the == comparison in lines 1225,1226. >> >> Otherwise this looks reasonable to me but I'm not too familiar with that code. >> >> Best regards, >> Tobias >> >> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>> Hi Tobias >>> >>> The webrev for the bug JDK-821650 is here: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>> Could you please review it. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Deshpande, Vivek R >>> Sent: Friday, January 11, 2019 11:38 AM >>> To: 'Tobias Hartmann' ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization fails >>> with assert(0 <= i && i < _len) failed: illegal index >>> >>> Hi Tobias >>> >>> Thanks for reviewing the patch. >>> I have made the changes according to your suggestion. >>> In this webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>> I have fix for the crash reported in the 8216050. >>> >>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>> >>> I have updated the bug also with the link to webrev. >>> >>> I have created a different bug JDK-8216580 for >>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>> for a[i] and a[i+1] accesses in same MulAddS2I node >>> >>> Thank you. >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>> Sent: Friday, January 11, 2019 4:49 AM >>> To: Deshpande, Vivek R ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization fails >>> with assert(0 <= i && i < _len) failed: illegal index >>> >>> Hi Vivek, >>> >>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>> >>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>> >>>> 2) Cost based generation of vpdpwssd instruction. >>> >>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>> >>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to >>>> be isomorphic when they have different control RangeCheck nodes >>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>> >>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>> >>> Thanks, >>> Tobias >>> From vivek.r.deshpande at intel.com Fri Mar 1 19:53:28 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Fri, 1 Mar 2019 19:53:28 +0000 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> Hi Vladimir Thanks for the review. I am also working on testing it on the VNNI enabled h/w. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Friday, March 1, 2019 10:02 AM To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya ; Raj, Guru Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. Thanks, Vladimir On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: > Hi Vladimir > > Thanks for your inputs. I have made the changes according to your suggestion. > The webrev is here: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ > This addresses the questions you had raised. > With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Wednesday, February 13, 2019 12:29 PM > To: Deshpande, Vivek R ; 'Tobias > Hartmann' ; > 'hotspot-compiler-dev at openjdk.java.net compiler' > > Cc: Viswanathan, Sandhya ; Raj, Guru > > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code > by allowing adjacent LoadS nodes to be isomorphic > > Hi Vivek, > > Most of new checks are loop invariant: !s1_ctrl_inv and > !s1_ctrl->is_RangeCheck() > > I think you don't need to search for is_muladds2i() if those checks return false. > > Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? > > Thanks, > Vladimir > > On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Would you please take a look at this patch. >> >> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >> >> Webrev: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >> Bug ID: >> https://bugs.openjdk.java.net/browse/JDK-8216580 >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Deshpande, Vivek R >> Sent: Monday, January 28, 2019 9:45 AM >> To: Tobias Hartmann ; >> hotspot-compiler-dev at openjdk.java.net compiler >> ; Vladimir Kozlov >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vladimir >> >> Would you please take a look at the patch. >> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >> >> Webrev: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >> Sent: Tuesday, January 15, 2019 2:57 AM >> To: Deshpande, Vivek R ; >> hotspot-compiler-dev at openjdk.java.net compiler >> >> Cc: Vladimir Kozlov ; Viswanathan, >> Sandhya ; Raj, Guru >> >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vivek, >> >> please add parentheses around the == comparison in lines 1225,1226. >> >> Otherwise this looks reasonable to me but I'm not too familiar with that code. >> >> Best regards, >> Tobias >> >> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>> Hi Tobias >>> >>> The webrev for the bug JDK-821650 is here: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>> Could you please review it. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Deshpande, Vivek R >>> Sent: Friday, January 11, 2019 11:38 AM >>> To: 'Tobias Hartmann' ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization >>> fails with assert(0 <= i && i < _len) failed: illegal index >>> >>> Hi Tobias >>> >>> Thanks for reviewing the patch. >>> I have made the changes according to your suggestion. >>> In this webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>> I have fix for the crash reported in the 8216050. >>> >>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>> >>> I have updated the bug also with the link to webrev. >>> >>> I have created a different bug JDK-8216580 for >>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>> for a[i] and a[i+1] accesses in same MulAddS2I node >>> >>> Thank you. >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>> Sent: Friday, January 11, 2019 4:49 AM >>> To: Deshpande, Vivek R ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization >>> fails with assert(0 <= i && i < _len) failed: illegal index >>> >>> Hi Vivek, >>> >>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>> >>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>> >>>> 2) Cost based generation of vpdpwssd instruction. >>> >>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>> >>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes >>>> to be isomorphic when they have different control RangeCheck nodes >>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>> >>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>> >>> Thanks, >>> Tobias >>> From bsrbnd at gmail.com Fri Mar 1 19:58:19 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Fri, 1 Mar 2019 20:58:19 +0100 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: On Fri, 1 Mar 2019 at 17:39, B. Blaser wrote: > > On Fri, 1 Mar 2019 at 10:44, Andrew Dinn wrote: > > > > If there is some other way to avoid the slowdown on x86 (whether that > > comes with use of the intrinsic or with use of reduction) without > > clobbering the gains to be had on AArch64 then that would be preferable. > > Then, I'd like to suggest another alternative here under. > > This is an optimization of the API generated code using only one > 'ucomis[s/d]' vs five before. > There's between 5-10% gain with the current benchmark [1] for > predictable, unpredictable and reduction scenarios. > > Before: > > Benchmark Mode Cnt Score Error Units > FpMinMaxIntrinsics.dMax avgt 8633.782 ns/op > FpMinMaxIntrinsics.dMin avgt 8694.123 ns/op > FpMinMaxIntrinsics.dMinReduce avgt 710.493 ns/op > FpMinMaxIntrinsics.fMax avgt 8578.784 ns/op > FpMinMaxIntrinsics.fMin avgt 8734.432 ns/op > FpMinMaxIntrinsics.fMinReduce avgt 719.532 ns/op > > > After: > > Benchmark Mode Cnt Score Error Units > FpMinMaxIntrinsics.dMax avgt 8050.014 ns/op > FpMinMaxIntrinsics.dMin avgt 8027.534 ns/op > FpMinMaxIntrinsics.dMinReduce avgt 675.791 ns/op > FpMinMaxIntrinsics.fMax avgt 8022.847 ns/op > FpMinMaxIntrinsics.fMin avgt 7945.885 ns/op > FpMinMaxIntrinsics.fMinReduce avgt 659.173 ns/op > > > I haven't observed any regression until now, so statistics aren't > necessary any more. > There's no need to deal with legacy 'xmm' registers and only one > temporary integer register is necessary. > > Any feedback would be more than welcome (hotspot:tier1 is OK on x86_64 xeon). Small correction below as equivalent fp values might have different representations, see JLS ?4.2.3. The gain is still roughly 5-10%. Bernard diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad --- a/src/hotspot/cpu/x86/x86_64.ad +++ b/src/hotspot/cpu/x86/x86_64.ad @@ -808,6 +808,81 @@ __ bind(done); } +// Math.min() # Math.max() +// -------------------------- +// ucomis[s/d] # +// ja -> b # a +// jp -> NaN # NaN +// jb -> a # b +// je # +// |-jz -> a | b # a & b +// | -> a # +void emit_fp_min_max(MacroAssembler& _masm, XMMRegister dst, + XMMRegister a, XMMRegister b, Register tmp, + bool min, bool single) { + + Label nan, zero, below, above, done; + + if (single) + __ ucomiss(a, b); + else + __ ucomisd(a, b); + + __ jccb(Assembler::above, above); // CF=0 & ZF=0 + __ jccb(Assembler::parity, nan); // PF=1 + __ jccb(Assembler::below, below); // CF=1 + + // equal + if (single) { + __ movdl(tmp, a); + __ shll(tmp, 1); // skip sign bit + __ testl(tmp, tmp); + __ jccb(Assembler::zero, zero); + __ movflt(dst, a); + __ jmp(done); + } + else { + __ movdq(tmp, a); + __ shlq(tmp, 1); // skip sign bit + __ testq(tmp, tmp); + __ jccb(Assembler::zero, zero); + __ movdbl(dst, a); + __ jmp(done); + } + + __ bind(zero); + if (min) + __ vpor(dst, a, b, Assembler::AVX_128bit); + else + __ vpand(dst, a, b, Assembler::AVX_128bit); + __ jmp(done); + + __ bind(above); + if (single) + __ movflt(dst, min ? b : a); + else + __ movdbl(dst, min ? b : a); + __ jmp(done); + + __ bind(nan); + if (single) { + __ movl(tmp, 0x7fc00000); // Float.NaN + __ movdl(dst, tmp); + } + else { + __ mov64(tmp, 0x7ff8000000000000L); // Double.NaN + __ movdq(dst, tmp); + } + __ jmp(done); + + __ bind(below); + if (single) + __ movflt(dst, min ? a : b); + else + __ movdbl(dst, min ? a : b); + + __ bind(done); +} //============================================================================= const RegMask& MachConstantBaseNode::_out_RegMask = RegMask::Empty; @@ -5470,6 +5545,63 @@ ins_pipe( fpu_reg_reg ); %} +// max = java.lang.Math.max(float a, float b) +instruct maxF_reg(regF dst, regF a, regF b, rRegI tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MaxF a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = max($a, $b)\t# intrinsic (float)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + false /*min*/, true /*single*/); + %} + ins_pipe( pipe_slow ); +%} + +// max = java.lang.Math.max(double a, double b) +instruct maxD_reg(regD dst, regD a, regD b, rRegL tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MaxD a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = max($a, $b)\t# intrinsic (double)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + false /*min*/, false /*single*/); + %} + ins_pipe( pipe_slow ); +%} + + +// min = java.lang.Math.min(float a, float b) +instruct minF_reg(regF dst, regF a, regF b, rRegI tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MinF a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = min($a, $b)\t# intrinsic (float)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + true /*min*/, true /*single*/); + %} + ins_pipe( pipe_slow ); +%} + +// min = java.lang.Math.min(double a, double b) +instruct minD_reg(regD dst, regD a, regD b, rRegL tmp) %{ + predicate(UseAVX > 0); + match(Set dst (MinD a b)); + effect(USE a, USE b, TEMP tmp); + + format %{ "$dst = min($a, $b)\t# intrinsic (double)" %} + ins_encode %{ + emit_fp_min_max(_masm, $dst$$XMMRegister, $a$$XMMRegister, $b$$XMMRegister, $tmp$$Register, + true /*min*/, false /*single*/); + %} + ins_pipe( pipe_slow ); +%} + // Load Effective Address instruct leaP8(rRegP dst, indOffset8 mem) %{ From vladimir.kozlov at oracle.com Fri Mar 1 21:46:53 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 1 Mar 2019 13:46:53 -0800 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> Message-ID: My testing passed. I think you can push after you finish testing. Please, re-base you changes to jdk/jdk repository before push. I see that webrev.02 was prepared vs jdk/jdk12 which is wrong. Thanks, Vladimir On 3/1/19 11:53 AM, Deshpande, Vivek R wrote: > Hi Vladimir > > Thanks for the review. I am also working on testing it on the VNNI enabled h/w. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, March 1, 2019 10:02 AM > To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Cc: Viswanathan, Sandhya ; Raj, Guru > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic > > This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. > > Thanks, > Vladimir > > On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Thanks for your inputs. I have made the changes according to your suggestion. >> The webrev is here: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ >> This addresses the questions you had raised. >> With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, February 13, 2019 12:29 PM >> To: Deshpande, Vivek R ; 'Tobias >> Hartmann' ; >> 'hotspot-compiler-dev at openjdk.java.net compiler' >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vivek, >> >> Most of new checks are loop invariant: !s1_ctrl_inv and >> !s1_ctrl->is_RangeCheck() >> >> I think you don't need to search for is_muladds2i() if those checks return false. >> >> Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? >> >> Thanks, >> Vladimir >> >> On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >>> Hi Vladimir >>> >>> Would you please take a look at this patch. >>> >>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>> Bug ID: >>> https://bugs.openjdk.java.net/browse/JDK-8216580 >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Deshpande, Vivek R >>> Sent: Monday, January 28, 2019 9:45 AM >>> To: Tobias Hartmann ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> ; Vladimir Kozlov >>> >>> Cc: Viswanathan, Sandhya ; Raj, Guru >>> >>> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vladimir >>> >>> Would you please take a look at the patch. >>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>> Sent: Tuesday, January 15, 2019 2:57 AM >>> To: Deshpande, Vivek R ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vivek, >>> >>> please add parentheses around the == comparison in lines 1225,1226. >>> >>> Otherwise this looks reasonable to me but I'm not too familiar with that code. >>> >>> Best regards, >>> Tobias >>> >>> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>>> Hi Tobias >>>> >>>> The webrev for the bug JDK-821650 is here: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>>> Could you please review it. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Deshpande, Vivek R >>>> Sent: Friday, January 11, 2019 11:38 AM >>>> To: 'Tobias Hartmann' ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization >>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>> >>>> Hi Tobias >>>> >>>> Thanks for reviewing the patch. >>>> I have made the changes according to your suggestion. >>>> In this webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>>> I have fix for the crash reported in the 8216050. >>>> >>>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>>> >>>> I have updated the bug also with the link to webrev. >>>> >>>> I have created a different bug JDK-8216580 for >>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>>> for a[i] and a[i+1] accesses in same MulAddS2I node >>>> >>>> Thank you. >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>> Sent: Friday, January 11, 2019 4:49 AM >>>> To: Deshpande, Vivek R ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization >>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>> >>>> Hi Vivek, >>>> >>>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>>> >>>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>>> >>>>> 2) Cost based generation of vpdpwssd instruction. >>>> >>>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>>> >>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes >>>>> to be isomorphic when they have different control RangeCheck nodes >>>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>>> >>>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>>> >>>> Thanks, >>>> Tobias >>>> From fujie at loongson.cn Fri Mar 1 23:33:09 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 2 Mar 2019 07:33:09 +0800 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: <004720da-c137-433f-19f1-708fe8f74a0b@physik.fu-berlin.de> References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> <004720da-c137-433f-19f1-708fe8f74a0b@physik.fu-berlin.de> Message-ID: <0dca2010-4a2a-8688-5a01-2bf294920b58@loongson.cn> That's great! Thanks Adrian. On 2019?03?01? 22:29, John Paul Adrian Glaubitz wrote: > Hello! > > On 3/1/19 3:25 PM, Magnus Ihse Bursie wrote: >>> It's a bit difficult for me to test this patch since I don't have a sparc or arm machine. > Both SPARC and AArch64 machines running Linux can be accessed through the gcc > compile farm. Any open source developer can request an account for these > machines. > > See: > >> https://gcc.gnu.org/wiki/CompileFarm >> https://cfarm.tetaneutral.net/machines/list/ > I'm admin for the sparc64 box running Linux in case someone needs any particular > package to be installed. > > Thanks, > Adrian > From fujie at loongson.cn Fri Mar 1 23:35:21 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 2 Mar 2019 07:35:21 +0800 Subject: RFR (trivial): 8219919: RuntimeStub's name lost with PrintFrameConverterAssembly In-Reply-To: References: <7c9d8242-389a-0b90-57cc-b9f2c8ee588b@loongson.cn> <944da3d8-4a5c-ae02-6ed8-3c5cc2bc11f3@oracle.com> Message-ID: <93eb066c-b9e3-31a0-506c-c0e793f7a776@loongson.cn> Thanks Vladimir. On 2019?03?01? 08:56, Jie Fu wrote: > Thank you so much, Vladimir. > > I really appreciate if someone could help to sponsor it. > Thanks in advance. > > Best regards, > Jie > > On 2019/3/1 ??3:36, Vladimir Kozlov wrote: >> Looks good. >> >> thanks, >> Vladimir >> >> On 2/28/19 4:29 AM, Jie Fu wrote: >>> Hi all, >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8219919 >>> >>> The RuntimeStub's name is lost when dumping C2's runtime stub with >>> -XX:+PrintFrameConverterAssembly. >>> However, it do exist when dumping with -XX:+PrintStubCode. >>> >>> It would be more friendly if the stub's name was dumped as well for >>> JVM debuggers with -XX:+PrintFrameConverterAssembly. >>> >>> It can be fixed by >>> --------------------------------------- >>> diff -r 56089cf6152c src/hotspot/share/opto/output.cpp >>> --- a/src/hotspot/share/opto/output.cpp Tue Feb 26 05:46:02 2019 -0800 >>> +++ b/src/hotspot/share/opto/output.cpp Thu Feb 28 19:52:40 2019 +0800 >>> @@ -1556,6 +1556,8 @@ >>> } >>> if (method() != NULL) { >>> method()->print_metadata(); >>> + } else if (stub_name() != NULL) { >>> + tty->print_cr("Generating RuntimeStub - %s", stub_name()); >>> } >>> dump_asm(node_offsets, node_offset_limit); >>> if (xtty != NULL) { >>> --------------------------------------- >>> >>> The change has been tested on Linux/x64. >>> Could you please review it? >>> Thanks a lot. >>> >>> Best regards, >>> Jie >>> >>> > From fujie at loongson.cn Fri Mar 1 23:40:36 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 2 Mar 2019 07:40:36 +0800 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> <213140b5-52f1-4d45-8198-d1f6a7676922@redhat.com> Message-ID: <539801b1-9cdf-13e2-47c3-3780b9d75c76@loongson.cn> Thanks Magnus and Andrew Dinn for your kind review. On 2019?03?01? 22:47, Magnus Ihse Bursie wrote: > On 2019-03-01 15:39, Andrew Dinn wrote: >> On 01/03/2019 14:25, Magnus Ihse Bursie wrote: >>> On 2019-02-27 03:25, Jie Fu wrote: >>>> It's a bit difficult for me to test this patch since I don't have a >>>> sparc or arm machine. >>>> I've analyzed the adlc processing logic in >>>> make/hotspot/gensrc/GensrcAdlc.gmk finding that ad-files under >>>> ./src/hotspot/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH) >>>> are optional. >>> What do you mean by "optional"? The build code does this: >>> >>> >>> ############################################################################## >>> # Concatenate all ad source files into a single file, which will be fed to >>> # adlc. >>> >>> ... >>> >>> AD_SRC_FILES := $(call uniq, $(wildcard $(foreach d, $(AD_SRC_ROOTS), \ >>> $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU).ad \ >>> $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH).ad \ >>> >>> $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad >>> \ >>> ))) >>> >>> so it will definitely pick up both those files and use it in creating >>> the concatenated ad file. >> That's interesting because Pengfei Li claims he applied the patch and >> successfully built OpenJDK on AArch64. >> >> >> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-February/006975.html >> >> Does the build system actually need those files to exist when it builds >> the concatenated file? > No, the build system does not "need" it. If it is not there, it is not > included (nor reported MIA), but if it is there, it is included. > >>> That being said, maybe this is not the correct behavior. >> Well, something sounds fishy. >> >>> I see that the linux_sparc.ad file is essentially empty, so you can >>> probably remove that. The aarch64 file otoh seems to contain valid code. >>> I would not presume that you can just remove it! >> He is ok to remove it as far as any contents are concerned. Indeed, I >> told him this was ok in a review in the above thread after Pengfei >> reported that OpenJDK built without the file being present. >> >> As to the contents, the encoding defined in that file is completely >> redundant (I don't really know how it got there as I don't believe it >> was ever used) > > Ok, it might very well be the case that the file is not needed since > it's contents is redundant. I can't say anything about that; that's > the domain of the adlc experts. However, it is incorrect to claim that > the build does not use file in question. But from the build PoV, it's > perfectly fine to remove it if it's not needed. But just not on the > grounds that it is not used by the build system! > > /Magnus >> regards, >> >> >> Andrew Dinn >> ----------- >> Senior Principal Software Engineer >> Red Hat UK Ltd >> Registered in England and Wales under Company Registration No. 03798903 >> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From akozlov at azul.com Sat Mar 2 08:56:55 2019 From: akozlov at azul.com (Anton Kozlov) Date: Sat, 2 Mar 2019 11:56:55 +0300 Subject: 8219582: PPC: Crash after C1 checkcast patched and GC In-Reply-To: References: Message-ID: <7ec7310e-6c52-8844-31aa-729ecce3d467@azul.com> Hi, Martin, Goetz, thank you for your efforts on fixing the bug. Best regards, Anton On 01.03.2019 19:58, Doerr, Martin wrote: > Hi G?tz, > > thank you for the review. Pushed. > > Best regards, > Martin > > > -----Original Message----- > From: Lindenmaier, Goetz > Sent: Freitag, 1. M?rz 2019 12:28 > To: Doerr, Martin ; Anton Kozlov ; 'hotspot-compiler-dev at openjdk.java.net' > Subject: RE: 8219582: PPC: Crash after C1 checkcast patched and GC > > Hi Martin, > > Thanks for addressing this issue! > > Well, this shuffling of registers is quite complex and error prone. > Forcing the register allocation to have 5 different registers seems > to be the safer fix. But I checked, about 10% of the cases in jvm98 > only use 4 registers here, and I would consider this a significant > amount to justify the complexity. > > For me, all the renaming and assignment of conditions makes > the code harder to read, (e.g., keep_object_alive encodes > checkcast && reg_conflict, I.e., it rather means "must_move_obj" ...), > as well as the names of registers (k_RInfo, klass_RInfo ... what's the idea > behind these names?) > But I understand you want to keep this similar to the code on > other platforms, which is also desireable. > > So consider this reviewed. > > Best regards, > Goetz. > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Doerr, Martin >> Sent: Tuesday, February 26, 2019 6:51 PM >> To: Anton Kozlov ; 'hotspot-compiler- >> dev at openjdk.java.net' >> Subject: [CAUTION] RE: 8219582: PPC: Crash after C1 checkcast patched and >> GC >> >> Hi Anton, >> >> I noticed that my fix had missed that the Runtime1::slow_subtype_check_id >> stub needs fixed registers. So I need to undo the register changes before >> and after that call. >> >> Quite messy. Now I remember why I didn't want to do it this way originally ?? >> >> I just added the shuffling in place: >> http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ >> >> I'll run tests and try to find somebody for a 2nd review. >> >> Best regards, >> Martin >> >> >> -----Original Message----- >> From: Anton Kozlov >> Sent: Montag, 25. Februar 2019 16:25 >> To: Doerr, Martin ; 'hotspot-compiler- >> dev at openjdk.java.net' >> Subject: Re: 8219582: PPC: Crash after C1 checkcast patched and GC >> >> Hi, Martin, >> >> my bad, the null check looked like optimization (why to resolve if the klass >> doesn't matter), but it's actually specified >> >> If objectref is null , then the operand stack is unchanged. >> Otherwise, the named class, array, or interface type is resolved ... >> >> Your patch looks correct; reproducer does not fail anymore. >> >> A very minor note: >>> } else if (code == lir_checkcast) { >>> Label success, failure; >>> - emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); >> // Moves obj to dst. >>> + emit_typecheck_helper(op, &success, /*fallthru*/&failure, &success); >>> __ b(*op->stub()->entry()); >>> __ align(32, 12); >>> __ bind(success); >>> + __ mr(op->result_opr()->as_register(), op->object()->as_register()); >> shouldn't __mr_if_needed be here? As it was before, obj and dst can be >> same register: >> >>> __ cmpdi(CCR0, obj, 0); >>> - if (move_obj_to_dst || reg_conflict) { >>> - __ mr_if_needed(dst, obj); >> ^^^ here >>> - if (reg_conflict) { obj = dst; } >>> - } >>> >> >> Thanks for fixing the patch! >> -- Anton >> >> On 25.02.2019 17:20, Doerr, Martin wrote: >>> Hi Anton, >>> >>> your proposal fixes the issue, but introduces another one: >>> We must not use the patching stub before the null check (resolving is not >> allowed at this place). >>> JCK tests exist which verify this: >>> vm/instr/checkcast >>> vm/instr/instanceof >>> >>> So I suggest to fix it this way: >>> http://cr.openjdk.java.net/~mdoerr/8219582_ppc64_c1_fix/webrev.01/ >>> It is closer to the x86 implementation. >>> >>> Can you verify this proposal, please? >>> >>> Thanks again for your helpful analysis of the problem. >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Doerr, Martin >>> Sent: Freitag, 22. Februar 2019 17:12 >>> To: 'hotspot-compiler-dev at openjdk.java.net' > dev at openjdk.java.net>; Anton Kozlov >>> Subject: [CAUTION] RFR: 8219582: PPC: Crash after C1 checkcast patched >> and GC >>> >>> Hi Anton, >>> >>> reposting on hotspot-compiler-dev. >>> Thanks for analyzing the issue and for providing a fix. I'll take a closer look >> next week. >>> >>> Best regards, >>> Martin >>> >>> >>> -----Original Message----- >>> From: hotspot-runtime-dev > bounces at openjdk.java.net> On Behalf Of Anton Kozlov >>> Sent: Freitag, 22. Februar 2019 16:05 >>> To: hotspot-runtime-dev at openjdk.java.net >>> Subject: RFR: 8219582: PPC: Crash after C1 checkcast patched and GC >>> >>> Hi, >>> >>> bug: https://bugs.openjdk.java.net/browse/JDK-8219582 >>> webrev: http://cr.openjdk.java.net/~akozlov/8219582/webrev.00/ >>> >>> PPC C1 checkcast implementation overcomes possible object-to-check and >> temp registers conflict by using destination register as temp to store the >> object. It usually works, but after object moved to dst and before checkcast >> completed, safepoint may occur because of implicit runtime call from >> klass2reg_with_patching. During the call, oop in dst register is not visible to >> GC, so it will not be updated after GC moved objects. >>> >>> Please review the fix, that is to load klass (and may be call runtime) at >> beginning of the LIR instruction, when all oops are in place expected by GC. >>> >>> Thanks, >>> Anton >>> From bsrbnd at gmail.com Sat Mar 2 14:55:55 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Sat, 2 Mar 2019 15:55:55 +0100 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: On Fri, 1 Mar 2019 at 11:21, Bhateja, Jatin wrote: > > Current patch which is under review does not contain above code change to bypass intrinsic creation for reduction patterns. > For X86 performance degrades with intrinsic w.r.t to non-intrinsic implementation in reduction > scenarios with and without data variance (i.e. with and without branch predication effects). > > I could not find right hooks which can be called from common code for adding any such target specific checks during ideal(DAG) construction. > Please share if you know any. I guess if you want to back out all reduction scenarios, a probably better way to do this would be to add predicates to your matching rules: instruct minF_random_reg(legRegF dst, legRegF a, legRegF b, legRegF tmp, legRegF atmp, legRegF btmp) %{ predicate(UseAVX > 0 && !n->is_reduction()); Reductions being computed properly here: diff --git a/src/hotspot/share/opto/loopTransform.cpp b/src/hotspot/share/opto/loopTransform.cpp --- a/src/hotspot/share/opto/loopTransform.cpp +++ b/src/hotspot/share/opto/loopTransform.cpp @@ -2039,7 +2039,8 @@ if (n_ctrl != NULL && loop->is_member(get_loop(n_ctrl))) { // Now test it to see if it fits the standard pattern for a reduction operator. int opc = def_node->Opcode(); - if (opc != ReductionNode::opcode(opc, def_node->bottom_type()->basic_type())) { + if (opc != ReductionNode::opcode(opc, def_node->bottom_type()->basic_type()) + || opc == Op_MinD || opc == Op_MinF || opc == Op_MaxD || opc == Op_MaxF) { if (!def_node->is_reduction()) { // Not marked yet // To be a reduction, the arithmetic node must have the phi as input and provide a def to it bool ok = false; And if this is a reduction you could use alternative rules, see [0]: instruct minF_reg(regF dst, regF a, regF b, rRegI tmp) %{ predicate(UseAVX > 0 && n->is_reduction()); But I'm not sure if 'blend/min/max' is really preferable to a single 'ucomisd'? To summarize: | blend/min/max | one ucomisd --------------|---------------|------------ predictable | ! tiny loss ! | 10% gain unpredictable | 50% gain | 10% gain reduction | !!high loss!! | 10% gain If we'd like to maximize the unpredictable gain we find in examples like random search trees [1], we'd have to choose the 'blend/min/max' variant. To avoid regressions, we'd have to use statistics which might be shared between architectures [2]. However, data isn't collected per call-site [3] and the prediction might be wrong [4] in which case we'd have a tiny loss. For reduction scenarios, it'd be safer to back out all of them when using 'blend/min/max'. A safer variant would be to optimize the current compiled code of the API using only one 'ucomisd' [0] vs several before [5]. The gain would be stable of about 10% in every scenarios. So, no need to do predictions and no need to bail out any more! The discussion is still open... any opinion? Thanks, Bernard [0] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033011.html [1] http://hg.openjdk.java.net/jdk/submit/file/d164e0b595e6/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java#l185 [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-February/032973.html [3] https://bugs.openjdk.java.net/browse/JDK-8015416 [4] http://hg.openjdk.java.net/jdk/submit/file/d164e0b595e6/src/hotspot/share/opto/library_call.cpp#l6612 [5] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-January/032564.html From jatin.bhateja at intel.com Sat Mar 2 19:51:07 2019 From: jatin.bhateja at intel.com (Bhateja, Jatin) Date: Sat, 2 Mar 2019 19:51:07 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: > -----Original Message----- > From: B. Blaser [mailto:bsrbnd at gmail.com] > Sent: Saturday, March 2, 2019 8:26 PM > To: Bhateja, Jatin > Cc: Andrew Dinn ; Vladimir Kozlov > ; Pengfei Li (Arm Technology China) > ; aarch64-port-dev at openjdk.java.net; hotspot- > compiler-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point > Math.min/max intrinsics, approval request > > On Fri, 1 Mar 2019 at 11:21, Bhateja, Jatin wrote: > > > > Current patch which is under review does not contain above code change > to bypass intrinsic creation for reduction patterns. > > For X86 performance degrades with intrinsic w.r.t to non-intrinsic > > implementation in reduction scenarios with and without data variance (i.e. > with and without branch predication effects). > > > > I could not find right hooks which can be called from common code for > adding any such target specific checks during ideal(DAG) construction. > > Please share if you know any. > > I guess if you want to back out all reduction scenarios, a probably better way > to do this would be to add predicates to your matching > rules: Having multiple selection patterns based on node properties is good if we have optimized selection patterns with and without properties (in this case reduction) , what I was asking for was a target specific hook at the ideal construction level which can be used to generate different graphs based on target requirements e.g. one target may see intrinsic creation beneficial where as other may not under some cases. > > instruct minF_random_reg(legRegF dst, legRegF a, legRegF b, legRegF tmp, > legRegF atmp, legRegF btmp) %{ > predicate(UseAVX > 0 && !n->is_reduction()); > > Reductions being computed properly here: > > diff --git a/src/hotspot/share/opto/loopTransform.cpp > b/src/hotspot/share/opto/loopTransform.cpp > --- a/src/hotspot/share/opto/loopTransform.cpp > +++ b/src/hotspot/share/opto/loopTransform.cpp > @@ -2039,7 +2039,8 @@ > if (n_ctrl != NULL && loop->is_member(get_loop(n_ctrl))) { > // Now test it to see if it fits the standard pattern for a reduction > operator. > int opc = def_node->Opcode(); > - if (opc != ReductionNode::opcode(opc, > def_node->bottom_type()->basic_type())) { > + if (opc != ReductionNode::opcode(opc, > def_node->bottom_type()->basic_type()) > + || opc == Op_MinD || opc == Op_MinF || opc == Op_MaxD > || opc == Op_MaxF) { > if (!def_node->is_reduction()) { // Not marked yet > // To be a reduction, the arithmetic node must have the phi as input > and provide a def to it > bool ok = false; > > And if this is a reduction you could use alternative rules, see [0]: > > instruct minF_reg(regF dst, regF a, regF b, rRegI tmp) %{ > predicate(UseAVX > 0 && n->is_reduction()); > > But I'm not sure if 'blend/min/max' is really preferable to a single 'ucomisd'? > > To summarize: > > | blend/min/max | one ucomisd > --------------|---------------|------------ > predictable | ! tiny loss ! | 10% gain > unpredictable | 50% gain | 10% gain > reduction | !!high loss!! | 10% gain I tried reduction with multiple combination of data (NaN , signed zeros and strict FP). I'm not sure if we will see 10% gains for reduction in all the cases, but performance won't degrade as with blend/min/max. > > If we'd like to maximize the unpredictable gain we find in examples like > random search trees [1], we'd have to choose the 'blend/min/max' > variant. To avoid regressions, we'd have to use statistics which might be > shared between architectures [2]. However, data isn't collected per call-site > [3] and the prediction might be wrong [4] in which case we'd have a tiny loss. > For reduction scenarios, it'd be safer to back out all of them when using > 'blend/min/max'. > > A safer variant would be to optimize the current compiled code of the API > using only one 'ucomisd' [0] vs several before [5]. The gain would be stable > of about 10% in every scenarios. So, no need to do predictions and no need > to bail out any more! > > The discussion is still open... any opinion? > > Thanks, > Bernard > > > [0] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019- > March/033011.html > [1] > http://hg.openjdk.java.net/jdk/submit/file/d164e0b595e6/test/hotspot/jtreg > /compiler/intrinsics/math/TestFpMinMaxIntrinsics.java#l185 > [2] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019- > February/032973.html > [3] https://bugs.openjdk.java.net/browse/JDK-8015416 > [4] > http://hg.openjdk.java.net/jdk/submit/file/d164e0b595e6/src/hotspot/shar > e/opto/library_call.cpp#l6612 > [5] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019- > January/032564.html From Pengfei.Li at arm.com Mon Mar 4 10:06:16 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Mon, 4 Mar 2019 10:06:16 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3eb77fb3-db71-8e57-a9f3-ebf635a0291c@redhat.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: Hi Andrew Dinn, Thanks for your work on testing my pending patch. But I have to say the 3rd implementation (vecplus) is buggy. The main reason I didn't use FMAXV/FMINV for 2 doubles is the architecture doesn't support that. > instruct reduce_max4F(vRegF dst, vRegF src1, vecX src2) %{ > . . . > ins_encode %{ > __ fmaxv(as_FloatRegister($dst$$reg), __ T4S, > as_FloatRegister($src2$$reg)); > __ fmaxs(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), > as_FloatRegister($src1$$reg)); > %} > . . . Above matching rule for the float reduction node is correct. > instruct reduce_max2D(vRegD dst, vRegD src1, vecX src2, vecX tmp) %{ > . . . > ins_encode %{ > __ fmaxv(as_FloatRegister($dst$$reg), __ T2D, > as_FloatRegister($src2$$reg)); > __ fmaxd(as_FloatRegister($dst$$reg), as_FloatRegister($dst$$reg), > as_FloatRegister($src1$$reg)); > %} > . . . But this one for the double reduction node is *not* correct. From page 1502 of the Arm Architecture Reference Manual, we could see that the arrangement specifier for the floating-point min/max reduction instructions can only be 4S. It cannot be anything else. So below is the code I added in assembler_aarch64.hpp in my patch. +#define INSN(NAME, opc) \ + void NAME(FloatRegister Vd, SIMD_Arrangement T, FloatRegister Vn) { \ + starti; \ + assert(T == T4S, "arrangement must be T4S"); \ + f(0b01101110, 31, 24), f(opc, 23), f(0b0110000111110, 22, 10); \ + rf(Vn, 5), rf(Vd, 0); \ + } + + INSN(fmaxv, 0); + INSN(fminv, 1); + +#undef INSN + I hard-coded the arrangement bits in the encoding and added an assert (T == T4S) before that. So if you use "__ fmaxv(as_FloatRegister($dst$$reg), __ T2D, as_FloatRegister($src2$$reg));" in the ad file. The code generated will still be something like "fmaxv s21, v16.4s" (with 4S arrangement). And if you run the test with fastdebug build JDK, the assertion failure will be hit. Maybe I shouldn't hard-code the arrangement bits in the encoding as it may hide problems. I will modify my previous webrev and post it in a new thread if you think so. -- Thanks, Pengfei From lutz.schmidt at sap.com Mon Mar 4 16:25:34 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Mon, 4 Mar 2019 16:25:34 +0000 Subject: RFR(XS): 8219214: Infinite Loop in CodeSection::dump() In-Reply-To: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> References: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> Message-ID: <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> Dear All, the "mini-poll" mentioned below showed a clear result: yes - 2 votes (delete) no - 0 votes (keep and fix) I therefore have prepared a new webrev. With that, I removed the methods CodeBuffer::decode_all() CodeBuffer::skip_decode() CodeSection::dump() from the files share/asm/codeBuffer.cpp and share/asm/codeBuffer.hpp. Submit repo results are still pending, but local tests on various platforms run ok. Your reviews are appreciated! Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.01/ Thanks, Lutz ?On 20.02.19, 14:58, "Schmidt, Lutz" wrote: Dear All, I would like to propose the following change which fixes an infinite loop. Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.00/ The method affected is called from nowhere inside hotspot code, but could be called via a debugger for diagnostic purposes. BUT: even that doesn?t seem to have happened in the past. As an alternative, I would suggest to delete the method. Please voice your opinion. I am unbiased to either solution. Thanks, Lutz From bsrbnd at gmail.com Mon Mar 4 21:15:19 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Mon, 4 Mar 2019 22:15:19 +0100 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: > > Having multiple selection patterns based on node properties is good if we have > optimized selection patterns with and without properties (in this case reduction) Pushed to jdk/submit as third changeset on branch JDK-8217561: http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c I think this is our best solution, could we have a Reviewer feedback for this (hotspot:tier1 is OK on x86_64 xeon)? Thanks, Bernard From vivek.r.deshpande at intel.com Tue Mar 5 00:34:36 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 5 Mar 2019 00:34:36 +0000 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4306DA@ORSMSX106.amr.corp.intel.com> Hi Vladimir I have tested the patch with compiler tests on VNNI h/w and it passed. While doing tests in jdk, I noticed that the checks should be guarded against NULL. So I have added those checks: if(s1_ctrl != NULL && s2_ctrl != NULL) { ... The webrev is here: http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.03/ I have also rebased the patch on jdk/jdk. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Friday, March 1, 2019 1:47 PM To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya ; Raj, Guru Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic My testing passed. I think you can push after you finish testing. Please, re-base you changes to jdk/jdk repository before push. I see that webrev.02 was prepared vs jdk/jdk12 which is wrong. Thanks, Vladimir On 3/1/19 11:53 AM, Deshpande, Vivek R wrote: > Hi Vladimir > > Thanks for the review. I am also working on testing it on the VNNI enabled h/w. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, March 1, 2019 10:02 AM > To: Deshpande, Vivek R ; 'Tobias > Hartmann' ; > 'hotspot-compiler-dev at openjdk.java.net compiler' > > Cc: Viswanathan, Sandhya ; Raj, Guru > > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code > by allowing adjacent LoadS nodes to be isomorphic > > This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. > > Thanks, > Vladimir > > On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Thanks for your inputs. I have made the changes according to your suggestion. >> The webrev is here: >> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ >> This addresses the questions you had raised. >> With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Wednesday, February 13, 2019 12:29 PM >> To: Deshpande, Vivek R ; 'Tobias >> Hartmann' ; >> 'hotspot-compiler-dev at openjdk.java.net compiler' >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> Hi Vivek, >> >> Most of new checks are loop invariant: !s1_ctrl_inv and >> !s1_ctrl->is_RangeCheck() >> >> I think you don't need to search for is_muladds2i() if those checks return false. >> >> Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? >> >> Thanks, >> Vladimir >> >> On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >>> Hi Vladimir >>> >>> Would you please take a look at this patch. >>> >>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>> Bug ID: >>> https://bugs.openjdk.java.net/browse/JDK-8216580 >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Deshpande, Vivek R >>> Sent: Monday, January 28, 2019 9:45 AM >>> To: Tobias Hartmann ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> ; Vladimir Kozlov >>> >>> Cc: Viswanathan, Sandhya ; Raj, Guru >>> >>> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vladimir >>> >>> Would you please take a look at the patch. >>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>> Sent: Tuesday, January 15, 2019 2:57 AM >>> To: Deshpande, Vivek R ; >>> hotspot-compiler-dev at openjdk.java.net compiler >>> >>> Cc: Vladimir Kozlov ; Viswanathan, >>> Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vivek, >>> >>> please add parentheses around the == comparison in lines 1225,1226. >>> >>> Otherwise this looks reasonable to me but I'm not too familiar with that code. >>> >>> Best regards, >>> Tobias >>> >>> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>>> Hi Tobias >>>> >>>> The webrev for the bug JDK-821650 is here: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>>> Could you please review it. >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Deshpande, Vivek R >>>> Sent: Friday, January 11, 2019 11:38 AM >>>> To: 'Tobias Hartmann' ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization >>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>> >>>> Hi Tobias >>>> >>>> Thanks for reviewing the patch. >>>> I have made the changes according to your suggestion. >>>> In this webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>>> I have fix for the crash reported in the 8216050. >>>> >>>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>>> >>>> I have updated the bug also with the link to webrev. >>>> >>>> I have created a different bug JDK-8216580 for >>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>>> for a[i] and a[i+1] accesses in same MulAddS2I node >>>> >>>> Thank you. >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>> Sent: Friday, January 11, 2019 4:49 AM >>>> To: Deshpande, Vivek R ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization >>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>> >>>> Hi Vivek, >>>> >>>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>>> >>>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>>> >>>>> 2) Cost based generation of vpdpwssd instruction. >>>> >>>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>>> >>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes >>>>> to be isomorphic when they have different control RangeCheck nodes >>>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>>> >>>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>>> >>>> Thanks, >>>> Tobias >>>> From adinn at redhat.com Tue Mar 5 08:57:10 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 5 Mar 2019 08:57:10 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: Hi Pengfei, On 04/03/2019 10:06, Pengfei Li (Arm Technology China) wrote: > > Thanks for your work on testing my pending patch. But I have to say > the 3rd implementation (vecplus) is buggy. The main reason I didn't > use FMAXV/FMINV for 2 doubles is the architecture doesn't support > that. > . . . > So if you use "__ fmaxv(as_FloatRegister($dst$$reg), __ T2D, > as_FloatRegister($src2$$reg));" in the ad file. The code generated > will still be something like "fmaxv s21, v16.4s" (with 4S > arrangement). And if you run the test with fastdebug build JDK, the > assertion failure will be hit. Thank you for pointing this out. I had not noticed the warning about this encoding being reserved in the ARM ARM nor that your encoding was forcing the Q and sz bits to be 1 0. What seems very odd to me is the difference between fmaxv and fminv. Both Q == 1 encodings (i.e. with sz in {0, 1}) are reserved for fmaxv. However, the encoding for fminv accepts both Q == 1 encodings with the expected interpretation. > Maybe I shouldn't hard-code the arrangement bits in the encoding as > it may hide problems. I will modify my previous webrev and post it in > a new thread if you think so. Yes, I think it would probably be better to leave the assert in place and use the encoding implied by the SIMD_Arrangement parameter i.e. T2S ==> Q=1,sz=0 and T2D ==> Q=1, sz=1. That way the assert will catch errors in debug builds and non-debug builds should be stopped by a SIGILL exception. Also, if you adjust the assert to only apply to the fmax case then we can still translate 2D FMIN reduction using the fminv instruction. assert(opc == 1 || T == T4S, "fmaxv arrangement must be T4S"); regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From lutz.schmidt at sap.com Tue Mar 5 10:54:43 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Tue, 5 Mar 2019 10:54:43 +0000 Subject: RFR(XS): 8219214: Infinite Loop in CodeSection::dump() In-Reply-To: <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> References: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> Message-ID: Update: Meanwhile, Mach5 tasks reported 0 failed tests. Build Details: 2019-03-04-1542344.lutz.schmidt.source Regards, Lutz ?On 04.03.19, 17:25, "Schmidt, Lutz" wrote: Dear All, the "mini-poll" mentioned below showed a clear result: yes - 2 votes (delete) no - 0 votes (keep and fix) I therefore have prepared a new webrev. With that, I removed the methods CodeBuffer::decode_all() CodeBuffer::skip_decode() CodeSection::dump() from the files share/asm/codeBuffer.cpp and share/asm/codeBuffer.hpp. Submit repo results are still pending, but local tests on various platforms run ok. Your reviews are appreciated! Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.01/ Thanks, Lutz On 20.02.19, 14:58, "Schmidt, Lutz" wrote: Dear All, I would like to propose the following change which fixes an infinite loop. Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.00/ The method affected is called from nowhere inside hotspot code, but could be called via a debugger for diagnostic purposes. BUT: even that doesn?t seem to have happened in the past. As an alternative, I would suggest to delete the method. Please voice your opinion. I am unbiased to either solution. Thanks, Lutz From dmitrij.pochepko at bell-sw.com Tue Mar 5 12:32:48 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Tue, 5 Mar 2019 15:32:48 +0300 Subject: RFR: 8218749 - AARCH64: String compress intrinsic documentation and maintenance improvement Message-ID: <899ed098-8179-0df3-41e4-b6ff5b5e73c6@bell-sw.com> Hi all, please review patch for JDK-8218749: AARCH64: String compress intrinsic documentation and maintenance improvement webrev: http://cr.openjdk.java.net/~dpochepk/8218749/webrev.01/ This patch adds documentation with test and moves equivalent code generation into separate method. Documentation is both added into documentation block and inlined into code. Testing: - hotspot jtreg tests: compiler/*, gc/* and runtime/* - jdk jtreg tests: tier1-tier3 - jck tests No regression found. I'd like to thank Pengfei Li for help in pre-review. CR: https://bugs.openjdk.java.net/browse/JDK-8218749 Thanks, Dmitrij From claes.redestad at oracle.com Tue Mar 5 15:54:58 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 5 Mar 2019 16:54:58 +0100 Subject: RFR: 8220159: Optimize various RegMask operations by introducing watermarks Message-ID: <6035628e-9673-cecf-1e55-40320237a6a4@oracle.com> Hi, by introducing a low and high water mark of the RegMask words that we are sure have register bits, we can reduce time spent doing a variety of operations. Bug: https://bugs.openjdk.java.net/browse/JDK-8220159 Webrev: http://cr.openjdk.java.net/~redestad/8220159/open.00/ On a few profiled startup application the average number of instructions spent compiling methods in C2 drops 10-15%, and we see improvements on a range of startup and footprint applications. There are also a few improvements on microbenchmarks, likely due to a positive effect on warmup times. Testing: tier1-3 Thanks! /Claes From nils.eliasson at oracle.com Tue Mar 5 15:42:18 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 5 Mar 2019 16:42:18 +0100 Subject: RFR: 8220159: Optimize various RegMask operations by introducing watermarks In-Reply-To: <6035628e-9673-cecf-1e55-40320237a6a4@oracle.com> References: <6035628e-9673-cecf-1e55-40320237a6a4@oracle.com> Message-ID: <4ba4e387-8cf9-bf2c-e103-bf59845c1695@oracle.com> Hi Claes, Terrific work! Reviewed, Nils On 2019-03-05 16:54, Claes Redestad wrote: > Hi, > > by introducing a low and high water mark of the RegMask words that we > are sure have register bits, we can reduce time spent doing a variety of > operations. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220159 > Webrev: http://cr.openjdk.java.net/~redestad/8220159/open.00/ > > On a few profiled startup application the average number of instructions > spent compiling methods in C2 drops 10-15%, and we see improvements on a > range of startup and footprint applications. > > There are also a few improvements on microbenchmarks, likely due to a > positive effect on warmup times. > > Testing: tier1-3 > > Thanks! > > /Claes From vladimir.kozlov at oracle.com Tue Mar 5 20:23:32 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 5 Mar 2019 12:23:32 -0800 Subject: RFR(XS): 8219214: Infinite Loop in CodeSection::dump() In-Reply-To: <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> References: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> Message-ID: Looks good. Thanks, Vladimir On 3/4/19 8:25 AM, Schmidt, Lutz wrote: > Dear All, > > the "mini-poll" mentioned below showed a clear result: > yes - 2 votes (delete) > no - 0 votes (keep and fix) > > I therefore have prepared a new webrev. With that, I removed the methods > > CodeBuffer::decode_all() > CodeBuffer::skip_decode() > CodeSection::dump() > > from the files share/asm/codeBuffer.cpp and share/asm/codeBuffer.hpp. Submit repo results are still pending, but local tests on various platforms run ok. > > Your reviews are appreciated! > > Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.01/ > > Thanks, > Lutz > > > ?On 20.02.19, 14:58, "Schmidt, Lutz" wrote: > > Dear All, > > I would like to propose the following change which fixes an infinite loop. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.00/ > > The method affected is called from nowhere inside hotspot code, but could be called via a debugger for diagnostic purposes. BUT: even that doesn?t seem to have happened in the past. > > As an alternative, I would suggest to delete the method. Please voice your opinion. I am unbiased to either solution. > > Thanks, > Lutz > > > > > From vladimir.kozlov at oracle.com Tue Mar 5 20:39:56 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 5 Mar 2019 12:39:56 -0800 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F4306DA@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4306DA@ORSMSX106.amr.corp.intel.com> Message-ID: <818fddad-6809-2e9b-191f-8cc980cbd7be@oracle.com> Looks good. Thanks, Vladimir On 3/4/19 4:34 PM, Deshpande, Vivek R wrote: > Hi Vladimir > > I have tested the patch with compiler tests on VNNI h/w and it passed. > While doing tests in jdk, I noticed that the checks should be guarded against NULL. > So I have added those checks: > if(s1_ctrl != NULL && s2_ctrl != NULL) { ... > The webrev is here: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.03/ > I have also rebased the patch on jdk/jdk. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, March 1, 2019 1:47 PM > To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' > Cc: Viswanathan, Sandhya ; Raj, Guru > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic > > My testing passed. I think you can push after you finish testing. > Please, re-base you changes to jdk/jdk repository before push. I see that webrev.02 was prepared vs jdk/jdk12 which is wrong. > > Thanks, > Vladimir > > On 3/1/19 11:53 AM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Thanks for the review. I am also working on testing it on the VNNI enabled h/w. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, March 1, 2019 10:02 AM >> To: Deshpande, Vivek R ; 'Tobias >> Hartmann' ; >> 'hotspot-compiler-dev at openjdk.java.net compiler' >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. >> >> Thanks, >> Vladimir >> >> On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: >>> Hi Vladimir >>> >>> Thanks for your inputs. I have made the changes according to your suggestion. >>> The webrev is here: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ >>> This addresses the questions you had raised. >>> With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, February 13, 2019 12:29 PM >>> To: Deshpande, Vivek R ; 'Tobias >>> Hartmann' ; >>> 'hotspot-compiler-dev at openjdk.java.net compiler' >>> >>> Cc: Viswanathan, Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vivek, >>> >>> Most of new checks are loop invariant: !s1_ctrl_inv and >>> !s1_ctrl->is_RangeCheck() >>> >>> I think you don't need to search for is_muladds2i() if those checks return false. >>> >>> Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? >>> >>> Thanks, >>> Vladimir >>> >>> On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >>>> Hi Vladimir >>>> >>>> Would you please take a look at this patch. >>>> >>>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>>> Bug ID: >>>> https://bugs.openjdk.java.net/browse/JDK-8216580 >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Deshpande, Vivek R >>>> Sent: Monday, January 28, 2019 9:45 AM >>>> To: Tobias Hartmann ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> ; Vladimir Kozlov >>>> >>>> Cc: Viswanathan, Sandhya ; Raj, Guru >>>> >>>> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>>> by allowing adjacent LoadS nodes to be isomorphic >>>> >>>> Hi Vladimir >>>> >>>> Would you please take a look at the patch. >>>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>> Sent: Tuesday, January 15, 2019 2:57 AM >>>> To: Deshpande, Vivek R ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>>> by allowing adjacent LoadS nodes to be isomorphic >>>> >>>> Hi Vivek, >>>> >>>> please add parentheses around the == comparison in lines 1225,1226. >>>> >>>> Otherwise this looks reasonable to me but I'm not too familiar with that code. >>>> >>>> Best regards, >>>> Tobias >>>> >>>> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>>>> Hi Tobias >>>>> >>>>> The webrev for the bug JDK-821650 is here: >>>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>>>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>>>> Could you please review it. >>>>> >>>>> Regards, >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Deshpande, Vivek R >>>>> Sent: Friday, January 11, 2019 11:38 AM >>>>> To: 'Tobias Hartmann' ; >>>>> hotspot-compiler-dev at openjdk.java.net compiler >>>>> >>>>> Cc: Vladimir Kozlov ; Viswanathan, >>>>> Sandhya ; Raj, Guru >>>>> >>>>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization >>>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>>> >>>>> Hi Tobias >>>>> >>>>> Thanks for reviewing the patch. >>>>> I have made the changes according to your suggestion. >>>>> In this webrev: >>>>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>>>> I have fix for the crash reported in the 8216050. >>>>> >>>>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>>>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>>>> >>>>> I have updated the bug also with the link to webrev. >>>>> >>>>> I have created a different bug JDK-8216580 for >>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>>>> for a[i] and a[i+1] accesses in same MulAddS2I node >>>>> >>>>> Thank you. >>>>> Regards, >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>>> Sent: Friday, January 11, 2019 4:49 AM >>>>> To: Deshpande, Vivek R ; >>>>> hotspot-compiler-dev at openjdk.java.net compiler >>>>> >>>>> Cc: Vladimir Kozlov ; Viswanathan, >>>>> Sandhya ; Raj, Guru >>>>> >>>>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization >>>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>>> >>>>> Hi Vivek, >>>>> >>>>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>>>> >>>>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>>>> >>>>>> 2) Cost based generation of vpdpwssd instruction. >>>>> >>>>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>>>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>>>> >>>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes >>>>>> to be isomorphic when they have different control RangeCheck nodes >>>>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>>>> >>>>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>>>> >>>>> Thanks, >>>>> Tobias >>>>> From vivek.r.deshpande at intel.com Tue Mar 5 22:09:00 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Tue, 5 Mar 2019 22:09:00 +0000 Subject: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic In-Reply-To: <818fddad-6809-2e9b-191f-8cc980cbd7be@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9A14A6DA@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9A15F100@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4006FA@ORSMSX106.amr.corp.intel.com> <5cc2946e-7770-0323-6f63-405e7e539fd6@oracle.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42D917@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F42EDE2@ORSMSX106.amr.corp.intel.com> <53E8E64DB2403849AFD89B7D4DAC8B2A9F4306DA@ORSMSX106.amr.corp.intel.com> <818fddad-6809-2e9b-191f-8cc980cbd7be@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F43164B@ORSMSX106.amr.corp.intel.com> Thanks Vladimir. I have pushed the change. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, March 5, 2019 12:40 PM To: Deshpande, Vivek R ; 'Tobias Hartmann' ; 'hotspot-compiler-dev at openjdk.java.net compiler' Cc: Viswanathan, Sandhya ; Raj, Guru Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic Looks good. Thanks, Vladimir On 3/4/19 4:34 PM, Deshpande, Vivek R wrote: > Hi Vladimir > > I have tested the patch with compiler tests on VNNI h/w and it passed. > While doing tests in jdk, I noticed that the checks should be guarded against NULL. > So I have added those checks: > if(s1_ctrl != NULL && s2_ctrl != NULL) { ... > The webrev is here: > http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.03/ > I have also rebased the patch on jdk/jdk. > > Regards, > Vivek > > -----Original Message----- > From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] > Sent: Friday, March 1, 2019 1:47 PM > To: Deshpande, Vivek R ; 'Tobias > Hartmann' ; > 'hotspot-compiler-dev at openjdk.java.net compiler' > > Cc: Viswanathan, Sandhya ; Raj, Guru > > Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code > by allowing adjacent LoadS nodes to be isomorphic > > My testing passed. I think you can push after you finish testing. > Please, re-base you changes to jdk/jdk repository before push. I see that webrev.02 was prepared vs jdk/jdk12 which is wrong. > > Thanks, > Vladimir > > On 3/1/19 11:53 AM, Deshpande, Vivek R wrote: >> Hi Vladimir >> >> Thanks for the review. I am also working on testing it on the VNNI enabled h/w. >> >> Regards, >> Vivek >> >> -----Original Message----- >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >> Sent: Friday, March 1, 2019 10:02 AM >> To: Deshpande, Vivek R ; 'Tobias >> Hartmann' ; >> 'hotspot-compiler-dev at openjdk.java.net compiler' >> >> Cc: Viswanathan, Sandhya ; Raj, Guru >> >> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >> by allowing adjacent LoadS nodes to be isomorphic >> >> This looks good. I assume you did full testing of these new changes on VNNI machine. I will submit testing on what we have. >> >> Thanks, >> Vladimir >> >> On 2/28/19 5:23 PM, Deshpande, Vivek R wrote: >>> Hi Vladimir >>> >>> Thanks for your inputs. I have made the changes according to your suggestion. >>> The webrev is here: >>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.02/ >>> This addresses the questions you had raised. >>> With this patch the checks are applied to all the nodes but returns true only in case of muladds2i. >>> >>> Regards, >>> Vivek >>> >>> -----Original Message----- >>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] >>> Sent: Wednesday, February 13, 2019 12:29 PM >>> To: Deshpande, Vivek R ; 'Tobias >>> Hartmann' ; >>> 'hotspot-compiler-dev at openjdk.java.net compiler' >>> >>> Cc: Viswanathan, Sandhya ; Raj, Guru >>> >>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector code >>> by allowing adjacent LoadS nodes to be isomorphic >>> >>> Hi Vivek, >>> >>> Most of new checks are loop invariant: !s1_ctrl_inv and >>> !s1_ctrl->is_RangeCheck() >>> >>> I think you don't need to search for is_muladds2i() if those checks return false. >>> >>> Most general question is: why it should apply only to muladds2i nodes only? Can we do the same for others? >>> >>> Thanks, >>> Vladimir >>> >>> On 2/8/19 2:17 PM, Deshpande, Vivek R wrote: >>>> Hi Vladimir >>>> >>>> Would you please take a look at this patch. >>>> >>>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>>> Bug ID: >>>> https://bugs.openjdk.java.net/browse/JDK-8216580 >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Deshpande, Vivek R >>>> Sent: Monday, January 28, 2019 9:45 AM >>>> To: Tobias Hartmann ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> ; Vladimir Kozlov >>>> >>>> Cc: Viswanathan, Sandhya ; Raj, Guru >>>> >>>> Subject: RE: RFR(XS):8216580:X86: Fix generation of VNNI vector >>>> code by allowing adjacent LoadS nodes to be isomorphic >>>> >>>> Hi Vladimir >>>> >>>> Would you please take a look at the patch. >>>> The Adjacent LoadS have different control RangeCheck node for accesses of type a[2i] and a[2i+1]. >>>> This patch allows those nodes to be isomorphic as they belong same counted loop and MulAddS2I nodes. >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.01/ >>>> >>>> Regards, >>>> Vivek >>>> >>>> -----Original Message----- >>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>> Sent: Tuesday, January 15, 2019 2:57 AM >>>> To: Deshpande, Vivek R ; >>>> hotspot-compiler-dev at openjdk.java.net compiler >>>> >>>> Cc: Vladimir Kozlov ; Viswanathan, >>>> Sandhya ; Raj, Guru >>>> >>>> Subject: Re: RFR(XS):8216580:X86: Fix generation of VNNI vector >>>> code by allowing adjacent LoadS nodes to be isomorphic >>>> >>>> Hi Vivek, >>>> >>>> please add parentheses around the == comparison in lines 1225,1226. >>>> >>>> Otherwise this looks reasonable to me but I'm not too familiar with that code. >>>> >>>> Best regards, >>>> Tobias >>>> >>>> On 12.01.19 01:03, Deshpande, Vivek R wrote: >>>>> Hi Tobias >>>>> >>>>> The webrev for the bug JDK-821650 is here: >>>>> http://cr.openjdk.java.net/~vdeshpande/8216580/webrev.00/ >>>>> This fixes generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes for a[i] and a[i+1] accesses in same MulAddS2I node. >>>>> Could you please review it. >>>>> >>>>> Regards, >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Deshpande, Vivek R >>>>> Sent: Friday, January 11, 2019 11:38 AM >>>>> To: 'Tobias Hartmann' ; >>>>> hotspot-compiler-dev at openjdk.java.net compiler >>>>> >>>>> Cc: Vladimir Kozlov ; Viswanathan, >>>>> Sandhya ; Raj, Guru >>>>> >>>>> Subject: RE: RFR(S):8216050:X86: Fix for Superword optimization >>>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>>> >>>>> Hi Tobias >>>>> >>>>> Thanks for reviewing the patch. >>>>> I have made the changes according to your suggestion. >>>>> In this webrev: >>>>> http://cr.openjdk.java.net/~vdeshpande/8216050/webrev.01/ >>>>> I have fix for the crash reported in the 8216050. >>>>> >>>>> The lower cost is needed for generation of vpdpwssd instruction, by combining AddVI and MulAddVS2VI. >>>>> For other instructions pmaddwd and vpmaddwd, they get generated on platforms upto skylake with default cost. >>>>> >>>>> I have updated the bug also with the link to webrev. >>>>> >>>>> I have created a different bug JDK-8216580 for >>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes to be isomorphic when they have different control RangeCheck nodes >>>>> for a[i] and a[i+1] accesses in same MulAddS2I node >>>>> >>>>> Thank you. >>>>> Regards, >>>>> Vivek >>>>> >>>>> -----Original Message----- >>>>> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] >>>>> Sent: Friday, January 11, 2019 4:49 AM >>>>> To: Deshpande, Vivek R ; >>>>> hotspot-compiler-dev at openjdk.java.net compiler >>>>> >>>>> Cc: Vladimir Kozlov ; Viswanathan, >>>>> Sandhya ; Raj, Guru >>>>> >>>>> Subject: Re: RFR(S):8216050:X86: Fix for Superword optimization >>>>> fails with assert(0 <= i && i < _len) failed: illegal index >>>>> >>>>> Hi Vivek, >>>>> >>>>> On 11.01.19 07:58, Deshpande, Vivek R wrote: >>>>>> 1) Fix for the crash by matching the operand by swapping to right positions. >>>>> >>>>> Looks good but the change to loopopts.cpp:530 screwed up the indentation around the ifs, please fix. >>>>> >>>>>> 2) Cost based generation of vpdpwssd instruction. >>>>> >>>>> Other instructions added by JDK-8214751 still miss a cost definition, for example: >>>>> http://hg.openjdk.java.net/jdk/jdk/rev/4bb6e0871bf7#l5.20 >>>>> >>>>>> 3) Fix generation of vector code by allowing adjacent LoadS nodes >>>>>> to be isomorphic when they have different control RangeCheck nodes >>>>>> ????for a[i] and a[i+1] accesses in same MulAddS2I node >>>>> >>>>> This is unrelated to the original bug, right? If so, this should be integrated with a separate RFE. >>>>> >>>>> Thanks, >>>>> Tobias >>>>> From vivek.r.deshpande at intel.com Wed Mar 6 01:24:36 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 6 Mar 2019 01:24:36 +0000 Subject: RFR(XXS): 8220211: Small update to Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic (JDK-8216580) Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F431813@ORSMSX106.amr.corp.intel.com> Hi I have a webrev for small update to the bug JDK-8216580. In this fix 8216580, the index variable for the loop iterator is wrongly used. The webrev is here: http://cr.openjdk.java.net/~vdeshpande/8220211/webrev.00/ I have also submitted the submit repo test. Regards, Vivek -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Mar 6 01:37:45 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 5 Mar 2019 17:37:45 -0800 Subject: RFR(XXS): 8220211: Small update to Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic (JDK-8216580) In-Reply-To: <53E8E64DB2403849AFD89B7D4DAC8B2A9F431813@ORSMSX106.amr.corp.intel.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F431813@ORSMSX106.amr.corp.intel.com> Message-ID: <33a8afba-a937-5964-1150-f32a62601dd5@oracle.com> Looks good. Thanks, Vladimir On 3/5/19 5:24 PM, Deshpande, Vivek R wrote: > Hi > > I have a webrev for small update to the bug JDK-8216580. > > In this fix 8216580, the index variable for the loop iterator is wrongly used. > > The webrev is here: > > http://cr.openjdk.java.net/~vdeshpande/8220211/webrev.00/ > > I have also submitted the submit repo test. > > Regards, > > Vivek > From Pengfei.Li at arm.com Wed Mar 6 02:05:58 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Wed, 6 Mar 2019 02:05:58 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <1341b8ab-1ab1-0270-86c4-5a4ac4945d03@oracle.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: Hi Andrew Dinn, > What seems very odd to me is the difference between fmaxv and fminv. > Both Q == 1 encodings (i.e. with sz in {0, 1}) are reserved for fmaxv. > However, the encoding for fminv accepts both Q == 1 encodings with the > expected interpretation. In the latest published version of the ArmARM doc, I don't see such difference between fmaxv and fminv. I guess what you have seen might be a bug of the previous version docs. > Yes, I think it would probably be better to leave the assert in place and use > the encoding implied by the SIMD_Arrangement parameter i.e. T2S ==> > Q=1,sz=0 and T2D ==> Q=1, sz=1. That way the assert will catch errors in > debug builds and non-debug builds should be stopped by a SIGILL exception. Thanks, I will do this and post a new webrev in a new thread then. -- Thanks, Pengfei From vivek.r.deshpande at intel.com Wed Mar 6 05:07:07 2019 From: vivek.r.deshpande at intel.com (Deshpande, Vivek R) Date: Wed, 6 Mar 2019 05:07:07 +0000 Subject: RFR(XXS): 8220211: Small update to Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic (JDK-8216580) In-Reply-To: <33a8afba-a937-5964-1150-f32a62601dd5@oracle.com> References: <53E8E64DB2403849AFD89B7D4DAC8B2A9F431813@ORSMSX106.amr.corp.intel.com> <33a8afba-a937-5964-1150-f32a62601dd5@oracle.com> Message-ID: <53E8E64DB2403849AFD89B7D4DAC8B2A9F431B62@ORSMSX106.amr.corp.intel.com> Thanks Vladimir. It passed the submit repo tests. I would push the patch. Regards, Vivek -----Original Message----- From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] Sent: Tuesday, March 5, 2019 5:38 PM To: Deshpande, Vivek R ; 'hotspot-compiler-dev at openjdk.java.net compiler' Subject: Re: RFR(XXS): 8220211: Small update to Fix generation of VNNI vector code by allowing adjacent LoadS nodes to be isomorphic (JDK-8216580) Looks good. Thanks, Vladimir On 3/5/19 5:24 PM, Deshpande, Vivek R wrote: > Hi > > I have a webrev for small update to the bug JDK-8216580. > > In this fix 8216580, the index variable for the loop iterator is wrongly used. > > The webrev is here: > > http://cr.openjdk.java.net/~vdeshpande/8220211/webrev.00/ > > I have also submitted the submit repo test. > > Regards, > > Vivek > From tobias.hartmann at oracle.com Wed Mar 6 09:08:25 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 6 Mar 2019 10:08:25 +0100 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> <3134c5a6-0a16-5091-6e09-b6c3c347c84c@oracle.com> Message-ID: Hi Jie, sorry, I was out of the office. Pushed. Best regards, Tobias On 28.02.19 02:57, Jie Fu wrote: > Hi Tobias, > > I have exported the patch in the attachment. > Could you please push it for me? > Thanks a lot. > > Best regards, > Jie > > On 2019/2/27 ??9:05, Jie Fu wrote: >> Hi Tobias, >> >> Thanks for your review. >> Could you please sponsor it or find a sponsor for me? >> Thanks a lot. >> >> Best regards, >> Jie >> >> >> On 2019?02?27? 18:03, Tobias Hartmann wrote: >>> On 27.02.19 06:31, Jie Fu wrote: >>>> As for the linux_sparc.ad, it is actually empty for the adlc compiler. >>>> It should be safe too. >>>> >>>> Tobias, is it still required to test it on the sparc platform? >>>> Thanks. >>> No, I think it's good to go. >>> >>> Best regards, >>> Tobias >> From fujie at loongson.cn Wed Mar 6 09:14:03 2019 From: fujie at loongson.cn (Jie Fu) Date: Wed, 6 Mar 2019 17:14:03 +0800 Subject: RFR (trivial): 8219519: Remove linux_sparc.ad and linux_aarch64.ad In-Reply-To: References: <900ea372-d9fd-e1fd-051d-3d96e1ec2c66@loongson.cn> <9c080917-84eb-ba84-5951-f159b103e6ed@oracle.com> <44494f4f-26ef-3b9e-b38e-eca8178aba89@loongson.cn> <3134c5a6-0a16-5091-6e09-b6c3c347c84c@oracle.com> Message-ID: <9d9d6f36-2eec-bd9d-0f74-98af70506e5f@loongson.cn> Thank you so much, Tobias. On 2019/3/6 ??5:08, Tobias Hartmann wrote: > Hi Jie, > > sorry, I was out of the office. Pushed. > > Best regards, > Tobias > > On 28.02.19 02:57, Jie Fu wrote: >> Hi Tobias, >> >> I have exported the patch in the attachment. >> Could you please push it for me? >> Thanks a lot. >> >> Best regards, >> Jie >> >> On 2019/2/27 ??9:05, Jie Fu wrote: >>> Hi Tobias, >>> >>> Thanks for your review. >>> Could you please sponsor it or find a sponsor for me? >>> Thanks a lot. >>> >>> Best regards, >>> Jie >>> >>> >>> On 2019?02?27? 18:03, Tobias Hartmann wrote: >>>> On 27.02.19 06:31, Jie Fu wrote: >>>>> As for the linux_sparc.ad, it is actually empty for the adlc compiler. >>>>> It should be safe too. >>>>> >>>>> Tobias, is it still required to test it on the sparc platform? >>>>> Thanks. >>>> No, I think it's good to go. >>>> >>>> Best regards, >>>> Tobias From tobias.hartmann at oracle.com Wed Mar 6 09:15:57 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 6 Mar 2019 10:15:57 +0100 Subject: RFR: 8220159: Optimize various RegMask operations by introducing watermarks In-Reply-To: <6035628e-9673-cecf-1e55-40320237a6a4@oracle.com> References: <6035628e-9673-cecf-1e55-40320237a6a4@oracle.com> Message-ID: <58ecaaab-7634-89b4-f08e-f695fe4ceb05@oracle.com> Hi Claes, nice improvements, looks good to me! Best regards, Tobias On 05.03.19 16:54, Claes Redestad wrote: > Hi, > > by introducing a low and high water mark of the RegMask words that we > are sure have register bits, we can reduce time spent doing a variety of > operations. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220159 > Webrev: http://cr.openjdk.java.net/~redestad/8220159/open.00/ > > On a few profiled startup application the average number of instructions > spent compiling methods in C2 drops 10-15%, and we see improvements on a > range of startup and footprint applications. > > There are also a few improvements on microbenchmarks, likely due to a > positive effect on warmup times. > > Testing: tier1-3 > > Thanks! > > /Claes From tobias.hartmann at oracle.com Wed Mar 6 09:18:25 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 6 Mar 2019 10:18:25 +0100 Subject: RFR(XS): 8219214: Infinite Loop in CodeSection::dump() In-Reply-To: References: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> Message-ID: <16b3b06a-094d-f4df-341b-8655f5040e43@oracle.com> +1 Best regards, Tobias On 05.03.19 21:23, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 3/4/19 8:25 AM, Schmidt, Lutz wrote: >> Dear All, >> >> the "mini-poll" mentioned below showed a clear result: >> ? yes - 2 votes (delete) >> ? no? - 0 votes (keep and fix) >> >> I therefore have prepared a new webrev. With that, I removed the methods >> >> ? CodeBuffer::decode_all() >> ? CodeBuffer::skip_decode() >> ? CodeSection::dump() >> >> from the files share/asm/codeBuffer.cpp and share/asm/codeBuffer.hpp. Submit repo results are >> still pending, but local tests on various platforms run ok. >> >> Your reviews are appreciated! >> >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8219214 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.01/ >> >> Thanks, >> Lutz >> >> >> ?On 20.02.19, 14:58, "Schmidt, Lutz" wrote: >> >> ???? Dear All, >> ???? ???? I would like to propose the following change which fixes an infinite loop. >> ???? ???? Bug:??? https://bugs.openjdk.java.net/browse/JDK-8219214 >> ???? Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.00/ >> ???? ???? The method affected is called from nowhere inside hotspot code, but could be called via >> a debugger for diagnostic purposes. BUT: even that doesn?t seem to have happened in the past. >> ???? ???? As an alternative, I would suggest to delete the method. Please voice your opinion. I am >> unbiased to either solution. >> ???? ???? Thanks, >> ???? Lutz >> ???? ???? ???? ???? From lutz.schmidt at sap.com Wed Mar 6 10:05:02 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 6 Mar 2019 10:05:02 +0000 Subject: RFR(XS): 8219214: Infinite Loop in CodeSection::dump() In-Reply-To: <16b3b06a-094d-f4df-341b-8655f5040e43@oracle.com> References: <77BFD1E1-02E9-470E-8638-90E4A8A95756@sap.com> <63CA5BCB-8663-43EF-9902-BBC506B76C6B@sap.com> <16b3b06a-094d-f4df-341b-8655f5040e43@oracle.com> Message-ID: <4773A872-8E05-48A7-A9A6-51C9E17B7EEC@sap.com> Vladimir, Tobias, thanks a lot for reviewing. I'll go ahead and push. Regards, Lutz ?On 06.03.19, 10:18, "Tobias Hartmann" wrote: +1 Best regards, Tobias On 05.03.19 21:23, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 3/4/19 8:25 AM, Schmidt, Lutz wrote: >> Dear All, >> >> the "mini-poll" mentioned below showed a clear result: >> yes - 2 votes (delete) >> no - 0 votes (keep and fix) >> >> I therefore have prepared a new webrev. With that, I removed the methods >> >> CodeBuffer::decode_all() >> CodeBuffer::skip_decode() >> CodeSection::dump() >> >> from the files share/asm/codeBuffer.cpp and share/asm/codeBuffer.hpp. Submit repo results are >> still pending, but local tests on various platforms run ok. >> >> Your reviews are appreciated! >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.01/ >> >> Thanks, >> Lutz >> >> >> On 20.02.19, 14:58, "Schmidt, Lutz" wrote: >> >> Dear All, >> I would like to propose the following change which fixes an infinite loop. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8219214 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8219214.00/ >> The method affected is called from nowhere inside hotspot code, but could be called via >> a debugger for diagnostic purposes. BUT: even that doesn?t seem to have happened in the past. >> As an alternative, I would suggest to delete the method. Please voice your opinion. I am >> unbiased to either solution. >> Thanks, >> Lutz >> From dcherepanov at azul.com Wed Mar 6 11:07:40 2019 From: dcherepanov at azul.com (Dmitry Cherepanov) Date: Wed, 6 Mar 2019 11:07:40 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Igor, Sorry for the delay in responding. I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? Thanks, Dmitry > On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: > > Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. > > igor > > > >> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: >> >> Hi Igor, >> >> Thanks for the suggestions. I tried to make the opr1 a temporary >> >> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ >> >> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? >> >> Dmitry >> >>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: >>> >>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. >>> >>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: >>>> >>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. >>>> >>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. >>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. >>>> >>>> igor >>>> >>>> >>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: >>>>> >>>>> Hi Dmitry, >>>>> >>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. >>>>> >>>>> Best regards, >>>>> Tobias >>>>> >>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: >>>>>> Hi Tobias, >>>>>> >>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the >>>>>> regression test. >>>>>> >>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ >>>>>> >>>>>> >>>>>> Dmitry >>>>>> >>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>> > wrote: >>>>>>> >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? >>>>>>> >>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? >>>>>>> >>>>>>> Thanks, >>>>>>> Tobias >>>>>>> >>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for >>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and >>>>>>>> using it for incrementing backedge counter. >>>>>>>> >>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 >>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Dmitry >>>> >>> >> > From martin.doerr at sap.com Wed Mar 6 11:43:36 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 6 Mar 2019 11:43:36 +0000 Subject: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout Message-ID: Hi, my proposal JDK-8219584 is currently being reviewed on hotspot-runtime-dev, but it contains a small test which explicitly uses C2. May I get a review for TestAbortVMOnSafepointTimeout.java, please? Webrev: http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.02/ Bug with description of the feature: https://bugs.openjdk.java.net/browse/JDK-8219584 The purpose of the method "test_loop" is to loop long enough to hit a safepoint timeout (with configured timeout delay). I compile it directly by C2 with -XX:-UseCountedLoopSafepoints and -XX:LoopStripMiningIter=0. This should force the loop to get compiled without safepoint and 2 billion divisions should definitely take long enough to hit a 500ms safepoint timeout. I've tested it many times on all platforms we have and I've never seen it failing. Is it fine to rely on this? Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From goetz.lindenmaier at sap.com Wed Mar 6 15:16:24 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 6 Mar 2019 15:16:24 +0000 Subject: RFR(XS): 8210651: compiler/ciReplay/TestServerVM.java is failing on windows Message-ID: Hi, The test is failing in our environment as we pass -Dhttp.nonProxyHosts with a list of | separated hosts to the tests. This is needed to make other tests pass. The arguments are passed to a shell, where the '|' is interpreted as a pipe. Escaping the | as for other symbols helps. http://cr.openjdk.java.net/~goetz/wr19/8219651-ciReplay_win/ Best regards, Goetz. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitrij.pochepko at bell-sw.com Wed Mar 6 15:17:58 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 6 Mar 2019 18:17:58 +0300 Subject: RFR: 8219366: AARCH64: String inflate intrinsic documentation and maintenance improvement Message-ID: <7d875294-dbb4-1e15-4064-2d801c7b6f8d@bell-sw.com> Hi all, Please review patch for 8219366: AARCH64: String inflate intrinsic documentation and maintenance improvement webrev: http://cr.openjdk.java.net/~dpochepk/8219366/webrev.01/ Patch includes: - documentation - new jtreg test - label names are now uppercase Since no code was changed, only sanity tests were executed (hotspot jtreg compiler/* including new test). No regressions found. I'd like to thank Pengfei Li for help in pre-review. CR: https://bugs.openjdk.java.net/browse/JDK-8219366 Thanks, Dmitrij From tobias.hartmann at oracle.com Wed Mar 6 15:59:01 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 6 Mar 2019 16:59:01 +0100 Subject: RFR(XS): 8210651: compiler/ciReplay/TestServerVM.java is failing on windows In-Reply-To: References: Message-ID: Hi Goetz, this looks good to me. (the bug ID in the subject of this email is wrong). Best regards, Tobias On 06.03.19 16:16, Lindenmaier, Goetz wrote: > Hi, > > ? > > The test is failing in our environment as we pass -Dhttp.nonProxyHosts with a list > > of | separated hosts to the tests. This is needed to make other tests pass. > > The arguments are passed to a shell, where the ?|? is interpreted as a pipe. > > Escaping the | as for other symbols helps. > > http://cr.openjdk.java.net/~goetz/wr19/8219651-ciReplay_win/ > > ? > > Best regards, > > ? Goetz. > > ? > From goetz.lindenmaier at sap.com Wed Mar 6 16:01:33 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Wed, 6 Mar 2019 16:01:33 +0000 Subject: RFR(XS): 8219651: compiler/ciReplay/TestServerVM.java is failing on windows Message-ID: Thanks for reviewing! ... Oh, 0->9, but in the change it is correct...luckily. Best regards, Goetz. > -----Original Message----- > From: Tobias Hartmann > Sent: Mittwoch, 6. M?rz 2019 16:59 > To: Lindenmaier, Goetz ; 'hotspot-compiler- > dev at openjdk.java.net' > Subject: Re: RFR(XS): 8210651: compiler/ciReplay/TestServerVM.java is failing > on windows > > Hi Goetz, > > this looks good to me. > > (the bug ID in the subject of this email is wrong). > > Best regards, > Tobias > > On 06.03.19 16:16, Lindenmaier, Goetz wrote: > > Hi, > > > > > > > > The test is failing in our environment as we pass -Dhttp.nonProxyHosts with a > list > > > > of | separated hosts to the tests. This is needed to make other tests pass. > > > > The arguments are passed to a shell, where the '|' is interpreted as a pipe. > > > > Escaping the | as for other symbols helps. > > > > http://cr.openjdk.java.net/~goetz/wr19/8219651-ciReplay_win/ > > > > > > > > Best regards, > > > > ? Goetz. > > > > > > From vladimir.kozlov at oracle.com Wed Mar 6 18:03:04 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 6 Mar 2019 10:03:04 -0800 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> Message-ID: <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> Hi Bernard, Can you prepare final patch for review? Changesets are good to see incremental changes but I already lost what whole changes are. Also in latest changeset branch prediction code in library_call.cpp is commented. Is this what you want in final changes? Thanks, Vladimir On 3/4/19 1:15 PM, B. Blaser wrote: > On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: >> >> Having multiple selection patterns based on node properties is good if we have >> optimized selection patterns with and without properties (in this case reduction) > > Pushed to jdk/submit as third changeset on branch JDK-8217561: > > http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c > > I think this is our best solution, could we have a Reviewer feedback > for this (hotspot:tier1 is OK on x86_64 xeon)? > > Thanks, > Bernard > From vladimir.kozlov at oracle.com Wed Mar 6 18:37:49 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 6 Mar 2019 10:37:49 -0800 Subject: RFR(XS): 8219651: compiler/ciReplay/TestServerVM.java is failing on windows In-Reply-To: References: Message-ID: <0b7721ae-1b41-f744-b3da-008240381e85@oracle.com> Fix looks good. I would treat it as trivial fix. Thanks, Vladimir On 3/6/19 8:01 AM, Lindenmaier, Goetz wrote: > Thanks for reviewing! > > ... Oh, 0->9, but in the change it is correct...luckily. > > Best regards, > Goetz. > >> -----Original Message----- >> From: Tobias Hartmann >> Sent: Mittwoch, 6. M?rz 2019 16:59 >> To: Lindenmaier, Goetz ; 'hotspot-compiler- >> dev at openjdk.java.net' >> Subject: Re: RFR(XS): 8210651: compiler/ciReplay/TestServerVM.java is failing >> on windows >> >> Hi Goetz, >> >> this looks good to me. >> >> (the bug ID in the subject of this email is wrong). >> >> Best regards, >> Tobias >> >> On 06.03.19 16:16, Lindenmaier, Goetz wrote: >>> Hi, >>> >>> >>> >>> The test is failing in our environment as we pass -Dhttp.nonProxyHosts with a >> list >>> >>> of | separated hosts to the tests. This is needed to make other tests pass. >>> >>> The arguments are passed to a shell, where the '|' is interpreted as a pipe. >>> >>> Escaping the | as for other symbols helps. >>> >>> http://cr.openjdk.java.net/~goetz/wr19/8219651-ciReplay_win/ >>> >>> >>> >>> Best regards, >>> >>> ? Goetz. >>> >>> >>> From vladimir.kozlov at oracle.com Wed Mar 6 19:14:00 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 6 Mar 2019 11:14:00 -0800 Subject: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout In-Reply-To: References: Message-ID: <233153e3-16f6-503b-d8af-7c22838dc34a@oracle.com> Hi Martin, On 3/6/19 3:43 AM, Doerr, Martin wrote: > Hi, > > my proposal JDK-8219584 is currently being reviewed on hotspot-runtime-dev, but it contains a small test which > explicitly uses C2. > > May I get a review for TestAbortVMOnSafepointTimeout.java, please? > > Webrev: > > http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.02/ > > Bug with description of the feature: > > https://bugs.openjdk.java.net/browse/JDK-8219584 > > The purpose of the method ?test_loop? is to loop long enough to hit a safepoint timeout (with configured timeout delay). > > I compile it directly by C2 with -XX:-UseCountedLoopSafepoints and -XX:LoopStripMiningIter=0. Yes, these flags combination should work even if one of this flag is set by testing infra. Also you correctly use @requires vm.compiler2.enabled to run VM build where these flag are available. > > This should force the loop to get compiled without safepoint and 2 billion divisions should definitely take long enough > to hit a 500ms safepoint timeout. Compiling with -Xcomp may produce unexpected result. Did you look on generated code for test_loop() method? Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow checks. You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check generated code. > > I?ve tested it many times on all platforms we have and I?ve never seen it failing. Did you tested on SPARC? How long it takes to run on it? > > Is it fine to rely on this? Yes, I think so. Vladimir > > Best regards, > > Martin > From bsrbnd at gmail.com Wed Mar 6 19:57:49 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Wed, 6 Mar 2019 20:57:49 +0100 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> Message-ID: Hi Vladimir, I'd like to keep branch predictions commented out until method data is collected per call-site because current statistics aren't accurate enough to really improve the following numbers. I tried Math.min(float) with the current patch [1] on both standard and reduction scenarios [2] for negative zero, zero, constant and random arrays (NaN being rather uncommon). I had to make an average between min(a,b) and its mirror min(b,a) for reductions because of the asymmetrical API implementation. To summarize: | pattern | array | | blend/min/max | one ucomisd | +/-0.0 | const | random | --------------|---------------|-------------|--------|-------|--------| predictable | 8% gain | unused | yes | yes | no | unpredictable | 57% gain | unused | no | no | yes | reduction | unused | 25% gain | yes | yes | yes | We see that the suggested fix to use 'ucomisd' for reductions and 'blend/min/max' otherwise is always faster than before. I'll prepare the final webrev based on all JDK-8217561 changesets very soon. Thanks, Bernard [1] http://hg.openjdk.java.net/jdk/submit/log?rev=branch%28%22JDK-8217561%22%29 [2] http://hg.openjdk.java.net/jdk/submit/file/ab2b1418f0db/test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java On Wed, 6 Mar 2019 at 19:03, Vladimir Kozlov wrote: > > Hi Bernard, > > Can you prepare final patch for review? Changesets are good to see incremental changes but I already lost what whole > changes are. > > Also in latest changeset branch prediction code in library_call.cpp is commented. Is this what you want in final changes? > > Thanks, > Vladimir > > On 3/4/19 1:15 PM, B. Blaser wrote: > > On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: > >> > >> Having multiple selection patterns based on node properties is good if we have > >> optimized selection patterns with and without properties (in this case reduction) > > > > Pushed to jdk/submit as third changeset on branch JDK-8217561: > > > > http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c > > > > I think this is our best solution, could we have a Reviewer feedback > > for this (hotspot:tier1 is OK on x86_64 xeon)? > > > > Thanks, > > Bernard > > From bsrbnd at gmail.com Wed Mar 6 22:25:35 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Wed, 6 Mar 2019 23:25:35 +0100 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> Message-ID: Here it is: http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ Any feedback is welcome (jdk/submit report is good), Bernard On Wed, 6 Mar 2019 at 20:57, B. Blaser wrote: > > Hi Vladimir, > > I'd like to keep branch predictions commented out until method data is > collected per call-site because current statistics aren't accurate > enough to really improve the following numbers. > > I tried Math.min(float) with the current patch [1] on both standard > and reduction scenarios [2] for negative zero, zero, constant and > random arrays (NaN being rather uncommon). I had to make an average > between min(a,b) and its mirror min(b,a) for reductions because of the > asymmetrical API implementation. > > To summarize: > > | pattern | array | > | blend/min/max | one ucomisd | +/-0.0 | const | random | > --------------|---------------|-------------|--------|-------|--------| > predictable | 8% gain | unused | yes | yes | no | > unpredictable | 57% gain | unused | no | no | yes | > reduction | unused | 25% gain | yes | yes | yes | > > We see that the suggested fix to use 'ucomisd' for reductions and > 'blend/min/max' otherwise is always faster than before. I'll prepare > the final webrev based on all JDK-8217561 changesets very soon. > > Thanks, > Bernard > > [1] http://hg.openjdk.java.net/jdk/submit/log?rev=branch%28%22JDK-8217561%22%29 > [2] http://hg.openjdk.java.net/jdk/submit/file/ab2b1418f0db/test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java > > > On Wed, 6 Mar 2019 at 19:03, Vladimir Kozlov wrote: > > > > Hi Bernard, > > > > Can you prepare final patch for review? Changesets are good to see incremental changes but I already lost what whole > > changes are. > > > > Also in latest changeset branch prediction code in library_call.cpp is commented. Is this what you want in final changes? > > > > Thanks, > > Vladimir > > > > On 3/4/19 1:15 PM, B. Blaser wrote: > > > On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: > > >> > > >> Having multiple selection patterns based on node properties is good if we have > > >> optimized selection patterns with and without properties (in this case reduction) > > > > > > Pushed to jdk/submit as third changeset on branch JDK-8217561: > > > > > > http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c > > > > > > I think this is our best solution, could we have a Reviewer feedback > > > for this (hotspot:tier1 is OK on x86_64 xeon)? > > > > > > Thanks, > > > Bernard > > > From vladimir.kozlov at oracle.com Thu Mar 7 00:55:22 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 6 Mar 2019 16:55:22 -0800 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> Message-ID: <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> Okay. Lets push this version. Do you need sponsor to push? Thanks, Vladimir On 3/6/19 2:25 PM, B. Blaser wrote: > Here it is: > > http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ > > Any feedback is welcome (jdk/submit report is good), > Bernard > > On Wed, 6 Mar 2019 at 20:57, B. Blaser wrote: >> >> Hi Vladimir, >> >> I'd like to keep branch predictions commented out until method data is >> collected per call-site because current statistics aren't accurate >> enough to really improve the following numbers. >> >> I tried Math.min(float) with the current patch [1] on both standard >> and reduction scenarios [2] for negative zero, zero, constant and >> random arrays (NaN being rather uncommon). I had to make an average >> between min(a,b) and its mirror min(b,a) for reductions because of the >> asymmetrical API implementation. >> >> To summarize: >> >> | pattern | array | >> | blend/min/max | one ucomisd | +/-0.0 | const | random | >> --------------|---------------|-------------|--------|-------|--------| >> predictable | 8% gain | unused | yes | yes | no | >> unpredictable | 57% gain | unused | no | no | yes | >> reduction | unused | 25% gain | yes | yes | yes | >> >> We see that the suggested fix to use 'ucomisd' for reductions and >> 'blend/min/max' otherwise is always faster than before. I'll prepare >> the final webrev based on all JDK-8217561 changesets very soon. >> >> Thanks, >> Bernard >> >> [1] http://hg.openjdk.java.net/jdk/submit/log?rev=branch%28%22JDK-8217561%22%29 >> [2] http://hg.openjdk.java.net/jdk/submit/file/ab2b1418f0db/test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java >> >> >> On Wed, 6 Mar 2019 at 19:03, Vladimir Kozlov wrote: >>> >>> Hi Bernard, >>> >>> Can you prepare final patch for review? Changesets are good to see incremental changes but I already lost what whole >>> changes are. >>> >>> Also in latest changeset branch prediction code in library_call.cpp is commented. Is this what you want in final changes? >>> >>> Thanks, >>> Vladimir >>> >>> On 3/4/19 1:15 PM, B. Blaser wrote: >>>> On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: >>>>> >>>>> Having multiple selection patterns based on node properties is good if we have >>>>> optimized selection patterns with and without properties (in this case reduction) >>>> >>>> Pushed to jdk/submit as third changeset on branch JDK-8217561: >>>> >>>> http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c >>>> >>>> I think this is our best solution, could we have a Reviewer feedback >>>> for this (hotspot:tier1 is OK on x86_64 xeon)? >>>> >>>> Thanks, >>>> Bernard >>>> From vladimir.kozlov at oracle.com Thu Mar 7 06:11:20 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 6 Mar 2019 22:11:20 -0800 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> Message-ID: <1099dac8-02f3-ffcb-da23-3036213dd27c@oracle.com> I run tier1-3 testing with these changes. All passed except compiler/intrinsics/math/TestFpMinMaxIntrinsics.java failed on SPARC: java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ... Caused by: java.lang.StackOverflowError at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:262) at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) Looks like deep recursion. It failed for last new @run commands with -XX:CompileCommand=dontinline,TestFpMinMaxIntrinsics.min* TestFpMinMaxIntrinsics sortedSearchTree 1 Vladimir On 3/6/19 4:55 PM, Vladimir Kozlov wrote: > Okay. Lets push this version. Do you need sponsor to push? > > Thanks, > Vladimir > > On 3/6/19 2:25 PM, B. Blaser wrote: >> Here it is: >> >> http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ >> >> Any feedback is welcome (jdk/submit report is good), >> Bernard >> >> On Wed, 6 Mar 2019 at 20:57, B. Blaser wrote: >>> >>> Hi Vladimir, >>> >>> I'd like to keep branch predictions commented out until method data is >>> collected per call-site because current statistics aren't accurate >>> enough to really improve the following numbers. >>> >>> I tried Math.min(float) with the current patch [1] on both standard >>> and reduction scenarios [2] for negative zero, zero, constant and >>> random arrays (NaN being rather uncommon). I had to make an average >>> between min(a,b) and its mirror min(b,a) for reductions because of the >>> asymmetrical API implementation. >>> >>> To summarize: >>> >>> ?????????????? |??????????? pattern????????? |????????? array????????? | >>> ?????????????? | blend/min/max | one ucomisd | +/-0.0 | const | random | >>> --------------|---------------|-------------|--------|-------|--------| >>> predictable?? |??? 8% gain??? |?? unused??? |? yes?? |? yes? |? no??? | >>> unpredictable |?? 57% gain??? |?? unused??? |? no??? |? no?? |? yes?? | >>> reduction???? |??? unused???? |? 25% gain?? |? yes?? |? yes? |? yes?? | >>> >>> We see that the suggested fix to use 'ucomisd' for reductions and >>> 'blend/min/max' otherwise is always faster than before. I'll prepare >>> the final webrev based on all JDK-8217561 changesets very soon. >>> >>> Thanks, >>> Bernard >>> >>> [1] http://hg.openjdk.java.net/jdk/submit/log?rev=branch%28%22JDK-8217561%22%29 >>> [2] >>> http://hg.openjdk.java.net/jdk/submit/file/ab2b1418f0db/test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java >>> >>> >>> On Wed, 6 Mar 2019 at 19:03, Vladimir Kozlov wrote: >>>> >>>> Hi Bernard, >>>> >>>> Can you prepare final patch for review? Changesets are good to see incremental changes but I already lost what whole >>>> changes are. >>>> >>>> Also in latest changeset branch prediction code in library_call.cpp is commented. Is this what you want in final >>>> changes? >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 3/4/19 1:15 PM, B. Blaser wrote: >>>>> On Sat, 2 Mar 2019 at 20:51, Bhateja, Jatin wrote: >>>>>> >>>>>> Having multiple selection patterns based on node properties is good if we have >>>>>> optimized selection patterns with and without properties (in this case reduction) >>>>> >>>>> Pushed to jdk/submit as third changeset on branch JDK-8217561: >>>>> >>>>> http://hg.openjdk.java.net/jdk/submit/rev/9aa98249f99c >>>>> >>>>> I think this is our best solution, could we have a Reviewer feedback >>>>> for this (hotspot:tier1 is OK on x86_64 xeon)? >>>>> >>>>> Thanks, >>>>> Bernard >>>>> From goetz.lindenmaier at sap.com Thu Mar 7 08:38:33 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 7 Mar 2019 08:38:33 +0000 Subject: RFR(XS): 8219651: compiler/ciReplay/TestServerVM.java is failing on windows In-Reply-To: <0b7721ae-1b41-f744-b3da-008240381e85@oracle.com> References: <0b7721ae-1b41-f744-b3da-008240381e85@oracle.com> Message-ID: Thanks Vladimir! Best regards, Goetz. > -----Original Message----- > From: Vladimir Kozlov > Sent: Mittwoch, 6. M?rz 2019 19:38 > To: Lindenmaier, Goetz ; Tobias Hartmann > ; 'hotspot-compiler-dev at openjdk.java.net' > > Subject: Re: RFR(XS): 8219651: compiler/ciReplay/TestServerVM.java is failing > on windows > > Fix looks good. I would treat it as trivial fix. > > Thanks, > Vladimir > > On 3/6/19 8:01 AM, Lindenmaier, Goetz wrote: > > Thanks for reviewing! > > > > ... Oh, 0->9, but in the change it is correct...luckily. > > > > Best regards, > > Goetz. > > > >> -----Original Message----- > >> From: Tobias Hartmann > >> Sent: Mittwoch, 6. M?rz 2019 16:59 > >> To: Lindenmaier, Goetz ; 'hotspot-compiler- > >> dev at openjdk.java.net' > >> Subject: Re: RFR(XS): 8210651: compiler/ciReplay/TestServerVM.java is > failing > >> on windows > >> > >> Hi Goetz, > >> > >> this looks good to me. > >> > >> (the bug ID in the subject of this email is wrong). > >> > >> Best regards, > >> Tobias > >> > >> On 06.03.19 16:16, Lindenmaier, Goetz wrote: > >>> Hi, > >>> > >>> > >>> > >>> The test is failing in our environment as we pass -Dhttp.nonProxyHosts > with a > >> list > >>> > >>> of | separated hosts to the tests. This is needed to make other tests pass. > >>> > >>> The arguments are passed to a shell, where the '|' is interpreted as a pipe. > >>> > >>> Escaping the | as for other symbols helps. > >>> > >>> http://cr.openjdk.java.net/~goetz/wr19/8219651-ciReplay_win/ > >>> > >>> > >>> > >>> Best regards, > >>> > >>> ? Goetz. > >>> > >>> > >>> From Pengfei.Li at arm.com Thu Mar 7 09:26:46 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Thu, 7 Mar 2019 09:26:46 +0000 Subject: [aarch64-port-dev ] RFR(S): 8214922: Add vectorization support for fmin/fmax In-Reply-To: References: <87d0pv2iow.fsf@redhat.com> <877eg32bzq.fsf@redhat.com> <871s6a3map.fsf@redhat.com> <87va371n6b.fsf@redhat.com> <40d1a9a7-47f3-4e13-032d-70932b03d215@redhat.com> Message-ID: Hi Andrew Dinn, Please see below updated webrev for the pending patch of fmin/fmax vectorization. The only difference between webrev.03 and webrev.02 is the hard-coded arrangement bits in fmaxv/fminv encodings are replaced. webrev: http://cr.openjdk.java.net/~pli/rfr/8214922/webrev.03/ JBS: https://bugs.openjdk.java.net/browse/JDK-8214922 -- Thanks, Pengfei From bsrbnd at gmail.com Thu Mar 7 11:42:28 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Thu, 7 Mar 2019 12:42:28 +0100 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <1099dac8-02f3-ffcb-da23-3036213dd27c@oracle.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> <1099dac8-02f3-ffcb-da23-3036213dd27c@oracle.com> Message-ID: Thanks for your approval, Vladimir, I'll push it. I had timeouts on jdk/submit with the initial search tree but reducing the loop number made the test pass [1]. It seems you have a stack overflow only when printing the tree at the end of the test on your SPARC system which isn't concerned by this fix. I'll comment out the recursive printing of the tree when pushing, the insertion being iterative. However, if we still have timeout/overflow reports on some systems, I'll comment out the search tree example as I added it only to try a realistic use case. Bernard [1] http://hg.openjdk.java.net/jdk/submit/rev/d164e0b595e6#l2.7 On Thu, 7 Mar 2019 at 07:11, Vladimir Kozlov wrote: > > I run tier1-3 testing with these changes. > All passed except compiler/intrinsics/math/TestFpMinMaxIntrinsics.java failed on SPARC: > > java.lang.reflect.InvocationTargetException > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ... > Caused by: java.lang.StackOverflowError > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:262) > at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) > at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) > > Looks like deep recursion. > > It failed for last new @run commands with -XX:CompileCommand=dontinline,TestFpMinMaxIntrinsics.min* > TestFpMinMaxIntrinsics sortedSearchTree 1 > > Vladimir > > On 3/6/19 4:55 PM, Vladimir Kozlov wrote: > > Okay. Lets push this version. Do you need sponsor to push? > > > > Thanks, > > Vladimir > > > > On 3/6/19 2:25 PM, B. Blaser wrote: > >> Here it is: > >> > >> http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ > >> > >> Any feedback is welcome (jdk/submit report is good), > >> Bernard From tobias.hartmann at oracle.com Thu Mar 7 12:54:50 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 Mar 2019 13:54:50 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined Message-ID: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8218201 http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) method Object::getClass as "return allocated value" which means that "only newly allocated unescaped objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the returned Object, I've removed the special case. Thanks, Tobias From nils.eliasson at oracle.com Thu Mar 7 13:17:23 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 7 Mar 2019 14:17:23 +0100 Subject: RFR(S): 8219448: split-if update_uses accesses stale idom data In-Reply-To: References: <9188a8bc-fe5e-d51e-092e-48e947265679@oracle.com> Message-ID: <9226c09c-f166-daf7-ab07-457d39a21974@oracle.com> Updated webrev: http://cr.openjdk.java.net/~neliasso/8219448/webrev.02 Regards, Nils On 2019-02-28 22:55, Vladimir Kozlov wrote: > Got it. In such case you need to replace lazy_replace() at line 575 > with remove_dead_node() and you don't need assert outcnt() == 0 then. > > Thanks, > Vladimir > > On 2/28/19 11:48 AM, Nils Eliasson wrote: >> Hi, >> >> I just moved it up from line 570. >> >> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l570 >> >> >> And on 575 we will call the replace on it, to finally kill it. >> >> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l575 >> >> >> So the node is dead, we just must handle the phi-uses first, and >> while doing that, a correct idom is required. >> >> The code that triggers this bug has a diamond-shaped control flow >> below the split-if-region. The region-nodes that need their idom >> corrected is far down and isn't touched during the split. But there >> is a call down there. It has one of its data edges defined by a phi >> hanging on the split-region. So when we try to call spinup on it, it >> will traverse a broken idom chain. >> >> The conclusion of my investigation is that all regions that have the >> split-region as its idom, must be updated (even if they are below). >> For that we have the lazy_update mechanism, and to make it trigger, I >> must mark the region as killed slightly earlier. >> >> Regards, >> >> Nils >> >> >> >> >> >> On 2019-02-28 20:00, Vladimir Kozlov wrote: >>> Hi Nils, >>> >>> You are updating map so that next code in idom_no_update() works for >>> you: >>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/loopnode.hpp#l928 >>> >>> Which seems a hack to me. I think we should fix spinup() method to >>> skip old region when looking for idom (I assume that is where you >>> have the problem). >>> >>> Thanks, >>> Vladimir >>> >>> On 2/28/19 5:18 AM, Nils Eliasson wrote: >>>> Hi, >>>> >>>> This patch fixes some of the idom updates in split-if. The updates >>>> are there, but at the end, which is to late. They need to be >>>> correct when handling the uses of the region being split. When the >>>> stale idom is seen we end up asserting or crashing. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8219448 >>>> >>>> http://cr.openjdk.java.net/~neliasso/8219448/webrev.01/ >>>> >>>> Please review, >>>> >>>> Nils Eliasson >>>> From claes.redestad at oracle.com Thu Mar 7 13:23:22 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Thu, 7 Mar 2019 14:23:22 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> Message-ID: <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> Looks good, and nice touch cleaning up the unused return values! /Claes On 2019-03-07 13:54, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8218201 > http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ > > When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) > method Object::getClass as "return allocated value" which means that "only newly allocated unescaped > objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly > fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. > > This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has > been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the > returned Object, I've removed the special case. > > Thanks, > Tobias > From tobias.hartmann at oracle.com Thu Mar 7 13:23:57 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 Mar 2019 14:23:57 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> Message-ID: <2a2ad213-b67d-2708-cb1f-b3ed58d8eb2e@oracle.com> Thanks Claes! Best regards, Tobias On 07.03.19 14:23, Claes Redestad wrote: > Looks good, and nice touch cleaning up the unused return values! > > /Claes > > On 2019-03-07 13:54, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8218201 >> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >> >> When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) >> method Object::getClass as "return allocated value" which means that "only newly allocated unescaped >> objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly >> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. >> >> This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has >> been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the >> returned Object, I've removed the special case. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Thu Mar 7 13:29:37 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 Mar 2019 14:29:37 +0100 Subject: RFR(S): 8219448: split-if update_uses accesses stale idom data In-Reply-To: <9226c09c-f166-daf7-ab07-457d39a21974@oracle.com> References: <9188a8bc-fe5e-d51e-092e-48e947265679@oracle.com> <9226c09c-f166-daf7-ab07-457d39a21974@oracle.com> Message-ID: <88e902b8-2e15-1812-3ae3-e5de2921ef13@oracle.com> Hi Nils, looks good to me. Best regards, Tobias On 07.03.19 14:17, Nils Eliasson wrote: > Updated webrev: > > http://cr.openjdk.java.net/~neliasso/8219448/webrev.02 > > Regards, > > Nils > > On 2019-02-28 22:55, Vladimir Kozlov wrote: >> Got it. In such case you need to replace lazy_replace() at line 575 with remove_dead_node() and >> you don't need assert outcnt() == 0 then. >> >> Thanks, >> Vladimir >> >> On 2/28/19 11:48 AM, Nils Eliasson wrote: >>> Hi, >>> >>> I just moved it up from line 570. >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l570 >>> >>> And on 575 we will call the replace on it, to finally kill it. >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l575 >>> >>> So the node is dead, we just must handle the phi-uses first, and while doing that, a correct idom >>> is required. >>> >>> The code that triggers this bug has a diamond-shaped control flow below the split-if-region. The >>> region-nodes that need their idom corrected is far down and isn't touched during the split. But >>> there is a call down there. It has one of its data edges defined by a phi hanging on the >>> split-region. So when we try to call spinup on it, it will traverse a broken idom chain. >>> >>> The conclusion of my investigation is that all regions that have the split-region as its idom, >>> must be updated (even if they are below). For that we have the lazy_update mechanism, and to make >>> it trigger, I must mark the region as killed slightly earlier. >>> >>> Regards, >>> >>> Nils >>> >>> >>> >>> >>> >>> On 2019-02-28 20:00, Vladimir Kozlov wrote: >>>> Hi Nils, >>>> >>>> You are updating map so that next code in idom_no_update() works for you: >>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/loopnode.hpp#l928 >>>> Which seems a hack to me. I think we should fix spinup() method to skip old region when looking >>>> for idom (I assume that is where you have the problem). >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 2/28/19 5:18 AM, Nils Eliasson wrote: >>>>> Hi, >>>>> >>>>> This patch fixes some of the idom updates in split-if. The updates are there, but at the end, >>>>> which is to late. They need to be correct when handling the uses of the region being split. >>>>> When the stale idom is seen we end up asserting or crashing. >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8219448 >>>>> >>>>> http://cr.openjdk.java.net/~neliasso/8219448/webrev.01/ >>>>> >>>>> Please review, >>>>> >>>>> Nils Eliasson >>>>> From nils.eliasson at oracle.com Thu Mar 7 14:29:54 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 7 Mar 2019 15:29:54 +0100 Subject: RFR(S): 8219448: split-if update_uses accesses stale idom data In-Reply-To: <88e902b8-2e15-1812-3ae3-e5de2921ef13@oracle.com> References: <9188a8bc-fe5e-d51e-092e-48e947265679@oracle.com> <9226c09c-f166-daf7-ab07-457d39a21974@oracle.com> <88e902b8-2e15-1812-3ae3-e5de2921ef13@oracle.com> Message-ID: <527b713f-1276-2f80-c58a-b396fc03caec@oracle.com> Thank you, Tobias! // Nils On 2019-03-07 14:29, Tobias Hartmann wrote: > Hi Nils, > > looks good to me. > > Best regards, > Tobias > > On 07.03.19 14:17, Nils Eliasson wrote: >> Updated webrev: >> >> http://cr.openjdk.java.net/~neliasso/8219448/webrev.02 >> >> Regards, >> >> Nils >> >> On 2019-02-28 22:55, Vladimir Kozlov wrote: >>> Got it. In such case you need to replace lazy_replace() at line 575 with remove_dead_node() and >>> you don't need assert outcnt() == 0 then. >>> >>> Thanks, >>> Vladimir >>> >>> On 2/28/19 11:48 AM, Nils Eliasson wrote: >>>> Hi, >>>> >>>> I just moved it up from line 570. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l570 >>>> >>>> And on 575 we will call the replace on it, to finally kill it. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l575 >>>> >>>> So the node is dead, we just must handle the phi-uses first, and while doing that, a correct idom >>>> is required. >>>> >>>> The code that triggers this bug has a diamond-shaped control flow below the split-if-region. The >>>> region-nodes that need their idom corrected is far down and isn't touched during the split. But >>>> there is a call down there. It has one of its data edges defined by a phi hanging on the >>>> split-region. So when we try to call spinup on it, it will traverse a broken idom chain. >>>> >>>> The conclusion of my investigation is that all regions that have the split-region as its idom, >>>> must be updated (even if they are below). For that we have the lazy_update mechanism, and to make >>>> it trigger, I must mark the region as killed slightly earlier. >>>> >>>> Regards, >>>> >>>> Nils >>>> >>>> >>>> >>>> >>>> >>>> On 2019-02-28 20:00, Vladimir Kozlov wrote: >>>>> Hi Nils, >>>>> >>>>> You are updating map so that next code in idom_no_update() works for you: >>>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/loopnode.hpp#l928 >>>>> Which seems a hack to me. I think we should fix spinup() method to skip old region when looking >>>>> for idom (I assume that is where you have the problem). >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/28/19 5:18 AM, Nils Eliasson wrote: >>>>>> Hi, >>>>>> >>>>>> This patch fixes some of the idom updates in split-if. The updates are there, but at the end, >>>>>> which is to late. They need to be correct when handling the uses of the region being split. >>>>>> When the stale idom is seen we end up asserting or crashing. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8219448 >>>>>> >>>>>> http://cr.openjdk.java.net/~neliasso/8219448/webrev.01/ >>>>>> >>>>>> Please review, >>>>>> >>>>>> Nils Eliasson >>>>>> From nils.eliasson at oracle.com Thu Mar 7 14:32:08 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Thu, 7 Mar 2019 15:32:08 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> Message-ID: +1 // Nils On 2019-03-07 14:23, Claes Redestad wrote: > Looks good, and nice touch cleaning up the unused return values! > > /Claes > > On 2019-03-07 13:54, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8218201 >> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >> >> When intrinsification is disabled, the BCEscapeAnalyzer marks the >> return value of the (native) >> method Object::getClass as "return allocated value" which means that >> "only newly allocated unescaped >> objects are returned". The OptimizePtrCompare optimization then uses >> this information to incorrectly >> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to >> always false. >> >> This is a very old issue and I can't trace back why a special case >> for the _getClass intrinsic has >> been added to the BCEscapeAnalyzer. Since I don't think we should >> make any assumptions about the >> returned Object, I've removed the special case. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Thu Mar 7 15:03:20 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 7 Mar 2019 16:03:20 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <421a501d-35eb-febe-5c05-e073ee6c585f@oracle.com> Message-ID: <976a3bae-6fba-65b7-7819-7da7d02e1305@oracle.com> Thanks Nils! Best regards, Tobias On 07.03.19 15:32, Nils Eliasson wrote: > +1 > > // Nils > > On 2019-03-07 14:23, Claes Redestad wrote: >> Looks good, and nice touch cleaning up the unused return values! >> >> /Claes >> >> On 2019-03-07 13:54, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8218201 >>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >>> >>> When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) >>> method Object::getClass as "return allocated value" which means that "only newly allocated unescaped >>> objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly >>> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. >>> >>> This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has >>> been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the >>> returned Object, I've removed the special case. >>> >>> Thanks, >>> Tobias >>> From bsrbnd at gmail.com Thu Mar 7 15:21:21 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Thu, 7 Mar 2019 16:21:21 +0100 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> <1099dac8-02f3-ffcb-da23-3036213dd27c@oracle.com> Message-ID: Pushed: http://hg.openjdk.java.net/jdk/jdk/rev/ff399127078a Search tree printing commented here: http://hg.openjdk.java.net/jdk/jdk/rev/ff399127078a#l8.221 Thanks to Jatin Bhateja for his contribution! Bernard On Thu, 7 Mar 2019 at 12:42, B. Blaser wrote: > > Thanks for your approval, Vladimir, I'll push it. > > I had timeouts on jdk/submit with the initial search tree but reducing > the loop number made the test pass [1]. > It seems you have a stack overflow only when printing the tree at the > end of the test on your SPARC system which isn't concerned by this > fix. > I'll comment out the recursive printing of the tree when pushing, the > insertion being iterative. > However, if we still have timeout/overflow reports on some systems, > I'll comment out the search tree example as I added it only to try a > realistic use case. > > Bernard > > [1] http://hg.openjdk.java.net/jdk/submit/rev/d164e0b595e6#l2.7 > > > On Thu, 7 Mar 2019 at 07:11, Vladimir Kozlov wrote: > > > > I run tier1-3 testing with these changes. > > All passed except compiler/intrinsics/math/TestFpMinMaxIntrinsics.java failed on SPARC: > > > > java.lang.reflect.InvocationTargetException > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > ... > > Caused by: java.lang.StackOverflowError > > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:262) > > at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) > > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) > > at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) > > at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) > > > > Looks like deep recursion. > > > > It failed for last new @run commands with -XX:CompileCommand=dontinline,TestFpMinMaxIntrinsics.min* > > TestFpMinMaxIntrinsics sortedSearchTree 1 > > > > Vladimir > > > > On 3/6/19 4:55 PM, Vladimir Kozlov wrote: > > > Okay. Lets push this version. Do you need sponsor to push? > > > > > > Thanks, > > > Vladimir > > > > > > On 3/6/19 2:25 PM, B. Blaser wrote: > > >> Here it is: > > >> > > >> http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ > > >> > > >> Any feedback is welcome (jdk/submit report is good), > > >> Bernard From martin.doerr at sap.com Thu Mar 7 15:38:38 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 7 Mar 2019 15:38:38 +0000 Subject: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout In-Reply-To: <233153e3-16f6-503b-d8af-7c22838dc34a@oracle.com> References: <233153e3-16f6-503b-d8af-7c22838dc34a@oracle.com> Message-ID: Hi Vladimir, thank you for reviewing it so quickly and for your good points. > Compiling with -Xcomp may produce unexpected result. In general, yes, but I didn't see any uncommon traps in this simple method. Were you concerned about that? I'm using -Xcomp + compileonly in order to get the test_loop method precompiled. (Also avoids OnStackReplacement.) > Did you look on generated code for test_loop() method? Yes, looks good. Also see OptoAssembly from SPARC below. > Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow > checks. I've tried, but there was no change in OptoAssembly other than the constant value. I've looked at the loopnode.cpp overflow checking logic and it appears to be implemented nicely and can detect that stride = 1 and limit "< max_jint" will not overflow. Do you agree? > You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check > generated code. Good idea. Makes the OptoAssembly better readably. Added in place. Thanks. > Did you tested on SPARC? Of course. Works great on this platform. Also see OptoAssembly below. > How long it takes to run on it? Took less than a minute. Hmm... seems like the jtreg stuff takes most of the time. The test itself shouldn't run much longer than 1 second (500ms GuaranteedSafepointInterval + 500ms SafepointTimeoutDelay) until the Java Thread gets killed. Can I add you as reviewer? Thanks and best regards, Martin OptoAssembly SPARC 000 B1: # B5 B2 <- BLOCK HEAD IS JUNK Freq: 1 000 ! stack bang (144 bytes) SAVE R_SP,-144,R_SP 014 MOV R_I0,R_L3 ! spill 018 + MOV #0,R_I0 01c CWBeq R_L3,#0,B5 ! int P=0.100000 C=-1.000000 01c 020 B2: # B3 <- B1 Freq: 0.9 020 + SET #2147483647,R_L0 028 + MOV #1,R_L2 028 02c B3: # B6 B4 <- B2 B4 Loop: B3-B4 inner Freq: 9 02c SREM R_L2,R_L3,R_L1 040 + CWBeq R_L1,#0,B6 ! int P=0.100000 C=-1.000000 040 044 B4: # B3 B5 <- B6 B3 Freq: 9 044 + ADD R_L2,#1,R_L2 048 CWBlt R_L2,R_L0,B3 ! Loop end P=0.900000 C=-1.000000 048 04c B5: # N1 <- B4 B1 Freq: 1 04c LDX [R_G2 + #poll_offset],L0 ! Load local polling address LDX [L0],G0 !Poll for Safepointing RET RESTORE 05c + ! return 05c 05c B6: # B4 <- B3 Freq: 0.9 05c + ADD R_I0,#1,R_I0 060 BA B4 ! short branch -----Original Message----- From: Vladimir Kozlov Sent: Mittwoch, 6. M?rz 2019 20:14 To: Doerr, Martin ; 'hotspot-compiler-dev at openjdk.java.net' Cc: David Holmes (david.holmes at oracle.com) Subject: Re: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout Hi Martin, On 3/6/19 3:43 AM, Doerr, Martin wrote: > Hi, > > my proposal JDK-8219584 is currently being reviewed on hotspot-runtime-dev, but it contains a small test which > explicitly uses C2. > > May I get a review for TestAbortVMOnSafepointTimeout.java, please? > > Webrev: > > http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.02/ > > Bug with description of the feature: > > https://bugs.openjdk.java.net/browse/JDK-8219584 > > The purpose of the method "test_loop" is to loop long enough to hit a safepoint timeout (with configured timeout delay). > > I compile it directly by C2 with -XX:-UseCountedLoopSafepoints and -XX:LoopStripMiningIter=0. Yes, these flags combination should work even if one of this flag is set by testing infra. Also you correctly use @requires vm.compiler2.enabled to run VM build where these flag are available. > > This should force the loop to get compiled without safepoint and 2 billion divisions should definitely take long enough > to hit a 500ms safepoint timeout. Compiling with -Xcomp may produce unexpected result. Did you look on generated code for test_loop() method? Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow checks. You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check generated code. > > I've tested it many times on all platforms we have and I've never seen it failing. Did you tested on SPARC? How long it takes to run on it? > > Is it fine to rely on this? Yes, I think so. Vladimir > > Best regards, > > Martin > From vladimir.kozlov at oracle.com Thu Mar 7 17:18:26 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 Mar 2019 09:18:26 -0800 Subject: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <04476179-590e-9315-667c-cc6885477194@oracle.com> <328908ec-14bd-3caf-3e27-78b924696170@oracle.com> <98ecb229-04c3-467c-5315-d777c87f9cae@oracle.com> <1099dac8-02f3-ffcb-da23-3036213dd27c@oracle.com> Message-ID: <23423aa4-aa9d-6f1c-01d3-59532730f975@oracle.com> Thank you, Bernard I looked on our regular testing results after your push and there are no failures I observed before. Vladimir On 3/7/19 7:21 AM, B. Blaser wrote: > Pushed: > http://hg.openjdk.java.net/jdk/jdk/rev/ff399127078a > > Search tree printing commented here: > http://hg.openjdk.java.net/jdk/jdk/rev/ff399127078a#l8.221 > > Thanks to Jatin Bhateja for his contribution! > Bernard > > On Thu, 7 Mar 2019 at 12:42, B. Blaser wrote: >> >> Thanks for your approval, Vladimir, I'll push it. >> >> I had timeouts on jdk/submit with the initial search tree but reducing >> the loop number made the test pass [1]. >> It seems you have a stack overflow only when printing the tree at the >> end of the test on your SPARC system which isn't concerned by this >> fix. >> I'll comment out the recursive printing of the tree when pushing, the >> insertion being iterative. >> However, if we still have timeout/overflow reports on some systems, >> I'll comment out the search tree example as I added it only to try a >> realistic use case. >> >> Bernard >> >> [1] http://hg.openjdk.java.net/jdk/submit/rev/d164e0b595e6#l2.7 >> >> >> On Thu, 7 Mar 2019 at 07:11, Vladimir Kozlov wrote: >>> >>> I run tier1-3 testing with these changes. >>> All passed except compiler/intrinsics/math/TestFpMinMaxIntrinsics.java failed on SPARC: >>> >>> java.lang.reflect.InvocationTargetException >>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> ... >>> Caused by: java.lang.StackOverflowError >>> at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:262) >>> at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) >>> at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) >>> at java.base/java.lang.invoke.StringConcatFactory$Stringifiers$ObjectStringifier.valueOf(StringConcatFactory.java:1702) >>> at compiler.intrinsics.math.TestFpMinMaxIntrinsics$Node.toString(TestFpMinMaxIntrinsics.java:264) >>> >>> Looks like deep recursion. >>> >>> It failed for last new @run commands with -XX:CompileCommand=dontinline,TestFpMinMaxIntrinsics.min* >>> TestFpMinMaxIntrinsics sortedSearchTree 1 >>> >>> Vladimir >>> >>> On 3/6/19 4:55 PM, Vladimir Kozlov wrote: >>>> Okay. Lets push this version. Do you need sponsor to push? >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 3/6/19 2:25 PM, B. Blaser wrote: >>>>> Here it is: >>>>> >>>>> http://cr.openjdk.java.net/~bsrbnd/jdk8217561/webrev.06/ >>>>> >>>>> Any feedback is welcome (jdk/submit report is good), >>>>> Bernard From vladimir.kozlov at oracle.com Thu Mar 7 17:50:13 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 Mar 2019 09:50:13 -0800 Subject: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout In-Reply-To: References: <233153e3-16f6-503b-d8af-7c22838dc34a@oracle.com> Message-ID: <157080d0-45a4-7bb5-6fe5-ddab0c9a3a32@oracle.com> On 3/7/19 7:38 AM, Doerr, Martin wrote: > Hi Vladimir, > > thank you for reviewing it so quickly and for your good points. > >> Compiling with -Xcomp may produce unexpected result. > In general, yes, but I didn't see any uncommon traps in this simple method. > Were you concerned about that? Yes, uncommon traps was my concern. That is why I asked if you looked on generated code. > > I'm using -Xcomp + compileonly in order to get the test_loop method precompiled. > (Also avoids OnStackReplacement. > >> Did you look on generated code for test_loop() method? > Yes, looks good. Also see OptoAssembly from SPARC below. Yes, it looks good! > >> Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow >> checks. > I've tried, but there was no change in OptoAssembly other than the constant value. > I've looked at the loopnode.cpp overflow checking logic and it appears to be implemented nicely and can detect that stride = 1 and limit "< max_jint" will not overflow. > Do you agree? I agree since generated code look good. > >> You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check >> generated code. > Good idea. Makes the OptoAssembly better readably. Added in place. Thanks. > >> Did you tested on SPARC? > Of course. Works great on this platform. Also see OptoAssembly below. > >> How long it takes to run on it? > Took less than a minute. Hmm... seems like the jtreg stuff takes most of the time. > The test itself shouldn't run much longer than 1 second (500ms GuaranteedSafepointInterval + 500ms SafepointTimeoutDelay) until the Java Thread gets killed. I was concern that you may get a timeout due to long loop if it is not interrupted. But since you set times to 0.5 sec it could happens only if we have a bug. So it seems good. > > Can I add you as reviewer? Yes. I looked on changes and they are good Thanks, Vladimir > > Thanks and best regards, > Martin > > > OptoAssembly SPARC > > 000 B1: # B5 B2 <- BLOCK HEAD IS JUNK Freq: 1 > 000 ! stack bang (144 bytes) > SAVE R_SP,-144,R_SP > 014 > MOV R_I0,R_L3 ! spill > 018 + MOV #0,R_I0 > 01c CWBeq R_L3,#0,B5 ! int P=0.100000 C=-1.000000 > 01c > 020 B2: # B3 <- B1 Freq: 0.9 > 020 + SET #2147483647,R_L0 > 028 + MOV #1,R_L2 > 028 > 02c B3: # B6 B4 <- B2 B4 Loop: B3-B4 inner Freq: 9 > 02c SREM R_L2,R_L3,R_L1 > 040 + CWBeq R_L1,#0,B6 ! int P=0.100000 C=-1.000000 > 040 > 044 B4: # B3 B5 <- B6 B3 Freq: 9 > 044 + ADD R_L2,#1,R_L2 > 048 CWBlt R_L2,R_L0,B3 ! Loop end P=0.900000 C=-1.000000 > 048 > 04c B5: # N1 <- B4 B1 Freq: 1 > 04c LDX [R_G2 + #poll_offset],L0 ! Load local polling address > LDX [L0],G0 !Poll for Safepointing > RET > RESTORE > 05c + ! return > 05c > 05c B6: # B4 <- B3 Freq: 0.9 > 05c + ADD R_I0,#1,R_I0 > 060 BA B4 ! short branch > > > > -----Original Message----- > From: Vladimir Kozlov > Sent: Mittwoch, 6. M?rz 2019 20:14 > To: Doerr, Martin ; 'hotspot-compiler-dev at openjdk.java.net' > Cc: David Holmes (david.holmes at oracle.com) > Subject: Re: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout > > Hi Martin, > > On 3/6/19 3:43 AM, Doerr, Martin wrote: >> Hi, >> >> my proposal JDK-8219584 is currently being reviewed on hotspot-runtime-dev, but it contains a small test which >> explicitly uses C2. >> >> May I get a review for TestAbortVMOnSafepointTimeout.java, please? >> >> Webrev: >> >> http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.02/ >> >> Bug with description of the feature: >> >> https://bugs.openjdk.java.net/browse/JDK-8219584 >> >> The purpose of the method "test_loop" is to loop long enough to hit a safepoint timeout (with configured timeout delay). >> >> I compile it directly by C2 with -XX:-UseCountedLoopSafepoints and -XX:LoopStripMiningIter=0. > > Yes, these flags combination should work even if one of this flag is set by testing infra. > Also you correctly use @requires vm.compiler2.enabled to run VM build where these flag are available. > >> >> This should force the loop to get compiled without safepoint and 2 billion divisions should definitely take long enough >> to hit a 500ms safepoint timeout. > > Compiling with -Xcomp may produce unexpected result. Did you look on generated code for test_loop() method? > Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow > checks. > You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check > generated code. > >> >> I've tested it many times on all platforms we have and I've never seen it failing. > > Did you tested on SPARC? How long it takes to run on it? > >> >> Is it fine to rely on this? > > Yes, I think so. > > Vladimir > >> >> Best regards, >> >> Martin >> From martin.doerr at sap.com Thu Mar 7 17:57:19 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 7 Mar 2019 17:57:19 +0000 Subject: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout In-Reply-To: <157080d0-45a4-7bb5-6fe5-ddab0c9a3a32@oracle.com> References: <233153e3-16f6-503b-d8af-7c22838dc34a@oracle.com> <157080d0-45a4-7bb5-6fe5-ddab0c9a3a32@oracle.com> Message-ID: Hi Vladimir, thank you very much for reviewing. Best regards, Martin -----Original Message----- From: Vladimir Kozlov Sent: Donnerstag, 7. M?rz 2019 18:50 To: Doerr, Martin ; 'hotspot-compiler-dev at openjdk.java.net' Cc: David Holmes (david.holmes at oracle.com) Subject: Re: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout On 3/7/19 7:38 AM, Doerr, Martin wrote: > Hi Vladimir, > > thank you for reviewing it so quickly and for your good points. > >> Compiling with -Xcomp may produce unexpected result. > In general, yes, but I didn't see any uncommon traps in this simple method. > Were you concerned about that? Yes, uncommon traps was my concern. That is why I asked if you looked on generated code. > > I'm using -Xcomp + compileonly in order to get the test_loop method precompiled. > (Also avoids OnStackReplacement. > >> Did you look on generated code for test_loop() method? > Yes, looks good. Also see OptoAssembly from SPARC below. Yes, it looks good! > >> Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow >> checks. > I've tried, but there was no change in OptoAssembly other than the constant value. > I've looked at the loopnode.cpp overflow checking logic and it appears to be implemented nicely and can detect that stride = 1 and limit "< max_jint" will not overflow. > Do you agree? I agree since generated code look good. > >> You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check >> generated code. > Good idea. Makes the OptoAssembly better readably. Added in place. Thanks. > >> Did you tested on SPARC? > Of course. Works great on this platform. Also see OptoAssembly below. > >> How long it takes to run on it? > Took less than a minute. Hmm... seems like the jtreg stuff takes most of the time. > The test itself shouldn't run much longer than 1 second (500ms GuaranteedSafepointInterval + 500ms SafepointTimeoutDelay) until the Java Thread gets killed. I was concern that you may get a timeout due to long loop if it is not interrupted. But since you set times to 0.5 sec it could happens only if we have a bug. So it seems good. > > Can I add you as reviewer? Yes. I looked on changes and they are good Thanks, Vladimir > > Thanks and best regards, > Martin > > > OptoAssembly SPARC > > 000 B1: # B5 B2 <- BLOCK HEAD IS JUNK Freq: 1 > 000 ! stack bang (144 bytes) > SAVE R_SP,-144,R_SP > 014 > MOV R_I0,R_L3 ! spill > 018 + MOV #0,R_I0 > 01c CWBeq R_L3,#0,B5 ! int P=0.100000 C=-1.000000 > 01c > 020 B2: # B3 <- B1 Freq: 0.9 > 020 + SET #2147483647,R_L0 > 028 + MOV #1,R_L2 > 028 > 02c B3: # B6 B4 <- B2 B4 Loop: B3-B4 inner Freq: 9 > 02c SREM R_L2,R_L3,R_L1 > 040 + CWBeq R_L1,#0,B6 ! int P=0.100000 C=-1.000000 > 040 > 044 B4: # B3 B5 <- B6 B3 Freq: 9 > 044 + ADD R_L2,#1,R_L2 > 048 CWBlt R_L2,R_L0,B3 ! Loop end P=0.900000 C=-1.000000 > 048 > 04c B5: # N1 <- B4 B1 Freq: 1 > 04c LDX [R_G2 + #poll_offset],L0 ! Load local polling address > LDX [L0],G0 !Poll for Safepointing > RET > RESTORE > 05c + ! return > 05c > 05c B6: # B4 <- B3 Freq: 0.9 > 05c + ADD R_I0,#1,R_I0 > 060 BA B4 ! short branch > > > > -----Original Message----- > From: Vladimir Kozlov > Sent: Mittwoch, 6. M?rz 2019 20:14 > To: Doerr, Martin ; 'hotspot-compiler-dev at openjdk.java.net' > Cc: David Holmes (david.holmes at oracle.com) > Subject: Re: RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout > > Hi Martin, > > On 3/6/19 3:43 AM, Doerr, Martin wrote: >> Hi, >> >> my proposal JDK-8219584 is currently being reviewed on hotspot-runtime-dev, but it contains a small test which >> explicitly uses C2. >> >> May I get a review for TestAbortVMOnSafepointTimeout.java, please? >> >> Webrev: >> >> http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.02/ >> >> Bug with description of the feature: >> >> https://bugs.openjdk.java.net/browse/JDK-8219584 >> >> The purpose of the method "test_loop" is to loop long enough to hit a safepoint timeout (with configured timeout delay). >> >> I compile it directly by C2 with -XX:-UseCountedLoopSafepoints and -XX:LoopStripMiningIter=0. > > Yes, these flags combination should work even if one of this flag is set by testing infra. > Also you correctly use @requires vm.compiler2.enabled to run VM build where these flag are available. > >> >> This should force the loop to get compiled without safepoint and 2 billion divisions should definitely take long enough >> to hit a 500ms safepoint timeout. > > Compiling with -Xcomp may produce unexpected result. Did you look on generated code for test_loop() method? > Also use something smaller then Integer.MAX_VALUE for limit (subtract -100 for example) to simplify logic for overflow > checks. > You may also add -XX:LoopUnrollLimit=0 to avoid unrolling and other loop optimizations which you don't need. Check > generated code. > >> >> I've tested it many times on all platforms we have and I've never seen it failing. > > Did you tested on SPARC? How long it takes to run on it? > >> >> Is it fine to rely on this? > > Yes, I think so. > > Vladimir > >> >> Best regards, >> >> Martin >> From vladimir.kozlov at oracle.com Thu Mar 7 18:46:04 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 Mar 2019 10:46:04 -0800 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> Message-ID: <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> _getClass is special case because it is native call and we can't do bytecode analysis. But we know what it does (it loads klass from object and mirror from klass) - no allocations, no locals, no arguments returned. I think it is simple missing _return_allocated = false setting in addition to _return_local = false. _return_allocated is set to true by default optimistically. Thanks, Vladimir On 3/7/19 4:54 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8218201 > http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ > > When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) > method Object::getClass as "return allocated value" which means that "only newly allocated unescaped > objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly > fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. > > This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has > been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the > returned Object, I've removed the special case. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Thu Mar 7 18:48:11 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 7 Mar 2019 10:48:11 -0800 Subject: RFR(S): 8219448: split-if update_uses accesses stale idom data In-Reply-To: <88e902b8-2e15-1812-3ae3-e5de2921ef13@oracle.com> References: <9188a8bc-fe5e-d51e-092e-48e947265679@oracle.com> <9226c09c-f166-daf7-ab07-457d39a21974@oracle.com> <88e902b8-2e15-1812-3ae3-e5de2921ef13@oracle.com> Message-ID: <4d31dbf5-567e-e7b4-1b8e-32edf7a4469d@oracle.com> +1 Thanks, Vladimir On 3/7/19 5:29 AM, Tobias Hartmann wrote: > Hi Nils, > > looks good to me. > > Best regards, > Tobias > > On 07.03.19 14:17, Nils Eliasson wrote: >> Updated webrev: >> >> http://cr.openjdk.java.net/~neliasso/8219448/webrev.02 >> >> Regards, >> >> Nils >> >> On 2019-02-28 22:55, Vladimir Kozlov wrote: >>> Got it. In such case you need to replace lazy_replace() at line 575 with remove_dead_node() and >>> you don't need assert outcnt() == 0 then. >>> >>> Thanks, >>> Vladimir >>> >>> On 2/28/19 11:48 AM, Nils Eliasson wrote: >>>> Hi, >>>> >>>> I just moved it up from line 570. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l570 >>>> >>>> And on 575 we will call the replace on it, to finally kill it. >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/split_if.cpp#l575 >>>> >>>> So the node is dead, we just must handle the phi-uses first, and while doing that, a correct idom >>>> is required. >>>> >>>> The code that triggers this bug has a diamond-shaped control flow below the split-if-region. The >>>> region-nodes that need their idom corrected is far down and isn't touched during the split. But >>>> there is a call down there. It has one of its data edges defined by a phi hanging on the >>>> split-region. So when we try to call spinup on it, it will traverse a broken idom chain. >>>> >>>> The conclusion of my investigation is that all regions that have the split-region as its idom, >>>> must be updated (even if they are below). For that we have the lazy_update mechanism, and to make >>>> it trigger, I must mark the region as killed slightly earlier. >>>> >>>> Regards, >>>> >>>> Nils >>>> >>>> >>>> >>>> >>>> >>>> On 2019-02-28 20:00, Vladimir Kozlov wrote: >>>>> Hi Nils, >>>>> >>>>> You are updating map so that next code in idom_no_update() works for you: >>>>> http://hg.openjdk.java.net/jdk/jdk/file/196ab0abc685/src/hotspot/share/opto/loopnode.hpp#l928 >>>>> Which seems a hack to me. I think we should fix spinup() method to skip old region when looking >>>>> for idom (I assume that is where you have the problem). >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 2/28/19 5:18 AM, Nils Eliasson wrote: >>>>>> Hi, >>>>>> >>>>>> This patch fixes some of the idom updates in split-if. The updates are there, but at the end, >>>>>> which is to late. They need to be correct when handling the uses of the region being split. >>>>>> When the stale idom is seen we end up asserting or crashing. >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8219448 >>>>>> >>>>>> http://cr.openjdk.java.net/~neliasso/8219448/webrev.01/ >>>>>> >>>>>> Please review, >>>>>> >>>>>> Nils Eliasson >>>>>> From dean.long at oracle.com Thu Mar 7 19:20:07 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Thu, 7 Mar 2019 11:20:07 -0800 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> Message-ID: <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> I agree.? We don't want obj.getClass() to treat obj as "global escape". dl PS - Original change goes back to JDK-6488063. On 3/7/19 10:46 AM, Vladimir Kozlov wrote: > _getClass is special case because it is native call and we can't do > bytecode analysis. > But we know what it does (it loads klass from object and mirror from > klass) - no allocations, no locals, no arguments returned. > I think it is simple missing _return_allocated = false setting in > addition to _return_local = false. _return_allocated is set to true by > default optimistically. > > Thanks, > Vladimir > > On 3/7/19 4:54 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8218201 >> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >> >> When intrinsification is disabled, the BCEscapeAnalyzer marks the >> return value of the (native) >> method Object::getClass as "return allocated value" which means that >> "only newly allocated unescaped >> objects are returned". The OptimizePtrCompare optimization then uses >> this information to incorrectly >> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to >> always false. >> >> This is a very old issue and I can't trace back why a special case >> for the _getClass intrinsic has >> been added to the BCEscapeAnalyzer. Since I don't think we should >> make any assumptions about the >> returned Object, I've removed the special case. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri Mar 8 08:25:33 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 8 Mar 2019 09:25:33 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> Message-ID: <8deb2b91-9a1e-b17e-bd70-c27190516ea6@oracle.com> Vladimir, Dean, thanks for the review. My first fix version had '_return_allocated = false' but then I incorrectly assumed that the behavior would be the same as if we don't special case at all. I missed that the receiver is of course also involved and you are right that we should not treat it as "global escape". New webrev: http://cr.openjdk.java.net/~thartmann/8218201/webrev.01/ Best regards, Tobias On 07.03.19 20:20, dean.long at oracle.com wrote: > I agree.? We don't want obj.getClass() to treat obj as "global escape". > > dl > > PS - Original change goes back to JDK-6488063. > > On 3/7/19 10:46 AM, Vladimir Kozlov wrote: >> _getClass is special case because it is native call and we can't do bytecode analysis. >> But we know what it does (it loads klass from object and mirror from klass) - no allocations, no >> locals, no arguments returned. >> I think it is simple missing _return_allocated = false setting in addition to _return_local = >> false. _return_allocated is set to true by default optimistically. >> >> Thanks, >> Vladimir >> >> On 3/7/19 4:54 AM, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8218201 >>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >>> >>> When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) >>> method Object::getClass as "return allocated value" which means that "only newly allocated unescaped >>> objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly >>> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. >>> >>> This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has >>> been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the >>> returned Object, I've removed the special case. >>> >>> Thanks, >>> Tobias >>> > From nils.eliasson at oracle.com Fri Mar 8 10:52:06 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 Mar 2019 11:52:06 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize Message-ID: Hi all, Background: We can get stuck in an infinite loop in IGVN. The method reproducing the problem is quite a big graph, and after some optimization, a huge loop will die. But since it is so big, it takes a while before it has been pruned. In an inner loop there is a phi on memory that gets reduced to a self looping heart, with a membar on each in edge. There is also a connected region that keeps it alive. (From the start there is other memory state coming into this loop, but it gets disconnected early when the loop dies.) +---+???????????????? +---+ |?? v???????????????? v?? | | Membar +-+ +---+ Membar | |????????? | |??????????? | |????????? v v??????????? | |????????? Phi??????????? | |????????? + + +????????? | |????????? | | |????????? | +----------+ | +----------+ ???????????? | ???????????? v ?????????? LoadN In IGVN, Ideal() will be called on the Load. On iteration 1 - A split_through_phi on one edge will be performed, because we can prove that other edge of the phi is a loop. Now the Load hangs of one of the membars. On iteration 2 - Optimize_memory_chain will suggest the in to the membar as a more ideal memory, and then the load get the phi back as the memory input. Repeat. I have gone great lengths to show that this code is part of a huge loop, that is dead, and will be eliminated in due time. My suggested solution to breaking the infinite loop, is to change the first case, by simply not perform the memory replacement when both inputs are self loops. https://bugs.openjdk.java.net/browse/JDK-8219517 http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ Regards, Nils From nils.eliasson at oracle.com Fri Mar 8 11:06:42 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 Mar 2019 12:06:42 +0100 Subject: RFR(S): 8219642: ciReplay loads wrong data when MethodData size changes In-Reply-To: <78fd3106-ff46-1ba2-20b6-4a8b2c57681e@oracle.com> References: <78fd3106-ff46-1ba2-20b6-4a8b2c57681e@oracle.com> Message-ID: <2fa31e7d-ec1f-5122-7518-dd75e3b81b74@oracle.com> A second review, please. Regards, Nils On 2019-02-25 14:16, Nils Eliasson wrote: > > Hi, > > I stumbled upon this problem when trying to > reproducehttps://bugs.openjdk.java.net/browse/JDK-8219448on JDK 13. > The crash was recorded on a late 12 build, but the issue doesn't > reproduce on 13. A bisection revealed that JDK-8210832 > "Remove sneaky > locking in class Monitor" caused the problem, at it doesn't even touch > the compilers. > > The problem is that when ciReplay serializes a ciMethodData it will > serialize a MethodData as an array preceded by the size. > But a MethodData contains an inlined Mutex, and its size changed with > the removal of sneaky locking. > > This fix adds code for detecting a size change of MethodData, and > tries to recover by adding padding or dropping data. Since all non > significant serialization data are in the beginning, the padding or > dropping of data is done from the start. > > https://bugs.openjdk.java.net/browse/JDK-8219642 > > http://cr.openjdk.java.net/~neliasso/8219642/webrev.01/ > > Please review, > > Nils Eliasson > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Fri Mar 8 13:14:17 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 8 Mar 2019 14:14:17 +0100 Subject: RFR(S): 8219642: ciReplay loads wrong data when MethodData size changes In-Reply-To: <2fa31e7d-ec1f-5122-7518-dd75e3b81b74@oracle.com> References: <78fd3106-ff46-1ba2-20b6-4a8b2c57681e@oracle.com> <2fa31e7d-ec1f-5122-7518-dd75e3b81b74@oracle.com> Message-ID: Hi Nils, looks good to me too. Best regards, Tobias On 08.03.19 12:06, Nils Eliasson wrote: > A second review, please. > > Regards, > > Nils > > > On 2019-02-25 14:16, Nils Eliasson wrote: >> >> Hi, >> >> I stumbled upon this problem when trying to >> reproduce?https://bugs.openjdk.java.net/browse/JDK-8219448?on JDK 13. The crash was recorded on a >> late 12 build, but the issue doesn't reproduce on 13. A bisection revealed that JDK-8210832 >> ?"Remove sneaky locking in class Monitor" caused >> the problem, at it doesn't even touch the compilers.? >> >> The problem is that when ciReplay serializes a ciMethodData it will serialize a MethodData as an >> array preceded by the size.? >> But a MethodData contains an inlined Mutex, and its size changed with the removal of sneaky locking.? >> >> This fix adds code for detecting a size change of MethodData, and tries to recover by adding >> padding or dropping data. Since all non significant serialization data are in the beginning, the >> padding or dropping of data is done from the start. >> >> https://bugs.openjdk.java.net/browse/JDK-8219642 >> >> http://cr.openjdk.java.net/~neliasso/8219642/webrev.01/ >> >> Please review, >> >> Nils Eliasson >> >> >> From tobias.hartmann at oracle.com Fri Mar 8 13:34:14 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 8 Mar 2019 14:34:14 +0100 Subject: [13] RFR(S): 8220341: Class redefinition fails with assert(!is_unloaded()) failed: unloaded method on the stack Message-ID: <4ed958d5-729b-9d2b-8d8a-bcfd7bbdea40@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8220341 http://cr.openjdk.java.net/~thartmann/8220341/webrev.00/ The assert added by 8163511 [1] is wrong and should be replaced by a check because class redefinition can encounter unloaded methods in the compile queue during marking (they will be removed from the queue later). Verified fix with failing tests at hs-tier6. Will run more tests. Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8163511 From nils.eliasson at oracle.com Fri Mar 8 14:25:26 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Fri, 8 Mar 2019 15:25:26 +0100 Subject: [13] RFR(S): 8220341: Class redefinition fails with assert(!is_unloaded()) failed: unloaded method on the stack In-Reply-To: <4ed958d5-729b-9d2b-8d8a-bcfd7bbdea40@oracle.com> References: <4ed958d5-729b-9d2b-8d8a-bcfd7bbdea40@oracle.com> Message-ID: <1cb7626a-b857-48a6-07c9-63e05372e948@oracle.com> Hi Tobias, Looks good! Regards, Nils On 2019-03-08 14:34, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8220341 > http://cr.openjdk.java.net/~thartmann/8220341/webrev.00/ > > The assert added by 8163511 [1] is wrong and should be replaced by a check because class > redefinition can encounter unloaded methods in the compile queue during marking (they will be > removed from the queue later). > > Verified fix with failing tests at hs-tier6. Will run more tests. > > Thanks, > Tobias > > [1] https://bugs.openjdk.java.net/browse/JDK-8163511 From eric.caspole at oracle.com Fri Mar 8 17:58:22 2019 From: eric.caspole at oracle.com (eric.caspole at oracle.com) Date: Fri, 8 Mar 2019 12:58:22 -0500 Subject: RFR (S): 8220368 : Update String.indexOf to test all the C2 intrinsics Message-ID: <89236cb0-0f52-795b-2593-a1abfd89b0b7@oracle.com> Hi everybody, Could I have reviews on this update to the String.indexOf JMH adding several new benchmarks so as to test all the intrinsics of Latin1 and UTF-16: ? _indexOfIL ? _indexOfIU ? _indexOfIUL ? _indexOfL ? _indexOfU ? _indexOfU_char ? _indexOfUL These should be helpful in testing updates from Graal and performance regression testing in general. JBS: https://bugs.openjdk.java.net/browse/JDK-8220368 Webrev: http://cr.openjdk.java.net/~ecaspole/JDK-8220368/01/webrev/ Thanks, Eric From vladimir.kozlov at oracle.com Fri Mar 8 18:09:08 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 Mar 2019 10:09:08 -0800 Subject: RFR (S): 8220368 : Update String.indexOf to test all the C2 intrinsics In-Reply-To: <89236cb0-0f52-795b-2593-a1abfd89b0b7@oracle.com> References: <89236cb0-0f52-795b-2593-a1abfd89b0b7@oracle.com> Message-ID: Good. Thanks, Vladimir On 3/8/19 9:58 AM, eric.caspole at oracle.com wrote: > Hi everybody, > > Could I have reviews on this update to the String.indexOf JMH adding several new benchmarks so as to test all the > intrinsics of Latin1 and UTF-16: > > ? _indexOfIL > ? _indexOfIU > ? _indexOfIUL > ? _indexOfL > ? _indexOfU > ? _indexOfU_char > ? _indexOfUL > > These should be helpful in testing updates from Graal and performance regression testing in general. > > JBS: > > https://bugs.openjdk.java.net/browse/JDK-8220368 > > Webrev: > > http://cr.openjdk.java.net/~ecaspole/JDK-8220368/01/webrev/ > > > Thanks, > > Eric > From vladimir.kozlov at oracle.com Fri Mar 8 18:11:58 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 Mar 2019 10:11:58 -0800 Subject: [13] RFR(S): 8220341: Class redefinition fails with assert(!is_unloaded()) failed: unloaded method on the stack In-Reply-To: <1cb7626a-b857-48a6-07c9-63e05372e948@oracle.com> References: <4ed958d5-729b-9d2b-8d8a-bcfd7bbdea40@oracle.com> <1cb7626a-b857-48a6-07c9-63e05372e948@oracle.com> Message-ID: <46e6d4b1-f164-c251-80bb-238cfe5eb9e0@oracle.com> +1 Thanks, Vladimir On 3/8/19 6:25 AM, Nils Eliasson wrote: > Hi Tobias, > > Looks good! > > Regards, > Nils > > > On 2019-03-08 14:34, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8220341 >> http://cr.openjdk.java.net/~thartmann/8220341/webrev.00/ >> >> The assert added by 8163511 [1] is wrong and should be replaced by a check because class >> redefinition can encounter unloaded methods in the compile queue during marking (they will be >> removed from the queue later). >> >> Verified fix with failing tests at hs-tier6. Will run more tests. >> >> Thanks, >> Tobias >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8163511 From vladimir.kozlov at oracle.com Fri Mar 8 18:27:00 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 Mar 2019 10:27:00 -0800 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: References: Message-ID: <70607611-a318-7634-2c35-bccee4300ffd@oracle.com> Hi Nils, What if it is normal diamond shaped graph and no loop? You can have case when both branches point to the same memory too. May be add is_Loop() check. Thanks, Vladimir On 3/8/19 2:52 AM, Nils Eliasson wrote: > Hi all, > > Background: > > We can get stuck in an infinite loop in IGVN. The method reproducing the problem is quite a big graph, and after some > optimization, a huge loop will die. But since it is so big, it takes a while before it has been pruned. > > In an inner loop there is a phi on memory that gets reduced to a self looping heart, with a membar on each in edge. > There is also a connected region that keeps it alive. (From the start there is other memory state coming into this loop, > but it gets disconnected early when the loop dies.) > > +---+???????????????? +---+ > |?? v???????????????? v?? | > | Membar +-+ +---+ Membar | > |????????? | |??????????? | > |????????? v v??????????? | > |????????? Phi??????????? | > |????????? + + +????????? | > |????????? | | |????????? | > +----------+ | +----------+ > ???????????? | > ???????????? v > ?????????? LoadN > > In IGVN, Ideal() will be called on the Load. > > On iteration 1 - A split_through_phi on one edge will be performed, because we can prove that other edge of the phi is a > loop. Now the Load hangs of one of the membars. > > On iteration 2 - Optimize_memory_chain will suggest the in to the membar as a more ideal memory, and then the load get > the phi back as the memory input. > > Repeat. > > I have gone great lengths to show that this code is part of a huge loop, that is dead, and will be eliminated in due time. > > My suggested solution to breaking the infinite loop, is to change the first case, by simply not perform the memory > replacement when both inputs are self loops. > > https://bugs.openjdk.java.net/browse/JDK-8219517 > > http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ > > Regards, > > Nils > > > > From vladimir.kozlov at oracle.com Fri Mar 8 18:29:26 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 8 Mar 2019 10:29:26 -0800 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <8deb2b91-9a1e-b17e-bd70-c27190516ea6@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> <8deb2b91-9a1e-b17e-bd70-c27190516ea6@oracle.com> Message-ID: Looks good. Thanks, Vladimir On 3/8/19 12:25 AM, Tobias Hartmann wrote: > Vladimir, Dean, thanks for the review. > > My first fix version had '_return_allocated = false' but then I incorrectly assumed that the > behavior would be the same as if we don't special case at all. I missed that the receiver is of > course also involved and you are right that we should not treat it as "global escape". > > New webrev: > http://cr.openjdk.java.net/~thartmann/8218201/webrev.01/ > > Best regards, > Tobias > > On 07.03.19 20:20, dean.long at oracle.com wrote: >> I agree.? We don't want obj.getClass() to treat obj as "global escape". >> >> dl >> >> PS - Original change goes back to JDK-6488063. >> >> On 3/7/19 10:46 AM, Vladimir Kozlov wrote: >>> _getClass is special case because it is native call and we can't do bytecode analysis. >>> But we know what it does (it loads klass from object and mirror from klass) - no allocations, no >>> locals, no arguments returned. >>> I think it is simple missing _return_allocated = false setting in addition to _return_local = >>> false. _return_allocated is set to true by default optimistically. >>> >>> Thanks, >>> Vladimir >>> >>> On 3/7/19 4:54 AM, Tobias Hartmann wrote: >>>> Hi, >>>> >>>> please review the following patch: >>>> https://bugs.openjdk.java.net/browse/JDK-8218201 >>>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >>>> >>>> When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) >>>> method Object::getClass as "return allocated value" which means that "only newly allocated unescaped >>>> objects are returned". The OptimizePtrCompare optimization then uses this information to incorrectly >>>> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. >>>> >>>> This is a very old issue and I can't trace back why a special case for the _getClass intrinsic has >>>> been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the >>>> returned Object, I've removed the special case. >>>> >>>> Thanks, >>>> Tobias >>>> >> From dean.long at oracle.com Fri Mar 8 19:22:11 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 8 Mar 2019 11:22:11 -0800 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> <8deb2b91-9a1e-b17e-bd70-c27190516ea6@oracle.com> Message-ID: <9e883ca4-6591-d47f-ba01-174c1d28a8a7@oracle.com> +1 dl On 3/8/19 10:29 AM, Vladimir Kozlov wrote: > Looks good. > > Thanks, > Vladimir > > On 3/8/19 12:25 AM, Tobias Hartmann wrote: >> Vladimir, Dean, thanks for the review. >> >> My first fix version had '_return_allocated = false' but then I >> incorrectly assumed that the >> behavior would be the same as if we don't special case at all. I >> missed that the receiver is of >> course also involved and you are right that we should not treat it as >> "global escape". >> >> New webrev: >> http://cr.openjdk.java.net/~thartmann/8218201/webrev.01/ >> >> Best regards, >> Tobias >> >> On 07.03.19 20:20, dean.long at oracle.com wrote: >>> I agree.? We don't want obj.getClass() to treat obj as "global escape". >>> >>> dl >>> >>> PS - Original change goes back to JDK-6488063. >>> >>> On 3/7/19 10:46 AM, Vladimir Kozlov wrote: >>>> _getClass is special case because it is native call and we can't do >>>> bytecode analysis. >>>> But we know what it does (it loads klass from object and mirror >>>> from klass) - no allocations, no >>>> locals, no arguments returned. >>>> I think it is simple missing _return_allocated = false setting in >>>> addition to _return_local = >>>> false. _return_allocated is set to true by default optimistically. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 3/7/19 4:54 AM, Tobias Hartmann wrote: >>>>> Hi, >>>>> >>>>> please review the following patch: >>>>> https://bugs.openjdk.java.net/browse/JDK-8218201 >>>>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >>>>> >>>>> When intrinsification is disabled, the BCEscapeAnalyzer marks the >>>>> return value of the (native) >>>>> method Object::getClass as "return allocated value" which means >>>>> that "only newly allocated unescaped >>>>> objects are returned". The OptimizePtrCompare optimization then >>>>> uses this information to incorrectly >>>>> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) >>>>> to always false. >>>>> >>>>> This is a very old issue and I can't trace back why a special case >>>>> for the _getClass intrinsic has >>>>> been added to the BCEscapeAnalyzer. Since I don't think we should >>>>> make any assumptions about the >>>>> returned Object, I've removed the special case. >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>> From claes.redestad at oracle.com Fri Mar 8 21:51:57 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Fri, 8 Mar 2019 22:51:57 +0100 Subject: RFR (S): 8220368 : Update String.indexOf to test all the C2 intrinsics In-Reply-To: <89236cb0-0f52-795b-2593-a1abfd89b0b7@oracle.com> References: <89236cb0-0f52-795b-2593-a1abfd89b0b7@oracle.com> Message-ID: <73ccf002-9420-9507-b2f7-12ee8b0fb2ca@oracle.com> Looks OK to me. Thanks! /Claes On 2019-03-08 18:58, eric.caspole at oracle.com wrote: > Hi everybody, > > Could I have reviews on this update to the String.indexOf JMH adding > several new benchmarks so as to test all the intrinsics of Latin1 and > UTF-16: > > ? _indexOfIL > ? _indexOfIU > ? _indexOfIUL > ? _indexOfL > ? _indexOfU > ? _indexOfU_char > ? _indexOfUL > > These should be helpful in testing updates from Graal and performance > regression testing in general. > > JBS: > > https://bugs.openjdk.java.net/browse/JDK-8220368 > > Webrev: > > http://cr.openjdk.java.net/~ecaspole/JDK-8220368/01/webrev/ > > > Thanks, > > Eric > From jesper.wilhelmsson at oracle.com Sat Mar 9 08:46:04 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Sat, 9 Mar 2019 09:46:04 +0100 Subject: RFR: JDK-8218074 - Update Graal Message-ID: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Hi, Please review the patch to integrate the latest Graal changes into OpenJDK. Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 JBS duplicates fixed by this integration: https://bugs.openjdk.java.net/browse/JDK-8217161 https://bugs.openjdk.java.net/browse/JDK-8218698 https://bugs.openjdk.java.net/browse/JDK-8218859 JBS duplicates deferred to the next integration: https://bugs.openjdk.java.net/browse/JDK-8214947 Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Sat Mar 9 17:42:35 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Sat, 9 Mar 2019 09:42:35 -0800 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Message-ID: Hi Jesper, Thank you for doing this. I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. Thanks, Vladimir [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate the latest Graal changes into OpenJDK. > Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 > > JBS duplicates fixed by this integration: > https://bugs.openjdk.java.net/browse/JDK-8217161 > https://bugs.openjdk.java.net/browse/JDK-8218698 > https://bugs.openjdk.java.net/browse/JDK-8218859 > > JBS duplicates deferred to the next integration: > https://bugs.openjdk.java.net/browse/JDK-8214947 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ > > This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 > > Thanks, > /Jesper > From jesper.wilhelmsson at oracle.com Mon Mar 11 00:56:23 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 11 Mar 2019 01:56:23 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Message-ID: Hi Vladimir, I followed the given instructions and used mx to create the diff: mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. I have removed JDK-8215322 from the diff now. Thanks, /Jesper > On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: > > Hi Jesper, > > Thank you for doing this. > > I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? > > For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. > > Thanks, > Vladimir > > [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. > > On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate the latest Graal changes into OpenJDK. >> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >> JBS duplicates fixed by this integration: >> https://bugs.openjdk.java.net/browse/JDK-8217161 >> https://bugs.openjdk.java.net/browse/JDK-8218698 >> https://bugs.openjdk.java.net/browse/JDK-8218859 >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 >> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >> Thanks, >> /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From tobias.hartmann at oracle.com Mon Mar 11 08:32:25 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 Mar 2019 09:32:25 +0100 Subject: [13] RFR(S): 8218201: Failures when vmIntrinsics::_getClass is not inlined In-Reply-To: <9e883ca4-6591-d47f-ba01-174c1d28a8a7@oracle.com> References: <8b04ca12-3619-9e44-7f79-e50eeca7f16c@oracle.com> <9739ec8d-703b-a10b-9325-c88f213c1d8e@oracle.com> <1e422175-61a6-2f6e-f351-e23c17fad606@oracle.com> <8deb2b91-9a1e-b17e-bd70-c27190516ea6@oracle.com> <9e883ca4-6591-d47f-ba01-174c1d28a8a7@oracle.com> Message-ID: <713cb35a-990c-1dcd-22b4-a58d9c0eac69@oracle.com> Thanks Vladimir and Dean. Best regards, Tobias On 08.03.19 20:22, dean.long at oracle.com wrote: > +1 > > dl > > On 3/8/19 10:29 AM, Vladimir Kozlov wrote: >> Looks good. >> >> Thanks, >> Vladimir >> >> On 3/8/19 12:25 AM, Tobias Hartmann wrote: >>> Vladimir, Dean, thanks for the review. >>> >>> My first fix version had '_return_allocated = false' but then I incorrectly assumed that the >>> behavior would be the same as if we don't special case at all. I missed that the receiver is of >>> course also involved and you are right that we should not treat it as "global escape". >>> >>> New webrev: >>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.01/ >>> >>> Best regards, >>> Tobias >>> >>> On 07.03.19 20:20, dean.long at oracle.com wrote: >>>> I agree.? We don't want obj.getClass() to treat obj as "global escape". >>>> >>>> dl >>>> >>>> PS - Original change goes back to JDK-6488063. >>>> >>>> On 3/7/19 10:46 AM, Vladimir Kozlov wrote: >>>>> _getClass is special case because it is native call and we can't do bytecode analysis. >>>>> But we know what it does (it loads klass from object and mirror from klass) - no allocations, no >>>>> locals, no arguments returned. >>>>> I think it is simple missing _return_allocated = false setting in addition to _return_local = >>>>> false. _return_allocated is set to true by default optimistically. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 3/7/19 4:54 AM, Tobias Hartmann wrote: >>>>>> Hi, >>>>>> >>>>>> please review the following patch: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218201 >>>>>> http://cr.openjdk.java.net/~thartmann/8218201/webrev.00/ >>>>>> >>>>>> When intrinsification is disabled, the BCEscapeAnalyzer marks the return value of the (native) >>>>>> method Object::getClass as "return allocated value" which means that "only newly allocated >>>>>> unescaped >>>>>> objects are returned". The OptimizePtrCompare optimization then uses this information to >>>>>> incorrectly >>>>>> fold 'obj.getClass() == Object.class' (see TestGetClass.java:39) to always false. >>>>>> >>>>>> This is a very old issue and I can't trace back why a special case for the _getClass intrinsic >>>>>> has >>>>>> been added to the BCEscapeAnalyzer. Since I don't think we should make any assumptions about the >>>>>> returned Object, I've removed the special case. >>>>>> >>>>>> Thanks, >>>>>> Tobias >>>>>> >>>> > From tobias.hartmann at oracle.com Mon Mar 11 08:33:03 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 Mar 2019 09:33:03 +0100 Subject: [13] RFR(S): 8220341: Class redefinition fails with assert(!is_unloaded()) failed: unloaded method on the stack In-Reply-To: <46e6d4b1-f164-c251-80bb-238cfe5eb9e0@oracle.com> References: <4ed958d5-729b-9d2b-8d8a-bcfd7bbdea40@oracle.com> <1cb7626a-b857-48a6-07c9-63e05372e948@oracle.com> <46e6d4b1-f164-c251-80bb-238cfe5eb9e0@oracle.com> Message-ID: <7f7018d9-d6e6-c37a-2630-4dafc0b0c709@oracle.com> Vladimir, Nils, thanks for the review! Best regards, Tobias On 08.03.19 19:11, Vladimir Kozlov wrote: > +1 > > Thanks, > Vladimir > > On 3/8/19 6:25 AM, Nils Eliasson wrote: >> Hi Tobias, >> >> Looks good! >> >> Regards, >> Nils >> >> >> On 2019-03-08 14:34, Tobias Hartmann wrote: >>> Hi, >>> >>> please review the following patch: >>> https://bugs.openjdk.java.net/browse/JDK-8220341 >>> http://cr.openjdk.java.net/~thartmann/8220341/webrev.00/ >>> >>> The assert added by 8163511 [1] is wrong and should be replaced by a check because class >>> redefinition can encounter unloaded methods in the compile queue during marking (they will be >>> removed from the queue later). >>> >>> Verified fix with failing tests at hs-tier6. Will run more tests. >>> >>> Thanks, >>> Tobias >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8163511 From doug.simon at oracle.com Mon Mar 11 09:45:29 2019 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 11 Mar 2019 10:45:29 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Message-ID: <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> > On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: > > Hi Vladimir, > > I followed the given instructions and used mx to create the diff: > > mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 > > I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) -Doug > The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. > > I have removed JDK-8215322 from the diff now. > > Thanks, > /Jesper > >> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >> >> Hi Jesper, >> >> Thank you for doing this. >> >> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >> >> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >> >> Thanks, >> Vladimir >> >> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >> >> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>> JBS duplicates fixed by this integration: >>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>> JBS duplicates deferred to the next integration: >>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>> Thanks, >>> /Jesper > From adinn at redhat.com Mon Mar 11 09:51:57 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 11 Mar 2019 09:51:57 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> Message-ID: <553b872f-786c-951c-5c98-7db6993f02be@redhat.com> On 06/03/2019 02:05, Pengfei Li (Arm Technology China) wrote: > Hi Andrew Dinn, > >> What seems very odd to me is the difference between fmaxv and >> fminv. Both Q == 1 encodings (i.e. with sz in {0, 1}) are reserved >> for fmaxv. However, the encoding for fminv accepts both Q == 1 >> encodings with the expected interpretation. > > In the latest published version of the ArmARM doc, I don't see such > difference between fmaxv and fminv. I guess what you have seen might > be a bug of the previous version docs. Ok, I'm not sure what documentation you are looking at so I need to explain where I am getting my information. Up until today I have been using the Arm ARM C.a which is dated 20 December 2017. I just downloaded and checked the Arm ARM D.a dated 31 October 2018 which i think is the latest version. It shows the same difference between fmaxv and fminv in the same section of the manual. Looking at the 2018 manual (D.a) the details are on pages C7-1501 onwards. For fmaxv it says under this heading: Single-precision and double-precision variant . . . Decode for this encoding integer d = UInt(Rd); integer n = UInt(Rn); if sz:Q != '01' then UNDEFINED; integer esize = 32 << UInt(sz); integer datasize = if Q == '1' then 128 else 64; . . . and then on page C7-1502 it says . . . For the single-precision and double-precision variant: is an arrangement specifier, encoded in the "Q:sz" field. It can have the following values: 4S when Q = 1 , sz = 0 The following encodings are reserved: ? Q = 0 , sz = x . ? Q = 1 , sz = 1 . However, for fminv on page C7-1502 it says under the following heading Single-precision and double-precision variant Decode for this encoding integer d = UInt(Rd); integer n = UInt(Rn); integer m = UInt(Rm); if sz:Q == '10' then UNDEFINED; integer esize = 32 << UInt(sz); integer datasize = if Q == '1' then 128 else 64; integer elements = datasize DIV esize; and later on page C7-1503 it says . . . For the single-precision and double-precision variant: is an arrangement specifier, encoded in the "sz:Q" field. It can have the following values: 2S when sz = 0 , Q = 0 4S when sz = 0 , Q = 1 2D when sz = 1 , Q = 1 The encoding sz = 1 , Q = 0 is reserved. Do you have a later version of the manual than D.a? If not, are you looking at some other part of the manual? Note that the first decode is undefined if sz:Q != '01' and the second is UNDEFINED if sz:Q == '01'. >> Yes, I think it would probably be better to leave the assert in >> place and use the encoding implied by the SIMD_Arrangement >> parameter i.e. T2S ==> Q=1,sz=0 and T2D ==> Q=1, sz=1. That way the >> assert will catch errors in debug builds and non-debug builds >> should be stopped by a SIGILL exception. > > Thanks, I will do this and post a new webrev in a new thread then. Ok, thank you. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From Pengfei.Li at arm.com Mon Mar 11 10:32:37 2019 From: Pengfei.Li at arm.com (Pengfei Li (Arm Technology China)) Date: Mon, 11 Mar 2019 10:32:37 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: <553b872f-786c-951c-5c98-7db6993f02be@redhat.com> References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <35c1db6b-a238-1e1e-9986-3d1a31b00bc2@redhat.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> <553b872f-786c-951c-5c98-7db6993f02be@redhat.com> Message-ID: Hi Andrew Dinn, > Do you have a later version of the manual than D.a? If not, are you looking at > some other part of the manual? Note that the first decode is undefined if > sz:Q != '01' and the second is UNDEFINED if sz:Q == '01'. Ah~ Perhaps the armARM doc is hard to read sometimes. In the D.a version: - The FMAXV instruction is defined in section C7.2.103 on page 1501-1502. - The FMINV instruction is defined in section C7.2.113 on page 1521-1522. - Section C7.2.104 which follows the FMAXV section defines the "FMIN (vector)" instruction. This is actually the binary vector operation, not used for reduction fmin. The armARM just sorts these sections in alphabetical order. Opposite instructions are not necessarily adjacent. > > Thanks, I will do this and post a new webrev in a new thread then. > > Ok, thank you. I've already posted it in a new thread. See https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-March/007039.html -- Thanks, Pengfei From adinn at redhat.com Mon Mar 11 11:30:59 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 11 Mar 2019 11:30:59 +0000 Subject: [aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request In-Reply-To: References: <02FCFB8477C4EF43A2AD8E0C60F3DA2BB1A70A33@FMSMSX126.amr.corp.intel.com> <3a850f71-0c13-135b-5150-4bdf46654a74@oracle.com> <806a3da6-7125-0ce3-4ec5-d352d7bdcf50@oracle.com> <0ebdb182-2b44-207d-81b7-e1dc1d19150e@oracle.com> <7194e0cc-0f4f-7348-7b50-1347acbf9f92@redhat.com> <8bf4cc54-6e66-fab4-b3fe-4b026780924d@redhat.com> <553b872f-786c-951c-5c98-7db6993f02be@redhat.com> Message-ID: <606ea437-99f4-1d0a-f03b-28f069c9e20d@redhat.com> Hi Pengfei, On 11/03/2019 10:32, Pengfei Li (Arm Technology China) wrote: > Ah~ Perhaps the armARM doc is hard to read sometimes. Doh! It is obviously difficult for me to read :-) > In the D.a version: - The FMAXV instruction is defined in section > C7.2.103 on page 1501-1502. - The FMINV instruction is defined in > section C7.2.113 on page 1521-1522. - Section C7.2.104 which follows > the FMAXV section defines the "FMIN (vector)" instruction. This is > actually the binary vector operation, not used for reduction fmin. > > The armARM just sorts these sections in alphabetical order. Opposite > instructions are not necessarily adjacent. Thank you for the correction! > I've already posted it in a new thread. See > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-March/007039.html Yes, got it. I will reply in that thread. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Mon Mar 11 11:39:20 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 11 Mar 2019 11:39:20 +0000 Subject: [aarch64-port-dev ] RFR(S): 8214922: Add vectorization support for fmin/fmax In-Reply-To: References: <87d0pv2iow.fsf@redhat.com> <877eg32bzq.fsf@redhat.com> <871s6a3map.fsf@redhat.com> <87va371n6b.fsf@redhat.com> <40d1a9a7-47f3-4e13-032d-70932b03d215@redhat.com> Message-ID: <954d6760-a914-747b-aab5-b928490510a4@redhat.com> hI pENGFEI, On 07/03/2019 09:26, Pengfei Li (Arm Technology China) wrote: > Please see below updated webrev for the pending patch of fmin/fmax > vectorization. > The only difference between webrev.03 and webrev.02 is the > hard-coded > arrangement bits in fmaxv/fminv encodings are replaced. > > webrev: http://cr.openjdk.java.net/~pli/rfr/8214922/webrev.03/ > JBS: https://bugs.openjdk.java.net/browse/JDK-8214922 Yes, that looks good to push. Thanks. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From claes.redestad at oracle.com Mon Mar 11 12:48:34 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 11 Mar 2019 13:48:34 +0100 Subject: RFR: 8220420: Cleanup c1_LinearScan Message-ID: Hi, please review this small optimization and cleanup of c1_LinearScan. Webrev: http://cr.openjdk.java.net/~redestad/8220420/open.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8220420 Testing: tier1-3 /Claes From tobias.hartmann at oracle.com Mon Mar 11 12:57:12 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 11 Mar 2019 13:57:12 +0100 Subject: RFR: 8220420: Cleanup c1_LinearScan In-Reply-To: References: Message-ID: Hi Claes, looks good to me. Best regards, Tobias On 11.03.19 13:48, Claes Redestad wrote: > Hi, > > please review this small optimization and cleanup of c1_LinearScan. > > Webrev: http://cr.openjdk.java.net/~redestad/8220420/open.00/ > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220420 > > Testing: tier1-3 > > /Claes From claes.redestad at oracle.com Mon Mar 11 13:04:06 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 11 Mar 2019 14:04:06 +0100 Subject: RFR: 8220420: Cleanup c1_LinearScan In-Reply-To: References: Message-ID: On 2019-03-11 13:57, Tobias Hartmann wrote: > Hi Claes, > > looks good to me. Thanks, Tobias! /Claes From nils.eliasson at oracle.com Mon Mar 11 13:31:02 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 11 Mar 2019 14:31:02 +0100 Subject: RFR: 8220420: Cleanup c1_LinearScan In-Reply-To: References: Message-ID: I like it, Reviewed! // Nils On 2019-03-11 13:48, Claes Redestad wrote: > Hi, > > please review this small optimization and cleanup of c1_LinearScan. > > Webrev: http://cr.openjdk.java.net/~redestad/8220420/open.00/ > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220420 > > Testing: tier1-3 > > /Claes From nils.eliasson at oracle.com Mon Mar 11 13:33:10 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 11 Mar 2019 14:33:10 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: <70607611-a318-7634-2c35-bccee4300ffd@oracle.com> References: <70607611-a318-7634-2c35-bccee4300ffd@oracle.com> Message-ID: <0e174fc7-eb7d-77f3-0932-f1bec0e7a27d@oracle.com> On 2019-03-08 19:27, Vladimir Kozlov wrote: > Hi Nils, > > What if it is normal diamond shaped graph and no loop? It can't be. The loads memory edge is 'mem'. 'mem' is a phi, and we loop over its in-edges . The optimization only triggers when: mem ==opt_mem_chain(mem->in(x)). That can only happen when there is a loop. (And my addition is to block the optimization when both are loops: mem == opt_mem_chain(mem->in(1)) == opt_mem_chain(mem->in(2)). Regards, / Nils > You can have case when both branches point to the same memory too. May > be add is_Loop() check. > > Thanks, > Vladimir > > > On 3/8/19 2:52 AM, Nils Eliasson wrote: >> Hi all, >> >> Background: >> >> We can get stuck in an infinite loop in IGVN. The method reproducing >> the problem is quite a big graph, and after some optimization, a huge >> loop will die. But since it is so big, it takes a while before it has >> been pruned. >> >> In an inner loop there is a phi on memory that gets reduced to a self >> looping heart, with a membar on each in edge. There is also a >> connected region that keeps it alive. (From the start there is other >> memory state coming into this loop, but it gets disconnected early >> when the loop dies.) >> >> +---+???????????????? +---+ >> |?? v???????????????? v?? | >> | Membar +-+ +---+ Membar | >> |????????? | |??????????? | >> |????????? v v??????????? | >> |????????? Phi??????????? | >> |????????? + + +????????? | >> |????????? | | |????????? | >> +----------+ | +----------+ >> ????????????? | >> ????????????? v >> ??????????? LoadN >> >> In IGVN, Ideal() will be called on the Load. >> >> On iteration 1 - A split_through_phi on one edge will be performed, >> because we can prove that other edge of the phi is a loop. Now the >> Load hangs of one of the membars. >> >> On iteration 2 - Optimize_memory_chain will suggest the in to the >> membar as a more ideal memory, and then the load get the phi back as >> the memory input. >> >> Repeat. >> >> I have gone great lengths to show that this code is part of a huge >> loop, that is dead, and will be eliminated in due time. >> >> My suggested solution to breaking the infinite loop, is to change the >> first case, by simply not perform the memory replacement when both >> inputs are self loops. >> >> https://bugs.openjdk.java.net/browse/JDK-8219517 >> >> http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ >> >> Regards, >> >> Nils >> >> >> >> From stefan.karlsson at oracle.com Mon Mar 11 14:23:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 11 Mar 2019 15:23:53 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code Message-ID: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Hi all, Please review this patch to move the scavenge root code out from CodeCache and nmethod. http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8220343 The CodeCache::scavenge_root_nmethods_do function and its implementation in CodeCache and nmethod, is only used by set of our GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or Epsilon. I want to move all of that into GC subsystem and then only let those GCs using it pay the cost of having that code. This is a continuation of the work of the GC Interface, where G1, ZGC, Shenandoah, and Epsilon, uses the register_nmethod, unregister_nmethod, and flush_nmethod calls, but the other GCs don't. This patch builds upon: JDK-8220411: Remove ScavengeRootsInCode=0 code https://bugs.openjdk.java.net/browse/JDK-8220411 and also depends on the the resolution of: JDK-8220342: Remove scavenge_root_nmethods_do from VM_HeapWalkOperation::collect_simple_roots https://bugs.openjdk.java.net/browse/JDK-8220342 Thanks, StefanK From claes.redestad at oracle.com Mon Mar 11 15:49:48 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 11 Mar 2019 16:49:48 +0100 Subject: RFR: 8220420: Cleanup c1_LinearScan In-Reply-To: References: Message-ID: <5f40d056-daee-7a2d-a2cc-5da4d3c26a20@oracle.com> On 2019-03-11 14:31, Nils Eliasson wrote: > I like it, > > Reviewed! Thanks, Nils! /Claes From vladimir.kozlov at oracle.com Mon Mar 11 16:55:36 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 09:55:36 -0700 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: Looks good to me. Please, test with Graal (at least tier3). Thanks, Vladimir On 3/11/19 7:23 AM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to move the scavenge root code out from CodeCache and nmethod. > > http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8220343 > > The CodeCache::scavenge_root_nmethods_do function and its implementation in CodeCache and nmethod, is only used by set > of our GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or Epsilon. I want to move all of that into GC > subsystem and then only let those GCs using it pay the cost of having that code. > > This is a continuation of the work of the GC Interface, where G1, ZGC, Shenandoah, and Epsilon, uses the > register_nmethod, unregister_nmethod, and flush_nmethod calls, but the other GCs don't. > > This patch builds upon: > ?JDK-8220411: Remove ScavengeRootsInCode=0 code > ?https://bugs.openjdk.java.net/browse/JDK-8220411 > > and also depends on the the resolution of: > ?JDK-8220342: Remove scavenge_root_nmethods_do from VM_HeapWalkOperation::collect_simple_roots > ?https://bugs.openjdk.java.net/browse/JDK-8220342 > > Thanks, > StefanK From vladimir.kozlov at oracle.com Mon Mar 11 17:56:38 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 10:56:38 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> Message-ID: <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. Thanks, Vladimir On 3/11/19 2:45 AM, Doug Simon wrote: > > >> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >> >> Hi Vladimir, >> >> I followed the given instructions and used mx to create the diff: >> >> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >> >> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. > > Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) > > -Doug > >> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >> >> I have removed JDK-8215322 from the diff now. >> >> Thanks, >> /Jesper >> >>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>> >>> Hi Jesper, >>> >>> Thank you for doing this. >>> >>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>> >>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>> >>> Thanks, >>> Vladimir >>> >>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>> >>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>> Hi, >>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>> JBS duplicates fixed by this integration: >>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>> JBS duplicates deferred to the next integration: >>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>> Thanks, >>>> /Jesper >> > From vladimir.kozlov at oracle.com Mon Mar 11 18:16:51 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 11:16:51 -0700 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: <0e174fc7-eb7d-77f3-0932-f1bec0e7a27d@oracle.com> References: <70607611-a318-7634-2c35-bccee4300ffd@oracle.com> <0e174fc7-eb7d-77f3-0932-f1bec0e7a27d@oracle.com> Message-ID: <40cefb9c-7e8a-5358-93a7-d0cba0eea81e@oracle.com> On 3/11/19 6:33 AM, Nils Eliasson wrote: > On 2019-03-08 19:27, Vladimir Kozlov wrote: >> Hi Nils, >> >> What if it is normal diamond shaped graph and no loop? > > It can't be. The loads memory edge is 'mem'. 'mem' is a phi, and we loop over its in-edges . The optimization only > triggers when: mem ==opt_mem_chain(mem->in(x)). That can only happen when there is a loop. Okay, I got it. > > (And my addition is to block the optimization when both are loops: mem == opt_mem_chain(mem->in(1)) == > opt_mem_chain(mem->in(2)). Good. Reviewed. Thanks, Vladimir > > Regards, > > / Nils > >> You can have case when both branches point to the same memory too. May be add is_Loop() check. >> >> Thanks, >> Vladimir >> >> >> On 3/8/19 2:52 AM, Nils Eliasson wrote: >>> Hi all, >>> >>> Background: >>> >>> We can get stuck in an infinite loop in IGVN. The method reproducing the problem is quite a big graph, and after some >>> optimization, a huge loop will die. But since it is so big, it takes a while before it has been pruned. >>> >>> In an inner loop there is a phi on memory that gets reduced to a self looping heart, with a membar on each in edge. >>> There is also a connected region that keeps it alive. (From the start there is other memory state coming into this >>> loop, but it gets disconnected early when the loop dies.) >>> >>> +---+???????????????? +---+ >>> |?? v???????????????? v?? | >>> | Membar +-+ +---+ Membar | >>> |????????? | |??????????? | >>> |????????? v v??????????? | >>> |????????? Phi??????????? | >>> |????????? + + +????????? | >>> |????????? | | |????????? | >>> +----------+ | +----------+ >>> ????????????? | >>> ????????????? v >>> ??????????? LoadN >>> >>> In IGVN, Ideal() will be called on the Load. >>> >>> On iteration 1 - A split_through_phi on one edge will be performed, because we can prove that other edge of the phi >>> is a loop. Now the Load hangs of one of the membars. >>> >>> On iteration 2 - Optimize_memory_chain will suggest the in to the membar as a more ideal memory, and then the load >>> get the phi back as the memory input. >>> >>> Repeat. >>> >>> I have gone great lengths to show that this code is part of a huge loop, that is dead, and will be eliminated in due >>> time. >>> >>> My suggested solution to breaking the infinite loop, is to change the first case, by simply not perform the memory >>> replacement when both inputs are self loops. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8219517 >>> >>> http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ >>> >>> Regards, >>> >>> Nils >>> >>> >>> >>> From doug.simon at oracle.com Mon Mar 11 19:19:18 2019 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 11 Mar 2019 20:19:18 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> Message-ID: <82828C39-E5E7-40FE-9600-4EA45D233B2E@oracle.com> > On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: > > I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. > > First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. Yes, that?s how it?s done: https://github.com/oracle/graal/blob/master/compiler/mx.compiler/mx_updategraalinopenjdk.py#L317 -Doug > Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. > > Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. > > Thanks, > Vladimir > > On 3/11/19 2:45 AM, Doug Simon wrote: >>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>> >>> Hi Vladimir, >>> >>> I followed the given instructions and used mx to create the diff: >>> >>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>> >>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >> -Doug >>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>> >>> I have removed JDK-8215322 from the diff now. >>> >>> Thanks, >>> /Jesper >>> >>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>> >>>> Hi Jesper, >>>> >>>> Thank you for doing this. >>>> >>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>> >>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>> >>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>> Hi, >>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>> JBS duplicates fixed by this integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>> JBS duplicates deferred to the next integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>> Thanks, >>>>> /Jesper >>> From nils.eliasson at oracle.com Mon Mar 11 19:34:37 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Mon, 11 Mar 2019 20:34:37 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: <40cefb9c-7e8a-5358-93a7-d0cba0eea81e@oracle.com> References: <70607611-a318-7634-2c35-bccee4300ffd@oracle.com> <0e174fc7-eb7d-77f3-0932-f1bec0e7a27d@oracle.com> <40cefb9c-7e8a-5358-93a7-d0cba0eea81e@oracle.com> Message-ID: Thank you, Vladimir! // Nils On 2019-03-11 19:16, Vladimir Kozlov wrote: > On 3/11/19 6:33 AM, Nils Eliasson wrote: >> On 2019-03-08 19:27, Vladimir Kozlov wrote: >>> Hi Nils, >>> >>> What if it is normal diamond shaped graph and no loop? >> >> It can't be. The loads memory edge is 'mem'. 'mem' is a phi, and we >> loop over its in-edges . The optimization only triggers when: mem >> ==opt_mem_chain(mem->in(x)). That can only happen when there is a loop. > > Okay, I got it. > >> >> (And my addition is to block the optimization when both are loops: >> mem == opt_mem_chain(mem->in(1)) == opt_mem_chain(mem->in(2)). > > Good. Reviewed. > > Thanks, > Vladimir > >> >> Regards, >> >> / Nils >> >>> You can have case when both branches point to the same memory too. >>> May be add is_Loop() check. >>> >>> Thanks, >>> Vladimir >>> >>> >>> On 3/8/19 2:52 AM, Nils Eliasson wrote: >>>> Hi all, >>>> >>>> Background: >>>> >>>> We can get stuck in an infinite loop in IGVN. The method >>>> reproducing the problem is quite a big graph, and after some >>>> optimization, a huge loop will die. But since it is so big, it >>>> takes a while before it has been pruned. >>>> >>>> In an inner loop there is a phi on memory that gets reduced to a >>>> self looping heart, with a membar on each in edge. There is also a >>>> connected region that keeps it alive. (From the start there is >>>> other memory state coming into this loop, but it gets disconnected >>>> early when the loop dies.) >>>> >>>> +---+???????????????? +---+ >>>> |?? v???????????????? v?? | >>>> | Membar +-+ +---+ Membar | >>>> |????????? | |??????????? | >>>> |????????? v v??????????? | >>>> |????????? Phi??????????? | >>>> |????????? + + +????????? | >>>> |????????? | | |????????? | >>>> +----------+ | +----------+ >>>> ????????????? | >>>> ????????????? v >>>> ??????????? LoadN >>>> >>>> In IGVN, Ideal() will be called on the Load. >>>> >>>> On iteration 1 - A split_through_phi on one edge will be performed, >>>> because we can prove that other edge of the phi is a loop. Now the >>>> Load hangs of one of the membars. >>>> >>>> On iteration 2 - Optimize_memory_chain will suggest the in to the >>>> membar as a more ideal memory, and then the load get the phi back >>>> as the memory input. >>>> >>>> Repeat. >>>> >>>> I have gone great lengths to show that this code is part of a huge >>>> loop, that is dead, and will be eliminated in due time. >>>> >>>> My suggested solution to breaking the infinite loop, is to change >>>> the first case, by simply not perform the memory replacement when >>>> both inputs are self loops. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8219517 >>>> >>>> http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ >>>> >>>> Regards, >>>> >>>> Nils >>>> >>>> >>>> >>>> From jesper.wilhelmsson at oracle.com Mon Mar 11 20:55:51 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Mon, 11 Mar 2019 21:55:51 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> Message-ID: <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> > On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: > > I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. > > First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. > > Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev should be what the mx script created if you ignore anything in test. > Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how often we can integrate depends on how much time the review takes ;-) /Jesper > > Thanks, > Vladimir > > On 3/11/19 2:45 AM, Doug Simon wrote: >>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>> >>> Hi Vladimir, >>> >>> I followed the given instructions and used mx to create the diff: >>> >>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>> >>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >> -Doug >>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>> >>> I have removed JDK-8215322 from the diff now. >>> >>> Thanks, >>> /Jesper >>> >>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>> >>>> Hi Jesper, >>>> >>>> Thank you for doing this. >>>> >>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>> >>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>> >>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>> Hi, >>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>> JBS duplicates fixed by this integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>> JBS duplicates deferred to the next integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>> Thanks, >>>>> /Jesper >>> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From stefan.karlsson at oracle.com Mon Mar 11 21:59:37 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Mon, 11 Mar 2019 22:59:37 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: <6a9a3bd2-2713-8206-efab-01f6cd4194e8@oracle.com> Hi Vladimir, On 2019-03-11 17:55, Vladimir Kozlov wrote: > Looks good to me. Please, test with Graal (at least tier3). Thanks for reviewing. Testing of this showed that I inadvertently removed the following line: ?https://cr.openjdk.java.net/~stefank/8220343/webrev.02.delta ?https://cr.openjdk.java.net/~stefank/8220343/webrev.02 Rerunning test right now, including Graal testing. Thanks, StefanK > > Thanks, > Vladimir > > On 3/11/19 7:23 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to move the scavenge root code out from >> CodeCache and nmethod. >> >> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8220343 >> >> The CodeCache::scavenge_root_nmethods_do function and its >> implementation in CodeCache and nmethod, is only used by set of our >> GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or >> Epsilon. I want to move all of that into GC subsystem and then only >> let those GCs using it pay the cost of having that code. >> >> This is a continuation of the work of the GC Interface, where G1, >> ZGC, Shenandoah, and Epsilon, uses the register_nmethod, >> unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >> >> This patch builds upon: >> ??JDK-8220411: Remove ScavengeRootsInCode=0 code >> ??https://bugs.openjdk.java.net/browse/JDK-8220411 >> >> and also depends on the the resolution of: >> ??JDK-8220342: Remove scavenge_root_nmethods_do from >> VM_HeapWalkOperation::collect_simple_roots >> ??https://bugs.openjdk.java.net/browse/JDK-8220342 >> >> Thanks, >> StefanK From vladimir.kozlov at oracle.com Tue Mar 12 00:44:30 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 17:44:30 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> Message-ID: I looked through all changes done in OpenJDK since last 'Graal Update'. I found only few issues - see following. The rest changes in overwritten-diffs.txt are not overwritten, they are kept - we don't need to re-apply them. ----------------------------------------------------------------------- New Igor's fix for 8196568 overwrote fix for 8217678 in StandardGraphBuilderPlugins.java file http://hg.openjdk.java.net/jdk/jdk/rev/b693b0d2053d Igor, please confirm that your changes also fix 8217678 and we don't need to apply it again. ----------------------------------------------------------------------- Next changes were reverted: JDK-8217289 http://hg.openjdk.java.net/jdk/jdk/rev/0040f89feb78 Jesper, I would suggest to revert your changes to corresponding file: hg revert src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/BigIntegerIntrinsicsTest.java I asked Patric to file PR for it month ago. Does anyone known if it was filed? I sent e-mail to him. ---------------------------------------------------------------------- Several files has incorrect Copyright years change from 2019 back to 2018. For example: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.lir.amd64/src/org/graalvm/compiler/lir/amd64/AMD64ArrayIndexOfOp.java.udiff.html Please Fix. Doug, we need PR to update Copyright years in master repo so they match. Thanks, Vladimir On 3/11/19 1:55 PM, jesper.wilhelmsson at oracle.com wrote: >> On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: >> >> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. >> >> First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. >> >> Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. > > The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev should be what the mx script created if you ignore anything in test. > >> Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. > > Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how often we can integrate depends on how much time the review takes ;-) > /Jesper > >> >> Thanks, >> Vladimir >> >> On 3/11/19 2:45 AM, Doug Simon wrote: >>>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>>> >>>> Hi Vladimir, >>>> >>>> I followed the given instructions and used mx to create the diff: >>>> >>>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>>> >>>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >>> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >>> -Doug >>>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>>> >>>> I have removed JDK-8215322 from the diff now. >>>> >>>> Thanks, >>>> /Jesper >>>> >>>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>>> >>>>> Hi Jesper, >>>>> >>>>> Thank you for doing this. >>>>> >>>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>>> >>>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>>> >>>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>> Hi, >>>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>>> JBS duplicates fixed by this integration: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>>> JBS duplicates deferred to the next integration: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>>> Thanks, >>>>>> /Jesper >>>> > From bsrbnd at gmail.com Tue Mar 12 00:54:56 2019 From: bsrbnd at gmail.com (B. Blaser) Date: Tue, 12 Mar 2019 01:54:56 +0100 Subject: RFR 8220407 (XS): compiler/intrinsics/math/TestFpMinMaxIntrinsics.java timedout Message-ID: Hi, Please review the following trivial fix to disable the unnecessary slow search tree examples which might cause timeouts on some systems. Thanks, Bernard diff --git a/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java b/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java --- a/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java +++ b/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java @@ -45,16 +45,6 @@ * -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.*Test* * -XX:CompileCommand=compileonly,compiler/intrinsics/math/TestFpMinMaxIntrinsics.*Test* * compiler.intrinsics.math.TestFpMinMaxIntrinsics reductionTests 100 - * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions - * -XX:+TieredCompilation - * -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* - * -XX:CompileCommand=dontinline,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* - * compiler.intrinsics.math.TestFpMinMaxIntrinsics randomSearchTree 1 - * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions - * -XX:+TieredCompilation - * -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* - * -XX:CompileCommand=dontinline,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* - * compiler.intrinsics.math.TestFpMinMaxIntrinsics sortedSearchTree 1 */ package compiler.intrinsics.math; From vladimir.kozlov at oracle.com Tue Mar 12 00:57:52 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 17:57:52 -0700 Subject: RFR 8220407 (XS): compiler/intrinsics/math/TestFpMinMaxIntrinsics.java timedout In-Reply-To: References: Message-ID: <445f6acb-3e69-adad-4388-7b6be38c154f@oracle.com> Looks good and trivial. Thanks, Vladimir On 3/11/19 5:54 PM, B. Blaser wrote: > Hi, > > Please review the following trivial fix to disable the unnecessary > slow search tree examples which might cause timeouts on some systems. > > Thanks, > Bernard > > diff --git a/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java > b/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java > --- a/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java > +++ b/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java > @@ -45,16 +45,6 @@ > * > -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.*Test* > * > -XX:CompileCommand=compileonly,compiler/intrinsics/math/TestFpMinMaxIntrinsics.*Test* > * compiler.intrinsics.math.TestFpMinMaxIntrinsics > reductionTests 100 > - * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions > -XX:+UnlockDiagnosticVMOptions > - * -XX:+TieredCompilation > - * > -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* > - * > -XX:CompileCommand=dontinline,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* > - * compiler.intrinsics.math.TestFpMinMaxIntrinsics > randomSearchTree 1 > - * @run main/othervm -XX:+IgnoreUnrecognizedVMOptions > -XX:+UnlockDiagnosticVMOptions > - * -XX:+TieredCompilation > - * > -XX:CompileCommand=print,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* > - * > -XX:CompileCommand=dontinline,compiler/intrinsics/math/TestFpMinMaxIntrinsics.min* > - * compiler.intrinsics.math.TestFpMinMaxIntrinsics > sortedSearchTree 1 > */ > > package compiler.intrinsics.math; > From vladimir.kozlov at oracle.com Tue Mar 12 03:21:56 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 11 Mar 2019 20:21:56 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> Message-ID: <2180ceb1-77a0-5f3f-2483-7fa9000c6c0e@oracle.com> Igor confirmed that we don't need to patch StandardGraphBuilderPlugins.java. Vladimir On 3/11/19 5:44 PM, Vladimir Kozlov wrote: > I looked through all changes done in OpenJDK since last 'Graal Update'. > > I found only few issues - see following. The rest changes in overwritten-diffs.txt are not overwritten, they are kept - > we don't need to re-apply them. > > ----------------------------------------------------------------------- > New Igor's fix for 8196568 overwrote fix for 8217678 in StandardGraphBuilderPlugins.java file > > http://hg.openjdk.java.net/jdk/jdk/rev/b693b0d2053d > > Igor, please confirm that your changes also fix 8217678 and we don't need to apply it again. > > ----------------------------------------------------------------------- > Next changes were reverted: JDK-8217289 > > http://hg.openjdk.java.net/jdk/jdk/rev/0040f89feb78 > > Jesper, I would suggest to revert your changes to corresponding file: > > ?hg revert > src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/BigIntegerIntrinsicsTest.java > > > I asked Patric to file PR for it month ago. Does anyone known if it was filed? I sent e-mail to him. > > ---------------------------------------------------------------------- > Several files has incorrect Copyright years change from 2019 back to 2018. For example: > > http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.lir.amd64/src/org/graalvm/compiler/lir/amd64/AMD64ArrayIndexOfOp.java.udiff.html > > > Please Fix. > > Doug, we need PR to update Copyright years in master repo so they match. > > Thanks, > Vladimir > > On 3/11/19 1:55 PM, jesper.wilhelmsson at oracle.com wrote: >>> On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: >>> >>> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. >>> >>> First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK >>> since last 'Update Graal' then I would understand why changes are in this file. >>> >>> Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK >>> sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't >>> need to re-apply changes. >> >> The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my >> script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev >> should be what the mx script created if you ignore anything in test. >> >>> Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 >>> months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should >>> not take a lot of time in such case. >> >> Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the >> need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how >> often we can integrate depends on how much time the review takes ;-) >> /Jesper >> >>> >>> Thanks, >>> Vladimir >>> >>> On 3/11/19 2:45 AM, Doug Simon wrote: >>>>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> I followed the given instructions and used mx to create the diff: >>>>> >>>>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>>>> >>>>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the >>>>> same diff). It seems the mx script do not realize that it's the same cahange. >>>> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d >>>> more more than happy to review a smart person?s attempt at adding it though ;-) >>>> -Doug >>>>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at >>>>> this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as >>>>> now) and that a new patch is simply added to the subtask. >>>>> >>>>> I have removed JDK-8215322 from the diff now. >>>>> >>>>> Thanks, >>>>> /Jesper >>>>> >>>>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>>>> >>>>>> Hi Jesper, >>>>>> >>>>>> Thank you for doing this. >>>>>> >>>>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big >>>>>> difference. How you prepared overwritten-diffs.txt? >>>>>> >>>>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is >>>>>> listed in the 8218074 Graal changelog list [1]. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>>>> >>>>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>>> Hi, >>>>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>>>> JBS duplicates fixed by this integration: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>>>> JBS duplicates deferred to the next integration: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>>>> Thanks, >>>>>>> /Jesper >>>>> >> From Ningsheng.Jian at arm.com Tue Mar 12 04:28:36 2019 From: Ningsheng.Jian at arm.com (Ningsheng Jian (Arm Technology China)) Date: Tue, 12 Mar 2019 04:28:36 +0000 Subject: [aarch64-port-dev ] RFR(S): 8214922: Add vectorization support for fmin/fmax In-Reply-To: <954d6760-a914-747b-aab5-b928490510a4@redhat.com> References: <87d0pv2iow.fsf@redhat.com> <877eg32bzq.fsf@redhat.com> <871s6a3map.fsf@redhat.com> <87va371n6b.fsf@redhat.com> <40d1a9a7-47f3-4e13-032d-70932b03d215@redhat.com> <954d6760-a914-747b-aab5-b928490510a4@redhat.com> Message-ID: <371872d2-4b07-f1bd-cadc-7758d5ac698e@arm.com> Hi Andrew, On 03/11/2019 07:39 PM, Andrew Dinn wrote: > hI pENGFEI, > > On 07/03/2019 09:26, Pengfei Li (Arm Technology China) wrote: >> Please see below updated webrev for the pending patch of fmin/fmax >> vectorization. >> The only difference between webrev.03 and webrev.02 is the >> hard-coded >> arrangement bits in fmaxv/fminv encodings are replaced. >> >> webrev: http://cr.openjdk.java.net/~pli/rfr/8214922/webrev.03/ >> JBS: https://bugs.openjdk.java.net/browse/JDK-8214922 > Yes, that looks good to push. Thanks. > Thanks! It passed the submit repo tests and our local jtreg tests. Pushed. Regards, Ningsheng From dean.long at oracle.com Tue Mar 12 04:57:36 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 11 Mar 2019 21:57:36 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Message-ID: On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: > JBS duplicates deferred to the next integration: > https://bugs.openjdk.java.net/browse/JDK-8214947 But this change is already in both JDK and Graal. dl From jesper.wilhelmsson at oracle.com Tue Mar 12 08:25:37 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 12 Mar 2019 09:25:37 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> Message-ID: <478C2306-0347-467F-8C2F-49F13CB7D3BA@oracle.com> I have reverted the changes to BigIntegerIntrinsicsTest.java and updated all copyright years that was present in overwritten-diffs.txt. New webrev for integration: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.01/ Thanks, /Jesper > On 12 Mar 2019, at 01:44, Vladimir Kozlov wrote: > > I looked through all changes done in OpenJDK since last 'Graal Update'. > > I found only few issues - see following. The rest changes in overwritten-diffs.txt are not overwritten, they are kept - we don't need to re-apply them. > > ----------------------------------------------------------------------- > New Igor's fix for 8196568 overwrote fix for 8217678 in StandardGraphBuilderPlugins.java file > > http://hg.openjdk.java.net/jdk/jdk/rev/b693b0d2053d > > Igor, please confirm that your changes also fix 8217678 and we don't need to apply it again. > > ----------------------------------------------------------------------- > Next changes were reverted: JDK-8217289 > > http://hg.openjdk.java.net/jdk/jdk/rev/0040f89feb78 > > Jesper, I would suggest to revert your changes to corresponding file: > > hg revert src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/BigIntegerIntrinsicsTest.java > > I asked Patric to file PR for it month ago. Does anyone known if it was filed? I sent e-mail to him. > > ---------------------------------------------------------------------- > Several files has incorrect Copyright years change from 2019 back to 2018. For example: > > http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.lir.amd64/src/org/graalvm/compiler/lir/amd64/AMD64ArrayIndexOfOp.java.udiff.html > > Please Fix. > > Doug, we need PR to update Copyright years in master repo so they match. > > Thanks, > Vladimir > > On 3/11/19 1:55 PM, jesper.wilhelmsson at oracle.com wrote: >>> On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: >>> >>> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. >>> >>> First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. >>> >>> Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. >> The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev should be what the mx script created if you ignore anything in test. >>> Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. >> Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how often we can integrate depends on how much time the review takes ;-) >> /Jesper >>> >>> Thanks, >>> Vladimir >>> >>> On 3/11/19 2:45 AM, Doug Simon wrote: >>>>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>>>> >>>>> Hi Vladimir, >>>>> >>>>> I followed the given instructions and used mx to create the diff: >>>>> >>>>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>>>> >>>>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >>>> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >>>> -Doug >>>>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>>>> >>>>> I have removed JDK-8215322 from the diff now. >>>>> >>>>> Thanks, >>>>> /Jesper >>>>> >>>>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>>>> >>>>>> Hi Jesper, >>>>>> >>>>>> Thank you for doing this. >>>>>> >>>>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>>>> >>>>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>>>> >>>>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>>> Hi, >>>>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>>>> JBS duplicates fixed by this integration: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>>>> JBS duplicates deferred to the next integration: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>>>> Thanks, >>>>>>> /Jesper >>>>> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From claes.redestad at oracle.com Tue Mar 12 12:52:31 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 13:52:31 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling Message-ID: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Hi, the _locks Values in ValueStack is often empty/unused, and allocating it lazily is a small startup/warmup optimization. Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8220501 Testing: tier1-3 Thanks! /Claes From claes.redestad at oracle.com Tue Mar 12 13:15:42 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 14:15:42 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling In-Reply-To: References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Message-ID: <80c8291b-b45e-4581-df36-10e27878ed49@oracle.com> On 2019-03-12 14:14, Tobias Hartmann wrote: > Hi Claes, > > looks good to me. Thanks, Tobias! /Claes From tobias.hartmann at oracle.com Tue Mar 12 13:14:43 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 12 Mar 2019 14:14:43 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling In-Reply-To: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Message-ID: Hi Claes, looks good to me. Best regards, Tobias On 12.03.19 13:52, Claes Redestad wrote: > Hi, > > the _locks Values in ValueStack is often empty/unused, and allocating it > lazily is a small startup/warmup optimization. > > Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220501 > > Testing: tier1-3 > > Thanks! > > /Claes From nils.eliasson at oracle.com Tue Mar 12 13:54:27 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 12 Mar 2019 14:54:27 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling In-Reply-To: References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Message-ID: <663aeb60-d4d5-54ab-7151-92897ccad79f@oracle.com> +1 // Nils > Hi Claes, > > looks good to me. > > Best regards, > Tobias > > On 12.03.19 13:52, Claes Redestad wrote: >> Hi, >> >> the _locks Values in ValueStack is often empty/unused, and allocating it >> lazily is a small startup/warmup optimization. >> >> Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220501 >> >> Testing: tier1-3 >> >> Thanks! >> >> /Claes From claes.redestad at oracle.com Tue Mar 12 14:21:31 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 15:21:31 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling In-Reply-To: <663aeb60-d4d5-54ab-7151-92897ccad79f@oracle.com> References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> <663aeb60-d4d5-54ab-7151-92897ccad79f@oracle.com> Message-ID: <7241aa0a-f45d-c45d-dcb3-b8189554fe3f@oracle.com> On 2019-03-12 14:54, Nils Eliasson wrote: > +1 > > // Nils Thanks for reviewing, Nils! /Claes From nils.eliasson at oracle.com Tue Mar 12 14:12:54 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 12 Mar 2019 15:12:54 +0100 Subject: RFR: 8220501: Improve c1_ValueStack locks handling In-Reply-To: <7241aa0a-f45d-c45d-dcb3-b8189554fe3f@oracle.com> References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> <663aeb60-d4d5-54ab-7151-92897ccad79f@oracle.com> <7241aa0a-f45d-c45d-dcb3-b8189554fe3f@oracle.com> Message-ID: <7bc0c95e-41a5-bf92-1056-e9a3a735dfd6@oracle.com> Always a pleasure, Claes! /Nils On 2019-03-12 15:21, Claes Redestad wrote: > On 2019-03-12 14:54, Nils Eliasson wrote: >> +1 >> >> // Nils > > Thanks for reviewing, Nils! > > /Claes From nils.eliasson at oracle.com Tue Mar 12 14:42:50 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 12 Mar 2019 15:42:50 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: References: Message-ID: <112841f9-8d2a-1404-6a97-98241fe956ec@oracle.com> I need a second review, Regards, Nils On 2019-03-08 11:52, Nils Eliasson wrote: > Hi all, > > Background: > > We can get stuck in an infinite loop in IGVN. The method reproducing > the problem is quite a big graph, and after some optimization, a huge > loop will die. But since it is so big, it takes a while before it has > been pruned. > > In an inner loop there is a phi on memory that gets reduced to a self > looping heart, with a membar on each in edge. There is also a > connected region that keeps it alive. (From the start there is other > memory state coming into this loop, but it gets disconnected early > when the loop dies.) > > +---+???????????????? +---+ > |?? v???????????????? v?? | > | Membar +-+ +---+ Membar | > |????????? | |??????????? | > |????????? v v??????????? | > |????????? Phi??????????? | > |????????? + + +????????? | > |????????? | | |????????? | > +----------+ | +----------+ > ???????????? | > ???????????? v > ?????????? LoadN > > In IGVN, Ideal() will be called on the Load. > > On iteration 1 - A split_through_phi on one edge will be performed, > because we can prove that other edge of the phi is a loop. Now the > Load hangs of one of the membars. > > On iteration 2 - Optimize_memory_chain will suggest the in to the > membar as a more ideal memory, and then the load get the phi back as > the memory input. > > Repeat. > > I have gone great lengths to show that this code is part of a huge > loop, that is dead, and will be eliminated in due time. > > My suggested solution to breaking the infinite loop, is to change the > first case, by simply not perform the memory replacement when both > inputs are self loops. > > https://bugs.openjdk.java.net/browse/JDK-8219517 > > http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ > > Regards, > > Nils > > > > From goetz.lindenmaier at sap.com Tue Mar 12 14:39:09 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 12 Mar 2019 14:39:09 +0000 Subject: 8220501: Improve c1_ValueStack locks handling In-Reply-To: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Message-ID: Hi Claes, it seems that your change breaks the slowdebug build: .../src/hotspot/share/c1/c1_CFGPrinter.cpp: In member function ?void CFGPrinterOutput::print_state(BlockBegin*)?: .../src/hotspot/share/c1/c1_CFGPrinter.cpp:169:7: error: ?for_each_lock_value? was not declared in this scope for_each_lock_value(state, index, value) { ^~~~~~~~~~~~~~~~~~~ Best regards, Goetz. > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Claes Redestad > Sent: Dienstag, 12. M?rz 2019 13:53 > To: hotspot compiler > Subject: RFR: 8220501: Improve c1_ValueStack locks handling > > Hi, > > the _locks Values in ValueStack is often empty/unused, and allocating it > lazily is a small startup/warmup optimization. > > Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8220501 > > Testing: tier1-3 > > Thanks! > > /Claes From claes.redestad at oracle.com Tue Mar 12 15:03:04 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 16:03:04 +0100 Subject: RFR (urgent, trivial): 8220515: Revert removal of for_each_lock_value removal Message-ID: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> Hi, revert the removal of the for_each_lock_value macro part of JDK-8220501, as this breaks debug builds. Bug: https://bugs.openjdk.java.net/browse/JDK-8220515 Patch: --- a/src/hotspot/share/c1/c1_ValueStack.hpp Tue Mar 12 15:26:45 2019 +0100 +++ b/src/hotspot/share/c1/c1_ValueStack.hpp Tue Mar 12 15:56:59 2019 +0100 @@ -261,6 +261,14 @@ index += value->type()->size()) +#define for_each_lock_value(state, index, value) \ + int temp_var = state->locks_size(); \ + for (index = 0; \ + index < temp_var && (value = state->lock_at(index), true); \ + index++) \ + if (value != NULL) + + // Macro definition for simple iteration of all state values of a ValueStack // Because the code cannot be executed in a single loop, the code must be passed // as a macro parameter. From claes.redestad at oracle.com Tue Mar 12 15:06:35 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 16:06:35 +0100 Subject: 8220501: Improve c1_ValueStack locks handling In-Reply-To: References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> Message-ID: <8eaaebd7-bf42-539c-8be4-14be64e0fdbc@oracle.com> Hi Goetz, yes, I've filed https://bugs.openjdk.java.net/browse/JDK-8220515 and sent out a RFR for a partial removal. /Claes On 2019-03-12 15:39, Lindenmaier, Goetz wrote: > Hi Claes, > > it seems that your change breaks the slowdebug build: > > .../src/hotspot/share/c1/c1_CFGPrinter.cpp: In member function ?void CFGPrinterOutput::print_state(BlockBegin*)?: > .../src/hotspot/share/c1/c1_CFGPrinter.cpp:169:7: error: ?for_each_lock_value? was not declared in this scope > for_each_lock_value(state, index, value) { > ^~~~~~~~~~~~~~~~~~~ > > Best regards, > Goetz. > > > >> -----Original Message----- >> From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Claes Redestad >> Sent: Dienstag, 12. M?rz 2019 13:53 >> To: hotspot compiler >> Subject: RFR: 8220501: Improve c1_ValueStack locks handling >> >> Hi, >> >> the _locks Values in ValueStack is often empty/unused, and allocating it >> lazily is a small startup/warmup optimization. >> >> Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8220501 >> >> Testing: tier1-3 >> >> Thanks! >> >> /Claes From tobias.hartmann at oracle.com Tue Mar 12 15:09:34 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 12 Mar 2019 16:09:34 +0100 Subject: RFR (urgent, trivial): 8220515: Revert removal of for_each_lock_value removal In-Reply-To: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> References: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> Message-ID: <82661b8a-3abb-ca7a-6854-4a025581e664@oracle.com> Hi Claes, reviewed. Best regards, Tobias On 12.03.19 16:03, Claes Redestad wrote: > Hi, > > revert the removal of the for_each_lock_value macro part of JDK-8220501, > as this breaks debug builds. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8220515 > Patch: > > --- a/src/hotspot/share/c1/c1_ValueStack.hpp??? Tue Mar 12 15:26:45 2019 +0100 > +++ b/src/hotspot/share/c1/c1_ValueStack.hpp??? Tue Mar 12 15:56:59 2019 +0100 > @@ -261,6 +261,14 @@ > ??????? index += value->type()->size()) > > > +#define for_each_lock_value(state, index, value) ?????????????????????? \ > +? int temp_var = state->locks_size(); ?????????????????????? \ > +? for (index = 0; ?????????????????????? \ > +?????? index < temp_var && (value = state->lock_at(index), true); ?????????????????????? \ > +?????? index++) ?????????????????????? \ > +??? if (value != NULL) > + > + > ?// Macro definition for simple iteration of all state values of a ValueStack > ?// Because the code cannot be executed in a single loop, the code must be passed > ?// as a macro parameter. From claes.redestad at oracle.com Tue Mar 12 15:11:10 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 12 Mar 2019 16:11:10 +0100 Subject: RFR (urgent, trivial): 8220515: Revert removal of for_each_lock_value removal In-Reply-To: <82661b8a-3abb-ca7a-6854-4a025581e664@oracle.com> References: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> <82661b8a-3abb-ca7a-6854-4a025581e664@oracle.com> Message-ID: Thanks, pushed. /Claes On 2019-03-12 16:09, Tobias Hartmann wrote: > Hi Claes, > > reviewed. From goetz.lindenmaier at sap.com Tue Mar 12 15:34:24 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 12 Mar 2019 15:34:24 +0000 Subject: 8220501: Improve c1_ValueStack locks handling In-Reply-To: <8eaaebd7-bf42-539c-8be4-14be64e0fdbc@oracle.com> References: <22a3a4a0-08e2-6873-66af-f549301bdb44@oracle.com> <8eaaebd7-bf42-539c-8be4-14be64e0fdbc@oracle.com> Message-ID: Thanks! ... sorry, I wasn't fast enough to review it! Works now, thanks! Best regards, Goetz. > -----Original Message----- > From: Claes Redestad > Sent: Dienstag, 12. M?rz 2019 16:07 > To: Lindenmaier, Goetz ; hotspot compiler > > Subject: Re: 8220501: Improve c1_ValueStack locks handling > > Hi Goetz, > > yes, I've filed https://bugs.openjdk.java.net/browse/JDK-8220515 and > sent out a RFR for a partial removal. > > /Claes > > On 2019-03-12 15:39, Lindenmaier, Goetz wrote: > > Hi Claes, > > > > it seems that your change breaks the slowdebug build: > > > > .../src/hotspot/share/c1/c1_CFGPrinter.cpp: In member function ?void > CFGPrinterOutput::print_state(BlockBegin*)?: > > .../src/hotspot/share/c1/c1_CFGPrinter.cpp:169:7: error: > ?for_each_lock_value? was not declared in this scope > > for_each_lock_value(state, index, value) { > > ^~~~~~~~~~~~~~~~~~~ > > > > Best regards, > > Goetz. > > > > > > > >> -----Original Message----- > >> From: hotspot-compiler-dev >> bounces at openjdk.java.net> On Behalf Of Claes Redestad > >> Sent: Dienstag, 12. M?rz 2019 13:53 > >> To: hotspot compiler > >> Subject: RFR: 8220501: Improve c1_ValueStack locks handling > >> > >> Hi, > >> > >> the _locks Values in ValueStack is often empty/unused, and allocating it > >> lazily is a small startup/warmup optimization. > >> > >> Webrev: http://cr.openjdk.java.net/~redestad/8220501/open.00/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8220501 > >> > >> Testing: tier1-3 > >> > >> Thanks! > >> > >> /Claes From nils.eliasson at oracle.com Tue Mar 12 15:46:15 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Tue, 12 Mar 2019 16:46:15 +0100 Subject: RFR (urgent, trivial): 8220515: Revert removal of for_each_lock_value removal In-Reply-To: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> References: <2532ec28-fda1-3457-9ad9-286174336a7b@oracle.com> Message-ID: Reviewed, Nils On 2019-03-12 16:03, Claes Redestad wrote: > Hi, > > revert the removal of the for_each_lock_value macro part of JDK-8220501, > as this breaks debug builds. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8220515 > Patch: > > --- a/src/hotspot/share/c1/c1_ValueStack.hpp??? Tue Mar 12 15:26:45 > 2019 +0100 > +++ b/src/hotspot/share/c1/c1_ValueStack.hpp??? Tue Mar 12 15:56:59 > 2019 +0100 > @@ -261,6 +261,14 @@ > ??????? index += value->type()->size()) > > > +#define for_each_lock_value(state, index, value) > ?????????????????????? \ > +? int temp_var = state->locks_size(); ?????????????????????? \ > +? for (index = 0; ?????????????????????? \ > +?????? index < temp_var && (value = state->lock_at(index), true); > ?????????????????????? \ > +?????? index++) ?????????????????????? \ > +??? if (value != NULL) > + > + > ?// Macro definition for simple iteration of all state values of a > ValueStack > ?// Because the code cannot be executed in a single loop, the code > must be passed > ?// as a macro parameter. From vladimir.kozlov at oracle.com Tue Mar 12 15:52:54 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 12 Mar 2019 08:52:54 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> Message-ID: <9b5be24d-dd41-07d7-f477-a658359dda00@oracle.com> Yes, the webrev does not have changes for StringCompressInflateTest.java which means it is in sync. We don't need to do anything for 8214947 fix. Vladimir On 3/11/19 9:57 PM, dean.long at oracle.com wrote: > On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 > But this change is already in both JDK and Graal. > > dl From vladimir.kozlov at oracle.com Tue Mar 12 15:57:23 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 12 Mar 2019 08:57:23 -0700 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: <478C2306-0347-467F-8C2F-49F13CB7D3BA@oracle.com> References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> <478C2306-0347-467F-8C2F-49F13CB7D3BA@oracle.com> Message-ID: Thank you, Jesper This looks good. Reviewed. And I see testing was good too. Thanks, Vladimir On 3/12/19 1:25 AM, jesper.wilhelmsson at oracle.com wrote: > I have reverted the changes to BigIntegerIntrinsicsTest.java and updated all copyright years that was present in overwritten-diffs.txt. > > New webrev for integration: > http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.01/ > > Thanks, > /Jesper > >> On 12 Mar 2019, at 01:44, Vladimir Kozlov wrote: >> >> I looked through all changes done in OpenJDK since last 'Graal Update'. >> >> I found only few issues - see following. The rest changes in overwritten-diffs.txt are not overwritten, they are kept - we don't need to re-apply them. >> >> ----------------------------------------------------------------------- >> New Igor's fix for 8196568 overwrote fix for 8217678 in StandardGraphBuilderPlugins.java file >> >> http://hg.openjdk.java.net/jdk/jdk/rev/b693b0d2053d >> >> Igor, please confirm that your changes also fix 8217678 and we don't need to apply it again. >> >> ----------------------------------------------------------------------- >> Next changes were reverted: JDK-8217289 >> >> http://hg.openjdk.java.net/jdk/jdk/rev/0040f89feb78 >> >> Jesper, I would suggest to revert your changes to corresponding file: >> >> hg revert src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/BigIntegerIntrinsicsTest.java >> >> I asked Patric to file PR for it month ago. Does anyone known if it was filed? I sent e-mail to him. >> >> ---------------------------------------------------------------------- >> Several files has incorrect Copyright years change from 2019 back to 2018. For example: >> >> http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.lir.amd64/src/org/graalvm/compiler/lir/amd64/AMD64ArrayIndexOfOp.java.udiff.html >> >> Please Fix. >> >> Doug, we need PR to update Copyright years in master repo so they match. >> >> Thanks, >> Vladimir >> >> On 3/11/19 1:55 PM, jesper.wilhelmsson at oracle.com wrote: >>>> On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: >>>> >>>> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. >>>> >>>> First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. >>>> >>>> Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. >>> The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev should be what the mx script created if you ignore anything in test. >>>> Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. >>> Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how often we can integrate depends on how much time the review takes ;-) >>> /Jesper >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> On 3/11/19 2:45 AM, Doug Simon wrote: >>>>>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>>>>> >>>>>> Hi Vladimir, >>>>>> >>>>>> I followed the given instructions and used mx to create the diff: >>>>>> >>>>>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>>>>> >>>>>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >>>>> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >>>>> -Doug >>>>>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>>>>> >>>>>> I have removed JDK-8215322 from the diff now. >>>>>> >>>>>> Thanks, >>>>>> /Jesper >>>>>> >>>>>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>>>>> >>>>>>> Hi Jesper, >>>>>>> >>>>>>> Thank you for doing this. >>>>>>> >>>>>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>>>>> >>>>>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>>>>> >>>>>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>>>> Hi, >>>>>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>>>>> JBS duplicates fixed by this integration: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>>>>> JBS duplicates deferred to the next integration: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>>>>> Thanks, >>>>>>>> /Jesper >>>>>> > From jesper.wilhelmsson at oracle.com Tue Mar 12 19:53:46 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 12 Mar 2019 20:53:46 +0100 Subject: RFR: JDK-8218074 - Update Graal In-Reply-To: References: <13A5D9F9-9B93-45F0-A230-EA61DFDB438F@oracle.com> <264E7B5C-BE4F-4702-95EB-2AD56020AF99@oracle.com> <0962869c-b281-edde-41d1-f76fe15d3dae@oracle.com> <78CAC4B7-B144-43B9-B684-6FE60F90099D@oracle.com> <478C2306-0347-467F-8C2F-49F13CB7D3BA@oracle.com> Message-ID: <61B9917A-7DA4-4639-B06D-135B0A6D826A@oracle.com> Thanks for the review! The integration is pushed now. /Jesper > On 12 Mar 2019, at 16:57, Vladimir Kozlov wrote: > > Thank you, Jesper > > This looks good. Reviewed. > And I see testing was good too. > > Thanks, > Vladimir > > On 3/12/19 1:25 AM, jesper.wilhelmsson at oracle.com wrote: >> I have reverted the changes to BigIntegerIntrinsicsTest.java and updated all copyright years that was present in overwritten-diffs.txt. >> New webrev for integration: >> http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.01/ >> Thanks, >> /Jesper >>> On 12 Mar 2019, at 01:44, Vladimir Kozlov wrote: >>> >>> I looked through all changes done in OpenJDK since last 'Graal Update'. >>> >>> I found only few issues - see following. The rest changes in overwritten-diffs.txt are not overwritten, they are kept - we don't need to re-apply them. >>> >>> ----------------------------------------------------------------------- >>> New Igor's fix for 8196568 overwrote fix for 8217678 in StandardGraphBuilderPlugins.java file >>> >>> http://hg.openjdk.java.net/jdk/jdk/rev/b693b0d2053d >>> >>> Igor, please confirm that your changes also fix 8217678 and we don't need to apply it again. >>> >>> ----------------------------------------------------------------------- >>> Next changes were reverted: JDK-8217289 >>> >>> http://hg.openjdk.java.net/jdk/jdk/rev/0040f89feb78 >>> >>> Jesper, I would suggest to revert your changes to corresponding file: >>> >>> hg revert src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.test/src/org/graalvm/compiler/hotspot/test/BigIntegerIntrinsicsTest.java >>> >>> I asked Patric to file PR for it month ago. Does anyone known if it was filed? I sent e-mail to him. >>> >>> ---------------------------------------------------------------------- >>> Several files has incorrect Copyright years change from 2019 back to 2018. For example: >>> >>> http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.lir.amd64/src/org/graalvm/compiler/lir/amd64/AMD64ArrayIndexOfOp.java.udiff.html >>> >>> Please Fix. >>> >>> Doug, we need PR to update Copyright years in master repo so they match. >>> >>> Thanks, >>> Vladimir >>> >>> On 3/11/19 1:55 PM, jesper.wilhelmsson at oracle.com wrote: >>>>> On 11 Mar 2019, at 18:56, Vladimir Kozlov wrote: >>>>> >>>>> I am fine doing small manual labor to find what is missing and what is overwritten. For that I need some information. >>>>> >>>>> First, how overwritten-diffs.txt was created? If it is created by using 'hg' to find all changes done in OpenJDK since last 'Update Graal' then I would understand why changes are in this file. >>>>> >>>>> Next, can you attach to bug report original patch created by updategraalinopenjdk before applying it to OpenJDK sources? Do we have it? This way I can compare overwritten-diffs.txt with it and find if there duplicate and I don't need to re-apply changes. >>>> The mx script applies the patch automatically so I don't have it as a separate patch. But the only changes that my script does after the mx script has applied the patch is in the test directory, so the patch available in the webrev should be what the mx script created if you ignore anything in test. >>>>> Some changes were pushed into OpenJDK because we did not want to wait a fix for months - last updated was done 3 months ago!!! If we do more frequent updates overwritten-diffs.txt will be empty or very small. Manual work should not take a lot of time in such case. >>>> Yes, the intention is to do this more often, hopefully on a weekly basis going forward, and that should minimize the need to push to the OpenJDK as you say. I hope this "first" integration is a one-off in that respect. Obviously how often we can integrate depends on how much time the review takes ;-) >>>> /Jesper >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> On 3/11/19 2:45 AM, Doug Simon wrote: >>>>>>> On 11 Mar 2019, at 01:56, jesper.wilhelmsson at oracle.com wrote: >>>>>>> >>>>>>> Hi Vladimir, >>>>>>> >>>>>>> I followed the given instructions and used mx to create the diff: >>>>>>> >>>>>>> mx --java-home=$JAVA_HOME updategraalinopenjdk cleanjdk/open 12 >>>>>>> >>>>>>> I assume JDK-8215322 is in there because it was actually pushed to OpenJDK and now it was overwritten (with the same diff). It seems the mx script do not realize that it's the same cahange. >>>>>> Correct. To detect that we would have to add logic for diffs of diffs and the complexity didn?t seem worth it. I?d more more than happy to review a smart person?s attempt at adding it though ;-) >>>>>> -Doug >>>>>>> The diff and patch and all changes made was done by the new script, so no extra human labor has been put into it at this point. If a human is required to modify the overwritten-diff I suggest that is part of the review step (as now) and that a new patch is simply added to the subtask. >>>>>>> >>>>>>> I have removed JDK-8215322 from the diff now. >>>>>>> >>>>>>> Thanks, >>>>>>> /Jesper >>>>>>> >>>>>>>> On 9 Mar 2019, at 18:42, Vladimir Kozlov wrote: >>>>>>>> >>>>>>>> Hi Jesper, >>>>>>>> >>>>>>>> Thank you for doing this. >>>>>>>> >>>>>>>> I looked on 8220387 to see what changes your webrev does not have and it is strange. You should not have such big difference. How you prepared overwritten-diffs.txt? >>>>>>>> >>>>>>>> For example, JAOTC changes "8215322: add @file support to jaotc" are in overwritten-diffs.txt. But changeset is listed in the 8218074 Graal changelog list [1]. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> [1] 8433917 Wed Dec 19 16:20:00 2018 -0800 Igor Ignatyev [GR-13142] Add at-file support to jaotc. >>>>>>>> >>>>>>>> On 3/9/19 12:46 AM, jesper.wilhelmsson at oracle.com wrote: >>>>>>>>> Hi, >>>>>>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>>>>>> Graal tip to integrate: db79f81716886b7883370cd6ea1bbf5c42966fa5 >>>>>>>>> JBS duplicates fixed by this integration: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8217161 >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218698 >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8218859 >>>>>>>>> JBS duplicates deferred to the next integration: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8218074 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8218074/webrev.00/ >>>>>>>>> This integration did overwrite changes already in place in OpenJDK. A subtask was created to hold the diff: https://bugs.openjdk.java.net/browse/JDK-8220387 >>>>>>>>> Thanks, >>>>>>>>> /Jesper >>>>>>> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From igor.veresov at oracle.com Wed Mar 13 02:24:04 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 12 Mar 2019 19:24:04 -0700 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Dmitry, After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? igor > On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: > > Igor, > > Sorry for the delay in responding. > > I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ > > For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? > > Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? > > Thanks, > > Dmitry > >> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: >> >> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. >> >> igor >> >> >> >>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: >>> >>> Hi Igor, >>> >>> Thanks for the suggestions. I tried to make the opr1 a temporary >>> >>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ >>> >>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? >>> >>> Dmitry >>> >>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: >>>> >>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. >>>> >>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: >>>>> >>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. >>>>> >>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. >>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. >>>>> >>>>> igor >>>>> >>>>> >>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: >>>>>> >>>>>> Hi Dmitry, >>>>>> >>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. >>>>>> >>>>>> Best regards, >>>>>> Tobias >>>>>> >>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: >>>>>>> Hi Tobias, >>>>>>> >>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the >>>>>>> regression test. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ >>>>>>> >>>>>>> >>>>>>> Dmitry >>>>>>> >>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>> > wrote: >>>>>>>> >>>>>>>> Hi Dmitry, >>>>>>>> >>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? >>>>>>>> >>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Tobias >>>>>>>> >>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for >>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and >>>>>>>>> using it for incrementing backedge counter. >>>>>>>>> >>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 >>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Dmitry >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Mar 13 08:31:19 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 13 Mar 2019 09:31:19 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: Updates to the patch: http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ - Removes anonymous namespace, which causes link problems on Windows. - Fixed test that looks for _scavenge_root_nmethods Tested with tier1-3 Thanks, StefanK On 2019-03-11 15:23, Stefan Karlsson wrote: > Hi all, > > Please review this patch to move the scavenge root code out from > CodeCache and nmethod. > > http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8220343 > > The CodeCache::scavenge_root_nmethods_do function and its implementation > in CodeCache and nmethod, is only used by set of our GCs (Serial, > Parallel, and CMS), but not by G1, ZGC, Shenandoah, or Epsilon. I want > to move all of that into GC subsystem and then only let those GCs using > it pay the cost of having that code. > > This is a continuation of the work of the GC Interface, where G1, ZGC, > Shenandoah, and Epsilon, uses the register_nmethod, unregister_nmethod, > and flush_nmethod calls, but the other GCs don't. > > This patch builds upon: > ?JDK-8220411: Remove ScavengeRootsInCode=0 code > ?https://bugs.openjdk.java.net/browse/JDK-8220411 > > and also depends on the the resolution of: > ?JDK-8220342: Remove scavenge_root_nmethods_do from > VM_HeapWalkOperation::collect_simple_roots > ?https://bugs.openjdk.java.net/browse/JDK-8220342 > > Thanks, > StefanK From rwestrel at redhat.com Wed Mar 13 09:04:32 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Wed, 13 Mar 2019 10:04:32 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected Message-ID: <87ef7buov3.fsf@redhat.com> http://cr.openjdk.java.net/~roland/8220374/webrev.00/ Fix for 8193597 accidentally broke loop strip mining: that change caused the inner and outer loop to have the same exit condition. So loop strip mining has been effectively disabled for more than a year (a month after it was pushed !?). The fix above correctly sets the inner and outer loop exit condition. Initially, when the loop is strip mined, the inner loop exit condition is left as is and the outer loop exit condition is set to always exit. After optimizations, the outer loop exit condition must be set to be the same as the inner loop exit condition and then, the inner loop exit condition must be adjusted so the inner loop runs for no more than LoopStripMiningIter iterations. Thanks to Martin Doerr for the test case and the bug report. Roland. From rkennke at redhat.com Wed Mar 13 09:56:25 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 13 Mar 2019 10:56:25 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <87ef7buov3.fsf@redhat.com> References: <87ef7buov3.fsf@redhat.com> Message-ID: Hi Roland, > http://cr.openjdk.java.net/~roland/8220374/webrev.00/ > > Fix for 8193597 accidentally broke loop strip mining: that change caused > the inner and outer loop to have the same exit condition. So loop strip > mining has been effectively disabled for more than a year (a month after > it was pushed !?). Wow! WTF!? > The fix above correctly sets the inner and outer loop exit > condition. Initially, when the loop is strip mined, the inner loop exit > condition is left as is and the outer loop exit condition is set to > always exit. After optimizations, the outer loop exit condition must be > set to be the same as the inner loop exit condition and then, the inner > loop exit condition must be adjusted so the inner loop runs for no more > than LoopStripMiningIter iterations. > > Thanks to Martin Doerr for the test case and the bug report. Looks good to me! Thanks! Roman From claes.redestad at oracle.com Wed Mar 13 11:03:24 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 13 Mar 2019 12:03:24 +0100 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator Message-ID: Hi, the PhiResolverState arrays are pre-sized to the maximum possible number of nodes that can be put into them, which when instrumenting turns out to be quite excessive. By not pre-sizing to the theoretical maximum I can measure a substantial improvements in calling methods. Webrev: http://cr.openjdk.java.net/~redestad/8220502/open.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8220502 Testing: tier1-3 On a set of startup and footprint benchmarks, 60% of collected metrics show 0.2-2% statistically significant improvements, with no regression on any metric. Thanks! /Claes From dcherepanov at azul.com Wed Mar 13 12:43:58 2019 From: dcherepanov at azul.com (Dmitry Cherepanov) Date: Wed, 13 Mar 2019 12:43:58 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Igor, Updated version of original fix (with ifdef X86 added): http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ Is it okay to push it? Thanks, Dmitry > On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: > > Dmitry, > > After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? > > igor > > > >> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: >> >> Igor, >> >> Sorry for the delay in responding. >> >> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ >> >> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? >> >> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? >> >> Thanks, >> >> Dmitry >> >>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: >>> >>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. >>> >>> igor >>> >>> >>> >>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: >>>> >>>> Hi Igor, >>>> >>>> Thanks for the suggestions. I tried to make the opr1 a temporary >>>> >>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ >>>> >>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? >>>> >>>> Dmitry >>>> >>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: >>>>> >>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. >>>>> >>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: >>>>>> >>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. >>>>>> >>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. >>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. >>>>>> >>>>>> igor >>>>>> >>>>>> >>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: >>>>>>> >>>>>>> Hi Dmitry, >>>>>>> >>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. >>>>>>> >>>>>>> Best regards, >>>>>>> Tobias >>>>>>> >>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: >>>>>>>> Hi Tobias, >>>>>>>> >>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the >>>>>>>> regression test. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ >>>>>>>> >>>>>>>> >>>>>>>> Dmitry >>>>>>>> >>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> Hi Dmitry, >>>>>>>>> >>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? >>>>>>>>> >>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Tobias >>>>>>>>> >>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for >>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and >>>>>>>>>> using it for incrementing backedge counter. >>>>>>>>>> >>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 >>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Dmitry >>>>>> >>>>> >>>> >>> >> > From tobias.hartmann at oracle.com Wed Mar 13 13:58:25 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 13 Mar 2019 14:58:25 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: <112841f9-8d2a-1404-6a97-98241fe956ec@oracle.com> References: <112841f9-8d2a-1404-6a97-98241fe956ec@oracle.com> Message-ID: Hi Nils, this looks reasonable to me. Small typo: "check second edge to" -> "check second edge too" Best regards, Tobias On 12.03.19 15:42, Nils Eliasson wrote: > I need a second review, > > Regards, > > Nils > > > On 2019-03-08 11:52, Nils Eliasson wrote: >> Hi all, >> >> Background: >> >> We can get stuck in an infinite loop in IGVN. The method reproducing the problem is quite a big >> graph, and after some optimization, a huge loop will die. But since it is so big, it takes a while >> before it has been pruned. >> >> In an inner loop there is a phi on memory that gets reduced to a self looping heart, with a membar >> on each in edge. There is also a connected region that keeps it alive. (From the start there is >> other memory state coming into this loop, but it gets disconnected early when the loop dies.) >> >> +---+???????????????? +---+ >> |?? v???????????????? v?? | >> | Membar +-+ +---+ Membar | >> |????????? | |??????????? | >> |????????? v v??????????? | >> |????????? Phi??????????? | >> |????????? + + +????????? | >> |????????? | | |????????? | >> +----------+ | +----------+ >> ???????????? | >> ???????????? v >> ?????????? LoadN >> >> In IGVN, Ideal() will be called on the Load. >> >> On iteration 1 - A split_through_phi on one edge will be performed, because we can prove that >> other edge of the phi is a loop. Now the Load hangs of one of the membars. >> >> On iteration 2 - Optimize_memory_chain will suggest the in to the membar as a more ideal memory, >> and then the load get the phi back as the memory input. >> >> Repeat. >> >> I have gone great lengths to show that this code is part of a huge loop, that is dead, and will be >> eliminated in due time. >> >> My suggested solution to breaking the infinite loop, is to change the first case, by simply not >> perform the memory replacement when both inputs are self loops. >> >> https://bugs.openjdk.java.net/browse/JDK-8219517 >> >> http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ >> >> Regards, >> >> Nils >> >> >> >> From tobias.hartmann at oracle.com Wed Mar 13 14:00:35 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 13 Mar 2019 15:00:35 +0100 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator In-Reply-To: References: Message-ID: <17cd5016-ceb1-9587-4b34-d71dd9e6bf99@oracle.com> Hi Claes, very nice, reviewed. Best regards, Tobias On 13.03.19 12:03, Claes Redestad wrote: > Hi, > > the PhiResolverState arrays are pre-sized to the maximum possible number > of nodes that can be put into them, which when instrumenting turns out > to be quite excessive. By not pre-sizing to the theoretical maximum I > can measure a substantial improvements in calling methods. > > Webrev: http://cr.openjdk.java.net/~redestad/8220502/open.00/ > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220502 > > Testing: tier1-3 > > On a set of startup and footprint benchmarks, 60% of collected metrics > show 0.2-2% statistically significant improvements, with no > regression on any metric. > > Thanks! > > /Claes From tobias.hartmann at oracle.com Wed Mar 13 14:09:53 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 13 Mar 2019 15:09:53 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <87ef7buov3.fsf@redhat.com> References: <87ef7buov3.fsf@redhat.com> Message-ID: <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Hi Roland, On 13.03.19 10:04, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8220374/webrev.00/ > > Fix for 8193597 accidentally broke loop strip mining: that change caused > the inner and outer loop to have the same exit condition. So loop strip > mining has been effectively disabled for more than a year (a month after > it was pushed !?). Oh wow. We should probably mark this for a backport then, right? > The fix above correctly sets the inner and outer loop exit > condition. Initially, when the loop is strip mined, the inner loop exit > condition is left as is and the outer loop exit condition is set to > always exit. After optimizations, the outer loop exit condition must be > set to be the same as the inner loop exit condition and then, the inner > loop exit condition must be adjusted so the inner loop runs for no more > than LoopStripMiningIter iterations. Looks reasonable to me. > Thanks to Martin Doerr for the test case and the bug report. Have you tested with a release build? Because SafepointALot is a debug flag and will probably fail. Also, please add some newlines to the @run statement. Thanks, Tobias From claes.redestad at oracle.com Wed Mar 13 14:11:39 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 13 Mar 2019 15:11:39 +0100 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator In-Reply-To: <17cd5016-ceb1-9587-4b34-d71dd9e6bf99@oracle.com> References: <17cd5016-ceb1-9587-4b34-d71dd9e6bf99@oracle.com> Message-ID: <9e4e2b75-52eb-10d2-41c7-c8b81a7c97e8@oracle.com> On 2019-03-13 15:00, Tobias Hartmann wrote: > Hi Claes, > > very nice, reviewed. Thanks, Tobias! /Claes From nils.eliasson at oracle.com Wed Mar 13 14:19:33 2019 From: nils.eliasson at oracle.com (Nils Eliasson) Date: Wed, 13 Mar 2019 15:19:33 +0100 Subject: RFR(S): 8219517: assert(false) failed: infinite loop in PhaseIterGVN::optimize In-Reply-To: References: <112841f9-8d2a-1404-6a97-98241fe956ec@oracle.com> Message-ID: <58cabb89-6803-ad8a-b851-77a14c887d42@oracle.com> I'll fix that. Thanks Tobias, // Nils On 2019-03-13 14:58, Tobias Hartmann wrote: > Hi Nils, > > this looks reasonable to me. Small typo: > "check second edge to" -> "check second edge too" > > Best regards, > Tobias > > On 12.03.19 15:42, Nils Eliasson wrote: >> I need a second review, >> >> Regards, >> >> Nils >> >> >> On 2019-03-08 11:52, Nils Eliasson wrote: >>> Hi all, >>> >>> Background: >>> >>> We can get stuck in an infinite loop in IGVN. The method reproducing the problem is quite a big >>> graph, and after some optimization, a huge loop will die. But since it is so big, it takes a while >>> before it has been pruned. >>> >>> In an inner loop there is a phi on memory that gets reduced to a self looping heart, with a membar >>> on each in edge. There is also a connected region that keeps it alive. (From the start there is >>> other memory state coming into this loop, but it gets disconnected early when the loop dies.) >>> >>> +---+???????????????? +---+ >>> |?? v???????????????? v?? | >>> | Membar +-+ +---+ Membar | >>> |????????? | |??????????? | >>> |????????? v v??????????? | >>> |????????? Phi??????????? | >>> |????????? + + +????????? | >>> |????????? | | |????????? | >>> +----------+ | +----------+ >>> ???????????? | >>> ???????????? v >>> ?????????? LoadN >>> >>> In IGVN, Ideal() will be called on the Load. >>> >>> On iteration 1 - A split_through_phi on one edge will be performed, because we can prove that >>> other edge of the phi is a loop. Now the Load hangs of one of the membars. >>> >>> On iteration 2 - Optimize_memory_chain will suggest the in to the membar as a more ideal memory, >>> and then the load get the phi back as the memory input. >>> >>> Repeat. >>> >>> I have gone great lengths to show that this code is part of a huge loop, that is dead, and will be >>> eliminated in due time. >>> >>> My suggested solution to breaking the infinite loop, is to change the first case, by simply not >>> perform the memory replacement when both inputs are self loops. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8219517 >>> >>> http://cr.openjdk.java.net/~neliasso/8219517/webrev.01/ >>> >>> Regards, >>> >>> Nils >>> >>> >>> >>> From martin.doerr at sap.com Wed Mar 13 15:44:47 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 13 Mar 2019 15:44:47 +0000 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Message-ID: Hi Roland, > Oh wow. We should probably mark this for a backport then, right? Makes sense. I suggest to backport JDK-8219584, too. It's needed for the test to work correctly. > Have you tested with a release build? Because SafepointALot is a debug flag and will probably fail. It has been changed to develop by the change mentioned above. The local variable "outer_cmp" is unused and may possibly cause build warnings. Besides that, the fix looks good to me. Thanks for fixing it so quickly. Best regards, Martin -----Original Message----- From: hotspot-compiler-dev On Behalf Of Tobias Hartmann Sent: Mittwoch, 13. M?rz 2019 15:10 To: Roland Westrelin ; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected Hi Roland, On 13.03.19 10:04, Roland Westrelin wrote: > http://cr.openjdk.java.net/~roland/8220374/webrev.00/ > > Fix for 8193597 accidentally broke loop strip mining: that change caused > the inner and outer loop to have the same exit condition. So loop strip > mining has been effectively disabled for more than a year (a month after > it was pushed !?). Oh wow. We should probably mark this for a backport then, right? > The fix above correctly sets the inner and outer loop exit > condition. Initially, when the loop is strip mined, the inner loop exit > condition is left as is and the outer loop exit condition is set to > always exit. After optimizations, the outer loop exit condition must be > set to be the same as the inner loop exit condition and then, the inner > loop exit condition must be adjusted so the inner loop runs for no more > than LoopStripMiningIter iterations. Looks reasonable to me. > Thanks to Martin Doerr for the test case and the bug report. Have you tested with a release build? Because SafepointALot is a debug flag and will probably fail. Also, please add some newlines to the @run statement. Thanks, Tobias From vladimir.kozlov at oracle.com Wed Mar 13 17:10:14 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 Mar 2019 10:10:14 -0700 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: <79bee740-9710-b9c8-2421-40695a6314c3@oracle.com> New webrev link http://cr.openjdk.java.net/~stefank/8220343/webrev.03 Thank you for testing, Stefan. Compiler changes looks fine to me. Thanks, Vladimir On 3/13/19 1:31 AM, Stefan Karlsson wrote: > Updates to the patch: > ?http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ > ?http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ > > - Removes anonymous namespace, which causes link problems on Windows. > - Fixed test that looks for _scavenge_root_nmethods > > Tested with tier1-3 > > Thanks, > StefanK > > On 2019-03-11 15:23, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to move the scavenge root code out from CodeCache and nmethod. >> >> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8220343 >> >> The CodeCache::scavenge_root_nmethods_do function and its implementation in CodeCache and nmethod, >> is only used by set of our GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or >> Epsilon. I want to move all of that into GC subsystem and then only let those GCs using it pay the >> cost of having that code. >> >> This is a continuation of the work of the GC Interface, where G1, ZGC, Shenandoah, and Epsilon, >> uses the register_nmethod, unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >> >> This patch builds upon: >> ??JDK-8220411: Remove ScavengeRootsInCode=0 code >> ??https://bugs.openjdk.java.net/browse/JDK-8220411 >> >> and also depends on the the resolution of: >> ??JDK-8220342: Remove scavenge_root_nmethods_do from VM_HeapWalkOperation::collect_simple_roots >> ??https://bugs.openjdk.java.net/browse/JDK-8220342 >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Wed Mar 13 17:11:44 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 13 Mar 2019 18:11:44 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: <79bee740-9710-b9c8-2421-40695a6314c3@oracle.com> References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> <79bee740-9710-b9c8-2421-40695a6314c3@oracle.com> Message-ID: <56ddf653-ec13-3a23-dcbc-79d664fe077a@oracle.com> Thanks for fixing the link and reviewing! StefanK On 2019-03-13 18:10, Vladimir Kozlov wrote: > New webrev link > http://cr.openjdk.java.net/~stefank/8220343/webrev.03 > > Thank you for testing, Stefan. Compiler changes looks fine to me. > > Thanks, > Vladimir > > On 3/13/19 1:31 AM, Stefan Karlsson wrote: >> Updates to the patch: >> ??http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ >> ??http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ >> >> - Removes anonymous namespace, which causes link problems on Windows. >> - Fixed test that looks for _scavenge_root_nmethods >> >> Tested with tier1-3 >> >> Thanks, >> StefanK >> >> On 2019-03-11 15:23, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review this patch to move the scavenge root code out from >>> CodeCache and nmethod. >>> >>> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8220343 >>> >>> The CodeCache::scavenge_root_nmethods_do function and its >>> implementation in CodeCache and nmethod, is only used by set of our >>> GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or >>> Epsilon. I want to move all of that into GC subsystem and then only >>> let those GCs using it pay the cost of having that code. >>> >>> This is a continuation of the work of the GC Interface, where G1, >>> ZGC, Shenandoah, and Epsilon, uses the register_nmethod, >>> unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >>> >>> This patch builds upon: >>> ??JDK-8220411: Remove ScavengeRootsInCode=0 code >>> ??https://bugs.openjdk.java.net/browse/JDK-8220411 >>> >>> and also depends on the the resolution of: >>> ??JDK-8220342: Remove scavenge_root_nmethods_do from >>> VM_HeapWalkOperation::collect_simple_roots >>> ??https://bugs.openjdk.java.net/browse/JDK-8220342 >>> >>> Thanks, >>> StefanK From sgehwolf at redhat.com Wed Mar 13 17:29:00 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 13 Mar 2019 18:29:00 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Message-ID: On Wed, 2019-03-13 at 15:09 +0100, Tobias Hartmann wrote: > Have you tested with a release build? Because SafepointALot is a > debug flag and will probably fail. No. SafepointALot is a diagnostic VM option and the test has -XX:+UnlockDiagnosticVMOptions. It should be fine. Thanks, Severin From vladimir.kozlov at oracle.com Wed Mar 13 18:15:30 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 Mar 2019 11:15:30 -0700 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator In-Reply-To: References: Message-ID: By default GrowableArray allocate 2 elements. Looking on C1 code I see that it indeed you can benefit to not resizing by default _virtual_operands and _vreg_table arrays. But _other_operands may need to be resized to actual registers numbers I think. But it needs to be verified. Also the code is used only for Phi nodes, as I understand, that is why it may not have big impact regardless resizing. In general I agree with these changes. Thanks, Vladimir On 3/13/19 4:03 AM, Claes Redestad wrote: > Hi, > > the PhiResolverState arrays are pre-sized to the maximum possible number > of nodes that can be put into them, which when instrumenting turns out > to be quite excessive. By not pre-sizing to the theoretical maximum I > can measure a substantial improvements in calling methods. > > Webrev: http://cr.openjdk.java.net/~redestad/8220502/open.00/ > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8220502 > > Testing: tier1-3 > > On a set of startup and footprint benchmarks, 60% of collected metrics > show 0.2-2% statistically significant improvements, with no > regression on any metric. > > Thanks! > > /Claes From claes.redestad at oracle.com Wed Mar 13 19:22:21 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Wed, 13 Mar 2019 20:22:21 +0100 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator In-Reply-To: References: Message-ID: On 2019-03-13 19:15, Vladimir Kozlov wrote: > By default GrowableArray allocate 2 elements. > > Looking on C1 code I see that it indeed you can benefit to not resizing > by default _virtual_operands and _vreg_table arrays. But _other_operands > may need to be resized to actual registers numbers I think. But it needs > to be verified. Also the code is used only for Phi nodes, as I > understand, that is why it may not have big impact regardless resizing. Only used for Phi nodes, yes. When instrumenting before/after, we do less than a third as many calls to ::grow in create_node than we did from PhiResolverState::reset before the patch. Cost of LIRGenerator::move_to_phi, which spans both all ::reset and all ::create_node, drops ~72%. > > In general I agree with these changes. Thanks! /Claes From vladimir.kozlov at oracle.com Wed Mar 13 23:31:16 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 Mar 2019 16:31:16 -0700 Subject: RFR: 8220502: Inefficient pre-sizing of PhiResolverState arrays in c1_LIRGenerator In-Reply-To: References: Message-ID: Claes, did experiment with reserving more space only for _other_operands(LIR_OprDesc::vreg_base). It showed that "slight reduction in ::grow calls, but overall more instructions per operation, LIRGenerator::move_to_phi". Based on this I think changes are good. Reviewed. Thanks, Vladimir On 3/13/19 12:22 PM, Claes Redestad wrote: > On 2019-03-13 19:15, Vladimir Kozlov wrote: >> By default GrowableArray allocate 2 elements. >> >> Looking on C1 code I see that it indeed you can benefit to not resizing by default >> _virtual_operands and _vreg_table arrays. But _other_operands may need to be resized to actual >> registers numbers I think. But it needs to be verified. Also the code is used only for Phi nodes, >> as I understand, that is why it may not have big impact regardless resizing. > > Only used for Phi nodes, yes. > > When instrumenting before/after, we do less than a third as many calls > to ::grow in create_node than we did from PhiResolverState::reset > before the patch. Cost of LIRGenerator::move_to_phi, which spans both all ::reset and all > ::create_node, drops ~72%. > >> >> In general I agree with these changes. > > Thanks! > > /Claes From erik.osterlund at oracle.com Thu Mar 14 06:51:12 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Thu, 14 Mar 2019 07:51:12 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: <2182503D-E699-4199-87A1-33BE1CA7715F@oracle.com> Hi Stefan, Looks good. /Erik > On 13 Mar 2019, at 09:31, Stefan Karlsson wrote: > > Updates to the patch: > http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ > http://cr.openjdk.java.net/~stefank/8220343/webrev.03.delta/ > > - Removes anonymous namespace, which causes link problems on Windows. > - Fixed test that looks for _scavenge_root_nmethods > > Tested with tier1-3 > > Thanks, > StefanK > >> On 2019-03-11 15:23, Stefan Karlsson wrote: >> Hi all, >> Please review this patch to move the scavenge root code out from CodeCache and nmethod. >> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8220343 >> The CodeCache::scavenge_root_nmethods_do function and its implementation in CodeCache and nmethod, is only used by set of our GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or Epsilon. I want to move all of that into GC subsystem and then only let those GCs using it pay the cost of having that code. >> This is a continuation of the work of the GC Interface, where G1, ZGC, Shenandoah, and Epsilon, uses the register_nmethod, unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >> This patch builds upon: >> JDK-8220411: Remove ScavengeRootsInCode=0 code >> https://bugs.openjdk.java.net/browse/JDK-8220411 >> and also depends on the the resolution of: >> JDK-8220342: Remove scavenge_root_nmethods_do from VM_HeapWalkOperation::collect_simple_roots >> https://bugs.openjdk.java.net/browse/JDK-8220342 >> Thanks, >> StefanK From tobias.hartmann at oracle.com Thu Mar 14 08:11:20 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 14 Mar 2019 09:11:20 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Message-ID: On 13.03.19 18:29, Severin Gehwolf wrote: > No. SafepointALot is a diagnostic VM option and the test has > -XX:+UnlockDiagnosticVMOptions. It should be fine. Yes, that changed with JDK-8219584 (I was looking at slightly outdated code). Best regards, Tobias From rahul.v.raghavan at oracle.com Thu Mar 14 08:24:11 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 14 Mar 2019 13:54:11 +0530 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change Message-ID: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> Hi, Please review the following fix proposal for JDK-8202414. Webrev - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.00/ -- Related links > https://bugs.openjdk.java.net/browse/JDK-8202414 > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-September/030536.html -- As per suggestions in JBS added following change in InitializeNode::can_capture_store() to return false for unaligned stores. ============= diff -r 3086f9259e97 src/hotspot/share/opto/memnode.cpp --- a/src/hotspot/share/opto/memnode.cpp Wed Mar 13 00:48:52 2019 -0400 +++ b/src/hotspot/share/opto/memnode.cpp Wed Mar 13 19:50:07 2019 +0530 @@ -3541,7 +3541,7 @@ // within the initialized memory. intptr_t InitializeNode::can_capture_store(StoreNode* st, PhaseTransform* phase, bool can_reshape) { const int FAIL = 0; - if (st->is_unaligned_access()) { + if (st->is_unaligned_access() || ((get_store_offset(st, phase) % BytesPerInt) != 0)) { return FAIL; } if (st->req() != MemNode::ValueIn + 1) ============== -- Added the new jtreg test from the JBS unit test. (test/hotspot/jtreg/compiler/c2/Test8202414.java) Understood the test with unaligned access will not work for Sparc due to hardware restrictions.The test always fails with SIGBUS crash, with or without above fix. So added @requires (os.arch != "sparc") & (os.arch != "sparcv9") -- Confirmed the above change solved the original reported 8202414 test case failure. Also no issues far for hs-tier1 to tier4, hs-precheckin-comp testing. -- Could not work out any related additions in LibraryCallKit::inline_unsafe_access(). Hope above fix proposal is correct, complete solution for the issue. Thanks, Rahul From tobias.hartmann at oracle.com Thu Mar 14 10:33:19 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 14 Mar 2019 11:33:19 +0100 Subject: [13] RFR(T): 8220611: compiler/classUnloading/methodUnloading/TestOverloadCompileQueues.java timeout Message-ID: Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8220611 http://cr.openjdk.java.net/~thartmann/8220611/webrev.00/ The test should be skipped if Graal is used as JIT because compilation thresholds are adjusted such that execution of Graal is extremely slow (Graal needs to compile itself). Thanks, Tobias From stefan.karlsson at oracle.com Thu Mar 14 10:36:37 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Mar 2019 11:36:37 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: New webrevs: http://cr.openjdk.java.net/~stefank/8220343/webrev.04.delta http://cr.openjdk.java.net/~stefank/8220343/webrev.04 The call to prune the scavengable nmethods were inadvertently moved to the old location of CodeCache::gc_prolog call, but should have been moved to the old location of the CodeCache::gc_epilogue call. This does not affect the functionality, but delays the pruning until next full or young GC. StefanK On 2019-03-11 15:23, Stefan Karlsson wrote: > Hi all, > > Please review this patch to move the scavenge root code out from > CodeCache and nmethod. > > http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8220343 > > The CodeCache::scavenge_root_nmethods_do function and its implementation > in CodeCache and nmethod, is only used by set of our GCs (Serial, > Parallel, and CMS), but not by G1, ZGC, Shenandoah, or Epsilon. I want > to move all of that into GC subsystem and then only let those GCs using > it pay the cost of having that code. > > This is a continuation of the work of the GC Interface, where G1, ZGC, > Shenandoah, and Epsilon, uses the register_nmethod, unregister_nmethod, > and flush_nmethod calls, but the other GCs don't. > > This patch builds upon: > ?JDK-8220411: Remove ScavengeRootsInCode=0 code > ?https://bugs.openjdk.java.net/browse/JDK-8220411 > > and also depends on the the resolution of: > ?JDK-8220342: Remove scavenge_root_nmethods_do from > VM_HeapWalkOperation::collect_simple_roots > ?https://bugs.openjdk.java.net/browse/JDK-8220342 > > Thanks, > StefanK From rwestrel at redhat.com Thu Mar 14 12:21:43 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 14 Mar 2019 13:21:43 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Message-ID: <87wol1tzmw.fsf@redhat.com> Hi Tobias, Thanks for the review. > Oh wow. We should probably mark this for a backport then, right? Yes. What tag do you use? > Also, please add some newlines to the @run statement. I will do that before I push it. Roland. From rwestrel at redhat.com Thu Mar 14 12:22:04 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 14 Mar 2019 13:22:04 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: References: <87ef7buov3.fsf@redhat.com> Message-ID: <87tvg5tzmb.fsf@redhat.com> Thanks for the review, Roman. Roland. From rwestrel at redhat.com Thu Mar 14 12:22:49 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 14 Mar 2019 13:22:49 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> Message-ID: <87r2b9tzl2.fsf@redhat.com> Hi Martin, > The local variable "outer_cmp" is unused and may possibly cause build warnings. I removed it. Thanks for the review. Roland. From shade at redhat.com Thu Mar 14 12:25:21 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Mar 2019 13:25:21 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <87wol1tzmw.fsf@redhat.com> References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> <87wol1tzmw.fsf@redhat.com> Message-ID: <8b0fc2cf-d5df-60a8-59f4-385655e19a9b@redhat.com> On 3/14/19 1:21 PM, Roland Westrelin wrote: >> Oh wow. We should probably mark this for a backport then, right? > > Yes. What tag do you use? Given that the issue is marked with redhat-openjdk, it is on our radar, and we would handle through the JDK Updates process. (It would require trial applies to 11u and 12u, jdk12u-fix-request and jdk11u-fix-request tags, figuring out testing, etc. -- not as simple as putting the tag) -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From erik.osterlund at oracle.com Thu Mar 14 12:34:42 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 14 Mar 2019 13:34:42 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> Message-ID: <832798cc-4e10-db89-6f46-478a27dae6b4@oracle.com> Hi Stefan, Looks good. /Erik On 2019-03-14 11:36, Stefan Karlsson wrote: > New webrevs: > ?http://cr.openjdk.java.net/~stefank/8220343/webrev.04.delta > ?http://cr.openjdk.java.net/~stefank/8220343/webrev.04 > > The call to prune the scavengable nmethods were inadvertently moved to > the old location of CodeCache::gc_prolog call, but should have been > moved to the old location of the CodeCache::gc_epilogue call. This > does not affect the functionality, but delays the pruning until next > full or young GC. > > StefanK > > On 2019-03-11 15:23, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to move the scavenge root code out from >> CodeCache and nmethod. >> >> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8220343 >> >> The CodeCache::scavenge_root_nmethods_do function and its >> implementation in CodeCache and nmethod, is only used by set of our >> GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or >> Epsilon. I want to move all of that into GC subsystem and then only >> let those GCs using it pay the cost of having that code. >> >> This is a continuation of the work of the GC Interface, where G1, >> ZGC, Shenandoah, and Epsilon, uses the register_nmethod, >> unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >> >> This patch builds upon: >> ??JDK-8220411: Remove ScavengeRootsInCode=0 code >> ??https://bugs.openjdk.java.net/browse/JDK-8220411 >> >> and also depends on the the resolution of: >> ??JDK-8220342: Remove scavenge_root_nmethods_do from >> VM_HeapWalkOperation::collect_simple_roots >> ??https://bugs.openjdk.java.net/browse/JDK-8220342 >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Thu Mar 14 12:35:44 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 14 Mar 2019 13:35:44 +0100 Subject: RFR: 8220343: Move scavenge_root_nmethods from shared code In-Reply-To: <832798cc-4e10-db89-6f46-478a27dae6b4@oracle.com> References: <312e77d3-e624-7deb-509e-89b1ee96124f@oracle.com> <832798cc-4e10-db89-6f46-478a27dae6b4@oracle.com> Message-ID: <2f8a779a-d5a5-2982-e716-54e5946d2566@oracle.com> Thanks, Erik. StefanK On 2019-03-14 13:34, Erik ?sterlund wrote: > Hi Stefan, > > Looks good. > > /Erik > > On 2019-03-14 11:36, Stefan Karlsson wrote: >> New webrevs: >> ?http://cr.openjdk.java.net/~stefank/8220343/webrev.04.delta >> ?http://cr.openjdk.java.net/~stefank/8220343/webrev.04 >> >> The call to prune the scavengable nmethods were inadvertently moved to >> the old location of CodeCache::gc_prolog call, but should have been >> moved to the old location of the CodeCache::gc_epilogue call. This >> does not affect the functionality, but delays the pruning until next >> full or young GC. >> >> StefanK >> >> On 2019-03-11 15:23, Stefan Karlsson wrote: >>> Hi all, >>> >>> Please review this patch to move the scavenge root code out from >>> CodeCache and nmethod. >>> >>> http://cr.openjdk.java.net/~stefank/8220343/webrev.01/ >>> https://bugs.openjdk.java.net/browse/JDK-8220343 >>> >>> The CodeCache::scavenge_root_nmethods_do function and its >>> implementation in CodeCache and nmethod, is only used by set of our >>> GCs (Serial, Parallel, and CMS), but not by G1, ZGC, Shenandoah, or >>> Epsilon. I want to move all of that into GC subsystem and then only >>> let those GCs using it pay the cost of having that code. >>> >>> This is a continuation of the work of the GC Interface, where G1, >>> ZGC, Shenandoah, and Epsilon, uses the register_nmethod, >>> unregister_nmethod, and flush_nmethod calls, but the other GCs don't. >>> >>> This patch builds upon: >>> ??JDK-8220411: Remove ScavengeRootsInCode=0 code >>> ??https://bugs.openjdk.java.net/browse/JDK-8220411 >>> >>> and also depends on the the resolution of: >>> ??JDK-8220342: Remove scavenge_root_nmethods_do from >>> VM_HeapWalkOperation::collect_simple_roots >>> ??https://bugs.openjdk.java.net/browse/JDK-8220342 >>> >>> Thanks, >>> StefanK > From vladimir.kozlov at oracle.com Thu Mar 14 17:08:30 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 14 Mar 2019 10:08:30 -0700 Subject: [13] RFR(T): 8220611: compiler/classUnloading/methodUnloading/TestOverloadCompileQueues.java timeout In-Reply-To: References: Message-ID: Looks good and trivial. Thanks, Vladimir On 3/14/19 3:33 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8220611 > http://cr.openjdk.java.net/~thartmann/8220611/webrev.00/ > > The test should be skipped if Graal is used as JIT because compilation thresholds are adjusted such > that execution of Graal is extremely slow (Graal needs to compile itself). > > Thanks, > Tobias > From igor.ignatyev at oracle.com Thu Mar 14 23:03:56 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Thu, 14 Mar 2019 16:03:56 -0700 Subject: RFR(T) : 8220689 : problem list RandomCommandsTest in graal runs Message-ID: <0AB325D9-8CB1-4FCE-9FFF-53C64F787A62@oracle.com> http://cr.openjdk.java.net/~iignatyev//8220689/webrev.00 > 1 line changed: 1 ins; 0 del; 0 mod; Hi all, could you please review this trivial one-liner which puts RandomCommandsTest in graal specific problem list? webrev: http://cr.openjdk.java.net/~iignatyev//8220689/webrev.00 JBS: https://bugs.openjdk.java.net/browse/JDK-8220689 Thanks, -- Igor From vladimir.kozlov at oracle.com Thu Mar 14 23:19:45 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 14 Mar 2019 16:19:45 -0700 Subject: RFR(T) : 8220689 : problem list RandomCommandsTest in graal runs In-Reply-To: <0AB325D9-8CB1-4FCE-9FFF-53C64F787A62@oracle.com> References: <0AB325D9-8CB1-4FCE-9FFF-53C64F787A62@oracle.com> Message-ID: <2606ff4a-9084-476a-01bb-28080927bcd2@oracle.com> Good. thanks, Vlaidmir On 3/14/19 4:03 PM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8220689/webrev.00 >> 1 line changed: 1 ins; 0 del; 0 mod; > > Hi all, > > could you please review this trivial one-liner which puts RandomCommandsTest in graal specific problem list? > > webrev: http://cr.openjdk.java.net/~iignatyev//8220689/webrev.00 > JBS: https://bugs.openjdk.java.net/browse/JDK-8220689 > > Thanks, > -- Igor > From tobias.hartmann at oracle.com Fri Mar 15 07:26:16 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 15 Mar 2019 08:26:16 +0100 Subject: [13] RFR(T): 8220611: compiler/classUnloading/methodUnloading/TestOverloadCompileQueues.java timeout In-Reply-To: References: Message-ID: Thanks Vladimir. Best regards, Tobias On 14.03.19 18:08, Vladimir Kozlov wrote: > Looks good and trivial. > > Thanks, > Vladimir > > On 3/14/19 3:33 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8220611 >> http://cr.openjdk.java.net/~thartmann/8220611/webrev.00/ >> >> The test should be skipped if Graal is used as JIT because compilation thresholds are adjusted such >> that execution of Graal is extremely slow (Graal needs to compile itself). >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Fri Mar 15 10:01:02 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 15 Mar 2019 11:01:02 +0100 Subject: A bug in C2 that causes a large amount of physical memory to be allocated In-Reply-To: References: Message-ID: Hi Jia Peng, thanks for reporting this! It seems that the bug was fixed in JDK 12 by JDK-8208677 [1]: http://hg.openjdk.java.net/jdk/jdk/rev/aa3bfacc912c#l4.7 Coleen, any details here? Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8208677 On 15.03.19 10:46, ??? wrote: > Recently, our company(PerfMa) is troubleshooting a JVM problem for a customer (JDK1.8.0_191-b12), and found that the process is always killed by the OS, which is caused by memory leaks. Finally, it was discovered that OOM is caused by a large amount of memory allocated by C2 thread. This is a bug in C2. The following is the troubleshooting process: > > First, through /proc//smaps, I saw a lot of 64MB of memory allocation, and RSS is basically exhausted. > 7fd690000000-7fd693f23000 rw-p 00000000 00:00 0 Size: 64652 kB Rss: 64652 kB Pss: 64652 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 64652 kB Referenced: 64652 kB Anonymous: 64652 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me nr sd 7fd693f23000-7fd694000000 ---p 00000000 00:00 0 Size: 884 kB Rss: 0 kB Pss: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 0 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: mr mw me nr sd > > Then trace the system call through the strace command, combined with the above virtual address, we found the corresponding mmap system call > > [pid 71] 13:34:41.982589 mmap(0x7fd690000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fd690000000 <0.000107> > > The thread that executes mmap is the 71 thread, so the thread is dumped through jstack, and the corresponding thread is found to be C2 CompilerThread0. > "C2 CompilerThread0" #39 daemon prio=9 os_prio=0 tid=0x00007fd8acebb000 nid=0x47 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE > > Then grep the output of strace, see a lot of memory were allocated by this thread, probably more than 2G. > > C2 is memory-limited under normal circumstances, but why is there a memory consumption greater than 2G? Finally, we found that a bug in the JVM will cause memory leaks. The location of this code is the nmethod::metadata_do method of nmethod.cpp: > void nmethod::metadata_do(void f(Metadata*)) { address low_boundary = verified_entry_point(); if (is_not_entrant()) { low_boundary += NativeJump::instruction_size; // %%% Note: On SPARC we patch only a 4-byte trap, not a full NativeJump. // (See comment above.) } { // Visit all immediate references that are embedded in the instruction stream. RelocIterator iter(this, low_boundary); while (iter.next()) { if (iter.type() == relocInfo::metadata_type ) { metadata_Relocation* r = iter.metadata_reloc(); // In this metadata, we must only follow those metadatas directly embedded in // the code. Other metadatas (oop_index>0) are seen as part of // the metadata section below. assert(1 == (r->metadata_is_immediate()) + (r->metadata_addr() >= metadata_begin() && r->metadata_addr() < metadata_end()), ?metadata must be found in exactly one place?); if (r->metadata_is_immediate() && r->metadata_value() != NULL) { Metadata* md = r->metadata_value(); if (md != _method) f(md); } } else if (iter.type() == relocInfo::virtual_call_type) { // Check compiledIC holders associated with this nmethod CompiledIC *ic = CompiledIC_at(&iter); if (ic->is_icholder_call()) { CompiledICHolder* cichk = ic->cached_icholder(); f(cichk->holder_metadata()); f(cichk->holder_klass()); } else { Metadata* ic_oop = ic->cached_metadata(); if (ic_oop != NULL) { f(ic_oop); } } } } } > > Because CompiledIC is a ResourceObj, but it is not freed by ResourceMark, so when this method is called multiple times, OOM will appear. > > The fix is very simple, just add ResourceMark rm; before CompiledIC *ic = CompiledIC_at(&iter); > > Can this patch be entered into JDK8? > > > > Best, > - Jia Peng @ PerfMa > From nijiaben at perfma.com Fri Mar 15 10:18:45 2019 From: nijiaben at perfma.com (=?utf-8?B?bmlqaWFiZW4=?=) Date: Fri, 15 Mar 2019 18:18:45 +0800 Subject: A bug in C2 that causes a large amount of physical memory to be allocated Message-ID: Hi, All Recently, I?m troubleshooting a JVM problem (JDK1.8.0_191-b12), and found that the process is always killed by the OS, which is caused by memory leaks. Finally, it was discovered that OOM is caused by a large amount of memory allocated by C2 thread. This is a bug in C2. The following is the troubleshooting process: First, through /proc//smaps, I saw a lot of 64MB of memory allocation, and RSS is basically exhausted. 7fd690000000-7fd693f23000 rw-p 00000000 00:00 0 Size: 64652 kB Rss: 64652 kB Pss: 64652 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 64652 kB Referenced: 64652 kB Anonymous: 64652 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me nr sd 7fd693f23000-7fd694000000 ---p 00000000 00:00 0 Size: 884 kB Rss: 0 kB Pss: 0 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 0 kB Anonymous: 0 kB AnonHugePages: 0 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: mr mw me nr sd Then trace the system call through the strace command, combined with the above virtual address, we found the corresponding mmap system call[pid 71] 13:34:41.982589 mmap(0x7fd690000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fd690000000 <0.000107>The thread that executes mmap is the 71 thread, so the thread is dumped through jstack, and the corresponding thread is found to be C2 CompilerThread0. "C2 CompilerThread0" #39 daemon prio=9 os_prio=0 tid=0x00007fd8acebb000 nid=0x47 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE Then grep the output of strace, see a lot of memory were allocated by this thread, probably more than 2G. C2 is memory-limited under normal circumstances, but why is there a memory consumption greater than 2G? Finally, we found that a bug in the JVM will cause memory leaks. The location of this code is the nmethod::metadata_do method of nmethod.cpp: void nmethod::metadata_do(void f(Metadata*)) { address low_boundary = verified_entry_point(); if (is_not_entrant()) { low_boundary += NativeJump::instruction_size; // %%% Note: On SPARC we patch only a 4-byte trap, not a full NativeJump. // (See comment above.) } { // Visit all immediate references that are embedded in the instruction stream. RelocIterator iter(this, low_boundary); while (iter.next()) { if (iter.type() == relocInfo::metadata_type ) { metadata_Relocation* r = iter.metadata_reloc(); // In this metadata, we must only follow those metadatas directly embedded in // the code. Other metadatas (oop_index>0) are seen as part of // the metadata section below. assert(1 == (r->metadata_is_immediate()) + (r->metadata_addr() >= metadata_begin() && r->metadata_addr() < metadata_end()), ?metadata must be found in exactly one place?); if (r->metadata_is_immediate() && r->metadata_value() != NULL) { Metadata* md = r->metadata_value(); if (md != _method) f(md); } } else if (iter.type() == relocInfo::virtual_call_type) { // Check compiledIC holders associated with this nmethod CompiledIC *ic = CompiledIC_at(&iter); if (ic->is_icholder_call()) { CompiledICHolder* cichk = ic->cached_icholder(); f(cichk->holder_metadata()); f(cichk->holder_klass()); } else { Metadata* ic_oop = ic->cached_metadata(); if (ic_oop != NULL) { f(ic_oop); } } } } } Because CompiledIC is a ResourceObj, but it is not freed by ResourceMark, so when this method is called multiple times, OOM will appear. The fix is very simple, just add ResourceMark rm; before CompiledIC *ic = CompiledIC_at(&iter); And I found out that it has been solved on JDK12. http://hg.openjdk.java.net/jdk-updates/jdk12u/rev/aa3bfacc912c Can this patch be entered into JDK8? Thanks, nijiaben @ PerfMa -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Fri Mar 15 10:42:24 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 15 Mar 2019 11:42:24 +0100 Subject: A bug in C2 that causes a large amount of physical memory to be allocated In-Reply-To: References: Message-ID: On 15.03.19 11:08, ??? wrote: > ? ? Thank you, will this patch consider backport to jdk8? Probably a point fix is more suitable for a backport but I leave this to Coleen / the runtime team to decide. Best regards, Tobias From shade at redhat.com Fri Mar 15 10:45:48 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Mar 2019 11:45:48 +0100 Subject: A bug in C2 that causes a large amount of physical memory to be allocated In-Reply-To: References: Message-ID: <9782702d-af3b-e72a-b369-ee69f9efb064@redhat.com> On 3/15/19 11:42 AM, Tobias Hartmann wrote: > On 15.03.19 11:08, ??? wrote: >> ? ? Thank you, will this patch consider backport to jdk8? > > Probably a point fix is more suitable for a backport but I leave this to Coleen / the runtime team > to decide. What a coincidence! I think my build server OOMs on tests with similar symptoms. If there are no objections and/or subtle problems with it, I am willing to pick this up for 11u and 8u backports: $ hg qdiff diff -r 3086207c8650 src/hotspot/share/code/nmethod.cpp --- a/src/hotspot/share/code/nmethod.cpp Tue Mar 05 08:24:58 2019 -0500 +++ b/src/hotspot/share/code/nmethod.cpp Fri Mar 15 11:39:27 2019 +0100 @@ -1538,10 +1538,11 @@ Metadata* md = r->metadata_value(); if (md != _method) f(md); } } else if (iter.type() == relocInfo::virtual_call_type) { // Check compiledIC holders associated with this nmethod + ResourceMark rm; CompiledIC *ic = CompiledIC_at(&iter); if (ic->is_icholder_call()) { CompiledICHolder* cichk = ic->cached_icholder(); f(cichk->holder_metadata()); f(cichk->holder_klass()); -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From sgehwolf at redhat.com Fri Mar 15 11:05:08 2019 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Fri, 15 Mar 2019 12:05:08 +0100 Subject: A bug in C2 that causes a large amount of physical memory to be allocated In-Reply-To: References: Message-ID: <2c86fda2e7178fcd7e0184b061111dd2676cb975.camel@redhat.com> On Fri, 2019-03-15 at 11:42 +0100, Tobias Hartmann wrote: > On 15.03.19 11:08, ??? wrote: > > Thank you, will this patch consider backport to jdk8? > > Probably a point fix is more suitable for a backport but I leave this to Coleen / the runtime team > to decide. FWIW, it seems to affect OpenJDK 11u and OpenJDK 8u. Thanks, Severin From shade at redhat.com Fri Mar 15 11:08:14 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 15 Mar 2019 12:08:14 +0100 Subject: A bug in C2 that causes a large amount of physical memory to be allocated In-Reply-To: <9782702d-af3b-e72a-b369-ee69f9efb064@redhat.com> References: <9782702d-af3b-e72a-b369-ee69f9efb064@redhat.com> Message-ID: <4bfc14c4-f80c-9d5f-a036-4555dda9d016@redhat.com> On 3/15/19 11:45 AM, Aleksey Shipilev wrote: > On 3/15/19 11:42 AM, Tobias Hartmann wrote: >> On 15.03.19 11:08, ??? wrote: >>> ? ? Thank you, will this patch consider backport to jdk8? >> >> Probably a point fix is more suitable for a backport but I leave this to Coleen / the runtime team >> to decide. > > What a coincidence! I think my build server OOMs on tests with similar symptoms. > > If there are no objections and/or subtle problems with it, I am willing to pick this up for 11u and > 8u backports: > > $ hg qdiff > diff -r 3086207c8650 src/hotspot/share/code/nmethod.cpp > --- a/src/hotspot/share/code/nmethod.cpp Tue Mar 05 08:24:58 2019 -0500 > +++ b/src/hotspot/share/code/nmethod.cpp Fri Mar 15 11:39:27 2019 +0100 > @@ -1538,10 +1538,11 @@ > Metadata* md = r->metadata_value(); > if (md != _method) f(md); > } > } else if (iter.type() == relocInfo::virtual_call_type) { > // Check compiledIC holders associated with this nmethod > + ResourceMark rm; > CompiledIC *ic = CompiledIC_at(&iter); > if (ic->is_icholder_call()) { > CompiledICHolder* cichk = ic->cached_icholder(); > f(cichk->holder_metadata()); > f(cichk->holder_klass()); Tracked here: https://bugs.openjdk.java.net/browse/JDK-8220718 -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Fri Mar 15 11:39:22 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 15 Mar 2019 12:39:22 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe Message-ID: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> A user reported misbehaving off-heap access. It looks like a C2 compilation failure that seems to only trigger with Shenandoah: https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html Eg with G1 generates this assembly for swapping two array elements: mov (%r8),%r9d mov (%r10),%r11d mov %r9d,(%r10) mov %r11d,(%r8) While with Shenandoah we get this: mov (%r9),%ecx mov %ecx,(%r10,%r11,1) mov %ecx,(%r9) I.e. the two loads seem to have been wrongly coalesced into one. Even though that is only triggered by Shenandoah, it seems to be a legit and generic C2 problem. Bug: https://bugs.openjdk.java.net/browse/JDK-8220714 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ The issue seems to be that off-heap accesses are supposed to use MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in C2Access::needs_cpu_membar(). However, it seems we wrongly set this here (library_call.cpp around l2410: // Can base be NULL? Otherwise, always on-heap access. bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); if (!can_access_non_heap) { decorators |= IN_HEAP; } However, heap_base_oop is initialized to top() a few lines up, and then never updated, at least not for the off-heap-access case. And top doesn't match NULL_PTR afaik. The proposed fix uses base instead of heap_base_oop for that check, this should be updated correcly through make_unsafe_addr() and classify_unsafe_addr(). For some reason, this bug only seems to be exposed when running Shenandoah, and then only when actually compiling with barriers. We could not reproduce this with any other GC. It totally eludes me why that is so. For this reason, the testcase goes under the gc/shenandoah/compiler directory. Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: OpenPGP digital signature URL: From martin.doerr at sap.com Fri Mar 15 16:26:46 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 15 Mar 2019 16:26:46 +0000 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: <87r2b9tzl2.fsf@redhat.com> References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> <87r2b9tzl2.fsf@redhat.com> Message-ID: Hi Roland, I have requested 11u and 12u backport of JDK-8219584. After that, this one should be trivial to backport. Best regards, Martin From rkennke at redhat.com Fri Mar 15 18:33:56 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 15 Mar 2019 19:33:56 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> Message-ID: jdk-submit job also passed. Roman > A user reported misbehaving off-heap access. It looks like a C2 > compilation failure that seems to only trigger with Shenandoah: > > https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html > > Eg with G1 generates this assembly for swapping two array elements: > mov (%r8),%r9d > mov (%r10),%r11d > mov %r9d,(%r10) > mov %r11d,(%r8) > > While with Shenandoah we get this: > mov (%r9),%ecx > mov %ecx,(%r10,%r11,1) > mov %ecx,(%r9) > > I.e. the two loads seem to have been wrongly coalesced into one. > > Even though that is only triggered by Shenandoah, it seems to be a legit > and generic C2 problem. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8220714 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ > > The issue seems to be that off-heap accesses are supposed to use > MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in > C2Access::needs_cpu_membar(). However, it seems we wrongly set this here > (library_call.cpp around l2410: > > // Can base be NULL? Otherwise, always on-heap access. > bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); > if (!can_access_non_heap) { > decorators |= IN_HEAP; > } > > However, heap_base_oop is initialized to top() a few lines up, and then > never updated, at least not for the off-heap-access case. And top > doesn't match NULL_PTR afaik. > > The proposed fix uses base instead of heap_base_oop for that check, this > should be updated correcly through make_unsafe_addr() and > classify_unsafe_addr(). > > For some reason, this bug only seems to be exposed when running > Shenandoah, and then only when actually compiling with barriers. We > could not reproduce this with any other GC. It totally eludes me why > that is so. For this reason, the testcase goes under the > gc/shenandoah/compiler directory. > > Roman > From vladimir.kozlov at oracle.com Sat Mar 16 00:36:34 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 15 Mar 2019 17:36:34 -0700 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> Message-ID: <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> Hi Roman, I would suggest to wait when Vladimir I. look on this - he made most changes in this code. Looking on changes and it seems you can modify it a little: // Can base be NULL? Otherwise, always on-heap access. bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); if (!can_access_non_heap) { heap_base_oop = base; decorators |= IN_HEAP; } else if (type == T_OBJECT) { return false; // off-heap oop accesses are not supported } Thanks, Vladimir K On 3/15/19 4:39 AM, Roman Kennke wrote: > A user reported misbehaving off-heap access. It looks like a C2 > compilation failure that seems to only trigger with Shenandoah: > > https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html > > Eg with G1 generates this assembly for swapping two array elements: > mov (%r8),%r9d > mov (%r10),%r11d > mov %r9d,(%r10) > mov %r11d,(%r8) > > While with Shenandoah we get this: > mov (%r9),%ecx > mov %ecx,(%r10,%r11,1) > mov %ecx,(%r9) > > I.e. the two loads seem to have been wrongly coalesced into one. > > Even though that is only triggered by Shenandoah, it seems to be a legit > and generic C2 problem. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8220714 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ > > The issue seems to be that off-heap accesses are supposed to use > MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in > C2Access::needs_cpu_membar(). However, it seems we wrongly set this here > (library_call.cpp around l2410: > > // Can base be NULL? Otherwise, always on-heap access. > bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); > if (!can_access_non_heap) { > decorators |= IN_HEAP; > } > > However, heap_base_oop is initialized to top() a few lines up, and then > never updated, at least not for the off-heap-access case. And top > doesn't match NULL_PTR afaik. > > The proposed fix uses base instead of heap_base_oop for that check, this > should be updated correcly through make_unsafe_addr() and > classify_unsafe_addr(). > > For some reason, this bug only seems to be exposed when running > Shenandoah, and then only when actually compiling with barriers. We > could not reproduce this with any other GC. It totally eludes me why > that is so. For this reason, the testcase goes under the > gc/shenandoah/compiler directory. > > Roman > From jesper.wilhelmsson at oracle.com Sat Mar 16 05:15:47 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Sat, 16 Mar 2019 06:15:47 +0100 Subject: RFR: JDK-8220389 - Update Graal Message-ID: Hi, Please review the patch to integrate the latest Graal changes into OpenJDK. Graal tip to integrate: fe5d30fb9d5b1cfbf455dc161e749381a93732d1 JBS duplicates deferred to the next integration: https://bugs.openjdk.java.net/browse/JDK-8214947 Bug: https://bugs.openjdk.java.net/browse/JDK-8220389 Webrev: http://cr.openjdk.java.net/~jwilhelm/8220389/webrev.00/ This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From igor.veresov at oracle.com Sat Mar 16 05:47:59 2019 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 15 Mar 2019 22:47:59 -0700 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Yes, looks good. igor > On Mar 13, 2019, at 5:43 AM, Dmitry Cherepanov wrote: > > Igor, > > Updated version of original fix (with ifdef X86 added): > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > > Is it okay to push it? > > Thanks, > > Dmitry > >> On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: >> >> Dmitry, >> >> After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? >> >> igor >> >> >> >>> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: >>> >>> Igor, >>> >>> Sorry for the delay in responding. >>> >>> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ >>> >>> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? >>> >>> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? >>> >>> Thanks, >>> >>> Dmitry >>> >>>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: >>>> >>>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. >>>> >>>> igor >>>> >>>> >>>> >>>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> Thanks for the suggestions. I tried to make the opr1 a temporary >>>>> >>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ >>>>> >>>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? >>>>> >>>>> Dmitry >>>>> >>>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: >>>>>> >>>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. >>>>>> >>>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: >>>>>>> >>>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. >>>>>>> >>>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. >>>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. >>>>>>> >>>>>>> igor >>>>>>> >>>>>>> >>>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: >>>>>>>> >>>>>>>> Hi Dmitry, >>>>>>>> >>>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Tobias >>>>>>>> >>>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: >>>>>>>>> Hi Tobias, >>>>>>>>> >>>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the >>>>>>>>> regression test. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Dmitry >>>>>>>>> >>>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>> Hi Dmitry, >>>>>>>>>> >>>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? >>>>>>>>>> >>>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Tobias >>>>>>>>>> >>>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for >>>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and >>>>>>>>>>> using it for incrementing backedge counter. >>>>>>>>>>> >>>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 >>>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Dmitry >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From volker.simonis at gmail.com Sat Mar 16 08:12:16 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Sat, 16 Mar 2019 09:12:16 +0100 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Hi Dmitry, sorry, but I don?t understand how the regression test works. Can you please explain what?s the expected result of the test without and with your fix? Thanks, Volker Dmitry Cherepanov schrieb am Mi. 13. M?rz 2019 um 13:44: > Igor, > > Updated version of original fix (with ifdef X86 added): > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > > Is it okay to push it? > > Thanks, > > Dmitry > > > On Mar 13, 2019, at 5:24 AM, Igor Veresov > wrote: > > > > Dmitry, > > > > After some digging around I think your original fix is ok. In addition > to !_LP64 can you add ifdef X86? > > > > igor > > > > > > > >> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov > wrote: > >> > >> Igor, > >> > >> Sorry for the delay in responding. > >> > >> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for > this case. The changes: > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ > >> > >> For this change, I got assertion failed (from cpu_regnrLo, in > c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly > understand that another part of this solution should be an additional > change that would allocate tmp1? Or is there an existing code that should > take care of it already and just need to enable the allocation of tmp1 for > this case? > >> > >> Another question: given that this is a major issue on x86 32bit system, > would you mind if we proceed with the current minimal/low-risk fix ( > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create > new JBS issue to investigate more generic approach separately? > >> > >> Thanks, > >> > >> Dmitry > >> > >>> On Oct 2, 2018, at 8:09 PM, Igor Veresov > wrote: > >>> > >>> Right, I forgot how it works. Sorry for the confusion. I think there > is no way to explicitly describe a register kill in C1. I guess the only > option is to just avoid clobbering opr1. So may be we should make use of > tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be > allocated only for this particular case. > >>> > >>> igor > >>> > >>> > >>> > >>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov > wrote: > >>>> > >>>> Hi Igor, > >>>> > >>>> Thanks for the suggestions. I tried to make the opr1 a temporary > >>>> > >>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ > >>>> > >>>> but the generated code still has the problem. Looking into the log > with -XX:TraceLinearScanLevel=4 ( > http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) > seems like the reason for this is that the opr1 (virtual register R165 in > the log) is also an input operand and its range becomes wider and the > shorter ranges (corresponding to the opr1 marked as temp) are merged to the > single range. Can the input operand be temporary at the same time? > >>>> > >>>> Dmitry > >>>> > >>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov > wrote: > >>>>> > >>>>> Edit: It may be more consistent to check for is_double_cpu() instead > of T_LONG. Although that?s semantically equivalent. > >>>>> > >>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov > wrote: > >>>>>> > >>>>>> It doesn?t seem to me like the proper way to fix it. The problem is > that the cmp is destroying opr1 without telling the register allocator > about it. > >>>>>> > >>>>>> One possible solution would be to make opr1 also a temp (see > LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and > only if the operand type is T_LONG. > >>>>>> Another solution is to maintain a temporary register for lir_cmp > and use it to save/restore opr1 when emitting the code in > LIR_Assembler::comp_op(). Again, the temporary register has to be there > only for x86 32bit and T_LONG. > >>>>>> > >>>>>> igor > >>>>>> > >>>>>> > >>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann < > tobias.hartmann at oracle.com> wrote: > >>>>>>> > >>>>>>> Hi Dmitry, > >>>>>>> > >>>>>>> this looks good to me but Igor (who implemented 8201447) should > have a look as well. > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Tobias > >>>>>>> > >>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: > >>>>>>>> Hi Tobias, > >>>>>>>> > >>>>>>>> Thanks for the review, updated patch avoids the additional move > on x86_64 and includes the > >>>>>>>> regression test. > >>>>>>>> > >>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ > >>>>>>>> > >>>>>>>> > >>>>>>>> Dmitry > >>>>>>>> > >>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann < > tobias.hartmann at oracle.com > >>>>>>>>> > wrote: > >>>>>>>>> > >>>>>>>>> Hi Dmitry, > >>>>>>>>> > >>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid > the additional move on x86_64? > >>>>>>>>> > >>>>>>>>> Could you please add the regression test to the webrev? Or did > this reproduce with other tests? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Tobias > >>>>>>>>> > >>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. > It slightly adjusts the fix for > >>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by > creating a copy of the left operand and > >>>>>>>>>> using it for incrementing backedge counter. > >>>>>>>>>> > >>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 > >>>>>>>>>> webrev: > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Dmitry > >>>>>> > >>>>> > >>>> > >>> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dean.long at oracle.com Sun Mar 17 06:06:47 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Sat, 16 Mar 2019 23:06:47 -0700 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: References: Message-ID: On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: > This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. Something doesn't look right.? The changes in overwritten-diffs.txt look old, and 'hg log src/jdk.internal.vm.compiler' says the last change was: 8218074: Update Graal dl From dean.long at oracle.com Sun Mar 17 06:09:01 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Sat, 16 Mar 2019 23:09:01 -0700 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: References: Message-ID: <09cd765b-3427-27b5-d127-4aae16b373e8@oracle.com> On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: > JBS duplicates deferred to the next integration: > https://bugs.openjdk.java.net/browse/JDK-8214947 This is already in, so why is it listed as deferred? dl From zhaixiang at loongson.cn Mon Mar 18 02:55:58 2019 From: zhaixiang at loongson.cn (Leslie Zhai) Date: Mon, 18 Mar 2019 10:55:58 +0800 Subject: Request for Comments: Potential leak of memory pointed to by 'name' about jvmciCodeInstaller In-Reply-To: References: Message-ID: <1f40d096-2e3f-4f47-5812-931d624e239d@loongson.cn> Hi Doug, Thanks for your kind response! ? 2019?03?18? 05:55, Doug Simon ??: > Hi Leslie, > > As a point of process, I think hotspot-compiler-dev at openjdk.java.net is probably a better list for JVMCI bugs. Sorry for my wrong posting! > > In any case, thanks for the investigation. However, I don?t think this is a bug as RuntimeStub simply passes along the name argument which is eventually stored to CodeBlob::_name without any further copying. If we subsequently freed that argument, the CodeBlob::_name would become invalid. Thanks for pointing out my fault! I might brought in Use-after-free[2] issue even that I carefully free the allocated memory in the end of `installCode` C2V_VMENTRY... The reduced testcase is able to reproduce my fault: ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- $ cat t.cpp #include #include #include class CodeBlob { public: CodeBlob(const char* name) : _name(name) {}; const char* name() { return _name; } protected: const char* _name; }; class RuntimeBlob : public CodeBlob { public: RuntimeBlob(const char* name) : CodeBlob(name) {} }; class RuntimeStub : public RuntimeBlob { private: RuntimeStub(const char* name) : RuntimeBlob(name) {} public: static RuntimeStub* new_runtime_stub(const char* stub_name) { RuntimeStub* stub = new RuntimeStub(stub_name); return stub; } }; static void install(CodeBlob*& cb, char*& name) { name = strdup("some stubName"); cb = RuntimeStub::new_runtime_stub(name); } // To simulate C2V_VMENTRY(jint, installCode, (JNIEnv *jniEnv, jobject, jobject target, jobject compiled_code, jobject installed_code, jobject speculation_log)) int main(int argc, char *argv[]) { CodeBlob* cb = NULL; char* cb_name = NULL; install(cb, cb_name); if (cb_name) free(cb_name); // <--- MY FAULT std::cout << cb->name() << std::endl; return 0; } ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- t.cpp:9:24: warning: Use of memory after it is freed const char* name() { return _name; } ^~~~~~~~~~~~ t.cpp:42:3: warning: Potential leak of memory pointed to by 'cb' std::cout << cb->name() << std::endl; ^~~~~~~~~~~~~~~~~~~~~~~ So It might be Potential leak of memory pointed to by 'cb' about jvmciCompilerToVM? But the `cb` might be used in other place just like `name` :) Just comment MY FAULT //if (cb_name) free(cb_name); And see what dynamic analysis say: $ clang++ -fsanitize=address t.cpp $ ./a.out ================================================================= ==4482==ERROR: LeakSanitizer: detected memory leaks Direct leak of 8 byte(s) in 1 object(s) allocated from: #0 0x12017b674 (/home/loongson/zhaixiang/a.out+0x12017b674) #1 0x12018097c (/home/loongson/zhaixiang/a.out+0x12018097c) #2 0x12018074c (/home/loongson/zhaixiang/a.out+0x12018074c) #3 0x1201804c4 (/home/loongson/zhaixiang/a.out+0x1201804c4) #4 0xfff6cc719c (/lib64/libc.so.6+0x4319c) #5 0x12002b198 (/home/loongson/zhaixiang/a.out+0x12002b198) Indirect leak of 14 byte(s) in 1 object(s) allocated from: #0 0x1200de204 (/home/loongson/zhaixiang/a.out+0x1200de204) #1 0x120180688 (/home/loongson/zhaixiang/a.out+0x120180688) #2 0x1201804c4 (/home/loongson/zhaixiang/a.out+0x1201804c4) #3 0xfff6cc719c (/lib64/libc.so.6+0x4319c) #4 0x12002b198 (/home/loongson/zhaixiang/a.out+0x12002b198) SUMMARY: AddressSanitizer: 22 byte(s) leaked in 2 allocation(s). Thanks, Leslie Zhai > > -Doug > >> On 17 Mar 2019, at 10:30, Leslie Zhai wrote: >> >> Hi, >> >> Bug reported[1] by the clang static analyzer. >> >> Description: Potential leak of memory pointed to by 'name' >> File: /home/zhaixiang/jdk/src/hotspot/share/jvmci/jvmciCodeInstaller.cpp >> Line: 653 >> >> 652 char* name = strdup(java_lang_String::as_utf8_string(stubName)); >> >> 5 ? Memory is allocated ? >> >> 653 cb = RuntimeStub::new_runtime_stub(name, >> >> 6 ? Potential leak of memory pointed to by 'name' >> >> I checked `install` function in src/hotspot/share/jvmci/jvmciCodeInstaller.cpp and `installCode` C2V_VMENTRY in src/hotspot/share/jvmci/jvmciCompilerToVM.cpp carefully. There is no `free` to release the allocated memory, so I argue that it is a Memory leak issue, not a False positive[2]. May I file a bug if it is real potential leak of memory issue? >> >> Because I think webrev is related to BUGID[3], so I just paste my patch here: >> >> >> ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- >> diff -r 1a18b8d56d73 src/hotspot/share/jvmci/jvmciCodeInstaller.cpp >> --- a/src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Sat Mar 16 15:05:21 2019 -0700 >> +++ b/src/hotspot/share/jvmci/jvmciCodeInstaller.cpp Sun Mar 17 17:06:50 2019 +0800 >> @@ -623,7 +623,7 @@ >> #endif // INCLUDE_AOT >> // constructor used to create a method >> -JVMCIEnv::CodeInstallResult CodeInstaller::install(JVMCICompiler* compiler, Handle target, Handle compiled_code, CodeBlob*& cb, Handle installed_code, Handle speculation_log, TRAPS) { >> +JVMCIEnv::CodeInstallResult CodeInstaller::install(JVMCICompiler* compiler, Handle target, Handle compiled_code, CodeBlob*& cb, char*& cb_name, Handle installed_code, Handle speculation_log, TRAPS) { >> CodeBuffer buffer("JVMCI Compiler CodeBuffer"); >> jobject compiled_code_obj = JNIHandles::make_local(compiled_code()); >> OopRecorder* recorder = new OopRecorder(&_arena, true); >> @@ -649,8 +649,8 @@ >> if (stubName == NULL) { >> JVMCI_ERROR_OK("stub should have a name"); >> } >> - char* name = strdup(java_lang_String::as_utf8_string(stubName)); >> - cb = RuntimeStub::new_runtime_stub(name, >> + cb_name = strdup(java_lang_String::as_utf8_string(stubName)); >> + cb = RuntimeStub::new_runtime_stub(cb_name, >> &buffer, >> CodeOffsets::frame_never_safe, >> stack_slots, >> diff -r 1a18b8d56d73 src/hotspot/share/jvmci/jvmciCodeInstaller.hpp >> --- a/src/hotspot/share/jvmci/jvmciCodeInstaller.hpp Sat Mar 16 15:05:21 2019 -0700 >> +++ b/src/hotspot/share/jvmci/jvmciCodeInstaller.hpp Sun Mar 17 17:06:50 2019 +0800 >> @@ -207,7 +207,7 @@ >> #if INCLUDE_AOT >> JVMCIEnv::CodeInstallResult gather_metadata(Handle target, Handle compiled_code, CodeMetadata& metadata, TRAPS); >> #endif >> - JVMCIEnv::CodeInstallResult install(JVMCICompiler* compiler, Handle target, Handle compiled_code, CodeBlob*& cb, Handle installed_code, Handle speculation_log, TRAPS); >> + JVMCIEnv::CodeInstallResult install(JVMCICompiler* compiler, Handle target, Handle compiled_code, CodeBlob*& cb, char*& cb_name, Handle installed_code, Handle speculation_log, TRAPS); >> static address runtime_call_target_address(oop runtime_call); >> static VMReg get_hotspot_reg(jint jvmciRegisterNumber, TRAPS); >> diff -r 1a18b8d56d73 src/hotspot/share/jvmci/jvmciCompilerToVM.cpp >> --- a/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp Sat Mar 16 15:05:21 2019 -0700 >> +++ b/src/hotspot/share/jvmci/jvmciCompilerToVM.cpp Sun Mar 17 17:06:50 2019 +0800 >> @@ -677,6 +677,7 @@ >> Handle target_handle(THREAD, JNIHandles::resolve(target)); >> Handle compiled_code_handle(THREAD, JNIHandles::resolve(compiled_code)); >> CodeBlob* cb = NULL; >> + char* cb_name = NULL; >> Handle installed_code_handle(THREAD, JNIHandles::resolve(installed_code)); >> Handle speculation_log_handle(THREAD, JNIHandles::resolve(speculation_log)); >> @@ -685,7 +686,7 @@ >> TraceTime install_time("installCode", JVMCICompiler::codeInstallTimer()); >> bool is_immutable_PIC = HotSpotCompiledCode::isImmutablePIC(compiled_code_handle) > 0; >> CodeInstaller installer(is_immutable_PIC); >> - JVMCIEnv::CodeInstallResult result = installer.install(compiler, target_handle, compiled_code_handle, cb, installed_code_handle, speculation_log_handle, CHECK_0); >> + JVMCIEnv::CodeInstallResult result = installer.install(compiler, target_handle, compiled_code_handle, cb, cb_name, installed_code_handle, speculation_log_handle, CHECK_0); >> if (PrintCodeCacheOnCompilation) { >> stringStream s; >> @@ -722,6 +723,7 @@ >> } >> } >> } >> + if (cb_name) free(cb_name); >> return result; >> C2V_END >> >> ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- >> >> >> I ran clang static analyzer again, and it is not reproducible owing to I fixed the issue, not False negative :) >> >> hotspot:tier1 linux-x86_64-server-fastdebug 2 fail: >> >> * compiler/c2/Test8062950.java: it is also reproducible for mips64el without the patch >> * runtime/classFileParserBug/TestEmptyBootstrapMethodsAttr.java: Test empty bootstrap_methods table within BootstrapMethods attribute >> >> >> Please point out my any fault! >> >> Thanks, >> >> Leslie Zhai >> >> [1] https://raw.githubusercontent.com/xiangzhai/jdk-dev/master/jvmciCodeInstaller.cpp.png >> >> [2] https://bugs.llvm.org/show_bug.cgi?id=40913 Potential Use-after-free issue reported by clang static analyzer. >> >> [3] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2018-September/007855.html >> >> From doug.simon at oracle.com Mon Mar 18 07:44:33 2019 From: doug.simon at oracle.com (Doug Simon) Date: Mon, 18 Mar 2019 08:44:33 +0100 Subject: Request for Comments: Potential leak of memory pointed to by 'name' about jvmciCodeInstaller In-Reply-To: <1f40d096-2e3f-4f47-5812-931d624e239d@loongson.cn> References: <1f40d096-2e3f-4f47-5812-931d624e239d@loongson.cn> Message-ID: > On 18 Mar 2019, at 03:55, Leslie Zhai wrote: > > Hi Doug, > > Thanks for your kind response! > > > ? 2019?03?18? 05:55, Doug Simon ??: >> Hi Leslie, >> >> As a point of process, I think hotspot-compiler-dev at openjdk.java.net is probably a better list for JVMCI bugs. > > Sorry for my wrong posting! > >> >> In any case, thanks for the investigation. However, I don?t think this is a bug as RuntimeStub simply passes along the name argument which is eventually stored to CodeBlob::_name without any further copying. If we subsequently freed that argument, the CodeBlob::_name would become invalid. > > Thanks for pointing out my fault! > I might brought in Use-after-free[2] issue even that I carefully free the allocated memory in the end of `installCode` C2V_VMENTRY... > The reduced testcase is able to reproduce my fault: > > ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- > $ cat t.cpp > #include > #include > #include > > class CodeBlob { > public: > CodeBlob(const char* name) : _name(name) {}; > > const char* name() { return _name; } > > protected: > const char* _name; > }; > > class RuntimeBlob : public CodeBlob { > public: > RuntimeBlob(const char* name) : CodeBlob(name) {} > }; > > class RuntimeStub : public RuntimeBlob { > private: > RuntimeStub(const char* name) : RuntimeBlob(name) {} > > public: > static RuntimeStub* new_runtime_stub(const char* stub_name) { > RuntimeStub* stub = new RuntimeStub(stub_name); > return stub; > } > }; > > static void install(CodeBlob*& cb, char*& name) { > name = strdup("some stubName"); > cb = RuntimeStub::new_runtime_stub(name); > } > > // To simulate C2V_VMENTRY(jint, installCode, (JNIEnv *jniEnv, jobject, jobject target, jobject compiled_code, jobject installed_code, jobject speculation_log)) > int main(int argc, char *argv[]) { > CodeBlob* cb = NULL; > char* cb_name = NULL; > install(cb, cb_name); > if (cb_name) free(cb_name); // <--- MY FAULT > std::cout << cb->name() << std::endl; > return 0; > } > ----- 8< -------- 8< -------- 8< -------- 8< -------- 8< -------- 8< --- > > t.cpp:9:24: warning: Use of memory after it is freed > const char* name() { return _name; } > ^~~~~~~~~~~~ > t.cpp:42:3: warning: Potential leak of memory pointed to by 'cb' > std::cout << cb->name() << std::endl; > ^~~~~~~~~~~~~~~~~~~~~~~ > > > So It might be Potential leak of memory pointed to by 'cb' about jvmciCompilerToVM? But the `cb` might be used in other place just like `name` :) Indeed. RuntimeStubs are never freed as they are used as helpers for ?normal? compiled code. Once created, their addresses are stored in static variables that live for the remainder of the VM process. -Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: From dcherepanov at azul.com Mon Mar 18 10:39:45 2019 From: dcherepanov at azul.com (Dmitry Cherepanov) Date: Mon, 18 Mar 2019 10:39:45 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Hi Volker, With the fix, the test method reaches the break statement in the first iteration of the loop and passes. Without the fix, the test method enters infinite loop and fails with timeout error. Thanks, Dmitry > On Mar 16, 2019, at 11:12 AM, Volker Simonis wrote: > > Hi Dmitry, > > sorry, but I don?t understand how the regression test works. Can you please explain what?s the expected result of the test without and with your fix? > > Thanks, > Volker > > Dmitry Cherepanov schrieb am Mi. 13. M?rz 2019 um 13:44: > Igor, > > Updated version of original fix (with ifdef X86 added): > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > > Is it okay to push it? > > Thanks, > > Dmitry > > > On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: > > > > Dmitry, > > > > After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? > > > > igor > > > > > > > >> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: > >> > >> Igor, > >> > >> Sorry for the delay in responding. > >> > >> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ > >> > >> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? > >> > >> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? > >> > >> Thanks, > >> > >> Dmitry > >> > >>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: > >>> > >>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. > >>> > >>> igor > >>> > >>> > >>> > >>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: > >>>> > >>>> Hi Igor, > >>>> > >>>> Thanks for the suggestions. I tried to make the opr1 a temporary > >>>> > >>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ > >>>> > >>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? > >>>> > >>>> Dmitry > >>>> > >>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: > >>>>> > >>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. > >>>>> > >>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: > >>>>>> > >>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. > >>>>>> > >>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. > >>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. > >>>>>> > >>>>>> igor > >>>>>> > >>>>>> > >>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: > >>>>>>> > >>>>>>> Hi Dmitry, > >>>>>>> > >>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Tobias > >>>>>>> > >>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: > >>>>>>>> Hi Tobias, > >>>>>>>> > >>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the > >>>>>>>> regression test. > >>>>>>>> > >>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ > >>>>>>>> > >>>>>>>> > >>>>>>>> Dmitry > >>>>>>>> > >>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>>>> > wrote: > >>>>>>>>> > >>>>>>>>> Hi Dmitry, > >>>>>>>>> > >>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? > >>>>>>>>> > >>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? > >>>>>>>>> > >>>>>>>>> Thanks, > >>>>>>>>> Tobias > >>>>>>>>> > >>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for > >>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and > >>>>>>>>>> using it for incrementing backedge counter. > >>>>>>>>>> > >>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 > >>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Dmitry > >>>>>> > >>>>> > >>>> > >>> > >> > > > From volker.simonis at gmail.com Mon Mar 18 13:25:07 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 18 Mar 2019 14:25:07 +0100 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Hi Dmitry, thanks for the confirmation. I don't like test which fail implicitly by timing out. First because it is not immediately evident if it is really a test failure or just on infrastructure problem and second because they unnecessarily use cpu time until they time out. Wouldn't it be possible to change the tests such that it fails explicitly with an Exception - e.g. by introducing and checking a counter within the loop? Thanks, Volker On Mon, Mar 18, 2019 at 11:39 AM Dmitry Cherepanov wrote: > > Hi Volker, > > With the fix, the test method reaches the break statement in the first iteration of the loop and passes. Without the fix, the test method enters infinite loop and fails with timeout error. > > Thanks, > > Dmitry > > > On Mar 16, 2019, at 11:12 AM, Volker Simonis wrote: > > > > Hi Dmitry, > > > > sorry, but I don?t understand how the regression test works. Can you please explain what?s the expected result of the test without and with your fix? > > > > Thanks, > > Volker > > > > Dmitry Cherepanov schrieb am Mi. 13. M?rz 2019 um 13:44: > > Igor, > > > > Updated version of original fix (with ifdef X86 added): > > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > > > > Is it okay to push it? > > > > Thanks, > > > > Dmitry > > > > > On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: > > > > > > Dmitry, > > > > > > After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? > > > > > > igor > > > > > > > > > > > >> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: > > >> > > >> Igor, > > >> > > >> Sorry for the delay in responding. > > >> > > >> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ > > >> > > >> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? > > >> > > >> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? > > >> > > >> Thanks, > > >> > > >> Dmitry > > >> > > >>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: > > >>> > > >>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. > > >>> > > >>> igor > > >>> > > >>> > > >>> > > >>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: > > >>>> > > >>>> Hi Igor, > > >>>> > > >>>> Thanks for the suggestions. I tried to make the opr1 a temporary > > >>>> > > >>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ > > >>>> > > >>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? > > >>>> > > >>>> Dmitry > > >>>> > > >>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: > > >>>>> > > >>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. > > >>>>> > > >>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: > > >>>>>> > > >>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. > > >>>>>> > > >>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. > > >>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. > > >>>>>> > > >>>>>> igor > > >>>>>> > > >>>>>> > > >>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: > > >>>>>>> > > >>>>>>> Hi Dmitry, > > >>>>>>> > > >>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. > > >>>>>>> > > >>>>>>> Best regards, > > >>>>>>> Tobias > > >>>>>>> > > >>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: > > >>>>>>>> Hi Tobias, > > >>>>>>>> > > >>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the > > >>>>>>>> regression test. > > >>>>>>>> > > >>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Dmitry > > >>>>>>>> > > >>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann > >>>>>>>>> > wrote: > > >>>>>>>>> > > >>>>>>>>> Hi Dmitry, > > >>>>>>>>> > > >>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? > > >>>>>>>>> > > >>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> Tobias > > >>>>>>>>> > > >>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: > > >>>>>>>>>> Hello, > > >>>>>>>>>> > > >>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for > > >>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and > > >>>>>>>>>> using it for incrementing backedge counter. > > >>>>>>>>>> > > >>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 > > >>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> Thanks, > > >>>>>>>>>> > > >>>>>>>>>> Dmitry > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > > > From aph at redhat.com Mon Mar 18 15:53:04 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 18 Mar 2019 15:53:04 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: <95dd86ac-ff05-59a4-9185-2e99775cff35@redhat.com> On 3/13/19 12:43 PM, Dmitry Cherepanov wrote: > Updated version of original fix (with ifdef X86 added): > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > > Is it okay to push it? Surely you cannot contemplate pushing this patch with no explanatory comment. "BEWARE! On 32-bit x86 cmp clobbers its left argument so we need a temp copy." -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From dmitrij.pochepko at bell-sw.com Mon Mar 18 17:03:04 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Mon, 18 Mar 2019 20:03:04 +0300 Subject: RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 Message-ID: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> Hi all, please review patch for JDK-8216989 CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 webrev: http://cr.openjdk.java.net/~dpochepk/8216989/webrev.01/ All platforms except AARCH64 performs zero length check in arraycopy post barrier and skip card marking for zero length arrays. Missing check can lead to wrong card marking. This patch adds such check. Testing (using parallel gc, because default g1 is not affected): - JCK - jtreg hotspot tests: compiler/*, gc/* and runtime/* - jtreg jdk tier1-3 tests no regressions found. CR: https://bugs.openjdk.java.net/browse/JDK-8216989 Thanks, Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Mon Mar 18 19:06:34 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 18 Mar 2019 12:06:34 -0700 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: References: Message-ID: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> Changes are fine. There are failures in testing but they seem unrelated. There is strange thing about generated overwriiten-diffs.txt file pointed by Dean but it is the problem of the script which generates it. Thanks, Vladimir On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate the latest Graal changes into OpenJDK. > Graal tip to integrate: fe5d30fb9d5b1cfbf455dc161e749381a93732d1 > > JBS duplicates deferred to the next integration: > https://bugs.openjdk.java.net/browse/JDK-8214947 > > Bug: https://bugs.openjdk.java.net/browse/JDK-8220389 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8220389/webrev.00/ > > This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. > > Thanks, > /Jesper > From dean.long at oracle.com Mon Mar 18 19:14:05 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 18 Mar 2019 12:14:05 -0700 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> References: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> Message-ID: The change in make/test/JtregGraalUnit.gmk seems to be changing the indentation only, but the original indentation looks correct.? Can we revert this change and correct the script? dl On 3/18/19 12:06 PM, Vladimir Kozlov wrote: > Changes are fine. > > There are failures in testing but they seem unrelated. > > There is strange thing about generated overwriiten-diffs.txt file > pointed by Dean but it is the problem of the script which generates it. > > Thanks, > Vladimir > > On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> >> Please review the patch to integrate the latest Graal changes into >> OpenJDK. >> Graal tip to integrate: fe5d30fb9d5b1cfbf455dc161e749381a93732d1 >> >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8220389 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8220389/webrev.00/ >> >> This integration did overwrite changes already in place in OpenJDK. >> The diff has been attached to the umbrella bug. >> >> Thanks, >> /Jesper >> From jesper.wilhelmsson at oracle.com Tue Mar 19 00:53:10 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 19 Mar 2019 01:53:10 +0100 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: References: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> Message-ID: <4668115D-1C38-4309-AFAE-B3F42D6E3056@oracle.com> I have reverted the make/test/JtregGraalUnit.gmk changes locally. Thanks, /Jesper > On 18 Mar 2019, at 20:14, dean.long at oracle.com wrote: > > The change in make/test/JtregGraalUnit.gmk seems to be changing the indentation only, but the original indentation looks correct. Can we revert this change and correct the script? > > dl > > On 3/18/19 12:06 PM, Vladimir Kozlov wrote: >> Changes are fine. >> >> There are failures in testing but they seem unrelated. >> >> There is strange thing about generated overwriiten-diffs.txt file pointed by Dean but it is the problem of the script which generates it. >> >> Thanks, >> Vladimir >> >> On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> >>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>> Graal tip to integrate: fe5d30fb9d5b1cfbf455dc161e749381a93732d1 >>> >>> JBS duplicates deferred to the next integration: >>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8220389 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8220389/webrev.00/ >>> >>> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >>> >>> Thanks, >>> /Jesper >>> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From jesper.wilhelmsson at oracle.com Tue Mar 19 00:55:49 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Tue, 19 Mar 2019 01:55:49 +0100 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> References: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> Message-ID: Are you referring to the indentation changes in make/test/JtregGraalUnit.gmk? Those have been reverted now. Should the changes in overwritten-diffs be re-applied to the OpenJDK? Thanks, /Jesper > On 18 Mar 2019, at 20:06, Vladimir Kozlov wrote: > > Changes are fine. > > There are failures in testing but they seem unrelated. > > There is strange thing about generated overwriiten-diffs.txt file pointed by Dean but it is the problem of the script which generates it. > > Thanks, > Vladimir > > On 3/15/19 10:15 PM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate the latest Graal changes into OpenJDK. >> Graal tip to integrate: fe5d30fb9d5b1cfbf455dc161e749381a93732d1 >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 >> Bug: https://bugs.openjdk.java.net/browse/JDK-8220389 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8220389/webrev.00/ >> This integration did overwrite changes already in place in OpenJDK. The diff has been attached to the umbrella bug. >> Thanks, >> /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From dean.long at oracle.com Tue Mar 19 03:33:33 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 18 Mar 2019 20:33:33 -0700 Subject: RFR: JDK-8220389 - Update Graal In-Reply-To: References: <38826dcc-6ff0-a9d4-f463-ca3050caf4af@oracle.com> Message-ID: <3e3bdd38-10ea-12de-4544-321d697a98de@oracle.com> On 3/18/19 5:55 PM, jesper.wilhelmsson at oracle.com wrote: > Should the changes in overwritten-diffs be re-applied to the OpenJDK? No, that file shouldn't have been created, because there have been no JDK changes to those packages since 8218074.? Check the date on the file.? I suspect it's an old file. dl From rwestrel at redhat.com Tue Mar 19 08:14:49 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 19 Mar 2019 09:14:49 +0100 Subject: RFR(S): 8220374: C2: LoopStripMining doesn't strip as expected In-Reply-To: References: <87ef7buov3.fsf@redhat.com> <42027540-aa62-0b10-c0ee-d98e8d6a5ece@oracle.com> <87r2b9tzl2.fsf@redhat.com> Message-ID: <87wokvs2km.fsf@redhat.com> Hi Martin, > I have requested 11u and 12u backport of JDK-8219584. After that, this one should be trivial to backport. Great. Thanks. Roland. From dcherepanov at azul.com Tue Mar 19 10:02:30 2019 From: dcherepanov at azul.com (Dmitry Cherepanov) Date: Tue, 19 Mar 2019 10:02:30 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: Hi Volker, Thanks for the suggestion. I also don?t like the aspect of the test but currently don?t see a way to improve this. I tried introducing/checking counter within the loop but it makes the issue invisible (the test doesn?t fail with build without the fix). Thanks, Dmitry > On Mar 18, 2019, at 4:25 PM, Volker Simonis wrote: > > Hi Dmitry, > > thanks for the confirmation. I don't like test which fail implicitly > by timing out. First because it is not immediately evident if it is > really a test failure or just on infrastructure problem and second > because they unnecessarily use cpu time until they time out. > > Wouldn't it be possible to change the tests such that it fails > explicitly with an Exception - e.g. by introducing and checking a > counter within the loop? > > Thanks, > Volker > > On Mon, Mar 18, 2019 at 11:39 AM Dmitry Cherepanov wrote: >> >> Hi Volker, >> >> With the fix, the test method reaches the break statement in the first iteration of the loop and passes. Without the fix, the test method enters infinite loop and fails with timeout error. >> >> Thanks, >> >> Dmitry >> >>> On Mar 16, 2019, at 11:12 AM, Volker Simonis wrote: >>> >>> Hi Dmitry, >>> >>> sorry, but I don?t understand how the regression test works. Can you please explain what?s the expected result of the test without and with your fix? >>> >>> Thanks, >>> Volker >>> >>> Dmitry Cherepanov schrieb am Mi. 13. M?rz 2019 um 13:44: >>> Igor, >>> >>> Updated version of original fix (with ifdef X86 added): >>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ >>> >>> Is it okay to push it? >>> >>> Thanks, >>> >>> Dmitry >>> >>>> On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: >>>> >>>> Dmitry, >>>> >>>> After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? >>>> >>>> igor >>>> >>>> >>>> >>>>> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: >>>>> >>>>> Igor, >>>>> >>>>> Sorry for the delay in responding. >>>>> >>>>> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ >>>>> >>>>> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? >>>>> >>>>> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? >>>>> >>>>> Thanks, >>>>> >>>>> Dmitry >>>>> >>>>>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: >>>>>> >>>>>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. >>>>>> >>>>>> igor >>>>>> >>>>>> >>>>>> >>>>>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: >>>>>>> >>>>>>> Hi Igor, >>>>>>> >>>>>>> Thanks for the suggestions. I tried to make the opr1 a temporary >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ >>>>>>> >>>>>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? >>>>>>> >>>>>>> Dmitry >>>>>>> >>>>>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: >>>>>>>> >>>>>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. >>>>>>>> >>>>>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: >>>>>>>>> >>>>>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. >>>>>>>>> >>>>>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. >>>>>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. >>>>>>>>> >>>>>>>>> igor >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: >>>>>>>>>> >>>>>>>>>> Hi Dmitry, >>>>>>>>>> >>>>>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Tobias >>>>>>>>>> >>>>>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: >>>>>>>>>>> Hi Tobias, >>>>>>>>>>> >>>>>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the >>>>>>>>>>> regression test. >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Dmitry >>>>>>>>>>> >>>>>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Dmitry, >>>>>>>>>>>> >>>>>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? >>>>>>>>>>>> >>>>>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Tobias >>>>>>>>>>>> >>>>>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for >>>>>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and >>>>>>>>>>>>> using it for incrementing backedge counter. >>>>>>>>>>>>> >>>>>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 >>>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Dmitry >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From dcherepanov at azul.com Tue Mar 19 10:04:16 2019 From: dcherepanov at azul.com (Dmitry Cherepanov) Date: Tue, 19 Mar 2019 10:04:16 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: <95dd86ac-ff05-59a4-9185-2e99775cff35@redhat.com> References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> <95dd86ac-ff05-59a4-9185-2e99775cff35@redhat.com> Message-ID: <2B24B9C9-5E8B-42E6-BAB5-28EBBE55522A@azul.com> Updated webrev with the comment: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.05/ Thanks, Dmitry > On Mar 18, 2019, at 6:53 PM, Andrew Haley wrote: > > On 3/13/19 12:43 PM, Dmitry Cherepanov wrote: >> Updated version of original fix (with ifdef X86 added): >> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ >> >> Is it okay to push it? > > Surely you cannot contemplate pushing this patch with no explanatory comment. > > "BEWARE! On 32-bit x86 cmp clobbers its left argument so we need a temp copy." > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From adinn at redhat.com Tue Mar 19 10:04:47 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 19 Mar 2019 10:04:47 +0000 Subject: RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> Message-ID: Hello Dmitrij, On 18/03/2019 17:03, Dmitrij Pochepko wrote: > Hi all, > > please review patch for JDK-8216989 > CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does > not check for zero length on AARCH64 > webrev: http://cr.openjdk.java.net/~dpochepk/8216989/webrev.01/ > > All platforms except AARCH64 performs zero length check in arraycopy > post barrier and skip card marking for zero length arrays. Missing check > can lead to wrong card marking. This patch adds such check. > > Testing (using parallel gc, because default g1 is not affected): > > - JCK > - jtreg hotspot tests: compiler/*, gc/* and runtime/* > - jtreg jdk tier1-3 tests > > no regressions found. > > CR: https://bugs.openjdk.java.net/browse/JDK-8216989 The patch looks good on its own. However, I think there is a bigger problem here and a better solution is available. On x86_64 the stub code calls arraycopy_prologue passing in a start + dest address (dst) and count (cnt). The implementation in ShenandoahBarrierSetAssembler calls out to a C runtime handler (ShenandoahRuntime::write_ref_array_post_entry), passing dst and cnt. The one in ModRefBarrierSetAssembler virtually invokes either G1BarrierSetAssembler::gen_write_ref_array_post_barrier or CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier, passing in dst and count. The G1 implementation calls out to C runtime code passing dst and count. The CardTable implementation subtracts 1 from count and then adds dst to cnt to get an inclusive end address and then processes the dst entries in a loop. So, the latter bails out at the start if cnt == 0 as there is no work to do. On AArch64 for some unexplained reason it is the stub code which modifies cnt, performing the decrement and address add. So, cnt is passed in to the ModRef and Shenandoah barrier set implementations of arraycopy_prologue as an inclusive end pointer. The implementation of arraycopy_prologue in ModRefBarrierSetAssembler passes these values on through to G1BarrierSetAssembler::gen_write_ref_array_post_barrier and CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier -- the AArch64 versions -- using a virtual invoke, just as for x86 So, the AArch64 CardTable implementation now needs to bail out if end < start where the x86 version bails out if cnt == 0. And, of course, it doesn't include code for the decrement and pointer add. However, the AArch64 ShenandoahBarrierSetAssembler::arraycopy_epilogue implementation now has to convert its inclusive end pointer argument back to a count. So, it adds BytesPerHeapOop to end, subtracts start and then shifts by LogBytesPerHeapOop. Similarly, the G1 implementation of gen_write_ref_array_post_barrier has to convert its end pointer back to a count performing the same operations. This looks like a pointless divergence from x86_64 that simply adds more work and complicates the code. So, I suggest fixing AArch64 by following x86_64 i.e. relocating the conversion of count to an inclusive end pointer back into the CardTable implementation of gen_write_ref_array_post_barrier removing the pointer arithmetic to re-establish the count in ShenandoahBarrierSetAssembler::arraycopy_epilogue and G1BarrierSetAssembler::gen_write_ref_array_post_barrier rewriting your bail out test to detect cnt == 0 regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Tue Mar 19 10:19:05 2019 From: aph at redhat.com (Andrew Haley) Date: Tue, 19 Mar 2019 10:19:05 +0000 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: <2B24B9C9-5E8B-42E6-BAB5-28EBBE55522A@azul.com> References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> <95dd86ac-ff05-59a4-9185-2e99775cff35@redhat.com> <2B24B9C9-5E8B-42E6-BAB5-28EBBE55522A@azul.com> Message-ID: <554f312f-861e-b394-1ab8-cfa20ab559fd@redhat.com> On 3/19/19 10:04 AM, Dmitry Cherepanov wrote: > Updated webrev with the comment: > http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.05/ OK, that's fine. Thanks for keeping x86-32 alive. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Tue Mar 19 10:52:43 2019 From: aph at redhat.com (Andrew Haley) Date: Tue, 19 Mar 2019 10:52:43 +0000 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> Message-ID: <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> On 3/19/19 10:04 AM, Andrew Dinn wrote: > On AArch64 for some unexplained reason it is the stub code which > modifies cnt, performing the decrement and address add. That's how x86 did it back then. x86 got fixed by http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/3f281b313240 We should look at that patch *extremely* *carefully*, and make sure we've got all the fixes. The AArch64 version is based on that code, so it will probably have inherited its bugs. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From adinn at redhat.com Tue Mar 19 11:16:02 2019 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 19 Mar 2019 11:16:02 +0000 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> Message-ID: On 19/03/2019 10:52, Andrew Haley wrote: > On 3/19/19 10:04 AM, Andrew Dinn wrote: >> On AArch64 for some unexplained reason it is the stub code which >> modifies cnt, performing the decrement and address add. > > That's how x86 did it back then. x86 got fixed by > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/3f281b313240 > > We should look at that patch *extremely* *carefully*, and make sure we've > got all the fixes. The AArch64 version is based on that code, so it will > probably have inherited its bugs. Ok, noted. Dmitrij, please prepare a revised patch to re-align AArch64 with x86_64 as previously recommended and then check it is consistent with the above commit. I'll do the same at re-review to ensure that we have two sets of eyes on the problem. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From volker.simonis at gmail.com Tue Mar 19 14:33:45 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 19 Mar 2019 15:33:45 +0100 Subject: RFR: 8211100: hotspot C1 issue with comparing long numbers on x86 32-bit In-Reply-To: References: <659DF4FF-71B9-472D-A064-038ADF2A50FF@oracle.com> <0C5ACDFD-EAA1-4EE0-AD1C-845B0B488680@azul.com> Message-ID: On Tue, Mar 19, 2019 at 11:02 AM Dmitry Cherepanov wrote: > > Hi Volker, > > Thanks for the suggestion. I also don?t like the aspect of the test but currently don?t see a way to improve this. I tried introducing/checking counter within the loop but it makes the issue invisible (the test doesn?t fail with build without the fix). > Hi Dmitry, thanks for trying an alternative solution. No more objections from my side if the test is really that sensitive. Regards, Volker > Thanks, > > Dmitry > > > On Mar 18, 2019, at 4:25 PM, Volker Simonis wrote: > > > > Hi Dmitry, > > > > thanks for the confirmation. I don't like test which fail implicitly > > by timing out. First because it is not immediately evident if it is > > really a test failure or just on infrastructure problem and second > > because they unnecessarily use cpu time until they time out. > > > > Wouldn't it be possible to change the tests such that it fails > > explicitly with an Exception - e.g. by introducing and checking a > > counter within the loop? > > > > Thanks, > > Volker > > > > On Mon, Mar 18, 2019 at 11:39 AM Dmitry Cherepanov wrote: > >> > >> Hi Volker, > >> > >> With the fix, the test method reaches the break statement in the first iteration of the loop and passes. Without the fix, the test method enters infinite loop and fails with timeout error. > >> > >> Thanks, > >> > >> Dmitry > >> > >>> On Mar 16, 2019, at 11:12 AM, Volker Simonis wrote: > >>> > >>> Hi Dmitry, > >>> > >>> sorry, but I don?t understand how the regression test works. Can you please explain what?s the expected result of the test without and with your fix? > >>> > >>> Thanks, > >>> Volker > >>> > >>> Dmitry Cherepanov schrieb am Mi. 13. M?rz 2019 um 13:44: > >>> Igor, > >>> > >>> Updated version of original fix (with ifdef X86 added): > >>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.04/ > >>> > >>> Is it okay to push it? > >>> > >>> Thanks, > >>> > >>> Dmitry > >>> > >>>> On Mar 13, 2019, at 5:24 AM, Igor Veresov wrote: > >>>> > >>>> Dmitry, > >>>> > >>>> After some digging around I think your original fix is ok. In addition to !_LP64 can you add ifdef X86? > >>>> > >>>> igor > >>>> > >>>> > >>>> > >>>>> On Mar 6, 2019, at 3:07 AM, Dmitry Cherepanov wrote: > >>>>> > >>>>> Igor, > >>>>> > >>>>> Sorry for the delay in responding. > >>>>> > >>>>> I updated comp_op (in c1_LIRAssembler_x86.cpp) to make use of tmp1 for this case. The changes: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.03/ > >>>>> > >>>>> For this change, I got assertion failed (from cpu_regnrLo, in c1_LIR.hpp). Sorry if this is an obvious question - Am I correctly understand that another part of this solution should be an additional change that would allocate tmp1? Or is there an existing code that should take care of it already and just need to enable the allocation of tmp1 for this case? > >>>>> > >>>>> Another question: given that this is a major issue on x86 32bit system, would you mind if we proceed with the current minimal/low-risk fix (http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/) and create new JBS issue to investigate more generic approach separately? > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Dmitry > >>>>> > >>>>>> On Oct 2, 2018, at 8:09 PM, Igor Veresov wrote: > >>>>>> > >>>>>> Right, I forgot how it works. Sorry for the confusion. I think there is no way to explicitly describe a register kill in C1. I guess the only option is to just avoid clobbering opr1. So may be we should make use of tmp1 for lir_cmp to save/restore opr1? Again, tmp1 would have to be allocated only for this particular case. > >>>>>> > >>>>>> igor > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On Oct 1, 2018, at 7:15 AM, Dmitry Cherepanov wrote: > >>>>>>> > >>>>>>> Hi Igor, > >>>>>>> > >>>>>>> Thanks for the suggestions. I tried to make the opr1 a temporary > >>>>>>> > >>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.02/ > >>>>>>> > >>>>>>> but the generated code still has the problem. Looking into the log with -XX:TraceLinearScanLevel=4 (http://cr.openjdk.java.net/~dcherepanov/8211100/TraceLinearScanLevel.02.log) seems like the reason for this is that the opr1 (virtual register R165 in the log) is also an input operand and its range becomes wider and the shorter ranges (corresponding to the opr1 marked as temp) are merged to the single range. Can the input operand be temporary at the same time? > >>>>>>> > >>>>>>> Dmitry > >>>>>>> > >>>>>>>> On Sep 27, 2018, at 2:18 AM, Igor Veresov wrote: > >>>>>>>> > >>>>>>>> Edit: It may be more consistent to check for is_double_cpu() instead of T_LONG. Although that?s semantically equivalent. > >>>>>>>> > >>>>>>>>> On Sep 26, 2018, at 9:35 AM, Igor Veresov wrote: > >>>>>>>>> > >>>>>>>>> It doesn?t seem to me like the proper way to fix it. The problem is that the cmp is destroying opr1 without telling the register allocator about it. > >>>>>>>>> > >>>>>>>>> One possible solution would be to make opr1 also a temp (see LIR_OpVisitState::visit(LIR_Op* op) in c1_LIR.cpp), only for x86 32bit and only if the operand type is T_LONG. > >>>>>>>>> Another solution is to maintain a temporary register for lir_cmp and use it to save/restore opr1 when emitting the code in LIR_Assembler::comp_op(). Again, the temporary register has to be there only for x86 32bit and T_LONG. > >>>>>>>>> > >>>>>>>>> igor > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On Sep 26, 2018, at 1:29 AM, Tobias Hartmann wrote: > >>>>>>>>>> > >>>>>>>>>> Hi Dmitry, > >>>>>>>>>> > >>>>>>>>>> this looks good to me but Igor (who implemented 8201447) should have a look as well. > >>>>>>>>>> > >>>>>>>>>> Best regards, > >>>>>>>>>> Tobias > >>>>>>>>>> > >>>>>>>>>> On 26.09.2018 09:04, Dmitry Cherepanov wrote: > >>>>>>>>>>> Hi Tobias, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for the review, updated patch avoids the additional move on x86_64 and includes the > >>>>>>>>>>> regression test. > >>>>>>>>>>> > >>>>>>>>>>> http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.01/ > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Dmitry > >>>>>>>>>>> > >>>>>>>>>>>> On Sep 25, 2018, at 6:40 PM, Tobias Hartmann >>>>>>>>>>>> > wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Hi Dmitry, > >>>>>>>>>>>> > >>>>>>>>>>>> Shouldn't this at least be guarded by an #ifndef _LP64 to avoid the additional move on x86_64? > >>>>>>>>>>>> > >>>>>>>>>>>> Could you please add the regression test to the webrev? Or did this reproduce with other tests? > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> Tobias > >>>>>>>>>>>> > >>>>>>>>>>>> On 25.09.2018 16:00, Dmitry Cherepanov wrote: > >>>>>>>>>>>>> Hello, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Please review a patch that resolves issue in x86 32bit builds. It slightly adjusts the fix for > >>>>>>>>>>>>> JDK-8201447 (C1 does backedge profiling incorrectly) by creating a copy of the left operand and > >>>>>>>>>>>>> using it for incrementing backedge counter. > >>>>>>>>>>>>> > >>>>>>>>>>>>> JBS issue: https://bugs.openjdk.java.net/browse/JDK-8211100 > >>>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~dcherepanov/8211100/webrev.00/ > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Dmitry > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > From volker.simonis at gmail.com Tue Mar 19 18:54:28 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 19 Mar 2019 19:54:28 +0100 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code Message-ID: Hi, can I please have a review for the following small ppc64-only C1 patch which fixes a nasty, day-one problem which only recently popped up more frequently: http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083/ https://bugs.openjdk.java.net/browse/JDK-8221083 The C1 generated code for comparing two oops erroneously emits a 32-bit instead of an 64-bit compare instruction. Because oops are only compared for equality/inequality, this bug only becomes manifests for oops which are equal in their 32 least-significant bits but unequal otherwise. This means the two oops have to be exactly 4GB apart from each other in the heap or their 32 least significant bits have to be zero when compared to 'null'. This makes the occurrence of this bug extremely unlikely, but when it happens, the consequences are usually a semantically wrong program execution and not a crash, which makes it very hard to detect. The regression test reproduces the issue by allocation an object at an address with the 32-bit least significant bits being zero and comperes it with another null object. The fix also removes some adjacent code which has never been used (and tested) until now. Thank you and best regards, Volker From fw at deneb.enyo.de Tue Mar 19 21:47:34 2019 From: fw at deneb.enyo.de (Florian Weimer) Date: Tue, 19 Mar 2019 22:47:34 +0100 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: (Volker Simonis's message of "Tue, 19 Mar 2019 19:54:28 +0100") References: Message-ID: <87k1guiljd.fsf@mid.deneb.enyo.de> * Volker Simonis: > The regression test reproduces the issue by allocation an object at an > address with the 32-bit least significant bits being zero and comperes > it with another null object. 79 for (int i = 0; i < 10; i++) { 80 System.gc(); 81 for (int j = 0; j < 1024; j++) { 82 s = new String("I'm not null!!!"); 83 if (WB.getObjectAddress(s) == 0x700000000L) break; 84 } 85 if (WB.getObjectAddress(s) == 0x700000000L) { 86 System.out.println("Got object at address 0x700000000"); 87 break; 88 } 89 } I think this could use a labeled loop, like this: GC_TESTS: for (int i = 0; i < 10; i++) { System.gc(); for (int j = 0; j < 1024; j++) { s = new String("I'm not null!!!"); if (WB.getObjectAddress(s) == 0x700000000L) { System.out.println("Got object at address 0x700000000"); break GC_TESTS; } } } (Untested.) On the other hand, neither this version or yours properly detects when s is *not* actually the object with the desired address, in which case the test objective fails to materialize, I think. From martin.doerr at sap.com Wed Mar 20 11:24:17 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 20 Mar 2019 11:24:17 +0000 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: <87k1guiljd.fsf@mid.deneb.enyo.de> References: <87k1guiljd.fsf@mid.deneb.enyo.de> Message-ID: Hi Volker, I think the 2 nested loops are not really needed. If you use System.gc() + new String you should get it allocated at the desired location. I believe this is sufficient with the flags you are using. The test seems to test interpreter, C1, and C2 with the loop limit of 30_000. I like that. The C1 fix looks good. Thanks for finding and fixing this bug. Best regards, Martin -----Original Message----- From: hotspot-compiler-dev On Behalf Of Florian Weimer Sent: Dienstag, 19. M?rz 2019 22:48 To: Volker Simonis Cc: hotspot compiler Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code * Volker Simonis: > The regression test reproduces the issue by allocation an object at an > address with the 32-bit least significant bits being zero and comperes > it with another null object. 79 for (int i = 0; i < 10; i++) { 80 System.gc(); 81 for (int j = 0; j < 1024; j++) { 82 s = new String("I'm not null!!!"); 83 if (WB.getObjectAddress(s) == 0x700000000L) break; 84 } 85 if (WB.getObjectAddress(s) == 0x700000000L) { 86 System.out.println("Got object at address 0x700000000"); 87 break; 88 } 89 } I think this could use a labeled loop, like this: GC_TESTS: for (int i = 0; i < 10; i++) { System.gc(); for (int j = 0; j < 1024; j++) { s = new String("I'm not null!!!"); if (WB.getObjectAddress(s) == 0x700000000L) { System.out.println("Got object at address 0x700000000"); break GC_TESTS; } } } (Untested.) On the other hand, neither this version or yours properly detects when s is *not* actually the object with the desired address, in which case the test objective fails to materialize, I think. From dmitrij.pochepko at bell-sw.com Wed Mar 20 17:29:47 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 20 Mar 2019 20:29:47 +0300 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> Message-ID: On 19/03/2019 2:16 PM, Andrew Dinn wrote: > On 19/03/2019 10:52, Andrew Haley wrote: >> On 3/19/19 10:04 AM, Andrew Dinn wrote: >>> On AArch64 for some unexplained reason it is the stub code which >>> modifies cnt, performing the decrement and address add. >> That's how x86 did it back then. x86 got fixed by >> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/rev/3f281b313240 >> >> We should look at that patch *extremely* *carefully*, and make sure we've >> got all the fixes. The AArch64 version is based on that code, so it will >> probably have inherited its bugs. > Ok, noted. > > Dmitrij, please prepare a revised patch to re-align AArch64 with x86_64 > as previously recommended and then check it is consistent with the above > commit. I'll do the same at re-review to ensure that we have two sets of > eyes on the problem. > Hi, Please take a look at http://cr.openjdk.java.net/~dpochepk/8216989/webrev.02/ I changed patch according to x86 fix and current code layout: - renamed methods parameters from "end" to "count" to avoid confusing names and match x86 - updated G1BarrierSetAssembler::gen_write_ref_array_post_barrier implementation: removed code with calculation of "count" value and using "count" directly instead - updated ShenandoahBarrierSetAssembler::arraycopy_epilogue in same way as G1 code - updated CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier with zero length branch and calculation of inclusive end pointer to match original logic - updated arraycopy_epilogue usage by removing unnecessary end pointer calculation and providing array length (count) instead I also run jtreg hotspot/compiler, hotspot/gc, hotspot/runtime and jck with G1GC, ParallelGC and ShenandoahGC No regressions found. Thanks, Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.x.ivanov at oracle.com Thu Mar 21 00:21:55 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 20 Mar 2019 17:21:55 -0700 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> Message-ID: <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> Proposed fix looks good [1]. (FYI the patch [2] differs significantly from the webrev.) > Looking on changes and it seems you can modify it a little: > > ? // Can base be NULL? Otherwise, always on-heap access. > ? bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); > ? if (!can_access_non_heap) { > ??? heap_base_oop = base; > ??? decorators |= IN_HEAP; > ? } else if (type == T_OBJECT) { > ??? return false; // off-heap oop accesses are not supported > ? } It doesn't look right. There're 3 cases we care about here: * on-heap accesses (non-NULL base) * off-heap accesses (NULL base) * mixed accesses (base may be NULL) 'can_access_non_heap' distinguishes on-heap accesses from off-heap/mixed ones. 'heap_base_oop != top()' distinguishes off-heap accesses from on-heap/mixed ones. Best regards, Vladimir Ivanov [1] - bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); + bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); [2] http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/jdk-jdk.changeset - bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); + bool can_access_non_heap = TypePtr::NULL_PTR->higher_equal(_gvn.type(UseNewCode ? base : heap_base_oop)); > On 3/15/19 4:39 AM, Roman Kennke wrote: >> A user reported misbehaving off-heap access. It looks like a C2 >> compilation failure that seems to only trigger with Shenandoah: >> >> https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html >> >> >> Eg with G1 generates this assembly for swapping two array elements: >> mov (%r8),%r9d >> mov (%r10),%r11d >> mov %r9d,(%r10) >> mov %r11d,(%r8) >> >> While with Shenandoah we get this: >> mov (%r9),%ecx >> mov %ecx,(%r10,%r11,1) >> mov %ecx,(%r9) >> >> I.e. the two loads seem to have been wrongly coalesced into one. >> >> Even though that is only triggered by Shenandoah, it seems to be a legit >> and generic C2 problem. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8220714 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ >> >> The issue seems to be that off-heap accesses are supposed to use >> MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in >> C2Access::needs_cpu_membar(). However, it seems we wrongly set this here >> (library_call.cpp around l2410: >> >> ?? // Can base be NULL? Otherwise, always on-heap access. >> ?? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >> ?? if (!can_access_non_heap) { >> ???? decorators |= IN_HEAP; >> ?? } >> >> However, heap_base_oop is initialized to top() a few lines up, and then >> never updated, at least not for the off-heap-access case. And top >> doesn't match NULL_PTR afaik. >> >> The proposed fix uses base instead of heap_base_oop for that check, this >> should be updated correcly through make_unsafe_addr() and >> classify_unsafe_addr(). >> >> For some reason, this bug only seems to be exposed when running >> Shenandoah, and then only when actually compiling with barriers. We >> could not reproduce this with any other GC. It totally eludes me why >> that is so. For this reason, the testcase goes under the >> gc/shenandoah/compiler directory. >> >> Roman >> From rkennke at redhat.com Thu Mar 21 12:00:38 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Mar 2019 13:00:38 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> Message-ID: Hi Vladimir, Thanks for reviewing! Not sure what happened to the webrev. Parts of it seem to have snuck in from a previous working version. Here's a good one: http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ Thanks, Roman > Proposed fix looks good [1]. > > (FYI the patch [2] differs significantly from the webrev.) > >> Looking on changes and it seems you can modify it a little: >> >> ?? // Can base be NULL? Otherwise, always on-heap access. >> ?? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >> ?? if (!can_access_non_heap) { >> ???? heap_base_oop = base; >> ???? decorators |= IN_HEAP; >> ?? } else if (type == T_OBJECT) { >> ???? return false; // off-heap oop accesses are not supported >> ?? } > > It doesn't look right. > > There're 3 cases we care about here: > ? * on-heap accesses (non-NULL base) > ? * off-heap accesses (NULL base) > ? * mixed accesses (base may be NULL) > > 'can_access_non_heap' distinguishes on-heap accesses from off-heap/mixed > ones. > > 'heap_base_oop != top()' distinguishes off-heap accesses from > on-heap/mixed ones. > > Best regards, > Vladimir Ivanov > > [1] > > -? bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); > +? bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); > > > [2] > http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/jdk-jdk.changeset > > -? bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); > +? bool can_access_non_heap = > TypePtr::NULL_PTR->higher_equal(_gvn.type(UseNewCode ? base : > heap_base_oop)); > > >> On 3/15/19 4:39 AM, Roman Kennke wrote: >>> A user reported misbehaving off-heap access. It looks like a C2 >>> compilation failure that seems to only trigger with Shenandoah: >>> >>> https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html >>> >>> >>> Eg with G1 generates this assembly for swapping two array elements: >>> mov (%r8),%r9d >>> mov (%r10),%r11d >>> mov %r9d,(%r10) >>> mov %r11d,(%r8) >>> >>> While with Shenandoah we get this: >>> mov (%r9),%ecx >>> mov %ecx,(%r10,%r11,1) >>> mov %ecx,(%r9) >>> >>> I.e. the two loads seem to have been wrongly coalesced into one. >>> >>> Even though that is only triggered by Shenandoah, it seems to be a legit >>> and generic C2 problem. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8220714 >>> Webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ >>> >>> The issue seems to be that off-heap accesses are supposed to use >>> MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in >>> C2Access::needs_cpu_membar(). However, it seems we wrongly set this here >>> (library_call.cpp around l2410: >>> >>> ?? // Can base be NULL? Otherwise, always on-heap access. >>> ?? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>> ?? if (!can_access_non_heap) { >>> ???? decorators |= IN_HEAP; >>> ?? } >>> >>> However, heap_base_oop is initialized to top() a few lines up, and then >>> never updated, at least not for the off-heap-access case. And top >>> doesn't match NULL_PTR afaik. >>> >>> The proposed fix uses base instead of heap_base_oop for that check, this >>> should be updated correcly through make_unsafe_addr() and >>> classify_unsafe_addr(). >>> >>> For some reason, this bug only seems to be exposed when running >>> Shenandoah, and then only when actually compiling with barriers. We >>> could not reproduce this with any other GC. It totally eludes me why >>> that is so. For this reason, the testcase goes under the >>> gc/shenandoah/compiler directory. >>> >>> Roman >>> From rwestrel at redhat.com Thu Mar 21 16:35:50 2019 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 21 Mar 2019 17:35:50 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> Message-ID: <87o964rxqx.fsf@redhat.com> > http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ That looks good to me. Roland. From rkennke at redhat.com Thu Mar 21 16:39:29 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Mar 2019 17:39:29 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <87o964rxqx.fsf@redhat.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> <87o964rxqx.fsf@redhat.com> Message-ID: <08956008-ab27-aa4a-0679-e24936e307bb@redhat.com> Thanks, Roland! Roman > >> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ > > That looks good to me. > > Roland. > From vladimir.x.ivanov at oracle.com Thu Mar 21 16:44:27 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 21 Mar 2019 09:44:27 -0700 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> Message-ID: test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeOffheapSwap.java Since it's not a GC bug, I'd expect to see the regression test among compiler tests (compiler/unsafe/ or compiler/c2/). 28 * @requires vm.gc.Shenandoah I prefer "-XX:+IgnoreUnrecognizedVMOptions". As you noted, the bug is generic, but is triggered only w/ Shenandoah enabled in this particular case. Best regards, Vladimir Ivanov On 21/03/2019 05:00, Roman Kennke wrote: > Hi Vladimir, > > Thanks for reviewing! > > Not sure what happened to the webrev. Parts of it seem to have snuck in > from a previous working version. Here's a good one: > > http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ > > Thanks, > Roman > > >> Proposed fix looks good [1]. >> >> (FYI the patch [2] differs significantly from the webrev.) >> >>> Looking on changes and it seems you can modify it a little: >>> >>> ?? // Can base be NULL? Otherwise, always on-heap access. >>> ?? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >>> ?? if (!can_access_non_heap) { >>> ???? heap_base_oop = base; >>> ???? decorators |= IN_HEAP; >>> ?? } else if (type == T_OBJECT) { >>> ???? return false; // off-heap oop accesses are not supported >>> ?? } >> >> It doesn't look right. >> >> There're 3 cases we care about here: >> ?? * on-heap accesses (non-NULL base) >> ?? * off-heap accesses (NULL base) >> ?? * mixed accesses (base may be NULL) >> >> 'can_access_non_heap' distinguishes on-heap accesses from >> off-heap/mixed ones. >> >> 'heap_base_oop != top()' distinguishes off-heap accesses from >> on-heap/mixed ones. >> >> Best regards, >> Vladimir Ivanov >> >> [1] >> >> -? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >> +? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >> >> >> [2] >> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/jdk-jdk.changeset >> >> >> -? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >> +? bool can_access_non_heap = >> TypePtr::NULL_PTR->higher_equal(_gvn.type(UseNewCode ? base : >> heap_base_oop)); >> >> >>> On 3/15/19 4:39 AM, Roman Kennke wrote: >>>> A user reported misbehaving off-heap access. It looks like a C2 >>>> compilation failure that seems to only trigger with Shenandoah: >>>> >>>> https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html >>>> >>>> >>>> Eg with G1 generates this assembly for swapping two array elements: >>>> mov (%r8),%r9d >>>> mov (%r10),%r11d >>>> mov %r9d,(%r10) >>>> mov %r11d,(%r8) >>>> >>>> While with Shenandoah we get this: >>>> mov (%r9),%ecx >>>> mov %ecx,(%r10,%r11,1) >>>> mov %ecx,(%r9) >>>> >>>> I.e. the two loads seem to have been wrongly coalesced into one. >>>> >>>> Even though that is only triggered by Shenandoah, it seems to be a >>>> legit >>>> and generic C2 problem. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8220714 >>>> Webrev: >>>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ >>>> >>>> The issue seems to be that off-heap accesses are supposed to use >>>> MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in >>>> C2Access::needs_cpu_membar(). However, it seems we wrongly set this >>>> here >>>> (library_call.cpp around l2410: >>>> >>>> ?? // Can base be NULL? Otherwise, always on-heap access. >>>> ?? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>>> ?? if (!can_access_non_heap) { >>>> ???? decorators |= IN_HEAP; >>>> ?? } >>>> >>>> However, heap_base_oop is initialized to top() a few lines up, and then >>>> never updated, at least not for the off-heap-access case. And top >>>> doesn't match NULL_PTR afaik. >>>> >>>> The proposed fix uses base instead of heap_base_oop for that check, >>>> this >>>> should be updated correcly through make_unsafe_addr() and >>>> classify_unsafe_addr(). >>>> >>>> For some reason, this bug only seems to be exposed when running >>>> Shenandoah, and then only when actually compiling with barriers. We >>>> could not reproduce this with any other GC. It totally eludes me why >>>> that is so. For this reason, the testcase goes under the >>>> gc/shenandoah/compiler directory. >>>> >>>> Roman >>>> From rkennke at redhat.com Thu Mar 21 21:35:21 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Mar 2019 22:35:21 +0100 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> Message-ID: <974140f1-2f86-6fd2-7bab-8626d28fc69a@redhat.com> > test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeOffheapSwap.java > > Since it's not a GC bug, I'd expect to see the regression test among > compiler tests (compiler/unsafe/ or compiler/c2/). We specifically have gc/shenandoah/compiler to host compiler tests that are known to fail only with Shenandoah. I'd rather keep it there as it is. Maybe we shall try to come up with a more generic test that also fails without Shenandoah? (Note: I tried for a little bit - not very much -, but failed... ). > ? 28? * @requires vm.gc.Shenandoah > > I prefer "-XX:+IgnoreUnrecognizedVMOptions". As you noted, the bug is > generic, but is triggered only w/ Shenandoah enabled in this particular > case. The @requires vm.gc.Shenandoah seems to be the correct tag to filter this. Can I push it as it is? Roman > Best regards, > Vladimir Ivanov > > On 21/03/2019 05:00, Roman Kennke wrote: >> Hi Vladimir, >> >> Thanks for reviewing! >> >> Not sure what happened to the webrev. Parts of it seem to have snuck >> in from a previous working version. Here's a good one: >> >> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ >> >> Thanks, >> Roman >> >> >>> Proposed fix looks good [1]. >>> >>> (FYI the patch [2] differs significantly from the webrev.) >>> >>>> Looking on changes and it seems you can modify it a little: >>>> >>>> ?? // Can base be NULL? Otherwise, always on-heap access. >>>> ?? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >>>> ?? if (!can_access_non_heap) { >>>> ???? heap_base_oop = base; >>>> ???? decorators |= IN_HEAP; >>>> ?? } else if (type == T_OBJECT) { >>>> ???? return false; // off-heap oop accesses are not supported >>>> ?? } >>> >>> It doesn't look right. >>> >>> There're 3 cases we care about here: >>> ?? * on-heap accesses (non-NULL base) >>> ?? * off-heap accesses (NULL base) >>> ?? * mixed accesses (base may be NULL) >>> >>> 'can_access_non_heap' distinguishes on-heap accesses from >>> off-heap/mixed ones. >>> >>> 'heap_base_oop != top()' distinguishes off-heap accesses from >>> on-heap/mixed ones. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> [1] >>> >>> -? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>> +? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >>> >>> >>> [2] >>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/jdk-jdk.changeset >>> >>> >>> -? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>> +? bool can_access_non_heap = >>> TypePtr::NULL_PTR->higher_equal(_gvn.type(UseNewCode ? base : >>> heap_base_oop)); >>> >>> >>>> On 3/15/19 4:39 AM, Roman Kennke wrote: >>>>> A user reported misbehaving off-heap access. It looks like a C2 >>>>> compilation failure that seems to only trigger with Shenandoah: >>>>> >>>>> https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html >>>>> >>>>> >>>>> Eg with G1 generates this assembly for swapping two array elements: >>>>> mov (%r8),%r9d >>>>> mov (%r10),%r11d >>>>> mov %r9d,(%r10) >>>>> mov %r11d,(%r8) >>>>> >>>>> While with Shenandoah we get this: >>>>> mov (%r9),%ecx >>>>> mov %ecx,(%r10,%r11,1) >>>>> mov %ecx,(%r9) >>>>> >>>>> I.e. the two loads seem to have been wrongly coalesced into one. >>>>> >>>>> Even though that is only triggered by Shenandoah, it seems to be a >>>>> legit >>>>> and generic C2 problem. >>>>> >>>>> Bug: >>>>> https://bugs.openjdk.java.net/browse/JDK-8220714 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ >>>>> >>>>> The issue seems to be that off-heap accesses are supposed to use >>>>> MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in >>>>> C2Access::needs_cpu_membar(). However, it seems we wrongly set this >>>>> here >>>>> (library_call.cpp around l2410: >>>>> >>>>> ?? // Can base be NULL? Otherwise, always on-heap access. >>>>> ?? bool can_access_non_heap = >>>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>>>> ?? if (!can_access_non_heap) { >>>>> ???? decorators |= IN_HEAP; >>>>> ?? } >>>>> >>>>> However, heap_base_oop is initialized to top() a few lines up, and >>>>> then >>>>> never updated, at least not for the off-heap-access case. And top >>>>> doesn't match NULL_PTR afaik. >>>>> >>>>> The proposed fix uses base instead of heap_base_oop for that check, >>>>> this >>>>> should be updated correcly through make_unsafe_addr() and >>>>> classify_unsafe_addr(). >>>>> >>>>> For some reason, this bug only seems to be exposed when running >>>>> Shenandoah, and then only when actually compiling with barriers. We >>>>> could not reproduce this with any other GC. It totally eludes me why >>>>> that is so. For this reason, the testcase goes under the >>>>> gc/shenandoah/compiler directory. >>>>> >>>>> Roman >>>>> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: OpenPGP digital signature URL: From vladimir.x.ivanov at oracle.com Thu Mar 21 21:39:00 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 21 Mar 2019 14:39:00 -0700 Subject: RFR: JDK-8220714: C2 Compilation failure when accessing off-heap memory using Unsafe In-Reply-To: <974140f1-2f86-6fd2-7bab-8626d28fc69a@redhat.com> References: <00c27700-5c6f-f7dd-603e-c69a324dd8ba@redhat.com> <78d50a10-b2c8-8fc1-58a1-83d118a99965@oracle.com> <5a87b5c6-f403-8bb7-f803-e041dafaf160@oracle.com> <974140f1-2f86-6fd2-7bab-8626d28fc69a@redhat.com> Message-ID: <549dea5b-1c56-efbf-614f-dfa3024525a8@oracle.com> On 21/03/2019 14:35, Roman Kennke wrote: >> test/hotspot/jtreg/gc/shenandoah/compiler/TestUnsafeOffheapSwap.java >> >> Since it's not a GC bug, I'd expect to see the regression test among >> compiler tests (compiler/unsafe/ or compiler/c2/). > > We specifically have gc/shenandoah/compiler to host compiler tests that > are known to fail only with Shenandoah. I'd rather keep it there as it > is. Maybe we shall try to come up with a more generic test that also > fails without Shenandoah? (Note: I tried for a little bit - not very > much -, but failed... ). > >> ? 28? * @requires vm.gc.Shenandoah >> >> I prefer "-XX:+IgnoreUnrecognizedVMOptions". As you noted, the bug is >> generic, but is triggered only w/ Shenandoah enabled in this particular >> case. > > The @requires vm.gc.Shenandoah seems to be the correct tag to filter this. > > Can I push it as it is? Ok. Best regards, Vladimir Ivanov >> On 21/03/2019 05:00, Roman Kennke wrote: >>> Hi Vladimir, >>> >>> Thanks for reviewing! >>> >>> Not sure what happened to the webrev. Parts of it seem to have snuck >>> in from a previous working version. Here's a good one: >>> >>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.01/ >>> >>> Thanks, >>> Roman >>> >>> >>>> Proposed fix looks good [1]. >>>> >>>> (FYI the patch [2] differs significantly from the webrev.) >>>> >>>>> Looking on changes and it seems you can modify it a little: >>>>> >>>>> ?? // Can base be NULL? Otherwise, always on-heap access. >>>>> ?? bool can_access_non_heap = >>>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >>>>> ?? if (!can_access_non_heap) { >>>>> ???? heap_base_oop = base; >>>>> ???? decorators |= IN_HEAP; >>>>> ?? } else if (type == T_OBJECT) { >>>>> ???? return false; // off-heap oop accesses are not supported >>>>> ?? } >>>> >>>> It doesn't look right. >>>> >>>> There're 3 cases we care about here: >>>> ?? * on-heap accesses (non-NULL base) >>>> ?? * off-heap accesses (NULL base) >>>> ?? * mixed accesses (base may be NULL) >>>> >>>> 'can_access_non_heap' distinguishes on-heap accesses from >>>> off-heap/mixed ones. >>>> >>>> 'heap_base_oop != top()' distinguishes off-heap accesses from >>>> on-heap/mixed ones. >>>> >>>> Best regards, >>>> Vladimir Ivanov >>>> >>>> [1] >>>> >>>> -? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>>> +? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(base)); >>>> >>>> >>>> [2] >>>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/jdk-jdk.changeset >>>> >>>> >>>> -? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>>> +? bool can_access_non_heap = >>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(UseNewCode ? base : >>>> heap_base_oop)); >>>> >>>> >>>>> On 3/15/19 4:39 AM, Roman Kennke wrote: >>>>>> A user reported misbehaving off-heap access. It looks like a C2 >>>>>> compilation failure that seems to only trigger with Shenandoah: >>>>>> >>>>>> https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/009060.html >>>>>> >>>>>> >>>>>> Eg with G1 generates this assembly for swapping two array elements: >>>>>> mov (%r8),%r9d >>>>>> mov (%r10),%r11d >>>>>> mov %r9d,(%r10) >>>>>> mov %r11d,(%r8) >>>>>> >>>>>> While with Shenandoah we get this: >>>>>> mov (%r9),%ecx >>>>>> mov %ecx,(%r10,%r11,1) >>>>>> mov %ecx,(%r9) >>>>>> >>>>>> I.e. the two loads seem to have been wrongly coalesced into one. >>>>>> >>>>>> Even though that is only triggered by Shenandoah, it seems to be a >>>>>> legit >>>>>> and generic C2 problem. >>>>>> >>>>>> Bug: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8220714 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8220714/webrev.00/ >>>>>> >>>>>> The issue seems to be that off-heap accesses are supposed to use >>>>>> MO_RELAXED mem-ordering instead of MO_UNORDERED, as implemented in >>>>>> C2Access::needs_cpu_membar(). However, it seems we wrongly set this >>>>>> here >>>>>> (library_call.cpp around l2410: >>>>>> >>>>>> ?? // Can base be NULL? Otherwise, always on-heap access. >>>>>> ?? bool can_access_non_heap = >>>>>> TypePtr::NULL_PTR->higher_equal(_gvn.type(heap_base_oop)); >>>>>> ?? if (!can_access_non_heap) { >>>>>> ???? decorators |= IN_HEAP; >>>>>> ?? } >>>>>> >>>>>> However, heap_base_oop is initialized to top() a few lines up, and >>>>>> then >>>>>> never updated, at least not for the off-heap-access case. And top >>>>>> doesn't match NULL_PTR afaik. >>>>>> >>>>>> The proposed fix uses base instead of heap_base_oop for that check, >>>>>> this >>>>>> should be updated correcly through make_unsafe_addr() and >>>>>> classify_unsafe_addr(). >>>>>> >>>>>> For some reason, this bug only seems to be exposed when running >>>>>> Shenandoah, and then only when actually compiling with barriers. We >>>>>> could not reproduce this with any other GC. It totally eludes me why >>>>>> that is so. For this reason, the testcase goes under the >>>>>> gc/shenandoah/compiler directory. >>>>>> >>>>>> Roman >>>>>> > From volker.simonis at gmail.com Fri Mar 22 13:48:25 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 22 Mar 2019 14:48:25 +0100 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: <87k1guiljd.fsf@mid.deneb.enyo.de> Message-ID: Hi Florian, Martin, thanks for looking at the change. I think Martin is right, we either get an object allocated at 0x700000000 right with the first allocation after System.gc() or it is unlikely that we'll get it at all. The loops are actually a leftover from my first experiments with other GC's like G1, but in the end I decided to explicitly use SerialGC because it was the simplest and most reliably way of allocation an object at the desired address. Please find the new webrev here: http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v1/ Thank you and best regards, Volker On Wed, Mar 20, 2019 at 12:24 PM Doerr, Martin wrote: > > Hi Volker, > > I think the 2 nested loops are not really needed. > If you use System.gc() + new String you should get it allocated at the desired location. > I believe this is sufficient with the flags you are using. > > The test seems to test interpreter, C1, and C2 with the loop limit of 30_000. I like that. > > The C1 fix looks good. > > Thanks for finding and fixing this bug. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Florian Weimer > Sent: Dienstag, 19. M?rz 2019 22:48 > To: Volker Simonis > Cc: hotspot compiler > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > * Volker Simonis: > > > The regression test reproduces the issue by allocation an object at an > > address with the 32-bit least significant bits being zero and comperes > > it with another null object. > > 79 for (int i = 0; i < 10; i++) { > 80 System.gc(); > 81 for (int j = 0; j < 1024; j++) { > 82 s = new String("I'm not null!!!"); > 83 if (WB.getObjectAddress(s) == 0x700000000L) break; > 84 } > 85 if (WB.getObjectAddress(s) == 0x700000000L) { > 86 System.out.println("Got object at address 0x700000000"); > 87 break; > 88 } > 89 } > > I think this could use a labeled loop, like this: > > GC_TESTS: for (int i = 0; i < 10; i++) { > System.gc(); > for (int j = 0; j < 1024; j++) { > s = new String("I'm not null!!!"); > if (WB.getObjectAddress(s) == 0x700000000L) { > System.out.println("Got object at address 0x700000000"); > break GC_TESTS; > } > } > } > > (Untested.) > > On the other hand, neither this version or yours properly detects when > s is *not* actually the object with the desired address, in which case > the test objective fails to materialize, I think. From martin.doerr at sap.com Fri Mar 22 15:47:14 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 22 Mar 2019 15:47:14 +0000 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: <87k1guiljd.fsf@mid.deneb.enyo.de> Message-ID: Hi Volker, thanks for the updated test. Looks like there are still a few leftovers from earlier experiments. I don't see any usage of Unsafe. Not sure if all options and @modules etc. are still required. If you have time, it would be nice to clean up a little bit. Best regards, Martin -----Original Message----- From: Volker Simonis Sent: Freitag, 22. M?rz 2019 14:48 To: Doerr, Martin Cc: Florian Weimer ; hotspot compiler Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code Hi Florian, Martin, thanks for looking at the change. I think Martin is right, we either get an object allocated at 0x700000000 right with the first allocation after System.gc() or it is unlikely that we'll get it at all. The loops are actually a leftover from my first experiments with other GC's like G1, but in the end I decided to explicitly use SerialGC because it was the simplest and most reliably way of allocation an object at the desired address. Please find the new webrev here: http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v1/ Thank you and best regards, Volker On Wed, Mar 20, 2019 at 12:24 PM Doerr, Martin wrote: > > Hi Volker, > > I think the 2 nested loops are not really needed. > If you use System.gc() + new String you should get it allocated at the desired location. > I believe this is sufficient with the flags you are using. > > The test seems to test interpreter, C1, and C2 with the loop limit of 30_000. I like that. > > The C1 fix looks good. > > Thanks for finding and fixing this bug. > > Best regards, > Martin > > > -----Original Message----- > From: hotspot-compiler-dev On Behalf Of Florian Weimer > Sent: Dienstag, 19. M?rz 2019 22:48 > To: Volker Simonis > Cc: hotspot compiler > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > * Volker Simonis: > > > The regression test reproduces the issue by allocation an object at an > > address with the 32-bit least significant bits being zero and comperes > > it with another null object. > > 79 for (int i = 0; i < 10; i++) { > 80 System.gc(); > 81 for (int j = 0; j < 1024; j++) { > 82 s = new String("I'm not null!!!"); > 83 if (WB.getObjectAddress(s) == 0x700000000L) break; > 84 } > 85 if (WB.getObjectAddress(s) == 0x700000000L) { > 86 System.out.println("Got object at address 0x700000000"); > 87 break; > 88 } > 89 } > > I think this could use a labeled loop, like this: > > GC_TESTS: for (int i = 0; i < 10; i++) { > System.gc(); > for (int j = 0; j < 1024; j++) { > s = new String("I'm not null!!!"); > if (WB.getObjectAddress(s) == 0x700000000L) { > System.out.println("Got object at address 0x700000000"); > break GC_TESTS; > } > } > } > > (Untested.) > > On the other hand, neither this version or yours properly detects when > s is *not* actually the object with the desired address, in which case > the test objective fails to materialize, I think. From rahul.v.raghavan at oracle.com Mon Mar 25 07:51:37 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Mon, 25 Mar 2019 13:21:37 +0530 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host Message-ID: Hi, Please review the following fix proposal for JDK-8219612. http://cr.openjdk.java.net/~rraghavan/8219612/webrev.00/ - https://bugs.openjdk.java.net/browse/JDK-8219612 - http://hg.openjdk.java.net/valhalla/valhalla/rev/ab7ea72963c9 Thanks to Mandy Chung. As mentioned in JBS above fix changeset is same as the one contributed by Mandy Chung, in valhalla repo nestmates branch. Testbug issue - compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host. Proposed fix - Test rewritten to use top-level classes rather then nested ones. Thanks, Rahul From tobias.hartmann at oracle.com Mon Mar 25 08:57:16 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 25 Mar 2019 09:57:16 +0100 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: References: Message-ID: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> Hi Rahul, this looks good to me. Best regards, Tobias On 25.03.19 08:51, Rahul Raghavan wrote: > Hi, > > Please review the following fix proposal for JDK-8219612. > http://cr.openjdk.java.net/~rraghavan/8219612/webrev.00/ > > > - https://bugs.openjdk.java.net/browse/JDK-8219612 > - http://hg.openjdk.java.net/valhalla/valhalla/rev/ab7ea72963c9 > > Thanks to Mandy Chung. > As mentioned in JBS above fix changeset is same as the one contributed by Mandy Chung, in valhalla > repo nestmates branch. > > Testbug issue - compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime > package as its nest host. > > Proposed fix - Test rewritten to use top-level classes rather then nested ones. > > > Thanks, > Rahul From rahul.v.raghavan at oracle.com Mon Mar 25 09:30:49 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Mon, 25 Mar 2019 15:00:49 +0530 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> Message-ID: <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> Hi, Request help review the following revised fix proposal for JDK-8202414. - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.01/ Though did not receive comments for earlier '8202414/webrev.00', when checked again seems the same to be wrong or too restrictive. So tried the revised changes - intptr_t InitializeNode::can_capture_store(StoreNode* st, PhaseTransform* phase, bool can_reshape) { const int FAIL = 0; if (st->is_unaligned_access()) { return FAIL; } + if ((st->memory_size() >= BytesPerInt) && ((get_store_offset(st, phase) % BytesPerInt) != 0)) { + return FAIL; + } if (st->req() != MemNode::ValueIn + 1) return FAIL; // an inscrutable StoreNode (card mark?) Confirmed no issues with reported 8202414 test case. Also no issues for hs-tier1 to tier4, hs-precheckin-comp testing. Please let me know if missed something here. Thanks, Rahul On 14/03/19 1:54 PM, Rahul Raghavan wrote: > Hi, > > Please review the following fix proposal for JDK-8202414. > > Webrev - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.00/ > > > -- Related links > > https://bugs.openjdk.java.net/browse/JDK-8202414 > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-September/030536.html > > > > -- As per suggestions in JBS added following change in > InitializeNode::can_capture_store() to return false for unaligned stores. > ============= > diff -r 3086f9259e97 src/hotspot/share/opto/memnode.cpp > --- a/src/hotspot/share/opto/memnode.cpp Wed Mar 13 00:48:52 2019 -0400 > +++ b/src/hotspot/share/opto/memnode.cpp Wed Mar 13 19:50:07 2019 +0530 > @@ -3541,7 +3541,7 @@ > ?// within the initialized memory. > ?intptr_t InitializeNode::can_capture_store(StoreNode* st, > PhaseTransform* phase, bool can_reshape) { > ?? const int FAIL = 0; > -? if (st->is_unaligned_access()) { > +? if (st->is_unaligned_access() || ((get_store_offset(st, phase) % > BytesPerInt) != 0)) { > ???? return FAIL; > ?? } > ?? if (st->req() != MemNode::ValueIn + 1) > ============== > > > -- Added the new jtreg test from the JBS unit test. > (test/hotspot/jtreg/compiler/c2/Test8202414.java) > Understood the test with unaligned access will not work for Sparc due to > hardware restrictions.The test always fails with SIGBUS crash, with or > without above fix. So added > ?? @requires (os.arch != "sparc") & (os.arch != "sparcv9") > > > -- Confirmed the above change solved the original reported 8202414 test > case failure. Also no issues far for hs-tier1 to tier4, > hs-precheckin-comp testing. > > -- Could not work out any related additions in > LibraryCallKit::inline_unsafe_access(). > Hope above fix proposal is correct, complete solution for the issue. > > > Thanks, > Rahul From adinn at redhat.com Mon Mar 25 10:01:08 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 25 Mar 2019 10:01:08 +0000 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> Message-ID: <077381e1-34e4-8857-32e8-740bbd4df882@redhat.com> Hi Dmitrij, On 20/03/2019 17:29, Dmitrij Pochepko wrote: > Please take a look at > http://cr.openjdk.java.net/~dpochepk/8216989/webrev.02/ > I changed patch according to x86 fix and current code layout: > > - renamed methods parameters from "end" to "count" to avoid confusing > names and match x86 > - updated G1BarrierSetAssembler::gen_write_ref_array_post_barrier > implementation: removed code with calculation of "count" value and using > "count" directly instead > - updated ShenandoahBarrierSetAssembler::arraycopy_epilogue in same way > as G1 code > - updated CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier > with zero length branch and calculation of inclusive end pointer to > match original logic > - updated arraycopy_epilogue usage by removing unnecessary end pointer > calculation and providing array length (count) instead > > I also run jtreg hotspot/compiler, hotspot/gc, hotspot/runtime and jck > with G1GC, ParallelGC and ShenandoahGC > No regressions found. Yes, with that change AArch64 matches the x86 code and the changes are consistent with the JDK8 change set that Andrew identified as the reason why AArch64 and x86 diverged. So, it's ok to push. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From claes.redestad at oracle.com Mon Mar 25 13:30:24 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 25 Mar 2019 14:30:24 +0100 Subject: RFR: 8221343: x86_32 crashes on startup with "_hwm out of range" Message-ID: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> Hi, some of the RegMasks allocated in Matcher::init_first_stack_mask aren't properly initialized on 32-bit builds, which hits an assert after JDK-8220159. One simple fix is to explictly construct empty RegMasks into the allocated memory. Bug: https://bugs.openjdk.java.net/browse/JDK-8221343 Webrev: http://cr.openjdk.java.net/~redestad/8221343/open.00/ Testing: verified reproducer in bug, pass tier1-3 Thanks! /Claes From tobias.hartmann at oracle.com Mon Mar 25 13:41:50 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 25 Mar 2019 14:41:50 +0100 Subject: RFR: 8221343: x86_32 crashes on startup with "_hwm out of range" In-Reply-To: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> References: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> Message-ID: Hi Claes, this looks good to me. Best regards, Tobias On 25.03.19 14:30, Claes Redestad wrote: > Hi, > > some of the RegMasks allocated in Matcher::init_first_stack_mask aren't > properly initialized on 32-bit builds, which hits an assert after > JDK-8220159.? One simple fix is to explictly construct empty RegMasks > into the allocated memory. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221343 > Webrev: http://cr.openjdk.java.net/~redestad/8221343/open.00/ > > Testing: verified reproducer in bug, pass tier1-3 > > Thanks! > > /Claes From claes.redestad at oracle.com Mon Mar 25 13:45:48 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 25 Mar 2019 14:45:48 +0100 Subject: RFR: 8221343: x86_32 crashes on startup with "_hwm out of range" In-Reply-To: References: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> Message-ID: <9ff04900-f0f4-d43d-f3e0-1ee45bd56401@oracle.com> On 2019-03-25 14:41, Tobias Hartmann wrote: > Hi Claes, > > this looks good to me. Thanks, Tobias! /Claes From claes.redestad at oracle.com Mon Mar 25 13:44:20 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 25 Mar 2019 14:44:20 +0100 Subject: RFR: 8221404: Remove double alignment of RegMasks in Matcher Message-ID: Hi, RegMask are allocated double-aligned, which doesn't seem to have any real effect on any of our supported platforms. Simplify. Bug: https://bugs.openjdk.java.net/browse/JDK-8221404 Webrev: http://cr.openjdk.java.net/~redestad/8221404/open.00 Testing: tier1-3 (together with JDK-8221343) Thanks! /Claes From volker.simonis at gmail.com Mon Mar 25 15:28:19 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 25 Mar 2019 16:28:19 +0100 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: <87k1guiljd.fsf@mid.deneb.enyo.de> Message-ID: On Fri, Mar 22, 2019 at 4:47 PM Doerr, Martin wrote: > > Hi Volker, > > thanks for the updated test. > > Looks like there are still a few leftovers from earlier experiments. I don't see any usage of Unsafe. > Not sure if all options and @modules etc. are still required. > You're right :) Unsafe isn't required any more: http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v2 > If you have time, it would be nice to clean up a little bit. > > Best regards, > Martin > > > -----Original Message----- > From: Volker Simonis > Sent: Freitag, 22. M?rz 2019 14:48 > To: Doerr, Martin > Cc: Florian Weimer ; hotspot compiler > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > Hi Florian, Martin, > > thanks for looking at the change. > > I think Martin is right, we either get an object allocated at > 0x700000000 right with the first allocation after System.gc() or it is > unlikely that we'll get it at all. The loops are actually a leftover > from my first experiments with other GC's like G1, but in the end I > decided to explicitly use SerialGC because it was the simplest and > most reliably way of allocation an object at the desired address. > > Please find the new webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v1/ > > Thank you and best regards, > Volker > > On Wed, Mar 20, 2019 at 12:24 PM Doerr, Martin wrote: > > > > Hi Volker, > > > > I think the 2 nested loops are not really needed. > > If you use System.gc() + new String you should get it allocated at the desired location. > > I believe this is sufficient with the flags you are using. > > > > The test seems to test interpreter, C1, and C2 with the loop limit of 30_000. I like that. > > > > The C1 fix looks good. > > > > Thanks for finding and fixing this bug. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-compiler-dev On Behalf Of Florian Weimer > > Sent: Dienstag, 19. M?rz 2019 22:48 > > To: Volker Simonis > > Cc: hotspot compiler > > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > > > * Volker Simonis: > > > > > The regression test reproduces the issue by allocation an object at an > > > address with the 32-bit least significant bits being zero and comperes > > > it with another null object. > > > > 79 for (int i = 0; i < 10; i++) { > > 80 System.gc(); > > 81 for (int j = 0; j < 1024; j++) { > > 82 s = new String("I'm not null!!!"); > > 83 if (WB.getObjectAddress(s) == 0x700000000L) break; > > 84 } > > 85 if (WB.getObjectAddress(s) == 0x700000000L) { > > 86 System.out.println("Got object at address 0x700000000"); > > 87 break; > > 88 } > > 89 } > > > > I think this could use a labeled loop, like this: > > > > GC_TESTS: for (int i = 0; i < 10; i++) { > > System.gc(); > > for (int j = 0; j < 1024; j++) { > > s = new String("I'm not null!!!"); > > if (WB.getObjectAddress(s) == 0x700000000L) { > > System.out.println("Got object at address 0x700000000"); > > break GC_TESTS; > > } > > } > > } > > > > (Untested.) > > > > On the other hand, neither this version or yours properly detects when > > s is *not* actually the object with the desired address, in which case > > the test objective fails to materialize, I think. From martin.doerr at sap.com Mon Mar 25 15:34:01 2019 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 25 Mar 2019 15:34:01 +0000 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: <87k1guiljd.fsf@mid.deneb.enyo.de> Message-ID: Hi Volker, looks good. Thanks, Martin -----Original Message----- From: Volker Simonis Sent: Montag, 25. M?rz 2019 16:28 To: Doerr, Martin Cc: hotspot compiler Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code On Fri, Mar 22, 2019 at 4:47 PM Doerr, Martin wrote: > > Hi Volker, > > thanks for the updated test. > > Looks like there are still a few leftovers from earlier experiments. I don't see any usage of Unsafe. > Not sure if all options and @modules etc. are still required. > You're right :) Unsafe isn't required any more: http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v2 > If you have time, it would be nice to clean up a little bit. > > Best regards, > Martin > > > -----Original Message----- > From: Volker Simonis > Sent: Freitag, 22. M?rz 2019 14:48 > To: Doerr, Martin > Cc: Florian Weimer ; hotspot compiler > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > Hi Florian, Martin, > > thanks for looking at the change. > > I think Martin is right, we either get an object allocated at > 0x700000000 right with the first allocation after System.gc() or it is > unlikely that we'll get it at all. The loops are actually a leftover > from my first experiments with other GC's like G1, but in the end I > decided to explicitly use SerialGC because it was the simplest and > most reliably way of allocation an object at the desired address. > > Please find the new webrev here: > > http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083.v1/ > > Thank you and best regards, > Volker > > On Wed, Mar 20, 2019 at 12:24 PM Doerr, Martin wrote: > > > > Hi Volker, > > > > I think the 2 nested loops are not really needed. > > If you use System.gc() + new String you should get it allocated at the desired location. > > I believe this is sufficient with the flags you are using. > > > > The test seems to test interpreter, C1, and C2 with the loop limit of 30_000. I like that. > > > > The C1 fix looks good. > > > > Thanks for finding and fixing this bug. > > > > Best regards, > > Martin > > > > > > -----Original Message----- > > From: hotspot-compiler-dev On Behalf Of Florian Weimer > > Sent: Dienstag, 19. M?rz 2019 22:48 > > To: Volker Simonis > > Cc: hotspot compiler > > Subject: Re: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > > > * Volker Simonis: > > > > > The regression test reproduces the issue by allocation an object at an > > > address with the 32-bit least significant bits being zero and comperes > > > it with another null object. > > > > 79 for (int i = 0; i < 10; i++) { > > 80 System.gc(); > > 81 for (int j = 0; j < 1024; j++) { > > 82 s = new String("I'm not null!!!"); > > 83 if (WB.getObjectAddress(s) == 0x700000000L) break; > > 84 } > > 85 if (WB.getObjectAddress(s) == 0x700000000L) { > > 86 System.out.println("Got object at address 0x700000000"); > > 87 break; > > 88 } > > 89 } > > > > I think this could use a labeled loop, like this: > > > > GC_TESTS: for (int i = 0; i < 10; i++) { > > System.gc(); > > for (int j = 0; j < 1024; j++) { > > s = new String("I'm not null!!!"); > > if (WB.getObjectAddress(s) == 0x700000000L) { > > System.out.println("Got object at address 0x700000000"); > > break GC_TESTS; > > } > > } > > } > > > > (Untested.) > > > > On the other hand, neither this version or yours properly detects when > > s is *not* actually the object with the desired address, in which case > > the test objective fails to materialize, I think. From vladimir.kozlov at oracle.com Mon Mar 25 16:13:23 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 25 Mar 2019 09:13:23 -0700 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> References: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> Message-ID: +1 Vladimir On 3/25/19 1:57 AM, Tobias Hartmann wrote: > Hi Rahul, > > this looks good to me. > > Best regards, > Tobias > > On 25.03.19 08:51, Rahul Raghavan wrote: >> Hi, >> >> Please review the following fix proposal for JDK-8219612. >> http://cr.openjdk.java.net/~rraghavan/8219612/webrev.00/ >> >> >> - https://bugs.openjdk.java.net/browse/JDK-8219612 >> - http://hg.openjdk.java.net/valhalla/valhalla/rev/ab7ea72963c9 >> >> Thanks to Mandy Chung. >> As mentioned in JBS above fix changeset is same as the one contributed by Mandy Chung, in valhalla >> repo nestmates branch. >> >> Testbug issue - compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime >> package as its nest host. >> >> Proposed fix - Test rewritten to use top-level classes rather then nested ones. >> >> >> Thanks, >> Rahul From vladimir.kozlov at oracle.com Mon Mar 25 16:30:23 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 25 Mar 2019 09:30:23 -0700 Subject: RFR: 8221343: x86_32 crashes on startup with "_hwm out of range" In-Reply-To: References: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> Message-ID: <0ec63296-5170-6795-4f17-3c45c898db63@oracle.com> +1 Vladimir On 3/25/19 6:41 AM, Tobias Hartmann wrote: > Hi Claes, > > this looks good to me. > > Best regards, > Tobias > > On 25.03.19 14:30, Claes Redestad wrote: >> Hi, >> >> some of the RegMasks allocated in Matcher::init_first_stack_mask aren't >> properly initialized on 32-bit builds, which hits an assert after >> JDK-8220159.? One simple fix is to explictly construct empty RegMasks >> into the allocated memory. >> >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221343 >> Webrev: http://cr.openjdk.java.net/~redestad/8221343/open.00/ >> >> Testing: verified reproducer in bug, pass tier1-3 >> >> Thanks! >> >> /Claes From claes.redestad at oracle.com Mon Mar 25 16:42:00 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 25 Mar 2019 17:42:00 +0100 Subject: RFR: 8221343: x86_32 crashes on startup with "_hwm out of range" In-Reply-To: <0ec63296-5170-6795-4f17-3c45c898db63@oracle.com> References: <1d7b5150-93c2-f5ff-7cf6-45e7f00cdf17@oracle.com> <0ec63296-5170-6795-4f17-3c45c898db63@oracle.com> Message-ID: On 2019-03-25 17:30, Vladimir Kozlov wrote: > +1 > > Vladimir Thanks for reviewing! /Claes From dmitrij.pochepko at bell-sw.com Mon Mar 25 16:53:33 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Mon, 25 Mar 2019 19:53:33 +0300 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: <077381e1-34e4-8857-32e8-740bbd4df882@redhat.com> References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> <077381e1-34e4-8857-32e8-740bbd4df882@redhat.com> Message-ID: Hi Andrew, Thank you for review. pushed to jdk/jdk (had to resolve 1 hunk, because of http://hg.openjdk.java.net/jdk/jdk/rev/f4f0dce5d0bb). And I'm going to propose this patch for backporting into 11u and 12u (Final jdk/jdk patch doesn't apply cleanly for both. Separate webrevs and testing will be provided later). Thanks, Dmitrij On 25/03/2019 1:01 PM, Andrew Dinn wrote: > Hi Dmitrij, > > On 20/03/2019 17:29, Dmitrij Pochepko wrote: >> Please take a look at >> http://cr.openjdk.java.net/~dpochepk/8216989/webrev.02/ >> I changed patch according to x86 fix and current code layout: >> >> - renamed methods parameters from "end" to "count" to avoid confusing >> names and match x86 >> - updated G1BarrierSetAssembler::gen_write_ref_array_post_barrier >> implementation: removed code with calculation of "count" value and using >> "count" directly instead >> - updated ShenandoahBarrierSetAssembler::arraycopy_epilogue in same way >> as G1 code >> - updated CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier >> with zero length branch and calculation of inclusive end pointer to >> match original logic >> - updated arraycopy_epilogue usage by removing unnecessary end pointer >> calculation and providing array length (count) instead >> >> I also run jtreg hotspot/compiler, hotspot/gc, hotspot/runtime and jck >> with G1GC, ParallelGC and ShenandoahGC >> No regressions found. > Yes, with that change AArch64 matches the x86 code and the changes are > consistent with the JDK8 change set that Andrew identified as the reason > why AArch64 and x86 diverged. So, it's ok to push. > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From adinn at redhat.com Mon Mar 25 17:16:49 2019 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 25 Mar 2019 17:16:49 +0000 Subject: [aarch64-port-dev ] RFR(S): 8216989 - CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier() does not check for zero length on AARCH64 In-Reply-To: References: <2f857512-c206-a977-1c48-118f9c9d9a63@bell-sw.com> <288f6d65-5aff-0ba1-1a1c-890700b985db@redhat.com> <077381e1-34e4-8857-32e8-740bbd4df882@redhat.com> Message-ID: On 25/03/2019 16:53, Dmitrij Pochepko wrote: > Thank you for review. > pushed to jdk/jdk (had to resolve 1 hunk, because of > http://hg.openjdk.java.net/jdk/jdk/rev/f4f0dce5d0bb). Thank you! > And I'm going to propose this patch for backporting into 11u and 12u > (Final jdk/jdk patch doesn't apply cleanly for both. Separate webrevs > and testing will be provided later). This conversation really needs to happen on the updates list so please raise it there. However, since we are already talking about it: Backporting to 11u sounds to me like a worthwhile precaution. I'm not sure it is worth worrying about 12u. The bug does not appear to have crashed anything yet (at least not as far as we know :-) and 12u offers a very limited window for a problem to manifest. I think it is more important to backport it to the jdk8u aarch64 tree (which will probably be around for ... some time). However, let's defer any decision on that to those managing the updates projects. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From vladimir.kozlov at oracle.com Mon Mar 25 17:20:47 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 25 Mar 2019 10:20:47 -0700 Subject: RFR: 8221404: Remove double alignment of RegMasks in Matcher In-Reply-To: References: Message-ID: <9cf2258b-c432-6bb1-7da6-f7bd9c6a6a98@oracle.com> Intel's Skylake has 64 bytes L1 cache line. Keeping first 64 register masks bits in one cache line should help. I don't think we should do this change. Thanks, Vladimir On 3/25/19 6:44 AM, Claes Redestad wrote: > Hi, > > RegMask are allocated double-aligned, which doesn't seem to have any > real effect on any of our supported platforms. Simplify. > > Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221404 > Webrev: http://cr.openjdk.java.net/~redestad/8221404/open.00 > > Testing: tier1-3 (together with JDK-8221343) > > Thanks! > > /Claes From igor.ignatyev at oracle.com Mon Mar 25 17:44:41 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 25 Mar 2019 10:44:41 -0700 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: References: Message-ID: <8E118BA1-B8F8-47B8-86F5-86B46E36FF1E@oracle.com> Hi Rahul, the fix looks good to me, I have a meta question though (most probably for Mandy): wouldn't this restriction break backwards compatibility? Thanks, -- Igor > On Mar 25, 2019, at 12:51 AM, Rahul Raghavan wrote: > > Hi, > > Please review the following fix proposal for JDK-8219612. > http://cr.openjdk.java.net/~rraghavan/8219612/webrev.00/ > > > - https://bugs.openjdk.java.net/browse/JDK-8219612 > - http://hg.openjdk.java.net/valhalla/valhalla/rev/ab7ea72963c9 > > Thanks to Mandy Chung. > As mentioned in JBS above fix changeset is same as the one contributed by Mandy Chung, in valhalla repo nestmates branch. > > Testbug issue - compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host. > > Proposed fix - Test rewritten to use top-level classes rather then nested ones. > > > Thanks, > Rahul From claes.redestad at oracle.com Mon Mar 25 18:16:11 2019 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 25 Mar 2019 19:16:11 +0100 Subject: RFR: 8221404: Remove double alignment of RegMasks in Matcher In-Reply-To: <9cf2258b-c432-6bb1-7da6-f7bd9c6a6a98@oracle.com> References: <9cf2258b-c432-6bb1-7da6-f7bd9c6a6a98@oracle.com> Message-ID: Withdrawing with intent to rework this a bit. Notes: - A single RegMask on my Intel ivy bridge workstation is 22 4-byte mask words - 88 bytes - plus the 8 bytes for _lwm and _hwm, so 96 bytes. As double-aligning only shifts the object at most 4 bytes we'll still look at RegMasks that span 2 cache lines. - Quick performance runs show neutral results. - I'm having second thoughts about the utility of _lwm since it sticks to 0 for many (most?) masks, and my intent with removing the double- alignment was to make the RegMask layout more malleable to experimentation with layout without memory waste. I have some data that suggest we could improve a bit by putting watermarks and the AllStack bit closer to the first masks. Thanks! /Claes On 2019-03-25 18:20, Vladimir Kozlov wrote: > Intel's Skylake has 64 bytes L1 cache line. > Keeping first 64 register masks bits in one cache line should help. I > don't think we should do this change. > > Thanks, > Vladimir > > On 3/25/19 6:44 AM, Claes Redestad wrote: >> Hi, >> >> RegMask are allocated double-aligned, which doesn't seem to have any >> real effect on any of our supported platforms. Simplify. >> >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221404 >> Webrev: http://cr.openjdk.java.net/~redestad/8221404/open.00 >> >> Testing: tier1-3 (together with JDK-8221343) >> >> Thanks! >> >> /Claes From vladimir.kozlov at oracle.com Mon Mar 25 19:33:10 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 25 Mar 2019 12:33:10 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> Message-ID: <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> On 3/25/19 2:30 AM, Rahul Raghavan wrote: > Hi, > > Request help review the following revised fix proposal for JDK-8202414. > > - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.01/ > > Though did not receive comments for earlier '8202414/webrev.00', > when checked again seems the same to be wrong or too restrictive. > So tried the revised changes - > > intptr_t InitializeNode::can_capture_store(StoreNode* st, PhaseTransform* phase, bool can_reshape) { > ?? const int FAIL = 0; > ?? if (st->is_unaligned_access()) { > ???? return FAIL; > ?? } > +? if ((st->memory_size() >= BytesPerInt) && ((get_store_offset(st, phase) % BytesPerInt) != 0)) { > +??? return FAIL; > +? } Suggestion: if ((get_store_offset(st, phase) % st->memory_size()) != 0) { Vladimir > ?? if (st->req() != MemNode::ValueIn + 1) > ???? return FAIL;??????????????? // an inscrutable StoreNode (card mark?) > > Confirmed no issues with reported 8202414 test case. > Also no issues for hs-tier1 to tier4, hs-precheckin-comp testing. > Please let me know if missed something here. > > > Thanks, > Rahul > > > > On 14/03/19 1:54 PM, Rahul Raghavan wrote: >> Hi, >> >> Please review the following fix proposal for JDK-8202414. >> >> Webrev - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.00/ >> >> >> -- Related links >> ?> https://bugs.openjdk.java.net/browse/JDK-8202414 >> ?> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2018-September/030536.html >> >> >> -- As per suggestions in JBS added following change in InitializeNode::can_capture_store() to return false for >> unaligned stores. >> ============= >> diff -r 3086f9259e97 src/hotspot/share/opto/memnode.cpp >> --- a/src/hotspot/share/opto/memnode.cpp Wed Mar 13 00:48:52 2019 -0400 >> +++ b/src/hotspot/share/opto/memnode.cpp Wed Mar 13 19:50:07 2019 +0530 >> @@ -3541,7 +3541,7 @@ >> ??// within the initialized memory. >> ??intptr_t InitializeNode::can_capture_store(StoreNode* st, PhaseTransform* phase, bool can_reshape) { >> ??? const int FAIL = 0; >> -? if (st->is_unaligned_access()) { >> +? if (st->is_unaligned_access() || ((get_store_offset(st, phase) % BytesPerInt) != 0)) { >> ????? return FAIL; >> ??? } >> ??? if (st->req() != MemNode::ValueIn + 1) >> ============== >> >> >> -- Added the new jtreg test from the JBS unit test. >> (test/hotspot/jtreg/compiler/c2/Test8202414.java) >> Understood the test with unaligned access will not work for Sparc due to hardware restrictions.The test always fails >> with SIGBUS crash, with or without above fix. So added >> ??? @requires (os.arch != "sparc") & (os.arch != "sparcv9") >> >> >> -- Confirmed the above change solved the original reported 8202414 test case failure. Also no issues far for hs-tier1 >> to tier4, hs-precheckin-comp testing. >> >> -- Could not work out any related additions in LibraryCallKit::inline_unsafe_access(). >> Hope above fix proposal is correct, complete solution for the issue. >> >> >> Thanks, >> Rahul From dean.long at oracle.com Mon Mar 25 22:52:51 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Mon, 25 Mar 2019 15:52:51 -0700 Subject: RFR: 8221404: Remove double alignment of RegMasks in Matcher In-Reply-To: References: <9cf2258b-c432-6bb1-7da6-f7bd9c6a6a98@oracle.com> Message-ID: <4428c892-0244-9de9-12ee-978da6fd9e26@oracle.com> Since you are going to rework this, does it make sense to have separate masks for GPR and FP registers? dl On 3/25/19 11:16 AM, Claes Redestad wrote: > Withdrawing with intent to rework this a bit. > > Notes: > > - A single RegMask on my Intel ivy bridge workstation is 22 4-byte mask > words - 88 bytes - plus the 8 bytes for _lwm and _hwm, so 96 bytes. As > double-aligning only shifts the object at most 4 bytes we'll still look > at RegMasks that span 2 cache lines. > > - Quick performance runs show neutral results. > > - I'm having second thoughts about the utility of _lwm since it sticks > to 0 for many (most?) masks, and my intent with removing the double- > alignment was to make the RegMask layout more malleable to > experimentation with layout without memory waste. I have some data that > suggest we could improve a bit by putting watermarks and the AllStack > bit closer to the first masks. > > Thanks! > > /Claes > > On 2019-03-25 18:20, Vladimir Kozlov wrote: >> Intel's Skylake has 64 bytes L1 cache line. >> Keeping first 64 register masks bits in one cache line should help. I >> don't think we should do this change. >> >> Thanks, >> Vladimir >> >> On 3/25/19 6:44 AM, Claes Redestad wrote: >>> Hi, >>> >>> RegMask are allocated double-aligned, which doesn't seem to have any >>> real effect on any of our supported platforms. Simplify. >>> >>> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221404 >>> Webrev: http://cr.openjdk.java.net/~redestad/8221404/open.00 >>> >>> Testing: tier1-3 (together with JDK-8221343) >>> >>> Thanks! >>> >>> /Claes From goetz.lindenmaier at sap.com Tue Mar 26 07:10:44 2019 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 26 Mar 2019 07:10:44 +0000 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: Message-ID: Hi Volker, so basically the code handled pointers (Object references) the same as integers and other 32-bit values, because they take up on Java 'slot'? Is that what "is_single_cpu()" means? If so, the fix looks good and complete. Reviewed. For the test, while it is extremely unlikely that it ever fails, (now after the code has been fixed), I think it should throw a RuntimeException with a message like "wrong compare resulting in assumption of reference being null" or the like, instead of throwing a NPE. Don't need a new webrev in case you want to adapt this. Best regards, Goetz. > -----Original Message----- > From: hotspot-compiler-dev bounces at openjdk.java.net> On Behalf Of Volker Simonis > Sent: Dienstag, 19. M?rz 2019 19:54 > To: hotspot compiler > Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > Hi, > > can I please have a review for the following small ppc64-only C1 patch > which fixes a nasty, day-one problem which only recently popped up > more frequently: > > http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083/ > https://bugs.openjdk.java.net/browse/JDK-8221083 > > The C1 generated code for comparing two oops erroneously emits a > 32-bit instead of an 64-bit compare instruction. Because oops are only > compared for equality/inequality, this bug only becomes manifests for > oops which are equal in their 32 least-significant bits but unequal > otherwise. This means the two oops have to be exactly 4GB apart from > each other in the heap or their 32 least significant bits have to be > zero when compared to 'null'. > > This makes the occurrence of this bug extremely unlikely, but when it > happens, the consequences are usually a semantically wrong program > execution and not a crash, which makes it very hard to detect. > > The regression test reproduces the issue by allocation an object at an > address with the 32-bit least significant bits being zero and comperes > it with another null object. > > The fix also removes some adjacent code which has never been used (and > tested) until now. > > Thank you and best regards, > Volker From per.liden at oracle.com Tue Mar 26 07:24:09 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Mar 2019 08:24:09 +0100 Subject: RFR: 8221456: nmethod::make_unloaded() clears _method member too early Message-ID: nmethod::make_unloaded() clears the _method member too early, before passing the nmethod to CollectedHeap::unregister_nmethod(). This is not what happens when an nmethod is unregistered via nmethod::make_not_entrant_or_zombie(). We should align this behavior. Clearing the _method member after it has been unregistered is useful, since the GC can then print the method name/signature in logs, etc. Moving the clearing of _method until after CollectedHeap::unregister_nmethod() should be a safe and uncontroversial thing to do. Today, ZGC can crash if -Xlog:gc+nmethod=debug is used and an nmethod is unloaded via nmethod::make_unloaded(), because it tries to log the name of the method. Bug: https://bugs.openjdk.java.net/browse/JDK-8221456 Webrev: http://cr.openjdk.java.net/~pliden/8221456/webrev.0 /Per From volker.simonis at gmail.com Tue Mar 26 10:50:10 2019 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 26 Mar 2019 11:50:10 +0100 Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code In-Reply-To: References: Message-ID: On Tue, Mar 26, 2019 at 8:10 AM Lindenmaier, Goetz wrote: > > Hi Volker, > > so basically the code handled pointers (Object references) the > same as integers and other 32-bit values, because they take up > on Java 'slot'? Is that what "is_single_cpu()" means? > Yes, exactly. "is_single_cpu()" means that it uses one Java 'slot' but if it's an object or an array we still need to use a 64-bit compare. > If so, the fix looks good and complete. Reviewed. Thanks. > > For the test, while it is extremely unlikely that it ever fails, > (now after the code has been fixed), > I think it should throw a RuntimeException with a message > like "wrong compare resulting in assumption of reference being null" > or the like, instead of throwing a NPE. > Don't need a new webrev in case you want to adapt this. > Done. > Best regards, > Goetz. > > > -----Original Message----- > > From: hotspot-compiler-dev > bounces at openjdk.java.net> On Behalf Of Volker Simonis > > Sent: Dienstag, 19. M?rz 2019 19:54 > > To: hotspot compiler > > Subject: RFR(S): 8221083: [ppc64] Wrong oop compare in C1-generated code > > > > Hi, > > > > can I please have a review for the following small ppc64-only C1 patch > > which fixes a nasty, day-one problem which only recently popped up > > more frequently: > > > > http://cr.openjdk.java.net/~simonis/webrevs/2019/8221083/ > > https://bugs.openjdk.java.net/browse/JDK-8221083 > > > > The C1 generated code for comparing two oops erroneously emits a > > 32-bit instead of an 64-bit compare instruction. Because oops are only > > compared for equality/inequality, this bug only becomes manifests for > > oops which are equal in their 32 least-significant bits but unequal > > otherwise. This means the two oops have to be exactly 4GB apart from > > each other in the heap or their 32 least significant bits have to be > > zero when compared to 'null'. > > > > This makes the occurrence of this bug extremely unlikely, but when it > > happens, the consequences are usually a semantically wrong program > > execution and not a crash, which makes it very hard to detect. > > > > The regression test reproduces the issue by allocation an object at an > > address with the 32-bit least significant bits being zero and comperes > > it with another null object. > > > > The fix also removes some adjacent code which has never been used (and > > tested) until now. > > > > Thank you and best regards, > > Volker From shade at redhat.com Tue Mar 26 13:19:01 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 26 Mar 2019 14:19:01 +0100 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs Message-ID: Bug: https://bugs.openjdk.java.net/browse/JDK-8220198 Fix: http://cr.openjdk.java.net/~shade/8220198/webrev.01/ Actually, we might consider enabling these stubs for 32-bit builds, but it would require more work. I want to get this easy patch in to make it cleanly backportable to 12u and 11u, where x86_32 is broken too. Not very sure about the assert in GraphKit: it turns the cryptic crash into proper failure, and I _think_ there are no false negatives from it, because calling to NULL stub would crash eventually anyway. Testing: Linux {x86_64, x86_32} tier1, Cipher tests are fixed on 32-bit, jdk-submit (running) Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue Mar 26 16:46:03 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 26 Mar 2019 09:46:03 -0700 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: References: Message-ID: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> Thank you, Aleksey vm_version_x86.cpp changes are good. And I agree with assert in GraphKit. But I would also suggest to add (stubAddr == NULL) check in LibraryCallKit::inline_sha_implCompress(). We do such check in other intrinsics [1]. And please file a new bug to fix other intrinsics which missing the check - I see such cases in library_call.cpp. Thanks, Vladimir [1] http://hg.openjdk.java.net/jdk/jdk/file/c12b897021ea/src/hotspot/share/opto/library_call.cpp#l5986 On 3/26/19 6:19 AM, Aleksey Shipilev wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8220198 > > Fix: > http://cr.openjdk.java.net/~shade/8220198/webrev.01/ > > Actually, we might consider enabling these stubs for 32-bit builds, but it would require more work. > I want to get this easy patch in to make it cleanly backportable to 12u and 11u, where x86_32 is > broken too. Not very sure about the assert in GraphKit: it turns the cryptic crash into proper > failure, and I _think_ there are no false negatives from it, because calling to NULL stub would > crash eventually anyway. > > Testing: Linux {x86_64, x86_32} tier1, Cipher tests are fixed on 32-bit, jdk-submit (running) > > Thanks, > -Aleksey > > From shade at redhat.com Tue Mar 26 17:37:52 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 26 Mar 2019 18:37:52 +0100 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> References: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> Message-ID: <2ee58490-141b-263d-7f2e-5b2f393ffbf6@redhat.com> On 3/26/19 5:46 PM, Vladimir Kozlov wrote: > But I would also suggest to add (stubAddr == NULL) check in > LibraryCallKit::inline_sha_implCompress(). We do such check in other intrinsics [1]. Yes, gracefully returning on uninitialized stub in release bits is a saner behavior. I still want the assert before that check to catch the actual stubs bug early. Like this: http://cr.openjdk.java.net/~shade/8220198/webrev.02/ Still passes tier1. I'll run jdk-submit one more time. > And please file a new bug to fix other intrinsics which missing the check - I see such cases in library_call.cpp. https://bugs.openjdk.java.net/browse/JDK-8221495 -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue Mar 26 18:33:56 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 26 Mar 2019 11:33:56 -0700 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: References: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> <2ee58490-141b-263d-7f2e-5b2f393ffbf6@redhat.com> <0d05bcd6-c9e1-547b-673e-7a45d18b84dd@oracle.com> Message-ID: <641bcbcf-e429-91a3-db8c-a15efdc854ac@oracle.com> Added CC list I missed in my previous reply. On 3/26/19 10:56 AM, Aleksey Shipilev wrote: > On 3/26/19 6:52 PM, Vladimir Kozlov wrote: >> I don't think we should have stub_addr check asserts in library_call.cpp (in graphKit it is fine). >> It is normal if an implementation is done only on some platforms. Consider it as >> Matcher::match_rule_supported() check we have in some intrinsics. > > > I think I misplaced one of the assert blocks. Look here: > http://cr.openjdk.java.net/~shade/8220198/webrev.02/src/hotspot/share/opto/library_call.cpp.sdiff.html > > In first instance, we are about to call make_runtime_call with "stubAddr == NULL", which is going to > assert there. If there is only the "return false", then it would silently disable the intrinsic > without exposing the actual bug. That is not good. > > In second instance, we do know the intrinsic is enabled, because we are under (klass_SHA_name != > NULL) and going for inline_sha_implCompressMB generation with potentially NULL stub_addr. That > method would call make_runtime_call with NULL then, and assert there, but then we have the same > hidden problem as in first instance. Agree. There are asserts or checks for flags which complement your new assert. Your assert will catch cases when flag is set but stub is not generated as in this bug. Thanks, Vladimir > > -Aleksey > From shade at redhat.com Tue Mar 26 18:39:58 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 26 Mar 2019 19:39:58 +0100 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: <641bcbcf-e429-91a3-db8c-a15efdc854ac@oracle.com> References: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> <2ee58490-141b-263d-7f2e-5b2f393ffbf6@redhat.com> <0d05bcd6-c9e1-547b-673e-7a45d18b84dd@oracle.com> <641bcbcf-e429-91a3-db8c-a15efdc854ac@oracle.com> Message-ID: <76c64b26-d637-736f-1356-6ff23c4c9067@redhat.com> On 3/26/19 7:33 PM, Vladimir Kozlov wrote: > On 3/26/19 10:56 AM, Aleksey Shipilev wrote: >> On 3/26/19 6:52 PM, Vladimir Kozlov wrote: >> ? >> http://cr.openjdk.java.net/~shade/8220198/webrev.02/src/hotspot/share/opto/library_call.cpp.sdiff.html >> >> In first instance, we are about to call make_runtime_call with "stubAddr == NULL", which is going to >> assert there. If there is only the "return false", then it would silently disable the intrinsic >> without exposing the actual bug. That is not good. >> >> In second instance, we do know the intrinsic is enabled, because we are under (klass_SHA_name != >> NULL) and going for inline_sha_implCompressMB generation with potentially NULL stub_addr. That >> method would call make_runtime_call with NULL then, and assert there, but then we have the same >> hidden problem as in first instance. > > Agree. There are asserts or checks for flags which complement your new assert. > Your assert will catch cases when flag is set but stub is not generated as in this bug. Okay, good. So to be clear, you are agreeing with that webrev, right? http://cr.openjdk.java.net/~shade/8220198/webrev.02/ -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From vladimir.kozlov at oracle.com Tue Mar 26 18:49:26 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 26 Mar 2019 11:49:26 -0700 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: <76c64b26-d637-736f-1356-6ff23c4c9067@redhat.com> References: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> <2ee58490-141b-263d-7f2e-5b2f393ffbf6@redhat.com> <0d05bcd6-c9e1-547b-673e-7a45d18b84dd@oracle.com> <641bcbcf-e429-91a3-db8c-a15efdc854ac@oracle.com> <76c64b26-d637-736f-1356-6ff23c4c9067@redhat.com> Message-ID: <489b5afb-dfe1-97f8-f58e-5512a6ebc66a@oracle.com> On 3/26/19 11:39 AM, Aleksey Shipilev wrote: > On 3/26/19 7:33 PM, Vladimir Kozlov wrote: >> On 3/26/19 10:56 AM, Aleksey Shipilev wrote: >>> On 3/26/19 6:52 PM, Vladimir Kozlov wrote: >>> >>> http://cr.openjdk.java.net/~shade/8220198/webrev.02/src/hotspot/share/opto/library_call.cpp.sdiff.html >>> >>> In first instance, we are about to call make_runtime_call with "stubAddr == NULL", which is going to >>> assert there. If there is only the "return false", then it would silently disable the intrinsic >>> without exposing the actual bug. That is not good. >>> >>> In second instance, we do know the intrinsic is enabled, because we are under (klass_SHA_name != >>> NULL) and going for inline_sha_implCompressMB generation with potentially NULL stub_addr. That >>> method would call make_runtime_call with NULL then, and assert there, but then we have the same >>> hidden problem as in first instance. >> >> Agree. There are asserts or checks for flags which complement your new assert. >> Your assert will catch cases when flag is set but stub is not generated as in this bug. > > Okay, good. So to be clear, you are agreeing with that webrev, right? > http://cr.openjdk.java.net/~shade/8220198/webrev.02/ Yes, it is good. Vladimir > > -Aleksey > From erik.osterlund at oracle.com Wed Mar 27 06:59:46 2019 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Wed, 27 Mar 2019 07:59:46 +0100 Subject: RFR: 8221456: nmethod::make_unloaded() clears _method member too early In-Reply-To: References: Message-ID: Hi Per, Looks good. Thanks, /Erik > On 26 Mar 2019, at 08:24, Per Liden wrote: > > nmethod::make_unloaded() clears the _method member too early, before passing the nmethod to CollectedHeap::unregister_nmethod(). This is not what happens when an nmethod is unregistered via nmethod::make_not_entrant_or_zombie(). We should align this behavior. Clearing the _method member after it has been unregistered is useful, since the GC can then print the method name/signature in logs, etc. Moving the clearing of _method until after CollectedHeap::unregister_nmethod() should be a safe and uncontroversial thing to do. > > Today, ZGC can crash if -Xlog:gc+nmethod=debug is used and an nmethod is unloaded via nmethod::make_unloaded(), because it tries to log the name of the method. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221456 > Webrev: http://cr.openjdk.java.net/~pliden/8221456/webrev.0 > > /Per From tobias.hartmann at oracle.com Wed Mar 27 07:10:17 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 Mar 2019 08:10:17 +0100 Subject: RFR (XS) 8220198: Lots of com/sun/crypto/provider/Cipher tests fail on x86_32 due to missing SHA512 stubs In-Reply-To: <489b5afb-dfe1-97f8-f58e-5512a6ebc66a@oracle.com> References: <1478e3b4-625d-491a-8876-0a6a078ee479@oracle.com> <2ee58490-141b-263d-7f2e-5b2f393ffbf6@redhat.com> <0d05bcd6-c9e1-547b-673e-7a45d18b84dd@oracle.com> <641bcbcf-e429-91a3-db8c-a15efdc854ac@oracle.com> <76c64b26-d637-736f-1356-6ff23c4c9067@redhat.com> <489b5afb-dfe1-97f8-f58e-5512a6ebc66a@oracle.com> Message-ID: Hi Aleksey, looks good to me too. Best regards, Tobias On 26.03.19 19:49, Vladimir Kozlov wrote: > On 3/26/19 11:39 AM, Aleksey Shipilev wrote: >> On 3/26/19 7:33 PM, Vladimir Kozlov wrote: >>> On 3/26/19 10:56 AM, Aleksey Shipilev wrote: >>>> On 3/26/19 6:52 PM, Vladimir Kozlov wrote: >>>> ? >>>> http://cr.openjdk.java.net/~shade/8220198/webrev.02/src/hotspot/share/opto/library_call.cpp.sdiff.html >>>> >>>> >>>> In first instance, we are about to call make_runtime_call with "stubAddr == NULL", which is >>>> going to >>>> assert there. If there is only the "return false", then it would silently disable the intrinsic >>>> without exposing the actual bug. That is not good. >>>> >>>> In second instance, we do know the intrinsic is enabled, because we are under (klass_SHA_name != >>>> NULL) and going for inline_sha_implCompressMB generation with potentially NULL stub_addr. That >>>> method would call make_runtime_call with NULL then, and assert there, but then we have the same >>>> hidden problem as in first instance. >>> >>> Agree. There are asserts or checks for flags which complement your new assert. >>> Your assert will catch cases when flag is set but stub is not generated as in this bug. >> >> Okay, good. So to be clear, you are agreeing with that webrev, right? >> ?? http://cr.openjdk.java.net/~shade/8220198/webrev.02/ > > Yes, it is good. > > Vladimir > >> >> -Aleksey >> From per.liden at oracle.com Wed Mar 27 07:35:30 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Mar 2019 08:35:30 +0100 Subject: RFR: 8221456: nmethod::make_unloaded() clears _method member too early In-Reply-To: References: Message-ID: <8a81ba36-0daa-5eb2-6b18-90f030ece01e@oracle.com> Thanks Erik! /Per On 3/27/19 7:59 AM, Erik Osterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > >> On 26 Mar 2019, at 08:24, Per Liden wrote: >> >> nmethod::make_unloaded() clears the _method member too early, before passing the nmethod to CollectedHeap::unregister_nmethod(). This is not what happens when an nmethod is unregistered via nmethod::make_not_entrant_or_zombie(). We should align this behavior. Clearing the _method member after it has been unregistered is useful, since the GC can then print the method name/signature in logs, etc. Moving the clearing of _method until after CollectedHeap::unregister_nmethod() should be a safe and uncontroversial thing to do. >> >> Today, ZGC can crash if -Xlog:gc+nmethod=debug is used and an nmethod is unloaded via nmethod::make_unloaded(), because it tries to log the name of the method. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221456 >> Webrev: http://cr.openjdk.java.net/~pliden/8221456/webrev.0 >> >> /Per > From fujie at loongson.cn Wed Mar 27 10:15:08 2019 From: fujie at loongson.cn (Jie Fu) Date: Wed, 27 Mar 2019 18:15:08 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision Message-ID: Hi all, JBS:??? https://bugs.openjdk.java.net/browse/JDK-8221542 Webrev: http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.00/ ## Symptom ~15% performance degradation (from 700 ops/m to 600 ops/m) was observed randomly on x86 while running SPECjvm2008's scimark.monte_carlo with -XX:-TieredCompilation. ## Reproduce It can be always reproduced with the script[1] in less than 5 minutes. ## Reason The drop was caused by a not-inline decision on spec.benchmarks.scimark.utils.Random:: in spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate. ## Fix It might be better to make a little change to the inline heuristic[2]. For callers without loops, the original heuristic works fine. But for callers with loops, it would be better to make a not-inline decision more conservatively. ## Testing - Running scimark.monte_carlo on jdk/x64 with -XX:-TieredCompilation for about 5000 times, no performance drop ? Also on jdk8u/mips64 with -XX:-TieredCompilation, no performance drop - Running make test TEST="micro" on jdk/x64, no performance regression - Running SPECjvm2008 on jdk8u/x64 with -XX:-TieredCompilation, no performance regression For more detailed info, please see the JBS. Could you please review it? Thanks a lot. Best regards, Jie [1] http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/reproduce.sh [2] http://hg.openjdk.java.net/jdk/jdk/file/0a2d73e02076/src/hotspot/share/opto/bytecodeInfo.cpp#l375 From tobias.hartmann at oracle.com Wed Mar 27 10:58:28 2019 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 27 Mar 2019 11:58:28 +0100 Subject: RFR: 8221456: nmethod::make_unloaded() clears _method member too early In-Reply-To: References: Message-ID: <7b285295-3756-e07c-175f-ced23977554a@oracle.com> Hi Per, this looks good to me. Best regards, Tobias On 26.03.19 08:24, Per Liden wrote: > nmethod::make_unloaded() clears the _method member too early, before passing the nmethod to > CollectedHeap::unregister_nmethod(). This is not what happens when an nmethod is unregistered via > nmethod::make_not_entrant_or_zombie(). We should align this behavior. Clearing the _method member > after it has been unregistered is useful, since the GC can then print the method name/signature in > logs, etc. Moving the clearing of _method until after CollectedHeap::unregister_nmethod() should be > a safe and uncontroversial thing to do. > > Today, ZGC can crash if -Xlog:gc+nmethod=debug is used and an nmethod is unloaded via > nmethod::make_unloaded(), because it tries to log the name of the method. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221456 > Webrev: http://cr.openjdk.java.net/~pliden/8221456/webrev.0 > > /Per From rahul.v.raghavan at oracle.com Wed Mar 27 10:58:03 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 27 Mar 2019 16:28:03 +0530 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: References: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> Message-ID: Thank you Vladimir. On 25/03/19 9:43 PM, Vladimir Kozlov wrote: > +1 > > Vladimir > From rahul.v.raghavan at oracle.com Wed Mar 27 10:56:59 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 27 Mar 2019 16:26:59 +0530 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> References: <65bc319b-edd7-614d-f081-c45c5daa862a@oracle.com> Message-ID: <24af1772-53b4-0341-92d6-b46e41a85e4d@oracle.com> Thank you Tobias. On 25/03/19 2:27 PM, Tobias Hartmann wrote: > Hi Rahul, > > this looks good to me. > > Best regards, > Tobias From per.liden at oracle.com Wed Mar 27 11:09:11 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Mar 2019 12:09:11 +0100 Subject: RFR: 8221456: nmethod::make_unloaded() clears _method member too early In-Reply-To: <7b285295-3756-e07c-175f-ced23977554a@oracle.com> References: <7b285295-3756-e07c-175f-ced23977554a@oracle.com> Message-ID: <0a1c78e9-02d6-5469-b506-a63f3361ccd0@oracle.com> Thanks for reviewing, Tobias! /Per On 2019-03-27 11:58, Tobias Hartmann wrote: > Hi Per, > > this looks good to me. > > Best regards, > Tobias > > On 26.03.19 08:24, Per Liden wrote: >> nmethod::make_unloaded() clears the _method member too early, before passing the nmethod to >> CollectedHeap::unregister_nmethod(). This is not what happens when an nmethod is unregistered via >> nmethod::make_not_entrant_or_zombie(). We should align this behavior. Clearing the _method member >> after it has been unregistered is useful, since the GC can then print the method name/signature in >> logs, etc. Moving the clearing of _method until after CollectedHeap::unregister_nmethod() should be >> a safe and uncontroversial thing to do. >> >> Today, ZGC can crash if -Xlog:gc+nmethod=debug is used and an nmethod is unloaded via >> nmethod::make_unloaded(), because it tries to log the name of the method. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221456 >> Webrev: http://cr.openjdk.java.net/~pliden/8221456/webrev.0 >> >> /Per From rahul.v.raghavan at oracle.com Wed Mar 27 12:30:39 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 27 Mar 2019 18:00:39 +0530 Subject: [13] RFR: 8219612: [TESTBUG] compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host In-Reply-To: <8E118BA1-B8F8-47B8-86F5-86B46E36FF1E@oracle.com> References: <8E118BA1-B8F8-47B8-86F5-86B46E36FF1E@oracle.com> Message-ID: <72696c24-5495-3fff-44ef-9affb533f772@oracle.com> Thank you Igor. I will continue to push the fix. Also request help from Mandy to reply the backwards compatibility question correctly. (understood from JBS notes, the IncompatibleClassChangeError type failure would have been generated for existing TestCaseImpl itself, if it's implementation had actual nestmate relationship validation !) Thanks, Rahul On 25/03/19 11:14 PM, Igor Ignatyev wrote: > Hi Rahul, > > the fix looks good to me, I have a meta question though (most probably for Mandy): wouldn't this restriction break backwards compatibility? > > Thanks, > -- Igor > >> On Mar 25, 2019, at 12:51 AM, Rahul Raghavan wrote: >> >> Hi, >> >> Please review the following fix proposal for JDK-8219612. >> http://cr.openjdk.java.net/~rraghavan/8219612/webrev.00/ >> >> >> - https://bugs.openjdk.java.net/browse/JDK-8219612 >> - http://hg.openjdk.java.net/valhalla/valhalla/rev/ab7ea72963c9 >> >> Thanks to Mandy Chung. >> As mentioned in JBS above fix changeset is same as the one contributed by Mandy Chung, in valhalla repo nestmates branch. >> >> Testbug issue - compiler.codecache.stress.Helper.TestCaseImpl can't be defined in different runtime package as its nest host. >> >> Proposed fix - Test rewritten to use top-level classes rather then nested ones. >> >> >> Thanks, >> Rahul > From dmitrij.pochepko at bell-sw.com Wed Mar 27 13:33:17 2019 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 27 Mar 2019 16:33:17 +0300 Subject: RFR: 8219654: AARCH64: Arrays:equals intrinsic documentation and maintenance improvement Message-ID: <04c0ba39-ff65-ea31-6b7e-cc7132f51e32@bell-sw.com> Hi all, please review patch for 8219654: AARCH64: Arrays:equals intrinsic documentation and maintenance improvement webrev: http://cr.openjdk.java.net/~dpochepk/8219654/webrev.01/ Changes: - added documentation and variable renaming - added stub prerequisites checks for debug build to guard against incorrect stub usage - removed unused code block (EARLY_OUT) - changed loop threshold constant calculation for stub code. It could be incorrect in several cases when non-default SoftwarePrefetchHintDistance value is specified - added test to cover all code branches Testing: - jtreg tests: hotspot/compiler, hotspot/runtime, hotspot/gc, jdk tier1-3 - jck no regressions found CR: https://bugs.openjdk.java.net/browse/JDK-8219654 I'd like to thank Pengfei Li for help with prereview. I have only one more documentation patch left to be sent (hasNegatives intrinsic) and would like to remind that unreviewed patch queue is already quite large: https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-January/006795.html https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-February/006940.html https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-February/006956.html https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-March/007028.html https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2019-March/007036.html Thanks, Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From rahul.v.raghavan at oracle.com Wed Mar 27 13:44:54 2019 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Wed, 27 Mar 2019 19:14:54 +0530 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> Message-ID: Hi, Thank you Vladimir. Yes, tried following fix. (needed to add checks to avoid SIGFPE crash). + int size_in_bytes = st->memory_size(); + if ((size_in_bytes != 0) && (get_store_offset(st, phase) % size_in_bytes) != 0) { + return FAIL; + } - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ Confirmed no issues with testing for this revised fix. Thanks, Rahul On 26/03/19 1:03 AM, Vladimir Kozlov wrote: > > Suggestion: > > if ((get_store_offset(st, phase) % st->memory_size()) != 0) { > > Vladimir > > From vladimir.kozlov at oracle.com Wed Mar 27 16:44:39 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 27 Mar 2019 09:44:39 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> Message-ID: <04642bf3-bab5-8430-0b04-3a09b1b305ff@oracle.com> Looks good. Thanks, Vladimir On 3/27/19 6:44 AM, Rahul Raghavan wrote: > Hi, > > Thank you Vladimir. > > Yes, tried following fix. > (needed to add checks to avoid SIGFPE crash). > > +? int size_in_bytes = st->memory_size(); > +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % size_in_bytes) != 0) { > +??? return FAIL; > +? } > > > - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ > > Confirmed no issues with testing for this revised fix. > > Thanks, > Rahul > > On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >> >> Suggestion: >> >> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >> >> Vladimir >> >> From vladimir.x.ivanov at oracle.com Wed Mar 27 17:12:55 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 27 Mar 2019 10:12:55 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> Message-ID: <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> First, I'd like to note that it's a good practice to include problem & root cause descriptions in the request. Otherwise, reviewers have to find that information themselves which complicates review process. (In this particular case, I found some analysis from the submitter [1] in the bug only after carefully reading through it.) On 27/03/2019 06:44, Rahul Raghavan wrote: > Hi, > > Thank you Vladimir. > > Yes, tried following fix. > (needed to add checks to avoid SIGFPE crash). > > +? int size_in_bytes = st->memory_size(); > +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % > size_in_bytes) != 0) { > +??? return FAIL; > +? } > > > - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ It seems the problem is due to mismatched unsafe store being captured as a initializing one. Why not check for it explicitly? if (st->is_unaligned_access() || st->is_mismatched_access()) { return FAIL; } Best regards, Vladimir Ivanov [1] For your convenience, our analysis shows the problem may relate to array InitializeNode logic. It `capture_store` the the memory write of Unsafe.putInt. Since the putInt occupied offset range [17, 21] from the array pointer, then it decided to `clear_memory` of offset range [16, 17] of the array pointer. This range actually cannot pass the assert "assert((end_offset % BytesPerInt) == 0, "odd end offset")". While in jvm product mode, without the assert, the compiler falsely calculated to clear range [13, 17], which will clear the three most significant bytes of the `length` of this array. > > Confirmed no issues with testing for this revised fix. > > Thanks, > Rahul > > On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >> >> Suggestion: >> >> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >> >> Vladimir >> >> From dean.long at oracle.com Thu Mar 28 00:27:12 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Wed, 27 Mar 2019 17:27:12 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> Message-ID: I don't think we can use is_mismatched_access(), because we seem to have the same problem even if int[] is used.? I took another look at this, and I believe we can fix this in InitializeNode::complete_stores(), while still allowing the captured store optimization. dl On 3/27/19 10:12 AM, Vladimir Ivanov wrote: > First, I'd like to note that it's a good practice to include problem & > root cause descriptions in the request. Otherwise, reviewers have to > find that information themselves which complicates review process. > > (In this particular case, I found some analysis from the submitter [1] > in the bug only after carefully reading through it.) > > On 27/03/2019 06:44, Rahul Raghavan wrote: >> Hi, >> >> Thank you Vladimir. >> >> Yes, tried following fix. >> (needed to add checks to avoid SIGFPE crash). >> >> +? int size_in_bytes = st->memory_size(); >> +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % >> size_in_bytes) != 0) { >> +??? return FAIL; >> +? } >> >> >> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ > > It seems the problem is due to mismatched unsafe store being captured > as a initializing one. Why not check for it explicitly? > > ?? if (st->is_unaligned_access() || st->is_mismatched_access()) { > ???? return FAIL; > ?? } > > Best regards, > Vladimir Ivanov > > > [1] > > For your convenience, our analysis shows the problem may relate to > array InitializeNode logic. > It `capture_store` the the memory write of Unsafe.putInt. > Since the putInt occupied offset range [17, 21] from the array pointer, > then it decided to `clear_memory` of offset range [16, 17] of the > array pointer. > This range actually cannot pass the assert "assert((end_offset % > BytesPerInt) == 0, "odd end offset")". > While in jvm product mode, without the assert, the compiler falsely > calculated to clear range [13, 17], > which will clear the three most significant bytes of the `length` of > this array. > > >> >> Confirmed no issues with testing for this revised fix. >> >> Thanks, >> Rahul >> >> On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >>> >>> Suggestion: >>> >>> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >>> >>> Vladimir >>> >>> From vladimir.x.ivanov at oracle.com Thu Mar 28 06:21:45 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 27 Mar 2019 23:21:45 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> Message-ID: On 27/03/2019 17:27, dean.long at oracle.com wrote: > I don't think we can use is_mismatched_access(), because we seem to have > the same problem even if int[] is used.? I took another look at this, > and I believe we can fix this in InitializeNode::complete_stores(), > while still allowing the captured store optimization. Yes, good point. So, what you are saying is that is_mismatched_access() is not sufficient to cover all the cases and the only missing case is , right? I believe checking that offset is in doubn Best regards, Vladimir Ivanov > On 3/27/19 10:12 AM, Vladimir Ivanov wrote: >> First, I'd like to note that it's a good practice to include problem & >> root cause descriptions in the request. Otherwise, reviewers have to >> find that information themselves which complicates review process. >> >> (In this particular case, I found some analysis from the submitter [1] >> in the bug only after carefully reading through it.) >> >> On 27/03/2019 06:44, Rahul Raghavan wrote: >>> Hi, >>> >>> Thank you Vladimir. >>> >>> Yes, tried following fix. >>> (needed to add checks to avoid SIGFPE crash). >>> >>> +? int size_in_bytes = st->memory_size(); >>> +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % >>> size_in_bytes) != 0) { >>> +??? return FAIL; >>> +? } >>> >>> >>> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ >> >> It seems the problem is due to mismatched unsafe store being captured >> as a initializing one. Why not check for it explicitly? >> >> ?? if (st->is_unaligned_access() || st->is_mismatched_access()) { >> ???? return FAIL; >> ?? } >> >> Best regards, >> Vladimir Ivanov >> >> >> [1] >> >> For your convenience, our analysis shows the problem may relate to >> array InitializeNode logic. >> It `capture_store` the the memory write of Unsafe.putInt. >> Since the putInt occupied offset range [17, 21] from the array pointer, >> then it decided to `clear_memory` of offset range [16, 17] of the >> array pointer. >> This range actually cannot pass the assert "assert((end_offset % >> BytesPerInt) == 0, "odd end offset")". >> While in jvm product mode, without the assert, the compiler falsely >> calculated to clear range [13, 17], >> which will clear the three most significant bytes of the `length` of >> this array. >> >> >>> >>> Confirmed no issues with testing for this revised fix. >>> >>> Thanks, >>> Rahul >>> >>> On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >>>> >>>> Suggestion: >>>> >>>> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >>>> >>>> Vladimir >>>> >>>> > From vladimir.x.ivanov at oracle.com Thu Mar 28 06:23:20 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 27 Mar 2019 23:23:20 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> Message-ID: Sorry, hit "send" too early. Please, ignore. Best regards, Vladimir Ivanov On 27/03/2019 23:21, Vladimir Ivanov wrote: > > On 27/03/2019 17:27, dean.long at oracle.com wrote: >> I don't think we can use is_mismatched_access(), because we seem to >> have the same problem even if int[] is used.? I took another look at >> this, and I believe we can fix this in >> InitializeNode::complete_stores(), while still allowing the captured >> store optimization. > > Yes, good point. > > So, what you are saying is that is_mismatched_access() is not sufficient > to cover all the cases and the only missing case is , right? > > I believe checking that offset is in doubn >> On 3/27/19 10:12 AM, Vladimir Ivanov wrote: >>> First, I'd like to note that it's a good practice to include problem >>> & root cause descriptions in the request. Otherwise, reviewers have >>> to find that information themselves which complicates review process. >>> >>> (In this particular case, I found some analysis from the submitter >>> [1] in the bug only after carefully reading through it.) >>> >>> On 27/03/2019 06:44, Rahul Raghavan wrote: >>>> Hi, >>>> >>>> Thank you Vladimir. >>>> >>>> Yes, tried following fix. >>>> (needed to add checks to avoid SIGFPE crash). >>>> >>>> +? int size_in_bytes = st->memory_size(); >>>> +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % >>>> size_in_bytes) != 0) { >>>> +??? return FAIL; >>>> +? } >>>> >>>> >>>> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ >>> >>> It seems the problem is due to mismatched unsafe store being captured >>> as a initializing one. Why not check for it explicitly? >>> >>> ?? if (st->is_unaligned_access() || st->is_mismatched_access()) { >>> ???? return FAIL; >>> ?? } >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> >>> [1] >>> >>> For your convenience, our analysis shows the problem may relate to >>> array InitializeNode logic. >>> It `capture_store` the the memory write of Unsafe.putInt. >>> Since the putInt occupied offset range [17, 21] from the array pointer, >>> then it decided to `clear_memory` of offset range [16, 17] of the >>> array pointer. >>> This range actually cannot pass the assert "assert((end_offset % >>> BytesPerInt) == 0, "odd end offset")". >>> While in jvm product mode, without the assert, the compiler falsely >>> calculated to clear range [13, 17], >>> which will clear the three most significant bytes of the `length` of >>> this array. >>> >>> >>>> >>>> Confirmed no issues with testing for this revised fix. >>>> >>>> Thanks, >>>> Rahul >>>> >>>> On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >>>>> >>>>> Suggestion: >>>>> >>>>> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >>>>> >>>>> Vladimir >>>>> >>>>> >> From vladimir.x.ivanov at oracle.com Thu Mar 28 06:21:51 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 27 Mar 2019 23:21:51 -0700 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: Message-ID: Hi Jie, The heuristic quirk looks very similar to the one Sergey reported recently: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-February/032623.html Overall, tweaking the heuristic to favor inlining doesn't look the right thing here. profile.count=0 is a sign the profile isn't mature enough and it's likely the callee doesn't have enough profiling info as well. (And that's what Sergey observed on some of the microbenchmarks during his experiments.) In your particular case (Random::), tweaking the heuristic so is_init_with_ea [1] overrules "profile.count > 0" may be a more promising approach. After all, the fact that the call site is being considered for inlining (and not pruned along with the basic block it belongs to) is a strong signal in favor of "profile.count > 0" case. (Though it's not guaranteed due to the immaturity of profile data.) But IMO the root problem is that top-tier compilation happens too early: profile data isn't mature enough yet and it will easily lead to similar problems later (during compilation). Best regards, Vladimir Ivanov [1] http://hg.openjdk.java.net/jdk/jdk/file/9c84d2865c2d/src/hotspot/share/opto/bytecodeInfo.cpp#l81 On 27/03/2019 03:15, Jie Fu wrote: > Hi all, > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8221542 > Webrev: http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.00/ > > ## Symptom > ~15% performance degradation (from 700 ops/m to 600 ops/m) was observed > randomly on x86 while running SPECjvm2008's scimark.monte_carlo with > -XX:-TieredCompilation. > > ## Reproduce > It can be always reproduced with the script[1] in less than 5 minutes. > > ## Reason > The drop was caused by a not-inline decision on > spec.benchmarks.scimark.utils.Random:: in > spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate. > > ## Fix > It might be better to make a little change to the inline heuristic[2]. > > For callers without loops, the original heuristic works fine. > But for callers with loops, it would be better to make a not-inline > decision more conservatively. > > ## Testing > - Running scimark.monte_carlo on jdk/x64 with -XX:-TieredCompilation for > about 5000 times, no performance drop > ? Also on jdk8u/mips64 with -XX:-TieredCompilation, no performance drop > - Running make test TEST="micro" on jdk/x64, no performance regression > - Running SPECjvm2008 on jdk8u/x64 with -XX:-TieredCompilation, no > performance regression > > For more detailed info, please see the JBS. > > Could you please review it? > Thanks a lot. > > Best regards, > Jie > > [1] http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/reproduce.sh > [2] > http://hg.openjdk.java.net/jdk/jdk/file/0a2d73e02076/src/hotspot/share/opto/bytecodeInfo.cpp#l375 > > > From jesper.wilhelmsson at oracle.com Thu Mar 28 07:37:23 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Thu, 28 Mar 2019 08:37:23 +0100 Subject: RFR: JDK-8221341 - Update Graal Message-ID: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> Hi, Please review the patch to integrate the latest Graal changes into OpenJDK. Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 JBS duplicates fixed by this integration: https://bugs.openjdk.java.net/browse/JDK-8220643 https://bugs.openjdk.java.net/browse/JDK-8220810 JBS duplicates deferred to the next integration: https://bugs.openjdk.java.net/browse/JDK-8214947 Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ Thanks, /Jesper -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From fujie at loongson.cn Thu Mar 28 07:54:16 2019 From: fujie at loongson.cn (Jie Fu) Date: Thu, 28 Mar 2019 15:54:16 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: Message-ID: <755f3890-02de-8649-15b5-4789d279b8fe@loongson.cn> Hi Vladimir, Thanks for your review and valuable suggestions. I will study your suggestions and Sergey's discussion to find a better solution. Thanks a lot. Best regards, Jie On 2019/3/28 ??2:21, Vladimir Ivanov wrote: > Hi Jie, > > The heuristic quirk looks very similar to the one Sergey reported > recently: > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-February/032623.html > > > Overall, tweaking the heuristic to favor inlining doesn't look the > right thing here. profile.count=0 is a sign the profile isn't mature > enough and it's likely the callee doesn't have enough profiling info > as well. (And that's what Sergey observed on some of the > microbenchmarks during his experiments.) > > In your particular case (Random::), tweaking the heuristic so > is_init_with_ea [1] overrules "profile.count > 0" may be a more > promising approach. After all, the fact that the call site is being > considered for inlining (and not pruned along with the basic block it > belongs to) is a strong signal in favor of "profile.count > 0" case. > (Though it's not guaranteed due to the immaturity of profile data.) > > But IMO the root problem is that top-tier compilation happens too > early: profile data isn't mature enough yet and it will easily lead to > similar problems later (during compilation). > > Best regards, > Vladimir Ivanov > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/9c84d2865c2d/src/hotspot/share/opto/bytecodeInfo.cpp#l81 > > On 27/03/2019 03:15, Jie Fu wrote: >> Hi all, >> >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8221542 >> Webrev: >> http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.00/ >> >> ## Symptom >> ~15% performance degradation (from 700 ops/m to 600 ops/m) was >> observed randomly on x86 while running SPECjvm2008's >> scimark.monte_carlo with -XX:-TieredCompilation. >> >> ## Reproduce >> It can be always reproduced with the script[1] in less than 5 minutes. >> >> ## Reason >> The drop was caused by a not-inline decision on >> spec.benchmarks.scimark.utils.Random:: in >> spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate. >> >> ## Fix >> It might be better to make a little change to the inline heuristic[2]. >> >> For callers without loops, the original heuristic works fine. >> But for callers with loops, it would be better to make a not-inline >> decision more conservatively. >> >> ## Testing >> - Running scimark.monte_carlo on jdk/x64 with -XX:-TieredCompilation >> for about 5000 times, no performance drop >> ?? Also on jdk8u/mips64 with -XX:-TieredCompilation, no performance drop >> - Running make test TEST="micro" on jdk/x64, no performance regression >> - Running SPECjvm2008 on jdk8u/x64 with -XX:-TieredCompilation, no >> performance regression >> >> For more detailed info, please see the JBS. >> >> Could you please review it? >> Thanks a lot. >> >> Best regards, >> Jie >> >> [1] http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/reproduce.sh >> [2] >> http://hg.openjdk.java.net/jdk/jdk/file/0a2d73e02076/src/hotspot/share/opto/bytecodeInfo.cpp#l375 >> >> >> From lutz.schmidt at sap.com Thu Mar 28 13:19:40 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 28 Mar 2019 13:19:40 +0000 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> Message-ID: <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> Cross-posting to hotspot-compiler-dev because bug was moved from runtime to compiler... ?On 28.03.19, 12:14, "Schmidt, Lutz" wrote: Dear Community, may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ Submit-repo tests pending... Thanks, Lutz From vladimir.kozlov at oracle.com Thu Mar 28 16:30:11 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 Mar 2019 09:30:11 -0700 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> Message-ID: <00031742-7e6a-90a0-3e77-8259c587e57b@oracle.com> Make sense. Good. Thanks, Vladimir On 3/28/19 6:19 AM, Schmidt, Lutz wrote: > Cross-posting to hotspot-compiler-dev because bug was moved from runtime to compiler... > > ?On 28.03.19, 12:14, "Schmidt, Lutz" wrote: > > Dear Community, > > may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ > > Submit-repo tests pending... > > Thanks, > Lutz > > > > From lutz.schmidt at sap.com Thu Mar 28 16:29:10 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 28 Mar 2019 16:29:10 +0000 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: <00031742-7e6a-90a0-3e77-8259c587e57b@oracle.com> References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> <00031742-7e6a-90a0-3e77-8259c587e57b@oracle.com> Message-ID: Hi Vladimir, thanks for reviewing. May I consider this change trivial? Regards, Lutz ?On 28.03.19, 17:30, "Vladimir Kozlov" wrote: Make sense. Good. Thanks, Vladimir On 3/28/19 6:19 AM, Schmidt, Lutz wrote: > Cross-posting to hotspot-compiler-dev because bug was moved from runtime to compiler... > > On 28.03.19, 12:14, "Schmidt, Lutz" wrote: > > Dear Community, > > may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ > > Submit-repo tests pending... > > Thanks, > Lutz > > > > From vladimir.kozlov at oracle.com Thu Mar 28 16:50:55 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 Mar 2019 09:50:55 -0700 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> <00031742-7e6a-90a0-3e77-8259c587e57b@oracle.com> Message-ID: Yes, it is trivial but wait results of testing before pushing. Thanks, Vladimir On 3/28/19 9:29 AM, Schmidt, Lutz wrote: > Hi Vladimir, > thanks for reviewing. May I consider this change trivial? > Regards, > Lutz > > ?On 28.03.19, 17:30, "Vladimir Kozlov" wrote: > > Make sense. Good. > > Thanks, > Vladimir > > On 3/28/19 6:19 AM, Schmidt, Lutz wrote: > > Cross-posting to hotspot-compiler-dev because bug was moved from runtime to compiler... > > > > On 28.03.19, 12:14, "Schmidt, Lutz" wrote: > > > > Dear Community, > > > > may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ > > > > Submit-repo tests pending... > > > > Thanks, > > Lutz > > > > > > > > > > From lutz.schmidt at sap.com Thu Mar 28 16:53:17 2019 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Thu, 28 Mar 2019 16:53:17 +0000 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> <294B2CF4-0251-4623-BBA4-8A36F115B133@sap.com> <00031742-7e6a-90a0-3e77-8259c587e57b@oracle.com> Message-ID: <83C16D57-3F27-446F-8E63-B15A19B712C9@sap.com> Sure, will wait... ?On 28.03.19, 17:50, "Vladimir Kozlov" wrote: Yes, it is trivial but wait results of testing before pushing. Thanks, Vladimir On 3/28/19 9:29 AM, Schmidt, Lutz wrote: > Hi Vladimir, > thanks for reviewing. May I consider this change trivial? > Regards, > Lutz > > On 28.03.19, 17:30, "Vladimir Kozlov" wrote: > > Make sense. Good. > > Thanks, > Vladimir > > On 3/28/19 6:19 AM, Schmidt, Lutz wrote: > > Cross-posting to hotspot-compiler-dev because bug was moved from runtime to compiler... > > > > On 28.03.19, 12:14, "Schmidt, Lutz" wrote: > > > > Dear Community, > > > > may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 > > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ > > > > Submit-repo tests pending... > > > > Thanks, > > Lutz > > > > > > > > > > From vladimir.kozlov at oracle.com Thu Mar 28 17:48:05 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 Mar 2019 10:48:05 -0700 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> Message-ID: <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: > Hi, > > Please review the patch to integrate the latest Graal changes into OpenJDK. > Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 > > JBS duplicates fixed by this integration: > https://bugs.openjdk.java.net/browse/JDK-8220643 > https://bugs.openjdk.java.net/browse/JDK-8220810 > > JBS duplicates deferred to the next integration: > https://bugs.openjdk.java.net/browse/JDK-8214947 We should investigate why this bug is still referenced in RFR. We already discussed it: https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html > > Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 > Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ We also discussed indentation change in make/test/JtregGraalUnit.gmk Why the change showed up again? Otherwise changes looks good. Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. Thanks, Vladimir > > Thanks, > /Jesper > From doug.simon at oracle.com Thu Mar 28 17:51:36 2019 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 28 Mar 2019 18:51:36 +0100 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> Message-ID: <71A07FCC-2B29-4856-872A-3D028BE66B80@oracle.com> > On 28 Mar 2019, at 18:48, Vladimir Kozlov wrote: > > On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate the latest Graal changes into OpenJDK. >> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >> JBS duplicates fixed by this integration: >> https://bugs.openjdk.java.net/browse/JDK-8220643 >> https://bugs.openjdk.java.net/browse/JDK-8220810 >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 > > We should investigate why this bug is still referenced in RFR. We already discussed it: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ > > We also discussed indentation change in make/test/JtregGraalUnit.gmk > Why the change showed up again? > > Otherwise changes looks good. > > Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. > But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. Shouldn?t we use sun.misc.Unsafe for as long as it?s available? The advantage is that it is publicly exported and means no need for ?add-exports when running/testing Graal from outside JDK. -Doug From vladimir.kozlov at oracle.com Thu Mar 28 18:10:58 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 Mar 2019 11:10:58 -0700 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: <71A07FCC-2B29-4856-872A-3D028BE66B80@oracle.com> References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> <71A07FCC-2B29-4856-872A-3D028BE66B80@oracle.com> Message-ID: <35b291f0-565b-b722-b8e5-aeee732a5f1a@oracle.com> > Shouldn?t we use sun.misc.Unsafe for as long as it?s available? The advantage is that it is publicly exported and means no need for ?add-exports when running/testing Graal from outside JDK. I thought it is oversight. But I am fine if it is done intentionally. Agree. Vladimir On 3/28/19 10:51 AM, Doug Simon wrote: > > >> On 28 Mar 2019, at 18:48, Vladimir Kozlov wrote: >> >> On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >>> JBS duplicates fixed by this integration: >>> https://bugs.openjdk.java.net/browse/JDK-8220643 >>> https://bugs.openjdk.java.net/browse/JDK-8220810 >>> JBS duplicates deferred to the next integration: >>> https://bugs.openjdk.java.net/browse/JDK-8214947 >> >> We should investigate why this bug is still referenced in RFR. We already discussed it: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html >> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ >> >> We also discussed indentation change in make/test/JtregGraalUnit.gmk >> Why the change showed up again? >> >> Otherwise changes looks good. >> >> Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. >> But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. > > Shouldn?t we use sun.misc.Unsafe for as long as it?s available? The advantage is that it is publicly exported and means no need for ?add-exports when running/testing Graal from outside JDK. > > -Doug > From jesper.wilhelmsson at oracle.com Thu Mar 28 18:33:23 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Thu, 28 Mar 2019 19:33:23 +0100 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> Message-ID: Hi Vladimir, Thanks for reviewing! > On 28 Mar 2019, at 18:48, Vladimir Kozlov wrote: > > On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: >> Hi, >> Please review the patch to integrate the latest Graal changes into OpenJDK. >> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >> JBS duplicates fixed by this integration: >> https://bugs.openjdk.java.net/browse/JDK-8220643 >> https://bugs.openjdk.java.net/browse/JDK-8220810 >> JBS duplicates deferred to the next integration: >> https://bugs.openjdk.java.net/browse/JDK-8214947 > > We should investigate why this bug is still referenced in RFR. We already discussed it: > https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html Oops, sorry, missed that. I have removed the link from the next update issue so it won't show up again. >> Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 >> Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ > > We also discussed indentation change in make/test/JtregGraalUnit.gmk > Why the change showed up again? My understanding was that this was caused by a bug in the mx script. As long as that bug is there this will keep happening. I will try to remember to revert this going forward, but it is something that I will need to do manually every time. Someone should fix the mx script. I don't know who owns that though. Thanks, /Jesper > > Otherwise changes looks good. > > Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. > But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. > > Thanks, > Vladimir > >> Thanks, >> /Jesper -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.x.ivanov at oracle.com Thu Mar 28 18:42:12 2019 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Thu, 28 Mar 2019 11:42:12 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> Message-ID: > I don't think we can use is_mismatched_access(), because we seem to have > the same problem even if int[] is used.? I took another look at this, > and I believe we can fix this in InitializeNode::complete_stores(), > while still allowing the captured store optimization. Yes, good point. So, IMO the main question is what's the best place to fix it: InitializeNode::complete_stores() vs InitializeNode::can_capture_store Capturing stores and then InitializeNode::complete_stores() looks attractive, but there are other cases arising with unsafe accesses which aren't tested well yet. So, if we choose that route, the regression test should be improved. On the other hand, "!st->is_unaligned_access()" constraint in InitializeNode::can_capture_store() can be removed. Forbidding mismatched accesses in InitializeNode::can_capture_store (both marked as such and based on actual offset) looks like a safer fix to me: it keeps InitializeNode::complete_stores() exposed only to well-behaved accessed. How much do we lose by not capturing mismatched/unaligned initialized stores? Does it worth optimizing for it? Best regards, Vladimir Ivanov > On 3/27/19 10:12 AM, Vladimir Ivanov wrote: >> First, I'd like to note that it's a good practice to include problem & >> root cause descriptions in the request. Otherwise, reviewers have to >> find that information themselves which complicates review process. >> >> (In this particular case, I found some analysis from the submitter [1] >> in the bug only after carefully reading through it.) >> >> On 27/03/2019 06:44, Rahul Raghavan wrote: >>> Hi, >>> >>> Thank you Vladimir. >>> >>> Yes, tried following fix. >>> (needed to add checks to avoid SIGFPE crash). >>> >>> +? int size_in_bytes = st->memory_size(); >>> +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % >>> size_in_bytes) != 0) { >>> +??? return FAIL; >>> +? } >>> >>> >>> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ >> >> It seems the problem is due to mismatched unsafe store being captured >> as a initializing one. Why not check for it explicitly? >> >> ?? if (st->is_unaligned_access() || st->is_mismatched_access()) { >> ???? return FAIL; >> ?? } >> >> Best regards, >> Vladimir Ivanov >> >> >> [1] >> >> For your convenience, our analysis shows the problem may relate to >> array InitializeNode logic. >> It `capture_store` the the memory write of Unsafe.putInt. >> Since the putInt occupied offset range [17, 21] from the array pointer, >> then it decided to `clear_memory` of offset range [16, 17] of the >> array pointer. >> This range actually cannot pass the assert "assert((end_offset % >> BytesPerInt) == 0, "odd end offset")". >> While in jvm product mode, without the assert, the compiler falsely >> calculated to clear range [13, 17], >> which will clear the three most significant bytes of the `length` of >> this array. >> >> >>> >>> Confirmed no issues with testing for this revised fix. >>> >>> Thanks, >>> Rahul >>> >>> On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >>>> >>>> Suggestion: >>>> >>>> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >>>> >>>> Vladimir >>>> >>>> > From doug.simon at oracle.com Thu Mar 28 18:52:01 2019 From: doug.simon at oracle.com (Doug Simon) Date: Thu, 28 Mar 2019 19:52:01 +0100 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> Message-ID: <6192C685-DB3B-4399-8D28-B8E1F8D4AE72@oracle.com> > On 28 Mar 2019, at 19:33, jesper.wilhelmsson at oracle.com wrote: > > Hi Vladimir, > > Thanks for reviewing! > >> On 28 Mar 2019, at 18:48, Vladimir Kozlov > wrote: >> >> On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: >>> Hi, >>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >>> JBS duplicates fixed by this integration: >>> https://bugs.openjdk.java.net/browse/JDK-8220643 >>> https://bugs.openjdk.java.net/browse/JDK-8220810 >>> JBS duplicates deferred to the next integration: >>> https://bugs.openjdk.java.net/browse/JDK-8214947 >> >> We should investigate why this bug is still referenced in RFR. We already discussed it: >> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html > > Oops, sorry, missed that. I have removed the link from the next update issue so it won't show up again. > >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 >>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ >> >> We also discussed indentation change in make/test/JtregGraalUnit.gmk >> Why the change showed up again? > > My understanding was that this was caused by a bug in the mx script. As long as that bug is there this will keep happening. I will try to remember to revert this going forward, but it is something that I will need to do manually every time. Someone should fix the mx script. I don't know who owns that though. Anyone who can code Python and submit a pull request ;-) I believe these are the relevant lines: https://github.com/oracle/graal/blob/master/compiler/mx.compiler/mx_updategraalinopenjdk.py#L297-L308 -Doug > >> >> Otherwise changes looks good. >> >> Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. >> But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. >> >> Thanks, >> Vladimir >> >>> Thanks, >>> /Jesper > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Fri Mar 29 00:26:27 2019 From: jesper.wilhelmsson at oracle.com (jesper.wilhelmsson at oracle.com) Date: Fri, 29 Mar 2019 01:26:27 +0100 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: <6192C685-DB3B-4399-8D28-B8E1F8D4AE72@oracle.com> References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> <6192C685-DB3B-4399-8D28-B8E1F8D4AE72@oracle.com> Message-ID: > On 28 Mar 2019, at 19:52, Doug Simon wrote: >> On 28 Mar 2019, at 19:33, jesper.wilhelmsson at oracle.com wrote: >> >> Hi Vladimir, >> >> Thanks for reviewing! >> >>> On 28 Mar 2019, at 18:48, Vladimir Kozlov > wrote: >>> >>> On 3/28/19 12:37 AM, jesper.wilhelmsson at oracle.com wrote: >>>> Hi, >>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >>>> JBS duplicates fixed by this integration: >>>> https://bugs.openjdk.java.net/browse/JDK-8220643 >>>> https://bugs.openjdk.java.net/browse/JDK-8220810 >>>> JBS duplicates deferred to the next integration: >>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>> >>> We should investigate why this bug is still referenced in RFR. We already discussed it: >>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html >> >> Oops, sorry, missed that. I have removed the link from the next update issue so it won't show up again. >> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8221341 >>>> Webrev: http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ >>> >>> We also discussed indentation change in make/test/JtregGraalUnit.gmk >>> Why the change showed up again? >> >> My understanding was that this was caused by a bug in the mx script. As long as that bug is there this will keep happening. I will try to remember to revert this going forward, but it is something that I will need to do manually every time. Someone should fix the mx script. I don't know who owns that though. > > Anyone who can code Python and submit a pull request ;-) I believe these are the relevant lines: > > https://github.com/oracle/graal/blob/master/compiler/mx.compiler/mx_updategraalinopenjdk.py#L297-L308 It seems to me that any logic to figure out the correct indentation would be fragile at best. The change that caused this breakage was cleaning up the indentation to make it the same as the rest of the file. I wouldn't expect this to change again in a way that wouldn't require the logic to change as well. I suggest to simply add the two missing spaces in line 304. /Jesper > -Doug > >> >>> >>> Otherwise changes looks good. >>> >>> Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. >>> But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use jdk.internal.misc.Unsafe. It is for an other update. >>> >>> Thanks, >>> Vladimir >>> >>>> Thanks, >>>> /Jesper -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From vladimir.kozlov at oracle.com Fri Mar 29 00:39:19 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 28 Mar 2019 17:39:19 -0700 Subject: RFR: JDK-8221341 - Update Graal In-Reply-To: References: <9B53B1EC-1420-4844-9276-EA25E8F13987@oracle.com> <9ce1ee74-d524-e29e-3175-48279af339a2@oracle.com> <6192C685-DB3B-4399-8D28-B8E1F8D4AE72@oracle.com> Message-ID: <85e285cf-9eac-0ce8-adde-d21bc7ab1536@oracle.com> I filed GR-14808 to fix indentation generated for JtregGraalUnit.gmk Vladimir On 3/28/19 5:26 PM, jesper.wilhelmsson at oracle.com wrote: >> On 28 Mar 2019, at 19:52, Doug Simon > wrote: >>> On 28 Mar 2019, at 19:33,jesper.wilhelmsson at oracle.com wrote: >>> >>> Hi Vladimir, >>> >>> Thanks for reviewing! >>> >>>> On 28 Mar 2019, at 18:48, Vladimir Kozlov >>> > wrote: >>>> >>>> On 3/28/19 12:37 AM,jesper.wilhelmsson at oracle.com wrote: >>>>> Hi, >>>>> Please review the patch to integrate the latest Graal changes into OpenJDK. >>>>> Graal tip to integrate: 7970bd76ff60600ab5a2fc96cd24ddd7ed017cf8 >>>>> JBS duplicates fixed by this integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8220643 >>>>> https://bugs.openjdk.java.net/browse/JDK-8220810 >>>>> JBS duplicates deferred to the next integration: >>>>> https://bugs.openjdk.java.net/browse/JDK-8214947 >>>> >>>> We should investigate why this bug is still referenced in RFR. We already discussed it: >>>> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-March/033130.html >>> >>> Oops, sorry, missed that. I have removed the link from the next update issue so it won't show up >>> again. >>> >>>>> Bug:https://bugs.openjdk.java.net/browse/JDK-8221341 >>>>> Webrev:http://cr.openjdk.java.net/~jwilhelm/8221341/webrev.00/ >>>> >>>> We also discussed indentation change in make/test/JtregGraalUnit.gmk >>>> Why the change showed up again? >>> >>> My understanding was that this was caused by a bug in the mx script. As long as that bug is there >>> this will keep happening. I will try to remember to revert this going forward, but it is >>> something that I will need to do manually every time. Someone should fix the mx script. I don't >>> know who owns that though. >> >> Anyone who can code Python and submit a pull request ;-) I believe these are the relevant lines: >> >> https://github.com/oracle/graal/blob/master/compiler/mx.compiler/mx_updategraalinopenjdk.py#L297-L308 > > It seems to me that any logic to figure out the correct indentation would be fragile at best. The > change that caused this breakage was cleaning up the indentation to make it the same as the rest of > the file. I wouldn't expect this to change again in a way that wouldn't require the logic to change > as well. I suggest to simply add the two missing spaces in line 304. > > /Jesper > >> -Doug >> >>> >>>> >>>> Otherwise changes looks good. >>>> >>>> Doug, it is good to have only one Graal class to access Unsafe class - GraalUnsafeAccess.java. >>>> But we should not use sun.misc.Unsafe in JDK 13 - we should have version for JDK9+ which use >>>> jdk.internal.misc.Unsafe. It is for an other update. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>> Thanks, >>>>> /Jesper > From david.holmes at oracle.com Fri Mar 29 03:59:40 2019 From: david.holmes at oracle.com (David Holmes) Date: Fri, 29 Mar 2019 13:59:40 +1000 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> Message-ID: Hi Lutz, cc'd the compiler team On 28/03/2019 9:14 pm, Schmidt, Lutz wrote: > Dear Community, > > may I please request reviews for this tiny change. The purpose is to initialize the regName[] array earlier during VM init. I can see that will fix the assertion for you, but then begs the question as to whether VMRegImpl::set_regName itself has any initialization dependencies. The answer to that is not obvious to me. I _think_ the Register setup only depends on C++ static initialization. Hopefully someone from compiler team can confirm this change is in fact safe. Thanks, David > Bug: https://bugs.openjdk.java.net/browse/JDK-8221482 > Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ > > Submit-repo tests pending... > > Thanks, > Lutz > > From vladimir.kozlov at oracle.com Fri Mar 29 18:53:28 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 29 Mar 2019 11:53:28 -0700 Subject: RFR(XS): 8221482: Initialize VMRegImpl::regName[] earlier to prevent assert during PrintStubCode In-Reply-To: References: <30549BAC-6DA9-45D5-A4AC-32BC243E7F2B@sap.com> Message-ID: <0dfb7424-3595-4709-b6ed-33db4bdfc34d@oracle.com> On 3/28/19 8:59 PM, David Holmes wrote: > Hi Lutz, > > cc'd the compiler team > > On 28/03/2019 9:14 pm, Schmidt, Lutz wrote: >> Dear Community, >> >> may I please request reviews for this tiny change. The purpose is to initialize the regName[] >> array earlier during VM init. > > I can see that will fix the assertion for you, but then begs the question as to whether > VMRegImpl::set_regName itself has any initialization dependencies. The answer to that is not obvious > to me. I _think_ the Register setup only depends on C++ static initialization. > > Hopefully someone from compiler team can confirm this change is in fact safe. The array is static: http://hg.openjdk.java.net/jdk/jdk/file/6a1406c718ec/src/hotspot/share/code/vmreg.cpp#l37 And register's names are encoded: http://hg.openjdk.java.net/jdk/jdk/file/6a1406c718ec/src/hotspot/cpu/x86/register_x86.cpp#l41 There are no initialization dependencies. Vladimir > > Thanks, > David > >> Bug:??? https://bugs.openjdk.java.net/browse/JDK-8221482 >> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8221482.01/ >> >> Submit-repo tests pending... >> >> Thanks, >> Lutz >> >> From dean.long at oracle.com Fri Mar 29 23:31:55 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 29 Mar 2019 16:31:55 -0700 Subject: [13] RFR: 8202414: Unsafe write after primitive array creation may result in array length change In-Reply-To: References: <7e900022-4e16-2ab9-1f4d-89e1510e2646@oracle.com> <392c665f-869c-29af-4fc5-e6f844820846@oracle.com> <3db5d7ab-ad99-310b-e891-fc36d25da338@oracle.com> <7b03a213-7fee-a87f-b48d-250662e730ef@oracle.com> Message-ID: <959abf54-d1da-95ee-9cf6-6c6d8ec5e4a1@oracle.com> On 3/28/19 11:42 AM, Vladimir Ivanov wrote: > >> I don't think we can use is_mismatched_access(), because we seem to >> have the same problem even if int[] is used.? I took another look at >> this, and I believe we can fix this in >> InitializeNode::complete_stores(), while still allowing the captured >> store optimization. > > Yes, good point. > > So, IMO the main question is what's the best place to fix it: > ? InitializeNode::complete_stores() vs InitializeNode::can_capture_store > > Capturing stores and then InitializeNode::complete_stores() looks > attractive, but there are other cases arising with unsafe accesses > which aren't tested well yet. So, if we choose that route, the > regression test should be improved. On the other hand, > "!st->is_unaligned_access()" constraint in > InitializeNode::can_capture_store() can be removed. > I agree that we need better regression tests if we go this route. Do we have enough regression tests for the is_unaligned_access() case to enable that optimization first? > Forbidding mismatched accesses in InitializeNode::can_capture_store > (both marked as such and based on actual offset) looks like a safer > fix to me: it keeps InitializeNode::complete_stores() exposed only to > well-behaved accessed. > > How much do we lose by not capturing mismatched/unaligned initialized > stores? Does it worth optimizing for it? > It does seem like it would be rare that optimizing it would make a difference, unless we had a microbenchmark that focuses on it. dl > Best regards, > Vladimir Ivanov > >> On 3/27/19 10:12 AM, Vladimir Ivanov wrote: >>> First, I'd like to note that it's a good practice to include problem >>> & root cause descriptions in the request. Otherwise, reviewers have >>> to find that information themselves which complicates review process. >>> >>> (In this particular case, I found some analysis from the submitter >>> [1] in the bug only after carefully reading through it.) >>> >>> On 27/03/2019 06:44, Rahul Raghavan wrote: >>>> Hi, >>>> >>>> Thank you Vladimir. >>>> >>>> Yes, tried following fix. >>>> (needed to add checks to avoid SIGFPE crash). >>>> >>>> +? int size_in_bytes = st->memory_size(); >>>> +? if ((size_in_bytes != 0) && (get_store_offset(st, phase) % >>>> size_in_bytes) != 0) { >>>> +??? return FAIL; >>>> +? } >>>> >>>> >>>> - http://cr.openjdk.java.net/~rraghavan/8202414/webrev.02/ >>> >>> It seems the problem is due to mismatched unsafe store being >>> captured as a initializing one. Why not check for it explicitly? >>> >>> ?? if (st->is_unaligned_access() || st->is_mismatched_access()) { >>> ???? return FAIL; >>> ?? } >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> >>> [1] >>> >>> For your convenience, our analysis shows the problem may relate to >>> array InitializeNode logic. >>> It `capture_store` the the memory write of Unsafe.putInt. >>> Since the putInt occupied offset range [17, 21] from the array pointer, >>> then it decided to `clear_memory` of offset range [16, 17] of the >>> array pointer. >>> This range actually cannot pass the assert "assert((end_offset % >>> BytesPerInt) == 0, "odd end offset")". >>> While in jvm product mode, without the assert, the compiler falsely >>> calculated to clear range [13, 17], >>> which will clear the three most significant bytes of the `length` of >>> this array. >>> >>> >>>> >>>> Confirmed no issues with testing for this revised fix. >>>> >>>> Thanks, >>>> Rahul >>>> >>>> On 26/03/19 1:03 AM, Vladimir Kozlov wrote: >>>>> >>>>> Suggestion: >>>>> >>>>> if ((get_store_offset(st, phase) % st->memory_size()) != 0) { >>>>> >>>>> Vladimir >>>>> >>>>> >> From felix.yang at huawei.com Sat Mar 30 00:58:53 2019 From: felix.yang at huawei.com (Yangfei (Felix)) Date: Sat, 30 Mar 2019 00:58:53 +0000 Subject: RFR: 8221658: aarch64: add necessary predicate for ubfx patterns Message-ID: Hi, Please review this patch adding necessary predicate for ubfx patterns in aarch64.ad. Bug: https://bugs.openjdk.java.net/browse/JDK-8221658 Webrev: http://cr.openjdk.java.net/~fyang/8221658/webrev.00 Currently, this issue is only reproduced with an aarch64 8u jdk. Although it is not reproduced with aarch64 jdk11 or newer versions, it's better for them to have this fix. Jtreg tested with aarch64 jdk8u & jdk13 fastdebug build. Also passed the private fuzz test. Thanks, Felix From fujie at loongson.cn Sat Mar 30 10:05:34 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 30 Mar 2019 18:05:34 +0800 Subject: RFR: 8221542: ~15% performance degradation due to less optimized inline decision In-Reply-To: References: Message-ID: Hi Vladimir, I appreciate your suggestion. I agree with you that to solve these kinds of problems completely, the compile policy needs to be carefully co-designed with the inline decision system. But it is really a big topic which deserves years of research. I have updated the patch based on your advice. Webrev: http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.01/ Testing: ?- Running scimark.monte_carlo on jdk/x64 and jdk8u/mips64 with -XX:-TieredCompilation: no performance drop ?- Running SPECjvm2008 on jdk8u/mips64 with -XX:-TieredCompilation: no performance regression ?- Running make test TEST="micro" on jdk/x64: no performance regression ?- Running make test TEST="tier1 tier2 tier3" JTREG="JOBS=3" CONF=release on jdk/x64: no regression Could you please review it and give me some advice? Thanks a lot. Best regards, Jie On 2019/3/28 ??2:21, Vladimir Ivanov wrote: > Hi Jie, > > The heuristic quirk looks very similar to the one Sergey reported > recently: > > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-February/032623.html > > > Overall, tweaking the heuristic to favor inlining doesn't look the > right thing here. profile.count=0 is a sign the profile isn't mature > enough and it's likely the callee doesn't have enough profiling info > as well. (And that's what Sergey observed on some of the > microbenchmarks during his experiments.) > > In your particular case (Random::), tweaking the heuristic so > is_init_with_ea [1] overrules "profile.count > 0" may be a more > promising approach. After all, the fact that the call site is being > considered for inlining (and not pruned along with the basic block it > belongs to) is a strong signal in favor of "profile.count > 0" case. > (Though it's not guaranteed due to the immaturity of profile data.) > > But IMO the root problem is that top-tier compilation happens too > early: profile data isn't mature enough yet and it will easily lead to > similar problems later (during compilation). > > Best regards, > Vladimir Ivanov > > [1] > http://hg.openjdk.java.net/jdk/jdk/file/9c84d2865c2d/src/hotspot/share/opto/bytecodeInfo.cpp#l81 > > On 27/03/2019 03:15, Jie Fu wrote: >> Hi all, >> >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8221542 >> Webrev: >> http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.00/ >> >> ## Symptom >> ~15% performance degradation (from 700 ops/m to 600 ops/m) was >> observed randomly on x86 while running SPECjvm2008's >> scimark.monte_carlo with -XX:-TieredCompilation. >> >> ## Reproduce >> It can be always reproduced with the script[1] in less than 5 minutes. >> >> ## Reason >> The drop was caused by a not-inline decision on >> spec.benchmarks.scimark.utils.Random:: in >> spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate. >> >> ## Fix >> It might be better to make a little change to the inline heuristic[2]. >> >> For callers without loops, the original heuristic works fine. >> But for callers with loops, it would be better to make a not-inline >> decision more conservatively. >> >> ## Testing >> - Running scimark.monte_carlo on jdk/x64 with -XX:-TieredCompilation >> for about 5000 times, no performance drop >> ?? Also on jdk8u/mips64 with -XX:-TieredCompilation, no performance drop >> - Running make test TEST="micro" on jdk/x64, no performance regression >> - Running SPECjvm2008 on jdk8u/x64 with -XX:-TieredCompilation, no >> performance regression >> >> For more detailed info, please see the JBS. >> >> Could you please review it? >> Thanks a lot. >> >> Best regards, >> Jie >> >> [1] http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/reproduce.sh >> [2] >> http://hg.openjdk.java.net/jdk/jdk/file/0a2d73e02076/src/hotspot/share/opto/bytecodeInfo.cpp#l375 >> >> >> From aph at redhat.com Sun Mar 31 08:52:28 2019 From: aph at redhat.com (Andrew Haley) Date: Sun, 31 Mar 2019 09:52:28 +0100 Subject: [aarch64-port-dev ] RFR: 8221658: aarch64: add necessary predicate for ubfx patterns In-Reply-To: References: Message-ID: <130fbe62-4fac-5a8d-aade-74e340459e23@redhat.com> On 3/30/19 12:58 AM, Yangfei (Felix) wrote: > Please review this patch adding necessary predicate for ubfx patterns in aarch64.ad. > Bug: https://bugs.openjdk.java.net/browse/JDK-8221658 > Webrev: http://cr.openjdk.java.net/~fyang/8221658/webrev.00 > > Currently, this issue is only reproduced with an aarch64 8u jdk. > Although it is not reproduced with aarch64 jdk11 or newer versions, it's better for them to have this fix. > Jtreg tested with aarch64 jdk8u & jdk13 fastdebug build. Also passed the private fuzz test. Can't this be done by using a match operand? -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671